kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/6] Support writable CPU ID registers from userspace
@ 2023-03-17  5:06 Jing Zhang
  2023-03-17  5:06 ` [PATCH v4 1/6] KVM: arm64: Move CPU ID feature registers emulation into a separate file Jing Zhang
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Jing Zhang @ 2023-03-17  5:06 UTC (permalink / raw)
  To: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton
  Cc: Will Deacon, Paolo Bonzini, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Fuad Tabba, Reiji Watanabe, Ricardo Koller,
	Raghavendra Rao Ananta, Jing Zhang

This patchset refactors/adds code to support writable per guest CPU ID feature
registers. Part of the code/ideas are from
https://lore.kernel.org/all/20220419065544.3616948-1-reijiw@google.com .
No functional change is intended in this patchset. With the new CPU ID feature
registers infrastructure, only writtings of ID_AA64PFR0_EL1.[CSV2|CSV3],
ID_AA64DFR0_EL1.PMUVer and ID_DFR0_ELF.PerfMon are allowed as KVM does before.

Writable (Configurable) per guest CPU ID feature registers are useful for
creating/migrating guest on ARM CPUs with different kinds of features.

---

* v3 -> v4
  - Remove IDREG() macro for ID reg access, use simple array access instead
  - Rename kvm_arm_read_id_reg_with_encoding() to kvm_arm_read_id_reg()
  - Save perfmon value in ID_DFR0_EL1 instead of pmuver
  - Update perfmon in ID_DFR0_EL1 and pmuver in ID_AA64DFR0_EL1 atomically
  - Remove kvm_vcpu_has_pmu() in macro kvm_pmu_is_3p5()
  - Improve ID register sanity checking in kvm_arm_check_idreg_table()

* v2 -> v3
  - Rebased to 96a4627dbbd4 (kvmarm/next)
    Merge tag ' https://github.com/oupton/linux tags/kvmarm-6.3' from into kvmarm-master/next
  - Add id registere emulation entry point function emulate_id_reg
  - Fix consistency for ID_AA64DFR0_EL1.PMUVer and ID_DFR0_EL1.PerfMon
  - Improve the checking for id register table by ensuring that every entry has
    the correct id register encoding.
  - Addressed other comments from Reiji and Marc.

* v1 -> v2
  - Rebase to 7121a2e1d107 (kvmarm/next) Merge branch kvm-arm64/nv-prefix into kvmarm/next
  - Address writing issue for PMUVer

[1] https://lore.kernel.org/all/20230201025048.205820-1-jingzhangos@google.com
[2] https://lore.kernel.org/all/20230212215830.2975485-1-jingzhangos@google.com
[3] https://lore.kernel.org/all/20230228062246.1222387-1-jingzhangos@google.com

---

Jing Zhang (5):
  KVM: arm64: Move CPU ID feature registers emulation into a separate
    file
  KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3]
  KVM: arm64: Use per guest ID register for ID_AA64DFR0_EL1.PMUVer
  KVM: arm64: Introduce ID register specific descriptor
  KVM: arm64: Refactor writings for PMUVer/CSV2/CSV3

Reiji Watanabe (1):
  KVM: arm64: Save ID registers' sanitized value per guest

 arch/arm64/include/asm/cpufeature.h |  25 +
 arch/arm64/include/asm/kvm_host.h   |  24 +-
 arch/arm64/kernel/cpufeature.c      |  26 +-
 arch/arm64/kvm/Makefile             |   2 +-
 arch/arm64/kvm/arm.c                |  24 +-
 arch/arm64/kvm/hyp/nvhe/sys_regs.c  |   7 +-
 arch/arm64/kvm/id_regs.c            | 807 ++++++++++++++++++++++++++++
 arch/arm64/kvm/sys_regs.c           | 469 +---------------
 arch/arm64/kvm/sys_regs.h           |  43 ++
 include/kvm/arm_pmu.h               |   5 +-
 10 files changed, 930 insertions(+), 502 deletions(-)
 create mode 100644 arch/arm64/kvm/id_regs.c


base-commit: 96a4627dbbd48144a65af936b321701c70876026
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v4 1/6] KVM: arm64: Move CPU ID feature registers emulation into a separate file
  2023-03-17  5:06 [PATCH v4 0/6] Support writable CPU ID registers from userspace Jing Zhang
@ 2023-03-17  5:06 ` Jing Zhang
  2023-03-27 10:14   ` Marc Zyngier
  2023-03-17  5:06 ` [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest Jing Zhang
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Jing Zhang @ 2023-03-17  5:06 UTC (permalink / raw)
  To: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton
  Cc: Will Deacon, Paolo Bonzini, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Fuad Tabba, Reiji Watanabe, Ricardo Koller,
	Raghavendra Rao Ananta, Jing Zhang

Create a new file id_regs.c for CPU ID feature registers emulation code,
which are moved from sys_regs.c and tweak sys_regs code accordingly.

No functional change intended.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
---
 arch/arm64/kvm/Makefile   |   2 +-
 arch/arm64/kvm/id_regs.c  | 506 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kvm/sys_regs.c | 464 ++--------------------------------
 arch/arm64/kvm/sys_regs.h |  41 +++
 4 files changed, 575 insertions(+), 438 deletions(-)
 create mode 100644 arch/arm64/kvm/id_regs.c

diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index c0c050e53157..a6a315fcd81e 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -13,7 +13,7 @@ obj-$(CONFIG_KVM) += hyp/
 kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
 	 inject_fault.o va_layout.o handle_exit.o \
 	 guest.o debug.o reset.o sys_regs.o stacktrace.o \
-	 vgic-sys-reg-v3.o fpsimd.o pkvm.o \
+	 vgic-sys-reg-v3.o fpsimd.o pkvm.o id_regs.o \
 	 arch_timer.o trng.o vmid.o emulate-nested.o nested.o \
 	 vgic/vgic.o vgic/vgic-init.o \
 	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
new file mode 100644
index 000000000000..08b738852955
--- /dev/null
+++ b/arch/arm64/kvm/id_regs.c
@@ -0,0 +1,506 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023 - Google LLC
+ * Author: Jing Zhang <jingzhangos@google.com>
+ *
+ * Moved from arch/arm64/kvm/sys_regs.c
+ * Copyright (C) 2012,2013 - ARM Ltd
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bsearch.h>
+#include <linux/kvm_host.h>
+#include <asm/kvm_emulate.h>
+#include <asm/sysreg.h>
+#include <asm/cpufeature.h>
+#include <asm/kvm_nested.h>
+
+#include "sys_regs.h"
+
+static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
+{
+	if (kvm_vcpu_has_pmu(vcpu))
+		return vcpu->kvm->arch.dfr0_pmuver.imp;
+
+	return vcpu->kvm->arch.dfr0_pmuver.unimp;
+}
+
+static u8 perfmon_to_pmuver(u8 perfmon)
+{
+	switch (perfmon) {
+	case ID_DFR0_EL1_PerfMon_PMUv3:
+		return ID_AA64DFR0_EL1_PMUVer_IMP;
+	case ID_DFR0_EL1_PerfMon_IMPDEF:
+		return ID_AA64DFR0_EL1_PMUVer_IMP_DEF;
+	default:
+		/* Anything ARMv8.1+ and NI have the same value. For now. */
+		return perfmon;
+	}
+}
+
+static u8 pmuver_to_perfmon(u8 pmuver)
+{
+	switch (pmuver) {
+	case ID_AA64DFR0_EL1_PMUVer_IMP:
+		return ID_DFR0_EL1_PerfMon_PMUv3;
+	case ID_AA64DFR0_EL1_PMUVer_IMP_DEF:
+		return ID_DFR0_EL1_PerfMon_IMPDEF;
+	default:
+		/* Anything ARMv8.1+ and NI have the same value. For now. */
+		return pmuver;
+	}
+}
+
+/* Read a sanitised cpufeature ID register by sys_reg_desc */
+static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r)
+{
+	u32 id = reg_to_encoding(r);
+	u64 val;
+
+	if (sysreg_visible_as_raz(vcpu, r))
+		return 0;
+
+	val = read_sanitised_ftr_reg(id);
+
+	switch (id) {
+	case SYS_ID_AA64PFR0_EL1:
+		if (!vcpu_has_sve(vcpu))
+			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
+		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
+		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
+		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
+				  (u64)vcpu->kvm->arch.pfr0_csv2);
+		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
+		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
+				  (u64)vcpu->kvm->arch.pfr0_csv3);
+		if (kvm_vgic_global_state.type == VGIC_V3) {
+			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
+			val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
+		}
+		break;
+	case SYS_ID_AA64PFR1_EL1:
+		if (!kvm_has_mte(vcpu->kvm))
+			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE);
+
+		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SME);
+		break;
+	case SYS_ID_AA64ISAR1_EL1:
+		if (!vcpu_has_ptrauth(vcpu))
+			val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA) |
+				 ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API) |
+				 ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPA) |
+				 ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPI));
+		break;
+	case SYS_ID_AA64ISAR2_EL1:
+		if (!vcpu_has_ptrauth(vcpu))
+			val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3) |
+				 ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3));
+		if (!cpus_have_final_cap(ARM64_HAS_WFXT))
+			val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT);
+		break;
+	case SYS_ID_AA64DFR0_EL1:
+		/* Limit debug to ARMv8.0 */
+		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer);
+		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer), 6);
+		/* Set PMUver to the required version */
+		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
+		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
+				  vcpu_pmuver(vcpu));
+		/* Hide SPE from guests */
+		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer);
+		break;
+	case SYS_ID_DFR0_EL1:
+		val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
+		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon),
+				  pmuver_to_perfmon(vcpu_pmuver(vcpu)));
+		break;
+	case SYS_ID_AA64MMFR2_EL1:
+		val &= ~ID_AA64MMFR2_EL1_CCIDX_MASK;
+		break;
+	case SYS_ID_MMFR4_EL1:
+		val &= ~ARM64_FEATURE_MASK(ID_MMFR4_EL1_CCIDX);
+		break;
+	}
+
+	return val;
+}
+
+/* cpufeature ID register access trap handlers */
+
+static bool access_id_reg(struct kvm_vcpu *vcpu,
+			  struct sys_reg_params *p,
+			  const struct sys_reg_desc *r)
+{
+	if (p->is_write)
+		return write_to_read_only(vcpu, p, r);
+
+	p->regval = read_id_reg(vcpu, r);
+	if (vcpu_has_nv(vcpu))
+		access_nested_id_reg(vcpu, p, r);
+
+	return true;
+}
+
+/*
+ * cpufeature ID register user accessors
+ *
+ * For now, these registers are immutable for userspace, so no values
+ * are stored, and for set_id_reg() we don't allow the effective value
+ * to be changed.
+ */
+static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
+		      u64 *val)
+{
+	*val = read_id_reg(vcpu, rd);
+	return 0;
+}
+
+static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
+		      u64 val)
+{
+	/* This is what we mean by invariant: you can't change it. */
+	if (val != read_id_reg(vcpu, rd))
+		return -EINVAL;
+
+	return 0;
+}
+
+static unsigned int id_visibility(const struct kvm_vcpu *vcpu,
+				  const struct sys_reg_desc *r)
+{
+	u32 id = reg_to_encoding(r);
+
+	switch (id) {
+	case SYS_ID_AA64ZFR0_EL1:
+		if (!vcpu_has_sve(vcpu))
+			return REG_RAZ;
+		break;
+	}
+
+	return 0;
+}
+
+static unsigned int aa32_id_visibility(const struct kvm_vcpu *vcpu,
+				       const struct sys_reg_desc *r)
+{
+	/*
+	 * AArch32 ID registers are UNKNOWN if AArch32 isn't implemented at any
+	 * EL. Promote to RAZ/WI in order to guarantee consistency between
+	 * systems.
+	 */
+	if (!kvm_supports_32bit_el0())
+		return REG_RAZ | REG_USER_WI;
+
+	return id_visibility(vcpu, r);
+}
+
+static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
+			       const struct sys_reg_desc *rd,
+			       u64 val)
+{
+	u8 csv2, csv3;
+
+	/*
+	 * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
+	 * it doesn't promise more than what is actually provided (the
+	 * guest could otherwise be covered in ectoplasmic residue).
+	 */
+	csv2 = cpuid_feature_extract_unsigned_field(val, ID_AA64PFR0_EL1_CSV2_SHIFT);
+	if (csv2 > 1 ||
+	    (csv2 && arm64_get_spectre_v2_state() != SPECTRE_UNAFFECTED))
+		return -EINVAL;
+
+	/* Same thing for CSV3 */
+	csv3 = cpuid_feature_extract_unsigned_field(val, ID_AA64PFR0_EL1_CSV3_SHIFT);
+	if (csv3 > 1 ||
+	    (csv3 && arm64_get_meltdown_state() != SPECTRE_UNAFFECTED))
+		return -EINVAL;
+
+	/* We can only differ with CSV[23], and anything else is an error */
+	val ^= read_id_reg(vcpu, rd);
+	val &= ~(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
+		 ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
+	if (val)
+		return -EINVAL;
+
+	vcpu->kvm->arch.pfr0_csv2 = csv2;
+	vcpu->kvm->arch.pfr0_csv3 = csv3;
+
+	return 0;
+}
+
+static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
+			       const struct sys_reg_desc *rd,
+			       u64 val)
+{
+	u8 pmuver, host_pmuver;
+	bool valid_pmu;
+
+	host_pmuver = kvm_arm_pmu_get_pmuver_limit();
+
+	/*
+	 * Allow AA64DFR0_EL1.PMUver to be set from userspace as long
+	 * as it doesn't promise more than what the HW gives us. We
+	 * allow an IMPDEF PMU though, only if no PMU is supported
+	 * (KVM backward compatibility handling).
+	 */
+	pmuver = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), val);
+	if ((pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF && pmuver > host_pmuver))
+		return -EINVAL;
+
+	valid_pmu = (pmuver != 0 && pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF);
+
+	/* Make sure view register and PMU support do match */
+	if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
+		return -EINVAL;
+
+	/* We can only differ with PMUver, and anything else is an error */
+	val ^= read_id_reg(vcpu, rd);
+	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
+	if (val)
+		return -EINVAL;
+
+	if (valid_pmu)
+		vcpu->kvm->arch.dfr0_pmuver.imp = pmuver;
+	else
+		vcpu->kvm->arch.dfr0_pmuver.unimp = pmuver;
+
+	return 0;
+}
+
+static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
+			   const struct sys_reg_desc *rd,
+			   u64 val)
+{
+	u8 perfmon, host_perfmon;
+	bool valid_pmu;
+
+	host_perfmon = pmuver_to_perfmon(kvm_arm_pmu_get_pmuver_limit());
+
+	/*
+	 * Allow DFR0_EL1.PerfMon to be set from userspace as long as
+	 * it doesn't promise more than what the HW gives us on the
+	 * AArch64 side (as everything is emulated with that), and
+	 * that this is a PMUv3.
+	 */
+	perfmon = FIELD_GET(ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), val);
+	if ((perfmon != ID_DFR0_EL1_PerfMon_IMPDEF && perfmon > host_perfmon) ||
+	    (perfmon != 0 && perfmon < ID_DFR0_EL1_PerfMon_PMUv3))
+		return -EINVAL;
+
+	valid_pmu = (perfmon != 0 && perfmon != ID_DFR0_EL1_PerfMon_IMPDEF);
+
+	/* Make sure view register and PMU support do match */
+	if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
+		return -EINVAL;
+
+	/* We can only differ with PerfMon, and anything else is an error */
+	val ^= read_id_reg(vcpu, rd);
+	val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
+	if (val)
+		return -EINVAL;
+
+	if (valid_pmu)
+		vcpu->kvm->arch.dfr0_pmuver.imp = perfmon_to_pmuver(perfmon);
+	else
+		vcpu->kvm->arch.dfr0_pmuver.unimp = perfmon_to_pmuver(perfmon);
+
+	return 0;
+}
+
+/* sys_reg_desc initialiser for known cpufeature ID registers */
+#define ID_SANITISED(name) {			\
+	SYS_DESC(SYS_##name),			\
+	.access	= access_id_reg,		\
+	.get_user = get_id_reg,			\
+	.set_user = set_id_reg,			\
+	.visibility = id_visibility,		\
+}
+
+/* sys_reg_desc initialiser for known cpufeature ID registers */
+#define AA32_ID_SANITISED(name) {		\
+	SYS_DESC(SYS_##name),			\
+	.access	= access_id_reg,		\
+	.get_user = get_id_reg,			\
+	.set_user = set_id_reg,			\
+	.visibility = aa32_id_visibility,	\
+}
+
+/*
+ * sys_reg_desc initialiser for architecturally unallocated cpufeature ID
+ * register with encoding Op0=3, Op1=0, CRn=0, CRm=crm, Op2=op2
+ * (1 <= crm < 8, 0 <= Op2 < 8).
+ */
+#define ID_UNALLOCATED(crm, op2) {			\
+	Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2),	\
+	.access = access_id_reg,			\
+	.get_user = get_id_reg,				\
+	.set_user = set_id_reg,				\
+	.visibility = raz_visibility			\
+}
+
+/*
+ * sys_reg_desc initialiser for known ID registers that we hide from guests.
+ * For now, these are exposed just like unallocated ID regs: they appear
+ * RAZ for the guest.
+ */
+#define ID_HIDDEN(name) {			\
+	SYS_DESC(SYS_##name),			\
+	.access = access_id_reg,		\
+	.get_user = get_id_reg,			\
+	.set_user = set_id_reg,			\
+	.visibility = raz_visibility,		\
+}
+
+static const struct sys_reg_desc id_reg_descs[] = {
+	/*
+	 * ID regs: all ID_SANITISED() entries here must have corresponding
+	 * entries in arm64_ftr_regs[].
+	 */
+
+	/* AArch64 mappings of the AArch32 ID registers */
+	/* CRm=1 */
+	AA32_ID_SANITISED(ID_PFR0_EL1),
+	AA32_ID_SANITISED(ID_PFR1_EL1),
+	{ SYS_DESC(SYS_ID_DFR0_EL1), .access = access_id_reg,
+	  .get_user = get_id_reg, .set_user = set_id_dfr0_el1,
+	  .visibility = aa32_id_visibility, },
+	ID_HIDDEN(ID_AFR0_EL1),
+	AA32_ID_SANITISED(ID_MMFR0_EL1),
+	AA32_ID_SANITISED(ID_MMFR1_EL1),
+	AA32_ID_SANITISED(ID_MMFR2_EL1),
+	AA32_ID_SANITISED(ID_MMFR3_EL1),
+
+	/* CRm=2 */
+	AA32_ID_SANITISED(ID_ISAR0_EL1),
+	AA32_ID_SANITISED(ID_ISAR1_EL1),
+	AA32_ID_SANITISED(ID_ISAR2_EL1),
+	AA32_ID_SANITISED(ID_ISAR3_EL1),
+	AA32_ID_SANITISED(ID_ISAR4_EL1),
+	AA32_ID_SANITISED(ID_ISAR5_EL1),
+	AA32_ID_SANITISED(ID_MMFR4_EL1),
+	AA32_ID_SANITISED(ID_ISAR6_EL1),
+
+	/* CRm=3 */
+	AA32_ID_SANITISED(MVFR0_EL1),
+	AA32_ID_SANITISED(MVFR1_EL1),
+	AA32_ID_SANITISED(MVFR2_EL1),
+	ID_UNALLOCATED(3, 3),
+	AA32_ID_SANITISED(ID_PFR2_EL1),
+	ID_HIDDEN(ID_DFR1_EL1),
+	AA32_ID_SANITISED(ID_MMFR5_EL1),
+	ID_UNALLOCATED(3, 7),
+
+	/* AArch64 ID registers */
+	/* CRm=4 */
+	{ SYS_DESC(SYS_ID_AA64PFR0_EL1), .access = access_id_reg,
+	  .get_user = get_id_reg, .set_user = set_id_aa64pfr0_el1, },
+	ID_SANITISED(ID_AA64PFR1_EL1),
+	ID_UNALLOCATED(4, 2),
+	ID_UNALLOCATED(4, 3),
+	ID_SANITISED(ID_AA64ZFR0_EL1),
+	ID_HIDDEN(ID_AA64SMFR0_EL1),
+	ID_UNALLOCATED(4, 6),
+	ID_UNALLOCATED(4, 7),
+
+	/* CRm=5 */
+	{ SYS_DESC(SYS_ID_AA64DFR0_EL1), .access = access_id_reg,
+	  .get_user = get_id_reg, .set_user = set_id_aa64dfr0_el1, },
+	ID_SANITISED(ID_AA64DFR1_EL1),
+	ID_UNALLOCATED(5, 2),
+	ID_UNALLOCATED(5, 3),
+	ID_HIDDEN(ID_AA64AFR0_EL1),
+	ID_HIDDEN(ID_AA64AFR1_EL1),
+	ID_UNALLOCATED(5, 6),
+	ID_UNALLOCATED(5, 7),
+
+	/* CRm=6 */
+	ID_SANITISED(ID_AA64ISAR0_EL1),
+	ID_SANITISED(ID_AA64ISAR1_EL1),
+	ID_SANITISED(ID_AA64ISAR2_EL1),
+	ID_UNALLOCATED(6, 3),
+	ID_UNALLOCATED(6, 4),
+	ID_UNALLOCATED(6, 5),
+	ID_UNALLOCATED(6, 6),
+	ID_UNALLOCATED(6, 7),
+
+	/* CRm=7 */
+	ID_SANITISED(ID_AA64MMFR0_EL1),
+	ID_SANITISED(ID_AA64MMFR1_EL1),
+	ID_SANITISED(ID_AA64MMFR2_EL1),
+	ID_UNALLOCATED(7, 3),
+	ID_UNALLOCATED(7, 4),
+	ID_UNALLOCATED(7, 5),
+	ID_UNALLOCATED(7, 6),
+	ID_UNALLOCATED(7, 7),
+};
+
+/**
+ * emulate_id_reg - Emulate a guest access to an AArch64 CPU ID feature register
+ * @vcpu: The VCPU pointer
+ * @params: Decoded system register parameters
+ *
+ * Return: true if the ID register access was successful, false otherwise.
+ */
+int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params)
+{
+	const struct sys_reg_desc *r;
+
+	r = find_reg(params, id_reg_descs, ARRAY_SIZE(id_reg_descs));
+
+	if (likely(r)) {
+		perform_access(vcpu, params, r);
+	} else {
+		print_sys_reg_msg(params,
+				  "Unsupported guest id_reg access at: %lx [%08lx]\n",
+				  *vcpu_pc(vcpu), *vcpu_cpsr(vcpu));
+		kvm_inject_undefined(vcpu);
+	}
+
+	return 1;
+}
+
+
+void kvm_arm_reset_id_regs(struct kvm_vcpu *vcpu)
+{
+	unsigned long i;
+
+	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++)
+		if (id_reg_descs[i].reset)
+			id_reg_descs[i].reset(vcpu, &id_reg_descs[i]);
+}
+
+int kvm_arm_get_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	return kvm_sys_reg_get_user(vcpu, reg,
+				    id_reg_descs, ARRAY_SIZE(id_reg_descs));
+}
+
+int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
+{
+	return kvm_sys_reg_set_user(vcpu, reg,
+				    id_reg_descs, ARRAY_SIZE(id_reg_descs));
+}
+
+bool kvm_arm_check_idreg_table(void)
+{
+	return check_sysreg_table(id_reg_descs, ARRAY_SIZE(id_reg_descs), false);
+}
+
+int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
+{
+	const struct sys_reg_desc *i2, *end2;
+	unsigned int total = 0;
+	int err;
+
+	i2 = id_reg_descs;
+	end2 = id_reg_descs + ARRAY_SIZE(id_reg_descs);
+
+	while (i2 != end2) {
+		err = walk_one_sys_reg(vcpu, i2++, &uind, &total);
+		if (err)
+			return err;
+	}
+	return total;
+}
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 53749d3a0996..22b60474fcab 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -53,16 +53,6 @@ static bool read_from_write_only(struct kvm_vcpu *vcpu,
 	return false;
 }
 
-static bool write_to_read_only(struct kvm_vcpu *vcpu,
-			       struct sys_reg_params *params,
-			       const struct sys_reg_desc *r)
-{
-	WARN_ONCE(1, "Unexpected sys_reg write to read-only register\n");
-	print_sys_reg_instr(params);
-	kvm_inject_undefined(vcpu);
-	return false;
-}
-
 u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
 {
 	u64 val = 0x8badf00d8badf00d;
@@ -1153,163 +1143,6 @@ static bool access_arch_timer(struct kvm_vcpu *vcpu,
 	return true;
 }
 
-static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
-{
-	if (kvm_vcpu_has_pmu(vcpu))
-		return vcpu->kvm->arch.dfr0_pmuver.imp;
-
-	return vcpu->kvm->arch.dfr0_pmuver.unimp;
-}
-
-static u8 perfmon_to_pmuver(u8 perfmon)
-{
-	switch (perfmon) {
-	case ID_DFR0_EL1_PerfMon_PMUv3:
-		return ID_AA64DFR0_EL1_PMUVer_IMP;
-	case ID_DFR0_EL1_PerfMon_IMPDEF:
-		return ID_AA64DFR0_EL1_PMUVer_IMP_DEF;
-	default:
-		/* Anything ARMv8.1+ and NI have the same value. For now. */
-		return perfmon;
-	}
-}
-
-static u8 pmuver_to_perfmon(u8 pmuver)
-{
-	switch (pmuver) {
-	case ID_AA64DFR0_EL1_PMUVer_IMP:
-		return ID_DFR0_EL1_PerfMon_PMUv3;
-	case ID_AA64DFR0_EL1_PMUVer_IMP_DEF:
-		return ID_DFR0_EL1_PerfMon_IMPDEF;
-	default:
-		/* Anything ARMv8.1+ and NI have the same value. For now. */
-		return pmuver;
-	}
-}
-
-/* Read a sanitised cpufeature ID register by sys_reg_desc */
-static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r)
-{
-	u32 id = reg_to_encoding(r);
-	u64 val;
-
-	if (sysreg_visible_as_raz(vcpu, r))
-		return 0;
-
-	val = read_sanitised_ftr_reg(id);
-
-	switch (id) {
-	case SYS_ID_AA64PFR0_EL1:
-		if (!vcpu_has_sve(vcpu))
-			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), (u64)vcpu->kvm->arch.pfr0_csv2);
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), (u64)vcpu->kvm->arch.pfr0_csv3);
-		if (kvm_vgic_global_state.type == VGIC_V3) {
-			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
-			val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
-		}
-		break;
-	case SYS_ID_AA64PFR1_EL1:
-		if (!kvm_has_mte(vcpu->kvm))
-			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE);
-
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SME);
-		break;
-	case SYS_ID_AA64ISAR1_EL1:
-		if (!vcpu_has_ptrauth(vcpu))
-			val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA) |
-				 ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API) |
-				 ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPA) |
-				 ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPI));
-		break;
-	case SYS_ID_AA64ISAR2_EL1:
-		if (!vcpu_has_ptrauth(vcpu))
-			val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3) |
-				 ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3));
-		if (!cpus_have_final_cap(ARM64_HAS_WFXT))
-			val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT);
-		break;
-	case SYS_ID_AA64DFR0_EL1:
-		/* Limit debug to ARMv8.0 */
-		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer), 6);
-		/* Set PMUver to the required version */
-		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
-				  vcpu_pmuver(vcpu));
-		/* Hide SPE from guests */
-		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer);
-		break;
-	case SYS_ID_DFR0_EL1:
-		val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon),
-				  pmuver_to_perfmon(vcpu_pmuver(vcpu)));
-		break;
-	case SYS_ID_AA64MMFR2_EL1:
-		val &= ~ID_AA64MMFR2_EL1_CCIDX_MASK;
-		break;
-	case SYS_ID_MMFR4_EL1:
-		val &= ~ARM64_FEATURE_MASK(ID_MMFR4_EL1_CCIDX);
-		break;
-	}
-
-	return val;
-}
-
-static unsigned int id_visibility(const struct kvm_vcpu *vcpu,
-				  const struct sys_reg_desc *r)
-{
-	u32 id = reg_to_encoding(r);
-
-	switch (id) {
-	case SYS_ID_AA64ZFR0_EL1:
-		if (!vcpu_has_sve(vcpu))
-			return REG_RAZ;
-		break;
-	}
-
-	return 0;
-}
-
-static unsigned int aa32_id_visibility(const struct kvm_vcpu *vcpu,
-				       const struct sys_reg_desc *r)
-{
-	/*
-	 * AArch32 ID registers are UNKNOWN if AArch32 isn't implemented at any
-	 * EL. Promote to RAZ/WI in order to guarantee consistency between
-	 * systems.
-	 */
-	if (!kvm_supports_32bit_el0())
-		return REG_RAZ | REG_USER_WI;
-
-	return id_visibility(vcpu, r);
-}
-
-static unsigned int raz_visibility(const struct kvm_vcpu *vcpu,
-				   const struct sys_reg_desc *r)
-{
-	return REG_RAZ;
-}
-
-/* cpufeature ID register access trap handlers */
-
-static bool access_id_reg(struct kvm_vcpu *vcpu,
-			  struct sys_reg_params *p,
-			  const struct sys_reg_desc *r)
-{
-	if (p->is_write)
-		return write_to_read_only(vcpu, p, r);
-
-	p->regval = read_id_reg(vcpu, r);
-	if (vcpu_has_nv(vcpu))
-		access_nested_id_reg(vcpu, p, r);
-
-	return true;
-}
-
 /* Visibility overrides for SVE-specific control registers */
 static unsigned int sve_visibility(const struct kvm_vcpu *vcpu,
 				   const struct sys_reg_desc *rd)
@@ -1320,144 +1153,6 @@ static unsigned int sve_visibility(const struct kvm_vcpu *vcpu,
 	return REG_HIDDEN;
 }
 
-static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
-			       const struct sys_reg_desc *rd,
-			       u64 val)
-{
-	u8 csv2, csv3;
-
-	/*
-	 * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
-	 * it doesn't promise more than what is actually provided (the
-	 * guest could otherwise be covered in ectoplasmic residue).
-	 */
-	csv2 = cpuid_feature_extract_unsigned_field(val, ID_AA64PFR0_EL1_CSV2_SHIFT);
-	if (csv2 > 1 ||
-	    (csv2 && arm64_get_spectre_v2_state() != SPECTRE_UNAFFECTED))
-		return -EINVAL;
-
-	/* Same thing for CSV3 */
-	csv3 = cpuid_feature_extract_unsigned_field(val, ID_AA64PFR0_EL1_CSV3_SHIFT);
-	if (csv3 > 1 ||
-	    (csv3 && arm64_get_meltdown_state() != SPECTRE_UNAFFECTED))
-		return -EINVAL;
-
-	/* We can only differ with CSV[23], and anything else is an error */
-	val ^= read_id_reg(vcpu, rd);
-	val &= ~(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
-		 ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
-	if (val)
-		return -EINVAL;
-
-	vcpu->kvm->arch.pfr0_csv2 = csv2;
-	vcpu->kvm->arch.pfr0_csv3 = csv3;
-
-	return 0;
-}
-
-static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
-			       const struct sys_reg_desc *rd,
-			       u64 val)
-{
-	u8 pmuver, host_pmuver;
-	bool valid_pmu;
-
-	host_pmuver = kvm_arm_pmu_get_pmuver_limit();
-
-	/*
-	 * Allow AA64DFR0_EL1.PMUver to be set from userspace as long
-	 * as it doesn't promise more than what the HW gives us. We
-	 * allow an IMPDEF PMU though, only if no PMU is supported
-	 * (KVM backward compatibility handling).
-	 */
-	pmuver = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), val);
-	if ((pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF && pmuver > host_pmuver))
-		return -EINVAL;
-
-	valid_pmu = (pmuver != 0 && pmuver != ID_AA64DFR0_EL1_PMUVer_IMP_DEF);
-
-	/* Make sure view register and PMU support do match */
-	if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
-		return -EINVAL;
-
-	/* We can only differ with PMUver, and anything else is an error */
-	val ^= read_id_reg(vcpu, rd);
-	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
-	if (val)
-		return -EINVAL;
-
-	if (valid_pmu)
-		vcpu->kvm->arch.dfr0_pmuver.imp = pmuver;
-	else
-		vcpu->kvm->arch.dfr0_pmuver.unimp = pmuver;
-
-	return 0;
-}
-
-static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
-			   const struct sys_reg_desc *rd,
-			   u64 val)
-{
-	u8 perfmon, host_perfmon;
-	bool valid_pmu;
-
-	host_perfmon = pmuver_to_perfmon(kvm_arm_pmu_get_pmuver_limit());
-
-	/*
-	 * Allow DFR0_EL1.PerfMon to be set from userspace as long as
-	 * it doesn't promise more than what the HW gives us on the
-	 * AArch64 side (as everything is emulated with that), and
-	 * that this is a PMUv3.
-	 */
-	perfmon = FIELD_GET(ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), val);
-	if ((perfmon != ID_DFR0_EL1_PerfMon_IMPDEF && perfmon > host_perfmon) ||
-	    (perfmon != 0 && perfmon < ID_DFR0_EL1_PerfMon_PMUv3))
-		return -EINVAL;
-
-	valid_pmu = (perfmon != 0 && perfmon != ID_DFR0_EL1_PerfMon_IMPDEF);
-
-	/* Make sure view register and PMU support do match */
-	if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
-		return -EINVAL;
-
-	/* We can only differ with PerfMon, and anything else is an error */
-	val ^= read_id_reg(vcpu, rd);
-	val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
-	if (val)
-		return -EINVAL;
-
-	if (valid_pmu)
-		vcpu->kvm->arch.dfr0_pmuver.imp = perfmon_to_pmuver(perfmon);
-	else
-		vcpu->kvm->arch.dfr0_pmuver.unimp = perfmon_to_pmuver(perfmon);
-
-	return 0;
-}
-
-/*
- * cpufeature ID register user accessors
- *
- * For now, these registers are immutable for userspace, so no values
- * are stored, and for set_id_reg() we don't allow the effective value
- * to be changed.
- */
-static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
-		      u64 *val)
-{
-	*val = read_id_reg(vcpu, rd);
-	return 0;
-}
-
-static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
-		      u64 val)
-{
-	/* This is what we mean by invariant: you can't change it. */
-	if (val != read_id_reg(vcpu, rd))
-		return -EINVAL;
-
-	return 0;
-}
-
 static int get_raz_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		       u64 *val)
 {
@@ -1642,50 +1337,6 @@ static unsigned int elx2_visibility(const struct kvm_vcpu *vcpu,
 	.visibility = elx2_visibility,		\
 }
 
-/* sys_reg_desc initialiser for known cpufeature ID registers */
-#define ID_SANITISED(name) {			\
-	SYS_DESC(SYS_##name),			\
-	.access	= access_id_reg,		\
-	.get_user = get_id_reg,			\
-	.set_user = set_id_reg,			\
-	.visibility = id_visibility,		\
-}
-
-/* sys_reg_desc initialiser for known cpufeature ID registers */
-#define AA32_ID_SANITISED(name) {		\
-	SYS_DESC(SYS_##name),			\
-	.access	= access_id_reg,		\
-	.get_user = get_id_reg,			\
-	.set_user = set_id_reg,			\
-	.visibility = aa32_id_visibility,	\
-}
-
-/*
- * sys_reg_desc initialiser for architecturally unallocated cpufeature ID
- * register with encoding Op0=3, Op1=0, CRn=0, CRm=crm, Op2=op2
- * (1 <= crm < 8, 0 <= Op2 < 8).
- */
-#define ID_UNALLOCATED(crm, op2) {			\
-	Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2),	\
-	.access = access_id_reg,			\
-	.get_user = get_id_reg,				\
-	.set_user = set_id_reg,				\
-	.visibility = raz_visibility			\
-}
-
-/*
- * sys_reg_desc initialiser for known ID registers that we hide from guests.
- * For now, these are exposed just like unallocated ID regs: they appear
- * RAZ for the guest.
- */
-#define ID_HIDDEN(name) {			\
-	SYS_DESC(SYS_##name),			\
-	.access = access_id_reg,		\
-	.get_user = get_id_reg,			\
-	.set_user = set_id_reg,			\
-	.visibility = raz_visibility,		\
-}
-
 static bool access_sp_el1(struct kvm_vcpu *vcpu,
 			  struct sys_reg_params *p,
 			  const struct sys_reg_desc *r)
@@ -1776,87 +1427,6 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 
 	{ SYS_DESC(SYS_MPIDR_EL1), NULL, reset_mpidr, MPIDR_EL1 },
 
-	/*
-	 * ID regs: all ID_SANITISED() entries here must have corresponding
-	 * entries in arm64_ftr_regs[].
-	 */
-
-	/* AArch64 mappings of the AArch32 ID registers */
-	/* CRm=1 */
-	AA32_ID_SANITISED(ID_PFR0_EL1),
-	AA32_ID_SANITISED(ID_PFR1_EL1),
-	{ SYS_DESC(SYS_ID_DFR0_EL1), .access = access_id_reg,
-	  .get_user = get_id_reg, .set_user = set_id_dfr0_el1,
-	  .visibility = aa32_id_visibility, },
-	ID_HIDDEN(ID_AFR0_EL1),
-	AA32_ID_SANITISED(ID_MMFR0_EL1),
-	AA32_ID_SANITISED(ID_MMFR1_EL1),
-	AA32_ID_SANITISED(ID_MMFR2_EL1),
-	AA32_ID_SANITISED(ID_MMFR3_EL1),
-
-	/* CRm=2 */
-	AA32_ID_SANITISED(ID_ISAR0_EL1),
-	AA32_ID_SANITISED(ID_ISAR1_EL1),
-	AA32_ID_SANITISED(ID_ISAR2_EL1),
-	AA32_ID_SANITISED(ID_ISAR3_EL1),
-	AA32_ID_SANITISED(ID_ISAR4_EL1),
-	AA32_ID_SANITISED(ID_ISAR5_EL1),
-	AA32_ID_SANITISED(ID_MMFR4_EL1),
-	AA32_ID_SANITISED(ID_ISAR6_EL1),
-
-	/* CRm=3 */
-	AA32_ID_SANITISED(MVFR0_EL1),
-	AA32_ID_SANITISED(MVFR1_EL1),
-	AA32_ID_SANITISED(MVFR2_EL1),
-	ID_UNALLOCATED(3,3),
-	AA32_ID_SANITISED(ID_PFR2_EL1),
-	ID_HIDDEN(ID_DFR1_EL1),
-	AA32_ID_SANITISED(ID_MMFR5_EL1),
-	ID_UNALLOCATED(3,7),
-
-	/* AArch64 ID registers */
-	/* CRm=4 */
-	{ SYS_DESC(SYS_ID_AA64PFR0_EL1), .access = access_id_reg,
-	  .get_user = get_id_reg, .set_user = set_id_aa64pfr0_el1, },
-	ID_SANITISED(ID_AA64PFR1_EL1),
-	ID_UNALLOCATED(4,2),
-	ID_UNALLOCATED(4,3),
-	ID_SANITISED(ID_AA64ZFR0_EL1),
-	ID_HIDDEN(ID_AA64SMFR0_EL1),
-	ID_UNALLOCATED(4,6),
-	ID_UNALLOCATED(4,7),
-
-	/* CRm=5 */
-	{ SYS_DESC(SYS_ID_AA64DFR0_EL1), .access = access_id_reg,
-	  .get_user = get_id_reg, .set_user = set_id_aa64dfr0_el1, },
-	ID_SANITISED(ID_AA64DFR1_EL1),
-	ID_UNALLOCATED(5,2),
-	ID_UNALLOCATED(5,3),
-	ID_HIDDEN(ID_AA64AFR0_EL1),
-	ID_HIDDEN(ID_AA64AFR1_EL1),
-	ID_UNALLOCATED(5,6),
-	ID_UNALLOCATED(5,7),
-
-	/* CRm=6 */
-	ID_SANITISED(ID_AA64ISAR0_EL1),
-	ID_SANITISED(ID_AA64ISAR1_EL1),
-	ID_SANITISED(ID_AA64ISAR2_EL1),
-	ID_UNALLOCATED(6,3),
-	ID_UNALLOCATED(6,4),
-	ID_UNALLOCATED(6,5),
-	ID_UNALLOCATED(6,6),
-	ID_UNALLOCATED(6,7),
-
-	/* CRm=7 */
-	ID_SANITISED(ID_AA64MMFR0_EL1),
-	ID_SANITISED(ID_AA64MMFR1_EL1),
-	ID_SANITISED(ID_AA64MMFR2_EL1),
-	ID_UNALLOCATED(7,3),
-	ID_UNALLOCATED(7,4),
-	ID_UNALLOCATED(7,5),
-	ID_UNALLOCATED(7,6),
-	ID_UNALLOCATED(7,7),
-
 	{ SYS_DESC(SYS_SCTLR_EL1), access_vm_reg, reset_val, SCTLR_EL1, 0x00C50078 },
 	{ SYS_DESC(SYS_ACTLR_EL1), access_actlr, reset_actlr, ACTLR_EL1 },
 	{ SYS_DESC(SYS_CPACR_EL1), NULL, reset_val, CPACR_EL1, 0 },
@@ -2531,8 +2101,8 @@ static const struct sys_reg_desc cp15_64_regs[] = {
 	{ SYS_DESC(SYS_AARCH32_CNTP_CVAL),    access_arch_timer },
 };
 
-static bool check_sysreg_table(const struct sys_reg_desc *table, unsigned int n,
-			       bool is_32)
+bool check_sysreg_table(const struct sys_reg_desc *table, unsigned int n,
+			bool is_32)
 {
 	unsigned int i;
 
@@ -2557,7 +2127,7 @@ int kvm_handle_cp14_load_store(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
-static void perform_access(struct kvm_vcpu *vcpu,
+void perform_access(struct kvm_vcpu *vcpu,
 			   struct sys_reg_params *params,
 			   const struct sys_reg_desc *r)
 {
@@ -2912,6 +2482,8 @@ void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
 {
 	unsigned long i;
 
+	kvm_arm_reset_id_regs(vcpu);
+
 	for (i = 0; i < ARRAY_SIZE(sys_reg_descs); i++)
 		if (sys_reg_descs[i].reset)
 			sys_reg_descs[i].reset(vcpu, &sys_reg_descs[i]);
@@ -2932,6 +2504,9 @@ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu)
 	params = esr_sys64_to_params(esr);
 	params.regval = vcpu_get_reg(vcpu, Rt);
 
+	if (is_id_reg(reg_to_encoding(&params)))
+		return emulate_id_reg(vcpu, &params);
+
 	if (!emulate_sys_reg(vcpu, &params))
 		return 1;
 
@@ -3160,6 +2735,10 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (err != -ENOENT)
 		return err;
 
+	err = kvm_arm_get_id_reg(vcpu, reg);
+	if (err != -ENOENT)
+		return err;
+
 	return kvm_sys_reg_get_user(vcpu, reg,
 				    sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
 }
@@ -3204,6 +2783,10 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
 	if (err != -ENOENT)
 		return err;
 
+	err = kvm_arm_set_id_reg(vcpu, reg);
+	if (err != -ENOENT)
+		return err;
+
 	return kvm_sys_reg_set_user(vcpu, reg,
 				    sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
 }
@@ -3250,10 +2833,10 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
 	return true;
 }
 
-static int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
-			    const struct sys_reg_desc *rd,
-			    u64 __user **uind,
-			    unsigned int *total)
+int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
+		     const struct sys_reg_desc *rd,
+		     u64 __user **uind,
+		     unsigned int *total)
 {
 	/*
 	 * Ignore registers we trap but don't save,
@@ -3294,6 +2877,7 @@ unsigned long kvm_arm_num_sys_reg_descs(struct kvm_vcpu *vcpu)
 {
 	return ARRAY_SIZE(invariant_sys_regs)
 		+ num_demux_regs()
+		+ kvm_arm_walk_id_regs(vcpu, (u64 __user *)NULL)
 		+ walk_sys_regs(vcpu, (u64 __user *)NULL);
 }
 
@@ -3309,6 +2893,11 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 		uindices++;
 	}
 
+	err = kvm_arm_walk_id_regs(vcpu, uindices);
+	if (err < 0)
+		return err;
+	uindices += err;
+
 	err = walk_sys_regs(vcpu, uindices);
 	if (err < 0)
 		return err;
@@ -3323,6 +2912,7 @@ int __init kvm_sys_reg_table_init(void)
 	unsigned int i;
 
 	/* Make sure tables are unique and in order. */
+	valid &= kvm_arm_check_idreg_table();
 	valid &= check_sysreg_table(sys_reg_descs, ARRAY_SIZE(sys_reg_descs), false);
 	valid &= check_sysreg_table(cp14_regs, ARRAY_SIZE(cp14_regs), true);
 	valid &= check_sysreg_table(cp14_64_regs, ARRAY_SIZE(cp14_64_regs), true);
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 6b11f2cc7146..ad41305348f7 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -210,6 +210,35 @@ find_reg(const struct sys_reg_params *params, const struct sys_reg_desc table[],
 	return __inline_bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg);
 }
 
+static inline unsigned int raz_visibility(const struct kvm_vcpu *vcpu,
+					  const struct sys_reg_desc *r)
+{
+	return REG_RAZ;
+}
+
+static inline bool write_to_read_only(struct kvm_vcpu *vcpu,
+				      struct sys_reg_params *params,
+				      const struct sys_reg_desc *r)
+{
+	WARN_ONCE(1, "Unexpected sys_reg write to read-only register\n");
+	print_sys_reg_instr(params);
+	kvm_inject_undefined(vcpu);
+	return false;
+}
+
+/*
+ * Return true if the register's (Op0, Op1, CRn, CRm, Op2) is
+ * (3, 0, 0, crm, op2), where 1<=crm<8, 0<=op2<8.
+ */
+static inline bool is_id_reg(u32 id)
+{
+	return (sys_reg_Op0(id) == 3 && sys_reg_Op1(id) == 0 &&
+		sys_reg_CRn(id) == 0 && sys_reg_CRm(id) >= 1 &&
+		sys_reg_CRm(id) < 8);
+}
+
+void perform_access(struct kvm_vcpu *vcpu, struct sys_reg_params *params,
+		    const struct sys_reg_desc *r);
 const struct sys_reg_desc *get_reg_by_id(u64 id,
 					 const struct sys_reg_desc table[],
 					 unsigned int num);
@@ -220,6 +249,18 @@ int kvm_sys_reg_get_user(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg,
 			 const struct sys_reg_desc table[], unsigned int num);
 int kvm_sys_reg_set_user(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg,
 			 const struct sys_reg_desc table[], unsigned int num);
+bool check_sysreg_table(const struct sys_reg_desc *table, unsigned int n,
+			bool is_32);
+int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
+		     const struct sys_reg_desc *rd,
+		     u64 __user **uind,
+		     unsigned int *total);
+int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params);
+void kvm_arm_reset_id_regs(struct kvm_vcpu *vcpu);
+int kvm_arm_get_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
+bool kvm_arm_check_idreg_table(void);
+int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind);
 
 #define AA32(_x)	.aarch32_map = AA32_##_x
 #define Op0(_x) 	.Op0 = _x
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest
  2023-03-17  5:06 [PATCH v4 0/6] Support writable CPU ID registers from userspace Jing Zhang
  2023-03-17  5:06 ` [PATCH v4 1/6] KVM: arm64: Move CPU ID feature registers emulation into a separate file Jing Zhang
@ 2023-03-17  5:06 ` Jing Zhang
  2023-03-27 10:15   ` Marc Zyngier
  2023-03-17  5:06 ` [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3] Jing Zhang
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Jing Zhang @ 2023-03-17  5:06 UTC (permalink / raw)
  To: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton
  Cc: Will Deacon, Paolo Bonzini, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Fuad Tabba, Reiji Watanabe, Ricardo Koller,
	Raghavendra Rao Ananta, Jing Zhang

From: Reiji Watanabe <reijiw@google.com>

Introduce id_regs[] in kvm_arch as a storage of guest's ID registers,
and save ID registers' sanitized value in the array at KVM_CREATE_VM.
Use the saved ones when ID registers are read by the guest or
userspace (via KVM_GET_ONE_REG).

No functional change intended.

Signed-off-by: Reiji Watanabe <reijiw@google.com>
Co-developed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
---
 arch/arm64/include/asm/kvm_host.h | 11 ++++++++
 arch/arm64/kvm/arm.c              |  1 +
 arch/arm64/kvm/id_regs.c          | 44 ++++++++++++++++++++++++-------
 arch/arm64/kvm/sys_regs.c         |  2 +-
 arch/arm64/kvm/sys_regs.h         |  1 +
 5 files changed, 49 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index a1892a8f6032..fb6b50b1f111 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -245,6 +245,15 @@ struct kvm_arch {
 	 * the associated pKVM instance in the hypervisor.
 	 */
 	struct kvm_protected_vm pkvm;
+
+	/*
+	 * Save ID registers for the guest in id_regs[].
+	 * (Op0, Op1, CRn, CRm, Op2) of the ID registers to be saved in it
+	 * is (3, 0, 0, crm, op2), where 1<=crm<8, 0<=op2<8.
+	 */
+#define KVM_ARM_ID_REG_NUM	56
+#define IDREG_IDX(id)		(((sys_reg_CRm(id) - 1) << 3) | sys_reg_Op2(id))
+	u64 id_regs[KVM_ARM_ID_REG_NUM];
 };
 
 struct kvm_vcpu_fault_info {
@@ -1005,6 +1014,8 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
 				struct kvm_arm_copy_mte_tags *copy_tags);
 
+void kvm_arm_set_default_id_regs(struct kvm *kvm);
+
 /* Guest/host FPSIMD coordination helpers */
 int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3bd732eaf087..4579c878ab30 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -153,6 +153,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 
 	set_default_spectre(kvm);
 	kvm_arm_init_hypercalls(kvm);
+	kvm_arm_set_default_id_regs(kvm);
 
 	/*
 	 * Initialise the default PMUver before there is a chance to
diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
index 08b738852955..e393b5730557 100644
--- a/arch/arm64/kvm/id_regs.c
+++ b/arch/arm64/kvm/id_regs.c
@@ -52,16 +52,9 @@ static u8 pmuver_to_perfmon(u8 pmuver)
 	}
 }
 
-/* Read a sanitised cpufeature ID register by sys_reg_desc */
-static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r)
+u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
 {
-	u32 id = reg_to_encoding(r);
-	u64 val;
-
-	if (sysreg_visible_as_raz(vcpu, r))
-		return 0;
-
-	val = read_sanitised_ftr_reg(id);
+	u64 val = vcpu->kvm->arch.id_regs[IDREG_IDX(id)];
 
 	switch (id) {
 	case SYS_ID_AA64PFR0_EL1:
@@ -126,6 +119,14 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r
 	return val;
 }
 
+static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r)
+{
+	if (sysreg_visible_as_raz(vcpu, r))
+		return 0;
+
+	return kvm_arm_read_id_reg(vcpu, reg_to_encoding(r));
+}
+
 /* cpufeature ID register access trap handlers */
 
 static bool access_id_reg(struct kvm_vcpu *vcpu,
@@ -504,3 +505,28 @@ int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
 	}
 	return total;
 }
+
+/*
+ * Set the guest's ID registers that are defined in id_reg_descs[]
+ * with ID_SANITISED() to the host's sanitized value.
+ */
+void kvm_arm_set_default_id_regs(struct kvm *kvm)
+{
+	int i;
+	u32 id;
+	u64 val;
+
+	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
+		id = reg_to_encoding(&id_reg_descs[i]);
+		if (WARN_ON_ONCE(!is_id_reg(id)))
+			/* Shouldn't happen */
+			continue;
+
+		if (id_reg_descs[i].visibility == raz_visibility)
+			/* Hidden or reserved ID register */
+			continue;
+
+		val = read_sanitised_ftr_reg(id);
+		kvm->arch.id_regs[IDREG_IDX(id)] = val;
+	}
+}
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 22b60474fcab..3243c924527e 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -354,7 +354,7 @@ static bool trap_loregion(struct kvm_vcpu *vcpu,
 			  struct sys_reg_params *p,
 			  const struct sys_reg_desc *r)
 {
-	u64 val = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
+	u64 val = kvm_arm_read_id_reg(vcpu, SYS_ID_AA64MMFR1_EL1);
 	u32 sr = reg_to_encoding(r);
 
 	if (!(val & (0xfUL << ID_AA64MMFR1_EL1_LO_SHIFT))) {
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index ad41305348f7..ee136ba28fa5 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -261,6 +261,7 @@ int kvm_arm_get_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
 int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
 bool kvm_arm_check_idreg_table(void);
 int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind);
+u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id);
 
 #define AA32(_x)	.aarch32_map = AA32_##_x
 #define Op0(_x) 	.Op0 = _x
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3]
  2023-03-17  5:06 [PATCH v4 0/6] Support writable CPU ID registers from userspace Jing Zhang
  2023-03-17  5:06 ` [PATCH v4 1/6] KVM: arm64: Move CPU ID feature registers emulation into a separate file Jing Zhang
  2023-03-17  5:06 ` [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest Jing Zhang
@ 2023-03-17  5:06 ` Jing Zhang
  2023-03-27 10:31   ` Marc Zyngier
  2023-03-28 12:39   ` Fuad Tabba
  2023-03-17  5:06 ` [PATCH v4 4/6] KVM: arm64: Use per guest ID register for ID_AA64DFR0_EL1.PMUVer Jing Zhang
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 25+ messages in thread
From: Jing Zhang @ 2023-03-17  5:06 UTC (permalink / raw)
  To: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton
  Cc: Will Deacon, Paolo Bonzini, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Fuad Tabba, Reiji Watanabe, Ricardo Koller,
	Raghavendra Rao Ananta, Jing Zhang

With per guest ID registers, ID_AA64PFR0_EL1.[CSV2|CSV3] settings from
userspace can be stored in its corresponding ID register.

No functional change intended.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
---
 arch/arm64/include/asm/kvm_host.h  |  2 --
 arch/arm64/kvm/arm.c               | 19 +------------------
 arch/arm64/kvm/hyp/nvhe/sys_regs.c |  7 +++----
 arch/arm64/kvm/id_regs.c           | 30 ++++++++++++++++++++++--------
 4 files changed, 26 insertions(+), 32 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index fb6b50b1f111..e926ea91a73c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -230,8 +230,6 @@ struct kvm_arch {
 
 	cpumask_var_t supported_cpus;
 
-	u8 pfr0_csv2;
-	u8 pfr0_csv3;
 	struct {
 		u8 imp:4;
 		u8 unimp:4;
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 4579c878ab30..c78d68d011cb 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -104,22 +104,6 @@ static int kvm_arm_default_max_vcpus(void)
 	return vgic_present ? kvm_vgic_get_max_vcpus() : KVM_MAX_VCPUS;
 }
 
-static void set_default_spectre(struct kvm *kvm)
-{
-	/*
-	 * The default is to expose CSV2 == 1 if the HW isn't affected.
-	 * Although this is a per-CPU feature, we make it global because
-	 * asymmetric systems are just a nuisance.
-	 *
-	 * Userspace can override this as long as it doesn't promise
-	 * the impossible.
-	 */
-	if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED)
-		kvm->arch.pfr0_csv2 = 1;
-	if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED)
-		kvm->arch.pfr0_csv3 = 1;
-}
-
 /**
  * kvm_arch_init_vm - initializes a VM data structure
  * @kvm:	pointer to the KVM struct
@@ -151,9 +135,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	/* The maximum number of VCPUs is limited by the host's GIC model */
 	kvm->max_vcpus = kvm_arm_default_max_vcpus();
 
-	set_default_spectre(kvm);
-	kvm_arm_init_hypercalls(kvm);
 	kvm_arm_set_default_id_regs(kvm);
+	kvm_arm_init_hypercalls(kvm);
 
 	/*
 	 * Initialise the default PMUver before there is a chance to
diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
index 08d2b004f4b7..0e1988740a65 100644
--- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -93,10 +93,9 @@ static u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
 		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
 
 	/* Spectre and Meltdown mitigation in KVM */
-	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
-			       (u64)kvm->arch.pfr0_csv2);
-	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
-			       (u64)kvm->arch.pfr0_csv3);
+	set_mask |= vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] &
+		(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
+			ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
 
 	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
 }
diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
index e393b5730557..b60ca1058301 100644
--- a/arch/arm64/kvm/id_regs.c
+++ b/arch/arm64/kvm/id_regs.c
@@ -61,12 +61,6 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
 		if (!vcpu_has_sve(vcpu))
 			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
 		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
-				  (u64)vcpu->kvm->arch.pfr0_csv2);
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
-				  (u64)vcpu->kvm->arch.pfr0_csv3);
 		if (kvm_vgic_global_state.type == VGIC_V3) {
 			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
 			val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
@@ -201,6 +195,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
 			       u64 val)
 {
 	u8 csv2, csv3;
+	u64 sval = val;
 
 	/*
 	 * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
@@ -225,8 +220,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
 	if (val)
 		return -EINVAL;
 
-	vcpu->kvm->arch.pfr0_csv2 = csv2;
-	vcpu->kvm->arch.pfr0_csv3 = csv3;
+	vcpu->kvm->arch.id_regs[IDREG_IDX(reg_to_encoding(rd))] = sval;
 
 	return 0;
 }
@@ -529,4 +523,24 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
 		val = read_sanitised_ftr_reg(id);
 		kvm->arch.id_regs[IDREG_IDX(id)] = val;
 	}
+	/*
+	 * The default is to expose CSV2 == 1 if the HW isn't affected.
+	 * Although this is a per-CPU feature, we make it global because
+	 * asymmetric systems are just a nuisance.
+	 *
+	 * Userspace can override this as long as it doesn't promise
+	 * the impossible.
+	 */
+	val = kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)];
+
+	if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
+		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
+		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
+	}
+	if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
+		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
+		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
+	}
+
+	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;
 }
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 4/6] KVM: arm64: Use per guest ID register for ID_AA64DFR0_EL1.PMUVer
  2023-03-17  5:06 [PATCH v4 0/6] Support writable CPU ID registers from userspace Jing Zhang
                   ` (2 preceding siblings ...)
  2023-03-17  5:06 ` [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3] Jing Zhang
@ 2023-03-17  5:06 ` Jing Zhang
  2023-03-27 10:40   ` Marc Zyngier
  2023-03-17  5:06 ` [PATCH v4 5/6] KVM: arm64: Introduce ID register specific descriptor Jing Zhang
  2023-03-17  5:06 ` [PATCH v4 6/6] KVM: arm64: Refactor writings for PMUVer/CSV2/CSV3 Jing Zhang
  5 siblings, 1 reply; 25+ messages in thread
From: Jing Zhang @ 2023-03-17  5:06 UTC (permalink / raw)
  To: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton
  Cc: Will Deacon, Paolo Bonzini, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Fuad Tabba, Reiji Watanabe, Ricardo Koller,
	Raghavendra Rao Ananta, Jing Zhang

With per guest ID registers, PMUver settings from userspace
can be stored in its corresponding ID register.

No functional change intended.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
---
 arch/arm64/include/asm/kvm_host.h | 11 +++---
 arch/arm64/kvm/arm.c              |  6 ---
 arch/arm64/kvm/id_regs.c          | 61 +++++++++++++++++++++++++------
 include/kvm/arm_pmu.h             |  5 ++-
 4 files changed, 59 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e926ea91a73c..102860ba896d 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -218,6 +218,12 @@ struct kvm_arch {
 #define KVM_ARCH_FLAG_EL1_32BIT				4
 	/* PSCI SYSTEM_SUSPEND enabled for the guest */
 #define KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED		5
+	/*
+	 * AA64DFR0_EL1.PMUver was set as ID_AA64DFR0_EL1_PMUVer_IMP_DEF
+	 * or DFR0_EL1.PerfMon was set as ID_DFR0_EL1_PerfMon_IMPDEF from
+	 * userspace for VCPUs without PMU.
+	 */
+#define KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU		6
 
 	unsigned long flags;
 
@@ -230,11 +236,6 @@ struct kvm_arch {
 
 	cpumask_var_t supported_cpus;
 
-	struct {
-		u8 imp:4;
-		u8 unimp:4;
-	} dfr0_pmuver;
-
 	/* Hypercall features firmware registers' descriptor */
 	struct kvm_smccc_features smccc_feat;
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index c78d68d011cb..fb2de2cb98cb 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -138,12 +138,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	kvm_arm_set_default_id_regs(kvm);
 	kvm_arm_init_hypercalls(kvm);
 
-	/*
-	 * Initialise the default PMUver before there is a chance to
-	 * create an actual PMU.
-	 */
-	kvm->arch.dfr0_pmuver.imp = kvm_arm_pmu_get_pmuver_limit();
-
 	return 0;
 
 err_free_cpumask:
diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
index b60ca1058301..3a87a3d2390d 100644
--- a/arch/arm64/kvm/id_regs.c
+++ b/arch/arm64/kvm/id_regs.c
@@ -21,9 +21,12 @@
 static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
 {
 	if (kvm_vcpu_has_pmu(vcpu))
-		return vcpu->kvm->arch.dfr0_pmuver.imp;
-
-	return vcpu->kvm->arch.dfr0_pmuver.unimp;
+		return FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
+				vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)]);
+	else if (test_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags))
+		return ID_AA64DFR0_EL1_PMUVer_IMP_DEF;
+	else
+		return 0;
 }
 
 static u8 perfmon_to_pmuver(u8 perfmon)
@@ -256,10 +259,23 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
 	if (val)
 		return -EINVAL;
 
-	if (valid_pmu)
-		vcpu->kvm->arch.dfr0_pmuver.imp = pmuver;
-	else
-		vcpu->kvm->arch.dfr0_pmuver.unimp = pmuver;
+	if (valid_pmu) {
+		mutex_lock(&vcpu->kvm->lock);
+		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
+			~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
+		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
+			FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), pmuver);
+
+		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
+			~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
+		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
+				ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), pmuver_to_perfmon(pmuver));
+		mutex_unlock(&vcpu->kvm->lock);
+	} else if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) {
+		set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
+	} else {
+		clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
+	}
 
 	return 0;
 }
@@ -296,10 +312,23 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
 	if (val)
 		return -EINVAL;
 
-	if (valid_pmu)
-		vcpu->kvm->arch.dfr0_pmuver.imp = perfmon_to_pmuver(perfmon);
-	else
-		vcpu->kvm->arch.dfr0_pmuver.unimp = perfmon_to_pmuver(perfmon);
+	if (valid_pmu) {
+		mutex_lock(&vcpu->kvm->lock);
+		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
+			~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
+		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
+			ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), perfmon);
+
+		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
+			~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
+		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |= FIELD_PREP(
+			ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), perfmon_to_pmuver(perfmon));
+		mutex_unlock(&vcpu->kvm->lock);
+	} else if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF) {
+		set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
+	} else {
+		clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
+	}
 
 	return 0;
 }
@@ -543,4 +572,14 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
 	}
 
 	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;
+
+	/*
+	 * Initialise the default PMUver before there is a chance to
+	 * create an actual PMU.
+	 */
+	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
+		~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
+	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
+		FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
+			   kvm_arm_pmu_get_pmuver_limit());
 }
diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
index 628775334d5e..51c7f3e7bdde 100644
--- a/include/kvm/arm_pmu.h
+++ b/include/kvm/arm_pmu.h
@@ -92,8 +92,9 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
 /*
  * Evaluates as true when emulating PMUv3p5, and false otherwise.
  */
-#define kvm_pmu_is_3p5(vcpu)						\
-	(vcpu->kvm->arch.dfr0_pmuver.imp >= ID_AA64DFR0_EL1_PMUVer_V3P5)
+#define kvm_pmu_is_3p5(vcpu)									\
+	 (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),					\
+		 vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)]) >= ID_AA64DFR0_EL1_PMUVer_V3P5)
 
 u8 kvm_arm_pmu_get_pmuver_limit(void);
 
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 5/6] KVM: arm64: Introduce ID register specific descriptor
  2023-03-17  5:06 [PATCH v4 0/6] Support writable CPU ID registers from userspace Jing Zhang
                   ` (3 preceding siblings ...)
  2023-03-17  5:06 ` [PATCH v4 4/6] KVM: arm64: Use per guest ID register for ID_AA64DFR0_EL1.PMUVer Jing Zhang
@ 2023-03-17  5:06 ` Jing Zhang
  2023-03-27 11:28   ` Marc Zyngier
  2023-03-17  5:06 ` [PATCH v4 6/6] KVM: arm64: Refactor writings for PMUVer/CSV2/CSV3 Jing Zhang
  5 siblings, 1 reply; 25+ messages in thread
From: Jing Zhang @ 2023-03-17  5:06 UTC (permalink / raw)
  To: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton
  Cc: Will Deacon, Paolo Bonzini, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Fuad Tabba, Reiji Watanabe, Ricardo Koller,
	Raghavendra Rao Ananta, Jing Zhang

Introduce an ID feature register specific descriptor to include ID
register specific fields and callbacks besides its corresponding
general system register descriptor.
New fields for ID register descriptor would be added later when it
is necessary to support a writable ID register.

No functional change intended.

Co-developed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
---
 arch/arm64/kvm/id_regs.c  | 187 +++++++++++++++++++++++++++-----------
 arch/arm64/kvm/sys_regs.c |   2 +-
 arch/arm64/kvm/sys_regs.h |   1 +
 3 files changed, 138 insertions(+), 52 deletions(-)

diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
index 3a87a3d2390d..9956c99d20f7 100644
--- a/arch/arm64/kvm/id_regs.c
+++ b/arch/arm64/kvm/id_regs.c
@@ -18,6 +18,10 @@
 
 #include "sys_regs.h"
 
+struct id_reg_desc {
+	const struct sys_reg_desc	reg_desc;
+};
+
 static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
 {
 	if (kvm_vcpu_has_pmu(vcpu))
@@ -334,21 +338,25 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
 }
 
 /* sys_reg_desc initialiser for known cpufeature ID registers */
-#define ID_SANITISED(name) {			\
-	SYS_DESC(SYS_##name),			\
-	.access	= access_id_reg,		\
-	.get_user = get_id_reg,			\
-	.set_user = set_id_reg,			\
-	.visibility = id_visibility,		\
+#define ID_SANITISED(name) {				\
+	.reg_desc = {					\
+		SYS_DESC(SYS_##name),			\
+		.access	= access_id_reg,		\
+		.get_user = get_id_reg,			\
+		.set_user = set_id_reg,			\
+		.visibility = id_visibility,		\
+	},						\
 }
 
 /* sys_reg_desc initialiser for known cpufeature ID registers */
-#define AA32_ID_SANITISED(name) {		\
-	SYS_DESC(SYS_##name),			\
-	.access	= access_id_reg,		\
-	.get_user = get_id_reg,			\
-	.set_user = set_id_reg,			\
-	.visibility = aa32_id_visibility,	\
+#define AA32_ID_SANITISED(name) {			\
+	.reg_desc = {					\
+		SYS_DESC(SYS_##name),			\
+		.access	= access_id_reg,		\
+		.get_user = get_id_reg,			\
+		.set_user = set_id_reg,			\
+		.visibility = aa32_id_visibility,	\
+	},						\
 }
 
 /*
@@ -356,12 +364,14 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
  * register with encoding Op0=3, Op1=0, CRn=0, CRm=crm, Op2=op2
  * (1 <= crm < 8, 0 <= Op2 < 8).
  */
-#define ID_UNALLOCATED(crm, op2) {			\
-	Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2),	\
-	.access = access_id_reg,			\
-	.get_user = get_id_reg,				\
-	.set_user = set_id_reg,				\
-	.visibility = raz_visibility			\
+#define ID_UNALLOCATED(crm, op2) {				\
+	.reg_desc = {						\
+		Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2),	\
+		.access = access_id_reg,			\
+		.get_user = get_id_reg,				\
+		.set_user = set_id_reg,				\
+		.visibility = raz_visibility			\
+	},							\
 }
 
 /*
@@ -369,15 +379,17 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
  * For now, these are exposed just like unallocated ID regs: they appear
  * RAZ for the guest.
  */
-#define ID_HIDDEN(name) {			\
-	SYS_DESC(SYS_##name),			\
-	.access = access_id_reg,		\
-	.get_user = get_id_reg,			\
-	.set_user = set_id_reg,			\
-	.visibility = raz_visibility,		\
+#define ID_HIDDEN(name) {				\
+	.reg_desc = {					\
+		SYS_DESC(SYS_##name),			\
+		.access = access_id_reg,		\
+		.get_user = get_id_reg,			\
+		.set_user = set_id_reg,			\
+		.visibility = raz_visibility,		\
+	},						\
 }
 
-static const struct sys_reg_desc id_reg_descs[] = {
+static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
 	/*
 	 * ID regs: all ID_SANITISED() entries here must have corresponding
 	 * entries in arm64_ftr_regs[].
@@ -387,9 +399,13 @@ static const struct sys_reg_desc id_reg_descs[] = {
 	/* CRm=1 */
 	AA32_ID_SANITISED(ID_PFR0_EL1),
 	AA32_ID_SANITISED(ID_PFR1_EL1),
-	{ SYS_DESC(SYS_ID_DFR0_EL1), .access = access_id_reg,
-	  .get_user = get_id_reg, .set_user = set_id_dfr0_el1,
-	  .visibility = aa32_id_visibility, },
+	{ .reg_desc = {
+		SYS_DESC(SYS_ID_DFR0_EL1),
+		.access = access_id_reg,
+		.get_user = get_id_reg,
+		.set_user = set_id_dfr0_el1,
+		.visibility = aa32_id_visibility, },
+	},
 	ID_HIDDEN(ID_AFR0_EL1),
 	AA32_ID_SANITISED(ID_MMFR0_EL1),
 	AA32_ID_SANITISED(ID_MMFR1_EL1),
@@ -418,8 +434,12 @@ static const struct sys_reg_desc id_reg_descs[] = {
 
 	/* AArch64 ID registers */
 	/* CRm=4 */
-	{ SYS_DESC(SYS_ID_AA64PFR0_EL1), .access = access_id_reg,
-	  .get_user = get_id_reg, .set_user = set_id_aa64pfr0_el1, },
+	{ .reg_desc = {
+		SYS_DESC(SYS_ID_AA64PFR0_EL1),
+		.access = access_id_reg,
+		.get_user = get_id_reg,
+		.set_user = set_id_aa64pfr0_el1, },
+	},
 	ID_SANITISED(ID_AA64PFR1_EL1),
 	ID_UNALLOCATED(4, 2),
 	ID_UNALLOCATED(4, 3),
@@ -429,8 +449,12 @@ static const struct sys_reg_desc id_reg_descs[] = {
 	ID_UNALLOCATED(4, 7),
 
 	/* CRm=5 */
-	{ SYS_DESC(SYS_ID_AA64DFR0_EL1), .access = access_id_reg,
-	  .get_user = get_id_reg, .set_user = set_id_aa64dfr0_el1, },
+	{ .reg_desc = {
+		SYS_DESC(SYS_ID_AA64DFR0_EL1),
+		.access = access_id_reg,
+		.get_user = get_id_reg,
+		.set_user = set_id_aa64dfr0_el1, },
+	},
 	ID_SANITISED(ID_AA64DFR1_EL1),
 	ID_UNALLOCATED(5, 2),
 	ID_UNALLOCATED(5, 3),
@@ -469,12 +493,12 @@ static const struct sys_reg_desc id_reg_descs[] = {
  */
 int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params)
 {
-	const struct sys_reg_desc *r;
+	u32 id;
 
-	r = find_reg(params, id_reg_descs, ARRAY_SIZE(id_reg_descs));
+	id = reg_to_encoding(params);
 
-	if (likely(r)) {
-		perform_access(vcpu, params, r);
+	if (likely(is_id_reg(id))) {
+		perform_access(vcpu, params, &id_reg_descs[IDREG_IDX(id)].reg_desc);
 	} else {
 		print_sys_reg_msg(params,
 				  "Unsupported guest id_reg access at: %lx [%08lx]\n",
@@ -491,38 +515,102 @@ void kvm_arm_reset_id_regs(struct kvm_vcpu *vcpu)
 	unsigned long i;
 
 	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++)
-		if (id_reg_descs[i].reset)
-			id_reg_descs[i].reset(vcpu, &id_reg_descs[i]);
+		if (id_reg_descs[i].reg_desc.reset)
+			id_reg_descs[i].reg_desc.reset(vcpu, &id_reg_descs[i].reg_desc);
 }
 
 int kvm_arm_get_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 {
-	return kvm_sys_reg_get_user(vcpu, reg,
-				    id_reg_descs, ARRAY_SIZE(id_reg_descs));
+	u64 __user *uaddr = (u64 __user *)(unsigned long)reg->addr;
+	const struct sys_reg_desc *r;
+	struct sys_reg_params params;
+	u64 val;
+	int ret;
+	u32 id;
+
+	if (!index_to_params(reg->id, &params))
+		return -ENOENT;
+	id = reg_to_encoding(&params);
+
+	if (!is_id_reg(id))
+		return -ENOENT;
+
+	r = &id_reg_descs[IDREG_IDX(id)].reg_desc;
+	if (r->get_user) {
+		ret = (r->get_user)(vcpu, r, &val);
+	} else {
+		ret = 0;
+		val = vcpu->kvm->arch.id_regs[IDREG_IDX(id)];
+	}
+
+	if (!ret)
+		ret = put_user(val, uaddr);
+
+	return ret;
 }
 
 int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 {
-	return kvm_sys_reg_set_user(vcpu, reg,
-				    id_reg_descs, ARRAY_SIZE(id_reg_descs));
+	u64 __user *uaddr = (u64 __user *)(unsigned long)reg->addr;
+	const struct sys_reg_desc *r;
+	struct sys_reg_params params;
+	u64 val;
+	int ret;
+	u32 id;
+
+	if (!index_to_params(reg->id, &params))
+		return -ENOENT;
+	id = reg_to_encoding(&params);
+
+	if (!is_id_reg(id))
+		return -ENOENT;
+
+	if (get_user(val, uaddr))
+		return -EFAULT;
+
+	r = &id_reg_descs[IDREG_IDX(id)].reg_desc;
+
+	if (sysreg_user_write_ignore(vcpu, r))
+		return 0;
+
+	if (r->set_user) {
+		ret = (r->set_user)(vcpu, r, val);
+	} else {
+		WARN_ONCE(1, "ID register set_user callback is NULL\n");
+		ret = 0;
+	}
+
+	return ret;
 }
 
 bool kvm_arm_check_idreg_table(void)
 {
-	return check_sysreg_table(id_reg_descs, ARRAY_SIZE(id_reg_descs), false);
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
+		const struct sys_reg_desc *r = &id_reg_descs[i].reg_desc;
+
+		if (!is_id_reg(reg_to_encoding(r))) {
+			kvm_err("id_reg table %pS entry %d not set correctly\n",
+				&id_reg_descs[i].reg_desc, i);
+			return false;
+		}
+	}
+
+	return true;
 }
 
 int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
 {
-	const struct sys_reg_desc *i2, *end2;
+	const struct id_reg_desc *i2, *end2;
 	unsigned int total = 0;
 	int err;
 
 	i2 = id_reg_descs;
 	end2 = id_reg_descs + ARRAY_SIZE(id_reg_descs);
 
-	while (i2 != end2) {
-		err = walk_one_sys_reg(vcpu, i2++, &uind, &total);
+	for (; i2 != end2; i2++) {
+		err = walk_one_sys_reg(vcpu, &(i2->reg_desc), &uind, &total);
 		if (err)
 			return err;
 	}
@@ -540,12 +628,9 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
 	u64 val;
 
 	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
-		id = reg_to_encoding(&id_reg_descs[i]);
-		if (WARN_ON_ONCE(!is_id_reg(id)))
-			/* Shouldn't happen */
-			continue;
+		id = reg_to_encoding(&id_reg_descs[i].reg_desc);
 
-		if (id_reg_descs[i].visibility == raz_visibility)
+		if (id_reg_descs[i].reg_desc.visibility == raz_visibility)
 			/* Hidden or reserved ID register */
 			continue;
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 3243c924527e..608a0378bdae 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2519,7 +2519,7 @@ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu)
  * Userspace API
  *****************************************************************************/
 
-static bool index_to_params(u64 id, struct sys_reg_params *params)
+bool index_to_params(u64 id, struct sys_reg_params *params)
 {
 	switch (id & KVM_REG_SIZE_MASK) {
 	case KVM_REG_SIZE_U64:
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index ee136ba28fa5..8fd0020c67ca 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -239,6 +239,7 @@ static inline bool is_id_reg(u32 id)
 
 void perform_access(struct kvm_vcpu *vcpu, struct sys_reg_params *params,
 		    const struct sys_reg_desc *r);
+bool index_to_params(u64 id, struct sys_reg_params *params);
 const struct sys_reg_desc *get_reg_by_id(u64 id,
 					 const struct sys_reg_desc table[],
 					 unsigned int num);
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 6/6] KVM: arm64: Refactor writings for PMUVer/CSV2/CSV3
  2023-03-17  5:06 [PATCH v4 0/6] Support writable CPU ID registers from userspace Jing Zhang
                   ` (4 preceding siblings ...)
  2023-03-17  5:06 ` [PATCH v4 5/6] KVM: arm64: Introduce ID register specific descriptor Jing Zhang
@ 2023-03-17  5:06 ` Jing Zhang
  2023-03-27 13:34   ` Marc Zyngier
  5 siblings, 1 reply; 25+ messages in thread
From: Jing Zhang @ 2023-03-17  5:06 UTC (permalink / raw)
  To: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton
  Cc: Will Deacon, Paolo Bonzini, James Morse, Alexandru Elisei,
	Suzuki K Poulose, Fuad Tabba, Reiji Watanabe, Ricardo Koller,
	Raghavendra Rao Ananta, Jing Zhang

Save KVM sanitised ID register value in ID descriptor (kvm_sys_val).
Add an init callback for every ID register to setup kvm_sys_val.
All per VCPU sanitizations are still handled on the fly during ID
register read and write from userspace.
An arm64_ftr_bits array is used to indicate writable feature fields.

Refactor writings for ID_AA64PFR0_EL1.[CSV2|CSV3],
ID_AA64DFR0_EL1.PMUVer and ID_DFR0_ELF.PerfMon based on utilities
introduced by ID register descriptor.

No functional change intended.

Co-developed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
---
 arch/arm64/include/asm/cpufeature.h |  25 +++
 arch/arm64/include/asm/kvm_host.h   |   2 +-
 arch/arm64/kernel/cpufeature.c      |  26 +--
 arch/arm64/kvm/arm.c                |   2 +-
 arch/arm64/kvm/id_regs.c            | 325 ++++++++++++++++++++--------
 arch/arm64/kvm/sys_regs.c           |   3 +-
 arch/arm64/kvm/sys_regs.h           |   2 +-
 7 files changed, 261 insertions(+), 124 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index fc2c739f48f1..493ec530eefc 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -64,6 +64,30 @@ struct arm64_ftr_bits {
 	s64		safe_val; /* safe value for FTR_EXACT features */
 };
 
+#define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
+	{						\
+		.sign = SIGNED,				\
+		.visible = VISIBLE,			\
+		.strict = STRICT,			\
+		.type = TYPE,				\
+		.shift = SHIFT,				\
+		.width = WIDTH,				\
+		.safe_val = SAFE_VAL,			\
+	}
+
+/* Define a feature with unsigned values */
+#define ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
+	__ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
+
+/* Define a feature with a signed value */
+#define S_ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
+	__ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
+
+#define ARM64_FTR_END					\
+	{						\
+		.width = 0,				\
+	}
+
 /*
  * Describe the early feature override to the core override code:
  *
@@ -911,6 +935,7 @@ static inline unsigned int get_vmid_bits(u64 mmfr1)
 	return 8;
 }
 
+s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new, s64 cur);
 struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id);
 
 extern struct arm64_ftr_override id_aa64mmfr1_override;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 102860ba896d..aa83dd79e7ff 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1013,7 +1013,7 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
 				struct kvm_arm_copy_mte_tags *copy_tags);
 
-void kvm_arm_set_default_id_regs(struct kvm *kvm);
+void kvm_arm_init_id_regs(struct kvm *kvm);
 
 /* Guest/host FPSIMD coordination helpers */
 int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 23bd2a926b74..e18848ee4b98 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -139,30 +139,6 @@ void dump_cpu_features(void)
 	pr_emerg("0x%*pb\n", ARM64_NCAPS, &cpu_hwcaps);
 }
 
-#define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
-	{						\
-		.sign = SIGNED,				\
-		.visible = VISIBLE,			\
-		.strict = STRICT,			\
-		.type = TYPE,				\
-		.shift = SHIFT,				\
-		.width = WIDTH,				\
-		.safe_val = SAFE_VAL,			\
-	}
-
-/* Define a feature with unsigned values */
-#define ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
-	__ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
-
-/* Define a feature with a signed value */
-#define S_ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
-	__ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
-
-#define ARM64_FTR_END					\
-	{						\
-		.width = 0,				\
-	}
-
 static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
 
 static bool __system_matches_cap(unsigned int n);
@@ -790,7 +766,7 @@ static u64 arm64_ftr_set_value(const struct arm64_ftr_bits *ftrp, s64 reg,
 	return reg;
 }
 
-static s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new,
+s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new,
 				s64 cur)
 {
 	s64 ret = 0;
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fb2de2cb98cb..e539d9ca9d01 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -135,7 +135,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	/* The maximum number of VCPUs is limited by the host's GIC model */
 	kvm->max_vcpus = kvm_arm_default_max_vcpus();
 
-	kvm_arm_set_default_id_regs(kvm);
+	kvm_arm_init_id_regs(kvm);
 	kvm_arm_init_hypercalls(kvm);
 
 	return 0;
diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
index 9956c99d20f7..726b810b6e06 100644
--- a/arch/arm64/kvm/id_regs.c
+++ b/arch/arm64/kvm/id_regs.c
@@ -18,10 +18,88 @@
 
 #include "sys_regs.h"
 
+/*
+ * Number of entries in id_reg_desc's ftr_bits[] (Number of 4 bits fields
+ * in 64 bit register + 1 entry for a terminator entry).
+ */
+#define	FTR_FIELDS_NUM	17
+
 struct id_reg_desc {
 	const struct sys_reg_desc	reg_desc;
+	/*
+	 * KVM sanitised ID register value.
+	 * It is the default value for per VM emulated ID register.
+	 */
+	u64 kvm_sys_val;
+	/*
+	 * Used to validate the ID register values with arm64_check_features().
+	 * The last item in the array must be terminated by an item whose
+	 * width field is zero as that is expected by arm64_check_features().
+	 * Only feature bits defined in this array are writable.
+	 */
+	struct arm64_ftr_bits	ftr_bits[FTR_FIELDS_NUM];
+
+	/*
+	 * Basically init() is used to setup the KVM sanitised value
+	 * stored in kvm_sys_val.
+	 */
+	void (*init)(struct id_reg_desc *idr);
 };
 
+static struct id_reg_desc id_reg_descs[];
+
+/**
+ * arm64_check_features() - Check if a feature register value constitutes
+ * a subset of features indicated by @limit.
+ *
+ * @ftrp: Pointer to an array of arm64_ftr_bits. It must be terminated by
+ * an item whose width field is zero.
+ * @val: The feature register value to check
+ * @limit: The limit value of the feature register
+ *
+ * This function will check if each feature field of @val is the "safe" value
+ * against @limit based on @ftrp[], each of which specifies the target field
+ * (shift, width), whether or not the field is for a signed value (sign),
+ * how the field is determined to be "safe" (type), and the safe value
+ * (safe_val) when type == FTR_EXACT (safe_val won't be used by this
+ * function when type != FTR_EXACT). Any other fields in arm64_ftr_bits
+ * won't be used by this function. If a field value in @val is the same
+ * as the one in @limit, it is always considered the safe value regardless
+ * of the type. For register fields that are not in @ftrp[], only the value
+ * in @limit is considered the safe value.
+ *
+ * Return: 0 if all the fields are safe. Otherwise, return negative errno.
+ */
+static int arm64_check_features(const struct arm64_ftr_bits *ftrp, u64 val, u64 limit)
+{
+	u64 mask = 0;
+
+	for (; ftrp->width; ftrp++) {
+		s64 f_val, f_lim, safe_val;
+
+		f_val = arm64_ftr_value(ftrp, val);
+		f_lim = arm64_ftr_value(ftrp, limit);
+		mask |= arm64_ftr_mask(ftrp);
+
+		if (f_val == f_lim)
+			safe_val = f_val;
+		else
+			safe_val = arm64_ftr_safe_value(ftrp, f_val, f_lim);
+
+		if (safe_val != f_val)
+			return -E2BIG;
+	}
+
+	/*
+	 * For fields that are not indicated in ftrp, values in limit are the
+	 * safe values.
+	 */
+	if ((val & ~mask) != (limit & ~mask))
+		return -E2BIG;
+
+	return 0;
+}
+
 static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
 {
 	if (kvm_vcpu_has_pmu(vcpu))
@@ -67,7 +145,6 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
 	case SYS_ID_AA64PFR0_EL1:
 		if (!vcpu_has_sve(vcpu))
 			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
 		if (kvm_vgic_global_state.type == VGIC_V3) {
 			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
 			val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
@@ -94,15 +171,10 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
 			val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT);
 		break;
 	case SYS_ID_AA64DFR0_EL1:
-		/* Limit debug to ARMv8.0 */
-		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer), 6);
 		/* Set PMUver to the required version */
 		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
 		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
 				  vcpu_pmuver(vcpu));
-		/* Hide SPE from guests */
-		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer);
 		break;
 	case SYS_ID_DFR0_EL1:
 		val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
@@ -161,9 +233,15 @@ static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
 		      u64 val)
 {
-	/* This is what we mean by invariant: you can't change it. */
-	if (val != read_id_reg(vcpu, rd))
-		return -EINVAL;
+	int ret;
+	int id = reg_to_encoding(rd);
+
+	ret = arm64_check_features(id_reg_descs[IDREG_IDX(id)].ftr_bits, val,
+				   id_reg_descs[IDREG_IDX(id)].kvm_sys_val);
+	if (ret)
+		return ret;
+
+	vcpu->kvm->arch.id_regs[IDREG_IDX(id)] = val;
 
 	return 0;
 }
@@ -197,12 +275,47 @@ static unsigned int aa32_id_visibility(const struct kvm_vcpu *vcpu,
 	return id_visibility(vcpu, r);
 }
 
+static void init_id_reg(struct id_reg_desc *idr)
+{
+	idr->kvm_sys_val = read_sanitised_ftr_reg(reg_to_encoding(&idr->reg_desc));
+}
+
+static void init_id_aa64pfr0_el1(struct id_reg_desc *idr)
+{
+	u64 val;
+	u32 id = reg_to_encoding(&idr->reg_desc);
+
+	val = read_sanitised_ftr_reg(id);
+	/*
+	 * The default is to expose CSV2 == 1 if the HW isn't affected.
+	 * Although this is a per-CPU feature, we make it global because
+	 * asymmetric systems are just a nuisance.
+	 *
+	 * Userspace can override this as long as it doesn't promise
+	 * the impossible.
+	 */
+	if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
+		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
+		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
+	}
+	if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
+		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
+		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
+	}
+
+	val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
+
+	val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
+	val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
+
+	idr->kvm_sys_val = val;
+}
+
 static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
 			       const struct sys_reg_desc *rd,
 			       u64 val)
 {
 	u8 csv2, csv3;
-	u64 sval = val;
 
 	/*
 	 * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
@@ -220,16 +333,29 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
 	    (csv3 && arm64_get_meltdown_state() != SPECTRE_UNAFFECTED))
 		return -EINVAL;
 
-	/* We can only differ with CSV[23], and anything else is an error */
-	val ^= read_id_reg(vcpu, rd);
-	val &= ~(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
-		 ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
-	if (val)
-		return -EINVAL;
+	return set_id_reg(vcpu, rd, val);
+}
 
-	vcpu->kvm->arch.id_regs[IDREG_IDX(reg_to_encoding(rd))] = sval;
+static void init_id_aa64dfr0_el1(struct id_reg_desc *idr)
+{
+	u64 val;
+	u32 id = reg_to_encoding(&idr->reg_desc);
 
-	return 0;
+	val = read_sanitised_ftr_reg(id);
+	/* Limit debug to ARMv8.0 */
+	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer);
+	val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer), 6);
+	/*
+	 * Initialise the default PMUver before there is a chance to
+	 * create an actual PMU.
+	 */
+	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
+	val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
+			  kvm_arm_pmu_get_pmuver_limit());
+	/* Hide SPE from guests */
+	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer);
+
+	idr->kvm_sys_val = val;
 }
 
 static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
@@ -238,6 +364,7 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
 {
 	u8 pmuver, host_pmuver;
 	bool valid_pmu;
+	int ret;
 
 	host_pmuver = kvm_arm_pmu_get_pmuver_limit();
 
@@ -257,39 +384,58 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
 	if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
 		return -EINVAL;
 
-	/* We can only differ with PMUver, and anything else is an error */
-	val ^= read_id_reg(vcpu, rd);
-	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
-	if (val)
-		return -EINVAL;
-
 	if (valid_pmu) {
 		mutex_lock(&vcpu->kvm->lock);
-		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
-			~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
-		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
-			FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), pmuver);
+		ret = set_id_reg(vcpu, rd, val);
+		if (ret)
+			return ret;
 
 		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
 			~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
 		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
 				ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), pmuver_to_perfmon(pmuver));
 		mutex_unlock(&vcpu->kvm->lock);
-	} else if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) {
-		set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
 	} else {
-		clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
+		/* We can only differ with PMUver, and anything else is an error */
+		val ^= read_id_reg(vcpu, rd);
+		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
+		if (val)
+			return -EINVAL;
+
+		if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
+			set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
+		else
+			clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
+
 	}
 
 	return 0;
 }
 
+static void init_id_dfr0_el1(struct id_reg_desc *idr)
+{
+	u64 val;
+	u32 id = reg_to_encoding(&idr->reg_desc);
+
+	val = read_sanitised_ftr_reg(id);
+	/*
+	 * Initialise the default PMUver before there is a chance to
+	 * create an actual PMU.
+	 */
+	val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
+	val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon),
+			  kvm_arm_pmu_get_pmuver_limit());
+
+	idr->kvm_sys_val = val;
+}
+
 static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
 			   const struct sys_reg_desc *rd,
 			   u64 val)
 {
 	u8 perfmon, host_perfmon;
 	bool valid_pmu;
+	int ret;
 
 	host_perfmon = pmuver_to_perfmon(kvm_arm_pmu_get_pmuver_limit());
 
@@ -310,42 +456,46 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
 	if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
 		return -EINVAL;
 
-	/* We can only differ with PerfMon, and anything else is an error */
-	val ^= read_id_reg(vcpu, rd);
-	val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
-	if (val)
-		return -EINVAL;
-
 	if (valid_pmu) {
 		mutex_lock(&vcpu->kvm->lock);
-		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
-			~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
-		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
-			ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), perfmon);
+		ret = set_id_reg(vcpu, rd, val);
+		if (ret)
+			return ret;
 
 		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
 			~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
 		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |= FIELD_PREP(
 			ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), perfmon_to_pmuver(perfmon));
 		mutex_unlock(&vcpu->kvm->lock);
-	} else if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF) {
-		set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
 	} else {
-		clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
+		/* We can only differ with PerfMon, and anything else is an error */
+		val ^= read_id_reg(vcpu, rd);
+		val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
+		if (val)
+			return -EINVAL;
+
+		if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF)
+			set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
+		else
+			clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
 	}
 
 	return 0;
 }
 
 /* sys_reg_desc initialiser for known cpufeature ID registers */
+#define SYS_DESC_SANITISED(name) {			\
+	SYS_DESC(SYS_##name),				\
+	.access	= access_id_reg,			\
+	.get_user = get_id_reg,				\
+	.set_user = set_id_reg,				\
+	.visibility = id_visibility,			\
+}
+
 #define ID_SANITISED(name) {				\
-	.reg_desc = {					\
-		SYS_DESC(SYS_##name),			\
-		.access	= access_id_reg,		\
-		.get_user = get_id_reg,			\
-		.set_user = set_id_reg,			\
-		.visibility = id_visibility,		\
-	},						\
+	.reg_desc = SYS_DESC_SANITISED(name),		\
+	.ftr_bits = { ARM64_FTR_END, },			\
+	.init = init_id_reg,				\
 }
 
 /* sys_reg_desc initialiser for known cpufeature ID registers */
@@ -357,6 +507,8 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
 		.set_user = set_id_reg,			\
 		.visibility = aa32_id_visibility,	\
 	},						\
+	.ftr_bits = { ARM64_FTR_END, },			\
+	.init = init_id_reg,				\
 }
 
 /*
@@ -372,6 +524,7 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
 		.set_user = set_id_reg,				\
 		.visibility = raz_visibility			\
 	},							\
+	.ftr_bits = { ARM64_FTR_END, },				\
 }
 
 /*
@@ -387,9 +540,10 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
 		.set_user = set_id_reg,			\
 		.visibility = raz_visibility,		\
 	},						\
+	.ftr_bits = { ARM64_FTR_END, },			\
 }
 
-static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
+static struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
 	/*
 	 * ID regs: all ID_SANITISED() entries here must have corresponding
 	 * entries in arm64_ftr_regs[].
@@ -405,6 +559,11 @@ static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
 		.get_user = get_id_reg,
 		.set_user = set_id_dfr0_el1,
 		.visibility = aa32_id_visibility, },
+	  .ftr_bits = {
+		ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
+			ID_DFR0_EL1_PerfMon_SHIFT, ID_DFR0_EL1_PerfMon_WIDTH, 0),
+		ARM64_FTR_END, },
+	  .init = init_id_dfr0_el1,
 	},
 	ID_HIDDEN(ID_AFR0_EL1),
 	AA32_ID_SANITISED(ID_MMFR0_EL1),
@@ -439,6 +598,13 @@ static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
 		.access = access_id_reg,
 		.get_user = get_id_reg,
 		.set_user = set_id_aa64pfr0_el1, },
+	  .ftr_bits = {
+		ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
+			ID_AA64PFR0_EL1_CSV2_SHIFT, ID_AA64PFR0_EL1_CSV2_WIDTH, 0),
+		ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
+			ID_AA64PFR0_EL1_CSV3_SHIFT, ID_AA64PFR0_EL1_CSV3_WIDTH, 0),
+		ARM64_FTR_END, },
+	  .init = init_id_aa64pfr0_el1,
 	},
 	ID_SANITISED(ID_AA64PFR1_EL1),
 	ID_UNALLOCATED(4, 2),
@@ -454,6 +620,11 @@ static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
 		.access = access_id_reg,
 		.get_user = get_id_reg,
 		.set_user = set_id_aa64dfr0_el1, },
+	  .ftr_bits = {
+		ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
+			ID_AA64DFR0_EL1_PMUVer_SHIFT, ID_AA64DFR0_EL1_PMUVer_WIDTH, 0),
+		ARM64_FTR_END, },
+	  .init = init_id_aa64dfr0_el1,
 	},
 	ID_SANITISED(ID_AA64DFR1_EL1),
 	ID_UNALLOCATED(5, 2),
@@ -583,7 +754,7 @@ int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return ret;
 }
 
-bool kvm_arm_check_idreg_table(void)
+bool kvm_arm_idreg_table_init(void)
 {
 	unsigned int i;
 
@@ -595,6 +766,9 @@ bool kvm_arm_check_idreg_table(void)
 				&id_reg_descs[i].reg_desc, i);
 			return false;
 		}
+
+		if (id_reg_descs[i].init)
+			id_reg_descs[i].init(&id_reg_descs[i]);
 	}
 
 	return true;
@@ -618,53 +792,16 @@ int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
 }
 
 /*
- * Set the guest's ID registers that are defined in id_reg_descs[]
- * with ID_SANITISED() to the host's sanitized value.
+ * Initialize the guest's ID registers with KVM sanitised values that were setup
+ * during ID register descriptors initialization.
  */
-void kvm_arm_set_default_id_regs(struct kvm *kvm)
+void kvm_arm_init_id_regs(struct kvm *kvm)
 {
 	int i;
 	u32 id;
-	u64 val;
 
 	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
 		id = reg_to_encoding(&id_reg_descs[i].reg_desc);
-
-		if (id_reg_descs[i].reg_desc.visibility == raz_visibility)
-			/* Hidden or reserved ID register */
-			continue;
-
-		val = read_sanitised_ftr_reg(id);
-		kvm->arch.id_regs[IDREG_IDX(id)] = val;
+		kvm->arch.id_regs[IDREG_IDX(id)] = id_reg_descs[i].kvm_sys_val;
 	}
-	/*
-	 * The default is to expose CSV2 == 1 if the HW isn't affected.
-	 * Although this is a per-CPU feature, we make it global because
-	 * asymmetric systems are just a nuisance.
-	 *
-	 * Userspace can override this as long as it doesn't promise
-	 * the impossible.
-	 */
-	val = kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)];
-
-	if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
-	}
-	if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
-		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
-	}
-
-	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;
-
-	/*
-	 * Initialise the default PMUver before there is a chance to
-	 * create an actual PMU.
-	 */
-	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
-		~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
-	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
-		FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
-			   kvm_arm_pmu_get_pmuver_limit());
 }
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 608a0378bdae..61b9adfd5d5e 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2912,14 +2912,13 @@ int __init kvm_sys_reg_table_init(void)
 	unsigned int i;
 
 	/* Make sure tables are unique and in order. */
-	valid &= kvm_arm_check_idreg_table();
 	valid &= check_sysreg_table(sys_reg_descs, ARRAY_SIZE(sys_reg_descs), false);
 	valid &= check_sysreg_table(cp14_regs, ARRAY_SIZE(cp14_regs), true);
 	valid &= check_sysreg_table(cp14_64_regs, ARRAY_SIZE(cp14_64_regs), true);
 	valid &= check_sysreg_table(cp15_regs, ARRAY_SIZE(cp15_regs), true);
 	valid &= check_sysreg_table(cp15_64_regs, ARRAY_SIZE(cp15_64_regs), true);
 	valid &= check_sysreg_table(invariant_sys_regs, ARRAY_SIZE(invariant_sys_regs), false);
-
+	valid &= kvm_arm_idreg_table_init();
 	if (!valid)
 		return -EINVAL;
 
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 8fd0020c67ca..55484d1615be 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -260,7 +260,7 @@ int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params);
 void kvm_arm_reset_id_regs(struct kvm_vcpu *vcpu);
 int kvm_arm_get_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
 int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg);
-bool kvm_arm_check_idreg_table(void);
+bool kvm_arm_idreg_table_init(void);
 int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind);
 u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id);
 
-- 
2.40.0.rc1.284.g88254d51c5-goog


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 1/6] KVM: arm64: Move CPU ID feature registers emulation into a separate file
  2023-03-17  5:06 ` [PATCH v4 1/6] KVM: arm64: Move CPU ID feature registers emulation into a separate file Jing Zhang
@ 2023-03-27 10:14   ` Marc Zyngier
  2023-03-28 17:16     ` Jing Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2023-03-27 10:14 UTC (permalink / raw)
  To: Jing Zhang
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

On Fri, 17 Mar 2023 05:06:32 +0000,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> Create a new file id_regs.c for CPU ID feature registers emulation code,
> which are moved from sys_regs.c and tweak sys_regs code accordingly.
> 
> No functional change intended.
> 
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  arch/arm64/kvm/Makefile   |   2 +-
>  arch/arm64/kvm/id_regs.c  | 506 ++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kvm/sys_regs.c | 464 ++--------------------------------
>  arch/arm64/kvm/sys_regs.h |  41 +++
>  4 files changed, 575 insertions(+), 438 deletions(-)
>  create mode 100644 arch/arm64/kvm/id_regs.c
> 
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index c0c050e53157..a6a315fcd81e 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -13,7 +13,7 @@ obj-$(CONFIG_KVM) += hyp/
>  kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
>  	 inject_fault.o va_layout.o handle_exit.o \
>  	 guest.o debug.o reset.o sys_regs.o stacktrace.o \
> -	 vgic-sys-reg-v3.o fpsimd.o pkvm.o \
> +	 vgic-sys-reg-v3.o fpsimd.o pkvm.o id_regs.o \
>  	 arch_timer.o trng.o vmid.o emulate-nested.o nested.o \
>  	 vgic/vgic.o vgic/vgic-init.o \
>  	 vgic/vgic-irqfd.o vgic/vgic-v2.o \
> diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> new file mode 100644
> index 000000000000..08b738852955
> --- /dev/null
> +++ b/arch/arm64/kvm/id_regs.c

[...]

> +/**
> + * emulate_id_reg - Emulate a guest access to an AArch64 CPU ID feature register
> + * @vcpu: The VCPU pointer
> + * @params: Decoded system register parameters
> + *
> + * Return: true if the ID register access was successful, false otherwise.
> + */
> +int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params)
> +{
> +	const struct sys_reg_desc *r;
> +
> +	r = find_reg(params, id_reg_descs, ARRAY_SIZE(id_reg_descs));
> +
> +	if (likely(r)) {
> +		perform_access(vcpu, params, r);
> +	} else {
> +		print_sys_reg_msg(params,
> +				  "Unsupported guest id_reg access at: %lx [%08lx]\n",
> +				  *vcpu_pc(vcpu), *vcpu_cpsr(vcpu));
> +		kvm_inject_undefined(vcpu);
> +	}
> +
> +	return 1;
> +}
> +
> +
> +void kvm_arm_reset_id_regs(struct kvm_vcpu *vcpu)
> +{
> +	unsigned long i;
> +
> +	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++)
> +		if (id_reg_descs[i].reset)
> +			id_reg_descs[i].reset(vcpu, &id_reg_descs[i]);
> +}

What does this mean? None of the idregs have a reset function, given
that they are global. Maybe this will make sense in the following
patches, but definitely not here.

> +
> +int kvm_arm_get_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	return kvm_sys_reg_get_user(vcpu, reg,
> +				    id_reg_descs, ARRAY_SIZE(id_reg_descs));
> +}
> +
> +int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> +	return kvm_sys_reg_set_user(vcpu, reg,
> +				    id_reg_descs, ARRAY_SIZE(id_reg_descs));
> +}
> +
> +bool kvm_arm_check_idreg_table(void)
> +{
> +	return check_sysreg_table(id_reg_descs, ARRAY_SIZE(id_reg_descs), false);
> +}

All these helpers are called from sys_regs.c and directly call back
into it. Why not simply have a helper that gets the base and size of
the array, and stick to pure common code?

> +
> +int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
> +{
> +	const struct sys_reg_desc *i2, *end2;
> +	unsigned int total = 0;
> +	int err;
> +
> +	i2 = id_reg_descs;
> +	end2 = id_reg_descs + ARRAY_SIZE(id_reg_descs);
> +
> +	while (i2 != end2) {
> +		err = walk_one_sys_reg(vcpu, i2++, &uind, &total);
> +		if (err)
> +			return err;
> +	}
> +	return total;
> +}

This is an exact copy of walk_sys_regs. Surely this can be made common
code.

[...]

> @@ -2912,6 +2482,8 @@ void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
>  {
>  	unsigned long i;
>  
> +	kvm_arm_reset_id_regs(vcpu);
> +
>  	for (i = 0; i < ARRAY_SIZE(sys_reg_descs); i++)
>  		if (sys_reg_descs[i].reset)
>  			sys_reg_descs[i].reset(vcpu, &sys_reg_descs[i]);
> @@ -2932,6 +2504,9 @@ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu)
>  	params = esr_sys64_to_params(esr);
>  	params.regval = vcpu_get_reg(vcpu, Rt);
>  
> +	if (is_id_reg(reg_to_encoding(&params)))
> +		return emulate_id_reg(vcpu, &params);
> +
>  	if (!emulate_sys_reg(vcpu, &params))
>  		return 1;
>  
> @@ -3160,6 +2735,10 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (err != -ENOENT)
>  		return err;
>  
> +	err = kvm_arm_get_id_reg(vcpu, reg);

Why not check for the encoding here? or in the helpers? It feels that
this is an overhead that would be easy to reduce, given that we have
fewer idregs than normal sysregs.

> +	if (err != -ENOENT)
> +		return err;
> +
>  	return kvm_sys_reg_get_user(vcpu, reg,
>  				    sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>  }
> @@ -3204,6 +2783,10 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
>  	if (err != -ENOENT)
>  		return err;
>  
> +	err = kvm_arm_set_id_reg(vcpu, reg);

Same here.

> +	if (err != -ENOENT)
> +		return err;
> +
>  	return kvm_sys_reg_set_user(vcpu, reg,
>  				    sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
>  }
> @@ -3250,10 +2833,10 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
>  	return true;
>  }
>  
> -static int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
> -			    const struct sys_reg_desc *rd,
> -			    u64 __user **uind,
> -			    unsigned int *total)
> +int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
> +		     const struct sys_reg_desc *rd,
> +		     u64 __user **uind,
> +		     unsigned int *total)
>  {
>  	/*
>  	 * Ignore registers we trap but don't save,
> @@ -3294,6 +2877,7 @@ unsigned long kvm_arm_num_sys_reg_descs(struct kvm_vcpu *vcpu)
>  {
>  	return ARRAY_SIZE(invariant_sys_regs)
>  		+ num_demux_regs()
> +		+ kvm_arm_walk_id_regs(vcpu, (u64 __user *)NULL)
>  		+ walk_sys_regs(vcpu, (u64 __user *)NULL);
>  }
>  
> @@ -3309,6 +2893,11 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>  		uindices++;
>  	}
>  
> +	err = kvm_arm_walk_id_regs(vcpu, uindices);
> +	if (err < 0)
> +		return err;
> +	uindices += err;
> +
>  	err = walk_sys_regs(vcpu, uindices);
>  	if (err < 0)
>  		return err;
> @@ -3323,6 +2912,7 @@ int __init kvm_sys_reg_table_init(void)
>  	unsigned int i;
>  
>  	/* Make sure tables are unique and in order. */
> +	valid &= kvm_arm_check_idreg_table();
>  	valid &= check_sysreg_table(sys_reg_descs, ARRAY_SIZE(sys_reg_descs), false);
>  	valid &= check_sysreg_table(cp14_regs, ARRAY_SIZE(cp14_regs), true);
>  	valid &= check_sysreg_table(cp14_64_regs, ARRAY_SIZE(cp14_64_regs), true);
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index 6b11f2cc7146..ad41305348f7 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -210,6 +210,35 @@ find_reg(const struct sys_reg_params *params, const struct sys_reg_desc table[],
>  	return __inline_bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg);
>  }
>  
> +static inline unsigned int raz_visibility(const struct kvm_vcpu *vcpu,
> +					  const struct sys_reg_desc *r)
> +{
> +	return REG_RAZ;
> +}

No, please. This is used as a function pointer. You now potentially
force the compiler to emit as many copy of this as there are pointers.

> +
> +static inline bool write_to_read_only(struct kvm_vcpu *vcpu,
> +				      struct sys_reg_params *params,
> +				      const struct sys_reg_desc *r)
> +{
> +	WARN_ONCE(1, "Unexpected sys_reg write to read-only register\n");
> +	print_sys_reg_instr(params);
> +	kvm_inject_undefined(vcpu);
> +	return false;
> +}

Please make this common code, and not an inline function.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest
  2023-03-17  5:06 ` [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest Jing Zhang
@ 2023-03-27 10:15   ` Marc Zyngier
  2023-03-28 17:36     ` Jing Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2023-03-27 10:15 UTC (permalink / raw)
  To: Jing Zhang
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

On Fri, 17 Mar 2023 05:06:33 +0000,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> From: Reiji Watanabe <reijiw@google.com>
> 
> Introduce id_regs[] in kvm_arch as a storage of guest's ID registers,
> and save ID registers' sanitized value in the array at KVM_CREATE_VM.
> Use the saved ones when ID registers are read by the guest or
> userspace (via KVM_GET_ONE_REG).
> 
> No functional change intended.
> 
> Signed-off-by: Reiji Watanabe <reijiw@google.com>
> Co-developed-by: Jing Zhang <jingzhangos@google.com>
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 11 ++++++++
>  arch/arm64/kvm/arm.c              |  1 +
>  arch/arm64/kvm/id_regs.c          | 44 ++++++++++++++++++++++++-------
>  arch/arm64/kvm/sys_regs.c         |  2 +-
>  arch/arm64/kvm/sys_regs.h         |  1 +
>  5 files changed, 49 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index a1892a8f6032..fb6b50b1f111 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -245,6 +245,15 @@ struct kvm_arch {
>  	 * the associated pKVM instance in the hypervisor.
>  	 */
>  	struct kvm_protected_vm pkvm;
> +
> +	/*
> +	 * Save ID registers for the guest in id_regs[].
> +	 * (Op0, Op1, CRn, CRm, Op2) of the ID registers to be saved in it
> +	 * is (3, 0, 0, crm, op2), where 1<=crm<8, 0<=op2<8.
> +	 */
> +#define KVM_ARM_ID_REG_NUM	56
> +#define IDREG_IDX(id)		(((sys_reg_CRm(id) - 1) << 3) | sys_reg_Op2(id))
> +	u64 id_regs[KVM_ARM_ID_REG_NUM];

Place these registers in their own structure, and place this structure
*before* the pvm structure. Document what guards these registers when
updated (my hunch is that this should rely on Oliver's locking fixes
if the update comes from a vcpu).

>  };
>  
>  struct kvm_vcpu_fault_info {
> @@ -1005,6 +1014,8 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>  				struct kvm_arm_copy_mte_tags *copy_tags);
>  
> +void kvm_arm_set_default_id_regs(struct kvm *kvm);
> +
>  /* Guest/host FPSIMD coordination helpers */
>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 3bd732eaf087..4579c878ab30 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -153,6 +153,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  
>  	set_default_spectre(kvm);
>  	kvm_arm_init_hypercalls(kvm);
> +	kvm_arm_set_default_id_regs(kvm);
>  
>  	/*
>  	 * Initialise the default PMUver before there is a chance to
> diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> index 08b738852955..e393b5730557 100644
> --- a/arch/arm64/kvm/id_regs.c
> +++ b/arch/arm64/kvm/id_regs.c
> @@ -52,16 +52,9 @@ static u8 pmuver_to_perfmon(u8 pmuver)
>  	}
>  }
>  
> -/* Read a sanitised cpufeature ID register by sys_reg_desc */
> -static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r)
> +u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
>  {
> -	u32 id = reg_to_encoding(r);
> -	u64 val;
> -
> -	if (sysreg_visible_as_raz(vcpu, r))
> -		return 0;
> -
> -	val = read_sanitised_ftr_reg(id);
> +	u64 val = vcpu->kvm->arch.id_regs[IDREG_IDX(id)];
>  
>  	switch (id) {
>  	case SYS_ID_AA64PFR0_EL1:
> @@ -126,6 +119,14 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r
>  	return val;
>  }
>  
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r)
> +{
> +	if (sysreg_visible_as_raz(vcpu, r))
> +		return 0;
> +
> +	return kvm_arm_read_id_reg(vcpu, reg_to_encoding(r));
> +}
> +
>  /* cpufeature ID register access trap handlers */
>  
>  static bool access_id_reg(struct kvm_vcpu *vcpu,
> @@ -504,3 +505,28 @@ int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
>  	}
>  	return total;
>  }
> +
> +/*
> + * Set the guest's ID registers that are defined in id_reg_descs[]
> + * with ID_SANITISED() to the host's sanitized value.
> + */
> +void kvm_arm_set_default_id_regs(struct kvm *kvm)
> +{
> +	int i;
> +	u32 id;
> +	u64 val;
> +
> +	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
> +		id = reg_to_encoding(&id_reg_descs[i]);
> +		if (WARN_ON_ONCE(!is_id_reg(id)))
> +			/* Shouldn't happen */
> +			continue;
> +
> +		if (id_reg_descs[i].visibility == raz_visibility)
> +			/* Hidden or reserved ID register */
> +			continue;

Relying on function pointer comparison is really fragile. If I wrap
raz_visibility() in another function, this won't catch it. It also
doesn't bode well with your 'inline' definition of this function.

More importantly, why do we care about checking for visibility at all?
We can happily populate the array and rely on the runtime visibility.

> +
> +		val = read_sanitised_ftr_reg(id);
> +		kvm->arch.id_regs[IDREG_IDX(id)] = val;
> +	}
> +}

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3]
  2023-03-17  5:06 ` [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3] Jing Zhang
@ 2023-03-27 10:31   ` Marc Zyngier
  2023-03-28 19:54     ` Jing Zhang
  2023-03-28 12:39   ` Fuad Tabba
  1 sibling, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2023-03-27 10:31 UTC (permalink / raw)
  To: Jing Zhang
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

On Fri, 17 Mar 2023 05:06:34 +0000,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> With per guest ID registers, ID_AA64PFR0_EL1.[CSV2|CSV3] settings from
> userspace can be stored in its corresponding ID register.
> 
> No functional change intended.
> 
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h  |  2 --
>  arch/arm64/kvm/arm.c               | 19 +------------------
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c |  7 +++----
>  arch/arm64/kvm/id_regs.c           | 30 ++++++++++++++++++++++--------
>  4 files changed, 26 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index fb6b50b1f111..e926ea91a73c 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -230,8 +230,6 @@ struct kvm_arch {
>  
>  	cpumask_var_t supported_cpus;
>  
> -	u8 pfr0_csv2;
> -	u8 pfr0_csv3;
>  	struct {
>  		u8 imp:4;
>  		u8 unimp:4;
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 4579c878ab30..c78d68d011cb 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -104,22 +104,6 @@ static int kvm_arm_default_max_vcpus(void)
>  	return vgic_present ? kvm_vgic_get_max_vcpus() : KVM_MAX_VCPUS;
>  }
>  
> -static void set_default_spectre(struct kvm *kvm)
> -{
> -	/*
> -	 * The default is to expose CSV2 == 1 if the HW isn't affected.
> -	 * Although this is a per-CPU feature, we make it global because
> -	 * asymmetric systems are just a nuisance.
> -	 *
> -	 * Userspace can override this as long as it doesn't promise
> -	 * the impossible.
> -	 */
> -	if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED)
> -		kvm->arch.pfr0_csv2 = 1;
> -	if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED)
> -		kvm->arch.pfr0_csv3 = 1;
> -}
> -
>  /**
>   * kvm_arch_init_vm - initializes a VM data structure
>   * @kvm:	pointer to the KVM struct
> @@ -151,9 +135,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	/* The maximum number of VCPUs is limited by the host's GIC model */
>  	kvm->max_vcpus = kvm_arm_default_max_vcpus();
>  
> -	set_default_spectre(kvm);
> -	kvm_arm_init_hypercalls(kvm);
>  	kvm_arm_set_default_id_regs(kvm);
> +	kvm_arm_init_hypercalls(kvm);

Please document the ordering dependency between idregs and hypercalls.

>  
>  	/*
>  	 * Initialise the default PMUver before there is a chance to
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> index 08d2b004f4b7..0e1988740a65 100644
> --- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -93,10 +93,9 @@ static u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
>  		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
>  
>  	/* Spectre and Meltdown mitigation in KVM */
> -	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> -			       (u64)kvm->arch.pfr0_csv2);
> -	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> -			       (u64)kvm->arch.pfr0_csv3);
> +	set_mask |= vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] &

This really want an accessor.

> +		(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
> +			ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
>  
>  	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
>  }
> diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> index e393b5730557..b60ca1058301 100644
> --- a/arch/arm64/kvm/id_regs.c
> +++ b/arch/arm64/kvm/id_regs.c
> @@ -61,12 +61,6 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
>  		if (!vcpu_has_sve(vcpu))
>  			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
>  		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
> -		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> -		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> -				  (u64)vcpu->kvm->arch.pfr0_csv2);
> -		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> -		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> -				  (u64)vcpu->kvm->arch.pfr0_csv3);
>  		if (kvm_vgic_global_state.type == VGIC_V3) {
>  			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
>  			val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
> @@ -201,6 +195,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
>  			       u64 val)
>  {
>  	u8 csv2, csv3;
> +	u64 sval = val;
>  
>  	/*
>  	 * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
> @@ -225,8 +220,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
>  	if (val)
>  		return -EINVAL;
>  
> -	vcpu->kvm->arch.pfr0_csv2 = csv2;
> -	vcpu->kvm->arch.pfr0_csv3 = csv3;
> +	vcpu->kvm->arch.id_regs[IDREG_IDX(reg_to_encoding(rd))] = sval;

Accessor needed here to.

>
>  	return 0;
>  }
> @@ -529,4 +523,24 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
>  		val = read_sanitised_ftr_reg(id);
>  		kvm->arch.id_regs[IDREG_IDX(id)] = val;
>  	}
> +	/*

Add a blank line after the closing bracket.

> +	 * The default is to expose CSV2 == 1 if the HW isn't affected.
> +	 * Although this is a per-CPU feature, we make it global because
> +	 * asymmetric systems are just a nuisance.
> +	 *
> +	 * Userspace can override this as long as it doesn't promise
> +	 * the impossible.
> +	 */
> +	val = kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)];

Accessor.

> +
> +	if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
> +		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> +		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
> +	}
> +	if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
> +		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> +		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
> +	}
> +
> +	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;

Accessor.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 4/6] KVM: arm64: Use per guest ID register for ID_AA64DFR0_EL1.PMUVer
  2023-03-17  5:06 ` [PATCH v4 4/6] KVM: arm64: Use per guest ID register for ID_AA64DFR0_EL1.PMUVer Jing Zhang
@ 2023-03-27 10:40   ` Marc Zyngier
  2023-03-28 20:20     ` Jing Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2023-03-27 10:40 UTC (permalink / raw)
  To: Jing Zhang
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

On Fri, 17 Mar 2023 05:06:35 +0000,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> With per guest ID registers, PMUver settings from userspace
> can be stored in its corresponding ID register.
> 
> No functional change intended.
> 
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 11 +++---
>  arch/arm64/kvm/arm.c              |  6 ---
>  arch/arm64/kvm/id_regs.c          | 61 +++++++++++++++++++++++++------
>  include/kvm/arm_pmu.h             |  5 ++-
>  4 files changed, 59 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index e926ea91a73c..102860ba896d 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -218,6 +218,12 @@ struct kvm_arch {
>  #define KVM_ARCH_FLAG_EL1_32BIT				4
>  	/* PSCI SYSTEM_SUSPEND enabled for the guest */
>  #define KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED		5
> +	/*
> +	 * AA64DFR0_EL1.PMUver was set as ID_AA64DFR0_EL1_PMUVer_IMP_DEF
> +	 * or DFR0_EL1.PerfMon was set as ID_DFR0_EL1_PerfMon_IMPDEF from
> +	 * userspace for VCPUs without PMU.
> +	 */
> +#define KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU		6
>  
>  	unsigned long flags;
>  
> @@ -230,11 +236,6 @@ struct kvm_arch {
>  
>  	cpumask_var_t supported_cpus;
>  
> -	struct {
> -		u8 imp:4;
> -		u8 unimp:4;
> -	} dfr0_pmuver;
> -
>  	/* Hypercall features firmware registers' descriptor */
>  	struct kvm_smccc_features smccc_feat;
>  
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index c78d68d011cb..fb2de2cb98cb 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -138,12 +138,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	kvm_arm_set_default_id_regs(kvm);
>  	kvm_arm_init_hypercalls(kvm);
>  
> -	/*
> -	 * Initialise the default PMUver before there is a chance to
> -	 * create an actual PMU.
> -	 */
> -	kvm->arch.dfr0_pmuver.imp = kvm_arm_pmu_get_pmuver_limit();
> -
>  	return 0;
>  
>  err_free_cpumask:
> diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> index b60ca1058301..3a87a3d2390d 100644
> --- a/arch/arm64/kvm/id_regs.c
> +++ b/arch/arm64/kvm/id_regs.c
> @@ -21,9 +21,12 @@
>  static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
>  {
>  	if (kvm_vcpu_has_pmu(vcpu))
> -		return vcpu->kvm->arch.dfr0_pmuver.imp;
> -
> -	return vcpu->kvm->arch.dfr0_pmuver.unimp;
> +		return FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
> +				vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)]);
> +	else if (test_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags))
> +		return ID_AA64DFR0_EL1_PMUVer_IMP_DEF;
> +	else
> +		return 0;

Drop the pointless elses.

>  }
>  
>  static u8 perfmon_to_pmuver(u8 perfmon)
> @@ -256,10 +259,23 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
>  	if (val)
>  		return -EINVAL;
>  
> -	if (valid_pmu)
> -		vcpu->kvm->arch.dfr0_pmuver.imp = pmuver;
> -	else
> -		vcpu->kvm->arch.dfr0_pmuver.unimp = pmuver;
> +	if (valid_pmu) {
> +		mutex_lock(&vcpu->kvm->lock);

Bingo!

> +		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
> +			~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> +		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
> +			FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), pmuver);
> +
> +		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
> +			~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> +		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
> +				ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), pmuver_to_perfmon(pmuver));
> +		mutex_unlock(&vcpu->kvm->lock);
> +	} else if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) {
> +		set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> +	} else {
> +		clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> +	}

The last two cases are better written as:

	assign_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags,
		   pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF);

>  
>  	return 0;
>  }
> @@ -296,10 +312,23 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>  	if (val)
>  		return -EINVAL;
>  
> -	if (valid_pmu)
> -		vcpu->kvm->arch.dfr0_pmuver.imp = perfmon_to_pmuver(perfmon);
> -	else
> -		vcpu->kvm->arch.dfr0_pmuver.unimp = perfmon_to_pmuver(perfmon);
> +	if (valid_pmu) {
> +		mutex_lock(&vcpu->kvm->lock);

Same here (lock inversion)

> +		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
> +			~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> +		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
> +			ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), perfmon);
> +
> +		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
> +			~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> +		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |= FIELD_PREP(
> +			ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), perfmon_to_pmuver(perfmon));
> +		mutex_unlock(&vcpu->kvm->lock);
> +	} else if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF) {
> +		set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> +	} else {
> +		clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> +	}

Same here (assign_bit).
>  
>  	return 0;
>  }
> @@ -543,4 +572,14 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
>  	}
>  
>  	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;
> +
> +	/*
> +	 * Initialise the default PMUver before there is a chance to
> +	 * create an actual PMU.
> +	 */
> +	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
> +		~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> +	kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
> +		FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
> +			   kvm_arm_pmu_get_pmuver_limit());

Please put these assignments on a single line...

>  }
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 628775334d5e..51c7f3e7bdde 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -92,8 +92,9 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
>  /*
>   * Evaluates as true when emulating PMUv3p5, and false otherwise.
>   */
> -#define kvm_pmu_is_3p5(vcpu)						\
> -	(vcpu->kvm->arch.dfr0_pmuver.imp >= ID_AA64DFR0_EL1_PMUVer_V3P5)
> +#define kvm_pmu_is_3p5(vcpu)									\
> +	 (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),					\
> +		 vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)]) >= ID_AA64DFR0_EL1_PMUVer_V3P5)

I'll stop mentioning the need for accessors...

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 5/6] KVM: arm64: Introduce ID register specific descriptor
  2023-03-17  5:06 ` [PATCH v4 5/6] KVM: arm64: Introduce ID register specific descriptor Jing Zhang
@ 2023-03-27 11:28   ` Marc Zyngier
  2023-03-29  3:46     ` Jing Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2023-03-27 11:28 UTC (permalink / raw)
  To: Jing Zhang
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

On Fri, 17 Mar 2023 05:06:36 +0000,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> Introduce an ID feature register specific descriptor to include ID
> register specific fields and callbacks besides its corresponding
> general system register descriptor.
> New fields for ID register descriptor would be added later when it
> is necessary to support a writable ID register.

What would these be? Could they make sense for "normal" sysregs as
well?

> 
> No functional change intended.
> 
> Co-developed-by: Reiji Watanabe <reijiw@google.com>
> Signed-off-by: Reiji Watanabe <reijiw@google.com>
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  arch/arm64/kvm/id_regs.c  | 187 +++++++++++++++++++++++++++-----------
>  arch/arm64/kvm/sys_regs.c |   2 +-
>  arch/arm64/kvm/sys_regs.h |   1 +
>  3 files changed, 138 insertions(+), 52 deletions(-)
> 
> diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> index 3a87a3d2390d..9956c99d20f7 100644
> --- a/arch/arm64/kvm/id_regs.c
> +++ b/arch/arm64/kvm/id_regs.c
> @@ -18,6 +18,10 @@
>  
>  #include "sys_regs.h"
>  
> +struct id_reg_desc {
> +	const struct sys_reg_desc	reg_desc;
> +};
> +

What is the advantage in having this wrapping structure that forces us
to reinvent the wheel (the structure is different) over an additional
pointer or even a side table?

>  static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
>  {
>  	if (kvm_vcpu_has_pmu(vcpu))
> @@ -334,21 +338,25 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>  }
>  
>  /* sys_reg_desc initialiser for known cpufeature ID registers */
> -#define ID_SANITISED(name) {			\
> -	SYS_DESC(SYS_##name),			\
> -	.access	= access_id_reg,		\
> -	.get_user = get_id_reg,			\
> -	.set_user = set_id_reg,			\
> -	.visibility = id_visibility,		\
> +#define ID_SANITISED(name) {				\
> +	.reg_desc = {					\
> +		SYS_DESC(SYS_##name),			\
> +		.access	= access_id_reg,		\
> +		.get_user = get_id_reg,			\
> +		.set_user = set_id_reg,			\
> +		.visibility = id_visibility,		\
> +	},						\
>  }
>  
>  /* sys_reg_desc initialiser for known cpufeature ID registers */
> -#define AA32_ID_SANITISED(name) {		\
> -	SYS_DESC(SYS_##name),			\
> -	.access	= access_id_reg,		\
> -	.get_user = get_id_reg,			\
> -	.set_user = set_id_reg,			\
> -	.visibility = aa32_id_visibility,	\
> +#define AA32_ID_SANITISED(name) {			\
> +	.reg_desc = {					\
> +		SYS_DESC(SYS_##name),			\
> +		.access	= access_id_reg,		\
> +		.get_user = get_id_reg,			\
> +		.set_user = set_id_reg,			\
> +		.visibility = aa32_id_visibility,	\
> +	},						\
>  }
>  
>  /*
> @@ -356,12 +364,14 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>   * register with encoding Op0=3, Op1=0, CRn=0, CRm=crm, Op2=op2
>   * (1 <= crm < 8, 0 <= Op2 < 8).
>   */
> -#define ID_UNALLOCATED(crm, op2) {			\
> -	Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2),	\
> -	.access = access_id_reg,			\
> -	.get_user = get_id_reg,				\
> -	.set_user = set_id_reg,				\
> -	.visibility = raz_visibility			\
> +#define ID_UNALLOCATED(crm, op2) {				\
> +	.reg_desc = {						\
> +		Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2),	\
> +		.access = access_id_reg,			\
> +		.get_user = get_id_reg,				\
> +		.set_user = set_id_reg,				\
> +		.visibility = raz_visibility			\
> +	},							\
>  }
>  
>  /*
> @@ -369,15 +379,17 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>   * For now, these are exposed just like unallocated ID regs: they appear
>   * RAZ for the guest.
>   */
> -#define ID_HIDDEN(name) {			\
> -	SYS_DESC(SYS_##name),			\
> -	.access = access_id_reg,		\
> -	.get_user = get_id_reg,			\
> -	.set_user = set_id_reg,			\
> -	.visibility = raz_visibility,		\
> +#define ID_HIDDEN(name) {				\
> +	.reg_desc = {					\
> +		SYS_DESC(SYS_##name),			\
> +		.access = access_id_reg,		\
> +		.get_user = get_id_reg,			\
> +		.set_user = set_id_reg,			\
> +		.visibility = raz_visibility,		\
> +	},						\
>  }
>  
> -static const struct sys_reg_desc id_reg_descs[] = {
> +static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
>  	/*
>  	 * ID regs: all ID_SANITISED() entries here must have corresponding
>  	 * entries in arm64_ftr_regs[].
> @@ -387,9 +399,13 @@ static const struct sys_reg_desc id_reg_descs[] = {
>  	/* CRm=1 */
>  	AA32_ID_SANITISED(ID_PFR0_EL1),
>  	AA32_ID_SANITISED(ID_PFR1_EL1),
> -	{ SYS_DESC(SYS_ID_DFR0_EL1), .access = access_id_reg,
> -	  .get_user = get_id_reg, .set_user = set_id_dfr0_el1,
> -	  .visibility = aa32_id_visibility, },
> +	{ .reg_desc = {
> +		SYS_DESC(SYS_ID_DFR0_EL1),
> +		.access = access_id_reg,
> +		.get_user = get_id_reg,
> +		.set_user = set_id_dfr0_el1,
> +		.visibility = aa32_id_visibility, },
> +	},
>  	ID_HIDDEN(ID_AFR0_EL1),
>  	AA32_ID_SANITISED(ID_MMFR0_EL1),
>  	AA32_ID_SANITISED(ID_MMFR1_EL1),
> @@ -418,8 +434,12 @@ static const struct sys_reg_desc id_reg_descs[] = {
>  
>  	/* AArch64 ID registers */
>  	/* CRm=4 */
> -	{ SYS_DESC(SYS_ID_AA64PFR0_EL1), .access = access_id_reg,
> -	  .get_user = get_id_reg, .set_user = set_id_aa64pfr0_el1, },
> +	{ .reg_desc = {
> +		SYS_DESC(SYS_ID_AA64PFR0_EL1),
> +		.access = access_id_reg,
> +		.get_user = get_id_reg,
> +		.set_user = set_id_aa64pfr0_el1, },
> +	},
>  	ID_SANITISED(ID_AA64PFR1_EL1),
>  	ID_UNALLOCATED(4, 2),
>  	ID_UNALLOCATED(4, 3),
> @@ -429,8 +449,12 @@ static const struct sys_reg_desc id_reg_descs[] = {
>  	ID_UNALLOCATED(4, 7),
>  
>  	/* CRm=5 */
> -	{ SYS_DESC(SYS_ID_AA64DFR0_EL1), .access = access_id_reg,
> -	  .get_user = get_id_reg, .set_user = set_id_aa64dfr0_el1, },
> +	{ .reg_desc = {
> +		SYS_DESC(SYS_ID_AA64DFR0_EL1),
> +		.access = access_id_reg,
> +		.get_user = get_id_reg,
> +		.set_user = set_id_aa64dfr0_el1, },
> +	},
>  	ID_SANITISED(ID_AA64DFR1_EL1),
>  	ID_UNALLOCATED(5, 2),
>  	ID_UNALLOCATED(5, 3),
> @@ -469,12 +493,12 @@ static const struct sys_reg_desc id_reg_descs[] = {
>   */
>  int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params)
>  {
> -	const struct sys_reg_desc *r;
> +	u32 id;
>  
> -	r = find_reg(params, id_reg_descs, ARRAY_SIZE(id_reg_descs));
> +	id = reg_to_encoding(params);
>  
> -	if (likely(r)) {
> -		perform_access(vcpu, params, r);
> +	if (likely(is_id_reg(id))) {
> +		perform_access(vcpu, params, &id_reg_descs[IDREG_IDX(id)].reg_desc);

How about minimising the diff and making the whole thing less verbose?

static const struct sys_reg_desc *id_to_id_reg_desc(struct sys_reg_params *params)
{
	u32 id;

	id = reg_to_encoding(params);
	if (is_id_reg(id))
		return &id_reg_descs[IDREG_IDX(id)].reg_desc;

	return NULL;
}

int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params)
{
	const struct sys_reg_desc *r;

	r = id_to_id_reg_desc(params);
	[...]
}

And use the helper everywhere?

>  	} else {
>  		print_sys_reg_msg(params,
>  				  "Unsupported guest id_reg access at: %lx [%08lx]\n",
> @@ -491,38 +515,102 @@ void kvm_arm_reset_id_regs(struct kvm_vcpu *vcpu)
>  	unsigned long i;
>  
>  	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++)
> -		if (id_reg_descs[i].reset)
> -			id_reg_descs[i].reset(vcpu, &id_reg_descs[i]);
> +		if (id_reg_descs[i].reg_desc.reset)
> +			id_reg_descs[i].reg_desc.reset(vcpu, &id_reg_descs[i].reg_desc);
>  }
>  
>  int kvm_arm_get_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
> -	return kvm_sys_reg_get_user(vcpu, reg,
> -				    id_reg_descs, ARRAY_SIZE(id_reg_descs));
> +	u64 __user *uaddr = (u64 __user *)(unsigned long)reg->addr;
> +	const struct sys_reg_desc *r;
> +	struct sys_reg_params params;
> +	u64 val;
> +	int ret;
> +	u32 id;
> +
> +	if (!index_to_params(reg->id, &params))
> +		return -ENOENT;
> +	id = reg_to_encoding(&params);
> +
> +	if (!is_id_reg(id))
> +		return -ENOENT;
> +
> +	r = &id_reg_descs[IDREG_IDX(id)].reg_desc;
> +	if (r->get_user) {
> +		ret = (r->get_user)(vcpu, r, &val);
> +	} else {
> +		ret = 0;
> +		val = vcpu->kvm->arch.id_regs[IDREG_IDX(id)];
> +	}
> +
> +	if (!ret)
> +		ret = put_user(val, uaddr);

How about the visibility? Why isn't it checked?

> +
> +	return ret;
>  }
>  
>  int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  {
> -	return kvm_sys_reg_set_user(vcpu, reg,
> -				    id_reg_descs, ARRAY_SIZE(id_reg_descs));
> +	u64 __user *uaddr = (u64 __user *)(unsigned long)reg->addr;
> +	const struct sys_reg_desc *r;
> +	struct sys_reg_params params;
> +	u64 val;
> +	int ret;
> +	u32 id;
> +
> +	if (!index_to_params(reg->id, &params))
> +		return -ENOENT;
> +	id = reg_to_encoding(&params);
> +
> +	if (!is_id_reg(id))
> +		return -ENOENT;
> +
> +	if (get_user(val, uaddr))
> +		return -EFAULT;
> +
> +	r = &id_reg_descs[IDREG_IDX(id)].reg_desc;
> +
> +	if (sysreg_user_write_ignore(vcpu, r))
> +		return 0;
> +
> +	if (r->set_user) {
> +		ret = (r->set_user)(vcpu, r, val);
> +	} else {
> +		WARN_ONCE(1, "ID register set_user callback is NULL\n");

Why the shouting? We didn't do that before. What's changed?

> +		ret = 0;
> +	}
> +
> +	return ret;
>  }
>  
>  bool kvm_arm_check_idreg_table(void)
>  {
> -	return check_sysreg_table(id_reg_descs, ARRAY_SIZE(id_reg_descs), false);
> +	unsigned int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
> +		const struct sys_reg_desc *r = &id_reg_descs[i].reg_desc;
> +
> +		if (!is_id_reg(reg_to_encoding(r))) {
> +			kvm_err("id_reg table %pS entry %d not set correctly\n",
> +				&id_reg_descs[i].reg_desc, i);
> +			return false;
> +		}
> +	}
> +
> +	return true;
>  }
>  
>  int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
>  {
> -	const struct sys_reg_desc *i2, *end2;
> +	const struct id_reg_desc *i2, *end2;
>  	unsigned int total = 0;
>  	int err;
>  
>  	i2 = id_reg_descs;
>  	end2 = id_reg_descs + ARRAY_SIZE(id_reg_descs);
>  
> -	while (i2 != end2) {
> -		err = walk_one_sys_reg(vcpu, i2++, &uind, &total);
> +	for (; i2 != end2; i2++) {
> +		err = walk_one_sys_reg(vcpu, &(i2->reg_desc), &uind, &total);
>  		if (err)
>  			return err;
>  	}
> @@ -540,12 +628,9 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
>  	u64 val;
>  
>  	for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
> -		id = reg_to_encoding(&id_reg_descs[i]);
> -		if (WARN_ON_ONCE(!is_id_reg(id)))
> -			/* Shouldn't happen */
> -			continue;
> +		id = reg_to_encoding(&id_reg_descs[i].reg_desc);

Why have you dropped that check? If it shouldn't happen before, it
still shouldn't happen.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 6/6] KVM: arm64: Refactor writings for PMUVer/CSV2/CSV3
  2023-03-17  5:06 ` [PATCH v4 6/6] KVM: arm64: Refactor writings for PMUVer/CSV2/CSV3 Jing Zhang
@ 2023-03-27 13:34   ` Marc Zyngier
  2023-03-29  4:29     ` Jing Zhang
  0 siblings, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2023-03-27 13:34 UTC (permalink / raw)
  To: Jing Zhang
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

On Fri, 17 Mar 2023 05:06:37 +0000,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> Save KVM sanitised ID register value in ID descriptor (kvm_sys_val).

Why do we need to store a separate value *beside* the sanitised value
the kernel already holds?

> Add an init callback for every ID register to setup kvm_sys_val.

Same question.

> All per VCPU sanitizations are still handled on the fly during ID
> register read and write from userspace.
> An arm64_ftr_bits array is used to indicate writable feature fields.
> 
> Refactor writings for ID_AA64PFR0_EL1.[CSV2|CSV3],
> ID_AA64DFR0_EL1.PMUVer and ID_DFR0_ELF.PerfMon based on utilities
> introduced by ID register descriptor.
> 
> No functional change intended.
> 
> Co-developed-by: Reiji Watanabe <reijiw@google.com>
> Signed-off-by: Reiji Watanabe <reijiw@google.com>
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  arch/arm64/include/asm/cpufeature.h |  25 +++
>  arch/arm64/include/asm/kvm_host.h   |   2 +-
>  arch/arm64/kernel/cpufeature.c      |  26 +--
>  arch/arm64/kvm/arm.c                |   2 +-
>  arch/arm64/kvm/id_regs.c            | 325 ++++++++++++++++++++--------
>  arch/arm64/kvm/sys_regs.c           |   3 +-
>  arch/arm64/kvm/sys_regs.h           |   2 +-
>  7 files changed, 261 insertions(+), 124 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index fc2c739f48f1..493ec530eefc 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -64,6 +64,30 @@ struct arm64_ftr_bits {
>  	s64		safe_val; /* safe value for FTR_EXACT features */
>  };
>  
> +#define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> +	{						\
> +		.sign = SIGNED,				\
> +		.visible = VISIBLE,			\
> +		.strict = STRICT,			\
> +		.type = TYPE,				\
> +		.shift = SHIFT,				\
> +		.width = WIDTH,				\
> +		.safe_val = SAFE_VAL,			\
> +	}
> +
> +/* Define a feature with unsigned values */
> +#define ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> +	__ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
> +
> +/* Define a feature with a signed value */
> +#define S_ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> +	__ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
> +
> +#define ARM64_FTR_END					\
> +	{						\
> +		.width = 0,				\
> +	}
> +
>  /*
>   * Describe the early feature override to the core override code:
>   *
> @@ -911,6 +935,7 @@ static inline unsigned int get_vmid_bits(u64 mmfr1)
>  	return 8;
>  }
>  
> +s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new, s64 cur);
>  struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id);
>  
>  extern struct arm64_ftr_override id_aa64mmfr1_override;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 102860ba896d..aa83dd79e7ff 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -1013,7 +1013,7 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>  long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>  				struct kvm_arm_copy_mte_tags *copy_tags);
>  
> -void kvm_arm_set_default_id_regs(struct kvm *kvm);
> +void kvm_arm_init_id_regs(struct kvm *kvm);
>  
>  /* Guest/host FPSIMD coordination helpers */
>  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 23bd2a926b74..e18848ee4b98 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -139,30 +139,6 @@ void dump_cpu_features(void)
>  	pr_emerg("0x%*pb\n", ARM64_NCAPS, &cpu_hwcaps);
>  }
>  
> -#define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> -	{						\
> -		.sign = SIGNED,				\
> -		.visible = VISIBLE,			\
> -		.strict = STRICT,			\
> -		.type = TYPE,				\
> -		.shift = SHIFT,				\
> -		.width = WIDTH,				\
> -		.safe_val = SAFE_VAL,			\
> -	}
> -
> -/* Define a feature with unsigned values */
> -#define ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> -	__ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
> -
> -/* Define a feature with a signed value */
> -#define S_ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> -	__ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
> -
> -#define ARM64_FTR_END					\
> -	{						\
> -		.width = 0,				\
> -	}
> -
>  static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
>  
>  static bool __system_matches_cap(unsigned int n);
> @@ -790,7 +766,7 @@ static u64 arm64_ftr_set_value(const struct arm64_ftr_bits *ftrp, s64 reg,
>  	return reg;
>  }
>  
> -static s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new,
> +s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new,
>  				s64 cur)
>  {
>  	s64 ret = 0;
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fb2de2cb98cb..e539d9ca9d01 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -135,7 +135,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	/* The maximum number of VCPUs is limited by the host's GIC model */
>  	kvm->max_vcpus = kvm_arm_default_max_vcpus();
>  
> -	kvm_arm_set_default_id_regs(kvm);
> +	kvm_arm_init_id_regs(kvm);

How about picking the name once and for all from the first patch?

>  	kvm_arm_init_hypercalls(kvm);
>  
>  	return 0;
> diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> index 9956c99d20f7..726b810b6e06 100644
> --- a/arch/arm64/kvm/id_regs.c
> +++ b/arch/arm64/kvm/id_regs.c
> @@ -18,10 +18,88 @@
>  
>  #include "sys_regs.h"
>  
> +/*
> + * Number of entries in id_reg_desc's ftr_bits[] (Number of 4 bits fields
> + * in 64 bit register + 1 entry for a terminator entry).
> + */
> +#define	FTR_FIELDS_NUM	17

Please see SMFR0_EL1 for an example of a sysreg that doesn't follow
the 4bits-per-field format. I expect to see more of those in the
future.

And given that this is always a variable set of fields, why do we need
to define this as a fixed array that only bloats the structure? I'd
rather see a variable array in a side structure.

> +
>  struct id_reg_desc {
>  	const struct sys_reg_desc	reg_desc;
> +	/*
> +	 * KVM sanitised ID register value.
> +	 * It is the default value for per VM emulated ID register.
> +	 */
> +	u64 kvm_sys_val;
> +	/*
> +	 * Used to validate the ID register values with arm64_check_features().
> +	 * The last item in the array must be terminated by an item whose
> +	 * width field is zero as that is expected by arm64_check_features().
> +	 * Only feature bits defined in this array are writable.
> +	 */
> +	struct arm64_ftr_bits	ftr_bits[FTR_FIELDS_NUM];
> +
> +	/*
> +	 * Basically init() is used to setup the KVM sanitised value
> +	 * stored in kvm_sys_val.
> +	 */
> +	void (*init)(struct id_reg_desc *idr);

Given that this callback only builds the value from the sanitised
view, and that it is very cheap (only a set of masking), why do we
bother keeping the value around? It would also allow this structure to
be kept *const*, something that is extremely desirable.

Also, why do we need an init() method when each sysreg already have a
reset() method? Surely this should be the same thing...

My gut feeling is that we should only have a callback returning the
limit value computed on the fly.

>  };
>  
> +static struct id_reg_desc id_reg_descs[];
> +
> +/**
> + * arm64_check_features() - Check if a feature register value constitutes
> + * a subset of features indicated by @limit.
> + *
> + * @ftrp: Pointer to an array of arm64_ftr_bits. It must be terminated by
> + * an item whose width field is zero.
> + * @val: The feature register value to check
> + * @limit: The limit value of the feature register
> + *
> + * This function will check if each feature field of @val is the "safe" value
> + * against @limit based on @ftrp[], each of which specifies the target field
> + * (shift, width), whether or not the field is for a signed value (sign),
> + * how the field is determined to be "safe" (type), and the safe value
> + * (safe_val) when type == FTR_EXACT (safe_val won't be used by this
> + * function when type != FTR_EXACT). Any other fields in arm64_ftr_bits
> + * won't be used by this function. If a field value in @val is the same
> + * as the one in @limit, it is always considered the safe value regardless
> + * of the type. For register fields that are not in @ftrp[], only the value
> + * in @limit is considered the safe value.
> + *
> + * Return: 0 if all the fields are safe. Otherwise, return negative errno.
> + */
> +static int arm64_check_features(const struct arm64_ftr_bits *ftrp, u64 val, u64 limit)
> +{
> +	u64 mask = 0;
> +
> +	for (; ftrp->width; ftrp++) {
> +		s64 f_val, f_lim, safe_val;
> +
> +		f_val = arm64_ftr_value(ftrp, val);
> +		f_lim = arm64_ftr_value(ftrp, limit);
> +		mask |= arm64_ftr_mask(ftrp);
> +
> +		if (f_val == f_lim)
> +			safe_val = f_val;
> +		else
> +			safe_val = arm64_ftr_safe_value(ftrp, f_val, f_lim);
> +
> +		if (safe_val != f_val)
> +			return -E2BIG;
> +	}
> +
> +	/*
> +	 * For fields that are not indicated in ftrp, values in limit are the
> +	 * safe values.
> +	 */
> +	if ((val & ~mask) != (limit & ~mask))
> +		return -E2BIG;
> +
> +	return 0;
> +}

I have the feeling that the core code already implements something
similar...

> +
>  static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
>  {
>  	if (kvm_vcpu_has_pmu(vcpu))
> @@ -67,7 +145,6 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
>  	case SYS_ID_AA64PFR0_EL1:
>  		if (!vcpu_has_sve(vcpu))
>  			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
> -		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
>  		if (kvm_vgic_global_state.type == VGIC_V3) {
>  			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
>  			val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
> @@ -94,15 +171,10 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
>  			val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT);
>  		break;
>  	case SYS_ID_AA64DFR0_EL1:
> -		/* Limit debug to ARMv8.0 */
> -		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer);
> -		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer), 6);
>  		/* Set PMUver to the required version */
>  		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
>  		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
>  				  vcpu_pmuver(vcpu));
> -		/* Hide SPE from guests */
> -		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer);
>  		break;
>  	case SYS_ID_DFR0_EL1:
>  		val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> @@ -161,9 +233,15 @@ static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
>  		      u64 val)
>  {
> -	/* This is what we mean by invariant: you can't change it. */
> -	if (val != read_id_reg(vcpu, rd))
> -		return -EINVAL;
> +	int ret;
> +	int id = reg_to_encoding(rd);
> +
> +	ret = arm64_check_features(id_reg_descs[IDREG_IDX(id)].ftr_bits, val,
> +				   id_reg_descs[IDREG_IDX(id)].kvm_sys_val);
> +	if (ret)
> +		return ret;
> +
> +	vcpu->kvm->arch.id_regs[IDREG_IDX(id)] = val;
>  
>  	return 0;
>  }
> @@ -197,12 +275,47 @@ static unsigned int aa32_id_visibility(const struct kvm_vcpu *vcpu,
>  	return id_visibility(vcpu, r);
>  }
>  
> +static void init_id_reg(struct id_reg_desc *idr)
> +{
> +	idr->kvm_sys_val = read_sanitised_ftr_reg(reg_to_encoding(&idr->reg_desc));
> +}
> +
> +static void init_id_aa64pfr0_el1(struct id_reg_desc *idr)
> +{
> +	u64 val;
> +	u32 id = reg_to_encoding(&idr->reg_desc);
> +
> +	val = read_sanitised_ftr_reg(id);
> +	/*
> +	 * The default is to expose CSV2 == 1 if the HW isn't affected.
> +	 * Although this is a per-CPU feature, we make it global because
> +	 * asymmetric systems are just a nuisance.
> +	 *
> +	 * Userspace can override this as long as it doesn't promise
> +	 * the impossible.
> +	 */
> +	if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
> +		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> +		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
> +	}
> +	if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
> +		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> +		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
> +	}
> +
> +	val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
> +
> +	val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
> +	val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);

What? Why? What if I have a GICv2? What if I have no GIC?

> +
> +	idr->kvm_sys_val = val;
> +}

How does this compose with the runtime feature reduction that takes
place in access_nested_id_reg()?

> +
>  static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
>  			       const struct sys_reg_desc *rd,
>  			       u64 val)
>  {
>  	u8 csv2, csv3;
> -	u64 sval = val;
>  
>  	/*
>  	 * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
> @@ -220,16 +333,29 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
>  	    (csv3 && arm64_get_meltdown_state() != SPECTRE_UNAFFECTED))
>  		return -EINVAL;
>  
> -	/* We can only differ with CSV[23], and anything else is an error */
> -	val ^= read_id_reg(vcpu, rd);
> -	val &= ~(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
> -		 ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
> -	if (val)
> -		return -EINVAL;
> +	return set_id_reg(vcpu, rd, val);
> +}
>  
> -	vcpu->kvm->arch.id_regs[IDREG_IDX(reg_to_encoding(rd))] = sval;
> +static void init_id_aa64dfr0_el1(struct id_reg_desc *idr)
> +{
> +	u64 val;
> +	u32 id = reg_to_encoding(&idr->reg_desc);
>  
> -	return 0;
> +	val = read_sanitised_ftr_reg(id);
> +	/* Limit debug to ARMv8.0 */
> +	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer);
> +	val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer), 6);
> +	/*
> +	 * Initialise the default PMUver before there is a chance to
> +	 * create an actual PMU.
> +	 */
> +	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> +	val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
> +			  kvm_arm_pmu_get_pmuver_limit());
> +	/* Hide SPE from guests */
> +	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer);
> +
> +	idr->kvm_sys_val = val;
>  }
>  
>  static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
> @@ -238,6 +364,7 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
>  {
>  	u8 pmuver, host_pmuver;
>  	bool valid_pmu;
> +	int ret;
>  
>  	host_pmuver = kvm_arm_pmu_get_pmuver_limit();
>  
> @@ -257,39 +384,58 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
>  	if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
>  		return -EINVAL;
>  
> -	/* We can only differ with PMUver, and anything else is an error */
> -	val ^= read_id_reg(vcpu, rd);
> -	val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> -	if (val)
> -		return -EINVAL;
> -
>  	if (valid_pmu) {
>  		mutex_lock(&vcpu->kvm->lock);
> -		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
> -			~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> -		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
> -			FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), pmuver);
> +		ret = set_id_reg(vcpu, rd, val);
> +		if (ret)
> +			return ret;

Next stop, Deadlock City, our final destination.

>  
>  		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
>  			~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
>  		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
>  				ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), pmuver_to_perfmon(pmuver));
>  		mutex_unlock(&vcpu->kvm->lock);
> -	} else if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) {
> -		set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
>  	} else {
> -		clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> +		/* We can only differ with PMUver, and anything else is an error */
> +		val ^= read_id_reg(vcpu, rd);
> +		val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> +		if (val)
> +			return -EINVAL;

I find it very odd that you add all this infrastructure to check for
writable fields, and yet have to keep this comparison. It makes me
thing that the data structures are not necessarily the right ones.

> +
> +		if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
> +			set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> +		else
> +			clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> +
>  	}
>  
>  	return 0;
>  }
>  
> +static void init_id_dfr0_el1(struct id_reg_desc *idr)
> +{
> +	u64 val;
> +	u32 id = reg_to_encoding(&idr->reg_desc);
> +
> +	val = read_sanitised_ftr_reg(id);
> +	/*
> +	 * Initialise the default PMUver before there is a chance to
> +	 * create an actual PMU.
> +	 */
> +	val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> +	val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon),
> +			  kvm_arm_pmu_get_pmuver_limit());
> +
> +	idr->kvm_sys_val = val;
> +}
> +
>  static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>  			   const struct sys_reg_desc *rd,
>  			   u64 val)
>  {
>  	u8 perfmon, host_perfmon;
>  	bool valid_pmu;
> +	int ret;
>  
>  	host_perfmon = pmuver_to_perfmon(kvm_arm_pmu_get_pmuver_limit());
>  
> @@ -310,42 +456,46 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>  	if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
>  		return -EINVAL;
>  
> -	/* We can only differ with PerfMon, and anything else is an error */
> -	val ^= read_id_reg(vcpu, rd);
> -	val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> -	if (val)
> -		return -EINVAL;
> -
>  	if (valid_pmu) {
>  		mutex_lock(&vcpu->kvm->lock);
> -		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
> -			~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> -		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
> -			ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), perfmon);
> +		ret = set_id_reg(vcpu, rd, val);
> +		if (ret)
> +			return ret;

Same player, shoot again.

>  
>  		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
>  			~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
>  		vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |= FIELD_PREP(
>  			ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), perfmon_to_pmuver(perfmon));
>  		mutex_unlock(&vcpu->kvm->lock);
> -	} else if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF) {
> -		set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
>  	} else {
> -		clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> +		/* We can only differ with PerfMon, and anything else is an error */
> +		val ^= read_id_reg(vcpu, rd);
> +		val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> +		if (val)
> +			return -EINVAL;
> +
> +		if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF)
> +			set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> +		else
> +			clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
>  	}

Same remarks.

>  
>  	return 0;
>  }
>  
>  /* sys_reg_desc initialiser for known cpufeature ID registers */
> +#define SYS_DESC_SANITISED(name) {			\
> +	SYS_DESC(SYS_##name),				\
> +	.access	= access_id_reg,			\
> +	.get_user = get_id_reg,				\
> +	.set_user = set_id_reg,				\
> +	.visibility = id_visibility,			\
> +}
> +
>  #define ID_SANITISED(name) {				\
> -	.reg_desc = {					\
> -		SYS_DESC(SYS_##name),			\
> -		.access	= access_id_reg,		\
> -		.get_user = get_id_reg,			\
> -		.set_user = set_id_reg,			\
> -		.visibility = id_visibility,		\
> -	},						\
> +	.reg_desc = SYS_DESC_SANITISED(name),		\
> +	.ftr_bits = { ARM64_FTR_END, },			\
> +	.init = init_id_reg,				\
>  }
>  
>  /* sys_reg_desc initialiser for known cpufeature ID registers */
> @@ -357,6 +507,8 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>  		.set_user = set_id_reg,			\
>  		.visibility = aa32_id_visibility,	\
>  	},						\
> +	.ftr_bits = { ARM64_FTR_END, },			\
> +	.init = init_id_reg,				\
>  }
>  
>  /*
> @@ -372,6 +524,7 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>  		.set_user = set_id_reg,				\
>  		.visibility = raz_visibility			\
>  	},							\
> +	.ftr_bits = { ARM64_FTR_END, },				\
>  }
>  
>  /*
> @@ -387,9 +540,10 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
>  		.set_user = set_id_reg,			\
>  		.visibility = raz_visibility,		\
>  	},						\
> +	.ftr_bits = { ARM64_FTR_END, },			\
>  }
>  
> -static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
> +static struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
>  	/*
>  	 * ID regs: all ID_SANITISED() entries here must have corresponding
>  	 * entries in arm64_ftr_regs[].
> @@ -405,6 +559,11 @@ static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
>  		.get_user = get_id_reg,
>  		.set_user = set_id_dfr0_el1,
>  		.visibility = aa32_id_visibility, },
> +	  .ftr_bits = {
> +		ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
> +			ID_DFR0_EL1_PerfMon_SHIFT, ID_DFR0_EL1_PerfMon_WIDTH, 0),
> +		ARM64_FTR_END, },
> +	  .init = init_id_dfr0_el1,
>  	},
>  	ID_HIDDEN(ID_AFR0_EL1),
>  	AA32_ID_SANITISED(ID_MMFR0_EL1),
> @@ -439,6 +598,13 @@ static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
>  		.access = access_id_reg,
>  		.get_user = get_id_reg,
>  		.set_user = set_id_aa64pfr0_el1, },
> +	  .ftr_bits = {
> +		ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
> +			ID_AA64PFR0_EL1_CSV2_SHIFT, ID_AA64PFR0_EL1_CSV2_WIDTH, 0),
> +		ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
> +			ID_AA64PFR0_EL1_CSV3_SHIFT, ID_AA64PFR0_EL1_CSV3_WIDTH, 0),
> +		ARM64_FTR_END, },

It really strikes me that you are 100% duplicating data that is
already in ftr_id_aa64pfr0[]. Only that this is a subset of the
existing data.

You could instead have your 'init()' callback return a pair of values:
the default value based on the sanitised one, and a 64bit mask. At
this stage, you'll realise that this looks a lot like the feature
override, and that you should be able to reuse some of the existing
infrastructure.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3]
  2023-03-17  5:06 ` [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3] Jing Zhang
  2023-03-27 10:31   ` Marc Zyngier
@ 2023-03-28 12:39   ` Fuad Tabba
  2023-03-28 20:01     ` Jing Zhang
  1 sibling, 1 reply; 25+ messages in thread
From: Fuad Tabba @ 2023-03-28 12:39 UTC (permalink / raw)
  To: Jing Zhang
  Cc: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton, Will Deacon,
	Paolo Bonzini, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

Hi,

On Fri, Mar 17, 2023 at 5:06 AM Jing Zhang <jingzhangos@google.com> wrote:
>
> With per guest ID registers, ID_AA64PFR0_EL1.[CSV2|CSV3] settings from
> userspace can be stored in its corresponding ID register.
>
> No functional change intended.
>
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h  |  2 --
>  arch/arm64/kvm/arm.c               | 19 +------------------
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c |  7 +++----
>  arch/arm64/kvm/id_regs.c           | 30 ++++++++++++++++++++++--------
>  4 files changed, 26 insertions(+), 32 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index fb6b50b1f111..e926ea91a73c 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -230,8 +230,6 @@ struct kvm_arch {
>
>         cpumask_var_t supported_cpus;
>
> -       u8 pfr0_csv2;
> -       u8 pfr0_csv3;
>         struct {
>                 u8 imp:4;
>                 u8 unimp:4;
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 4579c878ab30..c78d68d011cb 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -104,22 +104,6 @@ static int kvm_arm_default_max_vcpus(void)
>         return vgic_present ? kvm_vgic_get_max_vcpus() : KVM_MAX_VCPUS;
>  }
>
> -static void set_default_spectre(struct kvm *kvm)
> -{
> -       /*
> -        * The default is to expose CSV2 == 1 if the HW isn't affected.
> -        * Although this is a per-CPU feature, we make it global because
> -        * asymmetric systems are just a nuisance.
> -        *
> -        * Userspace can override this as long as it doesn't promise
> -        * the impossible.
> -        */
> -       if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED)
> -               kvm->arch.pfr0_csv2 = 1;
> -       if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED)
> -               kvm->arch.pfr0_csv3 = 1;
> -}
> -
>  /**
>   * kvm_arch_init_vm - initializes a VM data structure
>   * @kvm:       pointer to the KVM struct
> @@ -151,9 +135,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>         /* The maximum number of VCPUs is limited by the host's GIC model */
>         kvm->max_vcpus = kvm_arm_default_max_vcpus();
>
> -       set_default_spectre(kvm);
> -       kvm_arm_init_hypercalls(kvm);
>         kvm_arm_set_default_id_regs(kvm);
> +       kvm_arm_init_hypercalls(kvm);
>
>         /*
>          * Initialise the default PMUver before there is a chance to
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> index 08d2b004f4b7..0e1988740a65 100644
> --- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -93,10 +93,9 @@ static u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
>                 PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
>
>         /* Spectre and Meltdown mitigation in KVM */
> -       set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> -                              (u64)kvm->arch.pfr0_csv2);
> -       set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> -                              (u64)kvm->arch.pfr0_csv3);
> +       set_mask |= vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] &
> +               (ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
> +                       ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));

This triggers a compiler warning now since the variable `struct kvm
*kvm` isn't used anymore, this, however, isn't the main issue.

The main issue is that `struct kvm` here (vcpu->kvm) is the
hypervisor's version for protected vms, and not the host's. Therefore,
reading that value is wrong. That said, this is an existing bug in
pKVM since kvm->arch.pfr0_csv2 and kvm->arch.pfr0_csv3 are not
initialized.

The solution would be to track the spectre/meltown state at hyp and
use that. I'll submit a patch that does that. In the meantime, I think
that it would be better not to set the CSV bits for protected VMs,
which is the current behavior in practice.

Non-protected VMs in protected mode go back to the host on id register
traps, and use the host's `struct kvm`, so they should behave as
expected.

Thanks,
/fuad


>
>         return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
>  }
> diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> index e393b5730557..b60ca1058301 100644
> --- a/arch/arm64/kvm/id_regs.c
> +++ b/arch/arm64/kvm/id_regs.c
> @@ -61,12 +61,6 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
>                 if (!vcpu_has_sve(vcpu))
>                         val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
>                 val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
> -               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> -               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> -                                 (u64)vcpu->kvm->arch.pfr0_csv2);
> -               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> -               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> -                                 (u64)vcpu->kvm->arch.pfr0_csv3);
>                 if (kvm_vgic_global_state.type == VGIC_V3) {
>                         val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
>                         val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
> @@ -201,6 +195,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
>                                u64 val)
>  {
>         u8 csv2, csv3;
> +       u64 sval = val;
>
>         /*
>          * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
> @@ -225,8 +220,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
>         if (val)
>                 return -EINVAL;
>
> -       vcpu->kvm->arch.pfr0_csv2 = csv2;
> -       vcpu->kvm->arch.pfr0_csv3 = csv3;
> +       vcpu->kvm->arch.id_regs[IDREG_IDX(reg_to_encoding(rd))] = sval;
>
>         return 0;
>  }
> @@ -529,4 +523,24 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
>                 val = read_sanitised_ftr_reg(id);
>                 kvm->arch.id_regs[IDREG_IDX(id)] = val;
>         }
> +       /*
> +        * The default is to expose CSV2 == 1 if the HW isn't affected.
> +        * Although this is a per-CPU feature, we make it global because
> +        * asymmetric systems are just a nuisance.
> +        *
> +        * Userspace can override this as long as it doesn't promise
> +        * the impossible.
> +        */
> +       val = kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)];
> +
> +       if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
> +               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> +               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
> +       }
> +       if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
> +               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> +               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
> +       }
> +
> +       kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;
>  }
> --
> 2.40.0.rc1.284.g88254d51c5-goog
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 1/6] KVM: arm64: Move CPU ID feature registers emulation into a separate file
  2023-03-27 10:14   ` Marc Zyngier
@ 2023-03-28 17:16     ` Jing Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Jing Zhang @ 2023-03-28 17:16 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

Hi Marc,

On Mon, Mar 27, 2023 at 3:14 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 17 Mar 2023 05:06:32 +0000,
> Jing Zhang <jingzhangos@google.com> wrote:
> >
> > Create a new file id_regs.c for CPU ID feature registers emulation code,
> > which are moved from sys_regs.c and tweak sys_regs code accordingly.
> >
> > No functional change intended.
> >
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  arch/arm64/kvm/Makefile   |   2 +-
> >  arch/arm64/kvm/id_regs.c  | 506 ++++++++++++++++++++++++++++++++++++++
> >  arch/arm64/kvm/sys_regs.c | 464 ++--------------------------------
> >  arch/arm64/kvm/sys_regs.h |  41 +++
> >  4 files changed, 575 insertions(+), 438 deletions(-)
> >  create mode 100644 arch/arm64/kvm/id_regs.c
> >
> > diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> > index c0c050e53157..a6a315fcd81e 100644
> > --- a/arch/arm64/kvm/Makefile
> > +++ b/arch/arm64/kvm/Makefile
> > @@ -13,7 +13,7 @@ obj-$(CONFIG_KVM) += hyp/
> >  kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
> >        inject_fault.o va_layout.o handle_exit.o \
> >        guest.o debug.o reset.o sys_regs.o stacktrace.o \
> > -      vgic-sys-reg-v3.o fpsimd.o pkvm.o \
> > +      vgic-sys-reg-v3.o fpsimd.o pkvm.o id_regs.o \
> >        arch_timer.o trng.o vmid.o emulate-nested.o nested.o \
> >        vgic/vgic.o vgic/vgic-init.o \
> >        vgic/vgic-irqfd.o vgic/vgic-v2.o \
> > diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> > new file mode 100644
> > index 000000000000..08b738852955
> > --- /dev/null
> > +++ b/arch/arm64/kvm/id_regs.c
>
> [...]
>
> > +/**
> > + * emulate_id_reg - Emulate a guest access to an AArch64 CPU ID feature register
> > + * @vcpu: The VCPU pointer
> > + * @params: Decoded system register parameters
> > + *
> > + * Return: true if the ID register access was successful, false otherwise.
> > + */
> > +int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params)
> > +{
> > +     const struct sys_reg_desc *r;
> > +
> > +     r = find_reg(params, id_reg_descs, ARRAY_SIZE(id_reg_descs));
> > +
> > +     if (likely(r)) {
> > +             perform_access(vcpu, params, r);
> > +     } else {
> > +             print_sys_reg_msg(params,
> > +                               "Unsupported guest id_reg access at: %lx [%08lx]\n",
> > +                               *vcpu_pc(vcpu), *vcpu_cpsr(vcpu));
> > +             kvm_inject_undefined(vcpu);
> > +     }
> > +
> > +     return 1;
> > +}
> > +
> > +
> > +void kvm_arm_reset_id_regs(struct kvm_vcpu *vcpu)
> > +{
> > +     unsigned long i;
> > +
> > +     for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++)
> > +             if (id_reg_descs[i].reset)
> > +                     id_reg_descs[i].reset(vcpu, &id_reg_descs[i]);
> > +}
>
> What does this mean? None of the idregs have a reset function, given
> that they are global. Maybe this will make sense in the following
> patches, but definitely not here.
>
You are right. It actually does nothing for idregs which have no reset function.
Will remove this.
> > +
> > +int kvm_arm_get_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > +{
> > +     return kvm_sys_reg_get_user(vcpu, reg,
> > +                                 id_reg_descs, ARRAY_SIZE(id_reg_descs));
> > +}
> > +
> > +int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> > +{
> > +     return kvm_sys_reg_set_user(vcpu, reg,
> > +                                 id_reg_descs, ARRAY_SIZE(id_reg_descs));
> > +}
> > +
> > +bool kvm_arm_check_idreg_table(void)
> > +{
> > +     return check_sysreg_table(id_reg_descs, ARRAY_SIZE(id_reg_descs), false);
> > +}
>
> All these helpers are called from sys_regs.c and directly call back
> into it. Why not simply have a helper that gets the base and size of
> the array, and stick to pure common code?
>
As you know from the later patches in this series, a per VM idregs
array and an idregs specific structure are used. All these functions
would be implemented based on that.
> > +
> > +int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
> > +{
> > +     const struct sys_reg_desc *i2, *end2;
> > +     unsigned int total = 0;
> > +     int err;
> > +
> > +     i2 = id_reg_descs;
> > +     end2 = id_reg_descs + ARRAY_SIZE(id_reg_descs);
> > +
> > +     while (i2 != end2) {
> > +             err = walk_one_sys_reg(vcpu, i2++, &uind, &total);
> > +             if (err)
> > +                     return err;
> > +     }
> > +     return total;
> > +}
>
> This is an exact copy of walk_sys_regs. Surely this can be made common
> code.
The reason for not using common code is the same as last comment. An
idregs specific data structure would be used.
>
> [...]
>
> > @@ -2912,6 +2482,8 @@ void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
> >  {
> >       unsigned long i;
> >
> > +     kvm_arm_reset_id_regs(vcpu);
> > +
> >       for (i = 0; i < ARRAY_SIZE(sys_reg_descs); i++)
> >               if (sys_reg_descs[i].reset)
> >                       sys_reg_descs[i].reset(vcpu, &sys_reg_descs[i]);
> > @@ -2932,6 +2504,9 @@ int kvm_handle_sys_reg(struct kvm_vcpu *vcpu)
> >       params = esr_sys64_to_params(esr);
> >       params.regval = vcpu_get_reg(vcpu, Rt);
> >
> > +     if (is_id_reg(reg_to_encoding(&params)))
> > +             return emulate_id_reg(vcpu, &params);
> > +
> >       if (!emulate_sys_reg(vcpu, &params))
> >               return 1;
> >
> > @@ -3160,6 +2735,10 @@ int kvm_arm_sys_reg_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> >       if (err != -ENOENT)
> >               return err;
> >
> > +     err = kvm_arm_get_id_reg(vcpu, reg);
>
> Why not check for the encoding here? or in the helpers? It feels that
> this is an overhead that would be easy to reduce, given that we have
> fewer idregs than normal sysregs.
Sure, will move the encoding check here.
>
> > +     if (err != -ENOENT)
> > +             return err;
> > +
> >       return kvm_sys_reg_get_user(vcpu, reg,
> >                                   sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> >  }
> > @@ -3204,6 +2783,10 @@ int kvm_arm_sys_reg_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg
> >       if (err != -ENOENT)
> >               return err;
> >
> > +     err = kvm_arm_set_id_reg(vcpu, reg);
>
> Same here.
Agreed.
>
> > +     if (err != -ENOENT)
> > +             return err;
> > +
> >       return kvm_sys_reg_set_user(vcpu, reg,
> >                                   sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
> >  }
> > @@ -3250,10 +2833,10 @@ static bool copy_reg_to_user(const struct sys_reg_desc *reg, u64 __user **uind)
> >       return true;
> >  }
> >
> > -static int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
> > -                         const struct sys_reg_desc *rd,
> > -                         u64 __user **uind,
> > -                         unsigned int *total)
> > +int walk_one_sys_reg(const struct kvm_vcpu *vcpu,
> > +                  const struct sys_reg_desc *rd,
> > +                  u64 __user **uind,
> > +                  unsigned int *total)
> >  {
> >       /*
> >        * Ignore registers we trap but don't save,
> > @@ -3294,6 +2877,7 @@ unsigned long kvm_arm_num_sys_reg_descs(struct kvm_vcpu *vcpu)
> >  {
> >       return ARRAY_SIZE(invariant_sys_regs)
> >               + num_demux_regs()
> > +             + kvm_arm_walk_id_regs(vcpu, (u64 __user *)NULL)
> >               + walk_sys_regs(vcpu, (u64 __user *)NULL);
> >  }
> >
> > @@ -3309,6 +2893,11 @@ int kvm_arm_copy_sys_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
> >               uindices++;
> >       }
> >
> > +     err = kvm_arm_walk_id_regs(vcpu, uindices);
> > +     if (err < 0)
> > +             return err;
> > +     uindices += err;
> > +
> >       err = walk_sys_regs(vcpu, uindices);
> >       if (err < 0)
> >               return err;
> > @@ -3323,6 +2912,7 @@ int __init kvm_sys_reg_table_init(void)
> >       unsigned int i;
> >
> >       /* Make sure tables are unique and in order. */
> > +     valid &= kvm_arm_check_idreg_table();
> >       valid &= check_sysreg_table(sys_reg_descs, ARRAY_SIZE(sys_reg_descs), false);
> >       valid &= check_sysreg_table(cp14_regs, ARRAY_SIZE(cp14_regs), true);
> >       valid &= check_sysreg_table(cp14_64_regs, ARRAY_SIZE(cp14_64_regs), true);
> > diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> > index 6b11f2cc7146..ad41305348f7 100644
> > --- a/arch/arm64/kvm/sys_regs.h
> > +++ b/arch/arm64/kvm/sys_regs.h
> > @@ -210,6 +210,35 @@ find_reg(const struct sys_reg_params *params, const struct sys_reg_desc table[],
> >       return __inline_bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg);
> >  }
> >
> > +static inline unsigned int raz_visibility(const struct kvm_vcpu *vcpu,
> > +                                       const struct sys_reg_desc *r)
> > +{
> > +     return REG_RAZ;
> > +}
>
> No, please. This is used as a function pointer. You now potentially
> force the compiler to emit as many copy of this as there are pointers.
>
Thanks, will fix this.
> > +
> > +static inline bool write_to_read_only(struct kvm_vcpu *vcpu,
> > +                                   struct sys_reg_params *params,
> > +                                   const struct sys_reg_desc *r)
> > +{
> > +     WARN_ONCE(1, "Unexpected sys_reg write to read-only register\n");
> > +     print_sys_reg_instr(params);
> > +     kvm_inject_undefined(vcpu);
> > +     return false;
> > +}
>
> Please make this common code, and not an inline function.
Sure, will do.
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.
Thanks,
Jing

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest
  2023-03-27 10:15   ` Marc Zyngier
@ 2023-03-28 17:36     ` Jing Zhang
  2023-03-28 19:22       ` Marc Zyngier
  2023-03-29 16:26       ` Reiji Watanabe
  0 siblings, 2 replies; 25+ messages in thread
From: Jing Zhang @ 2023-03-28 17:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

Hi Marc,

On Mon, Mar 27, 2023 at 3:15 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 17 Mar 2023 05:06:33 +0000,
> Jing Zhang <jingzhangos@google.com> wrote:
> >
> > From: Reiji Watanabe <reijiw@google.com>
> >
> > Introduce id_regs[] in kvm_arch as a storage of guest's ID registers,
> > and save ID registers' sanitized value in the array at KVM_CREATE_VM.
> > Use the saved ones when ID registers are read by the guest or
> > userspace (via KVM_GET_ONE_REG).
> >
> > No functional change intended.
> >
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > Co-developed-by: Jing Zhang <jingzhangos@google.com>
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 11 ++++++++
> >  arch/arm64/kvm/arm.c              |  1 +
> >  arch/arm64/kvm/id_regs.c          | 44 ++++++++++++++++++++++++-------
> >  arch/arm64/kvm/sys_regs.c         |  2 +-
> >  arch/arm64/kvm/sys_regs.h         |  1 +
> >  5 files changed, 49 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index a1892a8f6032..fb6b50b1f111 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -245,6 +245,15 @@ struct kvm_arch {
> >        * the associated pKVM instance in the hypervisor.
> >        */
> >       struct kvm_protected_vm pkvm;
> > +
> > +     /*
> > +      * Save ID registers for the guest in id_regs[].
> > +      * (Op0, Op1, CRn, CRm, Op2) of the ID registers to be saved in it
> > +      * is (3, 0, 0, crm, op2), where 1<=crm<8, 0<=op2<8.
> > +      */
> > +#define KVM_ARM_ID_REG_NUM   56
> > +#define IDREG_IDX(id)                (((sys_reg_CRm(id) - 1) << 3) | sys_reg_Op2(id))
> > +     u64 id_regs[KVM_ARM_ID_REG_NUM];
>
> Place these registers in their own structure, and place this structure
> *before* the pvm structure. Document what guards these registers when
> updated (my hunch is that this should rely on Oliver's locking fixes
> if the update comes from a vcpu).
Sure, I will put them in their own structure and place it before the
pkvm structure.
IIUC, usually we don't need a specific locking to update idregs here.
All idregs are 64 bit and can be read/written atomically. The only
case that may need a locking is to keep the consistency for PMUVer in
AA64DFR0_EL1 and PerfMon in DFR0_EL1. If there is no use case for two
VCPU threads in a VM to update PMUVer and PerfMon concurrently, then
we don't need the locking as in later patch by using the kvm lock.
WDTY?
>
> >  };
> >
> >  struct kvm_vcpu_fault_info {
> > @@ -1005,6 +1014,8 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> >  long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> >                               struct kvm_arm_copy_mte_tags *copy_tags);
> >
> > +void kvm_arm_set_default_id_regs(struct kvm *kvm);
> > +
> >  /* Guest/host FPSIMD coordination helpers */
> >  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 3bd732eaf087..4579c878ab30 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -153,6 +153,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> >
> >       set_default_spectre(kvm);
> >       kvm_arm_init_hypercalls(kvm);
> > +     kvm_arm_set_default_id_regs(kvm);
> >
> >       /*
> >        * Initialise the default PMUver before there is a chance to
> > diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> > index 08b738852955..e393b5730557 100644
> > --- a/arch/arm64/kvm/id_regs.c
> > +++ b/arch/arm64/kvm/id_regs.c
> > @@ -52,16 +52,9 @@ static u8 pmuver_to_perfmon(u8 pmuver)
> >       }
> >  }
> >
> > -/* Read a sanitised cpufeature ID register by sys_reg_desc */
> > -static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r)
> > +u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
> >  {
> > -     u32 id = reg_to_encoding(r);
> > -     u64 val;
> > -
> > -     if (sysreg_visible_as_raz(vcpu, r))
> > -             return 0;
> > -
> > -     val = read_sanitised_ftr_reg(id);
> > +     u64 val = vcpu->kvm->arch.id_regs[IDREG_IDX(id)];
> >
> >       switch (id) {
> >       case SYS_ID_AA64PFR0_EL1:
> > @@ -126,6 +119,14 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r
> >       return val;
> >  }
> >
> > +static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r)
> > +{
> > +     if (sysreg_visible_as_raz(vcpu, r))
> > +             return 0;
> > +
> > +     return kvm_arm_read_id_reg(vcpu, reg_to_encoding(r));
> > +}
> > +
> >  /* cpufeature ID register access trap handlers */
> >
> >  static bool access_id_reg(struct kvm_vcpu *vcpu,
> > @@ -504,3 +505,28 @@ int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
> >       }
> >       return total;
> >  }
> > +
> > +/*
> > + * Set the guest's ID registers that are defined in id_reg_descs[]
> > + * with ID_SANITISED() to the host's sanitized value.
> > + */
> > +void kvm_arm_set_default_id_regs(struct kvm *kvm)
> > +{
> > +     int i;
> > +     u32 id;
> > +     u64 val;
> > +
> > +     for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
> > +             id = reg_to_encoding(&id_reg_descs[i]);
> > +             if (WARN_ON_ONCE(!is_id_reg(id)))
> > +                     /* Shouldn't happen */
> > +                     continue;
> > +
> > +             if (id_reg_descs[i].visibility == raz_visibility)
> > +                     /* Hidden or reserved ID register */
> > +                     continue;
>
> Relying on function pointer comparison is really fragile. If I wrap
> raz_visibility() in another function, this won't catch it. It also
> doesn't bode well with your 'inline' definition of this function.
>
> More importantly, why do we care about checking for visibility at all?
> We can happily populate the array and rely on the runtime visibility.
Right. I'll remove this checking.
>
> > +
> > +             val = read_sanitised_ftr_reg(id);
> > +             kvm->arch.id_regs[IDREG_IDX(id)] = val;
> > +     }
> > +}
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

Thanks,
Jing

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest
  2023-03-28 17:36     ` Jing Zhang
@ 2023-03-28 19:22       ` Marc Zyngier
  2023-03-28 20:05         ` Jing Zhang
  2023-03-29 16:26       ` Reiji Watanabe
  1 sibling, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2023-03-28 19:22 UTC (permalink / raw)
  To: Jing Zhang
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

On Tue, 28 Mar 2023 18:36:58 +0100,
Jing Zhang <jingzhangos@google.com> wrote:
> 
> Hi Marc,

[...]

> IIUC, usually we don't need a specific locking to update idregs here.
> All idregs are 64 bit and can be read/written atomically. The only
> case that may need a locking is to keep the consistency for PMUVer in
> AA64DFR0_EL1 and PerfMon in DFR0_EL1. If there is no use case for two
> VCPU threads in a VM to update PMUVer and PerfMon concurrently, then
> we don't need the locking as in later patch by using the kvm lock.
> WDTY?

I think we generally need locking for any writable id-reg, the goal
being that they will ultimately *all* be writable. As you found out,
there is this need for the PMU fields, and I'm willing to bet that
there will be more of those.

And given that the locking you have used in some of the later patches
violates the locking order (don't worry, you're not alone!), we need
to use something else. Which is where Oliver's series comes into play.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3]
  2023-03-27 10:31   ` Marc Zyngier
@ 2023-03-28 19:54     ` Jing Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Jing Zhang @ 2023-03-28 19:54 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

Hi Marc,

On Mon, Mar 27, 2023 at 3:32 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 17 Mar 2023 05:06:34 +0000,
> Jing Zhang <jingzhangos@google.com> wrote:
> >
> > With per guest ID registers, ID_AA64PFR0_EL1.[CSV2|CSV3] settings from
> > userspace can be stored in its corresponding ID register.
> >
> > No functional change intended.
> >
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h  |  2 --
> >  arch/arm64/kvm/arm.c               | 19 +------------------
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c |  7 +++----
> >  arch/arm64/kvm/id_regs.c           | 30 ++++++++++++++++++++++--------
> >  4 files changed, 26 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index fb6b50b1f111..e926ea91a73c 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -230,8 +230,6 @@ struct kvm_arch {
> >
> >       cpumask_var_t supported_cpus;
> >
> > -     u8 pfr0_csv2;
> > -     u8 pfr0_csv3;
> >       struct {
> >               u8 imp:4;
> >               u8 unimp:4;
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 4579c878ab30..c78d68d011cb 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -104,22 +104,6 @@ static int kvm_arm_default_max_vcpus(void)
> >       return vgic_present ? kvm_vgic_get_max_vcpus() : KVM_MAX_VCPUS;
> >  }
> >
> > -static void set_default_spectre(struct kvm *kvm)
> > -{
> > -     /*
> > -      * The default is to expose CSV2 == 1 if the HW isn't affected.
> > -      * Although this is a per-CPU feature, we make it global because
> > -      * asymmetric systems are just a nuisance.
> > -      *
> > -      * Userspace can override this as long as it doesn't promise
> > -      * the impossible.
> > -      */
> > -     if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED)
> > -             kvm->arch.pfr0_csv2 = 1;
> > -     if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED)
> > -             kvm->arch.pfr0_csv3 = 1;
> > -}
> > -
> >  /**
> >   * kvm_arch_init_vm - initializes a VM data structure
> >   * @kvm:     pointer to the KVM struct
> > @@ -151,9 +135,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> >       /* The maximum number of VCPUs is limited by the host's GIC model */
> >       kvm->max_vcpus = kvm_arm_default_max_vcpus();
> >
> > -     set_default_spectre(kvm);
> > -     kvm_arm_init_hypercalls(kvm);
> >       kvm_arm_set_default_id_regs(kvm);
> > +     kvm_arm_init_hypercalls(kvm);
>
> Please document the ordering dependency between idregs and hypercalls.
I didn't see an ordering dependency here. The reason I put
kvm_arm_set_default_id_regs before kvm_arm_init_hypercalls is that
kvm_arm_set_default_id_regs includes the code in set_default_spectre.
>
> >
> >       /*
> >        * Initialise the default PMUver before there is a chance to
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > index 08d2b004f4b7..0e1988740a65 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -93,10 +93,9 @@ static u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> >               PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> >
> >       /* Spectre and Meltdown mitigation in KVM */
> > -     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> > -                            (u64)kvm->arch.pfr0_csv2);
> > -     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> > -                            (u64)kvm->arch.pfr0_csv3);
> > +     set_mask |= vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] &
>
> This really want an accessor.
>
> > +             (ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
> > +                     ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
> >
> >       return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> >  }
> > diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> > index e393b5730557..b60ca1058301 100644
> > --- a/arch/arm64/kvm/id_regs.c
> > +++ b/arch/arm64/kvm/id_regs.c
> > @@ -61,12 +61,6 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
> >               if (!vcpu_has_sve(vcpu))
> >                       val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
> >               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
> > -             val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> > -             val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> > -                               (u64)vcpu->kvm->arch.pfr0_csv2);
> > -             val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> > -             val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> > -                               (u64)vcpu->kvm->arch.pfr0_csv3);
> >               if (kvm_vgic_global_state.type == VGIC_V3) {
> >                       val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
> >                       val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
> > @@ -201,6 +195,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
> >                              u64 val)
> >  {
> >       u8 csv2, csv3;
> > +     u64 sval = val;
> >
> >       /*
> >        * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
> > @@ -225,8 +220,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
> >       if (val)
> >               return -EINVAL;
> >
> > -     vcpu->kvm->arch.pfr0_csv2 = csv2;
> > -     vcpu->kvm->arch.pfr0_csv3 = csv3;
> > +     vcpu->kvm->arch.id_regs[IDREG_IDX(reg_to_encoding(rd))] = sval;
>
> Accessor needed here to.
>
> >
> >       return 0;
> >  }
> > @@ -529,4 +523,24 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
> >               val = read_sanitised_ftr_reg(id);
> >               kvm->arch.id_regs[IDREG_IDX(id)] = val;
> >       }
> > +     /*
>
> Add a blank line after the closing bracket.
Will do.
>
> > +      * The default is to expose CSV2 == 1 if the HW isn't affected.
> > +      * Although this is a per-CPU feature, we make it global because
> > +      * asymmetric systems are just a nuisance.
> > +      *
> > +      * Userspace can override this as long as it doesn't promise
> > +      * the impossible.
> > +      */
> > +     val = kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)];
>
> Accessor.
>
> > +
> > +     if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
> > +             val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> > +             val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
> > +     }
> > +     if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
> > +             val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> > +             val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
> > +     }
> > +
> > +     kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;
>
> Accessor.
Sure, I'll use the macro IDREG() as in the last version of the patch series.
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

Thanks,
Jing

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3]
  2023-03-28 12:39   ` Fuad Tabba
@ 2023-03-28 20:01     ` Jing Zhang
  2023-03-29  8:23       ` Fuad Tabba
  0 siblings, 1 reply; 25+ messages in thread
From: Jing Zhang @ 2023-03-28 20:01 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton, Will Deacon,
	Paolo Bonzini, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

Hi Faud,

On Tue, Mar 28, 2023 at 5:40 AM Fuad Tabba <tabba@google.com> wrote:
>
> Hi,
>
> On Fri, Mar 17, 2023 at 5:06 AM Jing Zhang <jingzhangos@google.com> wrote:
> >
> > With per guest ID registers, ID_AA64PFR0_EL1.[CSV2|CSV3] settings from
> > userspace can be stored in its corresponding ID register.
> >
> > No functional change intended.
> >
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h  |  2 --
> >  arch/arm64/kvm/arm.c               | 19 +------------------
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c |  7 +++----
> >  arch/arm64/kvm/id_regs.c           | 30 ++++++++++++++++++++++--------
> >  4 files changed, 26 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index fb6b50b1f111..e926ea91a73c 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -230,8 +230,6 @@ struct kvm_arch {
> >
> >         cpumask_var_t supported_cpus;
> >
> > -       u8 pfr0_csv2;
> > -       u8 pfr0_csv3;
> >         struct {
> >                 u8 imp:4;
> >                 u8 unimp:4;
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 4579c878ab30..c78d68d011cb 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -104,22 +104,6 @@ static int kvm_arm_default_max_vcpus(void)
> >         return vgic_present ? kvm_vgic_get_max_vcpus() : KVM_MAX_VCPUS;
> >  }
> >
> > -static void set_default_spectre(struct kvm *kvm)
> > -{
> > -       /*
> > -        * The default is to expose CSV2 == 1 if the HW isn't affected.
> > -        * Although this is a per-CPU feature, we make it global because
> > -        * asymmetric systems are just a nuisance.
> > -        *
> > -        * Userspace can override this as long as it doesn't promise
> > -        * the impossible.
> > -        */
> > -       if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED)
> > -               kvm->arch.pfr0_csv2 = 1;
> > -       if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED)
> > -               kvm->arch.pfr0_csv3 = 1;
> > -}
> > -
> >  /**
> >   * kvm_arch_init_vm - initializes a VM data structure
> >   * @kvm:       pointer to the KVM struct
> > @@ -151,9 +135,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> >         /* The maximum number of VCPUs is limited by the host's GIC model */
> >         kvm->max_vcpus = kvm_arm_default_max_vcpus();
> >
> > -       set_default_spectre(kvm);
> > -       kvm_arm_init_hypercalls(kvm);
> >         kvm_arm_set_default_id_regs(kvm);
> > +       kvm_arm_init_hypercalls(kvm);
> >
> >         /*
> >          * Initialise the default PMUver before there is a chance to
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > index 08d2b004f4b7..0e1988740a65 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -93,10 +93,9 @@ static u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> >                 PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> >
> >         /* Spectre and Meltdown mitigation in KVM */
> > -       set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> > -                              (u64)kvm->arch.pfr0_csv2);
> > -       set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> > -                              (u64)kvm->arch.pfr0_csv3);
> > +       set_mask |= vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] &
> > +               (ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
> > +                       ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
>
> This triggers a compiler warning now since the variable `struct kvm
> *kvm` isn't used anymore, this, however, isn't the main issue.
>
> The main issue is that `struct kvm` here (vcpu->kvm) is the
> hypervisor's version for protected vms, and not the host's. Therefore,
> reading that value is wrong. That said, this is an existing bug in
> pKVM since kvm->arch.pfr0_csv2 and kvm->arch.pfr0_csv3 are not
> initialized.
>
> The solution would be to track the spectre/meltown state at hyp and
> use that. I'll submit a patch that does that. In the meantime, I think
> that it would be better not to set the CSV bits for protected VMs,
> which is the current behavior in practice.
>
> Non-protected VMs in protected mode go back to the host on id register
> traps, and use the host's `struct kvm`, so they should behave as
> expected.
You mean just remove these two lines:
 /* Spectre and Meltdown mitigation in KVM */
 set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
(u64)kvm->arch.pfr0_csv2);
set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
(u64)kvm->arch.pfr0_csv3);

Will it cause any problem for pKVM without your incoming patch?
>
> Thanks,
> /fuad
>
>
> >
> >         return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> >  }
> > diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> > index e393b5730557..b60ca1058301 100644
> > --- a/arch/arm64/kvm/id_regs.c
> > +++ b/arch/arm64/kvm/id_regs.c
> > @@ -61,12 +61,6 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
> >                 if (!vcpu_has_sve(vcpu))
> >                         val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
> >                 val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
> > -               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> > -               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> > -                                 (u64)vcpu->kvm->arch.pfr0_csv2);
> > -               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> > -               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> > -                                 (u64)vcpu->kvm->arch.pfr0_csv3);
> >                 if (kvm_vgic_global_state.type == VGIC_V3) {
> >                         val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
> >                         val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
> > @@ -201,6 +195,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
> >                                u64 val)
> >  {
> >         u8 csv2, csv3;
> > +       u64 sval = val;
> >
> >         /*
> >          * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
> > @@ -225,8 +220,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
> >         if (val)
> >                 return -EINVAL;
> >
> > -       vcpu->kvm->arch.pfr0_csv2 = csv2;
> > -       vcpu->kvm->arch.pfr0_csv3 = csv3;
> > +       vcpu->kvm->arch.id_regs[IDREG_IDX(reg_to_encoding(rd))] = sval;
> >
> >         return 0;
> >  }
> > @@ -529,4 +523,24 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
> >                 val = read_sanitised_ftr_reg(id);
> >                 kvm->arch.id_regs[IDREG_IDX(id)] = val;
> >         }
> > +       /*
> > +        * The default is to expose CSV2 == 1 if the HW isn't affected.
> > +        * Although this is a per-CPU feature, we make it global because
> > +        * asymmetric systems are just a nuisance.
> > +        *
> > +        * Userspace can override this as long as it doesn't promise
> > +        * the impossible.
> > +        */
> > +       val = kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)];
> > +
> > +       if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
> > +               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> > +               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
> > +       }
> > +       if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
> > +               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> > +               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
> > +       }
> > +
> > +       kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;
> >  }
> > --
> > 2.40.0.rc1.284.g88254d51c5-goog
> >
Thanks,
Jing

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest
  2023-03-28 19:22       ` Marc Zyngier
@ 2023-03-28 20:05         ` Jing Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Jing Zhang @ 2023-03-28 20:05 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

On Tue, Mar 28, 2023 at 12:22 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 28 Mar 2023 18:36:58 +0100,
> Jing Zhang <jingzhangos@google.com> wrote:
> >
> > Hi Marc,
>
> [...]
>
> > IIUC, usually we don't need a specific locking to update idregs here.
> > All idregs are 64 bit and can be read/written atomically. The only
> > case that may need a locking is to keep the consistency for PMUVer in
> > AA64DFR0_EL1 and PerfMon in DFR0_EL1. If there is no use case for two
> > VCPU threads in a VM to update PMUVer and PerfMon concurrently, then
> > we don't need the locking as in later patch by using the kvm lock.
> > WDTY?
>
> I think we generally need locking for any writable id-reg, the goal
> being that they will ultimately *all* be writable. As you found out,
> there is this need for the PMU fields, and I'm willing to bet that
> there will be more of those.
>
> And given that the locking you have used in some of the later patches
> violates the locking order (don't worry, you're not alone!), we need
> to use something else. Which is where Oliver's series comes into play.
Got it. Thanks for the details. I'll add locking based on Oliver's series.
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.
Thanks,
Jing

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 4/6] KVM: arm64: Use per guest ID register for ID_AA64DFR0_EL1.PMUVer
  2023-03-27 10:40   ` Marc Zyngier
@ 2023-03-28 20:20     ` Jing Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Jing Zhang @ 2023-03-28 20:20 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

Hi Marc,

On Mon, Mar 27, 2023 at 3:40 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 17 Mar 2023 05:06:35 +0000,
> Jing Zhang <jingzhangos@google.com> wrote:
> >
> > With per guest ID registers, PMUver settings from userspace
> > can be stored in its corresponding ID register.
> >
> > No functional change intended.
> >
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 11 +++---
> >  arch/arm64/kvm/arm.c              |  6 ---
> >  arch/arm64/kvm/id_regs.c          | 61 +++++++++++++++++++++++++------
> >  include/kvm/arm_pmu.h             |  5 ++-
> >  4 files changed, 59 insertions(+), 24 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index e926ea91a73c..102860ba896d 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -218,6 +218,12 @@ struct kvm_arch {
> >  #define KVM_ARCH_FLAG_EL1_32BIT                              4
> >       /* PSCI SYSTEM_SUSPEND enabled for the guest */
> >  #define KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED         5
> > +     /*
> > +      * AA64DFR0_EL1.PMUver was set as ID_AA64DFR0_EL1_PMUVer_IMP_DEF
> > +      * or DFR0_EL1.PerfMon was set as ID_DFR0_EL1_PerfMon_IMPDEF from
> > +      * userspace for VCPUs without PMU.
> > +      */
> > +#define KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU           6
> >
> >       unsigned long flags;
> >
> > @@ -230,11 +236,6 @@ struct kvm_arch {
> >
> >       cpumask_var_t supported_cpus;
> >
> > -     struct {
> > -             u8 imp:4;
> > -             u8 unimp:4;
> > -     } dfr0_pmuver;
> > -
> >       /* Hypercall features firmware registers' descriptor */
> >       struct kvm_smccc_features smccc_feat;
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index c78d68d011cb..fb2de2cb98cb 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -138,12 +138,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> >       kvm_arm_set_default_id_regs(kvm);
> >       kvm_arm_init_hypercalls(kvm);
> >
> > -     /*
> > -      * Initialise the default PMUver before there is a chance to
> > -      * create an actual PMU.
> > -      */
> > -     kvm->arch.dfr0_pmuver.imp = kvm_arm_pmu_get_pmuver_limit();
> > -
> >       return 0;
> >
> >  err_free_cpumask:
> > diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> > index b60ca1058301..3a87a3d2390d 100644
> > --- a/arch/arm64/kvm/id_regs.c
> > +++ b/arch/arm64/kvm/id_regs.c
> > @@ -21,9 +21,12 @@
> >  static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
> >  {
> >       if (kvm_vcpu_has_pmu(vcpu))
> > -             return vcpu->kvm->arch.dfr0_pmuver.imp;
> > -
> > -     return vcpu->kvm->arch.dfr0_pmuver.unimp;
> > +             return FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
> > +                             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)]);
> > +     else if (test_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags))
> > +             return ID_AA64DFR0_EL1_PMUVer_IMP_DEF;
> > +     else
> > +             return 0;
>
> Drop the pointless elses.
Will do.
>
> >  }
> >
> >  static u8 perfmon_to_pmuver(u8 perfmon)
> > @@ -256,10 +259,23 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
> >       if (val)
> >               return -EINVAL;
> >
> > -     if (valid_pmu)
> > -             vcpu->kvm->arch.dfr0_pmuver.imp = pmuver;
> > -     else
> > -             vcpu->kvm->arch.dfr0_pmuver.unimp = pmuver;
> > +     if (valid_pmu) {
> > +             mutex_lock(&vcpu->kvm->lock);
>
> Bingo!
>
Will fix it.
> > +             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
> > +                     ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> > +             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
> > +                     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), pmuver);
> > +
> > +             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
> > +                     ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> > +             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
> > +                             ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), pmuver_to_perfmon(pmuver));
> > +             mutex_unlock(&vcpu->kvm->lock);
> > +     } else if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) {
> > +             set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> > +     } else {
> > +             clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> > +     }
>
> The last two cases are better written as:
>
>         assign_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags,
>                    pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF);
>
Will do.
> >
> >       return 0;
> >  }
> > @@ -296,10 +312,23 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
> >       if (val)
> >               return -EINVAL;
> >
> > -     if (valid_pmu)
> > -             vcpu->kvm->arch.dfr0_pmuver.imp = perfmon_to_pmuver(perfmon);
> > -     else
> > -             vcpu->kvm->arch.dfr0_pmuver.unimp = perfmon_to_pmuver(perfmon);
> > +     if (valid_pmu) {
> > +             mutex_lock(&vcpu->kvm->lock);
>
> Same here (lock inversion)
Will fix it.
>
> > +             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
> > +                     ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> > +             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
> > +                     ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), perfmon);
> > +
> > +             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
> > +                     ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> > +             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |= FIELD_PREP(
> > +                     ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), perfmon_to_pmuver(perfmon));
> > +             mutex_unlock(&vcpu->kvm->lock);
> > +     } else if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF) {
> > +             set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> > +     } else {
> > +             clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> > +     }
>
> Same here (assign_bit).
Will do.
> >
> >       return 0;
> >  }
> > @@ -543,4 +572,14 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
> >       }
> >
> >       kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;
> > +
> > +     /*
> > +      * Initialise the default PMUver before there is a chance to
> > +      * create an actual PMU.
> > +      */
> > +     kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
> > +             ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> > +     kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
> > +             FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
> > +                        kvm_arm_pmu_get_pmuver_limit());
>
> Please put these assignments on a single line...
Will do.
>
> >  }
> > diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> > index 628775334d5e..51c7f3e7bdde 100644
> > --- a/include/kvm/arm_pmu.h
> > +++ b/include/kvm/arm_pmu.h
> > @@ -92,8 +92,9 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> >  /*
> >   * Evaluates as true when emulating PMUv3p5, and false otherwise.
> >   */
> > -#define kvm_pmu_is_3p5(vcpu)                                         \
> > -     (vcpu->kvm->arch.dfr0_pmuver.imp >= ID_AA64DFR0_EL1_PMUVer_V3P5)
> > +#define kvm_pmu_is_3p5(vcpu)                                                                 \
> > +      (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),                                 \
> > +              vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)]) >= ID_AA64DFR0_EL1_PMUVer_V3P5)
>
> I'll stop mentioning the need for accessors...
Will fix it.
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

Thanks,
Jing

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 5/6] KVM: arm64: Introduce ID register specific descriptor
  2023-03-27 11:28   ` Marc Zyngier
@ 2023-03-29  3:46     ` Jing Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Jing Zhang @ 2023-03-29  3:46 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

Hi Marc,
On Mon, Mar 27, 2023 at 4:28 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 17 Mar 2023 05:06:36 +0000,
> Jing Zhang <jingzhangos@google.com> wrote:
> >
> > Introduce an ID feature register specific descriptor to include ID
> > register specific fields and callbacks besides its corresponding
> > general system register descriptor.
> > New fields for ID register descriptor would be added later when it
> > is necessary to support a writable ID register.
>
> What would these be? Could they make sense for "normal" sysregs as
> well?
As you know from the later patch, some fields are added only
applicable to idregs.
As you suggested, some of those fields may not be necessary, I'll try
to improve the data idregs specific data structures based on your
comments or even don't use a new structure if it is possible.
>
> >
> > No functional change intended.
> >
> > Co-developed-by: Reiji Watanabe <reijiw@google.com>
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  arch/arm64/kvm/id_regs.c  | 187 +++++++++++++++++++++++++++-----------
> >  arch/arm64/kvm/sys_regs.c |   2 +-
> >  arch/arm64/kvm/sys_regs.h |   1 +
> >  3 files changed, 138 insertions(+), 52 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> > index 3a87a3d2390d..9956c99d20f7 100644
> > --- a/arch/arm64/kvm/id_regs.c
> > +++ b/arch/arm64/kvm/id_regs.c
> > @@ -18,6 +18,10 @@
> >
> >  #include "sys_regs.h"
> >
> > +struct id_reg_desc {
> > +     const struct sys_reg_desc       reg_desc;
> > +};
> > +
>
> What is the advantage in having this wrapping structure that forces us
> to reinvent the wheel (the structure is different) over an additional
> pointer or even a side table?
As stated in the last comment, I'll try to improve the data structure
or don't use it at all if possible.
>
> >  static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
> >  {
> >       if (kvm_vcpu_has_pmu(vcpu))
> > @@ -334,21 +338,25 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
> >  }
> >
> >  /* sys_reg_desc initialiser for known cpufeature ID registers */
> > -#define ID_SANITISED(name) {                 \
> > -     SYS_DESC(SYS_##name),                   \
> > -     .access = access_id_reg,                \
> > -     .get_user = get_id_reg,                 \
> > -     .set_user = set_id_reg,                 \
> > -     .visibility = id_visibility,            \
> > +#define ID_SANITISED(name) {                         \
> > +     .reg_desc = {                                   \
> > +             SYS_DESC(SYS_##name),                   \
> > +             .access = access_id_reg,                \
> > +             .get_user = get_id_reg,                 \
> > +             .set_user = set_id_reg,                 \
> > +             .visibility = id_visibility,            \
> > +     },                                              \
> >  }
> >
> >  /* sys_reg_desc initialiser for known cpufeature ID registers */
> > -#define AA32_ID_SANITISED(name) {            \
> > -     SYS_DESC(SYS_##name),                   \
> > -     .access = access_id_reg,                \
> > -     .get_user = get_id_reg,                 \
> > -     .set_user = set_id_reg,                 \
> > -     .visibility = aa32_id_visibility,       \
> > +#define AA32_ID_SANITISED(name) {                    \
> > +     .reg_desc = {                                   \
> > +             SYS_DESC(SYS_##name),                   \
> > +             .access = access_id_reg,                \
> > +             .get_user = get_id_reg,                 \
> > +             .set_user = set_id_reg,                 \
> > +             .visibility = aa32_id_visibility,       \
> > +     },                                              \
> >  }
> >
> >  /*
> > @@ -356,12 +364,14 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
> >   * register with encoding Op0=3, Op1=0, CRn=0, CRm=crm, Op2=op2
> >   * (1 <= crm < 8, 0 <= Op2 < 8).
> >   */
> > -#define ID_UNALLOCATED(crm, op2) {                   \
> > -     Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2),     \
> > -     .access = access_id_reg,                        \
> > -     .get_user = get_id_reg,                         \
> > -     .set_user = set_id_reg,                         \
> > -     .visibility = raz_visibility                    \
> > +#define ID_UNALLOCATED(crm, op2) {                           \
> > +     .reg_desc = {                                           \
> > +             Op0(3), Op1(0), CRn(0), CRm(crm), Op2(op2),     \
> > +             .access = access_id_reg,                        \
> > +             .get_user = get_id_reg,                         \
> > +             .set_user = set_id_reg,                         \
> > +             .visibility = raz_visibility                    \
> > +     },                                                      \
> >  }
> >
> >  /*
> > @@ -369,15 +379,17 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
> >   * For now, these are exposed just like unallocated ID regs: they appear
> >   * RAZ for the guest.
> >   */
> > -#define ID_HIDDEN(name) {                    \
> > -     SYS_DESC(SYS_##name),                   \
> > -     .access = access_id_reg,                \
> > -     .get_user = get_id_reg,                 \
> > -     .set_user = set_id_reg,                 \
> > -     .visibility = raz_visibility,           \
> > +#define ID_HIDDEN(name) {                            \
> > +     .reg_desc = {                                   \
> > +             SYS_DESC(SYS_##name),                   \
> > +             .access = access_id_reg,                \
> > +             .get_user = get_id_reg,                 \
> > +             .set_user = set_id_reg,                 \
> > +             .visibility = raz_visibility,           \
> > +     },                                              \
> >  }
> >
> > -static const struct sys_reg_desc id_reg_descs[] = {
> > +static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
> >       /*
> >        * ID regs: all ID_SANITISED() entries here must have corresponding
> >        * entries in arm64_ftr_regs[].
> > @@ -387,9 +399,13 @@ static const struct sys_reg_desc id_reg_descs[] = {
> >       /* CRm=1 */
> >       AA32_ID_SANITISED(ID_PFR0_EL1),
> >       AA32_ID_SANITISED(ID_PFR1_EL1),
> > -     { SYS_DESC(SYS_ID_DFR0_EL1), .access = access_id_reg,
> > -       .get_user = get_id_reg, .set_user = set_id_dfr0_el1,
> > -       .visibility = aa32_id_visibility, },
> > +     { .reg_desc = {
> > +             SYS_DESC(SYS_ID_DFR0_EL1),
> > +             .access = access_id_reg,
> > +             .get_user = get_id_reg,
> > +             .set_user = set_id_dfr0_el1,
> > +             .visibility = aa32_id_visibility, },
> > +     },
> >       ID_HIDDEN(ID_AFR0_EL1),
> >       AA32_ID_SANITISED(ID_MMFR0_EL1),
> >       AA32_ID_SANITISED(ID_MMFR1_EL1),
> > @@ -418,8 +434,12 @@ static const struct sys_reg_desc id_reg_descs[] = {
> >
> >       /* AArch64 ID registers */
> >       /* CRm=4 */
> > -     { SYS_DESC(SYS_ID_AA64PFR0_EL1), .access = access_id_reg,
> > -       .get_user = get_id_reg, .set_user = set_id_aa64pfr0_el1, },
> > +     { .reg_desc = {
> > +             SYS_DESC(SYS_ID_AA64PFR0_EL1),
> > +             .access = access_id_reg,
> > +             .get_user = get_id_reg,
> > +             .set_user = set_id_aa64pfr0_el1, },
> > +     },
> >       ID_SANITISED(ID_AA64PFR1_EL1),
> >       ID_UNALLOCATED(4, 2),
> >       ID_UNALLOCATED(4, 3),
> > @@ -429,8 +449,12 @@ static const struct sys_reg_desc id_reg_descs[] = {
> >       ID_UNALLOCATED(4, 7),
> >
> >       /* CRm=5 */
> > -     { SYS_DESC(SYS_ID_AA64DFR0_EL1), .access = access_id_reg,
> > -       .get_user = get_id_reg, .set_user = set_id_aa64dfr0_el1, },
> > +     { .reg_desc = {
> > +             SYS_DESC(SYS_ID_AA64DFR0_EL1),
> > +             .access = access_id_reg,
> > +             .get_user = get_id_reg,
> > +             .set_user = set_id_aa64dfr0_el1, },
> > +     },
> >       ID_SANITISED(ID_AA64DFR1_EL1),
> >       ID_UNALLOCATED(5, 2),
> >       ID_UNALLOCATED(5, 3),
> > @@ -469,12 +493,12 @@ static const struct sys_reg_desc id_reg_descs[] = {
> >   */
> >  int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params)
> >  {
> > -     const struct sys_reg_desc *r;
> > +     u32 id;
> >
> > -     r = find_reg(params, id_reg_descs, ARRAY_SIZE(id_reg_descs));
> > +     id = reg_to_encoding(params);
> >
> > -     if (likely(r)) {
> > -             perform_access(vcpu, params, r);
> > +     if (likely(is_id_reg(id))) {
> > +             perform_access(vcpu, params, &id_reg_descs[IDREG_IDX(id)].reg_desc);
>
> How about minimising the diff and making the whole thing less verbose?
>
> static const struct sys_reg_desc *id_to_id_reg_desc(struct sys_reg_params *params)
> {
>         u32 id;
>
>         id = reg_to_encoding(params);
>         if (is_id_reg(id))
>                 return &id_reg_descs[IDREG_IDX(id)].reg_desc;
>
>         return NULL;
> }
>
> int emulate_id_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *params)
> {
>         const struct sys_reg_desc *r;
>
>         r = id_to_id_reg_desc(params);
>         [...]
> }
>
> And use the helper everywhere?
Sure, will do.
>
> >       } else {
> >               print_sys_reg_msg(params,
> >                                 "Unsupported guest id_reg access at: %lx [%08lx]\n",
> > @@ -491,38 +515,102 @@ void kvm_arm_reset_id_regs(struct kvm_vcpu *vcpu)
> >       unsigned long i;
> >
> >       for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++)
> > -             if (id_reg_descs[i].reset)
> > -                     id_reg_descs[i].reset(vcpu, &id_reg_descs[i]);
> > +             if (id_reg_descs[i].reg_desc.reset)
> > +                     id_reg_descs[i].reg_desc.reset(vcpu, &id_reg_descs[i].reg_desc);
> >  }
> >
> >  int kvm_arm_get_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  {
> > -     return kvm_sys_reg_get_user(vcpu, reg,
> > -                                 id_reg_descs, ARRAY_SIZE(id_reg_descs));
> > +     u64 __user *uaddr = (u64 __user *)(unsigned long)reg->addr;
> > +     const struct sys_reg_desc *r;
> > +     struct sys_reg_params params;
> > +     u64 val;
> > +     int ret;
> > +     u32 id;
> > +
> > +     if (!index_to_params(reg->id, &params))
> > +             return -ENOENT;
> > +     id = reg_to_encoding(&params);
> > +
> > +     if (!is_id_reg(id))
> > +             return -ENOENT;
> > +
> > +     r = &id_reg_descs[IDREG_IDX(id)].reg_desc;
> > +     if (r->get_user) {
> > +             ret = (r->get_user)(vcpu, r, &val);
> > +     } else {
> > +             ret = 0;
> > +             val = vcpu->kvm->arch.id_regs[IDREG_IDX(id)];
> > +     }
> > +
> > +     if (!ret)
> > +             ret = put_user(val, uaddr);
>
> How about the visibility? Why isn't it checked?
The visibility check is done in get_user()->get_id_reg()->read_id_reg().
>
> > +
> > +     return ret;
> >  }
> >
> >  int kvm_arm_set_id_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> >  {
> > -     return kvm_sys_reg_set_user(vcpu, reg,
> > -                                 id_reg_descs, ARRAY_SIZE(id_reg_descs));
> > +     u64 __user *uaddr = (u64 __user *)(unsigned long)reg->addr;
> > +     const struct sys_reg_desc *r;
> > +     struct sys_reg_params params;
> > +     u64 val;
> > +     int ret;
> > +     u32 id;
> > +
> > +     if (!index_to_params(reg->id, &params))
> > +             return -ENOENT;
> > +     id = reg_to_encoding(&params);
> > +
> > +     if (!is_id_reg(id))
> > +             return -ENOENT;
> > +
> > +     if (get_user(val, uaddr))
> > +             return -EFAULT;
> > +
> > +     r = &id_reg_descs[IDREG_IDX(id)].reg_desc;
> > +
> > +     if (sysreg_user_write_ignore(vcpu, r))
> > +             return 0;
> > +
> > +     if (r->set_user) {
> > +             ret = (r->set_user)(vcpu, r, val);
> > +     } else {
> > +             WARN_ONCE(1, "ID register set_user callback is NULL\n");
>
> Why the shouting? We didn't do that before. What's changed?
It was added according to one of Reiji's comments from
https://lore.kernel.org/all/CAAeT=Fz-G_EUmh=Pj3UHA7pnKKYi7UyYuedziJxfmSoKpntw3Q@mail.gmail.com.
WDYT?
>
> > +             ret = 0;
> > +     }
> > +
> > +     return ret;
> >  }
> >
> >  bool kvm_arm_check_idreg_table(void)
> >  {
> > -     return check_sysreg_table(id_reg_descs, ARRAY_SIZE(id_reg_descs), false);
> > +     unsigned int i;
> > +
> > +     for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
> > +             const struct sys_reg_desc *r = &id_reg_descs[i].reg_desc;
> > +
> > +             if (!is_id_reg(reg_to_encoding(r))) {
> > +                     kvm_err("id_reg table %pS entry %d not set correctly\n",
> > +                             &id_reg_descs[i].reg_desc, i);
> > +                     return false;
> > +             }
> > +     }
> > +
> > +     return true;
> >  }
> >
> >  int kvm_arm_walk_id_regs(struct kvm_vcpu *vcpu, u64 __user *uind)
> >  {
> > -     const struct sys_reg_desc *i2, *end2;
> > +     const struct id_reg_desc *i2, *end2;
> >       unsigned int total = 0;
> >       int err;
> >
> >       i2 = id_reg_descs;
> >       end2 = id_reg_descs + ARRAY_SIZE(id_reg_descs);
> >
> > -     while (i2 != end2) {
> > -             err = walk_one_sys_reg(vcpu, i2++, &uind, &total);
> > +     for (; i2 != end2; i2++) {
> > +             err = walk_one_sys_reg(vcpu, &(i2->reg_desc), &uind, &total);
> >               if (err)
> >                       return err;
> >       }
> > @@ -540,12 +628,9 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
> >       u64 val;
> >
> >       for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
> > -             id = reg_to_encoding(&id_reg_descs[i]);
> > -             if (WARN_ON_ONCE(!is_id_reg(id)))
> > -                     /* Shouldn't happen */
> > -                     continue;
> > +             id = reg_to_encoding(&id_reg_descs[i].reg_desc);
>
> Why have you dropped that check? If it shouldn't happen before, it
> still shouldn't happen.
Since the check was done in kvm_arm_check_idreg_table() now.
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 6/6] KVM: arm64: Refactor writings for PMUVer/CSV2/CSV3
  2023-03-27 13:34   ` Marc Zyngier
@ 2023-03-29  4:29     ` Jing Zhang
  0 siblings, 0 replies; 25+ messages in thread
From: Jing Zhang @ 2023-03-29  4:29 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon, Paolo Bonzini,
	James Morse, Alexandru Elisei, Suzuki K Poulose, Fuad Tabba,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

Hi Marc,

On Mon, Mar 27, 2023 at 6:34 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 17 Mar 2023 05:06:37 +0000,
> Jing Zhang <jingzhangos@google.com> wrote:
> >
> > Save KVM sanitised ID register value in ID descriptor (kvm_sys_val).
>
> Why do we need to store a separate value *beside* the sanitised value
> the kernel already holds?
>
> > Add an init callback for every ID register to setup kvm_sys_val.
>
> Same question.
It is used to store the value further sanitised by KVM, which might be
different from the one held by kernel. But as you suggested later,
this isn't necessary, we can create KVM sanitised value on the fly
since it is cheap.
>
> > All per VCPU sanitizations are still handled on the fly during ID
> > register read and write from userspace.
> > An arm64_ftr_bits array is used to indicate writable feature fields.
> >
> > Refactor writings for ID_AA64PFR0_EL1.[CSV2|CSV3],
> > ID_AA64DFR0_EL1.PMUVer and ID_DFR0_ELF.PerfMon based on utilities
> > introduced by ID register descriptor.
> >
> > No functional change intended.
> >
> > Co-developed-by: Reiji Watanabe <reijiw@google.com>
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  arch/arm64/include/asm/cpufeature.h |  25 +++
> >  arch/arm64/include/asm/kvm_host.h   |   2 +-
> >  arch/arm64/kernel/cpufeature.c      |  26 +--
> >  arch/arm64/kvm/arm.c                |   2 +-
> >  arch/arm64/kvm/id_regs.c            | 325 ++++++++++++++++++++--------
> >  arch/arm64/kvm/sys_regs.c           |   3 +-
> >  arch/arm64/kvm/sys_regs.h           |   2 +-
> >  7 files changed, 261 insertions(+), 124 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> > index fc2c739f48f1..493ec530eefc 100644
> > --- a/arch/arm64/include/asm/cpufeature.h
> > +++ b/arch/arm64/include/asm/cpufeature.h
> > @@ -64,6 +64,30 @@ struct arm64_ftr_bits {
> >       s64             safe_val; /* safe value for FTR_EXACT features */
> >  };
> >
> > +#define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> > +     {                                               \
> > +             .sign = SIGNED,                         \
> > +             .visible = VISIBLE,                     \
> > +             .strict = STRICT,                       \
> > +             .type = TYPE,                           \
> > +             .shift = SHIFT,                         \
> > +             .width = WIDTH,                         \
> > +             .safe_val = SAFE_VAL,                   \
> > +     }
> > +
> > +/* Define a feature with unsigned values */
> > +#define ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> > +     __ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
> > +
> > +/* Define a feature with a signed value */
> > +#define S_ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> > +     __ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
> > +
> > +#define ARM64_FTR_END                                        \
> > +     {                                               \
> > +             .width = 0,                             \
> > +     }
> > +
> >  /*
> >   * Describe the early feature override to the core override code:
> >   *
> > @@ -911,6 +935,7 @@ static inline unsigned int get_vmid_bits(u64 mmfr1)
> >       return 8;
> >  }
> >
> > +s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new, s64 cur);
> >  struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id);
> >
> >  extern struct arm64_ftr_override id_aa64mmfr1_override;
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index 102860ba896d..aa83dd79e7ff 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -1013,7 +1013,7 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
> >  long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
> >                               struct kvm_arm_copy_mte_tags *copy_tags);
> >
> > -void kvm_arm_set_default_id_regs(struct kvm *kvm);
> > +void kvm_arm_init_id_regs(struct kvm *kvm);
> >
> >  /* Guest/host FPSIMD coordination helpers */
> >  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index 23bd2a926b74..e18848ee4b98 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -139,30 +139,6 @@ void dump_cpu_features(void)
> >       pr_emerg("0x%*pb\n", ARM64_NCAPS, &cpu_hwcaps);
> >  }
> >
> > -#define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> > -     {                                               \
> > -             .sign = SIGNED,                         \
> > -             .visible = VISIBLE,                     \
> > -             .strict = STRICT,                       \
> > -             .type = TYPE,                           \
> > -             .shift = SHIFT,                         \
> > -             .width = WIDTH,                         \
> > -             .safe_val = SAFE_VAL,                   \
> > -     }
> > -
> > -/* Define a feature with unsigned values */
> > -#define ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> > -     __ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
> > -
> > -/* Define a feature with a signed value */
> > -#define S_ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
> > -     __ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
> > -
> > -#define ARM64_FTR_END                                        \
> > -     {                                               \
> > -             .width = 0,                             \
> > -     }
> > -
> >  static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
> >
> >  static bool __system_matches_cap(unsigned int n);
> > @@ -790,7 +766,7 @@ static u64 arm64_ftr_set_value(const struct arm64_ftr_bits *ftrp, s64 reg,
> >       return reg;
> >  }
> >
> > -static s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new,
> > +s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new,
> >                               s64 cur)
> >  {
> >       s64 ret = 0;
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index fb2de2cb98cb..e539d9ca9d01 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -135,7 +135,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> >       /* The maximum number of VCPUs is limited by the host's GIC model */
> >       kvm->max_vcpus = kvm_arm_default_max_vcpus();
> >
> > -     kvm_arm_set_default_id_regs(kvm);
> > +     kvm_arm_init_id_regs(kvm);
>
> How about picking the name once and for all from the first patch?
Sure, will do.
>
> >       kvm_arm_init_hypercalls(kvm);
> >
> >       return 0;
> > diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> > index 9956c99d20f7..726b810b6e06 100644
> > --- a/arch/arm64/kvm/id_regs.c
> > +++ b/arch/arm64/kvm/id_regs.c
> > @@ -18,10 +18,88 @@
> >
> >  #include "sys_regs.h"
> >
> > +/*
> > + * Number of entries in id_reg_desc's ftr_bits[] (Number of 4 bits fields
> > + * in 64 bit register + 1 entry for a terminator entry).
> > + */
> > +#define      FTR_FIELDS_NUM  17
>
> Please see SMFR0_EL1 for an example of a sysreg that doesn't follow
> the 4bits-per-field format. I expect to see more of those in the
> future.
>
> And given that this is always a variable set of fields, why do we need
> to define this as a fixed array that only bloats the structure? I'd
> rather see a variable array in a side structure.
>
Yes, it makes more sense to use a variable array here. Do you have any
suggestions? xarray?
> > +
> >  struct id_reg_desc {
> >       const struct sys_reg_desc       reg_desc;
> > +     /*
> > +      * KVM sanitised ID register value.
> > +      * It is the default value for per VM emulated ID register.
> > +      */
> > +     u64 kvm_sys_val;
> > +     /*
> > +      * Used to validate the ID register values with arm64_check_features().
> > +      * The last item in the array must be terminated by an item whose
> > +      * width field is zero as that is expected by arm64_check_features().
> > +      * Only feature bits defined in this array are writable.
> > +      */
> > +     struct arm64_ftr_bits   ftr_bits[FTR_FIELDS_NUM];
> > +
> > +     /*
> > +      * Basically init() is used to setup the KVM sanitised value
> > +      * stored in kvm_sys_val.
> > +      */
> > +     void (*init)(struct id_reg_desc *idr);
>
> Given that this callback only builds the value from the sanitised
> view, and that it is very cheap (only a set of masking), why do we
> bother keeping the value around? It would also allow this structure to
> be kept *const*, something that is extremely desirable.
>
> Also, why do we need an init() method when each sysreg already have a
> reset() method? Surely this should be the same thing...
>
> My gut feeling is that we should only have a callback returning the
> limit value computed on the fly.
Sure, will use a callback to return the limit value on the fly.
>
> >  };
> >
> > +static struct id_reg_desc id_reg_descs[];
> > +
> > +/**
> > + * arm64_check_features() - Check if a feature register value constitutes
> > + * a subset of features indicated by @limit.
> > + *
> > + * @ftrp: Pointer to an array of arm64_ftr_bits. It must be terminated by
> > + * an item whose width field is zero.
> > + * @val: The feature register value to check
> > + * @limit: The limit value of the feature register
> > + *
> > + * This function will check if each feature field of @val is the "safe" value
> > + * against @limit based on @ftrp[], each of which specifies the target field
> > + * (shift, width), whether or not the field is for a signed value (sign),
> > + * how the field is determined to be "safe" (type), and the safe value
> > + * (safe_val) when type == FTR_EXACT (safe_val won't be used by this
> > + * function when type != FTR_EXACT). Any other fields in arm64_ftr_bits
> > + * won't be used by this function. If a field value in @val is the same
> > + * as the one in @limit, it is always considered the safe value regardless
> > + * of the type. For register fields that are not in @ftrp[], only the value
> > + * in @limit is considered the safe value.
> > + *
> > + * Return: 0 if all the fields are safe. Otherwise, return negative errno.
> > + */
> > +static int arm64_check_features(const struct arm64_ftr_bits *ftrp, u64 val, u64 limit)
> > +{
> > +     u64 mask = 0;
> > +
> > +     for (; ftrp->width; ftrp++) {
> > +             s64 f_val, f_lim, safe_val;
> > +
> > +             f_val = arm64_ftr_value(ftrp, val);
> > +             f_lim = arm64_ftr_value(ftrp, limit);
> > +             mask |= arm64_ftr_mask(ftrp);
> > +
> > +             if (f_val == f_lim)
> > +                     safe_val = f_val;
> > +             else
> > +                     safe_val =  arm64_ftr_safe_value(ftrp, f_val, f_lim);
> > +
> > +             if (safe_val != f_val)
> > +                     return -E2BIG;
> > +     }
> > +
> > +     /*
> > +      * For fields that are not indicated in ftrp, values in limit are the
> > +      * safe values.
> > +      */
> > +     if ((val & ~mask) != (limit & ~mask))
> > +             return -E2BIG;
> > +
> > +     return 0;
> > +}
>
> I have the feeling that the core code already implements something
> similar...
Right, it is similar to update_cpu_ftr_reg() in cpufeature.c. But that
function can't meet the needs here.
>
> > +
> >  static u8 vcpu_pmuver(const struct kvm_vcpu *vcpu)
> >  {
> >       if (kvm_vcpu_has_pmu(vcpu))
> > @@ -67,7 +145,6 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
> >       case SYS_ID_AA64PFR0_EL1:
> >               if (!vcpu_has_sve(vcpu))
> >                       val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
> > -             val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
> >               if (kvm_vgic_global_state.type == VGIC_V3) {
> >                       val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
> >                       val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
> > @@ -94,15 +171,10 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
> >                       val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT);
> >               break;
> >       case SYS_ID_AA64DFR0_EL1:
> > -             /* Limit debug to ARMv8.0 */
> > -             val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer);
> > -             val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer), 6);
> >               /* Set PMUver to the required version */
> >               val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> >               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
> >                                 vcpu_pmuver(vcpu));
> > -             /* Hide SPE from guests */
> > -             val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer);
> >               break;
> >       case SYS_ID_DFR0_EL1:
> >               val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> > @@ -161,9 +233,15 @@ static int get_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
> >  static int set_id_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
> >                     u64 val)
> >  {
> > -     /* This is what we mean by invariant: you can't change it. */
> > -     if (val != read_id_reg(vcpu, rd))
> > -             return -EINVAL;
> > +     int ret;
> > +     int id = reg_to_encoding(rd);
> > +
> > +     ret = arm64_check_features(id_reg_descs[IDREG_IDX(id)].ftr_bits, val,
> > +                                id_reg_descs[IDREG_IDX(id)].kvm_sys_val);
> > +     if (ret)
> > +             return ret;
> > +
> > +     vcpu->kvm->arch.id_regs[IDREG_IDX(id)] = val;
> >
> >       return 0;
> >  }
> > @@ -197,12 +275,47 @@ static unsigned int aa32_id_visibility(const struct kvm_vcpu *vcpu,
> >       return id_visibility(vcpu, r);
> >  }
> >
> > +static void init_id_reg(struct id_reg_desc *idr)
> > +{
> > +     idr->kvm_sys_val = read_sanitised_ftr_reg(reg_to_encoding(&idr->reg_desc));
> > +}
> > +
> > +static void init_id_aa64pfr0_el1(struct id_reg_desc *idr)
> > +{
> > +     u64 val;
> > +     u32 id = reg_to_encoding(&idr->reg_desc);
> > +
> > +     val = read_sanitised_ftr_reg(id);
> > +     /*
> > +      * The default is to expose CSV2 == 1 if the HW isn't affected.
> > +      * Although this is a per-CPU feature, we make it global because
> > +      * asymmetric systems are just a nuisance.
> > +      *
> > +      * Userspace can override this as long as it doesn't promise
> > +      * the impossible.
> > +      */
> > +     if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
> > +             val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> > +             val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
> > +     }
> > +     if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
> > +             val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> > +             val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
> > +     }
> > +
> > +     val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
> > +
> > +     val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
> > +     val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
>
> What? Why? What if I have a GICv2? What if I have no GIC?
:-) Forgot how these two lines got in here. Will remove them.
>
> > +
> > +     idr->kvm_sys_val = val;
> > +}
>
> How does this compose with the runtime feature reduction that takes
> place in access_nested_id_reg()?
kvm_sys_val is used as the initial value for the per VM idregs, which
is passed into access_nested_id_reg in function access_id_reg().
>
> > +
> >  static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
> >                              const struct sys_reg_desc *rd,
> >                              u64 val)
> >  {
> >       u8 csv2, csv3;
> > -     u64 sval = val;
> >
> >       /*
> >        * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
> > @@ -220,16 +333,29 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
> >           (csv3 && arm64_get_meltdown_state() != SPECTRE_UNAFFECTED))
> >               return -EINVAL;
> >
> > -     /* We can only differ with CSV[23], and anything else is an error */
> > -     val ^= read_id_reg(vcpu, rd);
> > -     val &= ~(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
> > -              ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
> > -     if (val)
> > -             return -EINVAL;
> > +     return set_id_reg(vcpu, rd, val);
> > +}
> >
> > -     vcpu->kvm->arch.id_regs[IDREG_IDX(reg_to_encoding(rd))] = sval;
> > +static void init_id_aa64dfr0_el1(struct id_reg_desc *idr)
> > +{
> > +     u64 val;
> > +     u32 id = reg_to_encoding(&idr->reg_desc);
> >
> > -     return 0;
> > +     val = read_sanitised_ftr_reg(id);
> > +     /* Limit debug to ARMv8.0 */
> > +     val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer);
> > +     val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer), 6);
> > +     /*
> > +      * Initialise the default PMUver before there is a chance to
> > +      * create an actual PMU.
> > +      */
> > +     val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> > +     val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
> > +                       kvm_arm_pmu_get_pmuver_limit());
> > +     /* Hide SPE from guests */
> > +     val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMSVer);
> > +
> > +     idr->kvm_sys_val = val;
> >  }
> >
> >  static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
> > @@ -238,6 +364,7 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
> >  {
> >       u8 pmuver, host_pmuver;
> >       bool valid_pmu;
> > +     int ret;
> >
> >       host_pmuver = kvm_arm_pmu_get_pmuver_limit();
> >
> > @@ -257,39 +384,58 @@ static int set_id_aa64dfr0_el1(struct kvm_vcpu *vcpu,
> >       if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
> >               return -EINVAL;
> >
> > -     /* We can only differ with PMUver, and anything else is an error */
> > -     val ^= read_id_reg(vcpu, rd);
> > -     val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> > -     if (val)
> > -             return -EINVAL;
> > -
> >       if (valid_pmu) {
> >               mutex_lock(&vcpu->kvm->lock);
> > -             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
> > -                     ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> > -             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |=
> > -                     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), pmuver);
> > +             ret = set_id_reg(vcpu, rd, val);
> > +             if (ret)
> > +                     return ret;
>
> Next stop, Deadlock City, our final destination.
Will fix it.
>
> >
> >               vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
> >                       ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> >               vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
> >                               ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), pmuver_to_perfmon(pmuver));
> >               mutex_unlock(&vcpu->kvm->lock);
> > -     } else if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF) {
> > -             set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> >       } else {
> > -             clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> > +             /* We can only differ with PMUver, and anything else is an error */
> > +             val ^= read_id_reg(vcpu, rd);
> > +             val &= ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> > +             if (val)
> > +                     return -EINVAL;
>
> I find it very odd that you add all this infrastructure to check for
> writable fields, and yet have to keep this comparison. It makes me
> thing that the data structures are not necessarily the right ones.
This comparison is still here because for this patch, we don't allow
the writable for the whole ID register and in the path of invalid pmu,
the set_id_reg() (This function will do all the checks) is not called.
This comparison can be removed as long as the whole ID reg is enabled
writable.
>
> > +
> > +             if (pmuver == ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
> > +                     set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> > +             else
> > +                     clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> > +
> >       }
> >
> >       return 0;
> >  }
> >
> > +static void init_id_dfr0_el1(struct id_reg_desc *idr)
> > +{
> > +     u64 val;
> > +     u32 id = reg_to_encoding(&idr->reg_desc);
> > +
> > +     val = read_sanitised_ftr_reg(id);
> > +     /*
> > +      * Initialise the default PMUver before there is a chance to
> > +      * create an actual PMU.
> > +      */
> > +     val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> > +     val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon),
> > +                       kvm_arm_pmu_get_pmuver_limit());
> > +
> > +     idr->kvm_sys_val = val;
> > +}
> > +
> >  static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
> >                          const struct sys_reg_desc *rd,
> >                          u64 val)
> >  {
> >       u8 perfmon, host_perfmon;
> >       bool valid_pmu;
> > +     int ret;
> >
> >       host_perfmon = pmuver_to_perfmon(kvm_arm_pmu_get_pmuver_limit());
> >
> > @@ -310,42 +456,46 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
> >       if (kvm_vcpu_has_pmu(vcpu) != valid_pmu)
> >               return -EINVAL;
> >
> > -     /* We can only differ with PerfMon, and anything else is an error */
> > -     val ^= read_id_reg(vcpu, rd);
> > -     val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> > -     if (val)
> > -             return -EINVAL;
> > -
> >       if (valid_pmu) {
> >               mutex_lock(&vcpu->kvm->lock);
> > -             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] &=
> > -                     ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> > -             vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_DFR0_EL1)] |= FIELD_PREP(
> > -                     ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon), perfmon);
> > +             ret = set_id_reg(vcpu, rd, val);
> > +             if (ret)
> > +                     return ret;
>
> Same player, shoot again.
Will fix it.
>
> >
> >               vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] &=
> >                       ~ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer);
> >               vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64DFR0_EL1)] |= FIELD_PREP(
> >                       ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer), perfmon_to_pmuver(perfmon));
> >               mutex_unlock(&vcpu->kvm->lock);
> > -     } else if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF) {
> > -             set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> >       } else {
> > -             clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> > +             /* We can only differ with PerfMon, and anything else is an error */
> > +             val ^= read_id_reg(vcpu, rd);
> > +             val &= ~ARM64_FEATURE_MASK(ID_DFR0_EL1_PerfMon);
> > +             if (val)
> > +                     return -EINVAL;
> > +
> > +             if (perfmon == ID_DFR0_EL1_PerfMon_IMPDEF)
> > +                     set_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> > +             else
> > +                     clear_bit(KVM_ARCH_FLAG_VCPU_HAS_IMP_DEF_PMU, &vcpu->kvm->arch.flags);
> >       }
>
> Same remarks.
>
> >
> >       return 0;
> >  }
> >
> >  /* sys_reg_desc initialiser for known cpufeature ID registers */
> > +#define SYS_DESC_SANITISED(name) {                   \
> > +     SYS_DESC(SYS_##name),                           \
> > +     .access = access_id_reg,                        \
> > +     .get_user = get_id_reg,                         \
> > +     .set_user = set_id_reg,                         \
> > +     .visibility = id_visibility,                    \
> > +}
> > +
> >  #define ID_SANITISED(name) {                         \
> > -     .reg_desc = {                                   \
> > -             SYS_DESC(SYS_##name),                   \
> > -             .access = access_id_reg,                \
> > -             .get_user = get_id_reg,                 \
> > -             .set_user = set_id_reg,                 \
> > -             .visibility = id_visibility,            \
> > -     },                                              \
> > +     .reg_desc = SYS_DESC_SANITISED(name),           \
> > +     .ftr_bits = { ARM64_FTR_END, },                 \
> > +     .init = init_id_reg,                            \
> >  }
> >
> >  /* sys_reg_desc initialiser for known cpufeature ID registers */
> > @@ -357,6 +507,8 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
> >               .set_user = set_id_reg,                 \
> >               .visibility = aa32_id_visibility,       \
> >       },                                              \
> > +     .ftr_bits = { ARM64_FTR_END, },                 \
> > +     .init = init_id_reg,                            \
> >  }
> >
> >  /*
> > @@ -372,6 +524,7 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
> >               .set_user = set_id_reg,                         \
> >               .visibility = raz_visibility                    \
> >       },                                                      \
> > +     .ftr_bits = { ARM64_FTR_END, },                         \
> >  }
> >
> >  /*
> > @@ -387,9 +540,10 @@ static int set_id_dfr0_el1(struct kvm_vcpu *vcpu,
> >               .set_user = set_id_reg,                 \
> >               .visibility = raz_visibility,           \
> >       },                                              \
> > +     .ftr_bits = { ARM64_FTR_END, },                 \
> >  }
> >
> > -static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
> > +static struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
> >       /*
> >        * ID regs: all ID_SANITISED() entries here must have corresponding
> >        * entries in arm64_ftr_regs[].
> > @@ -405,6 +559,11 @@ static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
> >               .get_user = get_id_reg,
> >               .set_user = set_id_dfr0_el1,
> >               .visibility = aa32_id_visibility, },
> > +       .ftr_bits = {
> > +             ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
> > +                     ID_DFR0_EL1_PerfMon_SHIFT, ID_DFR0_EL1_PerfMon_WIDTH, 0),
> > +             ARM64_FTR_END, },
> > +       .init = init_id_dfr0_el1,
> >       },
> >       ID_HIDDEN(ID_AFR0_EL1),
> >       AA32_ID_SANITISED(ID_MMFR0_EL1),
> > @@ -439,6 +598,13 @@ static const struct id_reg_desc id_reg_descs[KVM_ARM_ID_REG_NUM] = {
> >               .access = access_id_reg,
> >               .get_user = get_id_reg,
> >               .set_user = set_id_aa64pfr0_el1, },
> > +       .ftr_bits = {
> > +             ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
> > +                     ID_AA64PFR0_EL1_CSV2_SHIFT, ID_AA64PFR0_EL1_CSV2_WIDTH, 0),
> > +             ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
> > +                     ID_AA64PFR0_EL1_CSV3_SHIFT, ID_AA64PFR0_EL1_CSV3_WIDTH, 0),
> > +             ARM64_FTR_END, },
>
> It really strikes me that you are 100% duplicating data that is
> already in ftr_id_aa64pfr0[]. Only that this is a subset of the
> existing data.
>
> You could instead have your 'init()' callback return a pair of values:
> the default value based on the sanitised one, and a 64bit mask. At
> this stage, you'll realise that this looks a lot like the feature
> override, and that you should be able to reuse some of the existing
> infrastructure.
Sure, will try to improve this by your suggestion.
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.
Thanks,
Jing

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3]
  2023-03-28 20:01     ` Jing Zhang
@ 2023-03-29  8:23       ` Fuad Tabba
  0 siblings, 0 replies; 25+ messages in thread
From: Fuad Tabba @ 2023-03-29  8:23 UTC (permalink / raw)
  To: Jing Zhang
  Cc: KVM, KVMARM, ARMLinux, Marc Zyngier, Oliver Upton, Will Deacon,
	Paolo Bonzini, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Reiji Watanabe, Ricardo Koller, Raghavendra Rao Ananta

Hi Jing,

On Tue, Mar 28, 2023 at 9:01 PM Jing Zhang <jingzhangos@google.com> wrote:
>
> Hi Faud,
>
> On Tue, Mar 28, 2023 at 5:40 AM Fuad Tabba <tabba@google.com> wrote:
> >
> > Hi,
> >
> > On Fri, Mar 17, 2023 at 5:06 AM Jing Zhang <jingzhangos@google.com> wrote:
> > >
> > > With per guest ID registers, ID_AA64PFR0_EL1.[CSV2|CSV3] settings from
> > > userspace can be stored in its corresponding ID register.
> > >
> > > No functional change intended.
> > >
> > > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h  |  2 --
> > >  arch/arm64/kvm/arm.c               | 19 +------------------
> > >  arch/arm64/kvm/hyp/nvhe/sys_regs.c |  7 +++----
> > >  arch/arm64/kvm/id_regs.c           | 30 ++++++++++++++++++++++--------
> > >  4 files changed, 26 insertions(+), 32 deletions(-)
> > >
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index fb6b50b1f111..e926ea91a73c 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -230,8 +230,6 @@ struct kvm_arch {
> > >
> > >         cpumask_var_t supported_cpus;
> > >
> > > -       u8 pfr0_csv2;
> > > -       u8 pfr0_csv3;
> > >         struct {
> > >                 u8 imp:4;
> > >                 u8 unimp:4;
> > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > index 4579c878ab30..c78d68d011cb 100644
> > > --- a/arch/arm64/kvm/arm.c
> > > +++ b/arch/arm64/kvm/arm.c
> > > @@ -104,22 +104,6 @@ static int kvm_arm_default_max_vcpus(void)
> > >         return vgic_present ? kvm_vgic_get_max_vcpus() : KVM_MAX_VCPUS;
> > >  }
> > >
> > > -static void set_default_spectre(struct kvm *kvm)
> > > -{
> > > -       /*
> > > -        * The default is to expose CSV2 == 1 if the HW isn't affected.
> > > -        * Although this is a per-CPU feature, we make it global because
> > > -        * asymmetric systems are just a nuisance.
> > > -        *
> > > -        * Userspace can override this as long as it doesn't promise
> > > -        * the impossible.
> > > -        */
> > > -       if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED)
> > > -               kvm->arch.pfr0_csv2 = 1;
> > > -       if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED)
> > > -               kvm->arch.pfr0_csv3 = 1;
> > > -}
> > > -
> > >  /**
> > >   * kvm_arch_init_vm - initializes a VM data structure
> > >   * @kvm:       pointer to the KVM struct
> > > @@ -151,9 +135,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
> > >         /* The maximum number of VCPUs is limited by the host's GIC model */
> > >         kvm->max_vcpus = kvm_arm_default_max_vcpus();
> > >
> > > -       set_default_spectre(kvm);
> > > -       kvm_arm_init_hypercalls(kvm);
> > >         kvm_arm_set_default_id_regs(kvm);
> > > +       kvm_arm_init_hypercalls(kvm);
> > >
> > >         /*
> > >          * Initialise the default PMUver before there is a chance to
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > > index 08d2b004f4b7..0e1988740a65 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > > @@ -93,10 +93,9 @@ static u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> > >                 PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> > >
> > >         /* Spectre and Meltdown mitigation in KVM */
> > > -       set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> > > -                              (u64)kvm->arch.pfr0_csv2);
> > > -       set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> > > -                              (u64)kvm->arch.pfr0_csv3);
> > > +       set_mask |= vcpu->kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] &
> > > +               (ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) |
> > > +                       ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3));
> >
> > This triggers a compiler warning now since the variable `struct kvm
> > *kvm` isn't used anymore, this, however, isn't the main issue.
> >
> > The main issue is that `struct kvm` here (vcpu->kvm) is the
> > hypervisor's version for protected vms, and not the host's. Therefore,
> > reading that value is wrong. That said, this is an existing bug in
> > pKVM since kvm->arch.pfr0_csv2 and kvm->arch.pfr0_csv3 are not
> > initialized.
> >
> > The solution would be to track the spectre/meltown state at hyp and
> > use that. I'll submit a patch that does that. In the meantime, I think
> > that it would be better not to set the CSV bits for protected VMs,
> > which is the current behavior in practice.
> >
> > Non-protected VMs in protected mode go back to the host on id register
> > traps, and use the host's `struct kvm`, so they should behave as
> > expected.
> You mean just remove these two lines:
>  /* Spectre and Meltdown mitigation in KVM */
>  set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> (u64)kvm->arch.pfr0_csv2);
> set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> (u64)kvm->arch.pfr0_csv3);
>
> Will it cause any problem for pKVM without your incoming patch?

Yes, just remove those lines and maybe write a comment in the commit
message referring to this thread please.

It won't cause any problems because in pKVM, these values are never
set (initialized to zero). This means that a protected guest always
needs to use spectre/meltdown mitigations, which could be a
performance problem, but not a security one. This is still a bug of
course, hence why I need to fix it.

Thanks,
/fuad


> >
> > Thanks,
> > /fuad
> >
> >
> > >
> > >         return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> > >  }
> > > diff --git a/arch/arm64/kvm/id_regs.c b/arch/arm64/kvm/id_regs.c
> > > index e393b5730557..b60ca1058301 100644
> > > --- a/arch/arm64/kvm/id_regs.c
> > > +++ b/arch/arm64/kvm/id_regs.c
> > > @@ -61,12 +61,6 @@ u64 kvm_arm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id)
> > >                 if (!vcpu_has_sve(vcpu))
> > >                         val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE);
> > >                 val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AMU);
> > > -               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> > > -               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2),
> > > -                                 (u64)vcpu->kvm->arch.pfr0_csv2);
> > > -               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> > > -               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3),
> > > -                                 (u64)vcpu->kvm->arch.pfr0_csv3);
> > >                 if (kvm_vgic_global_state.type == VGIC_V3) {
> > >                         val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC);
> > >                         val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC), 1);
> > > @@ -201,6 +195,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
> > >                                u64 val)
> > >  {
> > >         u8 csv2, csv3;
> > > +       u64 sval = val;
> > >
> > >         /*
> > >          * Allow AA64PFR0_EL1.CSV2 to be set from userspace as long as
> > > @@ -225,8 +220,7 @@ static int set_id_aa64pfr0_el1(struct kvm_vcpu *vcpu,
> > >         if (val)
> > >                 return -EINVAL;
> > >
> > > -       vcpu->kvm->arch.pfr0_csv2 = csv2;
> > > -       vcpu->kvm->arch.pfr0_csv3 = csv3;
> > > +       vcpu->kvm->arch.id_regs[IDREG_IDX(reg_to_encoding(rd))] = sval;
> > >
> > >         return 0;
> > >  }
> > > @@ -529,4 +523,24 @@ void kvm_arm_set_default_id_regs(struct kvm *kvm)
> > >                 val = read_sanitised_ftr_reg(id);
> > >                 kvm->arch.id_regs[IDREG_IDX(id)] = val;
> > >         }
> > > +       /*
> > > +        * The default is to expose CSV2 == 1 if the HW isn't affected.
> > > +        * Although this is a per-CPU feature, we make it global because
> > > +        * asymmetric systems are just a nuisance.
> > > +        *
> > > +        * Userspace can override this as long as it doesn't promise
> > > +        * the impossible.
> > > +        */
> > > +       val = kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)];
> > > +
> > > +       if (arm64_get_spectre_v2_state() == SPECTRE_UNAFFECTED) {
> > > +               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2);
> > > +               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2), 1);
> > > +       }
> > > +       if (arm64_get_meltdown_state() == SPECTRE_UNAFFECTED) {
> > > +               val &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3);
> > > +               val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3), 1);
> > > +       }
> > > +
> > > +       kvm->arch.id_regs[IDREG_IDX(SYS_ID_AA64PFR0_EL1)] = val;
> > >  }
> > > --
> > > 2.40.0.rc1.284.g88254d51c5-goog
> > >
> Thanks,
> Jing

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest
  2023-03-28 17:36     ` Jing Zhang
  2023-03-28 19:22       ` Marc Zyngier
@ 2023-03-29 16:26       ` Reiji Watanabe
  1 sibling, 0 replies; 25+ messages in thread
From: Reiji Watanabe @ 2023-03-29 16:26 UTC (permalink / raw)
  To: Jing Zhang
  Cc: Marc Zyngier, KVM, KVMARM, ARMLinux, Oliver Upton, Will Deacon,
	Paolo Bonzini, James Morse, Alexandru Elisei, Suzuki K Poulose,
	Fuad Tabba, Ricardo Koller, Raghavendra Rao Ananta

> > > +/*
> > > + * Set the guest's ID registers that are defined in id_reg_descs[]
> > > + * with ID_SANITISED() to the host's sanitized value.
> > > + */
> > > +void kvm_arm_set_default_id_regs(struct kvm *kvm)
> > > +{
> > > +     int i;
> > > +     u32 id;
> > > +     u64 val;
> > > +
> > > +     for (i = 0; i < ARRAY_SIZE(id_reg_descs); i++) {
> > > +             id = reg_to_encoding(&id_reg_descs[i]);
> > > +             if (WARN_ON_ONCE(!is_id_reg(id)))
> > > +                     /* Shouldn't happen */
> > > +                     continue;
> > > +
> > > +             if (id_reg_descs[i].visibility == raz_visibility)
> > > +                     /* Hidden or reserved ID register */
> > > +                     continue;
> >
> > Relying on function pointer comparison is really fragile. If I wrap
> > raz_visibility() in another function, this won't catch it. It also
> > doesn't bode well with your 'inline' definition of this function.
> >
> > More importantly, why do we care about checking for visibility at all?
> > We can happily populate the array and rely on the runtime visibility.
> Right. I'll remove this checking.

Without the check, calling read_sanitised_ftr_reg() for some hidden
ID registers will show a warning as some of them are not in
arm64_ftr_regs[] (e.g. reserved ones). This checking is needed
temporarily to avoid the warning (the check is removed in the following
patches of this series).  It would be much less fragile to call the
visibility function instead, but I don't think this is a also good way
to check the availability of the sanitized values for ID registers
either. I didn't found a good (proper) way to check that without
making changes in cpufeature.c, and I'm not sure if it is worth it
for this temporary purpose.

Thank you,
Reiji




> >
> > > +
> > > +             val = read_sanitised_ftr_reg(id);
> > > +             kvm->arch.id_regs[IDREG_IDX(id)] = val;
> > > +     }
> > > +}
> >
> > Thanks,
> >
> >         M.
> >
> > --
> > Without deviation from the norm, progress is not possible.
>
> Thanks,
> Jing

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-03-29 16:26 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-17  5:06 [PATCH v4 0/6] Support writable CPU ID registers from userspace Jing Zhang
2023-03-17  5:06 ` [PATCH v4 1/6] KVM: arm64: Move CPU ID feature registers emulation into a separate file Jing Zhang
2023-03-27 10:14   ` Marc Zyngier
2023-03-28 17:16     ` Jing Zhang
2023-03-17  5:06 ` [PATCH v4 2/6] KVM: arm64: Save ID registers' sanitized value per guest Jing Zhang
2023-03-27 10:15   ` Marc Zyngier
2023-03-28 17:36     ` Jing Zhang
2023-03-28 19:22       ` Marc Zyngier
2023-03-28 20:05         ` Jing Zhang
2023-03-29 16:26       ` Reiji Watanabe
2023-03-17  5:06 ` [PATCH v4 3/6] KVM: arm64: Use per guest ID register for ID_AA64PFR0_EL1.[CSV2|CSV3] Jing Zhang
2023-03-27 10:31   ` Marc Zyngier
2023-03-28 19:54     ` Jing Zhang
2023-03-28 12:39   ` Fuad Tabba
2023-03-28 20:01     ` Jing Zhang
2023-03-29  8:23       ` Fuad Tabba
2023-03-17  5:06 ` [PATCH v4 4/6] KVM: arm64: Use per guest ID register for ID_AA64DFR0_EL1.PMUVer Jing Zhang
2023-03-27 10:40   ` Marc Zyngier
2023-03-28 20:20     ` Jing Zhang
2023-03-17  5:06 ` [PATCH v4 5/6] KVM: arm64: Introduce ID register specific descriptor Jing Zhang
2023-03-27 11:28   ` Marc Zyngier
2023-03-29  3:46     ` Jing Zhang
2023-03-17  5:06 ` [PATCH v4 6/6] KVM: arm64: Refactor writings for PMUVer/CSV2/CSV3 Jing Zhang
2023-03-27 13:34   ` Marc Zyngier
2023-03-29  4:29     ` Jing Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).