All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
@ 2019-01-07 12:05 ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-07 12:05 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall; +Cc: linux-arm-kernel, kvmarm, kvm

Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
from the firmware, so KVM implements an interface to provide that for
guests. When such a guest is migrated, we want to make sure we don't
loose the protection the guest relies on.

This introduces two new firmware registers in KVM's GET/SET_ONE_REG
interface, so userland can save the level of protection implemented by
the hypervisor and used by the guest. Upon restoring these registers,
we make sure we don't downgrade and reject any values that would mean
weaker protection.
There is some table in the code to describe the valid combinations.

Patch 1 implements the two firmware registers, patch 2 adds the
documentation.

This solution is using two hardcoded firmware registers for that. Not
sure if we should introduce something based on SMCCC instead, which
would allow us to report implementation of any SMCCC based service in a
generic way, or if this would be too generic.

ARM(32) is a bit of a pain (again), as the firmware register interface
is shared, but 32-bit does not implement all the workarounds.
For now I stuffed two wrappers into kvm_emulate.h, which doesn't sound
like the best solution. Happy to hear about better ideas.

This has been tested with a hack to allow faking the protection level
via a debugfs knob, then saving/restoring via some userland tool calling
the GET_ONE_REG/SET_ONE_REG ioctls.

Please have a look and comment!

Cheers,
Andre

Andre Przywara (2):
  KVM: arm/arm64: Add save/restore support for firmware workaround state
  KVM: doc: add API documentation on the KVM_REG_ARM_WORKAROUNDS
    register

 Documentation/virtual/kvm/arm/psci.txt |  20 ++++
 arch/arm/include/asm/kvm_emulate.h     |  10 ++
 arch/arm/include/uapi/asm/kvm.h        |   9 ++
 arch/arm64/include/asm/kvm_emulate.h   |  14 +++
 arch/arm64/include/uapi/asm/kvm.h      |   9 ++
 virt/kvm/arm/psci.c                    | 138 ++++++++++++++++++++++++-
 6 files changed, 198 insertions(+), 2 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
@ 2019-01-07 12:05 ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-07 12:05 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, Peter Maydell, kvmarm, kvm

Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
from the firmware, so KVM implements an interface to provide that for
guests. When such a guest is migrated, we want to make sure we don't
loose the protection the guest relies on.

This introduces two new firmware registers in KVM's GET/SET_ONE_REG
interface, so userland can save the level of protection implemented by
the hypervisor and used by the guest. Upon restoring these registers,
we make sure we don't downgrade and reject any values that would mean
weaker protection.
There is some table in the code to describe the valid combinations.

Patch 1 implements the two firmware registers, patch 2 adds the
documentation.

This solution is using two hardcoded firmware registers for that. Not
sure if we should introduce something based on SMCCC instead, which
would allow us to report implementation of any SMCCC based service in a
generic way, or if this would be too generic.

ARM(32) is a bit of a pain (again), as the firmware register interface
is shared, but 32-bit does not implement all the workarounds.
For now I stuffed two wrappers into kvm_emulate.h, which doesn't sound
like the best solution. Happy to hear about better ideas.

This has been tested with a hack to allow faking the protection level
via a debugfs knob, then saving/restoring via some userland tool calling
the GET_ONE_REG/SET_ONE_REG ioctls.

Please have a look and comment!

Cheers,
Andre

Andre Przywara (2):
  KVM: arm/arm64: Add save/restore support for firmware workaround state
  KVM: doc: add API documentation on the KVM_REG_ARM_WORKAROUNDS
    register

 Documentation/virtual/kvm/arm/psci.txt |  20 ++++
 arch/arm/include/asm/kvm_emulate.h     |  10 ++
 arch/arm/include/uapi/asm/kvm.h        |   9 ++
 arch/arm64/include/asm/kvm_emulate.h   |  14 +++
 arch/arm64/include/uapi/asm/kvm.h      |   9 ++
 virt/kvm/arm/psci.c                    | 138 ++++++++++++++++++++++++-
 6 files changed, 198 insertions(+), 2 deletions(-)

-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-07 12:05 ` Andre Przywara
@ 2019-01-07 12:05   ` Andre Przywara
  -1 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-07 12:05 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall; +Cc: linux-arm-kernel, kvmarm, kvm

KVM implements the firmware interface for mitigating cache speculation
vulnerabilities. Guests may use this interface to ensure mitigation is
active.
If we want to migrate such a guest to a host with a different support
level for those workarounds, migration might need to fail, to ensure that
critical guests don't loose their protection.

Introduce a way for userland to save and restore the workarounds state.
On restoring we do checks that make sure we don't downgrade our
mitigation level.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 arch/arm/include/asm/kvm_emulate.h   |  10 ++
 arch/arm/include/uapi/asm/kvm.h      |   9 ++
 arch/arm64/include/asm/kvm_emulate.h |  14 +++
 arch/arm64/include/uapi/asm/kvm.h    |   9 ++
 virt/kvm/arm/psci.c                  | 138 ++++++++++++++++++++++++++-
 5 files changed, 178 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
index 77121b713bef..2255c50debab 100644
--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -275,6 +275,16 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
 	return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK;
 }
 
+static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+
+static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
+						      bool flag)
+{
+}
+
 static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
 {
 	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index 4602464ebdfb..02c93b1d8f6d 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -214,6 +214,15 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM | KVM_REG_SIZE_U64 | \
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
 
 /* Device Control API: ARM VGIC */
 #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 506386a3edde..a44f07f68da4 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -336,6 +336,20 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
 	return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
 }
 
+static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG;
+}
+
+static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
+						      bool flag)
+{
+	if (flag)
+		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
+	else
+		vcpu->arch.workaround_flags &= ~VCPU_WORKAROUND_2_FLAG;
+}
+
 static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
 {
 	if (vcpu_mode_is_32bit(vcpu)) {
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 97c3478ee6e7..4a19ef199a99 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -225,6 +225,15 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
 
 /* Device Control API: ARM VGIC */
 #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index 9b73d3ad918a..4c671908ef62 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -445,12 +445,18 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 
 int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
 {
-	return 1;		/* PSCI version */
+	return 3;		/* PSCI version and two workaround registers */
 }
 
 int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 {
-	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
+	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
+		return -EFAULT;
+
+	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, uindices++))
+		return -EFAULT;
+
+	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, uindices++))
 		return -EFAULT;
 
 	return 0;
@@ -469,6 +475,45 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 		return 0;
 	}
 
+	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
+		void __user *uaddr = (void __user *)(long)reg->addr;
+		u64 val = 0;
+
+		if (kvm_arm_harden_branch_predictor())
+			val = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
+
+		if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
+			return -EFAULT;
+
+		return 0;
+	}
+
+	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
+		void __user *uaddr = (void __user *)(long)reg->addr;
+		u64 val = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
+
+		switch (kvm_arm_have_ssbd()) {
+		case KVM_SSBD_FORCE_DISABLE:
+		case KVM_SSBD_UNKNOWN:
+			break;
+		case KVM_SSBD_KERNEL:
+			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
+			break;
+		case KVM_SSBD_FORCE_ENABLE:
+		case KVM_SSBD_MITIGATED:
+			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED;
+			break;
+		}
+
+		if (kvm_arm_get_vcpu_workaround_2_flag(vcpu))
+			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED;
+
+		if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
+			return -EFAULT;
+
+		return 0;
+	}
+
 	return -EINVAL;
 }
 
@@ -499,5 +544,94 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 		}
 	}
 
+	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
+		void __user *uaddr = (void __user *)(long)reg->addr;
+		u64 val;
+
+		if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
+			return -EFAULT;
+
+		/* Make sure we support WORKAROUND_1 if userland asks for it. */
+		if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL) &&
+		    !kvm_arm_harden_branch_predictor())
+			return -EINVAL;
+
+		/* Any other bit is reserved. */
+		if (val & ~KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL)
+			return -EINVAL;
+
+		return 0;
+	}
+
+	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
+		void __user *uaddr = (void __user *)(long)reg->addr;
+		unsigned int wa_state;
+		bool wa_flag;
+		u64 val;
+
+		if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
+			return -EFAULT;
+
+		/* Reject any unknown bits. */
+		if (val & ~(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK|
+			    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
+			return -EINVAL;
+
+		/*
+		 * The value passed from userland has to be compatible with
+		 * our own workaround status. We also have to consider the
+		 * requested per-VCPU state for some combinations:
+		 * --------------+-----------+-----------------+---------------
+		 * \ user value  |           |                 |
+		 *  ------------ | SSBD_NONE |   SSBD_KERNEL   |  SSBD_ALWAYS
+		 *  this kernel \|           |                 |
+		 * --------------+-----------+-----------------+---------------
+		 * UNKNOWN       |     OK    |   -EINVAL       |   -EINVAL
+		 * FORCE_DISABLE |           |                 |
+		 * --------------+-----------+-----------------+---------------
+		 * KERNEL        |     OK    | copy VCPU state | set VCPU state
+		 * --------------+-----------+-----------------+---------------
+		 * FORCE_ENABLE  |     OK    |      OK         |      OK
+		 * MITIGATED     |           |                 |
+		 * --------------+-----------+-----------------+---------------
+		 */
+
+		wa_state = val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
+		switch (wa_state) {
+		case  KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
+			/* We can always support no mitigation (1st column). */
+			return 0;
+		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
+		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED:
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		switch (kvm_arm_have_ssbd()) {
+		case KVM_SSBD_UNKNOWN:
+		case KVM_SSBD_FORCE_DISABLE:
+		default:
+			/* ... but some mitigation was requested (1st line). */
+			return -EINVAL;
+		case KVM_SSBD_FORCE_ENABLE:
+		case KVM_SSBD_MITIGATED:
+			/* Always-on is always compatible (3rd line). */
+			return 0;
+		case KVM_SSBD_KERNEL:		/* 2nd line */
+			wa_flag = val;
+			wa_flag |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
+
+			/* Force on when always-on is requested. */
+			if (wa_state == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED)
+				wa_flag = true;
+			break;
+		}
+
+		kvm_arm_set_vcpu_workaround_2_flag(vcpu, wa_flag);
+
+		return 0;
+	}
+
 	return -EINVAL;
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-01-07 12:05   ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-07 12:05 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, Peter Maydell, kvmarm, kvm

KVM implements the firmware interface for mitigating cache speculation
vulnerabilities. Guests may use this interface to ensure mitigation is
active.
If we want to migrate such a guest to a host with a different support
level for those workarounds, migration might need to fail, to ensure that
critical guests don't loose their protection.

Introduce a way for userland to save and restore the workarounds state.
On restoring we do checks that make sure we don't downgrade our
mitigation level.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 arch/arm/include/asm/kvm_emulate.h   |  10 ++
 arch/arm/include/uapi/asm/kvm.h      |   9 ++
 arch/arm64/include/asm/kvm_emulate.h |  14 +++
 arch/arm64/include/uapi/asm/kvm.h    |   9 ++
 virt/kvm/arm/psci.c                  | 138 ++++++++++++++++++++++++++-
 5 files changed, 178 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
index 77121b713bef..2255c50debab 100644
--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -275,6 +275,16 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
 	return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK;
 }
 
+static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+
+static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
+						      bool flag)
+{
+}
+
 static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
 {
 	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index 4602464ebdfb..02c93b1d8f6d 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -214,6 +214,15 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM | KVM_REG_SIZE_U64 | \
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
 
 /* Device Control API: ARM VGIC */
 #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 506386a3edde..a44f07f68da4 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -336,6 +336,20 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
 	return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
 }
 
+static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG;
+}
+
+static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
+						      bool flag)
+{
+	if (flag)
+		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
+	else
+		vcpu->arch.workaround_flags &= ~VCPU_WORKAROUND_2_FLAG;
+}
+
 static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
 {
 	if (vcpu_mode_is_32bit(vcpu)) {
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 97c3478ee6e7..4a19ef199a99 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -225,6 +225,15 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
 					 KVM_REG_ARM_FW | ((r) & 0xffff))
 #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
 
 /* Device Control API: ARM VGIC */
 #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index 9b73d3ad918a..4c671908ef62 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -445,12 +445,18 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 
 int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
 {
-	return 1;		/* PSCI version */
+	return 3;		/* PSCI version and two workaround registers */
 }
 
 int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
 {
-	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
+	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
+		return -EFAULT;
+
+	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, uindices++))
+		return -EFAULT;
+
+	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, uindices++))
 		return -EFAULT;
 
 	return 0;
@@ -469,6 +475,45 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 		return 0;
 	}
 
+	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
+		void __user *uaddr = (void __user *)(long)reg->addr;
+		u64 val = 0;
+
+		if (kvm_arm_harden_branch_predictor())
+			val = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
+
+		if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
+			return -EFAULT;
+
+		return 0;
+	}
+
+	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
+		void __user *uaddr = (void __user *)(long)reg->addr;
+		u64 val = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
+
+		switch (kvm_arm_have_ssbd()) {
+		case KVM_SSBD_FORCE_DISABLE:
+		case KVM_SSBD_UNKNOWN:
+			break;
+		case KVM_SSBD_KERNEL:
+			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
+			break;
+		case KVM_SSBD_FORCE_ENABLE:
+		case KVM_SSBD_MITIGATED:
+			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED;
+			break;
+		}
+
+		if (kvm_arm_get_vcpu_workaround_2_flag(vcpu))
+			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED;
+
+		if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
+			return -EFAULT;
+
+		return 0;
+	}
+
 	return -EINVAL;
 }
 
@@ -499,5 +544,94 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 		}
 	}
 
+	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
+		void __user *uaddr = (void __user *)(long)reg->addr;
+		u64 val;
+
+		if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
+			return -EFAULT;
+
+		/* Make sure we support WORKAROUND_1 if userland asks for it. */
+		if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL) &&
+		    !kvm_arm_harden_branch_predictor())
+			return -EINVAL;
+
+		/* Any other bit is reserved. */
+		if (val & ~KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL)
+			return -EINVAL;
+
+		return 0;
+	}
+
+	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
+		void __user *uaddr = (void __user *)(long)reg->addr;
+		unsigned int wa_state;
+		bool wa_flag;
+		u64 val;
+
+		if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
+			return -EFAULT;
+
+		/* Reject any unknown bits. */
+		if (val & ~(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK|
+			    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
+			return -EINVAL;
+
+		/*
+		 * The value passed from userland has to be compatible with
+		 * our own workaround status. We also have to consider the
+		 * requested per-VCPU state for some combinations:
+		 * --------------+-----------+-----------------+---------------
+		 * \ user value  |           |                 |
+		 *  ------------ | SSBD_NONE |   SSBD_KERNEL   |  SSBD_ALWAYS
+		 *  this kernel \|           |                 |
+		 * --------------+-----------+-----------------+---------------
+		 * UNKNOWN       |     OK    |   -EINVAL       |   -EINVAL
+		 * FORCE_DISABLE |           |                 |
+		 * --------------+-----------+-----------------+---------------
+		 * KERNEL        |     OK    | copy VCPU state | set VCPU state
+		 * --------------+-----------+-----------------+---------------
+		 * FORCE_ENABLE  |     OK    |      OK         |      OK
+		 * MITIGATED     |           |                 |
+		 * --------------+-----------+-----------------+---------------
+		 */
+
+		wa_state = val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
+		switch (wa_state) {
+		case  KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
+			/* We can always support no mitigation (1st column). */
+			return 0;
+		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
+		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED:
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		switch (kvm_arm_have_ssbd()) {
+		case KVM_SSBD_UNKNOWN:
+		case KVM_SSBD_FORCE_DISABLE:
+		default:
+			/* ... but some mitigation was requested (1st line). */
+			return -EINVAL;
+		case KVM_SSBD_FORCE_ENABLE:
+		case KVM_SSBD_MITIGATED:
+			/* Always-on is always compatible (3rd line). */
+			return 0;
+		case KVM_SSBD_KERNEL:		/* 2nd line */
+			wa_flag = val;
+			wa_flag |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
+
+			/* Force on when always-on is requested. */
+			if (wa_state == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED)
+				wa_flag = true;
+			break;
+		}
+
+		kvm_arm_set_vcpu_workaround_2_flag(vcpu, wa_flag);
+
+		return 0;
+	}
+
 	return -EINVAL;
 }
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/2] KVM: doc: add API documentation on the KVM_REG_ARM_WORKAROUNDS register
  2019-01-07 12:05 ` Andre Przywara
@ 2019-01-07 12:05   ` Andre Przywara
  -1 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-07 12:05 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall; +Cc: linux-arm-kernel, kvmarm, kvm

Add documentation for the newly defined firmware registers to save and
restore any vulnerability migitation status.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 Documentation/virtual/kvm/arm/psci.txt | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
index aafdab887b04..1270dc22acac 100644
--- a/Documentation/virtual/kvm/arm/psci.txt
+++ b/Documentation/virtual/kvm/arm/psci.txt
@@ -28,3 +28,23 @@ The following register is defined:
   - Allows any PSCI version implemented by KVM and compatible with
     v0.2 to be set with SET_ONE_REG
   - Affects the whole VM (even if the register view is per-vcpu)
+
+* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
+  Holds the state of the firmware controlled workaround to mitigate
+  CVE-2017-5715, as described under SMCCC_ARCH_WORKAROUND_1 in [1].
+  Accepted values are:
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL: Workaround not available.
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL: Workaround active for the guest.
+
+* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
+  Holds the state of the firmware controlled workaround to mitigate
+  CVE-2018-3639, as described under SMCCC_ARCH_WORKAROUND_2 in [1].
+  Accepted values are:
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL: Workaround not available.
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL: Workaround available, and can
+      be disabled by a vCPU. If KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED is
+      set, it is active for this vCPU.
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED: Workaround always active
+      or not needed.
+
+[1] https://developer.arm.com/-/media/developer/pdf/ARM_DEN_0070A_Firmware_interfaces_for_mitigating_CVE-2017-5715.pdf
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/2] KVM: doc: add API documentation on the KVM_REG_ARM_WORKAROUNDS register
@ 2019-01-07 12:05   ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-07 12:05 UTC (permalink / raw)
  To: Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, Peter Maydell, kvmarm, kvm

Add documentation for the newly defined firmware registers to save and
restore any vulnerability migitation status.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 Documentation/virtual/kvm/arm/psci.txt | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/Documentation/virtual/kvm/arm/psci.txt b/Documentation/virtual/kvm/arm/psci.txt
index aafdab887b04..1270dc22acac 100644
--- a/Documentation/virtual/kvm/arm/psci.txt
+++ b/Documentation/virtual/kvm/arm/psci.txt
@@ -28,3 +28,23 @@ The following register is defined:
   - Allows any PSCI version implemented by KVM and compatible with
     v0.2 to be set with SET_ONE_REG
   - Affects the whole VM (even if the register view is per-vcpu)
+
+* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
+  Holds the state of the firmware controlled workaround to mitigate
+  CVE-2017-5715, as described under SMCCC_ARCH_WORKAROUND_1 in [1].
+  Accepted values are:
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL: Workaround not available.
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL: Workaround active for the guest.
+
+* KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
+  Holds the state of the firmware controlled workaround to mitigate
+  CVE-2018-3639, as described under SMCCC_ARCH_WORKAROUND_2 in [1].
+  Accepted values are:
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL: Workaround not available.
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL: Workaround available, and can
+      be disabled by a vCPU. If KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED is
+      set, it is active for this vCPU.
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED: Workaround always active
+      or not needed.
+
+[1] https://developer.arm.com/-/media/developer/pdf/ARM_DEN_0070A_Firmware_interfaces_for_mitigating_CVE-2017-5715.pdf
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-07 12:05   ` Andre Przywara
@ 2019-01-07 13:17     ` Steven Price
  -1 siblings, 0 replies; 50+ messages in thread
From: Steven Price @ 2019-01-07 13:17 UTC (permalink / raw)
  To: Andre Przywara, Marc Zyngier, Christoffer Dall
  Cc: kvmarm, linux-arm-kernel, kvm

On 07/01/2019 12:05, Andre Przywara wrote:
> KVM implements the firmware interface for mitigating cache speculation
> vulnerabilities. Guests may use this interface to ensure mitigation is
> active.
> If we want to migrate such a guest to a host with a different support
> level for those workarounds, migration might need to fail, to ensure that
> critical guests don't loose their protection.
> 
> Introduce a way for userland to save and restore the workarounds state.
> On restoring we do checks that make sure we don't downgrade our
> mitigation level.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm/include/asm/kvm_emulate.h   |  10 ++
>  arch/arm/include/uapi/asm/kvm.h      |   9 ++
>  arch/arm64/include/asm/kvm_emulate.h |  14 +++
>  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
>  virt/kvm/arm/psci.c                  | 138 ++++++++++++++++++++++++++-
>  5 files changed, 178 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
> index 77121b713bef..2255c50debab 100644
> --- a/arch/arm/include/asm/kvm_emulate.h
> +++ b/arch/arm/include/asm/kvm_emulate.h
> @@ -275,6 +275,16 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
>  	return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK;
>  }
>  
> +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> +{
> +	return false;
> +}
> +
> +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> +						      bool flag)
> +{
> +}
> +
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
>  {
>  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 4602464ebdfb..02c93b1d8f6d 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM | KVM_REG_SIZE_U64 | \
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
>  
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 506386a3edde..a44f07f68da4 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -336,6 +336,20 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
>  	return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
>  }
>  
> +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG;
> +}
> +
> +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> +						      bool flag)
> +{
> +	if (flag)
> +		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
> +	else
> +		vcpu->arch.workaround_flags &= ~VCPU_WORKAROUND_2_FLAG;
> +}
> +
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
>  {
>  	if (vcpu_mode_is_32bit(vcpu)) {
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 97c3478ee6e7..4a19ef199a99 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -225,6 +225,15 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1

I can't help feeling we need more than one bit to deal with all the
possible states. The host can support/not-support the workaround (i.e
the HVC) and the guest can be using/not using the workaround.

In particular I can imagine the following situation:

* Guest starts on a host (host A) without the workaround HVC (so
configures not to use it). Assuming the host doesn't need the workaround
the guest is therefore not vulnerable.

* Migrated to a new host (host B) with the workaround HVC (this is
accepted), the guest is potentially vulnerable.

* Migration back to the original host (host A) is then rejected, even
though the guest isn't using the HVC.

I can see two options here:

* Reject the migration to host B as the guest may be vulnerable after
the migration. I.e. the workaround availability cannot change (either
way) during a migration

* Store an extra bit of information which is whether a particular guest
has the HVC exposed to it. Ideally the HVC handling for the workaround
would also get disabled when running on a host which supports the HVC
but was migrated from a host which doesn't. This prevents problems with
a guest which is e.g. migrated during boot and may do feature detection
after the migration.

Since this is a new ABI it would be good to get the register values
sorted even if we don't have a complete implementation of it.

> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
>  
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index 9b73d3ad918a..4c671908ef62 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -445,12 +445,18 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>  
>  int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
>  {
> -	return 1;		/* PSCI version */
> +	return 3;		/* PSCI version and two workaround registers */
>  }
>  
>  int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>  {
> -	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
> +	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
> +		return -EFAULT;
> +
> +	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, uindices++))
> +		return -EFAULT;
> +
> +	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, uindices++))
>  		return -EFAULT;
>  
>  	return 0;
> @@ -469,6 +475,45 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  		return 0;
>  	}
>  
> +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
> +		void __user *uaddr = (void __user *)(long)reg->addr;
> +		u64 val = 0;
> +
> +		if (kvm_arm_harden_branch_predictor())
> +			val = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
> +
> +		if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> +			return -EFAULT;
> +
> +		return 0;
> +	}
> +
> +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
> +		void __user *uaddr = (void __user *)(long)reg->addr;
> +		u64 val = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> +
> +		switch (kvm_arm_have_ssbd()) {
> +		case KVM_SSBD_FORCE_DISABLE:
> +		case KVM_SSBD_UNKNOWN:
> +			break;
> +		case KVM_SSBD_KERNEL:
> +			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
> +			break;
> +		case KVM_SSBD_FORCE_ENABLE:
> +		case KVM_SSBD_MITIGATED:
> +			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED;
> +			break;
> +		}
> +
> +		if (kvm_arm_get_vcpu_workaround_2_flag(vcpu))
> +			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED;
> +
> +		if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> +			return -EFAULT;
> +
> +		return 0;
> +	}
> +
>  	return -EINVAL;
>  }
>  
> @@ -499,5 +544,94 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  		}
>  	}
>  
> +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
> +		void __user *uaddr = (void __user *)(long)reg->addr;
> +		u64 val;
> +
> +		if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> +			return -EFAULT;
> +
> +		/* Make sure we support WORKAROUND_1 if userland asks for it. */
> +		if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL) &&
> +		    !kvm_arm_harden_branch_predictor())
> +			return -EINVAL;
> +
> +		/* Any other bit is reserved. */
> +		if (val & ~KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL)
> +			return -EINVAL;
> +
> +		return 0;
> +	}
> +
> +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
> +		void __user *uaddr = (void __user *)(long)reg->addr;
> +		unsigned int wa_state;
> +		bool wa_flag;
> +		u64 val;
> +
> +		if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> +			return -EFAULT;
> +
> +		/* Reject any unknown bits. */
> +		if (val & ~(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK|
> +			    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
> +			return -EINVAL;
> +
> +		/*
> +		 * The value passed from userland has to be compatible with
> +		 * our own workaround status. We also have to consider the
> +		 * requested per-VCPU state for some combinations:
> +		 * --------------+-----------+-----------------+---------------
> +		 * \ user value  |           |                 |
> +		 *  ------------ | SSBD_NONE |   SSBD_KERNEL   |  SSBD_ALWAYS
> +		 *  this kernel \|           |                 |
> +		 * --------------+-----------+-----------------+---------------
> +		 * UNKNOWN       |     OK    |   -EINVAL       |   -EINVAL
> +		 * FORCE_DISABLE |           |                 |
> +		 * --------------+-----------+-----------------+---------------
> +		 * KERNEL        |     OK    | copy VCPU state | set VCPU state
> +		 * --------------+-----------+-----------------+---------------
> +		 * FORCE_ENABLE  |     OK    |      OK         |      OK
> +		 * MITIGATED     |           |                 |
> +		 * --------------+-----------+-----------------+---------------
> +		 */
> +
> +		wa_state = val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
> +		switch (wa_state) {
> +		case  KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
> +			/* We can always support no mitigation (1st column). */
> +			return 0;
> +		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
> +		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED:
> +			break;
> +		default:
> +			return -EINVAL;
> +		}
> +
> +		switch (kvm_arm_have_ssbd()) {
> +		case KVM_SSBD_UNKNOWN:
> +		case KVM_SSBD_FORCE_DISABLE:
> +		default:
> +			/* ... but some mitigation was requested (1st line). */
> +			return -EINVAL;
> +		case KVM_SSBD_FORCE_ENABLE:
> +		case KVM_SSBD_MITIGATED:
> +			/* Always-on is always compatible (3rd line). */
> +			return 0;
> +		case KVM_SSBD_KERNEL:		/* 2nd line */
> +			wa_flag = val;
> +			wa_flag |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
> +
> +			/* Force on when always-on is requested. */
> +			if (wa_state == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED)
> +				wa_flag = true;
> +			break;
> +		}
> +
> +		kvm_arm_set_vcpu_workaround_2_flag(vcpu, wa_flag);

Since this line is only reached in the KVM_SSBD_KERNEL case I think it
should be moved up. I'd personally find the code easier to follow if the
default/UNKNOWN/FORCE_DISABLE case is the one that drops out and all the
others have a "return 0". It took me a while to be sure that wa_flag
wasn't used uninitialised here!

Steve

> +
> +		return 0;
> +	}
> +
>  	return -EINVAL;
>  }
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-01-07 13:17     ` Steven Price
  0 siblings, 0 replies; 50+ messages in thread
From: Steven Price @ 2019-01-07 13:17 UTC (permalink / raw)
  To: Andre Przywara, Marc Zyngier, Christoffer Dall
  Cc: Peter Maydell, kvmarm, linux-arm-kernel, kvm

On 07/01/2019 12:05, Andre Przywara wrote:
> KVM implements the firmware interface for mitigating cache speculation
> vulnerabilities. Guests may use this interface to ensure mitigation is
> active.
> If we want to migrate such a guest to a host with a different support
> level for those workarounds, migration might need to fail, to ensure that
> critical guests don't loose their protection.
> 
> Introduce a way for userland to save and restore the workarounds state.
> On restoring we do checks that make sure we don't downgrade our
> mitigation level.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm/include/asm/kvm_emulate.h   |  10 ++
>  arch/arm/include/uapi/asm/kvm.h      |   9 ++
>  arch/arm64/include/asm/kvm_emulate.h |  14 +++
>  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
>  virt/kvm/arm/psci.c                  | 138 ++++++++++++++++++++++++++-
>  5 files changed, 178 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
> index 77121b713bef..2255c50debab 100644
> --- a/arch/arm/include/asm/kvm_emulate.h
> +++ b/arch/arm/include/asm/kvm_emulate.h
> @@ -275,6 +275,16 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
>  	return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK;
>  }
>  
> +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> +{
> +	return false;
> +}
> +
> +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> +						      bool flag)
> +{
> +}
> +
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
>  {
>  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 4602464ebdfb..02c93b1d8f6d 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM | KVM_REG_SIZE_U64 | \
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
>  
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 506386a3edde..a44f07f68da4 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -336,6 +336,20 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
>  	return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
>  }
>  
> +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG;
> +}
> +
> +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> +						      bool flag)
> +{
> +	if (flag)
> +		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
> +	else
> +		vcpu->arch.workaround_flags &= ~VCPU_WORKAROUND_2_FLAG;
> +}
> +
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
>  {
>  	if (vcpu_mode_is_32bit(vcpu)) {
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 97c3478ee6e7..4a19ef199a99 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -225,6 +225,15 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1

I can't help feeling we need more than one bit to deal with all the
possible states. The host can support/not-support the workaround (i.e
the HVC) and the guest can be using/not using the workaround.

In particular I can imagine the following situation:

* Guest starts on a host (host A) without the workaround HVC (so
configures not to use it). Assuming the host doesn't need the workaround
the guest is therefore not vulnerable.

* Migrated to a new host (host B) with the workaround HVC (this is
accepted), the guest is potentially vulnerable.

* Migration back to the original host (host A) is then rejected, even
though the guest isn't using the HVC.

I can see two options here:

* Reject the migration to host B as the guest may be vulnerable after
the migration. I.e. the workaround availability cannot change (either
way) during a migration

* Store an extra bit of information which is whether a particular guest
has the HVC exposed to it. Ideally the HVC handling for the workaround
would also get disabled when running on a host which supports the HVC
but was migrated from a host which doesn't. This prevents problems with
a guest which is e.g. migrated during boot and may do feature detection
after the migration.

Since this is a new ABI it would be good to get the register values
sorted even if we don't have a complete implementation of it.

> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
>  
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index 9b73d3ad918a..4c671908ef62 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -445,12 +445,18 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
>  
>  int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
>  {
> -	return 1;		/* PSCI version */
> +	return 3;		/* PSCI version and two workaround registers */
>  }
>  
>  int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
>  {
> -	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
> +	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
> +		return -EFAULT;
> +
> +	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1, uindices++))
> +		return -EFAULT;
> +
> +	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2, uindices++))
>  		return -EFAULT;
>  
>  	return 0;
> @@ -469,6 +475,45 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  		return 0;
>  	}
>  
> +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
> +		void __user *uaddr = (void __user *)(long)reg->addr;
> +		u64 val = 0;
> +
> +		if (kvm_arm_harden_branch_predictor())
> +			val = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL;
> +
> +		if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> +			return -EFAULT;
> +
> +		return 0;
> +	}
> +
> +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
> +		void __user *uaddr = (void __user *)(long)reg->addr;
> +		u64 val = KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> +
> +		switch (kvm_arm_have_ssbd()) {
> +		case KVM_SSBD_FORCE_DISABLE:
> +		case KVM_SSBD_UNKNOWN:
> +			break;
> +		case KVM_SSBD_KERNEL:
> +			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
> +			break;
> +		case KVM_SSBD_FORCE_ENABLE:
> +		case KVM_SSBD_MITIGATED:
> +			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED;
> +			break;
> +		}
> +
> +		if (kvm_arm_get_vcpu_workaround_2_flag(vcpu))
> +			val |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED;
> +
> +		if (copy_to_user(uaddr, &val, KVM_REG_SIZE(reg->id)))
> +			return -EFAULT;
> +
> +		return 0;
> +	}
> +
>  	return -EINVAL;
>  }
>  
> @@ -499,5 +544,94 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  		}
>  	}
>  
> +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
> +		void __user *uaddr = (void __user *)(long)reg->addr;
> +		u64 val;
> +
> +		if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> +			return -EFAULT;
> +
> +		/* Make sure we support WORKAROUND_1 if userland asks for it. */
> +		if ((val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL) &&
> +		    !kvm_arm_harden_branch_predictor())
> +			return -EINVAL;
> +
> +		/* Any other bit is reserved. */
> +		if (val & ~KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL)
> +			return -EINVAL;
> +
> +		return 0;
> +	}
> +
> +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
> +		void __user *uaddr = (void __user *)(long)reg->addr;
> +		unsigned int wa_state;
> +		bool wa_flag;
> +		u64 val;
> +
> +		if (copy_from_user(&val, uaddr, KVM_REG_SIZE(reg->id)))
> +			return -EFAULT;
> +
> +		/* Reject any unknown bits. */
> +		if (val & ~(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK|
> +			    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
> +			return -EINVAL;
> +
> +		/*
> +		 * The value passed from userland has to be compatible with
> +		 * our own workaround status. We also have to consider the
> +		 * requested per-VCPU state for some combinations:
> +		 * --------------+-----------+-----------------+---------------
> +		 * \ user value  |           |                 |
> +		 *  ------------ | SSBD_NONE |   SSBD_KERNEL   |  SSBD_ALWAYS
> +		 *  this kernel \|           |                 |
> +		 * --------------+-----------+-----------------+---------------
> +		 * UNKNOWN       |     OK    |   -EINVAL       |   -EINVAL
> +		 * FORCE_DISABLE |           |                 |
> +		 * --------------+-----------+-----------------+---------------
> +		 * KERNEL        |     OK    | copy VCPU state | set VCPU state
> +		 * --------------+-----------+-----------------+---------------
> +		 * FORCE_ENABLE  |     OK    |      OK         |      OK
> +		 * MITIGATED     |           |                 |
> +		 * --------------+-----------+-----------------+---------------
> +		 */
> +
> +		wa_state = val & KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
> +		switch (wa_state) {
> +		case  KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
> +			/* We can always support no mitigation (1st column). */
> +			return 0;
> +		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
> +		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED:
> +			break;
> +		default:
> +			return -EINVAL;
> +		}
> +
> +		switch (kvm_arm_have_ssbd()) {
> +		case KVM_SSBD_UNKNOWN:
> +		case KVM_SSBD_FORCE_DISABLE:
> +		default:
> +			/* ... but some mitigation was requested (1st line). */
> +			return -EINVAL;
> +		case KVM_SSBD_FORCE_ENABLE:
> +		case KVM_SSBD_MITIGATED:
> +			/* Always-on is always compatible (3rd line). */
> +			return 0;
> +		case KVM_SSBD_KERNEL:		/* 2nd line */
> +			wa_flag = val;
> +			wa_flag |= KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
> +
> +			/* Force on when always-on is requested. */
> +			if (wa_state == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED)
> +				wa_flag = true;
> +			break;
> +		}
> +
> +		kvm_arm_set_vcpu_workaround_2_flag(vcpu, wa_flag);

Since this line is only reached in the KVM_SSBD_KERNEL case I think it
should be moved up. I'd personally find the code easier to follow if the
default/UNKNOWN/FORCE_DISABLE case is the one that drops out and all the
others have a "return 0". It took me a while to be sure that wa_flag
wasn't used uninitialised here!

Steve

> +
> +		return 0;
> +	}
> +
>  	return -EINVAL;
>  }
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-07 13:17     ` Steven Price
@ 2019-01-21 17:04       ` Andre Przywara
  -1 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-21 17:04 UTC (permalink / raw)
  To: Steven Price; +Cc: kvm, Marc Zyngier, kvmarm, linux-arm-kernel

On Mon, 7 Jan 2019 13:17:37 +0000
Steven Price <steven.price@arm.com> wrote:

Hi,

> On 07/01/2019 12:05, Andre Przywara wrote:
> > KVM implements the firmware interface for mitigating cache
> > speculation vulnerabilities. Guests may use this interface to
> > ensure mitigation is active.
> > If we want to migrate such a guest to a host with a different
> > support level for those workarounds, migration might need to fail,
> > to ensure that critical guests don't loose their protection.
> > 
> > Introduce a way for userland to save and restore the workarounds
> > state. On restoring we do checks that make sure we don't downgrade
> > our mitigation level.
> > 
> > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > ---
> >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> >  virt/kvm/arm/psci.c                  | 138
> > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+), 2
> > deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > b/arch/arm/include/asm/kvm_emulate.h index
> > 77121b713bef..2255c50debab 100644 ---
> > a/arch/arm/include/asm/kvm_emulate.h +++
> > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu) +{
> > +	return false;
> > +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > b/arch/arm/include/uapi/asm/kvm.h index 4602464ebdfb..02c93b1d8f6d
> > 100644 --- a/arch/arm/include/uapi/asm/kvm.h
> > +++ b/arch/arm/include/uapi/asm/kvm.h
> > @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
> >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM |
> > KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1
> > KVM_REG_ARM_FW_REG(1) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4 
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h
> > b/arch/arm64/include/asm/kvm_emulate.h index
> > 506386a3edde..a44f07f68da4 100644 ---
> > a/arch/arm64/include/asm/kvm_emulate.h +++
> > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@ static
> > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu) +{
> > +	return vcpu->arch.workaround_flags &
> > VCPU_WORKAROUND_2_FLAG; +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +	if (flag)
> > +		vcpu->arch.workaround_flags |=
> > VCPU_WORKAROUND_2_FLAG;
> > +	else
> > +		vcpu->arch.workaround_flags &=
> > ~VCPU_WORKAROUND_2_FLAG; +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	if (vcpu_mode_is_32bit(vcpu)) {
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > b/arch/arm64/include/uapi/asm/kvm.h index
> > 97c3478ee6e7..4a19ef199a99 100644 ---
> > a/arch/arm64/include/uapi/asm/kvm.h +++
> > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > KVM_REG_ARM_FW_REG(0) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1  
> 
> I can't help feeling we need more than one bit to deal with all the
> possible states. The host can support/not-support the workaround (i.e
> the HVC) and the guest can be using/not using the workaround.

I don't think we care in the moment about the guest using it or not,
the current implementation is binary: Either the host offers the
workaround or not. We just pass this on to a KVM guest.

But it seems the workaround is *architected* to give three choices:
1) SMC call not implemented, meaning *either* not needed or just not
  implemented (old firmware).
2) SMC call implemented and required on that CPU.
3) SMC call implemented, but *not* required on that CPU.

Now it seems at least on the host side we leave something on the table,
as we neither consider the per-CPU nature of this workaround nor case
3, in which case we do the SMC call needlessly.
This should be fixed, I guess, but as a separate issue.

> In particular I can imagine the following situation:
> 
> * Guest starts on a host (host A) without the workaround HVC (so
> configures not to use it). Assuming the host doesn't need the
> workaround the guest is therefore not vulnerable.

But we don't know this. Not implemented could also (more likely,
actually) mean: workaround not supported, thus vulnerable (old
firmware/kernel).

> * Migrated to a new host (host B) with the workaround HVC (this is
> accepted), the guest is potentially vulnerable.

... as it was before, where we didn't know for sure if the system was
safe.

> * Migration back to the original host (host A) is then rejected, even
> though the guest isn't using the HVC.

Again, we can't be sure, so denying migration is on the safe side.
 
> I can see two options here:
> 
> * Reject the migration to host B as the guest may be vulnerable after
> the migration. I.e. the workaround availability cannot change (either
> way) during a migration
> 
> * Store an extra bit of information which is whether a particular
> guest has the HVC exposed to it. Ideally the HVC handling for the
> workaround would also get disabled when running on a host which
> supports the HVC but was migrated from a host which doesn't. This
> prevents problems with a guest which is e.g. migrated during boot and
> may do feature detection after the migration.
> 
> Since this is a new ABI it would be good to get the register values
> sorted even if we don't have a complete implementation of it.

I agree to that part: this userland interface should be as good as
possible. So I think as a separate issue we should upgrade both the
host side and the guest part of the workaround to deal with all three
cases, but should indeed create the interface in a forward compatible
way.

I will look into extending the register to use two bits to accommodate
all three cases.

> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > KVM_REG_ARM_FW_REG(2) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4 
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> > index 9b73d3ad918a..4c671908ef62 100644
> > --- a/virt/kvm/arm/psci.c
> > +++ b/virt/kvm/arm/psci.c
> > @@ -445,12 +445,18 @@ int kvm_hvc_call_handler(struct kvm_vcpu
> > *vcpu) 
> >  int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> >  {
> > -	return 1;		/* PSCI version */
> > +	return 3;		/* PSCI version and two
> > workaround registers */ }
> >  
> >  int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user
> > *uindices) {
> > -	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
> > +	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
> > +		return -EFAULT;
> > +
> > +	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
> > uindices++))
> > +		return -EFAULT;
> > +
> > +	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2,
> > uindices++)) return -EFAULT;
> >  
> >  	return 0;
> > @@ -469,6 +475,45 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu,
> > const struct kvm_one_reg *reg) return 0;
> >  	}
> >  
> > +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
> > +		void __user *uaddr = (void __user
> > *)(long)reg->addr;
> > +		u64 val = 0;
> > +
> > +		if (kvm_arm_harden_branch_predictor())
> > +			val =
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL; +
> > +		if (copy_to_user(uaddr, &val,
> > KVM_REG_SIZE(reg->id)))
> > +			return -EFAULT;
> > +
> > +		return 0;
> > +	}
> > +
> > +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
> > +		void __user *uaddr = (void __user
> > *)(long)reg->addr;
> > +		u64 val =
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL; +
> > +		switch (kvm_arm_have_ssbd()) {
> > +		case KVM_SSBD_FORCE_DISABLE:
> > +		case KVM_SSBD_UNKNOWN:
> > +			break;
> > +		case KVM_SSBD_KERNEL:
> > +			val |=
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
> > +			break;
> > +		case KVM_SSBD_FORCE_ENABLE:
> > +		case KVM_SSBD_MITIGATED:
> > +			val |=
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED;
> > +			break;
> > +		}
> > +
> > +		if (kvm_arm_get_vcpu_workaround_2_flag(vcpu))
> > +			val |=
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED; +
> > +		if (copy_to_user(uaddr, &val,
> > KVM_REG_SIZE(reg->id)))
> > +			return -EFAULT;
> > +
> > +		return 0;
> > +	}
> > +
> >  	return -EINVAL;
> >  }
> >  
> > @@ -499,5 +544,94 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu,
> > const struct kvm_one_reg *reg) }
> >  	}
> >  
> > +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
> > +		void __user *uaddr = (void __user
> > *)(long)reg->addr;
> > +		u64 val;
> > +
> > +		if (copy_from_user(&val, uaddr,
> > KVM_REG_SIZE(reg->id)))
> > +			return -EFAULT;
> > +
> > +		/* Make sure we support WORKAROUND_1 if userland
> > asks for it. */
> > +		if ((val &
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL) &&
> > +		    !kvm_arm_harden_branch_predictor())
> > +			return -EINVAL;
> > +
> > +		/* Any other bit is reserved. */
> > +		if (val &
> > ~KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL)
> > +			return -EINVAL;
> > +
> > +		return 0;
> > +	}
> > +
> > +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
> > +		void __user *uaddr = (void __user
> > *)(long)reg->addr;
> > +		unsigned int wa_state;
> > +		bool wa_flag;
> > +		u64 val;
> > +
> > +		if (copy_from_user(&val, uaddr,
> > KVM_REG_SIZE(reg->id)))
> > +			return -EFAULT;
> > +
> > +		/* Reject any unknown bits. */
> > +		if (val &
> > ~(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK|
> > +
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
> > +			return -EINVAL;
> > +
> > +		/*
> > +		 * The value passed from userland has to be
> > compatible with
> > +		 * our own workaround status. We also have to
> > consider the
> > +		 * requested per-VCPU state for some combinations:
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 * \ user value  |           |                 |
> > +		 *  ------------ | SSBD_NONE |   SSBD_KERNEL   |
> > SSBD_ALWAYS
> > +		 *  this kernel \|           |                 |
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 * UNKNOWN       |     OK    |   -EINVAL       |
> > -EINVAL
> > +		 * FORCE_DISABLE |           |                 |
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 * KERNEL        |     OK    | copy VCPU state |
> > set VCPU state
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 * FORCE_ENABLE  |     OK    |      OK
> > |      OK
> > +		 * MITIGATED     |           |                 |
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 */
> > +
> > +		wa_state = val &
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
> > +		switch (wa_state) {
> > +		case
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
> > +			/* We can always support no mitigation
> > (1st column). */
> > +			return 0;
> > +		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
> > +		case
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED:
> > +			break;
> > +		default:
> > +			return -EINVAL;
> > +		}
> > +
> > +		switch (kvm_arm_have_ssbd()) {
> > +		case KVM_SSBD_UNKNOWN:
> > +		case KVM_SSBD_FORCE_DISABLE:
> > +		default:
> > +			/* ... but some mitigation was requested
> > (1st line). */
> > +			return -EINVAL;
> > +		case KVM_SSBD_FORCE_ENABLE:
> > +		case KVM_SSBD_MITIGATED:
> > +			/* Always-on is always compatible (3rd
> > line). */
> > +			return 0;
> > +		case KVM_SSBD_KERNEL:		/* 2nd line */
> > +			wa_flag = val;
> > +			wa_flag |=
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK; +
> > +			/* Force on when always-on is requested. */
> > +			if (wa_state ==
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED)
> > +				wa_flag = true;
> > +			break;
> > +		}
> > +
> > +		kvm_arm_set_vcpu_workaround_2_flag(vcpu,
> > wa_flag);  
> 
> Since this line is only reached in the KVM_SSBD_KERNEL case I think it
> should be moved up. I'd personally find the code easier to follow if
> the default/UNKNOWN/FORCE_DISABLE case is the one that drops out and
> all the others have a "return 0". It took me a while to be sure that
> wa_flag wasn't used uninitialised here!

I will check, I think I tried this as well, but it was more messy
somewhere else.

Cheers,
Andre.

> 
> Steve
> 
> > +
> > +		return 0;
> > +	}
> > +
> >  	return -EINVAL;
> >  }
> >   
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-01-21 17:04       ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-21 17:04 UTC (permalink / raw)
  To: Steven Price
  Cc: Peter Maydell, kvm, Marc Zyngier, Christoffer Dall, kvmarm,
	linux-arm-kernel

On Mon, 7 Jan 2019 13:17:37 +0000
Steven Price <steven.price@arm.com> wrote:

Hi,

> On 07/01/2019 12:05, Andre Przywara wrote:
> > KVM implements the firmware interface for mitigating cache
> > speculation vulnerabilities. Guests may use this interface to
> > ensure mitigation is active.
> > If we want to migrate such a guest to a host with a different
> > support level for those workarounds, migration might need to fail,
> > to ensure that critical guests don't loose their protection.
> > 
> > Introduce a way for userland to save and restore the workarounds
> > state. On restoring we do checks that make sure we don't downgrade
> > our mitigation level.
> > 
> > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > ---
> >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> >  virt/kvm/arm/psci.c                  | 138
> > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+), 2
> > deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > b/arch/arm/include/asm/kvm_emulate.h index
> > 77121b713bef..2255c50debab 100644 ---
> > a/arch/arm/include/asm/kvm_emulate.h +++
> > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu) +{
> > +	return false;
> > +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > b/arch/arm/include/uapi/asm/kvm.h index 4602464ebdfb..02c93b1d8f6d
> > 100644 --- a/arch/arm/include/uapi/asm/kvm.h
> > +++ b/arch/arm/include/uapi/asm/kvm.h
> > @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
> >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM |
> > KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1
> > KVM_REG_ARM_FW_REG(1) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4 
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h
> > b/arch/arm64/include/asm/kvm_emulate.h index
> > 506386a3edde..a44f07f68da4 100644 ---
> > a/arch/arm64/include/asm/kvm_emulate.h +++
> > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@ static
> > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu) +{
> > +	return vcpu->arch.workaround_flags &
> > VCPU_WORKAROUND_2_FLAG; +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +	if (flag)
> > +		vcpu->arch.workaround_flags |=
> > VCPU_WORKAROUND_2_FLAG;
> > +	else
> > +		vcpu->arch.workaround_flags &=
> > ~VCPU_WORKAROUND_2_FLAG; +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	if (vcpu_mode_is_32bit(vcpu)) {
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > b/arch/arm64/include/uapi/asm/kvm.h index
> > 97c3478ee6e7..4a19ef199a99 100644 ---
> > a/arch/arm64/include/uapi/asm/kvm.h +++
> > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > KVM_REG_ARM_FW_REG(0) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1  
> 
> I can't help feeling we need more than one bit to deal with all the
> possible states. The host can support/not-support the workaround (i.e
> the HVC) and the guest can be using/not using the workaround.

I don't think we care in the moment about the guest using it or not,
the current implementation is binary: Either the host offers the
workaround or not. We just pass this on to a KVM guest.

But it seems the workaround is *architected* to give three choices:
1) SMC call not implemented, meaning *either* not needed or just not
  implemented (old firmware).
2) SMC call implemented and required on that CPU.
3) SMC call implemented, but *not* required on that CPU.

Now it seems at least on the host side we leave something on the table,
as we neither consider the per-CPU nature of this workaround nor case
3, in which case we do the SMC call needlessly.
This should be fixed, I guess, but as a separate issue.

> In particular I can imagine the following situation:
> 
> * Guest starts on a host (host A) without the workaround HVC (so
> configures not to use it). Assuming the host doesn't need the
> workaround the guest is therefore not vulnerable.

But we don't know this. Not implemented could also (more likely,
actually) mean: workaround not supported, thus vulnerable (old
firmware/kernel).

> * Migrated to a new host (host B) with the workaround HVC (this is
> accepted), the guest is potentially vulnerable.

... as it was before, where we didn't know for sure if the system was
safe.

> * Migration back to the original host (host A) is then rejected, even
> though the guest isn't using the HVC.

Again, we can't be sure, so denying migration is on the safe side.
 
> I can see two options here:
> 
> * Reject the migration to host B as the guest may be vulnerable after
> the migration. I.e. the workaround availability cannot change (either
> way) during a migration
> 
> * Store an extra bit of information which is whether a particular
> guest has the HVC exposed to it. Ideally the HVC handling for the
> workaround would also get disabled when running on a host which
> supports the HVC but was migrated from a host which doesn't. This
> prevents problems with a guest which is e.g. migrated during boot and
> may do feature detection after the migration.
> 
> Since this is a new ABI it would be good to get the register values
> sorted even if we don't have a complete implementation of it.

I agree to that part: this userland interface should be as good as
possible. So I think as a separate issue we should upgrade both the
host side and the guest part of the workaround to deal with all three
cases, but should indeed create the interface in a forward compatible
way.

I will look into extending the register to use two bits to accommodate
all three cases.

> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > KVM_REG_ARM_FW_REG(2) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4 
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> > index 9b73d3ad918a..4c671908ef62 100644
> > --- a/virt/kvm/arm/psci.c
> > +++ b/virt/kvm/arm/psci.c
> > @@ -445,12 +445,18 @@ int kvm_hvc_call_handler(struct kvm_vcpu
> > *vcpu) 
> >  int kvm_arm_get_fw_num_regs(struct kvm_vcpu *vcpu)
> >  {
> > -	return 1;		/* PSCI version */
> > +	return 3;		/* PSCI version and two
> > workaround registers */ }
> >  
> >  int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user
> > *uindices) {
> > -	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices))
> > +	if (put_user(KVM_REG_ARM_PSCI_VERSION, uindices++))
> > +		return -EFAULT;
> > +
> > +	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
> > uindices++))
> > +		return -EFAULT;
> > +
> > +	if (put_user(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2,
> > uindices++)) return -EFAULT;
> >  
> >  	return 0;
> > @@ -469,6 +475,45 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu,
> > const struct kvm_one_reg *reg) return 0;
> >  	}
> >  
> > +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
> > +		void __user *uaddr = (void __user
> > *)(long)reg->addr;
> > +		u64 val = 0;
> > +
> > +		if (kvm_arm_harden_branch_predictor())
> > +			val =
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL; +
> > +		if (copy_to_user(uaddr, &val,
> > KVM_REG_SIZE(reg->id)))
> > +			return -EFAULT;
> > +
> > +		return 0;
> > +	}
> > +
> > +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
> > +		void __user *uaddr = (void __user
> > *)(long)reg->addr;
> > +		u64 val =
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL; +
> > +		switch (kvm_arm_have_ssbd()) {
> > +		case KVM_SSBD_FORCE_DISABLE:
> > +		case KVM_SSBD_UNKNOWN:
> > +			break;
> > +		case KVM_SSBD_KERNEL:
> > +			val |=
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
> > +			break;
> > +		case KVM_SSBD_FORCE_ENABLE:
> > +		case KVM_SSBD_MITIGATED:
> > +			val |=
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED;
> > +			break;
> > +		}
> > +
> > +		if (kvm_arm_get_vcpu_workaround_2_flag(vcpu))
> > +			val |=
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED; +
> > +		if (copy_to_user(uaddr, &val,
> > KVM_REG_SIZE(reg->id)))
> > +			return -EFAULT;
> > +
> > +		return 0;
> > +	}
> > +
> >  	return -EINVAL;
> >  }
> >  
> > @@ -499,5 +544,94 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu,
> > const struct kvm_one_reg *reg) }
> >  	}
> >  
> > +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1) {
> > +		void __user *uaddr = (void __user
> > *)(long)reg->addr;
> > +		u64 val;
> > +
> > +		if (copy_from_user(&val, uaddr,
> > KVM_REG_SIZE(reg->id)))
> > +			return -EFAULT;
> > +
> > +		/* Make sure we support WORKAROUND_1 if userland
> > asks for it. */
> > +		if ((val &
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL) &&
> > +		    !kvm_arm_harden_branch_predictor())
> > +			return -EINVAL;
> > +
> > +		/* Any other bit is reserved. */
> > +		if (val &
> > ~KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL)
> > +			return -EINVAL;
> > +
> > +		return 0;
> > +	}
> > +
> > +	if (reg->id == KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2) {
> > +		void __user *uaddr = (void __user
> > *)(long)reg->addr;
> > +		unsigned int wa_state;
> > +		bool wa_flag;
> > +		u64 val;
> > +
> > +		if (copy_from_user(&val, uaddr,
> > KVM_REG_SIZE(reg->id)))
> > +			return -EFAULT;
> > +
> > +		/* Reject any unknown bits. */
> > +		if (val &
> > ~(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK|
> > +
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED))
> > +			return -EINVAL;
> > +
> > +		/*
> > +		 * The value passed from userland has to be
> > compatible with
> > +		 * our own workaround status. We also have to
> > consider the
> > +		 * requested per-VCPU state for some combinations:
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 * \ user value  |           |                 |
> > +		 *  ------------ | SSBD_NONE |   SSBD_KERNEL   |
> > SSBD_ALWAYS
> > +		 *  this kernel \|           |                 |
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 * UNKNOWN       |     OK    |   -EINVAL       |
> > -EINVAL
> > +		 * FORCE_DISABLE |           |                 |
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 * KERNEL        |     OK    | copy VCPU state |
> > set VCPU state
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 * FORCE_ENABLE  |     OK    |      OK
> > |      OK
> > +		 * MITIGATED     |           |                 |
> > +		 *
> > --------------+-----------+-----------------+---------------
> > +		 */
> > +
> > +		wa_state = val &
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK;
> > +		switch (wa_state) {
> > +		case
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
> > +			/* We can always support no mitigation
> > (1st column). */
> > +			return 0;
> > +		case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
> > +		case
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED:
> > +			break;
> > +		default:
> > +			return -EINVAL;
> > +		}
> > +
> > +		switch (kvm_arm_have_ssbd()) {
> > +		case KVM_SSBD_UNKNOWN:
> > +		case KVM_SSBD_FORCE_DISABLE:
> > +		default:
> > +			/* ... but some mitigation was requested
> > (1st line). */
> > +			return -EINVAL;
> > +		case KVM_SSBD_FORCE_ENABLE:
> > +		case KVM_SSBD_MITIGATED:
> > +			/* Always-on is always compatible (3rd
> > line). */
> > +			return 0;
> > +		case KVM_SSBD_KERNEL:		/* 2nd line */
> > +			wa_flag = val;
> > +			wa_flag |=
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK; +
> > +			/* Force on when always-on is requested. */
> > +			if (wa_state ==
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED)
> > +				wa_flag = true;
> > +			break;
> > +		}
> > +
> > +		kvm_arm_set_vcpu_workaround_2_flag(vcpu,
> > wa_flag);  
> 
> Since this line is only reached in the KVM_SSBD_KERNEL case I think it
> should be moved up. I'd personally find the code easier to follow if
> the default/UNKNOWN/FORCE_DISABLE case is the one that drops out and
> all the others have a "return 0". It took me a while to be sure that
> wa_flag wasn't used uninitialised here!

I will check, I think I tried this as well, but it was more messy
somewhere else.

Cheers,
Andre.

> 
> Steve
> 
> > +
> > +		return 0;
> > +	}
> > +
> >  	return -EINVAL;
> >  }
> >   
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
  2019-01-07 12:05 ` Andre Przywara
@ 2019-01-22 10:17   ` Dave Martin
  -1 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-22 10:17 UTC (permalink / raw)
  To: Andre Przywara; +Cc: Marc Zyngier, kvm, linux-arm-kernel, kvmarm

On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> from the firmware, so KVM implements an interface to provide that for
> guests. When such a guest is migrated, we want to make sure we don't
> loose the protection the guest relies on.
> 
> This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> interface, so userland can save the level of protection implemented by
> the hypervisor and used by the guest. Upon restoring these registers,
> we make sure we don't downgrade and reject any values that would mean
> weaker protection.

Just trolling here, but could we treat these as immutable, like the ID
registers?  

We don't support migration between nodes that are "too different" in any
case, so I wonder if adding complex logic to compare vulnerabilities and
workarounds is liable to create more problems than it solves...

Do we know of anyone who explicitly needs this flexibility yet?

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
@ 2019-01-22 10:17   ` Dave Martin
  0 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-22 10:17 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Marc Zyngier, kvm, Christoffer Dall, linux-arm-kernel, kvmarm

On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> from the firmware, so KVM implements an interface to provide that for
> guests. When such a guest is migrated, we want to make sure we don't
> loose the protection the guest relies on.
> 
> This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> interface, so userland can save the level of protection implemented by
> the hypervisor and used by the guest. Upon restoring these registers,
> we make sure we don't downgrade and reject any values that would mean
> weaker protection.

Just trolling here, but could we treat these as immutable, like the ID
registers?  

We don't support migration between nodes that are "too different" in any
case, so I wonder if adding complex logic to compare vulnerabilities and
workarounds is liable to create more problems than it solves...

Do we know of anyone who explicitly needs this flexibility yet?

[...]

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
  2019-01-22 10:17   ` Dave Martin
@ 2019-01-22 10:41     ` Andre Przywara
  -1 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-22 10:41 UTC (permalink / raw)
  To: Dave Martin; +Cc: Marc Zyngier, kvm, linux-arm-kernel, kvmarm

On Tue, 22 Jan 2019 10:17:00 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

> On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > Workarounds for Spectre variant 2 or 4 vulnerabilities require some
> > help from the firmware, so KVM implements an interface to provide
> > that for guests. When such a guest is migrated, we want to make
> > sure we don't loose the protection the guest relies on.
> > 
> > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > interface, so userland can save the level of protection implemented
> > by the hypervisor and used by the guest. Upon restoring these
> > registers, we make sure we don't downgrade and reject any values
> > that would mean weaker protection.  
> 
> Just trolling here, but could we treat these as immutable, like the ID
> registers?  
> 
> We don't support migration between nodes that are "too different" in
> any case, so I wonder if adding complex logic to compare
> vulnerabilities and workarounds is liable to create more problems
> than it solves...

That is a good point, and we should keep an eye on that it doesn't get
out of hands here. Indeed it is not clear yet how many users really
want to migrate between hosts with a different CPU or platform.
But ...
 
> Do we know of anyone who explicitly needs this flexibility yet?

I think there is a good use case to migrate from a vulnerable host
to one which implements mitigations or isn't vulnerable in the first
place, in which case we want to allow migrations. The scenario here
would probably to migrate VMs away, update the firmware, reboot
the host and migrate the VMs back.

For the other direction (increasing vulnerability) we deny it here,
which is in line with what you think of?

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
@ 2019-01-22 10:41     ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-22 10:41 UTC (permalink / raw)
  To: Dave Martin; +Cc: Marc Zyngier, kvm, Christoffer Dall, linux-arm-kernel, kvmarm

On Tue, 22 Jan 2019 10:17:00 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

> On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > Workarounds for Spectre variant 2 or 4 vulnerabilities require some
> > help from the firmware, so KVM implements an interface to provide
> > that for guests. When such a guest is migrated, we want to make
> > sure we don't loose the protection the guest relies on.
> > 
> > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > interface, so userland can save the level of protection implemented
> > by the hypervisor and used by the guest. Upon restoring these
> > registers, we make sure we don't downgrade and reject any values
> > that would mean weaker protection.  
> 
> Just trolling here, but could we treat these as immutable, like the ID
> registers?  
> 
> We don't support migration between nodes that are "too different" in
> any case, so I wonder if adding complex logic to compare
> vulnerabilities and workarounds is liable to create more problems
> than it solves...

That is a good point, and we should keep an eye on that it doesn't get
out of hands here. Indeed it is not clear yet how many users really
want to migrate between hosts with a different CPU or platform.
But ...
 
> Do we know of anyone who explicitly needs this flexibility yet?

I think there is a good use case to migrate from a vulnerable host
to one which implements mitigations or isn't vulnerable in the first
place, in which case we want to allow migrations. The scenario here
would probably to migrate VMs away, update the firmware, reboot
the host and migrate the VMs back.

For the other direction (increasing vulnerability) we deny it here,
which is in line with what you think of?

Cheers,
Andre.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
  2019-01-22 10:17   ` Dave Martin
@ 2019-01-22 11:11     ` Marc Zyngier
  -1 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-01-22 11:11 UTC (permalink / raw)
  To: Dave Martin; +Cc: Andre Przywara, kvm, linux-arm-kernel, kvmarm

On Tue, 22 Jan 2019 10:17:00 +0000,
Dave Martin <Dave.Martin@arm.com> wrote:
> 
> On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> > from the firmware, so KVM implements an interface to provide that for
> > guests. When such a guest is migrated, we want to make sure we don't
> > loose the protection the guest relies on.
> > 
> > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > interface, so userland can save the level of protection implemented by
> > the hypervisor and used by the guest. Upon restoring these registers,
> > we make sure we don't downgrade and reject any values that would mean
> > weaker protection.
> 
> Just trolling here, but could we treat these as immutable, like the ID
> registers?  
> 
> We don't support migration between nodes that are "too different" in any
> case, so I wonder if adding complex logic to compare vulnerabilities and
> workarounds is liable to create more problems than it solves...

And that's exactly the case we're trying to avoid. Two instances of
the same HW. One with firmware mitigations, one without. Migrating in
one direction is perfectly safe, migrating in the other isn't.

It is not about migrating to different HW at all.

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
@ 2019-01-22 11:11     ` Marc Zyngier
  0 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-01-22 11:11 UTC (permalink / raw)
  To: Dave Martin
  Cc: Andre Przywara, kvm, Christoffer Dall, linux-arm-kernel, kvmarm

On Tue, 22 Jan 2019 10:17:00 +0000,
Dave Martin <Dave.Martin@arm.com> wrote:
> 
> On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> > from the firmware, so KVM implements an interface to provide that for
> > guests. When such a guest is migrated, we want to make sure we don't
> > loose the protection the guest relies on.
> > 
> > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > interface, so userland can save the level of protection implemented by
> > the hypervisor and used by the guest. Upon restoring these registers,
> > we make sure we don't downgrade and reject any values that would mean
> > weaker protection.
> 
> Just trolling here, but could we treat these as immutable, like the ID
> registers?  
> 
> We don't support migration between nodes that are "too different" in any
> case, so I wonder if adding complex logic to compare vulnerabilities and
> workarounds is liable to create more problems than it solves...

And that's exactly the case we're trying to avoid. Two instances of
the same HW. One with firmware mitigations, one without. Migrating in
one direction is perfectly safe, migrating in the other isn't.

It is not about migrating to different HW at all.

	M.

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
  2019-01-22 11:11     ` Marc Zyngier
@ 2019-01-22 13:56       ` Dave Martin
  -1 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-22 13:56 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andre Przywara, linux-arm-kernel, kvm, kvmarm

On Tue, Jan 22, 2019 at 11:11:09AM +0000, Marc Zyngier wrote:
> On Tue, 22 Jan 2019 10:17:00 +0000,
> Dave Martin <Dave.Martin@arm.com> wrote:
> > 
> > On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > > Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> > > from the firmware, so KVM implements an interface to provide that for
> > > guests. When such a guest is migrated, we want to make sure we don't
> > > loose the protection the guest relies on.
> > > 
> > > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > > interface, so userland can save the level of protection implemented by
> > > the hypervisor and used by the guest. Upon restoring these registers,
> > > we make sure we don't downgrade and reject any values that would mean
> > > weaker protection.
> > 
> > Just trolling here, but could we treat these as immutable, like the ID
> > registers?  
> > 
> > We don't support migration between nodes that are "too different" in any
> > case, so I wonder if adding complex logic to compare vulnerabilities and
> > workarounds is liable to create more problems than it solves...
> 
> And that's exactly the case we're trying to avoid. Two instances of
> the same HW. One with firmware mitigations, one without. Migrating in
> one direction is perfectly safe, migrating in the other isn't.
> 
> It is not about migrating to different HW at all.

So this is a realistic scenario when deploying a firmware update across
a cluter that has homogeneous hardware -- there will temporarly be
different firmware versions running on different nodes?

My concern is really "will the checking be too buggy / untested in
practice to be justified by the use case".

I'll take a closer look at the checking logic.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
@ 2019-01-22 13:56       ` Dave Martin
  0 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-22 13:56 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andre Przywara, linux-arm-kernel, kvm, kvmarm

On Tue, Jan 22, 2019 at 11:11:09AM +0000, Marc Zyngier wrote:
> On Tue, 22 Jan 2019 10:17:00 +0000,
> Dave Martin <Dave.Martin@arm.com> wrote:
> > 
> > On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > > Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> > > from the firmware, so KVM implements an interface to provide that for
> > > guests. When such a guest is migrated, we want to make sure we don't
> > > loose the protection the guest relies on.
> > > 
> > > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > > interface, so userland can save the level of protection implemented by
> > > the hypervisor and used by the guest. Upon restoring these registers,
> > > we make sure we don't downgrade and reject any values that would mean
> > > weaker protection.
> > 
> > Just trolling here, but could we treat these as immutable, like the ID
> > registers?  
> > 
> > We don't support migration between nodes that are "too different" in any
> > case, so I wonder if adding complex logic to compare vulnerabilities and
> > workarounds is liable to create more problems than it solves...
> 
> And that's exactly the case we're trying to avoid. Two instances of
> the same HW. One with firmware mitigations, one without. Migrating in
> one direction is perfectly safe, migrating in the other isn't.
> 
> It is not about migrating to different HW at all.

So this is a realistic scenario when deploying a firmware update across
a cluter that has homogeneous hardware -- there will temporarly be
different firmware versions running on different nodes?

My concern is really "will the checking be too buggy / untested in
practice to be justified by the use case".

I'll take a closer look at the checking logic.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
  2019-01-22 13:56       ` Dave Martin
@ 2019-01-22 14:51         ` Marc Zyngier
  -1 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-01-22 14:51 UTC (permalink / raw)
  To: Dave Martin; +Cc: Andre Przywara, linux-arm-kernel, kvm, kvmarm

On Tue, 22 Jan 2019 13:56:34 +0000,
Dave Martin <Dave.Martin@arm.com> wrote:
> 
> On Tue, Jan 22, 2019 at 11:11:09AM +0000, Marc Zyngier wrote:
> > On Tue, 22 Jan 2019 10:17:00 +0000,
> > Dave Martin <Dave.Martin@arm.com> wrote:
> > > 
> > > On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > > > Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> > > > from the firmware, so KVM implements an interface to provide that for
> > > > guests. When such a guest is migrated, we want to make sure we don't
> > > > loose the protection the guest relies on.
> > > > 
> > > > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > > > interface, so userland can save the level of protection implemented by
> > > > the hypervisor and used by the guest. Upon restoring these registers,
> > > > we make sure we don't downgrade and reject any values that would mean
> > > > weaker protection.
> > > 
> > > Just trolling here, but could we treat these as immutable, like the ID
> > > registers?  
> > > 
> > > We don't support migration between nodes that are "too different" in any
> > > case, so I wonder if adding complex logic to compare vulnerabilities and
> > > workarounds is liable to create more problems than it solves...
> > 
> > And that's exactly the case we're trying to avoid. Two instances of
> > the same HW. One with firmware mitigations, one without. Migrating in
> > one direction is perfectly safe, migrating in the other isn't.
> > 
> > It is not about migrating to different HW at all.
> 
> So this is a realistic scenario when deploying a firmware update across
> a cluter that has homogeneous hardware -- there will temporarly be
> different firmware versions running on different nodes?

Case in point: I have on my desk two AMD Seattle systems. One with an
ancient firmware that doesn't mitigate anything, and one that has all
the mitigations applied (and correctly advertised). I can migrate
stuff back and forth, and that's really bad.

What people do in their data centre is none of my business,
really. What concerns me is that there is a potential for something
bad to happen without people noticing. And it is KVM's job to do the
right thing in this case.

> My concern is really "will the checking be too buggy / untested in
> practice to be justified by the use case".

Not doing anything is not going to make the current situation "less
buggy". We have all the stuff we need to test this. We can even
artificially create the various scenarios on a model.

> I'll take a closer look at the checking logic.

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
@ 2019-01-22 14:51         ` Marc Zyngier
  0 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-01-22 14:51 UTC (permalink / raw)
  To: Dave Martin; +Cc: Andre Przywara, linux-arm-kernel, kvm, kvmarm

On Tue, 22 Jan 2019 13:56:34 +0000,
Dave Martin <Dave.Martin@arm.com> wrote:
> 
> On Tue, Jan 22, 2019 at 11:11:09AM +0000, Marc Zyngier wrote:
> > On Tue, 22 Jan 2019 10:17:00 +0000,
> > Dave Martin <Dave.Martin@arm.com> wrote:
> > > 
> > > On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > > > Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> > > > from the firmware, so KVM implements an interface to provide that for
> > > > guests. When such a guest is migrated, we want to make sure we don't
> > > > loose the protection the guest relies on.
> > > > 
> > > > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > > > interface, so userland can save the level of protection implemented by
> > > > the hypervisor and used by the guest. Upon restoring these registers,
> > > > we make sure we don't downgrade and reject any values that would mean
> > > > weaker protection.
> > > 
> > > Just trolling here, but could we treat these as immutable, like the ID
> > > registers?  
> > > 
> > > We don't support migration between nodes that are "too different" in any
> > > case, so I wonder if adding complex logic to compare vulnerabilities and
> > > workarounds is liable to create more problems than it solves...
> > 
> > And that's exactly the case we're trying to avoid. Two instances of
> > the same HW. One with firmware mitigations, one without. Migrating in
> > one direction is perfectly safe, migrating in the other isn't.
> > 
> > It is not about migrating to different HW at all.
> 
> So this is a realistic scenario when deploying a firmware update across
> a cluter that has homogeneous hardware -- there will temporarly be
> different firmware versions running on different nodes?

Case in point: I have on my desk two AMD Seattle systems. One with an
ancient firmware that doesn't mitigate anything, and one that has all
the mitigations applied (and correctly advertised). I can migrate
stuff back and forth, and that's really bad.

What people do in their data centre is none of my business,
really. What concerns me is that there is a potential for something
bad to happen without people noticing. And it is KVM's job to do the
right thing in this case.

> My concern is really "will the checking be too buggy / untested in
> practice to be justified by the use case".

Not doing anything is not going to make the current situation "less
buggy". We have all the stuff we need to test this. We can even
artificially create the various scenarios on a model.

> I'll take a closer look at the checking logic.

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-07 12:05   ` Andre Przywara
@ 2019-01-22 15:17     ` Dave Martin
  -1 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-22 15:17 UTC (permalink / raw)
  To: Andre Przywara; +Cc: Marc Zyngier, kvm, linux-arm-kernel, kvmarm

On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:
> KVM implements the firmware interface for mitigating cache speculation
> vulnerabilities. Guests may use this interface to ensure mitigation is
> active.
> If we want to migrate such a guest to a host with a different support
> level for those workarounds, migration might need to fail, to ensure that
> critical guests don't loose their protection.
> 
> Introduce a way for userland to save and restore the workarounds state.
> On restoring we do checks that make sure we don't downgrade our
> mitigation level.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm/include/asm/kvm_emulate.h   |  10 ++
>  arch/arm/include/uapi/asm/kvm.h      |   9 ++
>  arch/arm64/include/asm/kvm_emulate.h |  14 +++
>  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
>  virt/kvm/arm/psci.c                  | 138 ++++++++++++++++++++++++++-
>  5 files changed, 178 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
> index 77121b713bef..2255c50debab 100644
> --- a/arch/arm/include/asm/kvm_emulate.h
> +++ b/arch/arm/include/asm/kvm_emulate.h
> @@ -275,6 +275,16 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
>  	return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK;
>  }
>  
> +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> +{
> +	return false;
> +}
> +
> +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> +						      bool flag)
> +{
> +}
> +
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
>  {
>  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 4602464ebdfb..02c93b1d8f6d 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM | KVM_REG_SIZE_U64 | \
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
>  
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 506386a3edde..a44f07f68da4 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -336,6 +336,20 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
>  	return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
>  }
>  
> +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG;
> +}
> +
> +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> +						      bool flag)
> +{
> +	if (flag)
> +		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
> +	else
> +		vcpu->arch.workaround_flags &= ~VCPU_WORKAROUND_2_FLAG;
> +}
> +
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
>  {
>  	if (vcpu_mode_is_32bit(vcpu)) {
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 97c3478ee6e7..4a19ef199a99 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -225,6 +225,15 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4

If this is the first exposure of this information to userspace, I wonder
if we can come up with some common semantics that avoid having to add
new ad-hoc code (and bugs) every time a new vulnerability/workaround is
defined.

We seem to have at least the two following independent properties
for a vulnerability, with the listed values for each:

 * vulnerability (Vulnerable, Unknown, Not Vulnerable)

 * mitigation support (Not Requestable, Requestable)

Migrations must not move to the left in _either_ list for any
vulnerability.

If we want to hedge out bets we could follow the style of the ID
registers and allocate to each theoretical vulnerability a pair of
signed 2- or (for more expansion room if we think we might need it)
4-bit fields.

We could perhaps allocate as follows:

 * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
 *  0=Mitigation not requestable, 1=Mitigation requestable


Checking code wouldn't need to know which fields describe mitigation
mechanisms and which describe vulnerabilities: we'd just do a strict >=
comparison on each.

Further, if a register is never written before the vcpu is first run,
we should imply a write of 0 to it as part of KVM_RUN (so that if the
destination node has a negative value anywhere, KVM_RUN barfs cleanly.


(Those semantics should apply equally to the CPU ID registers, though
we don't currently do that.)

Thoughts?

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-01-22 15:17     ` Dave Martin
  0 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-22 15:17 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Marc Zyngier, kvm, Christoffer Dall, linux-arm-kernel, kvmarm

On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:
> KVM implements the firmware interface for mitigating cache speculation
> vulnerabilities. Guests may use this interface to ensure mitigation is
> active.
> If we want to migrate such a guest to a host with a different support
> level for those workarounds, migration might need to fail, to ensure that
> critical guests don't loose their protection.
> 
> Introduce a way for userland to save and restore the workarounds state.
> On restoring we do checks that make sure we don't downgrade our
> mitigation level.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  arch/arm/include/asm/kvm_emulate.h   |  10 ++
>  arch/arm/include/uapi/asm/kvm.h      |   9 ++
>  arch/arm64/include/asm/kvm_emulate.h |  14 +++
>  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
>  virt/kvm/arm/psci.c                  | 138 ++++++++++++++++++++++++++-
>  5 files changed, 178 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
> index 77121b713bef..2255c50debab 100644
> --- a/arch/arm/include/asm/kvm_emulate.h
> +++ b/arch/arm/include/asm/kvm_emulate.h
> @@ -275,6 +275,16 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
>  	return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK;
>  }
>  
> +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> +{
> +	return false;
> +}
> +
> +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> +						      bool flag)
> +{
> +}
> +
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
>  {
>  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 4602464ebdfb..02c93b1d8f6d 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM | KVM_REG_SIZE_U64 | \
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
>  
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 506386a3edde..a44f07f68da4 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -336,6 +336,20 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
>  	return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
>  }
>  
> +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG;
> +}
> +
> +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> +						      bool flag)
> +{
> +	if (flag)
> +		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
> +	else
> +		vcpu->arch.workaround_flags &= ~VCPU_WORKAROUND_2_FLAG;
> +}
> +
>  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
>  {
>  	if (vcpu_mode_is_32bit(vcpu)) {
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 97c3478ee6e7..4a19ef199a99 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -225,6 +225,15 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
>  					 KVM_REG_ARM_FW | ((r) & 0xffff))
>  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4

If this is the first exposure of this information to userspace, I wonder
if we can come up with some common semantics that avoid having to add
new ad-hoc code (and bugs) every time a new vulnerability/workaround is
defined.

We seem to have at least the two following independent properties
for a vulnerability, with the listed values for each:

 * vulnerability (Vulnerable, Unknown, Not Vulnerable)

 * mitigation support (Not Requestable, Requestable)

Migrations must not move to the left in _either_ list for any
vulnerability.

If we want to hedge out bets we could follow the style of the ID
registers and allocate to each theoretical vulnerability a pair of
signed 2- or (for more expansion room if we think we might need it)
4-bit fields.

We could perhaps allocate as follows:

 * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
 *  0=Mitigation not requestable, 1=Mitigation requestable


Checking code wouldn't need to know which fields describe mitigation
mechanisms and which describe vulnerabilities: we'd just do a strict >=
comparison on each.

Further, if a register is never written before the vcpu is first run,
we should imply a write of 0 to it as part of KVM_RUN (so that if the
destination node has a negative value anywhere, KVM_RUN barfs cleanly.


(Those semantics should apply equally to the CPU ID registers, though
we don't currently do that.)

Thoughts?

[...]

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
  2019-01-22 14:51         ` Marc Zyngier
@ 2019-01-22 15:28           ` Dave Martin
  -1 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-22 15:28 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andre Przywara, kvm, linux-arm-kernel, kvmarm

On Tue, Jan 22, 2019 at 02:51:11PM +0000, Marc Zyngier wrote:
> On Tue, 22 Jan 2019 13:56:34 +0000,
> Dave Martin <Dave.Martin@arm.com> wrote:
> > 
> > On Tue, Jan 22, 2019 at 11:11:09AM +0000, Marc Zyngier wrote:
> > > On Tue, 22 Jan 2019 10:17:00 +0000,
> > > Dave Martin <Dave.Martin@arm.com> wrote:
> > > > 
> > > > On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > > > > Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> > > > > from the firmware, so KVM implements an interface to provide that for
> > > > > guests. When such a guest is migrated, we want to make sure we don't
> > > > > loose the protection the guest relies on.
> > > > > 
> > > > > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > > > > interface, so userland can save the level of protection implemented by
> > > > > the hypervisor and used by the guest. Upon restoring these registers,
> > > > > we make sure we don't downgrade and reject any values that would mean
> > > > > weaker protection.
> > > > 
> > > > Just trolling here, but could we treat these as immutable, like the ID
> > > > registers?  
> > > > 
> > > > We don't support migration between nodes that are "too different" in any
> > > > case, so I wonder if adding complex logic to compare vulnerabilities and
> > > > workarounds is liable to create more problems than it solves...
> > > 
> > > And that's exactly the case we're trying to avoid. Two instances of
> > > the same HW. One with firmware mitigations, one without. Migrating in
> > > one direction is perfectly safe, migrating in the other isn't.
> > > 
> > > It is not about migrating to different HW at all.
> > 
> > So this is a realistic scenario when deploying a firmware update across
> > a cluter that has homogeneous hardware -- there will temporarly be
> > different firmware versions running on different nodes?
> 
> Case in point: I have on my desk two AMD Seattle systems. One with an
> ancient firmware that doesn't mitigate anything, and one that has all
> the mitigations applied (and correctly advertised). I can migrate
> stuff back and forth, and that's really bad.

Agreed.

> What people do in their data centre is none of my business,
> really. What concerns me is that there is a potential for something
> bad to happen without people noticing. And it is KVM's job to do the
> right thing in this case.

Fair enough.

> > My concern is really "will the checking be too buggy / untested in
> > practice to be justified by the use case".
> 
> Not doing anything is not going to make the current situation "less
> buggy". We have all the stuff we need to test this. We can even
> artificially create the various scenarios on a model.

Agreed.  My concern is about how this will scale if future
vulnerabilities are added to the mix.  We might ultimately end up in a
worse mess, but I may be being paranoid.

> > I'll take a closer look at the checking logic.

See the other thread.  I have an idea there for exposing the information
in a different way that may simplfy things (or be totally misguided...)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register
@ 2019-01-22 15:28           ` Dave Martin
  0 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-22 15:28 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andre Przywara, kvm, linux-arm-kernel, kvmarm

On Tue, Jan 22, 2019 at 02:51:11PM +0000, Marc Zyngier wrote:
> On Tue, 22 Jan 2019 13:56:34 +0000,
> Dave Martin <Dave.Martin@arm.com> wrote:
> > 
> > On Tue, Jan 22, 2019 at 11:11:09AM +0000, Marc Zyngier wrote:
> > > On Tue, 22 Jan 2019 10:17:00 +0000,
> > > Dave Martin <Dave.Martin@arm.com> wrote:
> > > > 
> > > > On Mon, Jan 07, 2019 at 12:05:35PM +0000, Andre Przywara wrote:
> > > > > Workarounds for Spectre variant 2 or 4 vulnerabilities require some help
> > > > > from the firmware, so KVM implements an interface to provide that for
> > > > > guests. When such a guest is migrated, we want to make sure we don't
> > > > > loose the protection the guest relies on.
> > > > > 
> > > > > This introduces two new firmware registers in KVM's GET/SET_ONE_REG
> > > > > interface, so userland can save the level of protection implemented by
> > > > > the hypervisor and used by the guest. Upon restoring these registers,
> > > > > we make sure we don't downgrade and reject any values that would mean
> > > > > weaker protection.
> > > > 
> > > > Just trolling here, but could we treat these as immutable, like the ID
> > > > registers?  
> > > > 
> > > > We don't support migration between nodes that are "too different" in any
> > > > case, so I wonder if adding complex logic to compare vulnerabilities and
> > > > workarounds is liable to create more problems than it solves...
> > > 
> > > And that's exactly the case we're trying to avoid. Two instances of
> > > the same HW. One with firmware mitigations, one without. Migrating in
> > > one direction is perfectly safe, migrating in the other isn't.
> > > 
> > > It is not about migrating to different HW at all.
> > 
> > So this is a realistic scenario when deploying a firmware update across
> > a cluter that has homogeneous hardware -- there will temporarly be
> > different firmware versions running on different nodes?
> 
> Case in point: I have on my desk two AMD Seattle systems. One with an
> ancient firmware that doesn't mitigate anything, and one that has all
> the mitigations applied (and correctly advertised). I can migrate
> stuff back and forth, and that's really bad.

Agreed.

> What people do in their data centre is none of my business,
> really. What concerns me is that there is a potential for something
> bad to happen without people noticing. And it is KVM's job to do the
> right thing in this case.

Fair enough.

> > My concern is really "will the checking be too buggy / untested in
> > practice to be justified by the use case".
> 
> Not doing anything is not going to make the current situation "less
> buggy". We have all the stuff we need to test this. We can even
> artificially create the various scenarios on a model.

Agreed.  My concern is about how this will scale if future
vulnerabilities are added to the mix.  We might ultimately end up in a
worse mess, but I may be being paranoid.

> > I'll take a closer look at the checking logic.

See the other thread.  I have an idea there for exposing the information
in a different way that may simplfy things (or be totally misguided...)

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-22 15:17     ` Dave Martin
@ 2019-01-25 14:46       ` Andre Przywara
  -1 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-25 14:46 UTC (permalink / raw)
  To: Dave Martin; +Cc: Marc Zyngier, kvm, linux-arm-kernel, kvmarm

On Tue, 22 Jan 2019 15:17:14 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

Hi Dave,

thanks for having a look!

> On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:
> > KVM implements the firmware interface for mitigating cache
> > speculation vulnerabilities. Guests may use this interface to
> > ensure mitigation is active.
> > If we want to migrate such a guest to a host with a different
> > support level for those workarounds, migration might need to fail,
> > to ensure that critical guests don't loose their protection.
> > 
> > Introduce a way for userland to save and restore the workarounds
> > state. On restoring we do checks that make sure we don't downgrade
> > our mitigation level.
> > 
> > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > ---
> >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> >  virt/kvm/arm/psci.c                  | 138
> > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+), 2
> > deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > b/arch/arm/include/asm/kvm_emulate.h index
> > 77121b713bef..2255c50debab 100644 ---
> > a/arch/arm/include/asm/kvm_emulate.h +++
> > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu) +{
> > +	return false;
> > +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > b/arch/arm/include/uapi/asm/kvm.h index 4602464ebdfb..02c93b1d8f6d
> > 100644 --- a/arch/arm/include/uapi/asm/kvm.h
> > +++ b/arch/arm/include/uapi/asm/kvm.h
> > @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
> >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM |
> > KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1
> > KVM_REG_ARM_FW_REG(1) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4 
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h
> > b/arch/arm64/include/asm/kvm_emulate.h index
> > 506386a3edde..a44f07f68da4 100644 ---
> > a/arch/arm64/include/asm/kvm_emulate.h +++
> > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@ static
> > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu) +{
> > +	return vcpu->arch.workaround_flags &
> > VCPU_WORKAROUND_2_FLAG; +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +	if (flag)
> > +		vcpu->arch.workaround_flags |=
> > VCPU_WORKAROUND_2_FLAG;
> > +	else
> > +		vcpu->arch.workaround_flags &=
> > ~VCPU_WORKAROUND_2_FLAG; +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	if (vcpu_mode_is_32bit(vcpu)) {
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > b/arch/arm64/include/uapi/asm/kvm.h index
> > 97c3478ee6e7..4a19ef199a99 100644 ---
> > a/arch/arm64/include/uapi/asm/kvm.h +++
> > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > KVM_REG_ARM_FW_REG(0) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > KVM_REG_ARM_FW_REG(2) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4  
> 
> If this is the first exposure of this information to userspace, I
> wonder if we can come up with some common semantics that avoid having
> to add new ad-hoc code (and bugs) every time a new
> vulnerability/workaround is defined.
> 
> We seem to have at least the two following independent properties
> for a vulnerability, with the listed values for each:
> 
>  * vulnerability (Vulnerable, Unknown, Not Vulnerable)
> 
>  * mitigation support (Not Requestable, Requestable)
> 
> Migrations must not move to the left in _either_ list for any
> vulnerability.
> 
> If we want to hedge out bets we could follow the style of the ID
> registers and allocate to each theoretical vulnerability a pair of
> signed 2- or (for more expansion room if we think we might need it)
> 4-bit fields.
> 
> We could perhaps allocate as follows:
> 
>  * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
>  *  0=Mitigation not requestable, 1=Mitigation requestable

So as discussed in person, that sounds quite neat. I implemented
that, but the sign extension and masking to n bits is not very pretty
and limits readability.
However the property of having a kind of "vulnerability scale", where a
simple comparison would determine compatibility, is a good thing to
have and drastically simplifies the checking code.

> Checking code wouldn't need to know which fields describe mitigation
> mechanisms and which describe vulnerabilities: we'd just do a strict
> >= comparison on each.
> 
> Further, if a register is never written before the vcpu is first run,
> we should imply a write of 0 to it as part of KVM_RUN (so that if the
> destination node has a negative value anywhere, KVM_RUN barfs cleanly.

What I like about the signedness is this "0 means unknown", which is
magically forwards compatible. However I am not sure we can transfer
this semantic into every upcoming register that pops up in the future.
Actually we might not need this:
My understanding of how QEMU handles this in migration is that it reads
the f/w reg on the originating host A and writes this into the target
host B, without itself interpreting this in any way. It's up to the
target kernel (basically this code here) to check compatibility. So I am
not sure we actually need a stable scheme. If host A doesn't know about
a certain register, it won't appear in the result of the
KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at all. In
the opposite case the receiving host would reject an unknown register,
which I believe is safer, although I see that it leaves the "unknown"
case on the table.

It would be good to have some opinion of how forward looking we want to
(and can) be here.

Meanwhile I am sending a v2 which implements the linear scale idea,
without using signed values, as this indeed simplifies the code.
I have the signed version still in a branch here, let me know if you
want to have a look.

Cheers,
Andre.

> (Those semantics should apply equally to the CPU ID registers, though
> we don't currently do that.)
> 
> Thoughts?
> 
> [...]
> 
> Cheers
> ---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-01-25 14:46       ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-25 14:46 UTC (permalink / raw)
  To: Dave Martin; +Cc: Marc Zyngier, kvm, Christoffer Dall, linux-arm-kernel, kvmarm

On Tue, 22 Jan 2019 15:17:14 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

Hi Dave,

thanks for having a look!

> On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:
> > KVM implements the firmware interface for mitigating cache
> > speculation vulnerabilities. Guests may use this interface to
> > ensure mitigation is active.
> > If we want to migrate such a guest to a host with a different
> > support level for those workarounds, migration might need to fail,
> > to ensure that critical guests don't loose their protection.
> > 
> > Introduce a way for userland to save and restore the workarounds
> > state. On restoring we do checks that make sure we don't downgrade
> > our mitigation level.
> > 
> > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > ---
> >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> >  virt/kvm/arm/psci.c                  | 138
> > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+), 2
> > deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > b/arch/arm/include/asm/kvm_emulate.h index
> > 77121b713bef..2255c50debab 100644 ---
> > a/arch/arm/include/asm/kvm_emulate.h +++
> > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu) +{
> > +	return false;
> > +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > b/arch/arm/include/uapi/asm/kvm.h index 4602464ebdfb..02c93b1d8f6d
> > 100644 --- a/arch/arm/include/uapi/asm/kvm.h
> > +++ b/arch/arm/include/uapi/asm/kvm.h
> > @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
> >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM |
> > KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1
> > KVM_REG_ARM_FW_REG(1) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4 
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h
> > b/arch/arm64/include/asm/kvm_emulate.h index
> > 506386a3edde..a44f07f68da4 100644 ---
> > a/arch/arm64/include/asm/kvm_emulate.h +++
> > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@ static
> > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu) +{
> > +	return vcpu->arch.workaround_flags &
> > VCPU_WORKAROUND_2_FLAG; +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +	if (flag)
> > +		vcpu->arch.workaround_flags |=
> > VCPU_WORKAROUND_2_FLAG;
> > +	else
> > +		vcpu->arch.workaround_flags &=
> > ~VCPU_WORKAROUND_2_FLAG; +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	if (vcpu_mode_is_32bit(vcpu)) {
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > b/arch/arm64/include/uapi/asm/kvm.h index
> > 97c3478ee6e7..4a19ef199a99 100644 ---
> > a/arch/arm64/include/uapi/asm/kvm.h +++
> > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > KVM_REG_ARM_FW_REG(0) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > KVM_REG_ARM_FW_REG(2) +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4  
> 
> If this is the first exposure of this information to userspace, I
> wonder if we can come up with some common semantics that avoid having
> to add new ad-hoc code (and bugs) every time a new
> vulnerability/workaround is defined.
> 
> We seem to have at least the two following independent properties
> for a vulnerability, with the listed values for each:
> 
>  * vulnerability (Vulnerable, Unknown, Not Vulnerable)
> 
>  * mitigation support (Not Requestable, Requestable)
> 
> Migrations must not move to the left in _either_ list for any
> vulnerability.
> 
> If we want to hedge out bets we could follow the style of the ID
> registers and allocate to each theoretical vulnerability a pair of
> signed 2- or (for more expansion room if we think we might need it)
> 4-bit fields.
> 
> We could perhaps allocate as follows:
> 
>  * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
>  *  0=Mitigation not requestable, 1=Mitigation requestable

So as discussed in person, that sounds quite neat. I implemented
that, but the sign extension and masking to n bits is not very pretty
and limits readability.
However the property of having a kind of "vulnerability scale", where a
simple comparison would determine compatibility, is a good thing to
have and drastically simplifies the checking code.

> Checking code wouldn't need to know which fields describe mitigation
> mechanisms and which describe vulnerabilities: we'd just do a strict
> >= comparison on each.
> 
> Further, if a register is never written before the vcpu is first run,
> we should imply a write of 0 to it as part of KVM_RUN (so that if the
> destination node has a negative value anywhere, KVM_RUN barfs cleanly.

What I like about the signedness is this "0 means unknown", which is
magically forwards compatible. However I am not sure we can transfer
this semantic into every upcoming register that pops up in the future.
Actually we might not need this:
My understanding of how QEMU handles this in migration is that it reads
the f/w reg on the originating host A and writes this into the target
host B, without itself interpreting this in any way. It's up to the
target kernel (basically this code here) to check compatibility. So I am
not sure we actually need a stable scheme. If host A doesn't know about
a certain register, it won't appear in the result of the
KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at all. In
the opposite case the receiving host would reject an unknown register,
which I believe is safer, although I see that it leaves the "unknown"
case on the table.

It would be good to have some opinion of how forward looking we want to
(and can) be here.

Meanwhile I am sending a v2 which implements the linear scale idea,
without using signed values, as this indeed simplifies the code.
I have the signed version still in a branch here, let me know if you
want to have a look.

Cheers,
Andre.

> (Those semantics should apply equally to the CPU ID registers, though
> we don't currently do that.)
> 
> Thoughts?
> 
> [...]
> 
> Cheers
> ---Dave


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-25 14:46       ` Andre Przywara
@ 2019-01-29 21:32         ` Dave Martin
  -1 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-29 21:32 UTC (permalink / raw)
  To: Andre Przywara; +Cc: Marc Zyngier, linux-arm-kernel, kvm, kvmarm

On Fri, Jan 25, 2019 at 02:46:57PM +0000, Andre Przywara wrote:
> On Tue, 22 Jan 2019 15:17:14 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:
> 
> Hi Dave,
> 
> thanks for having a look!
> 
> > On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:
> > > KVM implements the firmware interface for mitigating cache
> > > speculation vulnerabilities. Guests may use this interface to
> > > ensure mitigation is active.
> > > If we want to migrate such a guest to a host with a different
> > > support level for those workarounds, migration might need to fail,
> > > to ensure that critical guests don't loose their protection.
> > > 
> > > Introduce a way for userland to save and restore the workarounds
> > > state. On restoring we do checks that make sure we don't downgrade
> > > our mitigation level.
> > > 
> > > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > > ---
> > >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> > >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> > >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> > >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> > >  virt/kvm/arm/psci.c                  | 138
> > > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+), 2
> > > deletions(-)
> > > 
> > > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > > b/arch/arm/include/asm/kvm_emulate.h index
> > > 77121b713bef..2255c50debab 100644 ---
> > > a/arch/arm/include/asm/kvm_emulate.h +++
> > > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > > return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> > >  
> > > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > > kvm_vcpu *vcpu) +{
> > > +	return false;
> > > +}
> > > +
> > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > kvm_vcpu *vcpu,
> > > +						      bool flag)
> > > +{
> > > +}
> > > +
> > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > >  {
> > >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > > b/arch/arm/include/uapi/asm/kvm.h index 4602464ebdfb..02c93b1d8f6d
> > > 100644 --- a/arch/arm/include/uapi/asm/kvm.h
> > > +++ b/arch/arm/include/uapi/asm/kvm.h
> > > @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
> > >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM |
> > > KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) & 0xffff))
> > >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1
> > > KVM_REG_ARM_FW_REG(1) +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4 
> > >  /* Device Control API: ARM VGIC */
> > >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > > diff --git a/arch/arm64/include/asm/kvm_emulate.h
> > > b/arch/arm64/include/asm/kvm_emulate.h index
> > > 506386a3edde..a44f07f68da4 100644 ---
> > > a/arch/arm64/include/asm/kvm_emulate.h +++
> > > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@ static
> > > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > > return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; }
> > >  
> > > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > > kvm_vcpu *vcpu) +{
> > > +	return vcpu->arch.workaround_flags &
> > > VCPU_WORKAROUND_2_FLAG; +}
> > > +
> > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > kvm_vcpu *vcpu,
> > > +						      bool flag)
> > > +{
> > > +	if (flag)
> > > +		vcpu->arch.workaround_flags |=
> > > VCPU_WORKAROUND_2_FLAG;
> > > +	else
> > > +		vcpu->arch.workaround_flags &=
> > > ~VCPU_WORKAROUND_2_FLAG; +}
> > > +
> > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > >  {
> > >  	if (vcpu_mode_is_32bit(vcpu)) {
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > > b/arch/arm64/include/uapi/asm/kvm.h index
> > > 97c3478ee6e7..4a19ef199a99 100644 ---
> > > a/arch/arm64/include/uapi/asm/kvm.h +++
> > > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > KVM_REG_ARM_FW_REG(0) +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > KVM_REG_ARM_FW_REG(2) +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4  
> > 
> > If this is the first exposure of this information to userspace, I
> > wonder if we can come up with some common semantics that avoid having
> > to add new ad-hoc code (and bugs) every time a new
> > vulnerability/workaround is defined.
> > 
> > We seem to have at least the two following independent properties
> > for a vulnerability, with the listed values for each:
> > 
> >  * vulnerability (Vulnerable, Unknown, Not Vulnerable)
> > 
> >  * mitigation support (Not Requestable, Requestable)
> > 
> > Migrations must not move to the left in _either_ list for any
> > vulnerability.
> > 
> > If we want to hedge out bets we could follow the style of the ID
> > registers and allocate to each theoretical vulnerability a pair of
> > signed 2- or (for more expansion room if we think we might need it)
> > 4-bit fields.
> > 
> > We could perhaps allocate as follows:
> > 
> >  * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
> >  *  0=Mitigation not requestable, 1=Mitigation requestable
> 
> So as discussed in person, that sounds quite neat. I implemented
> that, but the sign extension and masking to n bits is not very pretty
> and limits readability.
> However the property of having a kind of "vulnerability scale", where a
> simple comparison would determine compatibility, is a good thing to
> have and drastically simplifies the checking code.
> 
> > Checking code wouldn't need to know which fields describe mitigation
> > mechanisms and which describe vulnerabilities: we'd just do a strict
> > >= comparison on each.
> > 
> > Further, if a register is never written before the vcpu is first run,
> > we should imply a write of 0 to it as part of KVM_RUN (so that if the
> > destination node has a negative value anywhere, KVM_RUN barfs cleanly.
> 
> What I like about the signedness is this "0 means unknown", which is
> magically forwards compatible. However I am not sure we can transfer
> this semantic into every upcoming register that pops up in the future.

I appreciate the concern, but can you give an example of how it might
break?

My idea is that you can check for compatibility by comparing fields
without any need to know what they mean, but we wouldn't pre-assign
meanings for the values of unallocated fields, just create a precedent
that future fields can follow (where it works).

This is much like the CPU ID features scheme itself.  A "0" might
mean that something is absent, but there's no way (or need) to know
what.

> Actually we might not need this:
> My understanding of how QEMU handles this in migration is that it reads
> the f/w reg on the originating host A and writes this into the target
> host B, without itself interpreting this in any way. It's up to the
> target kernel (basically this code here) to check compatibility. So I am
> not sure we actually need a stable scheme. If host A doesn't know about

Nothing stops userspace from interpreting the data, so there's a risk
people may grow to rely on it even if we don't want them to.

So we should try to have something that's forward-compatible if at all
possible...

> a certain register, it won't appear in the result of the
> KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at all. In
> the opposite case the receiving host would reject an unknown register,
> which I believe is safer, although I see that it leaves the "unknown"
> case on the table.
> 
> It would be good to have some opinion of how forward looking we want to
> (and can) be here.
> 
> Meanwhile I am sending a v2 which implements the linear scale idea,
> without using signed values, as this indeed simplifies the code.
> I have the signed version still in a branch here, let me know if you
> want to have a look.

Happy to take a look at it.

I was hoping that cpufeatures already had a helper for extracting a
signed field, but I didn't go looking for it...

At the asm level this is just a sbfx, so it's hardly expensive.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-01-29 21:32         ` Dave Martin
  0 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-29 21:32 UTC (permalink / raw)
  To: Andre Przywara; +Cc: Marc Zyngier, linux-arm-kernel, kvm, kvmarm

On Fri, Jan 25, 2019 at 02:46:57PM +0000, Andre Przywara wrote:
> On Tue, 22 Jan 2019 15:17:14 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:
> 
> Hi Dave,
> 
> thanks for having a look!
> 
> > On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:
> > > KVM implements the firmware interface for mitigating cache
> > > speculation vulnerabilities. Guests may use this interface to
> > > ensure mitigation is active.
> > > If we want to migrate such a guest to a host with a different
> > > support level for those workarounds, migration might need to fail,
> > > to ensure that critical guests don't loose their protection.
> > > 
> > > Introduce a way for userland to save and restore the workarounds
> > > state. On restoring we do checks that make sure we don't downgrade
> > > our mitigation level.
> > > 
> > > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > > ---
> > >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> > >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> > >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> > >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> > >  virt/kvm/arm/psci.c                  | 138
> > > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+), 2
> > > deletions(-)
> > > 
> > > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > > b/arch/arm/include/asm/kvm_emulate.h index
> > > 77121b713bef..2255c50debab 100644 ---
> > > a/arch/arm/include/asm/kvm_emulate.h +++
> > > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > > return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> > >  
> > > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > > kvm_vcpu *vcpu) +{
> > > +	return false;
> > > +}
> > > +
> > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > kvm_vcpu *vcpu,
> > > +						      bool flag)
> > > +{
> > > +}
> > > +
> > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > >  {
> > >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > > b/arch/arm/include/uapi/asm/kvm.h index 4602464ebdfb..02c93b1d8f6d
> > > 100644 --- a/arch/arm/include/uapi/asm/kvm.h
> > > +++ b/arch/arm/include/uapi/asm/kvm.h
> > > @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
> > >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM |
> > > KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) & 0xffff))
> > >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1
> > > KVM_REG_ARM_FW_REG(1) +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4 
> > >  /* Device Control API: ARM VGIC */
> > >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > > diff --git a/arch/arm64/include/asm/kvm_emulate.h
> > > b/arch/arm64/include/asm/kvm_emulate.h index
> > > 506386a3edde..a44f07f68da4 100644 ---
> > > a/arch/arm64/include/asm/kvm_emulate.h +++
> > > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@ static
> > > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> > > return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK; }
> > >  
> > > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > > kvm_vcpu *vcpu) +{
> > > +	return vcpu->arch.workaround_flags &
> > > VCPU_WORKAROUND_2_FLAG; +}
> > > +
> > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > kvm_vcpu *vcpu,
> > > +						      bool flag)
> > > +{
> > > +	if (flag)
> > > +		vcpu->arch.workaround_flags |=
> > > VCPU_WORKAROUND_2_FLAG;
> > > +	else
> > > +		vcpu->arch.workaround_flags &=
> > > ~VCPU_WORKAROUND_2_FLAG; +}
> > > +
> > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > >  {
> > >  	if (vcpu_mode_is_32bit(vcpu)) {
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > > b/arch/arm64/include/uapi/asm/kvm.h index
> > > 97c3478ee6e7..4a19ef199a99 100644 ---
> > > a/arch/arm64/include/uapi/asm/kvm.h +++
> > > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > KVM_REG_ARM_FW_REG(0) +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > KVM_REG_ARM_FW_REG(2) +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4  
> > 
> > If this is the first exposure of this information to userspace, I
> > wonder if we can come up with some common semantics that avoid having
> > to add new ad-hoc code (and bugs) every time a new
> > vulnerability/workaround is defined.
> > 
> > We seem to have at least the two following independent properties
> > for a vulnerability, with the listed values for each:
> > 
> >  * vulnerability (Vulnerable, Unknown, Not Vulnerable)
> > 
> >  * mitigation support (Not Requestable, Requestable)
> > 
> > Migrations must not move to the left in _either_ list for any
> > vulnerability.
> > 
> > If we want to hedge out bets we could follow the style of the ID
> > registers and allocate to each theoretical vulnerability a pair of
> > signed 2- or (for more expansion room if we think we might need it)
> > 4-bit fields.
> > 
> > We could perhaps allocate as follows:
> > 
> >  * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
> >  *  0=Mitigation not requestable, 1=Mitigation requestable
> 
> So as discussed in person, that sounds quite neat. I implemented
> that, but the sign extension and masking to n bits is not very pretty
> and limits readability.
> However the property of having a kind of "vulnerability scale", where a
> simple comparison would determine compatibility, is a good thing to
> have and drastically simplifies the checking code.
> 
> > Checking code wouldn't need to know which fields describe mitigation
> > mechanisms and which describe vulnerabilities: we'd just do a strict
> > >= comparison on each.
> > 
> > Further, if a register is never written before the vcpu is first run,
> > we should imply a write of 0 to it as part of KVM_RUN (so that if the
> > destination node has a negative value anywhere, KVM_RUN barfs cleanly.
> 
> What I like about the signedness is this "0 means unknown", which is
> magically forwards compatible. However I am not sure we can transfer
> this semantic into every upcoming register that pops up in the future.

I appreciate the concern, but can you give an example of how it might
break?

My idea is that you can check for compatibility by comparing fields
without any need to know what they mean, but we wouldn't pre-assign
meanings for the values of unallocated fields, just create a precedent
that future fields can follow (where it works).

This is much like the CPU ID features scheme itself.  A "0" might
mean that something is absent, but there's no way (or need) to know
what.

> Actually we might not need this:
> My understanding of how QEMU handles this in migration is that it reads
> the f/w reg on the originating host A and writes this into the target
> host B, without itself interpreting this in any way. It's up to the
> target kernel (basically this code here) to check compatibility. So I am
> not sure we actually need a stable scheme. If host A doesn't know about

Nothing stops userspace from interpreting the data, so there's a risk
people may grow to rely on it even if we don't want them to.

So we should try to have something that's forward-compatible if at all
possible...

> a certain register, it won't appear in the result of the
> KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at all. In
> the opposite case the receiving host would reject an unknown register,
> which I believe is safer, although I see that it leaves the "unknown"
> case on the table.
> 
> It would be good to have some opinion of how forward looking we want to
> (and can) be here.
> 
> Meanwhile I am sending a v2 which implements the linear scale idea,
> without using signed values, as this indeed simplifies the code.
> I have the signed version still in a branch here, let me know if you
> want to have a look.

Happy to take a look at it.

I was hoping that cpufeatures already had a helper for extracting a
signed field, but I didn't go looking for it...

At the asm level this is just a sbfx, so it's hardly expensive.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-29 21:32         ` Dave Martin
@ 2019-01-30 11:39           ` Andre Przywara
  -1 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-30 11:39 UTC (permalink / raw)
  To: Dave Martin, Peter Maydell; +Cc: Marc Zyngier, linux-arm-kernel, kvm, kvmarm

On Tue, 29 Jan 2019 21:32:23 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

Hi Dave,

> On Fri, Jan 25, 2019 at 02:46:57PM +0000, Andre Przywara wrote:
> > On Tue, 22 Jan 2019 15:17:14 +0000
> > Dave Martin <Dave.Martin@arm.com> wrote:
> > 
> > Hi Dave,
> > 
> > thanks for having a look!
> >   
> > > On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:  
> > > > KVM implements the firmware interface for mitigating cache
> > > > speculation vulnerabilities. Guests may use this interface to
> > > > ensure mitigation is active.
> > > > If we want to migrate such a guest to a host with a different
> > > > support level for those workarounds, migration might need to
> > > > fail, to ensure that critical guests don't loose their
> > > > protection.
> > > > 
> > > > Introduce a way for userland to save and restore the workarounds
> > > > state. On restoring we do checks that make sure we don't
> > > > downgrade our mitigation level.
> > > > 
> > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > > > ---
> > > >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> > > >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> > > >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> > > >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> > > >  virt/kvm/arm/psci.c                  | 138
> > > > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+),
> > > > 2 deletions(-)
> > > > 
> > > > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > > > b/arch/arm/include/asm/kvm_emulate.h index
> > > > 77121b713bef..2255c50debab 100644 ---
> > > > a/arch/arm/include/asm/kvm_emulate.h +++
> > > > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > > > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu
> > > > *vcpu) return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> > > >  
> > > > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > > > kvm_vcpu *vcpu) +{
> > > > +	return false;
> > > > +}
> > > > +
> > > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > > kvm_vcpu *vcpu,
> > > > +						      bool
> > > > flag) +{
> > > > +}
> > > > +
> > > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > > >  {
> > > >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > > > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > > > b/arch/arm/include/uapi/asm/kvm.h index
> > > > 4602464ebdfb..02c93b1d8f6d 100644 ---
> > > > a/arch/arm/include/uapi/asm/kvm.h +++
> > > > b/arch/arm/include/uapi/asm/kvm.h @@ -214,6 +214,15 @@ struct
> > > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > > (KVM_REG_ARM | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > > KVM_REG_ARM_FW_REG(0) +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1 KVM_REG_ARM_FW_REG(1)
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > > KVM_REG_ARM_FW_REG(2) +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED
> > > > 4 /* Device Control API: ARM VGIC */ #define
> > > > KVM_DEV_ARM_VGIC_GRP_ADDR	0 diff --git
> > > > a/arch/arm64/include/asm/kvm_emulate.h
> > > > b/arch/arm64/include/asm/kvm_emulate.h index
> > > > 506386a3edde..a44f07f68da4 100644 ---
> > > > a/arch/arm64/include/asm/kvm_emulate.h +++
> > > > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@
> > > > static inline unsigned long kvm_vcpu_get_mpidr_aff(struct
> > > > kvm_vcpu *vcpu) return vcpu_read_sys_reg(vcpu, MPIDR_EL1) &
> > > > MPIDR_HWID_BITMASK; } +static inline bool
> > > > kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu) +{
> > > > +	return vcpu->arch.workaround_flags &
> > > > VCPU_WORKAROUND_2_FLAG; +}
> > > > +
> > > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > > kvm_vcpu *vcpu,
> > > > +						      bool
> > > > flag) +{
> > > > +	if (flag)
> > > > +		vcpu->arch.workaround_flags |=
> > > > VCPU_WORKAROUND_2_FLAG;
> > > > +	else
> > > > +		vcpu->arch.workaround_flags &=
> > > > ~VCPU_WORKAROUND_2_FLAG; +}
> > > > +
> > > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > > >  {
> > > >  	if (vcpu_mode_is_32bit(vcpu)) {
> > > > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > > > b/arch/arm64/include/uapi/asm/kvm.h index
> > > > 97c3478ee6e7..4a19ef199a99 100644 ---
> > > > a/arch/arm64/include/uapi/asm/kvm.h +++
> > > > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > > KVM_REG_ARM_FW_REG(0) +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > > KVM_REG_ARM_FW_REG(2) +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4    
> > > 
> > > If this is the first exposure of this information to userspace, I
> > > wonder if we can come up with some common semantics that avoid
> > > having to add new ad-hoc code (and bugs) every time a new
> > > vulnerability/workaround is defined.
> > > 
> > > We seem to have at least the two following independent properties
> > > for a vulnerability, with the listed values for each:
> > > 
> > >  * vulnerability (Vulnerable, Unknown, Not Vulnerable)
> > > 
> > >  * mitigation support (Not Requestable, Requestable)
> > > 
> > > Migrations must not move to the left in _either_ list for any
> > > vulnerability.
> > > 
> > > If we want to hedge out bets we could follow the style of the ID
> > > registers and allocate to each theoretical vulnerability a pair of
> > > signed 2- or (for more expansion room if we think we might need
> > > it) 4-bit fields.
> > > 
> > > We could perhaps allocate as follows:
> > > 
> > >  * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
> > >  *  0=Mitigation not requestable, 1=Mitigation requestable  
> > 
> > So as discussed in person, that sounds quite neat. I implemented
> > that, but the sign extension and masking to n bits is not very
> > pretty and limits readability.
> > However the property of having a kind of "vulnerability scale",
> > where a simple comparison would determine compatibility, is a good
> > thing to have and drastically simplifies the checking code.
> >   
> > > Checking code wouldn't need to know which fields describe
> > > mitigation mechanisms and which describe vulnerabilities: we'd
> > > just do a strict  
> > > >= comparison on each.  
> > > 
> > > Further, if a register is never written before the vcpu is first
> > > run, we should imply a write of 0 to it as part of KVM_RUN (so
> > > that if the destination node has a negative value anywhere,
> > > KVM_RUN barfs cleanly.  
> > 
> > What I like about the signedness is this "0 means unknown", which is
> > magically forwards compatible. However I am not sure we can transfer
> > this semantic into every upcoming register that pops up in the
> > future.  
> 
> I appreciate the concern, but can you give an example of how it might
> break?

The general problem is that we don't know how future firmware registers
would need to look like and whether they are actually for workarounds.
Take for instance KVM_REG_ARM_FW_REG(0), which holds the PSCI version.
So at the very least we would need to reserve a region of the 64K
firmware registers to use this scheme, yet don't know how many we would
need.
 
> My idea is that you can check for compatibility by comparing fields
> without any need to know what they mean, but we wouldn't pre-assign
> meanings for the values of unallocated fields, just create a precedent
> that future fields can follow (where it works).

For clarity, what do you mean with "... you can check ...", exactly? I
think this "you" would be the receiving kernel, which is very strict
about unknown registers (-EINVAL), because we don't take any chances.
>From what I understand how QEMU works, is that it just takes the list
of registers from the originating kernel and asks the receiving kernel
about them. It doesn't try to interpret most registers in any way.

Now QEMU *could* ignore the -EINVAL return and proceed anyway, if it
would be very sure about the implications or the admin told it so.
But I believe this should be done on a per register basis, and in QEMU,
relying on some forward looking scheme sounds a bit fragile to me.
It is my understanding that QEMU does not want to gamble with migration.

> This is much like the CPU ID features scheme itself.  A "0" might
> mean that something is absent, but there's no way (or need) to know
> what.

So I think we don't disagree about that this is possible or even would
be nice, but it's just not how it's used today. I am not sure we want
to introduce something like this, given that we don't know if there will
be any future workaround registers at all. Sounds a bit over-engineered
and fragile to me.

Peter, can you give your opinion about whether having some generic class
of firmware workaround registers which could be checked in a generic way
is something we want?

> > Actually we might not need this:
> > My understanding of how QEMU handles this in migration is that it
> > reads the f/w reg on the originating host A and writes this into
> > the target host B, without itself interpreting this in any way.
> > It's up to the target kernel (basically this code here) to check
> > compatibility. So I am not sure we actually need a stable scheme.
> > If host A doesn't know about  
> 
> Nothing stops userspace from interpreting the data, so there's a risk
> people may grow to rely on it even if we don't want them to.

Well, but userland would not interpret unknown registers, under the
current scheme, would it?
So it can surely tinker with KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
because it knows about its meaning. But I would be very careful judging
about anything else.
The moment we introduce some scheme, we would have to stick with it
forever. I am just not sure that's worth it. At the end of the day you
could always update QEMU to ignore an -EINVAL on a new firmware w/a
register.
 
> So we should try to have something that's forward-compatible if at all
> possible...
> > a certain register, it won't appear in the result of the
> > KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at
> > all. In the opposite case the receiving host would reject an
> > unknown register, which I believe is safer, although I see that it
> > leaves the "unknown" case on the table.
> > 
> > It would be good to have some opinion of how forward looking we
> > want to (and can) be here.
> > 
> > Meanwhile I am sending a v2 which implements the linear scale idea,
> > without using signed values, as this indeed simplifies the code.
> > I have the signed version still in a branch here, let me know if you
> > want to have a look.  
> 
> Happy to take a look at it.

See below.

> I was hoping that cpufeatures already had a helper for extracting a
> signed field, but I didn't go looking for it...
> 
> At the asm level this is just a sbfx, so it's hardly expensive.

The length of the code or the "performance" is hardly an issue (we are
talking about migration here, which is mostly limited by the speed of
the network). And yes, we have sign_extend32() and (i & 0xf) to
convert, it just looks a bit odd in the code and in the API
documentation.

Cheers,
Andre

diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index 6c6757c9571b..a7b10d835ce7 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -218,10 +218,10 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	(1U << 4)
 
 /* Device Control API: ARM VGIC */
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 367e96fe654e..7d03f8339100 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -229,10 +229,10 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED     (1U << 4)
 
 /* Device Control API: ARM VGIC */
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index fb6af5ca259e..cfb1519b9a11 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -498,7 +498,8 @@ static int get_kernel_wa_level(u64 regid)
 	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
 		switch (fake_kvm_arm_have_ssbd()) {
 		case KVM_SSBD_FORCE_DISABLE:
-			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
+			return sign_extend32(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL,
+					     KVM_REG_FEATURE_LEVEL_WIDTH - 1);
 		case KVM_SSBD_KERNEL:
 			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
 		case KVM_SSBD_FORCE_ENABLE:
@@ -574,7 +575,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	}
 
 	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
-		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
+		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
 
 		if (get_kernel_wa_level(reg->id) < wa_level)
 			return -EINVAL;
@@ -582,7 +583,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 		return 0;
 
 	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
-		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
+		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
 
 		if (get_kernel_wa_level(reg->id) < wa_level)
 			return -EINVAL;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-01-30 11:39           ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-01-30 11:39 UTC (permalink / raw)
  To: Dave Martin, Peter Maydell; +Cc: Marc Zyngier, linux-arm-kernel, kvm, kvmarm

On Tue, 29 Jan 2019 21:32:23 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

Hi Dave,

> On Fri, Jan 25, 2019 at 02:46:57PM +0000, Andre Przywara wrote:
> > On Tue, 22 Jan 2019 15:17:14 +0000
> > Dave Martin <Dave.Martin@arm.com> wrote:
> > 
> > Hi Dave,
> > 
> > thanks for having a look!
> >   
> > > On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:  
> > > > KVM implements the firmware interface for mitigating cache
> > > > speculation vulnerabilities. Guests may use this interface to
> > > > ensure mitigation is active.
> > > > If we want to migrate such a guest to a host with a different
> > > > support level for those workarounds, migration might need to
> > > > fail, to ensure that critical guests don't loose their
> > > > protection.
> > > > 
> > > > Introduce a way for userland to save and restore the workarounds
> > > > state. On restoring we do checks that make sure we don't
> > > > downgrade our mitigation level.
> > > > 
> > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > > > ---
> > > >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> > > >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> > > >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> > > >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> > > >  virt/kvm/arm/psci.c                  | 138
> > > > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+),
> > > > 2 deletions(-)
> > > > 
> > > > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > > > b/arch/arm/include/asm/kvm_emulate.h index
> > > > 77121b713bef..2255c50debab 100644 ---
> > > > a/arch/arm/include/asm/kvm_emulate.h +++
> > > > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > > > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu
> > > > *vcpu) return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> > > >  
> > > > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > > > kvm_vcpu *vcpu) +{
> > > > +	return false;
> > > > +}
> > > > +
> > > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > > kvm_vcpu *vcpu,
> > > > +						      bool
> > > > flag) +{
> > > > +}
> > > > +
> > > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > > >  {
> > > >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > > > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > > > b/arch/arm/include/uapi/asm/kvm.h index
> > > > 4602464ebdfb..02c93b1d8f6d 100644 ---
> > > > a/arch/arm/include/uapi/asm/kvm.h +++
> > > > b/arch/arm/include/uapi/asm/kvm.h @@ -214,6 +214,15 @@ struct
> > > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > > (KVM_REG_ARM | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > > KVM_REG_ARM_FW_REG(0) +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1 KVM_REG_ARM_FW_REG(1)
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > > KVM_REG_ARM_FW_REG(2) +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED
> > > > 4 /* Device Control API: ARM VGIC */ #define
> > > > KVM_DEV_ARM_VGIC_GRP_ADDR	0 diff --git
> > > > a/arch/arm64/include/asm/kvm_emulate.h
> > > > b/arch/arm64/include/asm/kvm_emulate.h index
> > > > 506386a3edde..a44f07f68da4 100644 ---
> > > > a/arch/arm64/include/asm/kvm_emulate.h +++
> > > > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@
> > > > static inline unsigned long kvm_vcpu_get_mpidr_aff(struct
> > > > kvm_vcpu *vcpu) return vcpu_read_sys_reg(vcpu, MPIDR_EL1) &
> > > > MPIDR_HWID_BITMASK; } +static inline bool
> > > > kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu) +{
> > > > +	return vcpu->arch.workaround_flags &
> > > > VCPU_WORKAROUND_2_FLAG; +}
> > > > +
> > > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > > kvm_vcpu *vcpu,
> > > > +						      bool
> > > > flag) +{
> > > > +	if (flag)
> > > > +		vcpu->arch.workaround_flags |=
> > > > VCPU_WORKAROUND_2_FLAG;
> > > > +	else
> > > > +		vcpu->arch.workaround_flags &=
> > > > ~VCPU_WORKAROUND_2_FLAG; +}
> > > > +
> > > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > > >  {
> > > >  	if (vcpu_mode_is_32bit(vcpu)) {
> > > > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > > > b/arch/arm64/include/uapi/asm/kvm.h index
> > > > 97c3478ee6e7..4a19ef199a99 100644 ---
> > > > a/arch/arm64/include/uapi/asm/kvm.h +++
> > > > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > > KVM_REG_ARM_FW_REG(0) +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > > KVM_REG_ARM_FW_REG(2) +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4    
> > > 
> > > If this is the first exposure of this information to userspace, I
> > > wonder if we can come up with some common semantics that avoid
> > > having to add new ad-hoc code (and bugs) every time a new
> > > vulnerability/workaround is defined.
> > > 
> > > We seem to have at least the two following independent properties
> > > for a vulnerability, with the listed values for each:
> > > 
> > >  * vulnerability (Vulnerable, Unknown, Not Vulnerable)
> > > 
> > >  * mitigation support (Not Requestable, Requestable)
> > > 
> > > Migrations must not move to the left in _either_ list for any
> > > vulnerability.
> > > 
> > > If we want to hedge out bets we could follow the style of the ID
> > > registers and allocate to each theoretical vulnerability a pair of
> > > signed 2- or (for more expansion room if we think we might need
> > > it) 4-bit fields.
> > > 
> > > We could perhaps allocate as follows:
> > > 
> > >  * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
> > >  *  0=Mitigation not requestable, 1=Mitigation requestable  
> > 
> > So as discussed in person, that sounds quite neat. I implemented
> > that, but the sign extension and masking to n bits is not very
> > pretty and limits readability.
> > However the property of having a kind of "vulnerability scale",
> > where a simple comparison would determine compatibility, is a good
> > thing to have and drastically simplifies the checking code.
> >   
> > > Checking code wouldn't need to know which fields describe
> > > mitigation mechanisms and which describe vulnerabilities: we'd
> > > just do a strict  
> > > >= comparison on each.  
> > > 
> > > Further, if a register is never written before the vcpu is first
> > > run, we should imply a write of 0 to it as part of KVM_RUN (so
> > > that if the destination node has a negative value anywhere,
> > > KVM_RUN barfs cleanly.  
> > 
> > What I like about the signedness is this "0 means unknown", which is
> > magically forwards compatible. However I am not sure we can transfer
> > this semantic into every upcoming register that pops up in the
> > future.  
> 
> I appreciate the concern, but can you give an example of how it might
> break?

The general problem is that we don't know how future firmware registers
would need to look like and whether they are actually for workarounds.
Take for instance KVM_REG_ARM_FW_REG(0), which holds the PSCI version.
So at the very least we would need to reserve a region of the 64K
firmware registers to use this scheme, yet don't know how many we would
need.
 
> My idea is that you can check for compatibility by comparing fields
> without any need to know what they mean, but we wouldn't pre-assign
> meanings for the values of unallocated fields, just create a precedent
> that future fields can follow (where it works).

For clarity, what do you mean with "... you can check ...", exactly? I
think this "you" would be the receiving kernel, which is very strict
about unknown registers (-EINVAL), because we don't take any chances.
From what I understand how QEMU works, is that it just takes the list
of registers from the originating kernel and asks the receiving kernel
about them. It doesn't try to interpret most registers in any way.

Now QEMU *could* ignore the -EINVAL return and proceed anyway, if it
would be very sure about the implications or the admin told it so.
But I believe this should be done on a per register basis, and in QEMU,
relying on some forward looking scheme sounds a bit fragile to me.
It is my understanding that QEMU does not want to gamble with migration.

> This is much like the CPU ID features scheme itself.  A "0" might
> mean that something is absent, but there's no way (or need) to know
> what.

So I think we don't disagree about that this is possible or even would
be nice, but it's just not how it's used today. I am not sure we want
to introduce something like this, given that we don't know if there will
be any future workaround registers at all. Sounds a bit over-engineered
and fragile to me.

Peter, can you give your opinion about whether having some generic class
of firmware workaround registers which could be checked in a generic way
is something we want?

> > Actually we might not need this:
> > My understanding of how QEMU handles this in migration is that it
> > reads the f/w reg on the originating host A and writes this into
> > the target host B, without itself interpreting this in any way.
> > It's up to the target kernel (basically this code here) to check
> > compatibility. So I am not sure we actually need a stable scheme.
> > If host A doesn't know about  
> 
> Nothing stops userspace from interpreting the data, so there's a risk
> people may grow to rely on it even if we don't want them to.

Well, but userland would not interpret unknown registers, under the
current scheme, would it?
So it can surely tinker with KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
because it knows about its meaning. But I would be very careful judging
about anything else.
The moment we introduce some scheme, we would have to stick with it
forever. I am just not sure that's worth it. At the end of the day you
could always update QEMU to ignore an -EINVAL on a new firmware w/a
register.
 
> So we should try to have something that's forward-compatible if at all
> possible...
> > a certain register, it won't appear in the result of the
> > KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at
> > all. In the opposite case the receiving host would reject an
> > unknown register, which I believe is safer, although I see that it
> > leaves the "unknown" case on the table.
> > 
> > It would be good to have some opinion of how forward looking we
> > want to (and can) be here.
> > 
> > Meanwhile I am sending a v2 which implements the linear scale idea,
> > without using signed values, as this indeed simplifies the code.
> > I have the signed version still in a branch here, let me know if you
> > want to have a look.  
> 
> Happy to take a look at it.

See below.

> I was hoping that cpufeatures already had a helper for extracting a
> signed field, but I didn't go looking for it...
> 
> At the asm level this is just a sbfx, so it's hardly expensive.

The length of the code or the "performance" is hardly an issue (we are
talking about migration here, which is mostly limited by the speed of
the network). And yes, we have sign_extend32() and (i & 0xf) to
convert, it just looks a bit odd in the code and in the API
documentation.

Cheers,
Andre

diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
index 6c6757c9571b..a7b10d835ce7 100644
--- a/arch/arm/include/uapi/asm/kvm.h
+++ b/arch/arm/include/uapi/asm/kvm.h
@@ -218,10 +218,10 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	(1U << 4)
 
 /* Device Control API: ARM VGIC */
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index 367e96fe654e..7d03f8339100 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -229,10 +229,10 @@ struct kvm_vcpu_events {
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
-#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
+#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
 #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED     (1U << 4)
 
 /* Device Control API: ARM VGIC */
diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
index fb6af5ca259e..cfb1519b9a11 100644
--- a/virt/kvm/arm/psci.c
+++ b/virt/kvm/arm/psci.c
@@ -498,7 +498,8 @@ static int get_kernel_wa_level(u64 regid)
 	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
 		switch (fake_kvm_arm_have_ssbd()) {
 		case KVM_SSBD_FORCE_DISABLE:
-			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
+			return sign_extend32(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL,
+					     KVM_REG_FEATURE_LEVEL_WIDTH - 1);
 		case KVM_SSBD_KERNEL:
 			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
 		case KVM_SSBD_FORCE_ENABLE:
@@ -574,7 +575,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	}
 
 	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
-		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
+		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
 
 		if (get_kernel_wa_level(reg->id) < wa_level)
 			return -EINVAL;
@@ -582,7 +583,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 		return 0;
 
 	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
-		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
+		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
 
 		if (get_kernel_wa_level(reg->id) < wa_level)
 			return -EINVAL;



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-30 11:39           ` Andre Przywara
@ 2019-01-30 12:07             ` Dave Martin
  -1 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-30 12:07 UTC (permalink / raw)
  To: Andre Przywara; +Cc: Marc Zyngier, kvm, linux-arm-kernel, kvmarm

On Wed, Jan 30, 2019 at 11:39:00AM +0000, Andre Przywara wrote:
> On Tue, 29 Jan 2019 21:32:23 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:
> 
> Hi Dave,
> 
> > On Fri, Jan 25, 2019 at 02:46:57PM +0000, Andre Przywara wrote:

[...]

> > > What I like about the signedness is this "0 means unknown", which is
> > > magically forwards compatible. However I am not sure we can transfer
> > > this semantic into every upcoming register that pops up in the
> > > future.  
> > 
> > I appreciate the concern, but can you give an example of how it might
> > break?
> 
> The general problem is that we don't know how future firmware registers
> would need to look like and whether they are actually for workarounds.
> Take for instance KVM_REG_ARM_FW_REG(0), which holds the PSCI version.
> So at the very least we would need to reserve a region of the 64K
> firmware registers to use this scheme, yet don't know how many we would
> need.

My idea was that we reserve a large block of register IDs for this
purpose.  This means that we can say in advance what the semantics
of these registers are going to be, and ensure plenty of expansion
room.

> > My idea is that you can check for compatibility by comparing fields
> > without any need to know what they mean, but we wouldn't pre-assign
> > meanings for the values of unallocated fields, just create a precedent
> > that future fields can follow (where it works).
> 
> For clarity, what do you mean with "... you can check ...", exactly? I
> think this "you" would be the receiving kernel, which is very strict
> about unknown registers (-EINVAL), because we don't take any chances.
> From what I understand how QEMU works, is that it just takes the list
> of registers from the originating kernel and asks the receiving kernel
> about them. It doesn't try to interpret most registers in any way.

Can we solve this by pre-allocating a block of registers for future
allocation: they all become RAZ, and writes are permitted provided that
the value written to each field satisfies our usual comparison rule
(every field must be written with <= 0 in this case).

This is what I had in mind.

> Now QEMU *could* ignore the -EINVAL return and proceed anyway, if it
> would be very sure about the implications or the admin told it so.
> But I believe this should be done on a per register basis, and in QEMU,
> relying on some forward looking scheme sounds a bit fragile to me.
> It is my understanding that QEMU does not want to gamble with migration.
> 
> > This is much like the CPU ID features scheme itself.  A "0" might
> > mean that something is absent, but there's no way (or need) to know
> > what.
> 
> So I think we don't disagree about that this is possible or even would
> be nice, but it's just not how it's used today. I am not sure we want
> to introduce something like this, given that we don't know if there will
> be any future workaround registers at all. Sounds a bit over-engineered
> and fragile to me.

Yes, that's a concern.

We could just allocate a single register with these semantics, but
use it as a template for future expansion if it turns out that we
need more fields.  We'll know pretty soon how fast the number of
fields is likely to grow.

> Peter, can you give your opinion about whether having some generic class
> of firmware workaround registers which could be checked in a generic way
> is something we want?
> 
> > > Actually we might not need this:
> > > My understanding of how QEMU handles this in migration is that it
> > > reads the f/w reg on the originating host A and writes this into
> > > the target host B, without itself interpreting this in any way.
> > > It's up to the target kernel (basically this code here) to check
> > > compatibility. So I am not sure we actually need a stable scheme.
> > > If host A doesn't know about  
> > 
> > Nothing stops userspace from interpreting the data, so there's a risk
> > people may grow to rely on it even if we don't want them to.
> 
> Well, but userland would not interpret unknown registers, under the
> current scheme, would it?
> So it can surely tinker with KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
> because it knows about its meaning. But I would be very careful judging
> about anything else.
> The moment we introduce some scheme, we would have to stick with it
> forever. I am just not sure that's worth it. At the end of the day you
> could always update QEMU to ignore an -EINVAL on a new firmware w/a
> register.

My point is, we are introducing a scheme whether we like it or not.

Perhaps we could do this, but make it explicit that these regs hold
KVM private metadata that userspace is not expected to interpret, say

	KVM_REG_ARM_PRIVATE_1
	KVM_REG_ARM_PRIVATE_2
	// ...

which will be listed by KVM_GET_REG_LIST, but with no #defines in the
UAPI headers, except perhaps to identify these IDs as a class (i.e.,
userspace can see it's in the KVM_REG_ARM_PRIVATE_ space, but it's
told what a given ID means).

Source and destination node might understand different numbers of such
registers: we'd need a way to handle this (or at least to guarantee that
the mismatch is detected).


>  
> > So we should try to have something that's forward-compatible if at all
> > possible...
> > > a certain register, it won't appear in the result of the
> > > KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at
> > > all. In the opposite case the receiving host would reject an
> > > unknown register, which I believe is safer, although I see that it
> > > leaves the "unknown" case on the table.
> > > 
> > > It would be good to have some opinion of how forward looking we
> > > want to (and can) be here.
> > > 
> > > Meanwhile I am sending a v2 which implements the linear scale idea,
> > > without using signed values, as this indeed simplifies the code.
> > > I have the signed version still in a branch here, let me know if you
> > > want to have a look.  
> > 
> > Happy to take a look at it.
> 
> See below.
> 
> > I was hoping that cpufeatures already had a helper for extracting a
> > signed field, but I didn't go looking for it...
> > 
> > At the asm level this is just a sbfx, so it's hardly expensive.
> 
> The length of the code or the "performance" is hardly an issue (we are
> talking about migration here, which is mostly limited by the speed of
> the network). And yes, we have sign_extend32() and (i & 0xf) to
> convert, it just looks a bit odd in the code and in the API
> documentation.
> 
> Cheers,
> Andre
> 
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 6c6757c9571b..a7b10d835ce7 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -218,10 +218,10 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	(1U << 4)
>  
>  /* Device Control API: ARM VGIC */
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 367e96fe654e..7d03f8339100 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -229,10 +229,10 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED     (1U << 4)
>  
>  /* Device Control API: ARM VGIC */
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index fb6af5ca259e..cfb1519b9a11 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -498,7 +498,8 @@ static int get_kernel_wa_level(u64 regid)
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>  		switch (fake_kvm_arm_have_ssbd()) {
>  		case KVM_SSBD_FORCE_DISABLE:
> -			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> +			return sign_extend32(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL,
> +					     KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  		case KVM_SSBD_KERNEL:
>  			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
>  		case KVM_SSBD_FORCE_ENABLE:
> @@ -574,7 +575,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	}
>  
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> -		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
> +		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  
>  		if (get_kernel_wa_level(reg->id) < wa_level)
>  			return -EINVAL;
> @@ -582,7 +583,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  		return 0;
>  
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> -		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
> +		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);

We could have a helper to do this.  I agree it's marginally uglier than
working with unsigned fields, we could probably use
cpuid_feature_extract_signed_field() to achieve the same.

I agree that this is bikeshedding though, and it doesn't matter one way
or the other unless there is some other compelling argument.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-01-30 12:07             ` Dave Martin
  0 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-01-30 12:07 UTC (permalink / raw)
  To: Andre Przywara; +Cc: Marc Zyngier, Peter Maydell, kvm, linux-arm-kernel, kvmarm

On Wed, Jan 30, 2019 at 11:39:00AM +0000, Andre Przywara wrote:
> On Tue, 29 Jan 2019 21:32:23 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:
> 
> Hi Dave,
> 
> > On Fri, Jan 25, 2019 at 02:46:57PM +0000, Andre Przywara wrote:

[...]

> > > What I like about the signedness is this "0 means unknown", which is
> > > magically forwards compatible. However I am not sure we can transfer
> > > this semantic into every upcoming register that pops up in the
> > > future.  
> > 
> > I appreciate the concern, but can you give an example of how it might
> > break?
> 
> The general problem is that we don't know how future firmware registers
> would need to look like and whether they are actually for workarounds.
> Take for instance KVM_REG_ARM_FW_REG(0), which holds the PSCI version.
> So at the very least we would need to reserve a region of the 64K
> firmware registers to use this scheme, yet don't know how many we would
> need.

My idea was that we reserve a large block of register IDs for this
purpose.  This means that we can say in advance what the semantics
of these registers are going to be, and ensure plenty of expansion
room.

> > My idea is that you can check for compatibility by comparing fields
> > without any need to know what they mean, but we wouldn't pre-assign
> > meanings for the values of unallocated fields, just create a precedent
> > that future fields can follow (where it works).
> 
> For clarity, what do you mean with "... you can check ...", exactly? I
> think this "you" would be the receiving kernel, which is very strict
> about unknown registers (-EINVAL), because we don't take any chances.
> From what I understand how QEMU works, is that it just takes the list
> of registers from the originating kernel and asks the receiving kernel
> about them. It doesn't try to interpret most registers in any way.

Can we solve this by pre-allocating a block of registers for future
allocation: they all become RAZ, and writes are permitted provided that
the value written to each field satisfies our usual comparison rule
(every field must be written with <= 0 in this case).

This is what I had in mind.

> Now QEMU *could* ignore the -EINVAL return and proceed anyway, if it
> would be very sure about the implications or the admin told it so.
> But I believe this should be done on a per register basis, and in QEMU,
> relying on some forward looking scheme sounds a bit fragile to me.
> It is my understanding that QEMU does not want to gamble with migration.
> 
> > This is much like the CPU ID features scheme itself.  A "0" might
> > mean that something is absent, but there's no way (or need) to know
> > what.
> 
> So I think we don't disagree about that this is possible or even would
> be nice, but it's just not how it's used today. I am not sure we want
> to introduce something like this, given that we don't know if there will
> be any future workaround registers at all. Sounds a bit over-engineered
> and fragile to me.

Yes, that's a concern.

We could just allocate a single register with these semantics, but
use it as a template for future expansion if it turns out that we
need more fields.  We'll know pretty soon how fast the number of
fields is likely to grow.

> Peter, can you give your opinion about whether having some generic class
> of firmware workaround registers which could be checked in a generic way
> is something we want?
> 
> > > Actually we might not need this:
> > > My understanding of how QEMU handles this in migration is that it
> > > reads the f/w reg on the originating host A and writes this into
> > > the target host B, without itself interpreting this in any way.
> > > It's up to the target kernel (basically this code here) to check
> > > compatibility. So I am not sure we actually need a stable scheme.
> > > If host A doesn't know about  
> > 
> > Nothing stops userspace from interpreting the data, so there's a risk
> > people may grow to rely on it even if we don't want them to.
> 
> Well, but userland would not interpret unknown registers, under the
> current scheme, would it?
> So it can surely tinker with KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
> because it knows about its meaning. But I would be very careful judging
> about anything else.
> The moment we introduce some scheme, we would have to stick with it
> forever. I am just not sure that's worth it. At the end of the day you
> could always update QEMU to ignore an -EINVAL on a new firmware w/a
> register.

My point is, we are introducing a scheme whether we like it or not.

Perhaps we could do this, but make it explicit that these regs hold
KVM private metadata that userspace is not expected to interpret, say

	KVM_REG_ARM_PRIVATE_1
	KVM_REG_ARM_PRIVATE_2
	// ...

which will be listed by KVM_GET_REG_LIST, but with no #defines in the
UAPI headers, except perhaps to identify these IDs as a class (i.e.,
userspace can see it's in the KVM_REG_ARM_PRIVATE_ space, but it's
told what a given ID means).

Source and destination node might understand different numbers of such
registers: we'd need a way to handle this (or at least to guarantee that
the mismatch is detected).


>  
> > So we should try to have something that's forward-compatible if at all
> > possible...
> > > a certain register, it won't appear in the result of the
> > > KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at
> > > all. In the opposite case the receiving host would reject an
> > > unknown register, which I believe is safer, although I see that it
> > > leaves the "unknown" case on the table.
> > > 
> > > It would be good to have some opinion of how forward looking we
> > > want to (and can) be here.
> > > 
> > > Meanwhile I am sending a v2 which implements the linear scale idea,
> > > without using signed values, as this indeed simplifies the code.
> > > I have the signed version still in a branch here, let me know if you
> > > want to have a look.  
> > 
> > Happy to take a look at it.
> 
> See below.
> 
> > I was hoping that cpufeatures already had a helper for extracting a
> > signed field, but I didn't go looking for it...
> > 
> > At the asm level this is just a sbfx, so it's hardly expensive.
> 
> The length of the code or the "performance" is hardly an issue (we are
> talking about migration here, which is mostly limited by the speed of
> the network). And yes, we have sign_extend32() and (i & 0xf) to
> convert, it just looks a bit odd in the code and in the API
> documentation.
> 
> Cheers,
> Andre
> 
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 6c6757c9571b..a7b10d835ce7 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -218,10 +218,10 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	(1U << 4)
>  
>  /* Device Control API: ARM VGIC */
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 367e96fe654e..7d03f8339100 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -229,10 +229,10 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED     (1U << 4)
>  
>  /* Device Control API: ARM VGIC */
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index fb6af5ca259e..cfb1519b9a11 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -498,7 +498,8 @@ static int get_kernel_wa_level(u64 regid)
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>  		switch (fake_kvm_arm_have_ssbd()) {
>  		case KVM_SSBD_FORCE_DISABLE:
> -			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> +			return sign_extend32(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL,
> +					     KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  		case KVM_SSBD_KERNEL:
>  			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
>  		case KVM_SSBD_FORCE_ENABLE:
> @@ -574,7 +575,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	}
>  
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> -		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
> +		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  
>  		if (get_kernel_wa_level(reg->id) < wa_level)
>  			return -EINVAL;
> @@ -582,7 +583,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  		return 0;
>  
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> -		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
> +		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);

We could have a helper to do this.  I agree it's marginally uglier than
working with unsigned fields, we could probably use
cpuid_feature_extract_signed_field() to achieve the same.

I agree that this is bikeshedding though, and it doesn't matter one way
or the other unless there is some other compelling argument.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-30 11:39           ` Andre Przywara
@ 2019-02-15  9:58             ` Andre Przywara
  -1 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-02-15  9:58 UTC (permalink / raw)
  To: Dave Martin, Peter Maydell, Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, kvm, kvmarm

On Wed, 30 Jan 2019 11:39:00 +0000
Andre Przywara <andre.przywara@arm.com> wrote:

Peter, Marc, Christoffer,

can we have an opinion on whether it's useful to introduce some common scheme for firmware workaround system registers (parts of KVM_REG_ARM_FW_REG(x)), which would allow checking them for compatibility between two kernels without specifically knowing about them?
Dave suggested to introduce some kind of signed encoding in the 4 LSBs for all those registers (including future ones), where 0 means UNKNOWN and greater values are better. So without knowing about the particular register, one could judge whether it's safe to migrate.
I am just not sure how useful this is, given that QEMU seems to ask the receiving kernel about any sysreg, and doesn't particularly care about the meaning of those registers. And I am not sure we really want to introduce some kind of forward looking scheme in the kernel here, short of a working crystal ball. I think the kernel policy was always to be as strict as possible about those things.

Any opinions would be welcome, so that we can proceed on those patches.

More context below.

Many Thanks,
Andre.

> On Tue, 29 Jan 2019 21:32:23 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:
> 
> Hi Dave,
> 
> > On Fri, Jan 25, 2019 at 02:46:57PM +0000, Andre Przywara wrote:  
> > > On Tue, 22 Jan 2019 15:17:14 +0000
> > > Dave Martin <Dave.Martin@arm.com> wrote:
> > > 
> > > Hi Dave,
> > > 
> > > thanks for having a look!
> > >     
> > > > On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:    
> > > > > KVM implements the firmware interface for mitigating cache
> > > > > speculation vulnerabilities. Guests may use this interface to
> > > > > ensure mitigation is active.
> > > > > If we want to migrate such a guest to a host with a different
> > > > > support level for those workarounds, migration might need to
> > > > > fail, to ensure that critical guests don't loose their
> > > > > protection.
> > > > > 
> > > > > Introduce a way for userland to save and restore the workarounds
> > > > > state. On restoring we do checks that make sure we don't
> > > > > downgrade our mitigation level.
> > > > > 
> > > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > > > > ---
> > > > >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> > > > >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> > > > >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> > > > >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> > > > >  virt/kvm/arm/psci.c                  | 138
> > > > > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+),
> > > > > 2 deletions(-)
> > > > > 
> > > > > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > > > > b/arch/arm/include/asm/kvm_emulate.h index
> > > > > 77121b713bef..2255c50debab 100644 ---
> > > > > a/arch/arm/include/asm/kvm_emulate.h +++
> > > > > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > > > > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu
> > > > > *vcpu) return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> > > > >  
> > > > > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > > > > kvm_vcpu *vcpu) +{
> > > > > +	return false;
> > > > > +}
> > > > > +
> > > > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > > > kvm_vcpu *vcpu,
> > > > > +						      bool
> > > > > flag) +{
> > > > > +}
> > > > > +
> > > > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > > > >  {
> > > > >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > > > > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > > > > b/arch/arm/include/uapi/asm/kvm.h index
> > > > > 4602464ebdfb..02c93b1d8f6d 100644 ---
> > > > > a/arch/arm/include/uapi/asm/kvm.h +++
> > > > > b/arch/arm/include/uapi/asm/kvm.h @@ -214,6 +214,15 @@ struct
> > > > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > > > (KVM_REG_ARM | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > > > KVM_REG_ARM_FW_REG(0) +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1 KVM_REG_ARM_FW_REG(1)
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > > > KVM_REG_ARM_FW_REG(2) +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED
> > > > > 4 /* Device Control API: ARM VGIC */ #define
> > > > > KVM_DEV_ARM_VGIC_GRP_ADDR	0 diff --git
> > > > > a/arch/arm64/include/asm/kvm_emulate.h
> > > > > b/arch/arm64/include/asm/kvm_emulate.h index
> > > > > 506386a3edde..a44f07f68da4 100644 ---
> > > > > a/arch/arm64/include/asm/kvm_emulate.h +++
> > > > > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@
> > > > > static inline unsigned long kvm_vcpu_get_mpidr_aff(struct
> > > > > kvm_vcpu *vcpu) return vcpu_read_sys_reg(vcpu, MPIDR_EL1) &
> > > > > MPIDR_HWID_BITMASK; } +static inline bool
> > > > > kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu) +{
> > > > > +	return vcpu->arch.workaround_flags &
> > > > > VCPU_WORKAROUND_2_FLAG; +}
> > > > > +
> > > > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > > > kvm_vcpu *vcpu,
> > > > > +						      bool
> > > > > flag) +{
> > > > > +	if (flag)
> > > > > +		vcpu->arch.workaround_flags |=
> > > > > VCPU_WORKAROUND_2_FLAG;
> > > > > +	else
> > > > > +		vcpu->arch.workaround_flags &=
> > > > > ~VCPU_WORKAROUND_2_FLAG; +}
> > > > > +
> > > > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > > > >  {
> > > > >  	if (vcpu_mode_is_32bit(vcpu)) {
> > > > > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > > > > b/arch/arm64/include/uapi/asm/kvm.h index
> > > > > 97c3478ee6e7..4a19ef199a99 100644 ---
> > > > > a/arch/arm64/include/uapi/asm/kvm.h +++
> > > > > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > > > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > > > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > > > KVM_REG_ARM_FW_REG(0) +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > > > KVM_REG_ARM_FW_REG(2) +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4      
> > > > 
> > > > If this is the first exposure of this information to userspace, I
> > > > wonder if we can come up with some common semantics that avoid
> > > > having to add new ad-hoc code (and bugs) every time a new
> > > > vulnerability/workaround is defined.
> > > > 
> > > > We seem to have at least the two following independent properties
> > > > for a vulnerability, with the listed values for each:
> > > > 
> > > >  * vulnerability (Vulnerable, Unknown, Not Vulnerable)
> > > > 
> > > >  * mitigation support (Not Requestable, Requestable)
> > > > 
> > > > Migrations must not move to the left in _either_ list for any
> > > > vulnerability.
> > > > 
> > > > If we want to hedge out bets we could follow the style of the ID
> > > > registers and allocate to each theoretical vulnerability a pair of
> > > > signed 2- or (for more expansion room if we think we might need
> > > > it) 4-bit fields.
> > > > 
> > > > We could perhaps allocate as follows:
> > > > 
> > > >  * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
> > > >  *  0=Mitigation not requestable, 1=Mitigation requestable    
> > > 
> > > So as discussed in person, that sounds quite neat. I implemented
> > > that, but the sign extension and masking to n bits is not very
> > > pretty and limits readability.
> > > However the property of having a kind of "vulnerability scale",
> > > where a simple comparison would determine compatibility, is a good
> > > thing to have and drastically simplifies the checking code.
> > >     
> > > > Checking code wouldn't need to know which fields describe
> > > > mitigation mechanisms and which describe vulnerabilities: we'd
> > > > just do a strict    
> > > > >= comparison on each.    
> > > > 
> > > > Further, if a register is never written before the vcpu is first
> > > > run, we should imply a write of 0 to it as part of KVM_RUN (so
> > > > that if the destination node has a negative value anywhere,
> > > > KVM_RUN barfs cleanly.    
> > > 
> > > What I like about the signedness is this "0 means unknown", which is
> > > magically forwards compatible. However I am not sure we can transfer
> > > this semantic into every upcoming register that pops up in the
> > > future.    
> > 
> > I appreciate the concern, but can you give an example of how it might
> > break?  
> 
> The general problem is that we don't know how future firmware registers
> would need to look like and whether they are actually for workarounds.
> Take for instance KVM_REG_ARM_FW_REG(0), which holds the PSCI version.
> So at the very least we would need to reserve a region of the 64K
> firmware registers to use this scheme, yet don't know how many we would
> need.
>  
> > My idea is that you can check for compatibility by comparing fields
> > without any need to know what they mean, but we wouldn't pre-assign
> > meanings for the values of unallocated fields, just create a precedent
> > that future fields can follow (where it works).  
> 
> For clarity, what do you mean with "... you can check ...", exactly? I
> think this "you" would be the receiving kernel, which is very strict
> about unknown registers (-EINVAL), because we don't take any chances.
> From what I understand how QEMU works, is that it just takes the list
> of registers from the originating kernel and asks the receiving kernel
> about them. It doesn't try to interpret most registers in any way.
> 
> Now QEMU *could* ignore the -EINVAL return and proceed anyway, if it
> would be very sure about the implications or the admin told it so.
> But I believe this should be done on a per register basis, and in QEMU,
> relying on some forward looking scheme sounds a bit fragile to me.
> It is my understanding that QEMU does not want to gamble with migration.
> 
> > This is much like the CPU ID features scheme itself.  A "0" might
> > mean that something is absent, but there's no way (or need) to know
> > what.  
> 
> So I think we don't disagree about that this is possible or even would
> be nice, but it's just not how it's used today. I am not sure we want
> to introduce something like this, given that we don't know if there will
> be any future workaround registers at all. Sounds a bit over-engineered
> and fragile to me.
> 
> Peter, can you give your opinion about whether having some generic class
> of firmware workaround registers which could be checked in a generic way
> is something we want?
> 
> > > Actually we might not need this:
> > > My understanding of how QEMU handles this in migration is that it
> > > reads the f/w reg on the originating host A and writes this into
> > > the target host B, without itself interpreting this in any way.
> > > It's up to the target kernel (basically this code here) to check
> > > compatibility. So I am not sure we actually need a stable scheme.
> > > If host A doesn't know about    
> > 
> > Nothing stops userspace from interpreting the data, so there's a risk
> > people may grow to rely on it even if we don't want them to.  
> 
> Well, but userland would not interpret unknown registers, under the
> current scheme, would it?
> So it can surely tinker with KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
> because it knows about its meaning. But I would be very careful judging
> about anything else.
> The moment we introduce some scheme, we would have to stick with it
> forever. I am just not sure that's worth it. At the end of the day you
> could always update QEMU to ignore an -EINVAL on a new firmware w/a
> register.
>  
> > So we should try to have something that's forward-compatible if at all
> > possible...  
> > > a certain register, it won't appear in the result of the
> > > KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at
> > > all. In the opposite case the receiving host would reject an
> > > unknown register, which I believe is safer, although I see that it
> > > leaves the "unknown" case on the table.
> > > 
> > > It would be good to have some opinion of how forward looking we
> > > want to (and can) be here.
> > > 
> > > Meanwhile I am sending a v2 which implements the linear scale idea,
> > > without using signed values, as this indeed simplifies the code.
> > > I have the signed version still in a branch here, let me know if you
> > > want to have a look.    
> > 
> > Happy to take a look at it.  
> 
> See below.
> 
> > I was hoping that cpufeatures already had a helper for extracting a
> > signed field, but I didn't go looking for it...
> > 
> > At the asm level this is just a sbfx, so it's hardly expensive.  
> 
> The length of the code or the "performance" is hardly an issue (we are
> talking about migration here, which is mostly limited by the speed of
> the network). And yes, we have sign_extend32() and (i & 0xf) to
> convert, it just looks a bit odd in the code and in the API
> documentation.
> 
> Cheers,
> Andre
> 
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 6c6757c9571b..a7b10d835ce7 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -218,10 +218,10 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	(1U << 4)
>  
>  /* Device Control API: ARM VGIC */
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 367e96fe654e..7d03f8339100 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -229,10 +229,10 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED     (1U << 4)
>  
>  /* Device Control API: ARM VGIC */
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index fb6af5ca259e..cfb1519b9a11 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -498,7 +498,8 @@ static int get_kernel_wa_level(u64 regid)
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>  		switch (fake_kvm_arm_have_ssbd()) {
>  		case KVM_SSBD_FORCE_DISABLE:
> -			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> +			return sign_extend32(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL,
> +					     KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  		case KVM_SSBD_KERNEL:
>  			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
>  		case KVM_SSBD_FORCE_ENABLE:
> @@ -574,7 +575,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	}
>  
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> -		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
> +		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  
>  		if (get_kernel_wa_level(reg->id) < wa_level)
>  			return -EINVAL;
> @@ -582,7 +583,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  		return 0;
>  
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> -		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
> +		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  
>  		if (get_kernel_wa_level(reg->id) < wa_level)
>  			return -EINVAL;
> 
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-02-15  9:58             ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-02-15  9:58 UTC (permalink / raw)
  To: Dave Martin, Peter Maydell, Marc Zyngier, Christoffer Dall
  Cc: linux-arm-kernel, kvm, kvmarm

On Wed, 30 Jan 2019 11:39:00 +0000
Andre Przywara <andre.przywara@arm.com> wrote:

Peter, Marc, Christoffer,

can we have an opinion on whether it's useful to introduce some common scheme for firmware workaround system registers (parts of KVM_REG_ARM_FW_REG(x)), which would allow checking them for compatibility between two kernels without specifically knowing about them?
Dave suggested to introduce some kind of signed encoding in the 4 LSBs for all those registers (including future ones), where 0 means UNKNOWN and greater values are better. So without knowing about the particular register, one could judge whether it's safe to migrate.
I am just not sure how useful this is, given that QEMU seems to ask the receiving kernel about any sysreg, and doesn't particularly care about the meaning of those registers. And I am not sure we really want to introduce some kind of forward looking scheme in the kernel here, short of a working crystal ball. I think the kernel policy was always to be as strict as possible about those things.

Any opinions would be welcome, so that we can proceed on those patches.

More context below.

Many Thanks,
Andre.

> On Tue, 29 Jan 2019 21:32:23 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:
> 
> Hi Dave,
> 
> > On Fri, Jan 25, 2019 at 02:46:57PM +0000, Andre Przywara wrote:  
> > > On Tue, 22 Jan 2019 15:17:14 +0000
> > > Dave Martin <Dave.Martin@arm.com> wrote:
> > > 
> > > Hi Dave,
> > > 
> > > thanks for having a look!
> > >     
> > > > On Mon, Jan 07, 2019 at 12:05:36PM +0000, Andre Przywara wrote:    
> > > > > KVM implements the firmware interface for mitigating cache
> > > > > speculation vulnerabilities. Guests may use this interface to
> > > > > ensure mitigation is active.
> > > > > If we want to migrate such a guest to a host with a different
> > > > > support level for those workarounds, migration might need to
> > > > > fail, to ensure that critical guests don't loose their
> > > > > protection.
> > > > > 
> > > > > Introduce a way for userland to save and restore the workarounds
> > > > > state. On restoring we do checks that make sure we don't
> > > > > downgrade our mitigation level.
> > > > > 
> > > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > > > > ---
> > > > >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> > > > >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> > > > >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> > > > >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> > > > >  virt/kvm/arm/psci.c                  | 138
> > > > > ++++++++++++++++++++++++++- 5 files changed, 178 insertions(+),
> > > > > 2 deletions(-)
> > > > > 
> > > > > diff --git a/arch/arm/include/asm/kvm_emulate.h
> > > > > b/arch/arm/include/asm/kvm_emulate.h index
> > > > > 77121b713bef..2255c50debab 100644 ---
> > > > > a/arch/arm/include/asm/kvm_emulate.h +++
> > > > > b/arch/arm/include/asm/kvm_emulate.h @@ -275,6 +275,16 @@ static
> > > > > inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu
> > > > > *vcpu) return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK; }
> > > > >  
> > > > > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct
> > > > > kvm_vcpu *vcpu) +{
> > > > > +	return false;
> > > > > +}
> > > > > +
> > > > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > > > kvm_vcpu *vcpu,
> > > > > +						      bool
> > > > > flag) +{
> > > > > +}
> > > > > +
> > > > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > > > >  {
> > > > >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > > > > diff --git a/arch/arm/include/uapi/asm/kvm.h
> > > > > b/arch/arm/include/uapi/asm/kvm.h index
> > > > > 4602464ebdfb..02c93b1d8f6d 100644 ---
> > > > > a/arch/arm/include/uapi/asm/kvm.h +++
> > > > > b/arch/arm/include/uapi/asm/kvm.h @@ -214,6 +214,15 @@ struct
> > > > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > > > (KVM_REG_ARM | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > > > KVM_REG_ARM_FW_REG(0) +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1 KVM_REG_ARM_FW_REG(1)
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > > > KVM_REG_ARM_FW_REG(2) +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED
> > > > > 4 /* Device Control API: ARM VGIC */ #define
> > > > > KVM_DEV_ARM_VGIC_GRP_ADDR	0 diff --git
> > > > > a/arch/arm64/include/asm/kvm_emulate.h
> > > > > b/arch/arm64/include/asm/kvm_emulate.h index
> > > > > 506386a3edde..a44f07f68da4 100644 ---
> > > > > a/arch/arm64/include/asm/kvm_emulate.h +++
> > > > > b/arch/arm64/include/asm/kvm_emulate.h @@ -336,6 +336,20 @@
> > > > > static inline unsigned long kvm_vcpu_get_mpidr_aff(struct
> > > > > kvm_vcpu *vcpu) return vcpu_read_sys_reg(vcpu, MPIDR_EL1) &
> > > > > MPIDR_HWID_BITMASK; } +static inline bool
> > > > > kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu) +{
> > > > > +	return vcpu->arch.workaround_flags &
> > > > > VCPU_WORKAROUND_2_FLAG; +}
> > > > > +
> > > > > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct
> > > > > kvm_vcpu *vcpu,
> > > > > +						      bool
> > > > > flag) +{
> > > > > +	if (flag)
> > > > > +		vcpu->arch.workaround_flags |=
> > > > > VCPU_WORKAROUND_2_FLAG;
> > > > > +	else
> > > > > +		vcpu->arch.workaround_flags &=
> > > > > ~VCPU_WORKAROUND_2_FLAG; +}
> > > > > +
> > > > >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> > > > >  {
> > > > >  	if (vcpu_mode_is_32bit(vcpu)) {
> > > > > diff --git a/arch/arm64/include/uapi/asm/kvm.h
> > > > > b/arch/arm64/include/uapi/asm/kvm.h index
> > > > > 97c3478ee6e7..4a19ef199a99 100644 ---
> > > > > a/arch/arm64/include/uapi/asm/kvm.h +++
> > > > > b/arch/arm64/include/uapi/asm/kvm.h @@ -225,6 +225,15 @@ struct
> > > > > kvm_vcpu_events { #define KVM_REG_ARM_FW_REG(r)
> > > > > (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \ KVM_REG_ARM_FW | ((r) &
> > > > > 0xffff)) #define KVM_REG_ARM_PSCI_VERSION
> > > > > KVM_REG_ARM_FW_REG(0) +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > > > > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2
> > > > > KVM_REG_ARM_FW_REG(2) +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2 +#define
> > > > > KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4      
> > > > 
> > > > If this is the first exposure of this information to userspace, I
> > > > wonder if we can come up with some common semantics that avoid
> > > > having to add new ad-hoc code (and bugs) every time a new
> > > > vulnerability/workaround is defined.
> > > > 
> > > > We seem to have at least the two following independent properties
> > > > for a vulnerability, with the listed values for each:
> > > > 
> > > >  * vulnerability (Vulnerable, Unknown, Not Vulnerable)
> > > > 
> > > >  * mitigation support (Not Requestable, Requestable)
> > > > 
> > > > Migrations must not move to the left in _either_ list for any
> > > > vulnerability.
> > > > 
> > > > If we want to hedge out bets we could follow the style of the ID
> > > > registers and allocate to each theoretical vulnerability a pair of
> > > > signed 2- or (for more expansion room if we think we might need
> > > > it) 4-bit fields.
> > > > 
> > > > We could perhaps allocate as follows:
> > > > 
> > > >  * -1=Vulnerable, 0=Unknown, 1=Not Vulnerable
> > > >  *  0=Mitigation not requestable, 1=Mitigation requestable    
> > > 
> > > So as discussed in person, that sounds quite neat. I implemented
> > > that, but the sign extension and masking to n bits is not very
> > > pretty and limits readability.
> > > However the property of having a kind of "vulnerability scale",
> > > where a simple comparison would determine compatibility, is a good
> > > thing to have and drastically simplifies the checking code.
> > >     
> > > > Checking code wouldn't need to know which fields describe
> > > > mitigation mechanisms and which describe vulnerabilities: we'd
> > > > just do a strict    
> > > > >= comparison on each.    
> > > > 
> > > > Further, if a register is never written before the vcpu is first
> > > > run, we should imply a write of 0 to it as part of KVM_RUN (so
> > > > that if the destination node has a negative value anywhere,
> > > > KVM_RUN barfs cleanly.    
> > > 
> > > What I like about the signedness is this "0 means unknown", which is
> > > magically forwards compatible. However I am not sure we can transfer
> > > this semantic into every upcoming register that pops up in the
> > > future.    
> > 
> > I appreciate the concern, but can you give an example of how it might
> > break?  
> 
> The general problem is that we don't know how future firmware registers
> would need to look like and whether they are actually for workarounds.
> Take for instance KVM_REG_ARM_FW_REG(0), which holds the PSCI version.
> So at the very least we would need to reserve a region of the 64K
> firmware registers to use this scheme, yet don't know how many we would
> need.
>  
> > My idea is that you can check for compatibility by comparing fields
> > without any need to know what they mean, but we wouldn't pre-assign
> > meanings for the values of unallocated fields, just create a precedent
> > that future fields can follow (where it works).  
> 
> For clarity, what do you mean with "... you can check ...", exactly? I
> think this "you" would be the receiving kernel, which is very strict
> about unknown registers (-EINVAL), because we don't take any chances.
> From what I understand how QEMU works, is that it just takes the list
> of registers from the originating kernel and asks the receiving kernel
> about them. It doesn't try to interpret most registers in any way.
> 
> Now QEMU *could* ignore the -EINVAL return and proceed anyway, if it
> would be very sure about the implications or the admin told it so.
> But I believe this should be done on a per register basis, and in QEMU,
> relying on some forward looking scheme sounds a bit fragile to me.
> It is my understanding that QEMU does not want to gamble with migration.
> 
> > This is much like the CPU ID features scheme itself.  A "0" might
> > mean that something is absent, but there's no way (or need) to know
> > what.  
> 
> So I think we don't disagree about that this is possible or even would
> be nice, but it's just not how it's used today. I am not sure we want
> to introduce something like this, given that we don't know if there will
> be any future workaround registers at all. Sounds a bit over-engineered
> and fragile to me.
> 
> Peter, can you give your opinion about whether having some generic class
> of firmware workaround registers which could be checked in a generic way
> is something we want?
> 
> > > Actually we might not need this:
> > > My understanding of how QEMU handles this in migration is that it
> > > reads the f/w reg on the originating host A and writes this into
> > > the target host B, without itself interpreting this in any way.
> > > It's up to the target kernel (basically this code here) to check
> > > compatibility. So I am not sure we actually need a stable scheme.
> > > If host A doesn't know about    
> > 
> > Nothing stops userspace from interpreting the data, so there's a risk
> > people may grow to rely on it even if we don't want them to.  
> 
> Well, but userland would not interpret unknown registers, under the
> current scheme, would it?
> So it can surely tinker with KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1,
> because it knows about its meaning. But I would be very careful judging
> about anything else.
> The moment we introduce some scheme, we would have to stick with it
> forever. I am just not sure that's worth it. At the end of the day you
> could always update QEMU to ignore an -EINVAL on a new firmware w/a
> register.
>  
> > So we should try to have something that's forward-compatible if at all
> > possible...  
> > > a certain register, it won't appear in the result of the
> > > KVM_GET_REG_LIST ioctl, so it won't be transferred to host B at
> > > all. In the opposite case the receiving host would reject an
> > > unknown register, which I believe is safer, although I see that it
> > > leaves the "unknown" case on the table.
> > > 
> > > It would be good to have some opinion of how forward looking we
> > > want to (and can) be here.
> > > 
> > > Meanwhile I am sending a v2 which implements the linear scale idea,
> > > without using signed values, as this indeed simplifies the code.
> > > I have the signed version still in a branch here, let me know if you
> > > want to have a look.    
> > 
> > Happy to take a look at it.  
> 
> See below.
> 
> > I was hoping that cpufeatures already had a helper for extracting a
> > signed field, but I didn't go looking for it...
> > 
> > At the asm level this is just a sbfx, so it's hardly expensive.  
> 
> The length of the code or the "performance" is hardly an issue (we are
> talking about migration here, which is mostly limited by the speed of
> the network). And yes, we have sign_extend32() and (i & 0xf) to
> convert, it just looks a bit odd in the code and in the API
> documentation.
> 
> Cheers,
> Andre
> 
> diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> index 6c6757c9571b..a7b10d835ce7 100644
> --- a/arch/arm/include/uapi/asm/kvm.h
> +++ b/arch/arm/include/uapi/asm/kvm.h
> @@ -218,10 +218,10 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	(1U << 4)
>  
>  /* Device Control API: ARM VGIC */
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index 367e96fe654e..7d03f8339100 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -229,10 +229,10 @@ struct kvm_vcpu_events {
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	1
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	2
> -#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	3
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	(-1 & 0xf)
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN	0
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
>  #define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED     (1U << 4)
>  
>  /* Device Control API: ARM VGIC */
> diff --git a/virt/kvm/arm/psci.c b/virt/kvm/arm/psci.c
> index fb6af5ca259e..cfb1519b9a11 100644
> --- a/virt/kvm/arm/psci.c
> +++ b/virt/kvm/arm/psci.c
> @@ -498,7 +498,8 @@ static int get_kernel_wa_level(u64 regid)
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
>  		switch (fake_kvm_arm_have_ssbd()) {
>  		case KVM_SSBD_FORCE_DISABLE:
> -			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
> +			return sign_extend32(KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL,
> +					     KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  		case KVM_SSBD_KERNEL:
>  			return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL;
>  		case KVM_SSBD_FORCE_ENABLE:
> @@ -574,7 +575,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  	}
>  
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
> -		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
> +		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  
>  		if (get_kernel_wa_level(reg->id) < wa_level)
>  			return -EINVAL;
> @@ -582,7 +583,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
>  		return 0;
>  
>  	case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
> -		wa_level = val & KVM_REG_FEATURE_LEVEL_MASK;
> +		wa_level = sign_extend32(val, KVM_REG_FEATURE_LEVEL_WIDTH - 1);
>  
>  		if (get_kernel_wa_level(reg->id) < wa_level)
>  			return -EINVAL;
> 
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-02-15  9:58             ` Andre Przywara
@ 2019-02-15 11:42               ` Marc Zyngier
  -1 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-02-15 11:42 UTC (permalink / raw)
  To: Andre Przywara; +Cc: kvm, Dave Martin, kvmarm, linux-arm-kernel

On Fri, 15 Feb 2019 09:58:57 +0000,
Andre Przywara <andre.przywara@arm.com> wrote:
> 
> On Wed, 30 Jan 2019 11:39:00 +0000
> Andre Przywara <andre.przywara@arm.com> wrote:
> 
> Peter, Marc, Christoffer,
> 
> can we have an opinion on whether it's useful to introduce some
> common scheme for firmware workaround system registers (parts of
> KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> compatibility between two kernels without specifically knowing about
> them?
> Dave suggested to introduce some kind of signed encoding in the 4
> LSBs for all those registers (including future ones), where 0 means
> UNKNOWN and greater values are better. So without knowing about the
> particular register, one could judge whether it's safe to migrate.
> I am just not sure how useful this is, given that QEMU seems to ask
> the receiving kernel about any sysreg, and doesn't particularly care
> about the meaning of those registers. And I am not sure we really
> want to introduce some kind of forward looking scheme in the kernel
> here, short of a working crystal ball. I think the kernel policy was
> always to be as strict as possible about those things.

I honestly don't understand how userspace can decide whether a given
configuration is migratable or not solely based on the value of such a
register. In my experience, the target system has a role to play, and
is the only place where we can find out about whether migration is
actually possible.

As you said, userspace doesn't interpret the data, nor should it. It
is only on the receiving end that compatibility is assessed and
whether some level of compatibility can be safely ensured.

So to sum it up, I don't believe in this approach as a general way of
describing the handling or errata.

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-02-15 11:42               ` Marc Zyngier
  0 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-02-15 11:42 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Peter Maydell, kvm, Christoffer Dall, Dave Martin, kvmarm,
	linux-arm-kernel

On Fri, 15 Feb 2019 09:58:57 +0000,
Andre Przywara <andre.przywara@arm.com> wrote:
> 
> On Wed, 30 Jan 2019 11:39:00 +0000
> Andre Przywara <andre.przywara@arm.com> wrote:
> 
> Peter, Marc, Christoffer,
> 
> can we have an opinion on whether it's useful to introduce some
> common scheme for firmware workaround system registers (parts of
> KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> compatibility between two kernels without specifically knowing about
> them?
> Dave suggested to introduce some kind of signed encoding in the 4
> LSBs for all those registers (including future ones), where 0 means
> UNKNOWN and greater values are better. So without knowing about the
> particular register, one could judge whether it's safe to migrate.
> I am just not sure how useful this is, given that QEMU seems to ask
> the receiving kernel about any sysreg, and doesn't particularly care
> about the meaning of those registers. And I am not sure we really
> want to introduce some kind of forward looking scheme in the kernel
> here, short of a working crystal ball. I think the kernel policy was
> always to be as strict as possible about those things.

I honestly don't understand how userspace can decide whether a given
configuration is migratable or not solely based on the value of such a
register. In my experience, the target system has a role to play, and
is the only place where we can find out about whether migration is
actually possible.

As you said, userspace doesn't interpret the data, nor should it. It
is only on the receiving end that compatibility is assessed and
whether some level of compatibility can be safely ensured.

So to sum it up, I don't believe in this approach as a general way of
describing the handling or errata.

Thanks,

	M.

-- 
Jazz is not dead, it just smell funny.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-02-15 11:42               ` Marc Zyngier
@ 2019-02-15 17:26                 ` Dave Martin
  -1 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-02-15 17:26 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Andre Przywara, kvmarm, kvm

On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:
> On Fri, 15 Feb 2019 09:58:57 +0000,
> Andre Przywara <andre.przywara@arm.com> wrote:
> > 
> > On Wed, 30 Jan 2019 11:39:00 +0000
> > Andre Przywara <andre.przywara@arm.com> wrote:
> > 
> > Peter, Marc, Christoffer,
> > 
> > can we have an opinion on whether it's useful to introduce some
> > common scheme for firmware workaround system registers (parts of
> > KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> > compatibility between two kernels without specifically knowing about
> > them?
> > Dave suggested to introduce some kind of signed encoding in the 4
> > LSBs for all those registers (including future ones), where 0 means
> > UNKNOWN and greater values are better. So without knowing about the
> > particular register, one could judge whether it's safe to migrate.
> > I am just not sure how useful this is, given that QEMU seems to ask
> > the receiving kernel about any sysreg, and doesn't particularly care
> > about the meaning of those registers. And I am not sure we really
> > want to introduce some kind of forward looking scheme in the kernel
> > here, short of a working crystal ball. I think the kernel policy was
> > always to be as strict as possible about those things.
> 
> I honestly don't understand how userspace can decide whether a given
> configuration is migratable or not solely based on the value of such a
> register. In my experience, the target system has a role to play, and
> is the only place where we can find out about whether migration is
> actually possible.

Both origin and target system need to be taken into account.  I don't
think that's anything new.

> As you said, userspace doesn't interpret the data, nor should it. It
> is only on the receiving end that compatibility is assessed and
> whether some level of compatibility can be safely ensured.
> 
> So to sum it up, I don't believe in this approach as a general way of
> describing the handling or errata.

For context, my idea attempted to put KVM, not userspace, in charge of
the decision: userspace applies fixed comparison rules determined ahead
of time, but KVM supplies the values compared (and hence determines the
result).

My worry was that otherwise we may end up with a wild-west tangle of
arbitrary properties that userspace needs specific knowledge about.

We can tolerate a few though.  If we accumulate a significant number
of errata/vulnerability properties that need to be reported to
userspace, this may be worth revisiting.  If not, it doesn't matter.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-02-15 17:26                 ` Dave Martin
  0 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-02-15 17:26 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-arm-kernel, Andre Przywara, kvmarm, kvm

On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:
> On Fri, 15 Feb 2019 09:58:57 +0000,
> Andre Przywara <andre.przywara@arm.com> wrote:
> > 
> > On Wed, 30 Jan 2019 11:39:00 +0000
> > Andre Przywara <andre.przywara@arm.com> wrote:
> > 
> > Peter, Marc, Christoffer,
> > 
> > can we have an opinion on whether it's useful to introduce some
> > common scheme for firmware workaround system registers (parts of
> > KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> > compatibility between two kernels without specifically knowing about
> > them?
> > Dave suggested to introduce some kind of signed encoding in the 4
> > LSBs for all those registers (including future ones), where 0 means
> > UNKNOWN and greater values are better. So without knowing about the
> > particular register, one could judge whether it's safe to migrate.
> > I am just not sure how useful this is, given that QEMU seems to ask
> > the receiving kernel about any sysreg, and doesn't particularly care
> > about the meaning of those registers. And I am not sure we really
> > want to introduce some kind of forward looking scheme in the kernel
> > here, short of a working crystal ball. I think the kernel policy was
> > always to be as strict as possible about those things.
> 
> I honestly don't understand how userspace can decide whether a given
> configuration is migratable or not solely based on the value of such a
> register. In my experience, the target system has a role to play, and
> is the only place where we can find out about whether migration is
> actually possible.

Both origin and target system need to be taken into account.  I don't
think that's anything new.

> As you said, userspace doesn't interpret the data, nor should it. It
> is only on the receiving end that compatibility is assessed and
> whether some level of compatibility can be safely ensured.
> 
> So to sum it up, I don't believe in this approach as a general way of
> describing the handling or errata.

For context, my idea attempted to put KVM, not userspace, in charge of
the decision: userspace applies fixed comparison rules determined ahead
of time, but KVM supplies the values compared (and hence determines the
result).

My worry was that otherwise we may end up with a wild-west tangle of
arbitrary properties that userspace needs specific knowledge about.

We can tolerate a few though.  If we accumulate a significant number
of errata/vulnerability properties that need to be reported to
userspace, this may be worth revisiting.  If not, it doesn't matter.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-02-15 17:26                 ` Dave Martin
@ 2019-02-18  9:07                   ` Marc Zyngier
  -1 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-02-18  9:07 UTC (permalink / raw)
  To: Dave Martin, Andre Przywara; +Cc: linux-arm-kernel, kvmarm, kvm

On Fri, 15 Feb 2019 17:26:02 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

> On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:
> > On Fri, 15 Feb 2019 09:58:57 +0000,
> > Andre Przywara <andre.przywara@arm.com> wrote:  
> > > 
> > > On Wed, 30 Jan 2019 11:39:00 +0000
> > > Andre Przywara <andre.przywara@arm.com> wrote:
> > > 
> > > Peter, Marc, Christoffer,
> > > 
> > > can we have an opinion on whether it's useful to introduce some
> > > common scheme for firmware workaround system registers (parts of
> > > KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> > > compatibility between two kernels without specifically knowing about
> > > them?
> > > Dave suggested to introduce some kind of signed encoding in the 4
> > > LSBs for all those registers (including future ones), where 0 means
> > > UNKNOWN and greater values are better. So without knowing about the
> > > particular register, one could judge whether it's safe to migrate.
> > > I am just not sure how useful this is, given that QEMU seems to ask
> > > the receiving kernel about any sysreg, and doesn't particularly care
> > > about the meaning of those registers. And I am not sure we really
> > > want to introduce some kind of forward looking scheme in the kernel
> > > here, short of a working crystal ball. I think the kernel policy was
> > > always to be as strict as possible about those things.  
> > 
> > I honestly don't understand how userspace can decide whether a given
> > configuration is migratable or not solely based on the value of such a
> > register. In my experience, the target system has a role to play, and
> > is the only place where we can find out about whether migration is
> > actually possible.  
> 
> Both origin and target system need to be taken into account.  I don't
> think that's anything new.

Well, that was what I understood from Andre's question.

> 
> > As you said, userspace doesn't interpret the data, nor should it. It
> > is only on the receiving end that compatibility is assessed and
> > whether some level of compatibility can be safely ensured.
> > 
> > So to sum it up, I don't believe in this approach as a general way of
> > describing the handling or errata.  
> 
> For context, my idea attempted to put KVM, not userspace, in charge of
> the decision: userspace applies fixed comparison rules determined ahead
> of time, but KVM supplies the values compared (and hence determines the
> result).
> 
> My worry was that otherwise we may end up with a wild-west tangle of
> arbitrary properties that userspace needs specific knowledge about.

And this is where our understanding differs. I do not think userspace
has to care at all. All it has to do is to provide the saved register
values to the target system, and let KVM accept or refuse these
settings. I can't see what providing a set of predefined values back to
userspace gains us.

An unknown register on the target system fails the restore phase:
that's absolutely fine, as we don't want to run on a system that
doesn't know about the mitigation.

An incompatible value fails the restore as well, as KVM itself finds
that this is a service it cannot safely provide.

No userspace involvement, no QEMU upgrade required. Only the kernel
knows about it.

> We can tolerate a few though.  If we accumulate a significant number
> of errata/vulnerability properties that need to be reported to
> userspace, this may be worth revisiting.  If not, it doesn't matter.

Andre: if you want this to make it into 5.1, the time is now.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-02-18  9:07                   ` Marc Zyngier
  0 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-02-18  9:07 UTC (permalink / raw)
  To: Dave Martin, Andre Przywara; +Cc: linux-arm-kernel, kvmarm, kvm

On Fri, 15 Feb 2019 17:26:02 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

> On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:
> > On Fri, 15 Feb 2019 09:58:57 +0000,
> > Andre Przywara <andre.przywara@arm.com> wrote:  
> > > 
> > > On Wed, 30 Jan 2019 11:39:00 +0000
> > > Andre Przywara <andre.przywara@arm.com> wrote:
> > > 
> > > Peter, Marc, Christoffer,
> > > 
> > > can we have an opinion on whether it's useful to introduce some
> > > common scheme for firmware workaround system registers (parts of
> > > KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> > > compatibility between two kernels without specifically knowing about
> > > them?
> > > Dave suggested to introduce some kind of signed encoding in the 4
> > > LSBs for all those registers (including future ones), where 0 means
> > > UNKNOWN and greater values are better. So without knowing about the
> > > particular register, one could judge whether it's safe to migrate.
> > > I am just not sure how useful this is, given that QEMU seems to ask
> > > the receiving kernel about any sysreg, and doesn't particularly care
> > > about the meaning of those registers. And I am not sure we really
> > > want to introduce some kind of forward looking scheme in the kernel
> > > here, short of a working crystal ball. I think the kernel policy was
> > > always to be as strict as possible about those things.  
> > 
> > I honestly don't understand how userspace can decide whether a given
> > configuration is migratable or not solely based on the value of such a
> > register. In my experience, the target system has a role to play, and
> > is the only place where we can find out about whether migration is
> > actually possible.  
> 
> Both origin and target system need to be taken into account.  I don't
> think that's anything new.

Well, that was what I understood from Andre's question.

> 
> > As you said, userspace doesn't interpret the data, nor should it. It
> > is only on the receiving end that compatibility is assessed and
> > whether some level of compatibility can be safely ensured.
> > 
> > So to sum it up, I don't believe in this approach as a general way of
> > describing the handling or errata.  
> 
> For context, my idea attempted to put KVM, not userspace, in charge of
> the decision: userspace applies fixed comparison rules determined ahead
> of time, but KVM supplies the values compared (and hence determines the
> result).
> 
> My worry was that otherwise we may end up with a wild-west tangle of
> arbitrary properties that userspace needs specific knowledge about.

And this is where our understanding differs. I do not think userspace
has to care at all. All it has to do is to provide the saved register
values to the target system, and let KVM accept or refuse these
settings. I can't see what providing a set of predefined values back to
userspace gains us.

An unknown register on the target system fails the restore phase:
that's absolutely fine, as we don't want to run on a system that
doesn't know about the mitigation.

An incompatible value fails the restore as well, as KVM itself finds
that this is a service it cannot safely provide.

No userspace involvement, no QEMU upgrade required. Only the kernel
knows about it.

> We can tolerate a few though.  If we accumulate a significant number
> of errata/vulnerability properties that need to be reported to
> userspace, this may be worth revisiting.  If not, it doesn't matter.

Andre: if you want this to make it into 5.1, the time is now.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-02-18  9:07                   ` Marc Zyngier
@ 2019-02-18 10:28                     ` Dave Martin
  -1 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-02-18 10:28 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andre Przywara, kvmarm, linux-arm-kernel, kvm

On Mon, Feb 18, 2019 at 09:07:31AM +0000, Marc Zyngier wrote:
> On Fri, 15 Feb 2019 17:26:02 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:
> 
> > On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:
> > > On Fri, 15 Feb 2019 09:58:57 +0000,
> > > Andre Przywara <andre.przywara@arm.com> wrote:  
> > > > 
> > > > On Wed, 30 Jan 2019 11:39:00 +0000
> > > > Andre Przywara <andre.przywara@arm.com> wrote:
> > > > 
> > > > Peter, Marc, Christoffer,
> > > > 
> > > > can we have an opinion on whether it's useful to introduce some
> > > > common scheme for firmware workaround system registers (parts of
> > > > KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> > > > compatibility between two kernels without specifically knowing about
> > > > them?
> > > > Dave suggested to introduce some kind of signed encoding in the 4
> > > > LSBs for all those registers (including future ones), where 0 means
> > > > UNKNOWN and greater values are better. So without knowing about the
> > > > particular register, one could judge whether it's safe to migrate.
> > > > I am just not sure how useful this is, given that QEMU seems to ask
> > > > the receiving kernel about any sysreg, and doesn't particularly care
> > > > about the meaning of those registers. And I am not sure we really
> > > > want to introduce some kind of forward looking scheme in the kernel
> > > > here, short of a working crystal ball. I think the kernel policy was
> > > > always to be as strict as possible about those things.  
> > > 
> > > I honestly don't understand how userspace can decide whether a given
> > > configuration is migratable or not solely based on the value of such a
> > > register. In my experience, the target system has a role to play, and
> > > is the only place where we can find out about whether migration is
> > > actually possible.  
> > 
> > Both origin and target system need to be taken into account.  I don't
> > think that's anything new.
> 
> Well, that was what I understood from Andre's question.
> 
> > 
> > > As you said, userspace doesn't interpret the data, nor should it. It
> > > is only on the receiving end that compatibility is assessed and
> > > whether some level of compatibility can be safely ensured.
> > > 
> > > So to sum it up, I don't believe in this approach as a general way of
> > > describing the handling or errata.  
> > 
> > For context, my idea attempted to put KVM, not userspace, in charge of
> > the decision: userspace applies fixed comparison rules determined ahead
> > of time, but KVM supplies the values compared (and hence determines the
> > result).
> > 
> > My worry was that otherwise we may end up with a wild-west tangle of
> > arbitrary properties that userspace needs specific knowledge about.
> 
> And this is where our understanding differs. I do not think userspace
> has to care at all. All it has to do is to provide the saved register
> values to the target system, and let KVM accept or refuse these
> settings. I can't see what providing a set of predefined values back to
> userspace gains us.

Can we just pull all the UAPI header definitions then?  If this is
really kernel private, we don't even need userspace to know what the
IDs mean, let alone what's in the registers.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-02-18 10:28                     ` Dave Martin
  0 siblings, 0 replies; 50+ messages in thread
From: Dave Martin @ 2019-02-18 10:28 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andre Przywara, kvmarm, linux-arm-kernel, kvm

On Mon, Feb 18, 2019 at 09:07:31AM +0000, Marc Zyngier wrote:
> On Fri, 15 Feb 2019 17:26:02 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:
> 
> > On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:
> > > On Fri, 15 Feb 2019 09:58:57 +0000,
> > > Andre Przywara <andre.przywara@arm.com> wrote:  
> > > > 
> > > > On Wed, 30 Jan 2019 11:39:00 +0000
> > > > Andre Przywara <andre.przywara@arm.com> wrote:
> > > > 
> > > > Peter, Marc, Christoffer,
> > > > 
> > > > can we have an opinion on whether it's useful to introduce some
> > > > common scheme for firmware workaround system registers (parts of
> > > > KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> > > > compatibility between two kernels without specifically knowing about
> > > > them?
> > > > Dave suggested to introduce some kind of signed encoding in the 4
> > > > LSBs for all those registers (including future ones), where 0 means
> > > > UNKNOWN and greater values are better. So without knowing about the
> > > > particular register, one could judge whether it's safe to migrate.
> > > > I am just not sure how useful this is, given that QEMU seems to ask
> > > > the receiving kernel about any sysreg, and doesn't particularly care
> > > > about the meaning of those registers. And I am not sure we really
> > > > want to introduce some kind of forward looking scheme in the kernel
> > > > here, short of a working crystal ball. I think the kernel policy was
> > > > always to be as strict as possible about those things.  
> > > 
> > > I honestly don't understand how userspace can decide whether a given
> > > configuration is migratable or not solely based on the value of such a
> > > register. In my experience, the target system has a role to play, and
> > > is the only place where we can find out about whether migration is
> > > actually possible.  
> > 
> > Both origin and target system need to be taken into account.  I don't
> > think that's anything new.
> 
> Well, that was what I understood from Andre's question.
> 
> > 
> > > As you said, userspace doesn't interpret the data, nor should it. It
> > > is only on the receiving end that compatibility is assessed and
> > > whether some level of compatibility can be safely ensured.
> > > 
> > > So to sum it up, I don't believe in this approach as a general way of
> > > describing the handling or errata.  
> > 
> > For context, my idea attempted to put KVM, not userspace, in charge of
> > the decision: userspace applies fixed comparison rules determined ahead
> > of time, but KVM supplies the values compared (and hence determines the
> > result).
> > 
> > My worry was that otherwise we may end up with a wild-west tangle of
> > arbitrary properties that userspace needs specific knowledge about.
> 
> And this is where our understanding differs. I do not think userspace
> has to care at all. All it has to do is to provide the saved register
> values to the target system, and let KVM accept or refuse these
> settings. I can't see what providing a set of predefined values back to
> userspace gains us.

Can we just pull all the UAPI header definitions then?  If this is
really kernel private, we don't even need userspace to know what the
IDs mean, let alone what's in the registers.

[...]

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-02-18 10:28                     ` Dave Martin
@ 2019-02-18 10:59                       ` Marc Zyngier
  -1 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-02-18 10:59 UTC (permalink / raw)
  To: Dave Martin; +Cc: Andre Przywara, kvmarm, linux-arm-kernel, kvm

On Mon, 18 Feb 2019 10:28:54 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

> On Mon, Feb 18, 2019 at 09:07:31AM +0000, Marc Zyngier wrote:
> > On Fri, 15 Feb 2019 17:26:02 +0000
> > Dave Martin <Dave.Martin@arm.com> wrote:
> >   
> > > On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:  
> > > > On Fri, 15 Feb 2019 09:58:57 +0000,
> > > > Andre Przywara <andre.przywara@arm.com> wrote:    
> > > > > 
> > > > > On Wed, 30 Jan 2019 11:39:00 +0000
> > > > > Andre Przywara <andre.przywara@arm.com> wrote:
> > > > > 
> > > > > Peter, Marc, Christoffer,
> > > > > 
> > > > > can we have an opinion on whether it's useful to introduce some
> > > > > common scheme for firmware workaround system registers (parts of
> > > > > KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> > > > > compatibility between two kernels without specifically knowing about
> > > > > them?
> > > > > Dave suggested to introduce some kind of signed encoding in the 4
> > > > > LSBs for all those registers (including future ones), where 0 means
> > > > > UNKNOWN and greater values are better. So without knowing about the
> > > > > particular register, one could judge whether it's safe to migrate.
> > > > > I am just not sure how useful this is, given that QEMU seems to ask
> > > > > the receiving kernel about any sysreg, and doesn't particularly care
> > > > > about the meaning of those registers. And I am not sure we really
> > > > > want to introduce some kind of forward looking scheme in the kernel
> > > > > here, short of a working crystal ball. I think the kernel policy was
> > > > > always to be as strict as possible about those things.    
> > > > 
> > > > I honestly don't understand how userspace can decide whether a given
> > > > configuration is migratable or not solely based on the value of such a
> > > > register. In my experience, the target system has a role to play, and
> > > > is the only place where we can find out about whether migration is
> > > > actually possible.    
> > > 
> > > Both origin and target system need to be taken into account.  I don't
> > > think that's anything new.  
> > 
> > Well, that was what I understood from Andre's question.
> >   
> > >   
> > > > As you said, userspace doesn't interpret the data, nor should it. It
> > > > is only on the receiving end that compatibility is assessed and
> > > > whether some level of compatibility can be safely ensured.
> > > > 
> > > > So to sum it up, I don't believe in this approach as a general way of
> > > > describing the handling or errata.    
> > > 
> > > For context, my idea attempted to put KVM, not userspace, in charge of
> > > the decision: userspace applies fixed comparison rules determined ahead
> > > of time, but KVM supplies the values compared (and hence determines the
> > > result).
> > > 
> > > My worry was that otherwise we may end up with a wild-west tangle of
> > > arbitrary properties that userspace needs specific knowledge about.  
> > 
> > And this is where our understanding differs. I do not think userspace
> > has to care at all. All it has to do is to provide the saved register
> > values to the target system, and let KVM accept or refuse these
> > settings. I can't see what providing a set of predefined values back to
> > userspace gains us.  
> 
> Can we just pull all the UAPI header definitions then?  If this is
> really kernel private, we don't even need userspace to know what the
> IDs mean, let alone what's in the registers.

I'm in two minds about this. Indeed, userspace shouldn't know about
this. And yet this is userspace visible. If we make it kernel private,
we still have the risk of accidentally breaking compatibility because
"this is kernel private and we can do what we want".

Sticking it into UAPI makes it abundantly clear that you cannot mess
with this at all without the risk of breaking save/restore.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-02-18 10:59                       ` Marc Zyngier
  0 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-02-18 10:59 UTC (permalink / raw)
  To: Dave Martin; +Cc: Andre Przywara, kvmarm, linux-arm-kernel, kvm

On Mon, 18 Feb 2019 10:28:54 +0000
Dave Martin <Dave.Martin@arm.com> wrote:

> On Mon, Feb 18, 2019 at 09:07:31AM +0000, Marc Zyngier wrote:
> > On Fri, 15 Feb 2019 17:26:02 +0000
> > Dave Martin <Dave.Martin@arm.com> wrote:
> >   
> > > On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:  
> > > > On Fri, 15 Feb 2019 09:58:57 +0000,
> > > > Andre Przywara <andre.przywara@arm.com> wrote:    
> > > > > 
> > > > > On Wed, 30 Jan 2019 11:39:00 +0000
> > > > > Andre Przywara <andre.przywara@arm.com> wrote:
> > > > > 
> > > > > Peter, Marc, Christoffer,
> > > > > 
> > > > > can we have an opinion on whether it's useful to introduce some
> > > > > common scheme for firmware workaround system registers (parts of
> > > > > KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> > > > > compatibility between two kernels without specifically knowing about
> > > > > them?
> > > > > Dave suggested to introduce some kind of signed encoding in the 4
> > > > > LSBs for all those registers (including future ones), where 0 means
> > > > > UNKNOWN and greater values are better. So without knowing about the
> > > > > particular register, one could judge whether it's safe to migrate.
> > > > > I am just not sure how useful this is, given that QEMU seems to ask
> > > > > the receiving kernel about any sysreg, and doesn't particularly care
> > > > > about the meaning of those registers. And I am not sure we really
> > > > > want to introduce some kind of forward looking scheme in the kernel
> > > > > here, short of a working crystal ball. I think the kernel policy was
> > > > > always to be as strict as possible about those things.    
> > > > 
> > > > I honestly don't understand how userspace can decide whether a given
> > > > configuration is migratable or not solely based on the value of such a
> > > > register. In my experience, the target system has a role to play, and
> > > > is the only place where we can find out about whether migration is
> > > > actually possible.    
> > > 
> > > Both origin and target system need to be taken into account.  I don't
> > > think that's anything new.  
> > 
> > Well, that was what I understood from Andre's question.
> >   
> > >   
> > > > As you said, userspace doesn't interpret the data, nor should it. It
> > > > is only on the receiving end that compatibility is assessed and
> > > > whether some level of compatibility can be safely ensured.
> > > > 
> > > > So to sum it up, I don't believe in this approach as a general way of
> > > > describing the handling or errata.    
> > > 
> > > For context, my idea attempted to put KVM, not userspace, in charge of
> > > the decision: userspace applies fixed comparison rules determined ahead
> > > of time, but KVM supplies the values compared (and hence determines the
> > > result).
> > > 
> > > My worry was that otherwise we may end up with a wild-west tangle of
> > > arbitrary properties that userspace needs specific knowledge about.  
> > 
> > And this is where our understanding differs. I do not think userspace
> > has to care at all. All it has to do is to provide the saved register
> > values to the target system, and let KVM accept or refuse these
> > settings. I can't see what providing a set of predefined values back to
> > userspace gains us.  
> 
> Can we just pull all the UAPI header definitions then?  If this is
> really kernel private, we don't even need userspace to know what the
> IDs mean, let alone what's in the registers.

I'm in two minds about this. Indeed, userspace shouldn't know about
this. And yet this is userspace visible. If we make it kernel private,
we still have the risk of accidentally breaking compatibility because
"this is kernel private and we can do what we want".

Sticking it into UAPI makes it abundantly clear that you cannot mess
with this at all without the risk of breaking save/restore.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-02-18  9:07                   ` Marc Zyngier
@ 2019-02-18 11:29                     ` André Przywara
  -1 siblings, 0 replies; 50+ messages in thread
From: André Przywara @ 2019-02-18 11:29 UTC (permalink / raw)
  To: Marc Zyngier, Dave Martin; +Cc: linux-arm-kernel, kvmarm, kvm

On 18/02/2019 09:07, Marc Zyngier wrote:
> On Fri, 15 Feb 2019 17:26:02 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:

Hi,

> 
>> On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:
>>> On Fri, 15 Feb 2019 09:58:57 +0000,
>>> Andre Przywara <andre.przywara@arm.com> wrote:  
>>>>
>>>> On Wed, 30 Jan 2019 11:39:00 +0000
>>>> Andre Przywara <andre.przywara@arm.com> wrote:
>>>>
>>>> Peter, Marc, Christoffer,
>>>>
>>>> can we have an opinion on whether it's useful to introduce some
>>>> common scheme for firmware workaround system registers (parts of
>>>> KVM_REG_ARM_FW_REG(x)), which would allow checking them for
>>>> compatibility between two kernels without specifically knowing about
>>>> them?
>>>> Dave suggested to introduce some kind of signed encoding in the 4
>>>> LSBs for all those registers (including future ones), where 0 means
>>>> UNKNOWN and greater values are better. So without knowing about the
>>>> particular register, one could judge whether it's safe to migrate.
>>>> I am just not sure how useful this is, given that QEMU seems to ask
>>>> the receiving kernel about any sysreg, and doesn't particularly care
>>>> about the meaning of those registers. And I am not sure we really
>>>> want to introduce some kind of forward looking scheme in the kernel
>>>> here, short of a working crystal ball. I think the kernel policy was
>>>> always to be as strict as possible about those things.  
>>>
>>> I honestly don't understand how userspace can decide whether a given
>>> configuration is migratable or not solely based on the value of such a
>>> register. In my experience, the target system has a role to play, and
>>> is the only place where we can find out about whether migration is
>>> actually possible.  
>>
>> Both origin and target system need to be taken into account.  I don't
>> think that's anything new.
> 
> Well, that was what I understood from Andre's question.
> 
>>
>>> As you said, userspace doesn't interpret the data, nor should it. It
>>> is only on the receiving end that compatibility is assessed and
>>> whether some level of compatibility can be safely ensured.
>>>
>>> So to sum it up, I don't believe in this approach as a general way of
>>> describing the handling or errata.  
>>
>> For context, my idea attempted to put KVM, not userspace, in charge of
>> the decision: userspace applies fixed comparison rules determined ahead
>> of time, but KVM supplies the values compared (and hence determines the
>> result).
>>
>> My worry was that otherwise we may end up with a wild-west tangle of
>> arbitrary properties that userspace needs specific knowledge about.
> 
> And this is where our understanding differs. I do not think userspace
> has to care at all. All it has to do is to provide the saved register
> values to the target system, and let KVM accept or refuse these
> settings. I can't see what providing a set of predefined values back to
> userspace gains us.
> 
> An unknown register on the target system fails the restore phase:
> that's absolutely fine, as we don't want to run on a system that
> doesn't know about the mitigation.
> 
> An incompatible value fails the restore as well, as KVM itself finds
> that this is a service it cannot safely provide.
> 
> No userspace involvement, no QEMU upgrade required. Only the kernel
> knows about it.

Yes, this is what I understand as well. From experience, many times when
we were not strict enough about some userland interface, it backfired.

The only case where such a forward-looking scheme would make sense is
the case where the source system has a new kernel, advertising a new
firmware workaround register, in an unknown or missing state (0 or -1).
An older kernel on the target system might not know about this register.
That would translate into "unknown", which is compatible with 0 or -1
from the source. So migration would be fine, but we deny it because the
new kernel returns -EINVAL.

But I am not sure this construct is worth implementing in the kernel. If
people care about this case, they could implement a workaround in
userland instead. Or just upgrade the target kernel before migration.

>> We can tolerate a few though.  If we accumulate a significant number
>> of errata/vulnerability properties that need to be reported to
>> userspace, this may be worth revisiting.  If not, it doesn't matter.
> 
> Andre: if you want this to make it into 5.1, the time is now.

OK. So is v2 [1] fine then? This implements the much easier "bigger is
better" scheme, but being 0 based instead of using a 4-bit signed encoding.
Let me know if there is something to rework in there.

Cheers,
Andre.

[1]
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/627739.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-02-18 11:29                     ` André Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: André Przywara @ 2019-02-18 11:29 UTC (permalink / raw)
  To: Marc Zyngier, Dave Martin; +Cc: linux-arm-kernel, kvmarm, kvm

On 18/02/2019 09:07, Marc Zyngier wrote:
> On Fri, 15 Feb 2019 17:26:02 +0000
> Dave Martin <Dave.Martin@arm.com> wrote:

Hi,

> 
>> On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:
>>> On Fri, 15 Feb 2019 09:58:57 +0000,
>>> Andre Przywara <andre.przywara@arm.com> wrote:  
>>>>
>>>> On Wed, 30 Jan 2019 11:39:00 +0000
>>>> Andre Przywara <andre.przywara@arm.com> wrote:
>>>>
>>>> Peter, Marc, Christoffer,
>>>>
>>>> can we have an opinion on whether it's useful to introduce some
>>>> common scheme for firmware workaround system registers (parts of
>>>> KVM_REG_ARM_FW_REG(x)), which would allow checking them for
>>>> compatibility between two kernels without specifically knowing about
>>>> them?
>>>> Dave suggested to introduce some kind of signed encoding in the 4
>>>> LSBs for all those registers (including future ones), where 0 means
>>>> UNKNOWN and greater values are better. So without knowing about the
>>>> particular register, one could judge whether it's safe to migrate.
>>>> I am just not sure how useful this is, given that QEMU seems to ask
>>>> the receiving kernel about any sysreg, and doesn't particularly care
>>>> about the meaning of those registers. And I am not sure we really
>>>> want to introduce some kind of forward looking scheme in the kernel
>>>> here, short of a working crystal ball. I think the kernel policy was
>>>> always to be as strict as possible about those things.  
>>>
>>> I honestly don't understand how userspace can decide whether a given
>>> configuration is migratable or not solely based on the value of such a
>>> register. In my experience, the target system has a role to play, and
>>> is the only place where we can find out about whether migration is
>>> actually possible.  
>>
>> Both origin and target system need to be taken into account.  I don't
>> think that's anything new.
> 
> Well, that was what I understood from Andre's question.
> 
>>
>>> As you said, userspace doesn't interpret the data, nor should it. It
>>> is only on the receiving end that compatibility is assessed and
>>> whether some level of compatibility can be safely ensured.
>>>
>>> So to sum it up, I don't believe in this approach as a general way of
>>> describing the handling or errata.  
>>
>> For context, my idea attempted to put KVM, not userspace, in charge of
>> the decision: userspace applies fixed comparison rules determined ahead
>> of time, but KVM supplies the values compared (and hence determines the
>> result).
>>
>> My worry was that otherwise we may end up with a wild-west tangle of
>> arbitrary properties that userspace needs specific knowledge about.
> 
> And this is where our understanding differs. I do not think userspace
> has to care at all. All it has to do is to provide the saved register
> values to the target system, and let KVM accept or refuse these
> settings. I can't see what providing a set of predefined values back to
> userspace gains us.
> 
> An unknown register on the target system fails the restore phase:
> that's absolutely fine, as we don't want to run on a system that
> doesn't know about the mitigation.
> 
> An incompatible value fails the restore as well, as KVM itself finds
> that this is a service it cannot safely provide.
> 
> No userspace involvement, no QEMU upgrade required. Only the kernel
> knows about it.

Yes, this is what I understand as well. From experience, many times when
we were not strict enough about some userland interface, it backfired.

The only case where such a forward-looking scheme would make sense is
the case where the source system has a new kernel, advertising a new
firmware workaround register, in an unknown or missing state (0 or -1).
An older kernel on the target system might not know about this register.
That would translate into "unknown", which is compatible with 0 or -1
from the source. So migration would be fine, but we deny it because the
new kernel returns -EINVAL.

But I am not sure this construct is worth implementing in the kernel. If
people care about this case, they could implement a workaround in
userland instead. Or just upgrade the target kernel before migration.

>> We can tolerate a few though.  If we accumulate a significant number
>> of errata/vulnerability properties that need to be reported to
>> userspace, this may be worth revisiting.  If not, it doesn't matter.
> 
> Andre: if you want this to make it into 5.1, the time is now.

OK. So is v2 [1] fine then? This implements the much easier "bigger is
better" scheme, but being 0 based instead of using a 4-bit signed encoding.
Let me know if there is something to rework in there.

Cheers,
Andre.

[1]
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/627739.html

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-02-18 11:29                     ` André Przywara
@ 2019-02-18 14:15                       ` Marc Zyngier
  -1 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-02-18 14:15 UTC (permalink / raw)
  To: André Przywara; +Cc: linux-arm-kernel, Dave Martin, kvm, kvmarm

On Mon, 18 Feb 2019 11:29:57 +0000
André Przywara <andre.przywara@arm.com> wrote:

> On 18/02/2019 09:07, Marc Zyngier wrote:
> > On Fri, 15 Feb 2019 17:26:02 +0000
> > Dave Martin <Dave.Martin@arm.com> wrote:  
> 
> Hi,
> 
> >   
> >> On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:  
> >>> On Fri, 15 Feb 2019 09:58:57 +0000,
> >>> Andre Przywara <andre.przywara@arm.com> wrote:    
> >>>>
> >>>> On Wed, 30 Jan 2019 11:39:00 +0000
> >>>> Andre Przywara <andre.przywara@arm.com> wrote:
> >>>>
> >>>> Peter, Marc, Christoffer,
> >>>>
> >>>> can we have an opinion on whether it's useful to introduce some
> >>>> common scheme for firmware workaround system registers (parts of
> >>>> KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> >>>> compatibility between two kernels without specifically knowing about
> >>>> them?
> >>>> Dave suggested to introduce some kind of signed encoding in the 4
> >>>> LSBs for all those registers (including future ones), where 0 means
> >>>> UNKNOWN and greater values are better. So without knowing about the
> >>>> particular register, one could judge whether it's safe to migrate.
> >>>> I am just not sure how useful this is, given that QEMU seems to ask
> >>>> the receiving kernel about any sysreg, and doesn't particularly care
> >>>> about the meaning of those registers. And I am not sure we really
> >>>> want to introduce some kind of forward looking scheme in the kernel
> >>>> here, short of a working crystal ball. I think the kernel policy was
> >>>> always to be as strict as possible about those things.    
> >>>
> >>> I honestly don't understand how userspace can decide whether a given
> >>> configuration is migratable or not solely based on the value of such a
> >>> register. In my experience, the target system has a role to play, and
> >>> is the only place where we can find out about whether migration is
> >>> actually possible.    
> >>
> >> Both origin and target system need to be taken into account.  I don't
> >> think that's anything new.  
> > 
> > Well, that was what I understood from Andre's question.
> >   
> >>  
> >>> As you said, userspace doesn't interpret the data, nor should it. It
> >>> is only on the receiving end that compatibility is assessed and
> >>> whether some level of compatibility can be safely ensured.
> >>>
> >>> So to sum it up, I don't believe in this approach as a general way of
> >>> describing the handling or errata.    
> >>
> >> For context, my idea attempted to put KVM, not userspace, in charge of
> >> the decision: userspace applies fixed comparison rules determined ahead
> >> of time, but KVM supplies the values compared (and hence determines the
> >> result).
> >>
> >> My worry was that otherwise we may end up with a wild-west tangle of
> >> arbitrary properties that userspace needs specific knowledge about.  
> > 
> > And this is where our understanding differs. I do not think userspace
> > has to care at all. All it has to do is to provide the saved register
> > values to the target system, and let KVM accept or refuse these
> > settings. I can't see what providing a set of predefined values back to
> > userspace gains us.
> > 
> > An unknown register on the target system fails the restore phase:
> > that's absolutely fine, as we don't want to run on a system that
> > doesn't know about the mitigation.
> > 
> > An incompatible value fails the restore as well, as KVM itself finds
> > that this is a service it cannot safely provide.
> > 
> > No userspace involvement, no QEMU upgrade required. Only the kernel
> > knows about it.  
> 
> Yes, this is what I understand as well. From experience, many times when
> we were not strict enough about some userland interface, it backfired.
> 
> The only case where such a forward-looking scheme would make sense is
> the case where the source system has a new kernel, advertising a new
> firmware workaround register, in an unknown or missing state (0 or -1).
> An older kernel on the target system might not know about this register.
> That would translate into "unknown", which is compatible with 0 or -1
> from the source. So migration would be fine, but we deny it because the
> new kernel returns -EINVAL.
> 
> But I am not sure this construct is worth implementing in the kernel. If
> people care about this case, they could implement a workaround in
> userland instead. Or just upgrade the target kernel before migration.

Upgrading the target may not be convenient. But more importantly, I
don't think we expect downgrades to be supported. This can break for an
infinity of reasons, such as the feature set implemented on the source
not being there on the target.

As you said, if userspace wants to bypass these restrictions, it can
alter the data before restoring.

> 
> >> We can tolerate a few though.  If we accumulate a significant number
> >> of errata/vulnerability properties that need to be reported to
> >> userspace, this may be worth revisiting.  If not, it doesn't matter.  
> > 
> > Andre: if you want this to make it into 5.1, the time is now.  
> 
> OK. So is v2 [1] fine then? This implements the much easier "bigger is
> better" scheme, but being 0 based instead of using a 4-bit signed encoding.
> Let me know if there is something to rework in there.
> 
> Cheers,
> Andre.
> 
> [1]
> http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/627739.html

Sorry, I've lost track of which is which. Please post something that is
consistent,  and addresses Steve's concerns if there is still any. Make
sure it applies on top of the current kvmarm/next, and provide
evidences that you've tested migration on the expected working and
expected failing configurations.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-02-18 14:15                       ` Marc Zyngier
  0 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2019-02-18 14:15 UTC (permalink / raw)
  To: André Przywara; +Cc: linux-arm-kernel, Dave Martin, kvm, kvmarm

On Mon, 18 Feb 2019 11:29:57 +0000
André Przywara <andre.przywara@arm.com> wrote:

> On 18/02/2019 09:07, Marc Zyngier wrote:
> > On Fri, 15 Feb 2019 17:26:02 +0000
> > Dave Martin <Dave.Martin@arm.com> wrote:  
> 
> Hi,
> 
> >   
> >> On Fri, Feb 15, 2019 at 11:42:27AM +0000, Marc Zyngier wrote:  
> >>> On Fri, 15 Feb 2019 09:58:57 +0000,
> >>> Andre Przywara <andre.przywara@arm.com> wrote:    
> >>>>
> >>>> On Wed, 30 Jan 2019 11:39:00 +0000
> >>>> Andre Przywara <andre.przywara@arm.com> wrote:
> >>>>
> >>>> Peter, Marc, Christoffer,
> >>>>
> >>>> can we have an opinion on whether it's useful to introduce some
> >>>> common scheme for firmware workaround system registers (parts of
> >>>> KVM_REG_ARM_FW_REG(x)), which would allow checking them for
> >>>> compatibility between two kernels without specifically knowing about
> >>>> them?
> >>>> Dave suggested to introduce some kind of signed encoding in the 4
> >>>> LSBs for all those registers (including future ones), where 0 means
> >>>> UNKNOWN and greater values are better. So without knowing about the
> >>>> particular register, one could judge whether it's safe to migrate.
> >>>> I am just not sure how useful this is, given that QEMU seems to ask
> >>>> the receiving kernel about any sysreg, and doesn't particularly care
> >>>> about the meaning of those registers. And I am not sure we really
> >>>> want to introduce some kind of forward looking scheme in the kernel
> >>>> here, short of a working crystal ball. I think the kernel policy was
> >>>> always to be as strict as possible about those things.    
> >>>
> >>> I honestly don't understand how userspace can decide whether a given
> >>> configuration is migratable or not solely based on the value of such a
> >>> register. In my experience, the target system has a role to play, and
> >>> is the only place where we can find out about whether migration is
> >>> actually possible.    
> >>
> >> Both origin and target system need to be taken into account.  I don't
> >> think that's anything new.  
> > 
> > Well, that was what I understood from Andre's question.
> >   
> >>  
> >>> As you said, userspace doesn't interpret the data, nor should it. It
> >>> is only on the receiving end that compatibility is assessed and
> >>> whether some level of compatibility can be safely ensured.
> >>>
> >>> So to sum it up, I don't believe in this approach as a general way of
> >>> describing the handling or errata.    
> >>
> >> For context, my idea attempted to put KVM, not userspace, in charge of
> >> the decision: userspace applies fixed comparison rules determined ahead
> >> of time, but KVM supplies the values compared (and hence determines the
> >> result).
> >>
> >> My worry was that otherwise we may end up with a wild-west tangle of
> >> arbitrary properties that userspace needs specific knowledge about.  
> > 
> > And this is where our understanding differs. I do not think userspace
> > has to care at all. All it has to do is to provide the saved register
> > values to the target system, and let KVM accept or refuse these
> > settings. I can't see what providing a set of predefined values back to
> > userspace gains us.
> > 
> > An unknown register on the target system fails the restore phase:
> > that's absolutely fine, as we don't want to run on a system that
> > doesn't know about the mitigation.
> > 
> > An incompatible value fails the restore as well, as KVM itself finds
> > that this is a service it cannot safely provide.
> > 
> > No userspace involvement, no QEMU upgrade required. Only the kernel
> > knows about it.  
> 
> Yes, this is what I understand as well. From experience, many times when
> we were not strict enough about some userland interface, it backfired.
> 
> The only case where such a forward-looking scheme would make sense is
> the case where the source system has a new kernel, advertising a new
> firmware workaround register, in an unknown or missing state (0 or -1).
> An older kernel on the target system might not know about this register.
> That would translate into "unknown", which is compatible with 0 or -1
> from the source. So migration would be fine, but we deny it because the
> new kernel returns -EINVAL.
> 
> But I am not sure this construct is worth implementing in the kernel. If
> people care about this case, they could implement a workaround in
> userland instead. Or just upgrade the target kernel before migration.

Upgrading the target may not be convenient. But more importantly, I
don't think we expect downgrades to be supported. This can break for an
infinity of reasons, such as the feature set implemented on the source
not being there on the target.

As you said, if userspace wants to bypass these restrictions, it can
alter the data before restoring.

> 
> >> We can tolerate a few though.  If we accumulate a significant number
> >> of errata/vulnerability properties that need to be reported to
> >> userspace, this may be worth revisiting.  If not, it doesn't matter.  
> > 
> > Andre: if you want this to make it into 5.1, the time is now.  
> 
> OK. So is v2 [1] fine then? This implements the much easier "bigger is
> better" scheme, but being 0 based instead of using a 4-bit signed encoding.
> Let me know if there is something to rework in there.
> 
> Cheers,
> Andre.
> 
> [1]
> http://lists.infradead.org/pipermail/linux-arm-kernel/2019-January/627739.html

Sorry, I've lost track of which is which. Please post something that is
consistent,  and addresses Steve's concerns if there is still any. Make
sure it applies on top of the current kvmarm/next, and provide
evidences that you've tested migration on the expected working and
expected failing configurations.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
  2019-01-07 13:17     ` Steven Price
@ 2019-02-22 12:26       ` Andre Przywara
  -1 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-02-22 12:26 UTC (permalink / raw)
  To: Steven Price; +Cc: kvm, Marc Zyngier, kvmarm, linux-arm-kernel

On Mon, 7 Jan 2019 13:17:37 +0000
Steven Price <steven.price@arm.com> wrote:

Hi,

> On 07/01/2019 12:05, Andre Przywara wrote:
> > KVM implements the firmware interface for mitigating cache speculation
> > vulnerabilities. Guests may use this interface to ensure mitigation is
> > active.
> > If we want to migrate such a guest to a host with a different support
> > level for those workarounds, migration might need to fail, to ensure that
> > critical guests don't loose their protection.
> > 
> > Introduce a way for userland to save and restore the workarounds state.
> > On restoring we do checks that make sure we don't downgrade our
> > mitigation level.
> > 
> > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > ---
> >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> >  virt/kvm/arm/psci.c                  | 138 ++++++++++++++++++++++++++-
> >  5 files changed, 178 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
> > index 77121b713bef..2255c50debab 100644
> > --- a/arch/arm/include/asm/kvm_emulate.h
> > +++ b/arch/arm/include/asm/kvm_emulate.h
> > @@ -275,6 +275,16 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> >  	return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK;
> >  }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> > +{
> > +	return false;
> > +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> > index 4602464ebdfb..02c93b1d8f6d 100644
> > --- a/arch/arm/include/uapi/asm/kvm.h
> > +++ b/arch/arm/include/uapi/asm/kvm.h
> > @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
> >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM | KVM_REG_SIZE_U64 | \
> >  					 KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
> >  
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> > index 506386a3edde..a44f07f68da4 100644
> > --- a/arch/arm64/include/asm/kvm_emulate.h
> > +++ b/arch/arm64/include/asm/kvm_emulate.h
> > @@ -336,6 +336,20 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> >  	return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
> >  }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> > +{
> > +	return vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG;
> > +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +	if (flag)
> > +		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
> > +	else
> > +		vcpu->arch.workaround_flags &= ~VCPU_WORKAROUND_2_FLAG;
> > +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	if (vcpu_mode_is_32bit(vcpu)) {
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 97c3478ee6e7..4a19ef199a99 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -225,6 +225,15 @@ struct kvm_vcpu_events {
> >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
> >  					 KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1  
> 
> I can't help feeling we need more than one bit to deal with all the
> possible states. The host can support/not-support the workaround (i.e
> the HVC) and the guest can be using/not using the workaround.
> 
> In particular I can imagine the following situation:
> 
> * Guest starts on a host (host A) without the workaround HVC (so
> configures not to use it). Assuming the host doesn't need the workaround
> the guest is therefore not vulnerable.
> 
> * Migrated to a new host (host B) with the workaround HVC (this is
> accepted), the guest is potentially vulnerable.
> 
> * Migration back to the original host (host A) is then rejected, even
> though the guest isn't using the HVC.
> 
> I can see two options here:
> 
> * Reject the migration to host B as the guest may be vulnerable after
> the migration. I.e. the workaround availability cannot change (either
> way) during a migration
> 
> * Store an extra bit of information which is whether a particular guest
> has the HVC exposed to it. Ideally the HVC handling for the workaround
> would also get disabled when running on a host which supports the HVC
> but was migrated from a host which doesn't. This prevents problems with
> a guest which is e.g. migrated during boot and may do feature detection
> after the migration.
> 
> Since this is a new ABI it would be good to get the register values
> sorted even if we don't have a complete implementation of it.

So I thought about this a bit more and now implemented something like a combination of your two options above:
- There is a new UNAFFECTED state, which is currently unused. As mentioned before, the current NOT_AVAIL does not mean not needed, but actually translates as "unknown". At the moment this is our best guess on this matter. There are patches on the list which extend the host workaround code, so we can then actually differentiate between "unknown" and "always mitigated". Once they have landed, we can then communicate this new state to userland.
- For now we are very strict and allow only migration if the workaround levels are identical. Since the guest may have detected either one state during its boot, we don't have much
of a choice here. When we later gain the UNAFFECTED state, we can additionally allow migration from "NOT_AVAIL" to "UNAFFECTED".

I hope this covers your concerns.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state
@ 2019-02-22 12:26       ` Andre Przywara
  0 siblings, 0 replies; 50+ messages in thread
From: Andre Przywara @ 2019-02-22 12:26 UTC (permalink / raw)
  To: Steven Price
  Cc: Peter Maydell, kvm, Marc Zyngier, Christoffer Dall, kvmarm,
	linux-arm-kernel

On Mon, 7 Jan 2019 13:17:37 +0000
Steven Price <steven.price@arm.com> wrote:

Hi,

> On 07/01/2019 12:05, Andre Przywara wrote:
> > KVM implements the firmware interface for mitigating cache speculation
> > vulnerabilities. Guests may use this interface to ensure mitigation is
> > active.
> > If we want to migrate such a guest to a host with a different support
> > level for those workarounds, migration might need to fail, to ensure that
> > critical guests don't loose their protection.
> > 
> > Introduce a way for userland to save and restore the workarounds state.
> > On restoring we do checks that make sure we don't downgrade our
> > mitigation level.
> > 
> > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > ---
> >  arch/arm/include/asm/kvm_emulate.h   |  10 ++
> >  arch/arm/include/uapi/asm/kvm.h      |   9 ++
> >  arch/arm64/include/asm/kvm_emulate.h |  14 +++
> >  arch/arm64/include/uapi/asm/kvm.h    |   9 ++
> >  virt/kvm/arm/psci.c                  | 138 ++++++++++++++++++++++++++-
> >  5 files changed, 178 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h
> > index 77121b713bef..2255c50debab 100644
> > --- a/arch/arm/include/asm/kvm_emulate.h
> > +++ b/arch/arm/include/asm/kvm_emulate.h
> > @@ -275,6 +275,16 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> >  	return vcpu_cp15(vcpu, c0_MPIDR) & MPIDR_HWID_BITMASK;
> >  }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> > +{
> > +	return false;
> > +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	*vcpu_cpsr(vcpu) |= PSR_E_BIT;
> > diff --git a/arch/arm/include/uapi/asm/kvm.h b/arch/arm/include/uapi/asm/kvm.h
> > index 4602464ebdfb..02c93b1d8f6d 100644
> > --- a/arch/arm/include/uapi/asm/kvm.h
> > +++ b/arch/arm/include/uapi/asm/kvm.h
> > @@ -214,6 +214,15 @@ struct kvm_vcpu_events {
> >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM | KVM_REG_SIZE_U64 | \
> >  					 KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2	KVM_REG_ARM_FW_REG(2)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_MASK	0x3
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL	1
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNAFFECTED	2
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED	4
> >  
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR	0
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> > index 506386a3edde..a44f07f68da4 100644
> > --- a/arch/arm64/include/asm/kvm_emulate.h
> > +++ b/arch/arm64/include/asm/kvm_emulate.h
> > @@ -336,6 +336,20 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
> >  	return vcpu_read_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
> >  }
> >  
> > +static inline bool kvm_arm_get_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu)
> > +{
> > +	return vcpu->arch.workaround_flags & VCPU_WORKAROUND_2_FLAG;
> > +}
> > +
> > +static inline void kvm_arm_set_vcpu_workaround_2_flag(struct kvm_vcpu *vcpu,
> > +						      bool flag)
> > +{
> > +	if (flag)
> > +		vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
> > +	else
> > +		vcpu->arch.workaround_flags &= ~VCPU_WORKAROUND_2_FLAG;
> > +}
> > +
> >  static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
> >  {
> >  	if (vcpu_mode_is_32bit(vcpu)) {
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> > index 97c3478ee6e7..4a19ef199a99 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -225,6 +225,15 @@ struct kvm_vcpu_events {
> >  #define KVM_REG_ARM_FW_REG(r)		(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
> >  					 KVM_REG_ARM_FW | ((r) & 0xffff))
> >  #define KVM_REG_ARM_PSCI_VERSION	KVM_REG_ARM_FW_REG(0)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1	KVM_REG_ARM_FW_REG(1)
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL	0
> > +#define KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL	1  
> 
> I can't help feeling we need more than one bit to deal with all the
> possible states. The host can support/not-support the workaround (i.e
> the HVC) and the guest can be using/not using the workaround.
> 
> In particular I can imagine the following situation:
> 
> * Guest starts on a host (host A) without the workaround HVC (so
> configures not to use it). Assuming the host doesn't need the workaround
> the guest is therefore not vulnerable.
> 
> * Migrated to a new host (host B) with the workaround HVC (this is
> accepted), the guest is potentially vulnerable.
> 
> * Migration back to the original host (host A) is then rejected, even
> though the guest isn't using the HVC.
> 
> I can see two options here:
> 
> * Reject the migration to host B as the guest may be vulnerable after
> the migration. I.e. the workaround availability cannot change (either
> way) during a migration
> 
> * Store an extra bit of information which is whether a particular guest
> has the HVC exposed to it. Ideally the HVC handling for the workaround
> would also get disabled when running on a host which supports the HVC
> but was migrated from a host which doesn't. This prevents problems with
> a guest which is e.g. migrated during boot and may do feature detection
> after the migration.
> 
> Since this is a new ABI it would be good to get the register values
> sorted even if we don't have a complete implementation of it.

So I thought about this a bit more and now implemented something like a combination of your two options above:
- There is a new UNAFFECTED state, which is currently unused. As mentioned before, the current NOT_AVAIL does not mean not needed, but actually translates as "unknown". At the moment this is our best guess on this matter. There are patches on the list which extend the host workaround code, so we can then actually differentiate between "unknown" and "always mitigated". Once they have landed, we can then communicate this new state to userland.
- For now we are very strict and allow only migration if the workaround levels are identical. Since the guest may have detected either one state during its boot, we don't have much
of a choice here. When we later gain the UNAFFECTED state, we can additionally allow migration from "NOT_AVAIL" to "UNAFFECTED".

I hope this covers your concerns.

Cheers,
Andre.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2019-02-22 12:30 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-07 12:05 [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register Andre Przywara
2019-01-07 12:05 ` Andre Przywara
2019-01-07 12:05 ` [PATCH 1/2] KVM: arm/arm64: Add save/restore support for firmware workaround state Andre Przywara
2019-01-07 12:05   ` Andre Przywara
2019-01-07 13:17   ` Steven Price
2019-01-07 13:17     ` Steven Price
2019-01-21 17:04     ` Andre Przywara
2019-01-21 17:04       ` Andre Przywara
2019-02-22 12:26     ` Andre Przywara
2019-02-22 12:26       ` Andre Przywara
2019-01-22 15:17   ` Dave Martin
2019-01-22 15:17     ` Dave Martin
2019-01-25 14:46     ` Andre Przywara
2019-01-25 14:46       ` Andre Przywara
2019-01-29 21:32       ` Dave Martin
2019-01-29 21:32         ` Dave Martin
2019-01-30 11:39         ` Andre Przywara
2019-01-30 11:39           ` Andre Przywara
2019-01-30 12:07           ` Dave Martin
2019-01-30 12:07             ` Dave Martin
2019-02-15  9:58           ` Andre Przywara
2019-02-15  9:58             ` Andre Przywara
2019-02-15 11:42             ` Marc Zyngier
2019-02-15 11:42               ` Marc Zyngier
2019-02-15 17:26               ` Dave Martin
2019-02-15 17:26                 ` Dave Martin
2019-02-18  9:07                 ` Marc Zyngier
2019-02-18  9:07                   ` Marc Zyngier
2019-02-18 10:28                   ` Dave Martin
2019-02-18 10:28                     ` Dave Martin
2019-02-18 10:59                     ` Marc Zyngier
2019-02-18 10:59                       ` Marc Zyngier
2019-02-18 11:29                   ` André Przywara
2019-02-18 11:29                     ` André Przywara
2019-02-18 14:15                     ` Marc Zyngier
2019-02-18 14:15                       ` Marc Zyngier
2019-01-07 12:05 ` [PATCH 2/2] KVM: doc: add API documentation on the KVM_REG_ARM_WORKAROUNDS register Andre Przywara
2019-01-07 12:05   ` Andre Przywara
2019-01-22 10:17 ` [PATCH 0/2] KVM: arm/arm64: Add VCPU workarounds firmware register Dave Martin
2019-01-22 10:17   ` Dave Martin
2019-01-22 10:41   ` Andre Przywara
2019-01-22 10:41     ` Andre Przywara
2019-01-22 11:11   ` Marc Zyngier
2019-01-22 11:11     ` Marc Zyngier
2019-01-22 13:56     ` Dave Martin
2019-01-22 13:56       ` Dave Martin
2019-01-22 14:51       ` Marc Zyngier
2019-01-22 14:51         ` Marc Zyngier
2019-01-22 15:28         ` Dave Martin
2019-01-22 15:28           ` Dave Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.