linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept
@ 2020-09-23 22:45 Sean Christopherson
  2020-09-23 22:45 ` [RFC PATCH 1/3] KVM: Export kvm_make_all_cpus_request() for use in marking VMs as bugged Sean Christopherson
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Sean Christopherson @ 2020-09-23 22:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Marc Zyngier, James Morse,
	Julien Thierry, Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

This series introduces a concept we've discussed a few times in x86 land.
The crux of the problem is that x86 has a few cases where KVM could
theoretically encounter a software or hardware bug deep in a call stack
without any sane way to propagate the error out to userspace.

Another use case would be for scenarios where letting the VM live will
do more harm than good, e.g. we've been using KVM_BUG_ON for early TDX
enabling as botching anything related to secure paging all but guarantees
there will be a flood of WARNs and error messages because lower level PTE
operations will fail if an upper level operation failed.

The basic idea is to WARN_ONCE if a bug is encountered, kick all vCPUs out
to userspace, and mark the VM as bugged so that no ioctls() can be issued
on the VM or its devices/vCPUs.

RFC as I've done nowhere near enough testing to verify that rejecting the
ioctls(), evicting running vCPUs, etc... works as intended.

Sean Christopherson (3):
  KVM: Export kvm_make_all_cpus_request() for use in marking VMs as
    bugged
  KVM: Add infrastructure and macro to mark VM as bugged
  KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the
    VM

 arch/x86/kvm/svm/svm.c   |  2 +-
 arch/x86/kvm/vmx/vmx.c   | 23 ++++++++++++--------
 arch/x86/kvm/x86.c       |  4 ++++
 include/linux/kvm_host.h | 45 ++++++++++++++++++++++++++++++++--------
 virt/kvm/kvm_main.c      | 11 +++++-----
 5 files changed, 61 insertions(+), 24 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 1/3] KVM: Export kvm_make_all_cpus_request() for use in marking VMs as bugged
  2020-09-23 22:45 [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Sean Christopherson
@ 2020-09-23 22:45 ` Sean Christopherson
  2020-09-23 22:45 ` [RFC PATCH 2/3] KVM: Add infrastructure and macro to mark VM " Sean Christopherson
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2020-09-23 22:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Marc Zyngier, James Morse,
	Julien Thierry, Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

Export kvm_make_all_cpus_request() and hoist the request helper
declarations of request up to the KVM_REQ_* definitions in preparation
for adding a "VM bugged" framework.  The framework will add KVM_BUG()
and KVM_BUG_ON() as alternatives to full BUG()/BUG_ON() for cases where
KVM has definitely hit a bug (in itself or in silicon) and the VM is all
but guaranteed to be hosed.  Marking a VM bugged will trigger a request
to all vCPUs to allow arch code to forcefully evict each vCPU from its
run loop.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 include/linux/kvm_host.h | 18 +++++++++---------
 virt/kvm/kvm_main.c      |  1 +
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 05e3c2fb3ef7..1d70aeeb7ec3 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -154,6 +154,15 @@ static inline bool is_error_page(struct page *page)
 })
 #define KVM_ARCH_REQ(nr)           KVM_ARCH_REQ_FLAGS(nr, 0)
 
+bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
+				 struct kvm_vcpu *except,
+				 unsigned long *vcpu_bitmap, cpumask_var_t tmp);
+bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req);
+bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
+				      struct kvm_vcpu *except);
+bool kvm_make_cpus_request_mask(struct kvm *kvm, unsigned int req,
+				unsigned long *vcpu_bitmap);
+
 #define KVM_USERSPACE_IRQ_SOURCE_ID		0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
 
@@ -845,15 +854,6 @@ void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc);
 void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
 #endif
 
-bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
-				 struct kvm_vcpu *except,
-				 unsigned long *vcpu_bitmap, cpumask_var_t tmp);
-bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req);
-bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
-				      struct kvm_vcpu *except);
-bool kvm_make_cpus_request_mask(struct kvm *kvm, unsigned int req,
-				unsigned long *vcpu_bitmap);
-
 long kvm_arch_dev_ioctl(struct file *filp,
 			unsigned int ioctl, unsigned long arg);
 long kvm_arch_vcpu_ioctl(struct file *filp,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index cf88233b819a..bf3f333c7a19 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -306,6 +306,7 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
 {
 	return kvm_make_all_cpus_request_except(kvm, req, NULL);
 }
+EXPORT_SYMBOL_GPL(kvm_make_all_cpus_request);
 
 #ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
 void kvm_flush_remote_tlbs(struct kvm *kvm)
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 2/3] KVM: Add infrastructure and macro to mark VM as bugged
  2020-09-23 22:45 [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Sean Christopherson
  2020-09-23 22:45 ` [RFC PATCH 1/3] KVM: Export kvm_make_all_cpus_request() for use in marking VMs as bugged Sean Christopherson
@ 2020-09-23 22:45 ` Sean Christopherson
  2020-09-23 22:45 ` [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM Sean Christopherson
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2020-09-23 22:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Marc Zyngier, James Morse,
	Julien Thierry, Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

Introduce the concept of a "bugged VM", i.e. a VM that has encountered a
KVM bug and/or a CPU bug and is, for all intents and purposes, dead in
the water.  Marking a VM as bugged is especially helpful to handle
scenarios that can only be reach if there is a software/hardware bug,
but can't easily return an error, e.g. x86's register caching callbacks.

Reject all ioctls() if a VM is bugged, and provide a new request so that
arch specific code can kick running vCPUs out to userspace.

Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Cc: linux-mips@vger.kernel.org
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: kvm-ppc@vger.kernel.org
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 include/linux/kvm_host.h | 27 +++++++++++++++++++++++++++
 virt/kvm/kvm_main.c      | 10 +++++-----
 2 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 1d70aeeb7ec3..cb527d55908d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -146,6 +146,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_MMU_RELOAD        (1 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_PENDING_TIMER     2
 #define KVM_REQ_UNHALT            3
+#define KVM_REQ_VM_BUGGED	  (4 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQUEST_ARCH_BASE     8
 
 #define KVM_ARCH_REQ_FLAGS(nr, flags) ({ \
@@ -513,6 +514,8 @@ struct kvm {
 	struct srcu_struct irq_srcu;
 	pid_t userspace_pid;
 	unsigned int max_halt_poll_ns;
+
+	bool vm_bugged;
 };
 
 #define kvm_err(fmt, ...) \
@@ -541,6 +544,30 @@ struct kvm {
 #define vcpu_err(vcpu, fmt, ...)					\
 	kvm_err("vcpu%i " fmt, (vcpu)->vcpu_id, ## __VA_ARGS__)
 
+static inline void kvm_vm_bugged(struct kvm *kvm)
+{
+	kvm->vm_bugged = true;
+	kvm_make_all_cpus_request(kvm, KVM_REQ_VM_BUGGED);
+}
+
+#define KVM_BUG(cond, kvm, fmt...)				\
+({								\
+	int __ret = (cond);					\
+								\
+	if (WARN_ONCE(__ret && !(kvm)->vm_bugged, fmt))		\
+		kvm_vm_bugged(kvm);				\
+	unlikely(__ret);					\
+})
+
+#define KVM_BUG_ON(cond, kvm)					\
+({								\
+	int __ret = (cond);					\
+								\
+	if (WARN_ON_ONCE(__ret && !(kvm)->vm_bugged))		\
+		kvm_vm_bugged(kvm);				\
+	unlikely(__ret);					\
+})
+
 static inline bool kvm_dirty_log_manual_protect_and_init_set(struct kvm *kvm)
 {
 	return !!(kvm->manual_dirty_log_protect & KVM_DIRTY_LOG_INITIALLY_SET);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index bf3f333c7a19..e216ce9d1c39 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3191,7 +3191,7 @@ static long kvm_vcpu_ioctl(struct file *filp,
 	struct kvm_fpu *fpu = NULL;
 	struct kvm_sregs *kvm_sregs = NULL;
 
-	if (vcpu->kvm->mm != current->mm)
+	if (vcpu->kvm->mm != current->mm || vcpu->kvm->vm_bugged)
 		return -EIO;
 
 	if (unlikely(_IOC_TYPE(ioctl) != KVMIO))
@@ -3397,7 +3397,7 @@ static long kvm_vcpu_compat_ioctl(struct file *filp,
 	void __user *argp = compat_ptr(arg);
 	int r;
 
-	if (vcpu->kvm->mm != current->mm)
+	if (vcpu->kvm->mm != current->mm || vcpu->kvm->vm_bugged)
 		return -EIO;
 
 	switch (ioctl) {
@@ -3463,7 +3463,7 @@ static long kvm_device_ioctl(struct file *filp, unsigned int ioctl,
 {
 	struct kvm_device *dev = filp->private_data;
 
-	if (dev->kvm->mm != current->mm)
+	if (dev->kvm->mm != current->mm || dev->kvm->vm_bugged)
 		return -EIO;
 
 	switch (ioctl) {
@@ -3679,7 +3679,7 @@ static long kvm_vm_ioctl(struct file *filp,
 	void __user *argp = (void __user *)arg;
 	int r;
 
-	if (kvm->mm != current->mm)
+	if (kvm->mm != current->mm || kvm->vm_bugged)
 		return -EIO;
 	switch (ioctl) {
 	case KVM_CREATE_VCPU:
@@ -3874,7 +3874,7 @@ static long kvm_vm_compat_ioctl(struct file *filp,
 	struct kvm *kvm = filp->private_data;
 	int r;
 
-	if (kvm->mm != current->mm)
+	if (kvm->mm != current->mm || kvm->vm_bugged)
 		return -EIO;
 	switch (ioctl) {
 	case KVM_GET_DIRTY_LOG: {
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
  2020-09-23 22:45 [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Sean Christopherson
  2020-09-23 22:45 ` [RFC PATCH 1/3] KVM: Export kvm_make_all_cpus_request() for use in marking VMs as bugged Sean Christopherson
  2020-09-23 22:45 ` [RFC PATCH 2/3] KVM: Add infrastructure and macro to mark VM " Sean Christopherson
@ 2020-09-23 22:45 ` Sean Christopherson
  2020-09-24 12:34   ` Vitaly Kuznetsov
  2020-09-24  6:37 ` [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Christian Borntraeger
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2020-09-23 22:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Marc Zyngier, James Morse,
	Julien Thierry, Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

Add support for KVM_REQ_VM_BUGG in x86, and replace a variety of WARNs
with KVM_BUG() and KVM_BUG_ON().  Return -EIO if a KVM_BUG is hit to
align with the common KVM behavior of rejecting iocts() with -EIO if the
VM is bugged.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kvm/svm/svm.c |  2 +-
 arch/x86/kvm/vmx/vmx.c | 23 ++++++++++++++---------
 arch/x86/kvm/x86.c     |  4 ++++
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3da5b2f1b4a1..e684794c6249 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1380,7 +1380,7 @@ static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
 		load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu));
 		break;
 	default:
-		WARN_ON_ONCE(1);
+		KVM_BUG_ON(1, vcpu->kvm);
 	}
 }
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6f9a0c6d5dc5..810d46ab0a47 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2250,7 +2250,7 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
 		vcpu->arch.cr4 |= vmcs_readl(GUEST_CR4) & guest_owned_bits;
 		break;
 	default:
-		WARN_ON_ONCE(1);
+		KVM_BUG_ON(1, vcpu->kvm);
 		break;
 	}
 }
@@ -4960,6 +4960,7 @@ static int handle_cr(struct kvm_vcpu *vcpu)
 			return kvm_complete_insn_gp(vcpu, err);
 		case 3:
 			WARN_ON_ONCE(enable_unrestricted_guest);
+
 			err = kvm_set_cr3(vcpu, val);
 			return kvm_complete_insn_gp(vcpu, err);
 		case 4:
@@ -4985,14 +4986,13 @@ static int handle_cr(struct kvm_vcpu *vcpu)
 		}
 		break;
 	case 2: /* clts */
-		WARN_ONCE(1, "Guest should always own CR0.TS");
-		vmx_set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));
-		trace_kvm_cr_write(0, kvm_read_cr0(vcpu));
-		return kvm_skip_emulated_instruction(vcpu);
+		KVM_BUG(1, vcpu->kvm, "Guest always owns CR0.TS");
+		return -EIO;
 	case 1: /*mov from cr*/
 		switch (cr) {
 		case 3:
 			WARN_ON_ONCE(enable_unrestricted_guest);
+
 			val = kvm_read_cr3(vcpu);
 			kvm_register_write(vcpu, reg, val);
 			trace_kvm_cr_read(cr, val);
@@ -5330,7 +5330,9 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
 
 static int handle_nmi_window(struct kvm_vcpu *vcpu)
 {
-	WARN_ON_ONCE(!enable_vnmi);
+	if (KVM_BUG_ON(!enable_vnmi, vcpu->kvm))
+		return -EIO;
+
 	exec_controls_clearbit(to_vmx(vcpu), CPU_BASED_NMI_WINDOW_EXITING);
 	++vcpu->stat.nmi_window_exits;
 	kvm_make_request(KVM_REQ_EVENT, vcpu);
@@ -5908,7 +5910,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 	 * below) should never happen as that means we incorrectly allowed a
 	 * nested VM-Enter with an invalid vmcs12.
 	 */
-	WARN_ON_ONCE(vmx->nested.nested_run_pending);
+	if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
+		return -EIO;
 
 	/* If guest state is invalid, start emulating */
 	if (vmx->emulation_required)
@@ -6258,7 +6261,9 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
 	int max_irr;
 	bool max_irr_updated;
 
-	WARN_ON(!vcpu->arch.apicv_active);
+	if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
+		return -EIO;
+
 	if (pi_test_on(&vmx->pi_desc)) {
 		pi_clear_on(&vmx->pi_desc);
 		/*
@@ -6345,7 +6350,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
 	gate_desc *desc;
 	u32 intr_info = vmx_get_intr_info(vcpu);
 
-	if (WARN_ONCE(!is_external_intr(intr_info),
+	if (KVM_BUG(!is_external_intr(intr_info), vcpu->kvm,
 	    "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info))
 		return;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 17f4995e80a7..672eb5142b34 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8363,6 +8363,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	bool req_immediate_exit = false;
 
 	if (kvm_request_pending(vcpu)) {
+		if (kvm_check_request(KVM_REQ_VM_BUGGED, vcpu)) {
+			r = -EIO;
+			goto out;
+		}
 		if (kvm_check_request(KVM_REQ_GET_VMCS12_PAGES, vcpu)) {
 			if (unlikely(!kvm_x86_ops.nested_ops->get_vmcs12_pages(vcpu))) {
 				r = 0;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept
  2020-09-23 22:45 [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Sean Christopherson
                   ` (2 preceding siblings ...)
  2020-09-23 22:45 ` [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM Sean Christopherson
@ 2020-09-24  6:37 ` Christian Borntraeger
  2020-09-25 16:32 ` Marc Zyngier
  2020-09-29  9:27 ` Cornelia Huck
  5 siblings, 0 replies; 15+ messages in thread
From: Christian Borntraeger @ 2020-09-24  6:37 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Marc Zyngier, James Morse, Julien Thierry,
	Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Janosch Frank, David Hildenbrand, Cornelia Huck,
	Claudio Imbrenda



On 24.09.20 00:45, Sean Christopherson wrote:
> This series introduces a concept we've discussed a few times in x86 land.
> The crux of the problem is that x86 has a few cases where KVM could
> theoretically encounter a software or hardware bug deep in a call stack
> without any sane way to propagate the error out to userspace.
> 
> Another use case would be for scenarios where letting the VM live will
> do more harm than good, e.g. we've been using KVM_BUG_ON for early TDX
> enabling as botching anything related to secure paging all but guarantees
> there will be a flood of WARNs and error messages because lower level PTE
> operations will fail if an upper level operation failed.
> 
> The basic idea is to WARN_ONCE if a bug is encountered, kick all vCPUs out
> to userspace, and mark the VM as bugged so that no ioctls() can be issued
> on the VM or its devices/vCPUs.
> 
> RFC as I've done nowhere near enough testing to verify that rejecting the
> ioctls(), evicting running vCPUs, etc... works as intended.

I like the idea. Especially when we add a common "understanding" in QEMU
across all platforms. That would then even allow to propagate an error.
> 
> Sean Christopherson (3):
>   KVM: Export kvm_make_all_cpus_request() for use in marking VMs as
>     bugged
>   KVM: Add infrastructure and macro to mark VM as bugged
>   KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the
>     VM
> 
>  arch/x86/kvm/svm/svm.c   |  2 +-
>  arch/x86/kvm/vmx/vmx.c   | 23 ++++++++++++--------
>  arch/x86/kvm/x86.c       |  4 ++++
>  include/linux/kvm_host.h | 45 ++++++++++++++++++++++++++++++++--------
>  virt/kvm/kvm_main.c      | 11 +++++-----
>  5 files changed, 61 insertions(+), 24 deletions(-)
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
  2020-09-23 22:45 ` [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM Sean Christopherson
@ 2020-09-24 12:34   ` Vitaly Kuznetsov
       [not found]     ` <20200924181134.GB9649@linux.intel.com>
  0 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2020-09-24 12:34 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel,
	Marc Zyngier, James Morse, Julien Thierry, Suzuki K Poulose,
	linux-arm-kernel, Huacai Chen, Aleksandar Markovic, linux-mips,
	Paul Mackerras, kvm-ppc, Christian Borntraeger, Janosch Frank,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> Add support for KVM_REQ_VM_BUGG in x86, and replace a variety of WARNs
> with KVM_BUG() and KVM_BUG_ON().  Return -EIO if a KVM_BUG is hit to
> align with the common KVM behavior of rejecting iocts() with -EIO if the
> VM is bugged.
>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  arch/x86/kvm/svm/svm.c |  2 +-
>  arch/x86/kvm/vmx/vmx.c | 23 ++++++++++++++---------
>  arch/x86/kvm/x86.c     |  4 ++++
>  3 files changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 3da5b2f1b4a1..e684794c6249 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1380,7 +1380,7 @@ static void svm_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
>  		load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu));
>  		break;
>  	default:
> -		WARN_ON_ONCE(1);
> +		KVM_BUG_ON(1, vcpu->kvm);
>  	}
>  }
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 6f9a0c6d5dc5..810d46ab0a47 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2250,7 +2250,7 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
>  		vcpu->arch.cr4 |= vmcs_readl(GUEST_CR4) & guest_owned_bits;
>  		break;
>  	default:
> -		WARN_ON_ONCE(1);
> +		KVM_BUG_ON(1, vcpu->kvm);
>  		break;
>  	}
>  }
> @@ -4960,6 +4960,7 @@ static int handle_cr(struct kvm_vcpu *vcpu)
>  			return kvm_complete_insn_gp(vcpu, err);
>  		case 3:
>  			WARN_ON_ONCE(enable_unrestricted_guest);
> +
>  			err = kvm_set_cr3(vcpu, val);
>  			return kvm_complete_insn_gp(vcpu, err);
>  		case 4:
> @@ -4985,14 +4986,13 @@ static int handle_cr(struct kvm_vcpu *vcpu)
>  		}
>  		break;
>  	case 2: /* clts */
> -		WARN_ONCE(1, "Guest should always own CR0.TS");
> -		vmx_set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));
> -		trace_kvm_cr_write(0, kvm_read_cr0(vcpu));
> -		return kvm_skip_emulated_instruction(vcpu);
> +		KVM_BUG(1, vcpu->kvm, "Guest always owns CR0.TS");
> +		return -EIO;
>  	case 1: /*mov from cr*/
>  		switch (cr) {
>  		case 3:
>  			WARN_ON_ONCE(enable_unrestricted_guest);
> +

Here, were you intended to replace WARN_ON_ONCE() with KVM_BUG_ON() or
this is just a stray newline added?

>  			val = kvm_read_cr3(vcpu);
>  			kvm_register_write(vcpu, reg, val);
>  			trace_kvm_cr_read(cr, val);
> @@ -5330,7 +5330,9 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
>  
>  static int handle_nmi_window(struct kvm_vcpu *vcpu)
>  {
> -	WARN_ON_ONCE(!enable_vnmi);
> +	if (KVM_BUG_ON(!enable_vnmi, vcpu->kvm))
> +		return -EIO;
> +
>  	exec_controls_clearbit(to_vmx(vcpu), CPU_BASED_NMI_WINDOW_EXITING);
>  	++vcpu->stat.nmi_window_exits;
>  	kvm_make_request(KVM_REQ_EVENT, vcpu);
> @@ -5908,7 +5910,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>  	 * below) should never happen as that means we incorrectly allowed a
>  	 * nested VM-Enter with an invalid vmcs12.
>  	 */
> -	WARN_ON_ONCE(vmx->nested.nested_run_pending);
> +	if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
> +		return -EIO;
>  
>  	/* If guest state is invalid, start emulating */
>  	if (vmx->emulation_required)
> @@ -6258,7 +6261,9 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>  	int max_irr;
>  	bool max_irr_updated;
>  
> -	WARN_ON(!vcpu->arch.apicv_active);
> +	if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
> +		return -EIO;
> +
>  	if (pi_test_on(&vmx->pi_desc)) {
>  		pi_clear_on(&vmx->pi_desc);
>  		/*
> @@ -6345,7 +6350,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
>  	gate_desc *desc;
>  	u32 intr_info = vmx_get_intr_info(vcpu);
>  
> -	if (WARN_ONCE(!is_external_intr(intr_info),
> +	if (KVM_BUG(!is_external_intr(intr_info), vcpu->kvm,
>  	    "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info))
>  		return;
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 17f4995e80a7..672eb5142b34 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8363,6 +8363,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	bool req_immediate_exit = false;
>  
>  	if (kvm_request_pending(vcpu)) {
> +		if (kvm_check_request(KVM_REQ_VM_BUGGED, vcpu)) {

Do we want to allow userspace to continue executing the guest or should
we make KVM_REQ_VM_BUGGED permanent by replacing kvm_check_request()
with kvm_test_request()?

> +			r = -EIO;
> +			goto out;
> +		}
>  		if (kvm_check_request(KVM_REQ_GET_VMCS12_PAGES, vcpu)) {
>  			if (unlikely(!kvm_x86_ops.nested_ops->get_vmcs12_pages(vcpu))) {
>  				r = 0;

-- 
Vitaly


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
       [not found]     ` <20200924181134.GB9649@linux.intel.com>
@ 2020-09-25  9:50       ` Vitaly Kuznetsov
  2020-09-25 17:12         ` Sean Christopherson
  0 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2020-09-25  9:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Marc Zyngier, James Morse, Julien Thierry,
	Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

Sean Christopherson <sean.j.christopherson@intel.com> writes:

> On Thu, Sep 24, 2020 at 02:34:14PM +0200, Vitaly Kuznetsov wrote:
>> Sean Christopherson <sean.j.christopherson@intel.com> writes:
>> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> > index 6f9a0c6d5dc5..810d46ab0a47 100644
>> > --- a/arch/x86/kvm/vmx/vmx.c
>> > +++ b/arch/x86/kvm/vmx/vmx.c
>> > @@ -2250,7 +2250,7 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
>> >  		vcpu->arch.cr4 |= vmcs_readl(GUEST_CR4) & guest_owned_bits;
>> >  		break;
>> >  	default:
>> > -		WARN_ON_ONCE(1);
>> > +		KVM_BUG_ON(1, vcpu->kvm);
>> >  		break;
>> >  	}
>> >  }
>> > @@ -4960,6 +4960,7 @@ static int handle_cr(struct kvm_vcpu *vcpu)
>> >  			return kvm_complete_insn_gp(vcpu, err);
>> >  		case 3:
>> >  			WARN_ON_ONCE(enable_unrestricted_guest);
>> > +
>> >  			err = kvm_set_cr3(vcpu, val);
>> >  			return kvm_complete_insn_gp(vcpu, err);
>> >  		case 4:
>> > @@ -4985,14 +4986,13 @@ static int handle_cr(struct kvm_vcpu *vcpu)
>> >  		}
>> >  		break;
>> >  	case 2: /* clts */
>> > -		WARN_ONCE(1, "Guest should always own CR0.TS");
>> > -		vmx_set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));
>> > -		trace_kvm_cr_write(0, kvm_read_cr0(vcpu));
>> > -		return kvm_skip_emulated_instruction(vcpu);
>> > +		KVM_BUG(1, vcpu->kvm, "Guest always owns CR0.TS");
>> > +		return -EIO;
>> >  	case 1: /*mov from cr*/
>> >  		switch (cr) {
>> >  		case 3:
>> >  			WARN_ON_ONCE(enable_unrestricted_guest);
>> > +
>> 
>> Here, were you intended to replace WARN_ON_ONCE() with KVM_BUG_ON() or
>> this is just a stray newline added?
>
> I think it's just a stray newline.  At one point I had converted this to a
> KVM_BUG_ON(), but then reversed direction because it's not fatal to the guest,
> i.e. KVM should continue to function even though it's spuriously intercepting
> CR3 loads.
>
> Which, rereading this patch, completely contradicts the KVM_BUG() for CLTS.
>
> That's probably something we should sort out in this RFC: is KVM_BUG() only
> to be used if the bug is fatal/dangerous, or should it be used any time the
> error is definitely a KVM (or hardware) bug.

Personally, I'm feeling adventurous so my vote goes to the later :-)
Whenever a KVM bug was discovered by a VM it's much safer to stop
executing it as who knows what the implications might be?

In this particular case I can think of a nested scenario when L1 didn't
ask for CR3 intercept but L0 is still injecting it. It is not fatal by
itself but probably there is bug in calculating intercepts in L0 so
if we're getting something extra maybe we're also missing some? And this
doesn't sound good at all.

>
>> >  			val = kvm_read_cr3(vcpu);
>> >  			kvm_register_write(vcpu, reg, val);
>> >  			trace_kvm_cr_read(cr, val);
>> > @@ -5330,7 +5330,9 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
>> >  
>> >  static int handle_nmi_window(struct kvm_vcpu *vcpu)
>> >  {
>> > -	WARN_ON_ONCE(!enable_vnmi);
>> > +	if (KVM_BUG_ON(!enable_vnmi, vcpu->kvm))
>> > +		return -EIO;
>> > +
>> >  	exec_controls_clearbit(to_vmx(vcpu), CPU_BASED_NMI_WINDOW_EXITING);
>> >  	++vcpu->stat.nmi_window_exits;
>> >  	kvm_make_request(KVM_REQ_EVENT, vcpu);
>> > @@ -5908,7 +5910,8 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>> >  	 * below) should never happen as that means we incorrectly allowed a
>> >  	 * nested VM-Enter with an invalid vmcs12.
>> >  	 */
>> > -	WARN_ON_ONCE(vmx->nested.nested_run_pending);
>> > +	if (KVM_BUG_ON(vmx->nested.nested_run_pending, vcpu->kvm))
>> > +		return -EIO;
>> >  
>> >  	/* If guest state is invalid, start emulating */
>> >  	if (vmx->emulation_required)
>> > @@ -6258,7 +6261,9 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>> >  	int max_irr;
>> >  	bool max_irr_updated;
>> >  
>> > -	WARN_ON(!vcpu->arch.apicv_active);
>> > +	if (KVM_BUG_ON(!vcpu->arch.apicv_active, vcpu->kvm))
>> > +		return -EIO;
>> > +
>> >  	if (pi_test_on(&vmx->pi_desc)) {
>> >  		pi_clear_on(&vmx->pi_desc);
>> >  		/*
>> > @@ -6345,7 +6350,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
>> >  	gate_desc *desc;
>> >  	u32 intr_info = vmx_get_intr_info(vcpu);
>> >  
>> > -	if (WARN_ONCE(!is_external_intr(intr_info),
>> > +	if (KVM_BUG(!is_external_intr(intr_info), vcpu->kvm,
>> >  	    "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info))
>> >  		return;
>> >  
>> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> > index 17f4995e80a7..672eb5142b34 100644
>> > --- a/arch/x86/kvm/x86.c
>> > +++ b/arch/x86/kvm/x86.c
>> > @@ -8363,6 +8363,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>> >  	bool req_immediate_exit = false;
>> >  
>> >  	if (kvm_request_pending(vcpu)) {
>> > +		if (kvm_check_request(KVM_REQ_VM_BUGGED, vcpu)) {
>> 
>> Do we want to allow userspace to continue executing the guest or should
>> we make KVM_REQ_VM_BUGGED permanent by replacing kvm_check_request()
>> with kvm_test_request()?
>
> In theory, it should be impossible to reach this again as "r = -EIO" will
> bounce this out to userspace, the common checks to deny all ioctls() will
> prevent reinvoking KVM_RUN.

Do we actually want to prevent *all* ioctls? E.g. when 'vm bugged'
condition is triggered userspace may want to extract some information to
assist debugging but even things like KVM_GET_[S]REGS will just return
-EIO. I'm not sure it is generally safe to enable *everything* (except
for KVM_RUN which should definitely be forbidden) so maybe your approach
is preferable.

>
>> > +			r = -EIO;
>> > +			goto out;
>> > +		}
>> >  		if (kvm_check_request(KVM_REQ_GET_VMCS12_PAGES, vcpu)) {
>> >  			if (unlikely(!kvm_x86_ops.nested_ops->get_vmcs12_pages(vcpu))) {
>> >  				r = 0;
>> 
>> -- 
>> Vitaly
>> 
>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept
  2020-09-23 22:45 [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Sean Christopherson
                   ` (3 preceding siblings ...)
  2020-09-24  6:37 ` [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Christian Borntraeger
@ 2020-09-25 16:32 ` Marc Zyngier
  2020-09-25 17:00   ` Sean Christopherson
  2020-09-25 21:05   ` Paolo Bonzini
  2020-09-29  9:27 ` Cornelia Huck
  5 siblings, 2 replies; 15+ messages in thread
From: Marc Zyngier @ 2020-09-25 16:32 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, James Morse, Julien Thierry,
	Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

Hi Sean,

On Wed, 23 Sep 2020 23:45:27 +0100,
Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> 
> This series introduces a concept we've discussed a few times in x86 land.
> The crux of the problem is that x86 has a few cases where KVM could
> theoretically encounter a software or hardware bug deep in a call stack
> without any sane way to propagate the error out to userspace.
> 
> Another use case would be for scenarios where letting the VM live will
> do more harm than good, e.g. we've been using KVM_BUG_ON for early TDX
> enabling as botching anything related to secure paging all but guarantees
> there will be a flood of WARNs and error messages because lower level PTE
> operations will fail if an upper level operation failed.
> 
> The basic idea is to WARN_ONCE if a bug is encountered, kick all vCPUs out
> to userspace, and mark the VM as bugged so that no ioctls() can be issued
> on the VM or its devices/vCPUs.
> 
> RFC as I've done nowhere near enough testing to verify that rejecting the
> ioctls(), evicting running vCPUs, etc... works as intended.

I'm quite like the idea. However, I wonder whether preventing the
vcpus from re-entering the guest is enough. When something goes really
wrong, is it safe to allow the userspace process to terminate normally
and free the associated memory? And is it still safe to allow new VMs
to be started?

I can't really imagine a case where such extreme measures would be
necessary on arm64, but I thought I'd ask.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept
  2020-09-25 16:32 ` Marc Zyngier
@ 2020-09-25 17:00   ` Sean Christopherson
  2020-09-25 21:05   ` Paolo Bonzini
  1 sibling, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2020-09-25 17:00 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, James Morse, Julien Thierry,
	Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

On Fri, Sep 25, 2020 at 05:32:53PM +0100, Marc Zyngier wrote:
> Hi Sean,
> 
> On Wed, 23 Sep 2020 23:45:27 +0100,
> Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> > 
> > This series introduces a concept we've discussed a few times in x86 land.
> > The crux of the problem is that x86 has a few cases where KVM could
> > theoretically encounter a software or hardware bug deep in a call stack
> > without any sane way to propagate the error out to userspace.
> > 
> > Another use case would be for scenarios where letting the VM live will
> > do more harm than good, e.g. we've been using KVM_BUG_ON for early TDX
> > enabling as botching anything related to secure paging all but guarantees
> > there will be a flood of WARNs and error messages because lower level PTE
> > operations will fail if an upper level operation failed.
> > 
> > The basic idea is to WARN_ONCE if a bug is encountered, kick all vCPUs out
> > to userspace, and mark the VM as bugged so that no ioctls() can be issued
> > on the VM or its devices/vCPUs.
> > 
> > RFC as I've done nowhere near enough testing to verify that rejecting the
> > ioctls(), evicting running vCPUs, etc... works as intended.
> 
> I'm quite like the idea. However, I wonder whether preventing the
> vcpus from re-entering the guest is enough. When something goes really
> wrong, is it safe to allow the userspace process to terminate normally
> and free the associated memory?

Yes and no.  Yes, there are potential scenarios where freeing memory is unsafe,
e.g. with TDX, improper sanitization of memory can lead to machine checks due
to integrity errors, i.e. freeing memory that wasn't sanitized is not safe.

But, our in-development code intentionally leaks pages that couldn't be
sanitized (with plenty of yelling).  So, "no" in the sense that, IMO, KVM
should be written such that it's sufficiently paranoid when handling "special"
memory (or other state).

> And is it still safe to allow new VMs to be started?

Hmm, anything that is truly fatal to the host/KVM should probably use BUG()
or even panic() directly.  E.g. to avoid a userspace bypass by unloading and
reloading KVM when it's built as a module, we'd have to set a flag in the
kernel proper.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
  2020-09-25  9:50       ` Vitaly Kuznetsov
@ 2020-09-25 17:12         ` Sean Christopherson
  2020-09-25 21:06           ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2020-09-25 17:12 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Marc Zyngier, James Morse, Julien Thierry,
	Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

On Fri, Sep 25, 2020 at 11:50:38AM +0200, Vitaly Kuznetsov wrote:
> Sean Christopherson <sean.j.christopherson@intel.com> writes:
> 
> > On Thu, Sep 24, 2020 at 02:34:14PM +0200, Vitaly Kuznetsov wrote:
> >> Sean Christopherson <sean.j.christopherson@intel.com> writes:
> >> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> >> > index 6f9a0c6d5dc5..810d46ab0a47 100644
> >> > --- a/arch/x86/kvm/vmx/vmx.c
> >> > +++ b/arch/x86/kvm/vmx/vmx.c
> >> > @@ -4985,14 +4986,13 @@ static int handle_cr(struct kvm_vcpu *vcpu)
> >> >  		}
> >> >  		break;
> >> >  	case 2: /* clts */
> >> > -		WARN_ONCE(1, "Guest should always own CR0.TS");
> >> > -		vmx_set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));
> >> > -		trace_kvm_cr_write(0, kvm_read_cr0(vcpu));
> >> > -		return kvm_skip_emulated_instruction(vcpu);
> >> > +		KVM_BUG(1, vcpu->kvm, "Guest always owns CR0.TS");
> >> > +		return -EIO;
> >> >  	case 1: /*mov from cr*/
> >> >  		switch (cr) {
> >> >  		case 3:
> >> >  			WARN_ON_ONCE(enable_unrestricted_guest);
> >> > +
> >> 
> >> Here, were you intended to replace WARN_ON_ONCE() with KVM_BUG_ON() or
> >> this is just a stray newline added?
> >
> > I think it's just a stray newline.  At one point I had converted this to a
> > KVM_BUG_ON(), but then reversed direction because it's not fatal to the guest,
> > i.e. KVM should continue to function even though it's spuriously intercepting
> > CR3 loads.
> >
> > Which, rereading this patch, completely contradicts the KVM_BUG() for CLTS.
> >
> > That's probably something we should sort out in this RFC: is KVM_BUG() only
> > to be used if the bug is fatal/dangerous, or should it be used any time the
> > error is definitely a KVM (or hardware) bug.
> 
> Personally, I'm feeling adventurous so my vote goes to the later :-)
> Whenever a KVM bug was discovered by a VM it's much safer to stop
> executing it as who knows what the implications might be?

Not necessarily, e.g. terminating the VM may corrupt the VM's file system,
which is less safe, for lack of a better word, from the VM's perspective.

> In this particular case I can think of a nested scenario when L1 didn't
> ask for CR3 intercept but L0 is still injecting it. It is not fatal by
> itself but probably there is bug in calculating intercepts in L0 so
> if we're getting something extra maybe we're also missing some? And this
> doesn't sound good at all.

Hmm, but by that argument this scenario would fall into the "dangerous" part
of "bug is fatal/dangerous".  I guess my opinion is that we should set a
fairly high bar for using KVM_BUG() so that KVM can be aggressive in shutting
down.

> > In theory, it should be impossible to reach this again as "r = -EIO" will
> > bounce this out to userspace, the common checks to deny all ioctls() will
> > prevent reinvoking KVM_RUN.
> 
> Do we actually want to prevent *all* ioctls? E.g. when 'vm bugged'
> condition is triggered userspace may want to extract some information to
> assist debugging but even things like KVM_GET_[S]REGS will just return
> -EIO. I'm not sure it is generally safe to enable *everything* (except
> for KVM_RUN which should definitely be forbidden) so maybe your approach
> is preferable.

The answer to this probably depends on the answer to the first question of
when it's appropriate to use KVM_BUG().  E.g. if we limit usage to fatal or
dangrous cases, then blocking all ioctls() is probably the right thing do do.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept
  2020-09-25 16:32 ` Marc Zyngier
  2020-09-25 17:00   ` Sean Christopherson
@ 2020-09-25 21:05   ` Paolo Bonzini
  1 sibling, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2020-09-25 21:05 UTC (permalink / raw)
  To: Marc Zyngier, Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, James Morse, Julien Thierry, Suzuki K Poulose,
	linux-arm-kernel, Huacai Chen, Aleksandar Markovic, linux-mips,
	Paul Mackerras, kvm-ppc, Christian Borntraeger, Janosch Frank,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda

On 25/09/20 18:32, Marc Zyngier wrote:
> I'm quite like the idea. However, I wonder whether preventing the
> vcpus from re-entering the guest is enough. When something goes really
> wrong, is it safe to allow the userspace process to terminate normally
> and free the associated memory? And is it still safe to allow new VMs
> to be started?

For something that bad, where e.g. you can't rule out future memory
corruptions via use-after-free bugs or similar, you're probably entering
BUG_ON territory.

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
  2020-09-25 17:12         ` Sean Christopherson
@ 2020-09-25 21:06           ` Paolo Bonzini
  2020-09-29  3:52             ` Sean Christopherson
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2020-09-25 21:06 UTC (permalink / raw)
  To: Sean Christopherson, Vitaly Kuznetsov
  Cc: Wanpeng Li, Jim Mattson, Joerg Roedel, kvm, linux-kernel,
	Marc Zyngier, James Morse, Julien Thierry, Suzuki K Poulose,
	linux-arm-kernel, Huacai Chen, Aleksandar Markovic, linux-mips,
	Paul Mackerras, kvm-ppc, Christian Borntraeger, Janosch Frank,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda

On 25/09/20 19:12, Sean Christopherson wrote:
>> Do we actually want to prevent *all* ioctls? E.g. when 'vm bugged'
>> condition is triggered userspace may want to extract some information to
>> assist debugging but even things like KVM_GET_[S]REGS will just return
>> -EIO. I'm not sure it is generally safe to enable *everything* (except
>> for KVM_RUN which should definitely be forbidden) so maybe your approach
>> is preferable.
>
> The answer to this probably depends on the answer to the first question of
> when it's appropriate to use KVM_BUG().  E.g. if we limit usage to fatal or
> dangrous cases, then blocking all ioctls() is probably the right thing do do.

I think usage should be limited to dangerous cases, basically WARN_ON
level.  However I agree with Vitaly that KVM_GET_* should be allowed.

The other question is whether to return -EIO or KVM_EXIT_INTERNAL_ERROR.
 The latter is more likely to be handled already by userspace.

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
  2020-09-25 21:06           ` Paolo Bonzini
@ 2020-09-29  3:52             ` Sean Christopherson
  2020-09-29  9:15               ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Sean Christopherson @ 2020-09-29  3:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Marc Zyngier, James Morse, Julien Thierry,
	Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

On Fri, Sep 25, 2020 at 11:06:10PM +0200, Paolo Bonzini wrote:
> On 25/09/20 19:12, Sean Christopherson wrote:
> >> Do we actually want to prevent *all* ioctls? E.g. when 'vm bugged'
> >> condition is triggered userspace may want to extract some information to
> >> assist debugging but even things like KVM_GET_[S]REGS will just return
> >> -EIO. I'm not sure it is generally safe to enable *everything* (except
> >> for KVM_RUN which should definitely be forbidden) so maybe your approach
> >> is preferable.
> >
> > The answer to this probably depends on the answer to the first question of
> > when it's appropriate to use KVM_BUG().  E.g. if we limit usage to fatal or
> > dangrous cases, then blocking all ioctls() is probably the right thing do do.
> 
> I think usage should be limited to dangerous cases, basically WARN_ON
> level.  However I agree with Vitaly that KVM_GET_* should be allowed.

Makes sense.

On the topic of feedback from Vitaly, while dredging through my mailbox I
rediscovered his suggestion of kvm->kvm_internal_bug (or maybe just
kvm->internal_bug) instead of kvm->vm_bugged[*].  Like past me, I like the
"internal" variants better.

[*] https://lkml.kernel.org/r/20190930153358.GD14693@linux.intel.com

> The other question is whether to return -EIO or KVM_EXIT_INTERNAL_ERROR.
>  The latter is more likely to be handled already by userspace.

And probably less confusing for unsuspecting users.  E.g. -EIO is most
likely to be interpreted as "I screwed up", whereas KVM_EXIT_INTERNAL_ERROR
will correctly be read as "KVM screwed up".

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM
  2020-09-29  3:52             ` Sean Christopherson
@ 2020-09-29  9:15               ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2020-09-29  9:15 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Marc Zyngier, James Morse, Julien Thierry,
	Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda

On 29/09/20 05:52, Sean Christopherson wrote:
>> I think usage should be limited to dangerous cases, basically WARN_ON
>> level.  However I agree with Vitaly that KVM_GET_* should be allowed.
>
> On the topic of feedback from Vitaly, while dredging through my mailbox I
> rediscovered his suggestion of kvm->kvm_internal_bug (or maybe just
> kvm->internal_bug) instead of kvm->vm_bugged[*].

Also agrees with KVM_EXIT_INTERNAL_ERROR.

>> The other question is whether to return -EIO or KVM_EXIT_INTERNAL_ERROR.
>>  The latter is more likely to be handled already by userspace.
>
> And probably less confusing for unsuspecting users.  E.g. -EIO is most
> likely to be interpreted as "I screwed up", whereas KVM_EXIT_INTERNAL_ERROR
> will correctly be read as "KVM screwed up".

All good points, seems like you have enough review material for the
non-RFC version.

Paolo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept
  2020-09-23 22:45 [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Sean Christopherson
                   ` (4 preceding siblings ...)
  2020-09-25 16:32 ` Marc Zyngier
@ 2020-09-29  9:27 ` Cornelia Huck
  5 siblings, 0 replies; 15+ messages in thread
From: Cornelia Huck @ 2020-09-29  9:27 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Marc Zyngier, James Morse,
	Julien Thierry, Suzuki K Poulose, linux-arm-kernel, Huacai Chen,
	Aleksandar Markovic, linux-mips, Paul Mackerras, kvm-ppc,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda

On Wed, 23 Sep 2020 15:45:27 -0700
Sean Christopherson <sean.j.christopherson@intel.com> wrote:

> This series introduces a concept we've discussed a few times in x86 land.
> The crux of the problem is that x86 has a few cases where KVM could
> theoretically encounter a software or hardware bug deep in a call stack
> without any sane way to propagate the error out to userspace.
> 
> Another use case would be for scenarios where letting the VM live will
> do more harm than good, e.g. we've been using KVM_BUG_ON for early TDX
> enabling as botching anything related to secure paging all but guarantees
> there will be a flood of WARNs and error messages because lower level PTE
> operations will fail if an upper level operation failed.
> 
> The basic idea is to WARN_ONCE if a bug is encountered, kick all vCPUs out
> to userspace, and mark the VM as bugged so that no ioctls() can be issued
> on the VM or its devices/vCPUs.

I think this makes a lot of sense.

Are there other user space interactions where we want to generate an
error for a bugged VM, e.g. via eventfd?

And can we make the 'bugged' information available to user space in a
structured way?

> 
> RFC as I've done nowhere near enough testing to verify that rejecting the
> ioctls(), evicting running vCPUs, etc... works as intended.
> 
> Sean Christopherson (3):
>   KVM: Export kvm_make_all_cpus_request() for use in marking VMs as
>     bugged
>   KVM: Add infrastructure and macro to mark VM as bugged
>   KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the
>     VM
> 
>  arch/x86/kvm/svm/svm.c   |  2 +-
>  arch/x86/kvm/vmx/vmx.c   | 23 ++++++++++++--------
>  arch/x86/kvm/x86.c       |  4 ++++
>  include/linux/kvm_host.h | 45 ++++++++++++++++++++++++++++++++--------
>  virt/kvm/kvm_main.c      | 11 +++++-----
>  5 files changed, 61 insertions(+), 24 deletions(-)
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-09-29  9:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-23 22:45 [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Sean Christopherson
2020-09-23 22:45 ` [RFC PATCH 1/3] KVM: Export kvm_make_all_cpus_request() for use in marking VMs as bugged Sean Christopherson
2020-09-23 22:45 ` [RFC PATCH 2/3] KVM: Add infrastructure and macro to mark VM " Sean Christopherson
2020-09-23 22:45 ` [RFC PATCH 3/3] KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VM Sean Christopherson
2020-09-24 12:34   ` Vitaly Kuznetsov
     [not found]     ` <20200924181134.GB9649@linux.intel.com>
2020-09-25  9:50       ` Vitaly Kuznetsov
2020-09-25 17:12         ` Sean Christopherson
2020-09-25 21:06           ` Paolo Bonzini
2020-09-29  3:52             ` Sean Christopherson
2020-09-29  9:15               ` Paolo Bonzini
2020-09-24  6:37 ` [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept Christian Borntraeger
2020-09-25 16:32 ` Marc Zyngier
2020-09-25 17:00   ` Sean Christopherson
2020-09-25 21:05   ` Paolo Bonzini
2020-09-29  9:27 ` Cornelia Huck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).