All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Mickaël Salaün" <mic@digikod.net>
To: Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H . Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Kees Cook <keescook@chromium.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>
Cc: "Mickaël Salaün" <mic@digikod.net>,
	"Alexander Graf" <graf@amazon.com>,
	"Chao Peng" <chao.p.peng@linux.intel.com>,
	"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
	"Forrest Yuan Yu" <yuanyu@google.com>,
	"James Gowans" <jgowans@amazon.com>,
	"James Morris" <jamorris@linux.microsoft.com>,
	"John Andersen" <john.s.andersen@intel.com>,
	"Madhavan T . Venkataraman" <madvenka@linux.microsoft.com>,
	"Marian Rotariu" <marian.c.rotariu@gmail.com>,
	"Mihai Donțu" <mdontu@bitdefender.com>,
	"Nicușor Cîțu" <nicu.citu@icloud.com>,
	"Thara Gopinath" <tgopinath@microsoft.com>,
	"Trilok Soni" <quic_tsoni@quicinc.com>,
	"Wei Liu" <wei.liu@kernel.org>, "Will Deacon" <will@kernel.org>,
	"Yu Zhang" <yu.c.zhang@linux.intel.com>,
	"Zahra Tarkhani" <ztarkhani@microsoft.com>,
	"Ștefan Șicleru" <ssicleru@bitdefender.com>,
	dev@lists.cloudhypervisor.org, kvm@vger.kernel.org,
	linux-hardening@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org, qemu-devel@nongnu.org,
	virtualization@lists.linux-foundation.org, x86@kernel.org,
	xen-devel@lists.xenproject.org
Subject: [RFC PATCH v2 02/19] KVM: x86: Add new hypercall to lock control registers
Date: Sun, 12 Nov 2023 21:23:09 -0500	[thread overview]
Message-ID: <20231113022326.24388-3-mic@digikod.net> (raw)
In-Reply-To: <20231113022326.24388-1-mic@digikod.net>

This enables guests to lock their CR0 and CR4 registers with a subset of
X86_CR0_WP, X86_CR4_SMEP, X86_CR4_SMAP, X86_CR4_UMIP, X86_CR4_FSGSBASE
and X86_CR4_CET flags.

The new KVM_HC_LOCK_CR_UPDATE hypercall takes three arguments.  The
first is to identify the control register, the second is a bit mask to
pin (i.e. mark as read-only), and the third is for optional flags.

These register flags should already be pinned by Linux guests, but once
compromised, this self-protection mechanism could be disabled, which is
not the case with this dedicated hypercall.

Once the CRs are pinned by the guest, if it attempts to change them,
then a general protection fault is sent to the guest.

This hypercall may evolve and support new kind of registers or pinning.
The optional KVM_LOCK_CR_UPDATE_VERSION flag enables guests to know the
supported abilities by mapping the returned version with the related
features.

Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
---

Changes since v1:
* Guard KVM_HC_LOCK_CR_UPDATE hypercall with CONFIG_HEKI.
* Move extern cr4_pinned_mask to x86.h (suggested by Kees Cook).
* Move VMX CR checks from vmx_set_cr*() to handle_cr() to make it
  possible to return to user space (see next commit).
* Change the heki_check_cr()'s first argument to vcpu.
* Don't use -KVM_EPERM in heki_check_cr().
* Generate a fault when the guest requests a denied CR update.
* Add a flags argument to get the version of this hypercall. Being able
  to do a preper version check was suggested by Wei Liu.
---
 Documentation/virt/kvm/x86/hypercalls.rst | 17 +++++
 arch/x86/include/uapi/asm/kvm_para.h      |  2 +
 arch/x86/kernel/cpu/common.c              |  4 +-
 arch/x86/kvm/vmx/vmx.c                    |  5 ++
 arch/x86/kvm/x86.c                        | 84 +++++++++++++++++++++++
 arch/x86/kvm/x86.h                        | 22 ++++++
 include/linux/kvm_host.h                  |  5 ++
 include/uapi/linux/kvm_para.h             |  1 +
 8 files changed, 139 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/x86/hypercalls.rst b/Documentation/virt/kvm/x86/hypercalls.rst
index 10db7924720f..3178576f4c47 100644
--- a/Documentation/virt/kvm/x86/hypercalls.rst
+++ b/Documentation/virt/kvm/x86/hypercalls.rst
@@ -190,3 +190,20 @@ the KVM_CAP_EXIT_HYPERCALL capability. Userspace must enable that capability
 before advertising KVM_FEATURE_HC_MAP_GPA_RANGE in the guest CPUID.  In
 addition, if the guest supports KVM_FEATURE_MIGRATION_CONTROL, userspace
 must also set up an MSR filter to process writes to MSR_KVM_MIGRATION_CONTROL.
+
+9. KVM_HC_LOCK_CR_UPDATE
+------------------------
+
+:Architecture: x86
+:Status: active
+:Purpose: Request some control registers to be restricted.
+
+- a0: identify a control register
+- a1: bit mask to make some flags read-only
+- a2: optional KVM_LOCK_CR_UPDATE_VERSION flag that will return the version of
+      this hypercall. Version 1 supports CR0 and CR4 pinning.
+
+The hypercall lets a guest request control register flags to be pinned for
+itself.
+
+Returns 0 on success or a KVM error code otherwise.
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 6e64b27b2c1e..efc5ccc0060f 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -150,4 +150,6 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#define KVM_LOCK_CR_UPDATE_VERSION (1 << 0)
+
 #endif /* _UAPI_ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4e5ffc8b0e46..f18ee7ce0496 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -400,9 +400,11 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
 }
 
 /* These bits should not change their value after CPU init is finished. */
-static const unsigned long cr4_pinned_mask =
+const unsigned long cr4_pinned_mask =
 	X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
 	X86_CR4_FSGSBASE | X86_CR4_CET;
+EXPORT_SYMBOL_GPL(cr4_pinned_mask);
+
 static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
 static unsigned long cr4_pinned_bits __ro_after_init;
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6e502ba93141..f487bf16dd96 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5452,6 +5452,11 @@ static int handle_cr(struct kvm_vcpu *vcpu)
 	case 0: /* mov to cr */
 		val = kvm_register_read(vcpu, reg);
 		trace_kvm_cr_write(cr, val);
+
+		ret = heki_check_cr(vcpu, cr, val);
+		if (ret)
+			return ret;
+
 		switch (cr) {
 		case 0:
 			err = handle_set_cr0(vcpu, val);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e3eb608b6692..4e6c4c21f12c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8054,11 +8054,86 @@ static unsigned long emulator_get_cr(struct x86_emulate_ctxt *ctxt, int cr)
 	return value;
 }
 
+#ifdef CONFIG_HEKI
+
+#define HEKI_ABI_VERSION 1
+
+static int heki_lock_cr(struct kvm_vcpu *const vcpu, const unsigned long cr,
+			unsigned long pin, unsigned long flags)
+{
+	if (flags) {
+		if ((flags == KVM_LOCK_CR_UPDATE_VERSION) && !cr && !pin)
+			return HEKI_ABI_VERSION;
+		return -KVM_EINVAL;
+	}
+
+	if (!pin)
+		return -KVM_EINVAL;
+
+	switch (cr) {
+	case 0:
+		/* Cf. arch/x86/kernel/cpu/common.c */
+		if (!(pin & X86_CR0_WP))
+			return -KVM_EINVAL;
+
+		if ((pin & read_cr0()) != pin)
+			return -KVM_EINVAL;
+
+		atomic_long_or(pin, &vcpu->kvm->heki_pinned_cr0);
+		return 0;
+	case 4:
+		/* Checks for irrelevant bits. */
+		if ((pin & cr4_pinned_mask) != pin)
+			return -KVM_EINVAL;
+
+		/* Ignores bits not present in host. */
+		pin &= __read_cr4();
+		atomic_long_or(pin, &vcpu->kvm->heki_pinned_cr4);
+		return 0;
+	}
+	return -KVM_EINVAL;
+}
+
+int heki_check_cr(struct kvm_vcpu *const vcpu, const unsigned long cr,
+		  const unsigned long val)
+{
+	unsigned long pinned;
+
+	switch (cr) {
+	case 0:
+		pinned = atomic_long_read(&vcpu->kvm->heki_pinned_cr0);
+		if ((val & pinned) != pinned) {
+			pr_warn_ratelimited(
+				"heki: Blocked CR0 update: 0x%lx\n", val);
+			kvm_inject_gp(vcpu, 0);
+			return 1;
+		}
+		return 0;
+	case 4:
+		pinned = atomic_long_read(&vcpu->kvm->heki_pinned_cr4);
+		if ((val & pinned) != pinned) {
+			pr_warn_ratelimited(
+				"heki: Blocked CR4 update: 0x%lx\n", val);
+			kvm_inject_gp(vcpu, 0);
+			return 1;
+		}
+		return 0;
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(heki_check_cr);
+
+#endif /* CONFIG_HEKI */
+
 static int emulator_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val)
 {
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
 	int res = 0;
 
+	res = heki_check_cr(vcpu, cr, val);
+	if (res)
+		return res;
+
 	switch (cr) {
 	case 0:
 		res = kvm_set_cr0(vcpu, mk_cr_64(kvm_read_cr0(vcpu), val));
@@ -9918,6 +9993,15 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 		vcpu->arch.complete_userspace_io = complete_hypercall_exit;
 		return 0;
 	}
+#ifdef CONFIG_HEKI
+	case KVM_HC_LOCK_CR_UPDATE:
+		if (a0 > U32_MAX) {
+			ret = -KVM_EINVAL;
+		} else {
+			ret = heki_lock_cr(vcpu, a0, a1, a2);
+		}
+		break;
+#endif /* CONFIG_HEKI */
 	default:
 		ret = -KVM_ENOSYS;
 		break;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 1e7be1f6ab29..193093112b55 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -290,6 +290,26 @@ static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk)
 	return !(kvm->arch.disabled_quirks & quirk);
 }
 
+#ifdef CONFIG_HEKI
+
+int heki_check_cr(struct kvm_vcpu *vcpu, unsigned long cr, unsigned long val);
+
+#else /* CONFIG_HEKI */
+
+static inline int heki_check_cr(struct kvm_vcpu *vcpu, unsigned long cr,
+				unsigned long val)
+{
+	return 0;
+}
+
+static inline int heki_lock_cr(struct kvm_vcpu *const vcpu, unsigned long cr,
+			       unsigned long pin)
+{
+	return 0;
+}
+
+#endif /* CONFIG_HEKI */
+
 void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip);
 
 u64 get_kvmclock_ns(struct kvm *kvm);
@@ -325,6 +345,8 @@ extern u64 host_xcr0;
 extern u64 host_xss;
 extern u64 host_arch_capabilities;
 
+extern const unsigned long cr4_pinned_mask;
+
 extern struct kvm_caps kvm_caps;
 
 extern bool enable_pmu;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 687589ce9f63..6864c80ff936 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -835,6 +835,11 @@ struct kvm {
 	bool vm_bugged;
 	bool vm_dead;
 
+#ifdef CONFIG_HEKI
+	atomic_long_t heki_pinned_cr0;
+	atomic_long_t heki_pinned_cr4;
+#endif /* CONFIG_HEKI */
+
 #ifdef CONFIG_HAVE_KVM_PM_NOTIFIER
 	struct notifier_block pm_notifier;
 #endif
diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h
index 960c7e93d1a9..2ed418704603 100644
--- a/include/uapi/linux/kvm_para.h
+++ b/include/uapi/linux/kvm_para.h
@@ -30,6 +30,7 @@
 #define KVM_HC_SEND_IPI		10
 #define KVM_HC_SCHED_YIELD		11
 #define KVM_HC_MAP_GPA_RANGE		12
+#define KVM_HC_LOCK_CR_UPDATE		13
 
 /*
  * hypercalls use architecture specific
-- 
2.42.1


  parent reply	other threads:[~2023-11-13  2:24 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-13  2:23 [RFC PATCH v2 00/19] Hypervisor-Enforced Kernel Integrity Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 01/19] virt: Introduce Hypervisor Enforced Kernel Integrity (Heki) Mickaël Salaün
2023-11-13  2:23 ` Mickaël Salaün [this message]
2023-11-13  2:23 ` [RFC PATCH v2 03/19] KVM: x86: Add notifications for Heki policy configuration and violation Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 04/19] heki: Lock guest control registers at the end of guest kernel init Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 05/19] KVM: VMX: Add MBEC support Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 06/19] KVM: x86: Add kvm_x86_ops.fault_gva() Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 07/19] KVM: x86: Make memory attribute helpers more generic Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 08/19] KVM: x86: Extend kvm_vm_set_mem_attributes() with a mask Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 09/19] KVM: x86: Extend kvm_range_has_memory_attributes() with match_all Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 10/19] KVM: x86: Implement per-guest-page permissions Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 11/19] KVM: x86: Add new hypercall to set EPT permissions Mickaël Salaün
2023-11-13  4:45   ` kernel test robot
2023-11-13  2:23 ` [RFC PATCH v2 12/19] x86: Implement the Memory Table feature to store arbitrary per-page data Mickaël Salaün
2023-11-22  7:19   ` kernel test robot
2023-11-13  2:23 ` [RFC PATCH v2 13/19] heki: Implement a kernel page table walker Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 14/19] heki: x86: Initialize permissions counters for pages mapped into KVA Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 15/19] heki: x86: Initialize permissions counters for pages in vmap()/vunmap() Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 16/19] heki: x86: Update permissions counters when guest page permissions change Mickaël Salaün
2023-11-13  2:23 ` [RFC PATCH v2 17/19] heki: x86: Update permissions counters during text patching Mickaël Salaün
2023-11-13  8:19   ` Peter Zijlstra
2023-11-27 16:48     ` Madhavan T. Venkataraman
2023-11-27 20:08       ` Peter Zijlstra
2023-11-29 21:07         ` Madhavan T. Venkataraman
2023-11-30 11:33           ` Peter Zijlstra
2023-12-06 16:37             ` Madhavan T. Venkataraman
2023-12-06 18:51               ` Peter Zijlstra
2023-12-08 18:41                 ` Madhavan T. Venkataraman
2023-12-01  0:45           ` Edgecombe, Rick P
2023-12-06 16:41             ` Madhavan T. Venkataraman
2023-11-13  2:23 ` [RFC PATCH v2 18/19] heki: x86: Protect guest kernel memory using the KVM hypervisor Mickaël Salaün
2023-11-13  8:54   ` Peter Zijlstra
2023-11-27 17:05     ` Madhavan T. Venkataraman
2023-11-27 20:03       ` Peter Zijlstra
2023-11-29 19:47         ` Madhavan T. Venkataraman
2023-11-13  2:23 ` [RFC PATCH v2 19/19] virt: Add Heki KUnit tests Mickaël Salaün
2023-11-13  5:18 [RFC PATCH v2 14/19] heki: x86: Initialize permissions counters for pages mapped into KVA kernel test robot
2023-11-14  1:22 ` kernel test robot
2023-11-13  7:42 [RFC PATCH v2 10/19] KVM: x86: Implement per-guest-page permissions kernel test robot
2023-11-14  1:27 ` kernel test robot
2023-11-13  8:14 [RFC PATCH v2 18/19] heki: x86: Protect guest kernel memory using the KVM hypervisor kernel test robot
2023-11-14  1:30 ` kernel test robot
2023-11-13 12:37 [RFC PATCH v2 10/19] KVM: x86: Implement per-guest-page permissions kernel test robot
2023-11-14  1:29 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231113022326.24388-3-mic@digikod.net \
    --to=mic@digikod.net \
    --cc=bp@alien8.de \
    --cc=chao.p.peng@linux.intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dev@lists.cloudhypervisor.org \
    --cc=graf@amazon.com \
    --cc=hpa@zytor.com \
    --cc=jamorris@linux.microsoft.com \
    --cc=jgowans@amazon.com \
    --cc=john.s.andersen@intel.com \
    --cc=keescook@chromium.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=madvenka@linux.microsoft.com \
    --cc=marian.c.rotariu@gmail.com \
    --cc=mdontu@bitdefender.com \
    --cc=mingo@redhat.com \
    --cc=nicu.citu@icloud.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quic_tsoni@quicinc.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=ssicleru@bitdefender.com \
    --cc=tglx@linutronix.de \
    --cc=tgopinath@microsoft.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=wei.liu@kernel.org \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=yu.c.zhang@linux.intel.com \
    --cc=yuanyu@google.com \
    --cc=ztarkhani@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.