linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH v13 00/11] Introduce support for guest CET feature
@ 2020-07-16  3:16 Yang Weijiang
  2020-07-16  3:16 ` [RESEND v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Yang Weijiang
                   ` (11 more replies)
  0 siblings, 12 replies; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

Control-flow Enforcement Technology (CET) provides protection against
Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
SHSTK is to prevent ROP programming and IBT is to prevent JOP programming.

Several parts in KVM have been updated to provide VM CET support, including:
CPUID/XSAVES config, MSR pass-through, user space MSR access interface, 
vmentry/vmexit config, nested VM etc. These patches have dependency on CET
kernel patches for xsaves support and CET definitions, e.g., MSR and related
feature flags.

CET kernel patches are here:
https://lkml.kernel.org/r/20200429220732.31602-1-yu-cheng.yu@intel.com

v13:
- Added CET definitions as a separate patch to facilitate KVM test.
- Disabled CET support in KVM if unrestricted_guest is turned off since
  in this case CET related instructions/infrastructure cannot be emulated
  well.
- Don't expose CET feature to guest if host kernel doesn't support CET.
- Rebased the series to v5.8-rc5, commit: 11ba468877bb

v12:
- Fixed a few issues per Sean and Paolo's review feeback.
- Refactored patches to make them properly arranged.
- Removed unnecessary hard-coded CET states for host/guest.
- Added compile-time assertions for vmcs_field_to_offset_table to detect
  mismatch of the field type and field encoding number.
- Added a custom MSR MSR_KVM_GUEST_SSP for guest active SSP save/restore.
- Rebased patches to 5.7-rc3.

v11:
- Fixed a guest vmentry failure issue when guest reboots.
- Used vm_xxx_control_{set, clear}bit() to avoid side effect, it'll
  clear cached data instead of pure VMCS field bits.
- Added vcpu->arch.guest_supported_xss dedidated for guest runtime mask,
  this avoids supported_xss overwritten issue caused by an old qemu.
- Separated vmentry/vmexit state setting with CR0/CR4 dependency check
  to make the patch more clear.
- Added CET VMCS states in dump_vmcs() for debugging purpose.
- Other refactor based on testing.
- This patch serial is built on top of below branch and CET kernel patches
  for seeking xsaves support:
  https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=cpu-caps

v10:
- Refactored code per Sean's review feedback.
- Added CET support for nested VM.
- Removed fix-patch for CPUID(0xd,N) enumeration as this part is done
  by Paolo and Sean.
- This new patchset is based on Paolo's queued cpu_caps branch.
- Modified patch per XSAVES related change.
- Consolidated KVM unit-test patch with KVM patches.

v9:
- Refactored msr-check functions per Sean's feedback.
- Fixed a few issues per Sean's suggestion.
- Rebased patch to kernel-v5.4.
- Moved CET CPUID feature bits and CR4.CET to last patch.

v8:
- Addressed Jim and Sean's feedback on: 1) CPUID(0xD,i) enumeration. 2)
  sanity check when configure guest CET. 3) function improvement.
- Added more sanity check functions.
- Set host vmexit default status so that guest won't leak CET status to
  host when vmexit.
- Added CR0.WP vs. CR4.CET mutual constrains.

v7:
- Rebased patch to kernel v5.3
- Sean suggested to change CPUID(0xd, n) enumeration code as alined with
  existing one, and I think it's better to make the fix as an independent patch 
  since XSS MSR are being used widely on X86 platforms.
- Check more host and guest status before configure guest CET
  per Sean's feedback.
- Add error-check before guest accesses CET MSRs per Sean's feedback.
- Other minor fixes suggested by Sean.

v6:
- Rebase patch to kernel v5.2.
- Move CPUID(0xD, n>=1) helper to a seperate patch.
- Merge xsave size fix with other patch.
- Other minor fixes per community feedback.

v5:
- Rebase patch to kernel v5.1.
- Wrap CPUID(0xD, n>=1) code to a helper function.
- Pass through MSR_IA32_PL1_SSP and MSR_IA32_PL2_SSP to Guest.
- Add Co-developed-by expression in patch description.
- Refine patch description.

v4:
- Add Sean's patch for loading Guest fpu state before access XSAVES
  managed CET MSRs.
- Melt down CET bits setting into CPUID configuration patch.
- Add VMX interface to query Host XSS.
- Check Host and Guest XSS support bits before set Guest XSS.
- Make Guest SHSTK and IBT feature enabling independent.
- Do not report CET support to Guest when Host CET feature is Disabled.

v3:
- Modified patches to make Guest CET independent to Host enabling.
- Added patch 8 to add user space access for Guest CET MSR access.
- Modified code comments and patch description to reflect changes.

v2:
- Re-ordered patch sequence, combined one patch.
- Added more description for CET related VMCS fields.
- Added Host CET capability check while enabling Guest CET loading bit.
- Added Host CET capability check while reporting Guest CPUID(EAX=7, EXC=0).
- Modified code in reporting Guest CPUID(EAX=D,ECX>=1), make it clearer.
- Added Host and Guest XSS mask check while setting bits for Guest XSS.

Sean Christopherson (1):
  KVM: x86: Load guest fpu state when access MSRs managed by XSAVES

Yang Weijiang (10):
  KVM: x86: Include CET definitions for KVM test purpose
  KVM: VMX: Introduce CET VMCS fields and flags
  KVM: VMX: Set guest CET MSRs per KVM and host configuration
  KVM: VMX: Configure CET settings upon guest CR0/4 changing
  KVM: x86: Refresh CPUID once guest changes XSS bits
  KVM: x86: Add userspace access interface for CET MSRs
  KVM: VMX: Enable CET support for nested VM
  KVM: VMX: Add VMCS dump and sanity check for CET states
  KVM: x86: Add #CP support in guest exception dispatch
  KVM: x86: Enable CET virtualization and advertise CET to userspace

 arch/x86/include/asm/kvm_host.h      |   4 +-
 arch/x86/include/asm/vmx.h           |   8 +
 arch/x86/include/uapi/asm/kvm.h      |   1 +
 arch/x86/include/uapi/asm/kvm_para.h |   7 +-
 arch/x86/kvm/cpuid.c                 |  28 ++-
 arch/x86/kvm/vmx/capabilities.h      |   5 +
 arch/x86/kvm/vmx/nested.c            |  34 ++++
 arch/x86/kvm/vmx/vmcs12.c            | 267 ++++++++++++++++-----------
 arch/x86/kvm/vmx/vmcs12.h            |  14 +-
 arch/x86/kvm/vmx/vmx.c               | 262 +++++++++++++++++++++++++-
 arch/x86/kvm/x86.c                   |  53 +++++-
 arch/x86/kvm/x86.h                   |   2 +-
 include/linux/kvm_host.h             |  32 ++++
 13 files changed, 590 insertions(+), 127 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RESEND v13 01/11] KVM: x86: Include CET definitions for KVM test purpose
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-16  3:16 ` [RESEND v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags Yang Weijiang
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

These definitions are added by CET kernel patch and referenced by KVM,
if the CET KVM patches are tested without CET kernel patches, this patch
should be included.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 include/linux/kvm_host.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d564855243d8..aae1775a29ac 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -35,6 +35,38 @@
 
 #include <asm/kvm_host.h>
 
+#ifndef CONFIG_X86_INTEL_CET
+#define XFEATURE_CET_USER   11
+#define XFEATURE_CET_KERNEL 12
+
+#define XFEATURE_MASK_CET_USER         (1 << XFEATURE_CET_USER)
+#define XFEATURE_MASK_CET_KERNEL       (1 << XFEATURE_CET_KERNEL)
+
+/* Control-flow Enforcement Technology MSRs */
+#define MSR_IA32_U_CET         0x6a0 /* user mode cet setting */
+#define MSR_IA32_S_CET         0x6a2 /* kernel mode cet setting */
+#define MSR_IA32_PL0_SSP       0x6a4 /* kernel shstk pointer */
+#define MSR_IA32_PL1_SSP       0x6a5 /* ring-1 shstk pointer */
+#define MSR_IA32_PL2_SSP       0x6a6 /* ring-2 shstk pointer */
+#define MSR_IA32_PL3_SSP       0x6a7 /* user shstk pointer */
+#define MSR_IA32_INT_SSP_TAB   0x6a8 /* exception shstk table */
+
+#define X86_CR4_CET_BIT        23 /* enable Control-flow Enforcement */
+#define X86_CR4_CET            _BITUL(X86_CR4_CET_BIT)
+
+#define X86_FEATURE_SHSTK      (16*32+ 7) /* Shadow Stack */
+#define X86_FEATURE_IBT        (18*32+20) /* Indirect Branch Tracking */
+
+/* MSR_IA32_U_CET and MSR_IA32_S_CET bits */
+#define MSR_IA32_CET_SHSTK_EN          0x0000000000000001ULL
+#define MSR_IA32_CET_WRSS_EN           0x0000000000000002ULL
+#define MSR_IA32_CET_ENDBR_EN          0x0000000000000004ULL
+#define MSR_IA32_CET_LEG_IW_EN         0x0000000000000008ULL
+#define MSR_IA32_CET_NO_TRACK_EN       0x0000000000000010ULL
+#define MSR_IA32_CET_WAIT_ENDBR        0x00000000000000800UL
+#define MSR_IA32_CET_BITMAP_MASK       0xfffffffffffff000ULL
+#endif
+
 #ifndef KVM_MAX_VCPU_ID
 #define KVM_MAX_VCPU_ID KVM_MAX_VCPUS
 #endif
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
  2020-07-16  3:16 ` [RESEND v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 19:48   ` Sean Christopherson
  2020-07-16  3:16 ` [RESEND v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Yang Weijiang
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CET(Control-flow Enforcement Technology) is a CPU feature used to prevent
Return/Jump-Oriented Programming(ROP/JOP) attacks. It provides the following
sub-features to defend against ROP/JOP style control-flow subversion attacks:

Shadow Stack (SHSTK):
  A second stack for program which is used exclusively for control transfer
  operations.

Indirect Branch Tracking (IBT):
  Code branching protection to defend against jump/call oriented programming.

Several new CET MSRs are defined in kernel to support CET:
  MSR_IA32_{U,S}_CET: Controls the CET settings for user mode and kernel mode
  respectively.

  MSR_IA32_PL{0,1,2,3}_SSP: Stores shadow stack pointers for CPL-0,1,2,3
  protection respectively.

  MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer table.

Two XSAVES state bits are introduced for CET:
  IA32_XSS:[bit 11]: Control saving/restoring user mode CET states
  IA32_XSS:[bit 12]: Control saving/restoring kernel mode CET states.

Six VMCS fields are introduced for CET:
  {HOST,GUEST}_S_CET: Stores CET settings for kernel mode.
  {HOST,GUEST}_SSP: Stores shadow stack pointer of current task/thread.
  {HOST,GUEST}_INTR_SSP_TABLE: Stores base address of shadow stack pointer
  table.

If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET states are restored from below
VMCS fields at VM-Exit:
  HOST_S_CET
  HOST_SSP
  HOST_INTR_SSP_TABLE

If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET states are loaded from below
VMCS fields at VM-Entry:
  GUEST_S_CET
  GUEST_SSP
  GUEST_INTR_SSP_TABLE

Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/vmx.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index cd7de4b401fe..879c57ff2dc5 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -94,6 +94,7 @@
 #define VM_EXIT_CLEAR_BNDCFGS                   0x00800000
 #define VM_EXIT_PT_CONCEAL_PIP			0x01000000
 #define VM_EXIT_CLEAR_IA32_RTIT_CTL		0x02000000
+#define VM_EXIT_LOAD_CET_STATE                  0x10000000
 
 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
 
@@ -107,6 +108,7 @@
 #define VM_ENTRY_LOAD_BNDCFGS                   0x00010000
 #define VM_ENTRY_PT_CONCEAL_PIP			0x00020000
 #define VM_ENTRY_LOAD_IA32_RTIT_CTL		0x00040000
+#define VM_ENTRY_LOAD_CET_STATE                 0x00100000
 
 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR	0x000011ff
 
@@ -328,6 +330,9 @@ enum vmcs_field {
 	GUEST_PENDING_DBG_EXCEPTIONS    = 0x00006822,
 	GUEST_SYSENTER_ESP              = 0x00006824,
 	GUEST_SYSENTER_EIP              = 0x00006826,
+	GUEST_S_CET                     = 0x00006828,
+	GUEST_SSP                       = 0x0000682a,
+	GUEST_INTR_SSP_TABLE            = 0x0000682c,
 	HOST_CR0                        = 0x00006c00,
 	HOST_CR3                        = 0x00006c02,
 	HOST_CR4                        = 0x00006c04,
@@ -340,6 +345,9 @@ enum vmcs_field {
 	HOST_IA32_SYSENTER_EIP          = 0x00006c12,
 	HOST_RSP                        = 0x00006c14,
 	HOST_RIP                        = 0x00006c16,
+	HOST_S_CET                      = 0x00006c18,
+	HOST_SSP                        = 0x00006c1a,
+	HOST_INTR_SSP_TABLE             = 0x00006c1c
 };
 
 /*
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
  2020-07-16  3:16 ` [RESEND v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Yang Weijiang
  2020-07-16  3:16 ` [RESEND v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 20:14   ` Sean Christopherson
  2020-07-16  3:16 ` [RESEND v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing Yang Weijiang
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CET MSRs pass through guest directly to enhance performance. CET runtime
control settings are stored in MSR_IA32_{U,S}_CET, Shadow Stack Pointer(SSP)
are stored in MSR_IA32_PL{0,1,2,3}_SSP, SSP table base address is stored in
MSR_IA32_INT_SSP_TAB, these MSRs are defined in kernel and re-used here.

MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user-mode protection,the MSR
contents are switched between threads during scheduling, it makes sense to pass
through them so that the guest kernel can use xsaves/xrstors to operate them
efficiently. Other MSRs are used for non-user mode protection. See SDM for detailed
info.

The difference between CET VMCS fields and CET MSRs is that,the former are used
during VMEnter/VMExit, whereas the latter are used for CET state storage between
task/thread scheduling.

Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 46 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c     |  3 +++
 2 files changed, 49 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 13745f2a5ecd..a9f135c52cbc 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3126,6 +3126,13 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd)
 		vmcs_writel(GUEST_CR3, guest_cr3);
 }
 
+static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states)
+{
+	return ((supported_xss & xss_states) &&
+		(guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_IBT)));
+}
+
 int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -7230,6 +7237,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
 		vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
 }
 
+static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
+	bool incpt;
+
+	incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER);
+	/*
+	 * U_CET is required for USER CET, and U_CET, PL3_SPP are bound as
+	 * one component and controlled by IA32_XSS[bit 11].
+	 */
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW,
+				  incpt);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW,
+				  incpt);
+
+	incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL);
+	/*
+	 * S_CET is required for KERNEL CET, and PL0_SSP ... PL2_SSP are
+	 * bound as one component and controlled by IA32_XSS[bit 12].
+	 */
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW,
+				  incpt);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW,
+				  incpt);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW,
+				  incpt);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW,
+				  incpt);
+
+	incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+	/* SSP_TAB is only available for KERNEL SHSTK.*/
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW,
+				  incpt);
+}
+
 static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -7268,6 +7311,9 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 			vmx_set_guest_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
 		}
 	}
+
+	if (supported_xss & (XFEATURE_MASK_CET_KERNEL | XFEATURE_MASK_CET_USER))
+		vmx_update_intercept_for_cet_msr(vcpu);
 }
 
 static __init void vmx_set_cpu_caps(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 88c593f83b28..ea8a9dc9fbad 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -184,6 +184,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs;
 				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
 				| XFEATURE_MASK_PKRU)
 
+#define KVM_SUPPORTED_XSS       (XFEATURE_MASK_CET_USER | \
+				 XFEATURE_MASK_CET_KERNEL)
+
 u64 __read_mostly host_efer;
 EXPORT_SYMBOL_GPL(host_efer);
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (2 preceding siblings ...)
  2020-07-16  3:16 ` [RESEND v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 20:31   ` Sean Christopherson
  2020-07-16  3:16 ` [RESEND v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits Yang Weijiang
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CR4.CET is master control bit for CET function. There're mutual constrains
between CR0.WP and CR4.CET, so need to check the dependent bit while changing
the control registers.

The processor does not allow CR4.CET to be set if CR0.WP = 0,similarly, it does
not allow CR0.WP to be cleared while CR4.CET = 1. In either case, KVM would
inject #GP to guest.

CET state load bit is set/cleared along with CR4.CET bit set/clear.

Note:
SHSTK and IBT features share one control MSR: MSR_IA32_{U,S}_CET, which means
it's difficult to hide one feature from another in the case of SHSTK != IBT,
after discussed in community, it's agreed to allow guest control two features
independently as it won't introduce security hole.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/capabilities.h |  5 +++++
 arch/x86/kvm/vmx/vmx.c          | 30 ++++++++++++++++++++++++++++--
 arch/x86/kvm/x86.c              |  3 +++
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 4bbd8b448d22..dbc87c5997cc 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -103,6 +103,11 @@ static inline bool cpu_has_load_perf_global_ctrl(void)
 	       (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);
 }
 
+static inline bool cpu_has_load_cet_ctrl(void)
+{
+	return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE) &&
+		(vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_CET_STATE);
+}
 static inline bool cpu_has_vmx_mpx(void)
 {
 	return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_BNDCFGS) &&
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a9f135c52cbc..0089943fbb31 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2510,7 +2510,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	      VM_EXIT_LOAD_IA32_EFER |
 	      VM_EXIT_CLEAR_BNDCFGS |
 	      VM_EXIT_PT_CONCEAL_PIP |
-	      VM_EXIT_CLEAR_IA32_RTIT_CTL;
+	      VM_EXIT_CLEAR_IA32_RTIT_CTL |
+	      VM_EXIT_LOAD_CET_STATE;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
 				&_vmexit_control) < 0)
 		return -EIO;
@@ -2534,7 +2535,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	      VM_ENTRY_LOAD_IA32_EFER |
 	      VM_ENTRY_LOAD_BNDCFGS |
 	      VM_ENTRY_PT_CONCEAL_PIP |
-	      VM_ENTRY_LOAD_IA32_RTIT_CTL;
+	      VM_ENTRY_LOAD_IA32_RTIT_CTL |
+	      VM_ENTRY_LOAD_CET_STATE;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
 				&_vmentry_control) < 0)
 		return -EIO;
@@ -3133,6 +3135,12 @@ static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states)
 		guest_cpuid_has(vcpu, X86_FEATURE_IBT)));
 }
 
+static bool is_cet_supported(struct kvm_vcpu *vcpu)
+{
+	return is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER |
+				      XFEATURE_MASK_CET_KERNEL);
+}
+
 int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -3173,6 +3181,10 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 			return 1;
 	}
 
+	if ((cr4 & X86_CR4_CET) && (!is_cet_supported(vcpu) ||
+	    !(kvm_read_cr0(vcpu) & X86_CR0_WP)))
+		return 1;
+
 	if (vmx->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
 		return 1;
 
@@ -3204,6 +3216,20 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 			hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
 	}
 
+	if (cpu_has_load_cet_ctrl()) {
+		if ((hw_cr4 & X86_CR4_CET) && is_cet_supported(vcpu)) {
+			vm_entry_controls_setbit(to_vmx(vcpu),
+						 VM_ENTRY_LOAD_CET_STATE);
+			vm_exit_controls_setbit(to_vmx(vcpu),
+						VM_EXIT_LOAD_CET_STATE);
+		} else {
+			vm_entry_controls_clearbit(to_vmx(vcpu),
+						   VM_ENTRY_LOAD_CET_STATE);
+			vm_exit_controls_clearbit(to_vmx(vcpu),
+						  VM_EXIT_LOAD_CET_STATE);
+		}
+	}
+
 	vmcs_writel(CR4_READ_SHADOW, cr4);
 	vmcs_writel(GUEST_CR4, hw_cr4);
 	return 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ea8a9dc9fbad..906e07039d59 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -815,6 +815,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE))
 		return 1;
 
+	if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET))
+		return 1;
+
 	kvm_x86_ops.set_cr0(vcpu, cr0);
 
 	if ((cr0 ^ old_cr0) & X86_CR0_PG) {
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (3 preceding siblings ...)
  2020-07-16  3:16 ` [RESEND v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 20:32   ` Sean Christopherson
  2020-07-16  3:16 ` [RESEND v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES Yang Weijiang
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CPUID(0xd, 1) reports the current required storage size of XCR0 | XSS,
when guest updates the XSS, it's necessary to update the CPUID leaf, otherwise
guest will fetch old state size, and results to some WARN traces during guest
running.

supported_xss is initialized to host_xss & KVM_SUPPORTED_XSS to indicate current
MSR_IA32_XSS bits supported in KVM, but actual XSS bits seen in guest depends
on the setting of CPUID(0xd,1).{ECX, EDX} for guest.

Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/cpuid.c            | 23 +++++++++++++++++++----
 arch/x86/kvm/x86.c              | 12 ++++++++----
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index be5363b21540..e8c749596ba2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -654,6 +654,7 @@ struct kvm_vcpu_arch {
 
 	u64 xcr0;
 	u64 guest_supported_xcr0;
+	u64 guest_supported_xss;
 
 	struct kvm_pio_request pio;
 	void *pio_data;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 8a294f9747aa..d97b2a6e8a8c 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -88,14 +88,29 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
 		vcpu->arch.guest_supported_xcr0 = 0;
 	} else {
 		vcpu->arch.guest_supported_xcr0 =
-			(best->eax | ((u64)best->edx << 32)) & supported_xcr0;
+			(((u64)best->edx << 32) | best->eax) & supported_xcr0;
 		best->ebx = xstate_required_size(vcpu->arch.xcr0, false);
 	}
 
 	best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
-	if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
-		     cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
-		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
+	if (best) {
+		if (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
+		    cpuid_entry_has(best, X86_FEATURE_XSAVEC))  {
+			u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss;
+
+			best->ebx = xstate_required_size(xstate, true);
+		}
+
+		if (!cpuid_entry_has(best, X86_FEATURE_XSAVES)) {
+			best->ecx = 0;
+			best->edx = 0;
+		}
+		vcpu->arch.guest_supported_xss =
+			(((u64)best->edx << 32) | best->ecx) & supported_xss;
+
+	} else {
+		vcpu->arch.guest_supported_xss = 0;
+	}
 
 	/*
 	 * The existing code assumes virtual address is 48-bit or 57-bit in the
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 906e07039d59..8aed32ff9c0c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2912,9 +2912,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than
 		 * XSAVES/XRSTORS to save/restore PT MSRs.
 		 */
-		if (data & ~supported_xss)
+		if (data & ~vcpu->arch.guest_supported_xss)
 			return 1;
-		vcpu->arch.ia32_xss = data;
+		if (vcpu->arch.ia32_xss != data) {
+			vcpu->arch.ia32_xss = data;
+			kvm_update_cpuid(vcpu);
+		}
 		break;
 	case MSR_SMI_COUNT:
 		if (!msr_info->host_initiated)
@@ -9779,8 +9782,9 @@ int kvm_arch_hardware_setup(void *opaque)
 
 	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
 
-	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
-		supported_xss = 0;
+	supported_xss = 0;
+	if (kvm_cpu_cap_has(X86_FEATURE_XSAVES))
+		supported_xss = host_xss & KVM_SUPPORTED_XSS;
 
 #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
 	cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (4 preceding siblings ...)
  2020-07-16  3:16 ` [RESEND v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 20:32   ` Sean Christopherson
  2020-07-16  3:16 ` [RESEND v13 07/11] KVM: x86: Add userspace access interface for CET MSRs Yang Weijiang
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

From: Sean Christopherson <sean.j.christopherson@intel.com>

A handful of CET MSRs are not context switched through "traditional"
methods, e.g. VMCS or manual switching, but rather are passed through
to the guest and are saved and restored by XSAVES/XRSTORS, i.e. in the
guest's FPU state.

Load the guest's FPU state if userspace is accessing MSRs whose values
are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(),
can simply do {RD,WR}MSR to access the guest's value.

Note that guest_cpuid_has() is not queried as host userspace is allowed
to access MSRs that have not been exposed to the guest, e.g. it might do
KVM_SET_MSRS prior to KVM_SET_CPUID2.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/x86.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8aed32ff9c0c..c437ddc22ad6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -107,6 +107,8 @@ static void enter_smm(struct kvm_vcpu *vcpu);
 static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
 static void store_regs(struct kvm_vcpu *vcpu);
 static int sync_regs(struct kvm_vcpu *vcpu);
+static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
+static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
 
 struct kvm_x86_ops kvm_x86_ops __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_x86_ops);
@@ -3356,6 +3358,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 }
 EXPORT_SYMBOL_GPL(kvm_get_msr_common);
 
+static bool is_xsaves_msr(u32 index)
+{
+	return index == MSR_IA32_U_CET ||
+	       (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP);
+}
+
 /*
  * Read or write a bunch of msrs. All parameters are kernel addresses.
  *
@@ -3366,11 +3374,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
 		    int (*do_msr)(struct kvm_vcpu *vcpu,
 				  unsigned index, u64 *data))
 {
+	bool fpu_loaded = false;
 	int i;
 
-	for (i = 0; i < msrs->nmsrs; ++i)
+	for (i = 0; i < msrs->nmsrs; ++i) {
+		if (vcpu && !fpu_loaded && supported_xss &&
+		    is_xsaves_msr(entries[i].index)) {
+			kvm_load_guest_fpu(vcpu);
+			fpu_loaded = true;
+		}
 		if (do_msr(vcpu, entries[i].index, &entries[i].data))
 			break;
+	}
+	if (fpu_loaded)
+		kvm_put_guest_fpu(vcpu);
 
 	return i;
 }
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 07/11] KVM: x86: Add userspace access interface for CET MSRs
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (5 preceding siblings ...)
  2020-07-16  3:16 ` [RESEND v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 20:54   ` Sean Christopherson
  2020-07-16  3:16 ` [RESEND v13 08/11] KVM: VMX: Enable CET support for nested VM Yang Weijiang
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

There're two different places storing Guest CET states, states managed
with XSAVES/XRSTORS, as restored/saved in previous patch, can be read/write
directly from/to the MSRs. For those stored in VMCS fields, they're access
via vmcs_read/vmcs_write.

To correctly read/write the CET MSRs, it's necessary to check whether the
kernel FPU context switch happened and reload guest FPU context if needed.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/uapi/asm/kvm_para.h |   7 +-
 arch/x86/kvm/vmx/vmx.c               | 148 +++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                   |   4 +
 3 files changed, 156 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 812e9b4c1114..2d3422dc4c81 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -47,12 +47,13 @@
 /* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */
 #define MSR_KVM_WALL_CLOCK_NEW  0x4b564d00
 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
-#define MSR_KVM_ASYNC_PF_EN 0x4b564d02
-#define MSR_KVM_STEAL_TIME  0x4b564d03
-#define MSR_KVM_PV_EOI_EN      0x4b564d04
+#define MSR_KVM_ASYNC_PF_EN	0x4b564d02
+#define MSR_KVM_STEAL_TIME	0x4b564d03
+#define MSR_KVM_PV_EOI_EN	0x4b564d04
 #define MSR_KVM_POLL_CONTROL	0x4b564d05
 #define MSR_KVM_ASYNC_PF_INT	0x4b564d06
 #define MSR_KVM_ASYNC_PF_ACK	0x4b564d07
+#define MSR_KVM_GUEST_SSP	0x4b564d08
 
 struct kvm_steal_time {
 	__u64 steal;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0089943fbb31..4ce61427ed49 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1819,6 +1819,94 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 	}
 }
 
+static void vmx_get_xsave_msr(struct msr_data *msr_info)
+{
+	local_irq_disable();
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_return();
+	rdmsrl(msr_info->index, msr_info->data);
+	local_irq_enable();
+}
+
+static void vmx_set_xsave_msr(struct msr_data *msr_info)
+{
+	local_irq_disable();
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_return();
+	wrmsrl(msr_info->index, msr_info->data);
+	local_irq_enable();
+}
+
+#define CET_MSR_RSVD_BITS_1  GENMASK(2, 0)
+#define CET_MSR_RSVD_BITS_2  GENMASK(9, 6)
+
+static bool cet_check_msr_valid(struct kvm_vcpu *vcpu,
+				struct msr_data *msr, u64 rsvd_bits)
+{
+	u64 data = msr->data;
+	u32 index = msr->index;
+
+	if ((index == MSR_IA32_PL0_SSP || index == MSR_IA32_PL1_SSP ||
+	    index == MSR_IA32_PL2_SSP || index == MSR_IA32_PL3_SSP ||
+	    index == MSR_IA32_INT_SSP_TAB || index == MSR_KVM_GUEST_SSP) &&
+	    is_noncanonical_address(data, vcpu))
+		return false;
+
+	if ((index  == MSR_IA32_S_CET || index == MSR_IA32_U_CET) &&
+	    data & MSR_IA32_CET_ENDBR_EN) {
+		u64 bitmap_base = data >> 12;
+
+		if (is_noncanonical_address(bitmap_base, vcpu))
+			return false;
+	}
+
+	return !(data & rsvd_bits);
+}
+
+static bool cet_check_ssp_msr_accessible(struct kvm_vcpu *vcpu,
+					 struct msr_data *msr)
+{
+	u32 index = msr->index;
+
+	if (!boot_cpu_has(X86_FEATURE_SHSTK))
+		return false;
+
+	if (!msr->host_initiated &&
+	    !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
+		return false;
+
+	if (index == MSR_KVM_GUEST_SSP)
+		return msr->host_initiated &&
+		       guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+
+	if (index == MSR_IA32_INT_SSP_TAB)
+		return true;
+
+	if (index == MSR_IA32_PL3_SSP)
+		return supported_xss & XFEATURE_MASK_CET_USER;
+
+	return supported_xss & XFEATURE_MASK_CET_KERNEL;
+}
+
+static bool cet_check_ctl_msr_accessible(struct kvm_vcpu *vcpu,
+					 struct msr_data *msr)
+{
+	u32 index = msr->index;
+
+	if (!boot_cpu_has(X86_FEATURE_SHSTK) &&
+	    !boot_cpu_has(X86_FEATURE_IBT))
+		return false;
+
+	if (!msr->host_initiated &&
+	    !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
+	    !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
+		return false;
+
+	if (index == MSR_IA32_U_CET)
+		return supported_xss & XFEATURE_MASK_CET_USER;
+
+	return supported_xss & XFEATURE_MASK_CET_KERNEL;
+}
 /*
  * Reads an msr value (of 'msr_index') into 'pdata'.
  * Returns 0 on success, non-0 otherwise.
@@ -1951,6 +2039,31 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
 		break;
+	case MSR_KVM_GUEST_SSP:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		msr_info->data = vmcs_readl(GUEST_SSP);
+		break;
+	case MSR_IA32_S_CET:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		msr_info->data = vmcs_readl(GUEST_S_CET);
+		break;
+	case MSR_IA32_INT_SSP_TAB:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
+		break;
+	case MSR_IA32_U_CET:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		vmx_get_xsave_msr(msr_info);
+		break;
+	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		vmx_get_xsave_msr(msr_info);
+		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
@@ -2221,6 +2334,41 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			vmx->pt_desc.guest.addr_a[index / 2] = data;
 		break;
+	case MSR_KVM_GUEST_SSP:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1))
+			return 1;
+		vmcs_writel(GUEST_SSP, data);
+		break;
+	case MSR_IA32_S_CET:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2))
+			return 1;
+		vmcs_writel(GUEST_S_CET, data);
+		break;
+	case MSR_IA32_INT_SSP_TAB:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, 0))
+			return 1;
+		vmcs_writel(GUEST_INTR_SSP_TABLE, data);
+		break;
+	case MSR_IA32_U_CET:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2))
+			return 1;
+		vmx_set_xsave_msr(msr_info);
+		break;
+	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1))
+			return 1;
+		vmx_set_xsave_msr(msr_info);
+		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c437ddc22ad6..c71a9ceac05e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1234,6 +1234,10 @@ static const u32 msrs_to_save_all[] = {
 	MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
+
+	MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_S_CET,
+	MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP,
+	MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB, MSR_KVM_GUEST_SSP,
 };
 
 static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 08/11] KVM: VMX: Enable CET support for nested VM
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (6 preceding siblings ...)
  2020-07-16  3:16 ` [RESEND v13 07/11] KVM: x86: Add userspace access interface for CET MSRs Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 21:20   ` Sean Christopherson
  2020-07-16  3:16 ` [RESEND v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states Yang Weijiang
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CET MSRs pass through guests for performance consideration. Configure the
MSRs to match L0/L1 settings so that nested VM is able to run with CET.

Add assertions for vmcs12 offset table initialization, these assertions can
detect the mismatch of VMCS field encoding and data type at compiling time.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/nested.c |  34 +++++
 arch/x86/kvm/vmx/vmcs12.c | 267 +++++++++++++++++++++++---------------
 arch/x86/kvm/vmx/vmcs12.h |  14 +-
 arch/x86/kvm/vmx/vmx.c    |  10 ++
 4 files changed, 216 insertions(+), 109 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index d4a4cec034d0..ddb1a69ce947 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -550,6 +550,18 @@ static inline void enable_x2apic_msr_intercepts(unsigned long *msr_bitmap)
 	}
 }
 
+static void nested_vmx_update_intercept_for_msr(struct kvm_vcpu *vcpu,
+						u32 msr,
+						unsigned long *msr_bitmap_l1,
+						unsigned long *msr_bitmap_l0,
+						int type)
+{
+	if (!msr_write_intercepted_l01(vcpu, msr))
+		nested_vmx_disable_intercept_for_msr(msr_bitmap_l1,
+						     msr_bitmap_l0,
+						     msr, type);
+}
+
 /*
  * Merge L0's and L1's MSR bitmap, return false to indicate that
  * we do not use the hardware.
@@ -621,6 +633,28 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
 	nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, msr_bitmap_l0,
 					     MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
 
+	/* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_U_CET,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_S_CET,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
 	/*
 	 * Checking the L0->L1 bitmap is trying to verify two things:
 	 *
diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c
index c8e51c004f78..147e0d8eeab2 100644
--- a/arch/x86/kvm/vmx/vmcs12.c
+++ b/arch/x86/kvm/vmx/vmcs12.c
@@ -4,31 +4,76 @@
 
 #define ROL16(val, n) ((u16)(((u16)(val) << (n)) | ((u16)(val) >> (16 - (n)))))
 #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x)
-#define FIELD(number, name)	[ROL16(number, 6)] = VMCS12_OFFSET(name)
-#define FIELD64(number, name)						\
-	FIELD(number, name),						\
-	[ROL16(number##_HIGH, 6)] = VMCS12_OFFSET(name) + sizeof(u32)
+
+#define VMCS_CHECK_SIZE16(number) \
+	(BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2000) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2001) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x4000) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x6000))
+
+#define VMCS_CHECK_SIZE32(number) \
+	(BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x6000))
+
+#define VMCS_CHECK_SIZE64(number) \
+	(BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2001) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x4000) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x6000))
+
+#define VMCS_CHECK_SIZE_N(number) \
+	(BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2000) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2001) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x4000))
+
+#define FIELD16(number, name) \
+	[ROL16(number, 6)] = VMCS_CHECK_SIZE16(number) + VMCS12_OFFSET(name)
+
+#define FIELD32(number, name) \
+	[ROL16(number, 6)] = VMCS_CHECK_SIZE32(number) + VMCS12_OFFSET(name)
+
+#define FIELD64(number, name)  \
+	FIELD32(number, name), \
+	[ROL16(number##_HIGH, 6)] = VMCS_CHECK_SIZE32(number) + \
+	VMCS12_OFFSET(name) + sizeof(u32)
+#define FIELDN(number, name) \
+	[ROL16(number, 6)] = VMCS_CHECK_SIZE_N(number) + VMCS12_OFFSET(name)
 
 const unsigned short vmcs_field_to_offset_table[] = {
-	FIELD(VIRTUAL_PROCESSOR_ID, virtual_processor_id),
-	FIELD(POSTED_INTR_NV, posted_intr_nv),
-	FIELD(GUEST_ES_SELECTOR, guest_es_selector),
-	FIELD(GUEST_CS_SELECTOR, guest_cs_selector),
-	FIELD(GUEST_SS_SELECTOR, guest_ss_selector),
-	FIELD(GUEST_DS_SELECTOR, guest_ds_selector),
-	FIELD(GUEST_FS_SELECTOR, guest_fs_selector),
-	FIELD(GUEST_GS_SELECTOR, guest_gs_selector),
-	FIELD(GUEST_LDTR_SELECTOR, guest_ldtr_selector),
-	FIELD(GUEST_TR_SELECTOR, guest_tr_selector),
-	FIELD(GUEST_INTR_STATUS, guest_intr_status),
-	FIELD(GUEST_PML_INDEX, guest_pml_index),
-	FIELD(HOST_ES_SELECTOR, host_es_selector),
-	FIELD(HOST_CS_SELECTOR, host_cs_selector),
-	FIELD(HOST_SS_SELECTOR, host_ss_selector),
-	FIELD(HOST_DS_SELECTOR, host_ds_selector),
-	FIELD(HOST_FS_SELECTOR, host_fs_selector),
-	FIELD(HOST_GS_SELECTOR, host_gs_selector),
-	FIELD(HOST_TR_SELECTOR, host_tr_selector),
+	FIELD16(VIRTUAL_PROCESSOR_ID, virtual_processor_id),
+	FIELD16(POSTED_INTR_NV, posted_intr_nv),
+	FIELD16(GUEST_ES_SELECTOR, guest_es_selector),
+	FIELD16(GUEST_CS_SELECTOR, guest_cs_selector),
+	FIELD16(GUEST_SS_SELECTOR, guest_ss_selector),
+	FIELD16(GUEST_DS_SELECTOR, guest_ds_selector),
+	FIELD16(GUEST_FS_SELECTOR, guest_fs_selector),
+	FIELD16(GUEST_GS_SELECTOR, guest_gs_selector),
+	FIELD16(GUEST_LDTR_SELECTOR, guest_ldtr_selector),
+	FIELD16(GUEST_TR_SELECTOR, guest_tr_selector),
+	FIELD16(GUEST_INTR_STATUS, guest_intr_status),
+	FIELD16(GUEST_PML_INDEX, guest_pml_index),
+	FIELD16(HOST_ES_SELECTOR, host_es_selector),
+	FIELD16(HOST_CS_SELECTOR, host_cs_selector),
+	FIELD16(HOST_SS_SELECTOR, host_ss_selector),
+	FIELD16(HOST_DS_SELECTOR, host_ds_selector),
+	FIELD16(HOST_FS_SELECTOR, host_fs_selector),
+	FIELD16(HOST_GS_SELECTOR, host_gs_selector),
+	FIELD16(HOST_TR_SELECTOR, host_tr_selector),
 	FIELD64(IO_BITMAP_A, io_bitmap_a),
 	FIELD64(IO_BITMAP_B, io_bitmap_b),
 	FIELD64(MSR_BITMAP, msr_bitmap),
@@ -64,90 +109,96 @@ const unsigned short vmcs_field_to_offset_table[] = {
 	FIELD64(HOST_IA32_PAT, host_ia32_pat),
 	FIELD64(HOST_IA32_EFER, host_ia32_efer),
 	FIELD64(HOST_IA32_PERF_GLOBAL_CTRL, host_ia32_perf_global_ctrl),
-	FIELD(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control),
-	FIELD(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control),
-	FIELD(EXCEPTION_BITMAP, exception_bitmap),
-	FIELD(PAGE_FAULT_ERROR_CODE_MASK, page_fault_error_code_mask),
-	FIELD(PAGE_FAULT_ERROR_CODE_MATCH, page_fault_error_code_match),
-	FIELD(CR3_TARGET_COUNT, cr3_target_count),
-	FIELD(VM_EXIT_CONTROLS, vm_exit_controls),
-	FIELD(VM_EXIT_MSR_STORE_COUNT, vm_exit_msr_store_count),
-	FIELD(VM_EXIT_MSR_LOAD_COUNT, vm_exit_msr_load_count),
-	FIELD(VM_ENTRY_CONTROLS, vm_entry_controls),
-	FIELD(VM_ENTRY_MSR_LOAD_COUNT, vm_entry_msr_load_count),
-	FIELD(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field),
-	FIELD(VM_ENTRY_EXCEPTION_ERROR_CODE, vm_entry_exception_error_code),
-	FIELD(VM_ENTRY_INSTRUCTION_LEN, vm_entry_instruction_len),
-	FIELD(TPR_THRESHOLD, tpr_threshold),
-	FIELD(SECONDARY_VM_EXEC_CONTROL, secondary_vm_exec_control),
-	FIELD(VM_INSTRUCTION_ERROR, vm_instruction_error),
-	FIELD(VM_EXIT_REASON, vm_exit_reason),
-	FIELD(VM_EXIT_INTR_INFO, vm_exit_intr_info),
-	FIELD(VM_EXIT_INTR_ERROR_CODE, vm_exit_intr_error_code),
-	FIELD(IDT_VECTORING_INFO_FIELD, idt_vectoring_info_field),
-	FIELD(IDT_VECTORING_ERROR_CODE, idt_vectoring_error_code),
-	FIELD(VM_EXIT_INSTRUCTION_LEN, vm_exit_instruction_len),
-	FIELD(VMX_INSTRUCTION_INFO, vmx_instruction_info),
-	FIELD(GUEST_ES_LIMIT, guest_es_limit),
-	FIELD(GUEST_CS_LIMIT, guest_cs_limit),
-	FIELD(GUEST_SS_LIMIT, guest_ss_limit),
-	FIELD(GUEST_DS_LIMIT, guest_ds_limit),
-	FIELD(GUEST_FS_LIMIT, guest_fs_limit),
-	FIELD(GUEST_GS_LIMIT, guest_gs_limit),
-	FIELD(GUEST_LDTR_LIMIT, guest_ldtr_limit),
-	FIELD(GUEST_TR_LIMIT, guest_tr_limit),
-	FIELD(GUEST_GDTR_LIMIT, guest_gdtr_limit),
-	FIELD(GUEST_IDTR_LIMIT, guest_idtr_limit),
-	FIELD(GUEST_ES_AR_BYTES, guest_es_ar_bytes),
-	FIELD(GUEST_CS_AR_BYTES, guest_cs_ar_bytes),
-	FIELD(GUEST_SS_AR_BYTES, guest_ss_ar_bytes),
-	FIELD(GUEST_DS_AR_BYTES, guest_ds_ar_bytes),
-	FIELD(GUEST_FS_AR_BYTES, guest_fs_ar_bytes),
-	FIELD(GUEST_GS_AR_BYTES, guest_gs_ar_bytes),
-	FIELD(GUEST_LDTR_AR_BYTES, guest_ldtr_ar_bytes),
-	FIELD(GUEST_TR_AR_BYTES, guest_tr_ar_bytes),
-	FIELD(GUEST_INTERRUPTIBILITY_INFO, guest_interruptibility_info),
-	FIELD(GUEST_ACTIVITY_STATE, guest_activity_state),
-	FIELD(GUEST_SYSENTER_CS, guest_sysenter_cs),
-	FIELD(HOST_IA32_SYSENTER_CS, host_ia32_sysenter_cs),
-	FIELD(VMX_PREEMPTION_TIMER_VALUE, vmx_preemption_timer_value),
-	FIELD(CR0_GUEST_HOST_MASK, cr0_guest_host_mask),
-	FIELD(CR4_GUEST_HOST_MASK, cr4_guest_host_mask),
-	FIELD(CR0_READ_SHADOW, cr0_read_shadow),
-	FIELD(CR4_READ_SHADOW, cr4_read_shadow),
-	FIELD(EXIT_QUALIFICATION, exit_qualification),
-	FIELD(GUEST_LINEAR_ADDRESS, guest_linear_address),
-	FIELD(GUEST_CR0, guest_cr0),
-	FIELD(GUEST_CR3, guest_cr3),
-	FIELD(GUEST_CR4, guest_cr4),
-	FIELD(GUEST_ES_BASE, guest_es_base),
-	FIELD(GUEST_CS_BASE, guest_cs_base),
-	FIELD(GUEST_SS_BASE, guest_ss_base),
-	FIELD(GUEST_DS_BASE, guest_ds_base),
-	FIELD(GUEST_FS_BASE, guest_fs_base),
-	FIELD(GUEST_GS_BASE, guest_gs_base),
-	FIELD(GUEST_LDTR_BASE, guest_ldtr_base),
-	FIELD(GUEST_TR_BASE, guest_tr_base),
-	FIELD(GUEST_GDTR_BASE, guest_gdtr_base),
-	FIELD(GUEST_IDTR_BASE, guest_idtr_base),
-	FIELD(GUEST_DR7, guest_dr7),
-	FIELD(GUEST_RSP, guest_rsp),
-	FIELD(GUEST_RIP, guest_rip),
-	FIELD(GUEST_RFLAGS, guest_rflags),
-	FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions),
-	FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp),
-	FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip),
-	FIELD(HOST_CR0, host_cr0),
-	FIELD(HOST_CR3, host_cr3),
-	FIELD(HOST_CR4, host_cr4),
-	FIELD(HOST_FS_BASE, host_fs_base),
-	FIELD(HOST_GS_BASE, host_gs_base),
-	FIELD(HOST_TR_BASE, host_tr_base),
-	FIELD(HOST_GDTR_BASE, host_gdtr_base),
-	FIELD(HOST_IDTR_BASE, host_idtr_base),
-	FIELD(HOST_IA32_SYSENTER_ESP, host_ia32_sysenter_esp),
-	FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip),
-	FIELD(HOST_RSP, host_rsp),
-	FIELD(HOST_RIP, host_rip),
+	FIELD32(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control),
+	FIELD32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control),
+	FIELD32(EXCEPTION_BITMAP, exception_bitmap),
+	FIELD32(PAGE_FAULT_ERROR_CODE_MASK, page_fault_error_code_mask),
+	FIELD32(PAGE_FAULT_ERROR_CODE_MATCH, page_fault_error_code_match),
+	FIELD32(CR3_TARGET_COUNT, cr3_target_count),
+	FIELD32(VM_EXIT_CONTROLS, vm_exit_controls),
+	FIELD32(VM_EXIT_MSR_STORE_COUNT, vm_exit_msr_store_count),
+	FIELD32(VM_EXIT_MSR_LOAD_COUNT, vm_exit_msr_load_count),
+	FIELD32(VM_ENTRY_CONTROLS, vm_entry_controls),
+	FIELD32(VM_ENTRY_MSR_LOAD_COUNT, vm_entry_msr_load_count),
+	FIELD32(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field),
+	FIELD32(VM_ENTRY_EXCEPTION_ERROR_CODE, vm_entry_exception_error_code),
+	FIELD32(VM_ENTRY_INSTRUCTION_LEN, vm_entry_instruction_len),
+	FIELD32(TPR_THRESHOLD, tpr_threshold),
+	FIELD32(SECONDARY_VM_EXEC_CONTROL, secondary_vm_exec_control),
+	FIELD32(VM_INSTRUCTION_ERROR, vm_instruction_error),
+	FIELD32(VM_EXIT_REASON, vm_exit_reason),
+	FIELD32(VM_EXIT_INTR_INFO, vm_exit_intr_info),
+	FIELD32(VM_EXIT_INTR_ERROR_CODE, vm_exit_intr_error_code),
+	FIELD32(IDT_VECTORING_INFO_FIELD, idt_vectoring_info_field),
+	FIELD32(IDT_VECTORING_ERROR_CODE, idt_vectoring_error_code),
+	FIELD32(VM_EXIT_INSTRUCTION_LEN, vm_exit_instruction_len),
+	FIELD32(VMX_INSTRUCTION_INFO, vmx_instruction_info),
+	FIELD32(GUEST_ES_LIMIT, guest_es_limit),
+	FIELD32(GUEST_CS_LIMIT, guest_cs_limit),
+	FIELD32(GUEST_SS_LIMIT, guest_ss_limit),
+	FIELD32(GUEST_DS_LIMIT, guest_ds_limit),
+	FIELD32(GUEST_FS_LIMIT, guest_fs_limit),
+	FIELD32(GUEST_GS_LIMIT, guest_gs_limit),
+	FIELD32(GUEST_LDTR_LIMIT, guest_ldtr_limit),
+	FIELD32(GUEST_TR_LIMIT, guest_tr_limit),
+	FIELD32(GUEST_GDTR_LIMIT, guest_gdtr_limit),
+	FIELD32(GUEST_IDTR_LIMIT, guest_idtr_limit),
+	FIELD32(GUEST_ES_AR_BYTES, guest_es_ar_bytes),
+	FIELD32(GUEST_CS_AR_BYTES, guest_cs_ar_bytes),
+	FIELD32(GUEST_SS_AR_BYTES, guest_ss_ar_bytes),
+	FIELD32(GUEST_DS_AR_BYTES, guest_ds_ar_bytes),
+	FIELD32(GUEST_FS_AR_BYTES, guest_fs_ar_bytes),
+	FIELD32(GUEST_GS_AR_BYTES, guest_gs_ar_bytes),
+	FIELD32(GUEST_LDTR_AR_BYTES, guest_ldtr_ar_bytes),
+	FIELD32(GUEST_TR_AR_BYTES, guest_tr_ar_bytes),
+	FIELD32(GUEST_INTERRUPTIBILITY_INFO, guest_interruptibility_info),
+	FIELD32(GUEST_ACTIVITY_STATE, guest_activity_state),
+	FIELD32(GUEST_SYSENTER_CS, guest_sysenter_cs),
+	FIELD32(HOST_IA32_SYSENTER_CS, host_ia32_sysenter_cs),
+	FIELD32(VMX_PREEMPTION_TIMER_VALUE, vmx_preemption_timer_value),
+	FIELDN(CR0_GUEST_HOST_MASK, cr0_guest_host_mask),
+	FIELDN(CR4_GUEST_HOST_MASK, cr4_guest_host_mask),
+	FIELDN(CR0_READ_SHADOW, cr0_read_shadow),
+	FIELDN(CR4_READ_SHADOW, cr4_read_shadow),
+	FIELDN(EXIT_QUALIFICATION, exit_qualification),
+	FIELDN(GUEST_LINEAR_ADDRESS, guest_linear_address),
+	FIELDN(GUEST_CR0, guest_cr0),
+	FIELDN(GUEST_CR3, guest_cr3),
+	FIELDN(GUEST_CR4, guest_cr4),
+	FIELDN(GUEST_ES_BASE, guest_es_base),
+	FIELDN(GUEST_CS_BASE, guest_cs_base),
+	FIELDN(GUEST_SS_BASE, guest_ss_base),
+	FIELDN(GUEST_DS_BASE, guest_ds_base),
+	FIELDN(GUEST_FS_BASE, guest_fs_base),
+	FIELDN(GUEST_GS_BASE, guest_gs_base),
+	FIELDN(GUEST_LDTR_BASE, guest_ldtr_base),
+	FIELDN(GUEST_TR_BASE, guest_tr_base),
+	FIELDN(GUEST_GDTR_BASE, guest_gdtr_base),
+	FIELDN(GUEST_IDTR_BASE, guest_idtr_base),
+	FIELDN(GUEST_DR7, guest_dr7),
+	FIELDN(GUEST_RSP, guest_rsp),
+	FIELDN(GUEST_RIP, guest_rip),
+	FIELDN(GUEST_RFLAGS, guest_rflags),
+	FIELDN(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions),
+	FIELDN(GUEST_SYSENTER_ESP, guest_sysenter_esp),
+	FIELDN(GUEST_SYSENTER_EIP, guest_sysenter_eip),
+	FIELDN(GUEST_S_CET, guest_s_cet),
+	FIELDN(GUEST_SSP, guest_ssp),
+	FIELDN(GUEST_INTR_SSP_TABLE, guest_ssp_tbl),
+	FIELDN(HOST_CR0, host_cr0),
+	FIELDN(HOST_CR3, host_cr3),
+	FIELDN(HOST_CR4, host_cr4),
+	FIELDN(HOST_FS_BASE, host_fs_base),
+	FIELDN(HOST_GS_BASE, host_gs_base),
+	FIELDN(HOST_TR_BASE, host_tr_base),
+	FIELDN(HOST_GDTR_BASE, host_gdtr_base),
+	FIELDN(HOST_IDTR_BASE, host_idtr_base),
+	FIELDN(HOST_IA32_SYSENTER_ESP, host_ia32_sysenter_esp),
+	FIELDN(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip),
+	FIELDN(HOST_RSP, host_rsp),
+	FIELDN(HOST_RIP, host_rip),
+	FIELDN(HOST_S_CET, host_s_cet),
+	FIELDN(HOST_SSP, host_ssp),
+	FIELDN(HOST_INTR_SSP_TABLE, host_ssp_tbl),
 };
 const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs_field_to_offset_table);
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index 80232daf00ff..016896c9e701 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -115,7 +115,13 @@ struct __packed vmcs12 {
 	natural_width host_ia32_sysenter_eip;
 	natural_width host_rsp;
 	natural_width host_rip;
-	natural_width paddingl[8]; /* room for future expansion */
+	natural_width host_s_cet;
+	natural_width host_ssp;
+	natural_width host_ssp_tbl;
+	natural_width guest_s_cet;
+	natural_width guest_ssp;
+	natural_width guest_ssp_tbl;
+	natural_width paddingl[2]; /* room for future expansion */
 	u32 pin_based_vm_exec_control;
 	u32 cpu_based_vm_exec_control;
 	u32 exception_bitmap;
@@ -295,6 +301,12 @@ static inline void vmx_check_vmcs12_offsets(void)
 	CHECK_OFFSET(host_ia32_sysenter_eip, 656);
 	CHECK_OFFSET(host_rsp, 664);
 	CHECK_OFFSET(host_rip, 672);
+	CHECK_OFFSET(host_s_cet, 680);
+	CHECK_OFFSET(host_ssp, 688);
+	CHECK_OFFSET(host_ssp_tbl, 696);
+	CHECK_OFFSET(guest_s_cet, 704);
+	CHECK_OFFSET(guest_ssp, 712);
+	CHECK_OFFSET(guest_ssp_tbl, 720);
 	CHECK_OFFSET(pin_based_vm_exec_control, 744);
 	CHECK_OFFSET(cpu_based_vm_exec_control, 748);
 	CHECK_OFFSET(exception_bitmap, 752);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4ce61427ed49..d465ff990094 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7321,6 +7321,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
 	cr4_fixed1_update(X86_CR4_PKE,        ecx, feature_bit(PKU));
 	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
 	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
+	cr4_fixed1_update(X86_CR4_CET,	      ecx, feature_bit(SHSTK));
 
 #undef cr4_fixed1_update
 }
@@ -7340,6 +7341,15 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 			vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS;
 		}
 	}
+
+	if (is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER |
+	    XFEATURE_MASK_CET_KERNEL)) {
+		vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_CET_STATE;
+		vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_CET_STATE;
+	} else {
+		vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_CET_STATE;
+		vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_CET_STATE;
+	}
 }
 
 static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (7 preceding siblings ...)
  2020-07-16  3:16 ` [RESEND v13 08/11] KVM: VMX: Enable CET support for nested VM Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 21:29   ` Sean Christopherson
  2020-07-16  3:16 ` [RESEND v13 10/11] KVM: x86: Add #CP support in guest exception dispatch Yang Weijiang
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

Dump CET VMCS states for debug purpose. Since CET kernel protection is
not enabled, if related MSRs in host are filled by mistake, warn once on
detecting it.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d465ff990094..5d4250b9dec8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6056,6 +6056,12 @@ void dump_vmcs(void)
 		pr_err("InterruptStatus = %04x\n",
 		       vmcs_read16(GUEST_INTR_STATUS));
 
+	if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) {
+		pr_err("S_CET = 0x%016lx\n", vmcs_readl(GUEST_S_CET));
+		pr_err("SSP = 0x%016lx\n", vmcs_readl(GUEST_SSP));
+		pr_err("SSP TABLE = 0x%016lx\n",
+		       vmcs_readl(GUEST_INTR_SSP_TABLE));
+	}
 	pr_err("*** Host State ***\n");
 	pr_err("RIP = 0x%016lx  RSP = 0x%016lx\n",
 	       vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP));
@@ -6130,6 +6136,12 @@ void dump_vmcs(void)
 	if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID)
 		pr_err("Virtual processor ID = 0x%04x\n",
 		       vmcs_read16(VIRTUAL_PROCESSOR_ID));
+	if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) {
+		pr_err("S_CET = 0x%016lx\n", vmcs_readl(HOST_S_CET));
+		pr_err("SSP = 0x%016lx\n", vmcs_readl(HOST_SSP));
+		pr_err("SSP TABLE = 0x%016lx\n",
+		       vmcs_readl(HOST_INTR_SSP_TABLE));
+	}
 }
 
 /*
@@ -8205,6 +8217,7 @@ static __init int hardware_setup(void)
 	unsigned long host_bndcfgs;
 	struct desc_ptr dt;
 	int r, i, ept_lpage_level;
+	u64 cet_msr;
 
 	store_idt(&dt);
 	host_idt_base = dt.address;
@@ -8365,6 +8378,16 @@ static __init int hardware_setup(void)
 			return r;
 	}
 
+	if (boot_cpu_has(X86_FEATURE_IBT) || boot_cpu_has(X86_FEATURE_SHSTK)) {
+		rdmsrl(MSR_IA32_S_CET, cet_msr);
+		WARN_ONCE(cet_msr, "KVM: CET S_CET in host will be lost!\n");
+	}
+
+	if (boot_cpu_has(X86_FEATURE_SHSTK)) {
+		rdmsrl(MSR_IA32_PL0_SSP, cet_msr);
+		WARN_ONCE(cet_msr, "KVM: CET PL0_SSP in host will be lost!\n");
+	}
+
 	vmx_set_cpu_caps();
 
 	r = alloc_kvm_area();
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 10/11] KVM: x86: Add #CP support in guest exception dispatch
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (8 preceding siblings ...)
  2020-07-16  3:16 ` [RESEND v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 21:29   ` Sean Christopherson
  2020-07-16  3:16 ` [RESEND v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace Yang Weijiang
  2020-07-22 19:48 ` [RESEND PATCH v13 00/11] Introduce support for guest CET feature Sean Christopherson
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CPU defined #CP(21) to handle CET induced exception, it's accompanied
with several error codes corresponding to different CET violation cases,
see SDM for detailed description. The exception is classified as a
contibutory exception w.r.t #DF.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/uapi/asm/kvm.h | 1 +
 arch/x86/kvm/x86.c              | 1 +
 arch/x86/kvm/x86.h              | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 0780f97c1850..fb33cacc8935 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -31,6 +31,7 @@
 #define MC_VECTOR 18
 #define XM_VECTOR 19
 #define VE_VECTOR 20
+#define CP_VECTOR 21
 
 /* Select x86 specific features in <linux/kvm.h> */
 #define __KVM_HAVE_PIT
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c71a9ceac05e..76892fb0b0a0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -405,6 +405,7 @@ static int exception_class(int vector)
 	case NP_VECTOR:
 	case SS_VECTOR:
 	case GP_VECTOR:
+	case CP_VECTOR:
 		return EXCPT_CONTRIBUTORY;
 	default:
 		break;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 6eb62e97e59f..53a92fc5f065 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -115,7 +115,7 @@ static inline bool x86_exception_has_error_code(unsigned int vector)
 {
 	static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
 			BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
-			BIT(PF_VECTOR) | BIT(AC_VECTOR);
+			BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);
 
 	return (1U << vector) & exception_has_error_code;
 }
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RESEND v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (9 preceding siblings ...)
  2020-07-16  3:16 ` [RESEND v13 10/11] KVM: x86: Add #CP support in guest exception dispatch Yang Weijiang
@ 2020-07-16  3:16 ` Yang Weijiang
  2020-07-22 21:33   ` Sean Christopherson
  2020-07-22 19:48 ` [RESEND PATCH v13 00/11] Introduce support for guest CET feature Sean Christopherson
  11 siblings, 1 reply; 24+ messages in thread
From: Yang Weijiang @ 2020-07-16  3:16 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

Set the feature bits so that CET capabilities can be seen in guest via
CPUID enumeration. Add CR4.CET bit support in order to allow guest set CET
master control bit(CR4.CET).

Disable KVM CET feature once unrestricted_guest is turned off because
KVM cannot emulate guest CET behavior well in this case.

Don't expose CET feature if dependent CET bits are cleared in host XSS.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/cpuid.c            |  5 +++--
 arch/x86/kvm/vmx/vmx.c          |  5 +++++
 arch/x86/kvm/x86.c              | 11 +++++++++++
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e8c749596ba2..c4c82db68b6a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -99,7 +99,8 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+			  | X86_CR4_CET))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index d97b2a6e8a8c..a085b8c57f34 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -340,7 +340,8 @@ void kvm_set_cpu_caps(void)
 		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
 		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
 		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
-		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
+		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
+		F(SHSTK)
 	);
 	/* Set LA57 based on hardware capability. */
 	if (cpuid_ecx(7) & F(LA57))
@@ -356,7 +357,7 @@ void kvm_set_cpu_caps(void)
 	kvm_cpu_cap_mask(CPUID_7_EDX,
 		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
 		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
-		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM)
+		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(IBT)
 	);
 
 	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5d4250b9dec8..31593339b6fe 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7542,6 +7542,11 @@ static __init void vmx_set_cpu_caps(void)
 
 	if (vmx_waitpkg_supported())
 		kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
+
+	if (!enable_unrestricted_guest) {
+		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+		kvm_cpu_cap_clear(X86_FEATURE_IBT);
+	}
 }
 
 static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76892fb0b0a0..c7393d62ad72 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9808,10 +9808,21 @@ int kvm_arch_hardware_setup(void *opaque)
 	if (kvm_cpu_cap_has(X86_FEATURE_XSAVES))
 		supported_xss = host_xss & KVM_SUPPORTED_XSS;
 
+	if (!(supported_xss & (XFEATURE_MASK_CET_USER |
+	    XFEATURE_MASK_CET_KERNEL))) {
+		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+		kvm_cpu_cap_clear(X86_FEATURE_IBT);
+	}
+
 #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
 	cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
 #undef __kvm_cpu_cap_has
 
+	if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+	    !kvm_cpu_cap_has(X86_FEATURE_IBT))
+		supported_xss &= ~(XFEATURE_MASK_CET_USER |
+				   XFEATURE_MASK_CET_KERNEL);
+
 	if (kvm_has_tsc_control) {
 		/*
 		 * Make sure the user can only configure tsc_khz values that
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RESEND PATCH v13 00/11] Introduce support for guest CET feature
  2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (10 preceding siblings ...)
  2020-07-16  3:16 ` [RESEND v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace Yang Weijiang
@ 2020-07-22 19:48 ` Sean Christopherson
  2020-07-23  3:17   ` Yang Weijiang
  11 siblings, 1 reply; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 19:48 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:16AM +0800, Yang Weijiang wrote:
> Control-flow Enforcement Technology (CET) provides protection against
> Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
> sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
> SHSTK is to prevent ROP programming and IBT is to prevent JOP programming.
> 
> Several parts in KVM have been updated to provide VM CET support, including:
> CPUID/XSAVES config, MSR pass-through, user space MSR access interface, 
> vmentry/vmexit config, nested VM etc. These patches have dependency on CET
> kernel patches for xsaves support and CET definitions, e.g., MSR and related
> feature flags.
> 
> CET kernel patches are here:
> https://lkml.kernel.org/r/20200429220732.31602-1-yu-cheng.yu@intel.com
> 
> v13:
> - Added CET definitions as a separate patch to facilitate KVM test.

What I actually want to do is pull in actual kernel patches themselves so
that we can upstream KVM support without having to wait for the kernel to
sort out the ABI, which seems like it's going to drag on.

I was thinking that we'd only need the MSR/CR4/CPUID definitions, but forgot
that KVM also needs XSAVES context switching, so it's not as simple as I was
thinking.  It's still relatively simple, but it means there would be
functional changes in the kernel.

I'll respond to the main SSP series to pose the question of taking the two
small-ish kernel patches through the KVM tree.

>  arch/x86/include/asm/kvm_host.h      |   4 +-
>  arch/x86/include/asm/vmx.h           |   8 +
>  arch/x86/include/uapi/asm/kvm.h      |   1 +
>  arch/x86/include/uapi/asm/kvm_para.h |   7 +-
>  arch/x86/kvm/cpuid.c                 |  28 ++-
>  arch/x86/kvm/vmx/capabilities.h      |   5 +
>  arch/x86/kvm/vmx/nested.c            |  34 ++++
>  arch/x86/kvm/vmx/vmcs12.c            | 267 ++++++++++++++++-----------
>  arch/x86/kvm/vmx/vmcs12.h            |  14 +-
>  arch/x86/kvm/vmx/vmx.c               | 262 +++++++++++++++++++++++++-
>  arch/x86/kvm/x86.c                   |  53 +++++-
>  arch/x86/kvm/x86.h                   |   2 +-
>  include/linux/kvm_host.h             |  32 ++++
>  13 files changed, 590 insertions(+), 127 deletions(-)

I have quite a few comments/changes (will respond to individual patches),
but have done all the updates/rework and, assuming I haven't broken things,
we're nearing the point where I can carry this and push it past the finish
line, e.g. get acks from tip/x86 maintainers for the kernel patches and
send a pull request to Paolo.

I pushed the result to:

  https://github.com/sean-jc/linux/releases/tag/kvm-cet-v14-rc1

can you please review and test?  If everything looks good, I'll post v14.
If not, I'll work offline with you to get it into shape.

Thanks!

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags
  2020-07-16  3:16 ` [RESEND v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags Yang Weijiang
@ 2020-07-22 19:48   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 19:48 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:18AM +0800, Yang Weijiang wrote:
> CET(Control-flow Enforcement Technology) is a CPU feature used to prevent
> Return/Jump-Oriented Programming(ROP/JOP) attacks. It provides the following
> sub-features to defend against ROP/JOP style control-flow subversion attacks:
> 
> Shadow Stack (SHSTK):
>   A second stack for program which is used exclusively for control transfer
>   operations.
> 
> Indirect Branch Tracking (IBT):
>   Code branching protection to defend against jump/call oriented programming.
> 
> Several new CET MSRs are defined in kernel to support CET:
>   MSR_IA32_{U,S}_CET: Controls the CET settings for user mode and kernel mode
>   respectively.
> 
>   MSR_IA32_PL{0,1,2,3}_SSP: Stores shadow stack pointers for CPL-0,1,2,3
>   protection respectively.
> 
>   MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer table.
> 
> Two XSAVES state bits are introduced for CET:
>   IA32_XSS:[bit 11]: Control saving/restoring user mode CET states
>   IA32_XSS:[bit 12]: Control saving/restoring kernel mode CET states.
> 
> Six VMCS fields are introduced for CET:
>   {HOST,GUEST}_S_CET: Stores CET settings for kernel mode.
>   {HOST,GUEST}_SSP: Stores shadow stack pointer of current task/thread.
>   {HOST,GUEST}_INTR_SSP_TABLE: Stores base address of shadow stack pointer
>   table.
> 
> If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET states are restored from below
> VMCS fields at VM-Exit:
>   HOST_S_CET
>   HOST_SSP
>   HOST_INTR_SSP_TABLE
> 
> If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET states are loaded from below
> VMCS fields at VM-Entry:
>   GUEST_S_CET
>   GUEST_SSP
>   GUEST_INTR_SSP_TABLE

No changes to the patch itself, but I tweaked the formatting of the changelog
a bit and expanded the introduction for SHSTK and IBT to provide a bit more
background.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration
  2020-07-16  3:16 ` [RESEND v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Yang Weijiang
@ 2020-07-22 20:14   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 20:14 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:19AM +0800, Yang Weijiang wrote:
> CET MSRs pass through guest directly to enhance performance. CET runtime
> control settings are stored in MSR_IA32_{U,S}_CET, Shadow Stack Pointer(SSP)
> are stored in MSR_IA32_PL{0,1,2,3}_SSP, SSP table base address is stored in
> MSR_IA32_INT_SSP_TAB, these MSRs are defined in kernel and re-used here.
> 
> MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user-mode protection,the MSR
> contents are switched between threads during scheduling, it makes sense to pass
> through them so that the guest kernel can use xsaves/xrstors to operate them
> efficiently. Other MSRs are used for non-user mode protection. See SDM for detailed
> info.
> 
> The difference between CET VMCS fields and CET MSRs is that,the former are used
> during VMEnter/VMExit, whereas the latter are used for CET state storage between
> task/thread scheduling.

I moved this patch until after CET virtualization is enabled.  Functionally,
KVM should be able to virtualize CET without allowing the guest to access
the MSRs directly.  Moving this after CET enabling will allow bisecting this
specific patch, e.g. if there's a problem with pass-through but not basic
support, or vice versa, and will also allow better testing of the MSR access
code.  At least, that's the theory.
 
> Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 46 ++++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c     |  3 +++
>  2 files changed, 49 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 13745f2a5ecd..a9f135c52cbc 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -3126,6 +3126,13 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long pgd)
>  		vmcs_writel(GUEST_CR3, guest_cr3);
>  }
>  
> +static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states)

s/xss_states/xss_state, i.e. make it singular instead of plural to match the
function name, and because the below check is at best ambiguous for multiple
states, e.g. it arguably should be ((supported_xss & xss_states) == xss_states).

> +{
> +	return ((supported_xss & xss_states) &&

Nit, the () around the entire statement is unnecessary.

Checking supported_xss is incorrect, this needs to check guest_supported_xss.

> +		(guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> +		guest_cpuid_has(vcpu, X86_FEATURE_IBT)));

Nit, please align inner statements, e.g. so it looks like:

               (guest_cpuid_has(...) ||
                guest_cpuid_has(...)))

> +}
> +
>  int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -7230,6 +7237,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
>  		vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
>  }
>  
> +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;

Naming the local 'bitmap' will avoid wrapping lines.  Everything except
INT_SSP_TAB fits under 80 chars, and for that one it's ok to poke out a bit.

> +	bool incpt;
> +
> +	incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER);

> +	/*
> +	 * U_CET is required for USER CET, and U_CET, PL3_SPP are bound as
> +	 * one component and controlled by IA32_XSS[bit 11].
> +	 */
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW,
> +				  incpt);
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW,
> +				  incpt);

This is wrong, PL3_SSP should be intercepted if IBT is supported by SHSTK is
not.  Weird XSAVES virtualization hole aside, we need to be consistent with
the emulation of the MSRs.

> +
> +	incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL);
> +	/*
> +	 * S_CET is required for KERNEL CET, and PL0_SSP ... PL2_SSP are
> +	 * bound as one component and controlled by IA32_XSS[bit 12].
> +	 */
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW,
> +				  incpt);
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW,
> +				  incpt);
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW,
> +				  incpt);
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW,
> +				  incpt);
> +
> +	incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> +	/* SSP_TAB is only available for KERNEL SHSTK.*/
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW,
> +				  incpt);
> +}
> +
>  static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -7268,6 +7311,9 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
>  			vmx_set_guest_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
>  		}
>  	}
> +
> +	if (supported_xss & (XFEATURE_MASK_CET_KERNEL | XFEATURE_MASK_CET_USER))

Given that the proposed kernel support bundles USER and KERNEL together, we
can simplify the KVM implementation by adding kvm_cet_supported(), with a
similar implementation to MPX:

static inline bool kvm_cet_supported(void)
{
	const u64 mask = XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL;

	return (supported_xss & mask) == mask;
}

> +		vmx_update_intercept_for_cet_msr(vcpu);
>  }
>  
>  static __init void vmx_set_cpu_caps(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 88c593f83b28..ea8a9dc9fbad 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -184,6 +184,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs;
>  				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
>  				| XFEATURE_MASK_PKRU)
>  
> +#define KVM_SUPPORTED_XSS       (XFEATURE_MASK_CET_USER | \
> +				 XFEATURE_MASK_CET_KERNEL)

Xiaoyao called out that this belongs in a later patch, which it does, but we
can actually do away with it entirely.  Because CET is dependent on multiple
features and feature flags, we can't do a straight mask in any case, i.e.
having KVM_SUPPORTED_XSS doesn't add value as VMX needs to manually set the
CET flags anyways.

>  u64 __read_mostly host_efer;
>  EXPORT_SYMBOL_GPL(host_efer);
>  
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing
  2020-07-16  3:16 ` [RESEND v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing Yang Weijiang
@ 2020-07-22 20:31   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 20:31 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:20AM +0800, Yang Weijiang wrote:
> CR4.CET is master control bit for CET function. There're mutual constrains
> between CR0.WP and CR4.CET, so need to check the dependent bit while changing
> the control registers.
> 
> The processor does not allow CR4.CET to be set if CR0.WP = 0,similarly, it does
> not allow CR0.WP to be cleared while CR4.CET = 1. In either case, KVM would
> inject #GP to guest.
> 
> CET state load bit is set/cleared along with CR4.CET bit set/clear.
> 
> Note:
> SHSTK and IBT features share one control MSR: MSR_IA32_{U,S}_CET, which means
> it's difficult to hide one feature from another in the case of SHSTK != IBT,
> after discussed in community, it's agreed to allow guest control two features
> independently as it won't introduce security hole.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/kvm/vmx/capabilities.h |  5 +++++
>  arch/x86/kvm/vmx/vmx.c          | 30 ++++++++++++++++++++++++++++--
>  arch/x86/kvm/x86.c              |  3 +++
>  3 files changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
> index 4bbd8b448d22..dbc87c5997cc 100644
> --- a/arch/x86/kvm/vmx/capabilities.h
> +++ b/arch/x86/kvm/vmx/capabilities.h
> @@ -103,6 +103,11 @@ static inline bool cpu_has_load_perf_global_ctrl(void)
>  	       (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);
>  }
>  
> +static inline bool cpu_has_load_cet_ctrl(void)
> +{
> +	return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE) &&
> +		(vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_CET_STATE);

Nit, please align indentation.

> +}
>  static inline bool cpu_has_vmx_mpx(void)
>  {
>  	return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_BNDCFGS) &&
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a9f135c52cbc..0089943fbb31 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2510,7 +2510,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  	      VM_EXIT_LOAD_IA32_EFER |
>  	      VM_EXIT_CLEAR_BNDCFGS |
>  	      VM_EXIT_PT_CONCEAL_PIP |
> -	      VM_EXIT_CLEAR_IA32_RTIT_CTL;
> +	      VM_EXIT_CLEAR_IA32_RTIT_CTL |
> +	      VM_EXIT_LOAD_CET_STATE;
>  	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
>  				&_vmexit_control) < 0)
>  		return -EIO;
> @@ -2534,7 +2535,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>  	      VM_ENTRY_LOAD_IA32_EFER |
>  	      VM_ENTRY_LOAD_BNDCFGS |
>  	      VM_ENTRY_PT_CONCEAL_PIP |
> -	      VM_ENTRY_LOAD_IA32_RTIT_CTL;
> +	      VM_ENTRY_LOAD_IA32_RTIT_CTL |
> +	      VM_ENTRY_LOAD_CET_STATE;
>  	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
>  				&_vmentry_control) < 0)
>  		return -EIO;
> @@ -3133,6 +3135,12 @@ static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states)
>  		guest_cpuid_has(vcpu, X86_FEATURE_IBT)));
>  }
>  
> +static bool is_cet_supported(struct kvm_vcpu *vcpu)
> +{
> +	return is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER |
> +				      XFEATURE_MASK_CET_KERNEL);
> +}
> +
>  int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -3173,6 +3181,10 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>  			return 1;
>  	}
>  
> +	if ((cr4 & X86_CR4_CET) && (!is_cet_supported(vcpu) ||

This is the wrong way to check CR4.CET support, and is in the wrong place.
The correct way to check support is to teach __cr4_reserved_bits() to mark
CET reserved if SHSTK and IBT are not supported.

It's the wrong place because the CR4.CET vs. CR0.WP check should _not_ be
done for nested transitions, which calls vmx_set_cr*() directly precisely
to avoid ordering problems.

> +	    !(kvm_read_cr0(vcpu) & X86_CR0_WP)))
> +		return 1;
> +
>  	if (vmx->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
>  		return 1;
>  
> @@ -3204,6 +3216,20 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>  			hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
>  	}
>  
> +	if (cpu_has_load_cet_ctrl()) {
> +		if ((hw_cr4 & X86_CR4_CET) && is_cet_supported(vcpu)) {
> +			vm_entry_controls_setbit(to_vmx(vcpu),
> +						 VM_ENTRY_LOAD_CET_STATE);
> +			vm_exit_controls_setbit(to_vmx(vcpu),
> +						VM_EXIT_LOAD_CET_STATE);
> +		} else {
> +			vm_entry_controls_clearbit(to_vmx(vcpu),
> +						   VM_ENTRY_LOAD_CET_STATE);
> +			vm_exit_controls_clearbit(to_vmx(vcpu),
> +						  VM_EXIT_LOAD_CET_STATE);
> +		}
> +	}

Disabling the load based on CR4.CET is incorrect.  The MSRs are exposed to
the guest irrespective of CR4.CET, which means that _not_ loading guest
state will allow the guest to read the host values, i.e. will leak host
data to the guest.

On the other hand, we do need to explicitly clear the controls in
setup_vmcs_config() if only one is supported, otherwise it'd be possible to
load guest state on VM-Enter but not restore host state on VM-Exit.  That
shouldn't happen on real silicon, but is technically a legal configuration
and therefore theoretically possible for a VM

> +
>  	vmcs_writel(CR4_READ_SHADOW, cr4);
>  	vmcs_writel(GUEST_CR4, hw_cr4);
>  	return 0;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ea8a9dc9fbad..906e07039d59 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -815,6 +815,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
>  	if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE))
>  		return 1;
>  
> +	if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET))
> +		return 1;
> +
>  	kvm_x86_ops.set_cr0(vcpu, cr0);
>  
>  	if ((cr0 ^ old_cr0) & X86_CR0_PG) {
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits
  2020-07-16  3:16 ` [RESEND v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits Yang Weijiang
@ 2020-07-22 20:32   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 20:32 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:21AM +0800, Yang Weijiang wrote:
> CPUID(0xd, 1) reports the current required storage size of XCR0 | XSS,
> when guest updates the XSS, it's necessary to update the CPUID leaf, otherwise
> guest will fetch old state size, and results to some WARN traces during guest
> running.
> 
> supported_xss is initialized to host_xss & KVM_SUPPORTED_XSS to indicate current
> MSR_IA32_XSS bits supported in KVM, but actual XSS bits seen in guest depends
> on the setting of CPUID(0xd,1).{ECX, EDX} for guest.
> 
> Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/cpuid.c            | 23 +++++++++++++++++++----
>  arch/x86/kvm/x86.c              | 12 ++++++++----
>  3 files changed, 28 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index be5363b21540..e8c749596ba2 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -654,6 +654,7 @@ struct kvm_vcpu_arch {
>  
>  	u64 xcr0;
>  	u64 guest_supported_xcr0;
> +	u64 guest_supported_xss;
>  
>  	struct kvm_pio_request pio;
>  	void *pio_data;
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 8a294f9747aa..d97b2a6e8a8c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -88,14 +88,29 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
>  		vcpu->arch.guest_supported_xcr0 = 0;
>  	} else {
>  		vcpu->arch.guest_supported_xcr0 =
> -			(best->eax | ((u64)best->edx << 32)) & supported_xcr0;
> +			(((u64)best->edx << 32) | best->eax) & supported_xcr0;

While I don't necessarily disagree with the change, it doesn't belong in
this patch.

>  		best->ebx = xstate_required_size(vcpu->arch.xcr0, false);
>  	}
>  
>  	best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
> -	if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
> -		     cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
> -		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
> +	if (best) {
> +		if (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
> +		    cpuid_entry_has(best, X86_FEATURE_XSAVEC))  {
> +			u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss;
> +
> +			best->ebx = xstate_required_size(xstate, true);
> +		}
> +
> +		if (!cpuid_entry_has(best, X86_FEATURE_XSAVES)) {
> +			best->ecx = 0;
> +			best->edx = 0;
> +		}
> +		vcpu->arch.guest_supported_xss =
> +			(((u64)best->edx << 32) | best->ecx) & supported_xss;
> +
> +	} else {
> +		vcpu->arch.guest_supported_xss = 0;
> +	}
>  
>  	/*
>  	 * The existing code assumes virtual address is 48-bit or 57-bit in the
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 906e07039d59..8aed32ff9c0c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2912,9 +2912,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		 * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than
>  		 * XSAVES/XRSTORS to save/restore PT MSRs.
>  		 */
> -		if (data & ~supported_xss)
> +		if (data & ~vcpu->arch.guest_supported_xss)
>  			return 1;
> -		vcpu->arch.ia32_xss = data;
> +		if (vcpu->arch.ia32_xss != data) {
> +			vcpu->arch.ia32_xss = data;
> +			kvm_update_cpuid(vcpu);
> +		}
>  		break;
>  	case MSR_SMI_COUNT:
>  		if (!msr_info->host_initiated)
> @@ -9779,8 +9782,9 @@ int kvm_arch_hardware_setup(void *opaque)
>  
>  	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
>  
> -	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
> -		supported_xss = 0;
> +	supported_xss = 0;
> +	if (kvm_cpu_cap_has(X86_FEATURE_XSAVES))
> +		supported_xss = host_xss & KVM_SUPPORTED_XSS;

Updating supported_xss in the actual enabling patch.

>  
>  #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
>  	cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES
  2020-07-16  3:16 ` [RESEND v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES Yang Weijiang
@ 2020-07-22 20:32   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 20:32 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:22AM +0800, Yang Weijiang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> A handful of CET MSRs are not context switched through "traditional"
> methods, e.g. VMCS or manual switching, but rather are passed through
> to the guest and are saved and restored by XSAVES/XRSTORS, i.e. in the
> guest's FPU state.
> 
> Load the guest's FPU state if userspace is accessing MSRs whose values
> are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(),
> can simply do {RD,WR}MSR to access the guest's value.
> 
> Note that guest_cpuid_has() is not queried as host userspace is allowed
> to access MSRs that have not been exposed to the guest, e.g. it might do
> KVM_SET_MSRS prior to KVM_SET_CPUID2.

No comments on the patch itself.  Added a blurb to the changelog to call
out the vcpu==NULL case is possible due to KVM_GET_MSRS also being a device
scope ioctl().

> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Co-developed-by: Yang Weijiang <weijiang.yang@intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 07/11] KVM: x86: Add userspace access interface for CET MSRs
  2020-07-16  3:16 ` [RESEND v13 07/11] KVM: x86: Add userspace access interface for CET MSRs Yang Weijiang
@ 2020-07-22 20:54   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 20:54 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:23AM +0800, Yang Weijiang wrote:
> There're two different places storing Guest CET states, states managed
> with XSAVES/XRSTORS, as restored/saved in previous patch, can be read/write
> directly from/to the MSRs. For those stored in VMCS fields, they're access
> via vmcs_read/vmcs_write.
> 
> To correctly read/write the CET MSRs, it's necessary to check whether the
> kernel FPU context switch happened and reload guest FPU context if needed.
> 
> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/include/uapi/asm/kvm_para.h |   7 +-
>  arch/x86/kvm/vmx/vmx.c               | 148 +++++++++++++++++++++++++++
>  arch/x86/kvm/x86.c                   |   4 +
>  3 files changed, 156 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> index 812e9b4c1114..2d3422dc4c81 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -47,12 +47,13 @@
>  /* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */
>  #define MSR_KVM_WALL_CLOCK_NEW  0x4b564d00
>  #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
> -#define MSR_KVM_ASYNC_PF_EN 0x4b564d02
> -#define MSR_KVM_STEAL_TIME  0x4b564d03
> -#define MSR_KVM_PV_EOI_EN      0x4b564d04
> +#define MSR_KVM_ASYNC_PF_EN	0x4b564d02
> +#define MSR_KVM_STEAL_TIME	0x4b564d03
> +#define MSR_KVM_PV_EOI_EN	0x4b564d04

Again, not a bad change, but doesn't belong in this patch/series.

>  #define MSR_KVM_POLL_CONTROL	0x4b564d05
>  #define MSR_KVM_ASYNC_PF_INT	0x4b564d06
>  #define MSR_KVM_ASYNC_PF_ACK	0x4b564d07
> +#define MSR_KVM_GUEST_SSP	0x4b564d08

Adding the synthetic MSR should be a separate patch, both for bisection and
to provide the justification for adding the synthetic MSR (and to call out
that it's host VMM only).

>  struct kvm_steal_time {
>  	__u64 steal;
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 0089943fbb31..4ce61427ed49 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1819,6 +1819,94 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
>  	}
>  }
>  
> +static void vmx_get_xsave_msr(struct msr_data *msr_info)
> +{
> +	local_irq_disable();
> +	if (test_thread_flag(TIF_NEED_FPU_LOAD))
> +		switch_fpu_return();
> +	rdmsrl(msr_info->index, msr_info->data);
> +	local_irq_enable();
> +}
> +
> +static void vmx_set_xsave_msr(struct msr_data *msr_info)
> +{
> +	local_irq_disable();
> +	if (test_thread_flag(TIF_NEED_FPU_LOAD))
> +		switch_fpu_return();
> +	wrmsrl(msr_info->index, msr_info->data);
> +	local_irq_enable();
> +}
> +
> +#define CET_MSR_RSVD_BITS_1  GENMASK(2, 0)
> +#define CET_MSR_RSVD_BITS_2  GENMASK(9, 6)

Rather than use #defines, we can use a single case statement to perform
the reserved bit checks for U_CET/S_CET and all the SSP MSRs.  I'm not
totally against using macros, but to do so we really need informative
names, e.g. RSVD_BITS_1/2 are way too arbitrary.  And coming up with names
is always hard, so it seems easier to avoid the issue entirely :-).

> +static bool cet_check_msr_valid(struct kvm_vcpu *vcpu,
> +				struct msr_data *msr, u64 rsvd_bits)

For me, "check" is still ambiguous with respect to the return.  It's also
longer than "is", e.g. changing these to cet_is_msr_{valid,accessible}()
shortens lines and avoids a few wraps.

> +{
> +	u64 data = msr->data;
> +	u32 index = msr->index;
> +
> +	if ((index == MSR_IA32_PL0_SSP || index == MSR_IA32_PL1_SSP ||
> +	    index == MSR_IA32_PL2_SSP || index == MSR_IA32_PL3_SSP ||
> +	    index == MSR_IA32_INT_SSP_TAB || index == MSR_KVM_GUEST_SSP) &&
> +	    is_noncanonical_address(data, vcpu))
> +		return false;
> +
> +	if ((index  == MSR_IA32_S_CET || index == MSR_IA32_U_CET) &&
> +	    data & MSR_IA32_CET_ENDBR_EN) {

I'm pretty sure conditioning the canonical check on ENDBR_EN is wrong.  The
SDM is ambiguous, but I peeked at internal simulator code and it performs
the check regardless of ENDBR_EN.  Can you double check on silicon?

> +		u64 bitmap_base = data >> 12;
> +
> +		if (is_noncanonical_address(bitmap_base, vcpu))

This is wrong.  The canonical check needs to be performed on the unshifted
value.

Putting this together with the above check, this whole function boils down
to:

	return !(data & rsvd_bits) && is_noncanonical_address(data, vcpu));

At that point, I don't see much value in a separate helper as the entire
check can be squeezed onto a single line.  It'll poke out a few chars, but
I think that's ok.

> +			return false;
> +	}
> +
> +	return !(data & rsvd_bits);
> +}
> +
> +static bool cet_check_ssp_msr_accessible(struct kvm_vcpu *vcpu,
> +					 struct msr_data *msr)
> +{
> +	u32 index = msr->index;
> +
> +	if (!boot_cpu_has(X86_FEATURE_SHSTK))

This is pointless, we need a full kvm_cet_supported() check, e.g. if CET
is supported by the kernel but not KVM.  That exists as the supported_xss
check below, but _that_ check actually needs to be against
vcpu->arch.guest_supported_xss.

> +		return false;
> +
> +	if (!msr->host_initiated &&

If the suported_xss/kvm_cet_supported() check is hoisted up, then the
host_initiated case can be short-circuited immediately.

With some further massaging, this becomes:

	u64 mask;

	if (!kvm_cet_supported())
		return false;

	if (msr->host_initiated)
		return true;

	if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
	    msr->index == MSR_KVM_GUEST_SSP)
		return false;

	if (msr->index == MSR_IA32_INT_SSP_TAB)
		return true;

	mask = (msr->index == MSR_IA32_PL3_SSP) ? XFEATURE_MASK_CET_USER :
						  XFEATURE_MASK_CET_KERNEL;
	return !!(vcpu->arch.guest_supported_xss & mask);

I think that gets all the cases correct?  The INT_SSP_TAB without CET_USER
or CET_KERNEL is weird, but that'd be an unusable model so it's probably not
worth spending much time on what's the least insane approach.

> +	    !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
> +		return false;
> +
> +	if (index == MSR_KVM_GUEST_SSP)
> +		return msr->host_initiated &&
> +		       guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> +
> +	if (index == MSR_IA32_INT_SSP_TAB)
> +		return true;
> +
> +	if (index == MSR_IA32_PL3_SSP)
> +		return supported_xss & XFEATURE_MASK_CET_USER;
> +
> +	return supported_xss & XFEATURE_MASK_CET_KERNEL;
> +}
> +
> +static bool cet_check_ctl_msr_accessible(struct kvm_vcpu *vcpu,
> +					 struct msr_data *msr)
> +{
> +	u32 index = msr->index;
> +
> +	if (!boot_cpu_has(X86_FEATURE_SHSTK) &&
> +	    !boot_cpu_has(X86_FEATURE_IBT))
> +		return false;
> +
> +	if (!msr->host_initiated &&
> +	    !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
> +	    !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
> +		return false;
> +
> +	if (index == MSR_IA32_U_CET)
> +		return supported_xss & XFEATURE_MASK_CET_USER;
> +
> +	return supported_xss & XFEATURE_MASK_CET_KERNEL;

Same comments as above.

> +}
>  /*
>   * Reads an msr value (of 'msr_index') into 'pdata'.
>   * Returns 0 on success, non-0 otherwise.
> @@ -1951,6 +2039,31 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		else
>  			msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
>  		break;
> +	case MSR_KVM_GUEST_SSP:
> +		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		msr_info->data = vmcs_readl(GUEST_SSP);
> +		break;
> +	case MSR_IA32_S_CET:
> +		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		msr_info->data = vmcs_readl(GUEST_S_CET);
> +		break;
> +	case MSR_IA32_INT_SSP_TAB:
> +		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
> +		break;
> +	case MSR_IA32_U_CET:
> +		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		vmx_get_xsave_msr(msr_info);
> +		break;
> +	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> +		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		vmx_get_xsave_msr(msr_info);
> +		break;
>  	case MSR_TSC_AUX:
>  		if (!msr_info->host_initiated &&
>  		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> @@ -2221,6 +2334,41 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		else
>  			vmx->pt_desc.guest.addr_a[index / 2] = data;
>  		break;
> +	case MSR_KVM_GUEST_SSP:
> +		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1))
> +			return 1;
> +		vmcs_writel(GUEST_SSP, data);
> +		break;
> +	case MSR_IA32_S_CET:
> +		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2))
> +			return 1;
> +		vmcs_writel(GUEST_S_CET, data);
> +		break;
> +	case MSR_IA32_INT_SSP_TAB:
> +		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		if (!cet_check_msr_valid(vcpu, msr_info, 0))
> +			return 1;
> +		vmcs_writel(GUEST_INTR_SSP_TABLE, data);
> +		break;
> +	case MSR_IA32_U_CET:
> +		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2))
> +			return 1;
> +		vmx_set_xsave_msr(msr_info);
> +		break;
> +	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> +		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
> +			return 1;
> +		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1))
> +			return 1;
> +		vmx_set_xsave_msr(msr_info);

There are essentially three groups: S_CET/U_SET, INT_SSP_TAB, and all the SSP
MSRs.  If we group them together, then the reserved bit and canonical checks
naturally get combined.  It requires an extra check to direct to the VMCS vs.
XSAVE state, but this isn't a fast path and it cuts down on the duplicate code
without having to add more utility functions.  E.g.

	case MSR_IA32_S_CET:
	case MSR_IA32_U_CET:
		if (!cet_is_control_msr_accessible(vcpu, msr_info))
			return 1;
		if ((data & GENMASK(9, 6)) || is_noncanonical_address(data, vcpu))
			return 1;
		if (msr_index == MSR_IA32_S_CET)
			vmcs_writel(GUEST_S_CET, data);
		else
			vmx_set_xsave_msr(msr_info);
		break;
	case MSR_IA32_INT_SSP_TAB:
		if (!cet_is_control_msr_accessible(vcpu, msr_info))
			return 1;
		if (is_noncanonical_address(data, vcpu))
			return 1;
		vmcs_writel(GUEST_INTR_SSP_TABLE, data);
		break;
	case MSR_KVM_GUEST_SSP:
	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
		if (!cet_is_ssp_msr_accessible(vcpu, msr_info))
			return 1;
		if ((data & GENMASK(2, 0)) || is_noncanonical_address(data, vcpu))
			return 1;
		if (msr_index == MSR_KVM_GUEST_SSP)
			vmcs_writel(GUEST_SSP, data);
		else
			vmx_set_xsave_msr(msr_info);
		break;


> +		break;
>  	case MSR_TSC_AUX:
>  		if (!msr_info->host_initiated &&
>  		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c437ddc22ad6..c71a9ceac05e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1234,6 +1234,10 @@ static const u32 msrs_to_save_all[] = {
>  	MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
>  	MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
>  	MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
> +
> +	MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_S_CET,
> +	MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP,
> +	MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB, MSR_KVM_GUEST_SSP,
>  };

It's somewhat arbitrary, but I think it makes sense to report the MSRs as
to-be-saved in a separate patch.  The XSS change definitely should be its
own patch.  Then, with the synthetic MSR as a separate patch, we end up
with a sequence like:

9398804f6577 KVM: x86: Report XSS as an MSR to be saved if there are supported features
...
5bc17d5211ed KVM: x86: Load guest fpu state when accessing MSRs managed by XSAVES
e96552044086 KVM: VMX: Emulate reads and writes to CET MSRs
a853510b851d KVM: VMX: Add a synthetic MSR to allow userspace VMM to access GUEST_SSP
9aef9270286b KVM: x86: Report CET MSRs as to-be-saved if CET is supported

Another required change is that all these MSRs need to be conditionally
reported based on KVM support, i.e. handled in kvm_init_msr_list().

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 08/11] KVM: VMX: Enable CET support for nested VM
  2020-07-16  3:16 ` [RESEND v13 08/11] KVM: VMX: Enable CET support for nested VM Yang Weijiang
@ 2020-07-22 21:20   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 21:20 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:24AM +0800, Yang Weijiang wrote:
> CET MSRs pass through guests for performance consideration. Configure the
> MSRs to match L0/L1 settings so that nested VM is able to run with CET.
> 
> Add assertions for vmcs12 offset table initialization, these assertions can
> detect the mismatch of VMCS field encoding and data type at compiling time.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/kvm/vmx/nested.c |  34 +++++
>  arch/x86/kvm/vmx/vmcs12.c | 267 +++++++++++++++++++++++---------------
>  arch/x86/kvm/vmx/vmcs12.h |  14 +-
>  arch/x86/kvm/vmx/vmx.c    |  10 ++
>  4 files changed, 216 insertions(+), 109 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index d4a4cec034d0..ddb1a69ce947 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -550,6 +550,18 @@ static inline void enable_x2apic_msr_intercepts(unsigned long *msr_bitmap)
>  	}
>  }
>  
> +static void nested_vmx_update_intercept_for_msr(struct kvm_vcpu *vcpu,

"update" is misleading.  That implies the helper can set or clear
interception, whereas this is purely a one-way ticket for disabling
intereption.  nested_vmx_cond_disable_intercept_for_msr() is the best I
could come up with.  It's long, but wrapping can be avoided with some
extra massaging.
> +						u32 msr,
> +						unsigned long *msr_bitmap_l1,
> +						unsigned long *msr_bitmap_l0,
> +						int type)
> +{
> +	if (!msr_write_intercepted_l01(vcpu, msr))
> +		nested_vmx_disable_intercept_for_msr(msr_bitmap_l1,
> +						     msr_bitmap_l0,
> +						     msr, type);

This can avoid wrapping by renaming variables and refactoring code:

	if (msr_write_intercepted_l01(vcpu, msr))
		return;

	nested_vmx_disable_intercept_for_msr(bitmap_12, bitmap_02, msr, type);

And since there are existing users, the helper should also be added in a
separate patch.  Doing so does two things: allows further consolidation of
code, and separates the new logic from the CET logic, e.g. if the new helper
is broken then (with luck) bisection will point at the helper patch and not
the CET patch.

> +}
> +
>  /*
>   * Merge L0's and L1's MSR bitmap, return false to indicate that
>   * we do not use the hardware.
> @@ -621,6 +633,28 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
>  	nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, msr_bitmap_l0,
>  					     MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
>  
> +	/* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */
> +	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_U_CET,
> +					    msr_bitmap_l1, msr_bitmap_l0,
> +					    MSR_TYPE_RW);
> +	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP,
> +					    msr_bitmap_l1, msr_bitmap_l0,
> +					    MSR_TYPE_RW);
> +	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_S_CET,
> +					    msr_bitmap_l1, msr_bitmap_l0,
> +					    MSR_TYPE_RW);
> +	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP,
> +					    msr_bitmap_l1, msr_bitmap_l0,
> +					    MSR_TYPE_RW);
> +	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP,
> +					    msr_bitmap_l1, msr_bitmap_l0,
> +					    MSR_TYPE_RW);
> +	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP,
> +					    msr_bitmap_l1, msr_bitmap_l0,
> +					    MSR_TYPE_RW);
> +	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB,
> +					    msr_bitmap_l1, msr_bitmap_l0,
> +					    MSR_TYPE_RW);
>  	/*
>  	 * Checking the L0->L1 bitmap is trying to verify two things:
>  	 *
> diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c
> index c8e51c004f78..147e0d8eeab2 100644
> --- a/arch/x86/kvm/vmx/vmcs12.c
> +++ b/arch/x86/kvm/vmx/vmcs12.c
> @@ -4,31 +4,76 @@
>  
>  #define ROL16(val, n) ((u16)(((u16)(val) << (n)) | ((u16)(val) >> (16 - (n)))))
>  #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x)
> -#define FIELD(number, name)	[ROL16(number, 6)] = VMCS12_OFFSET(name)
> -#define FIELD64(number, name)						\
> -	FIELD(number, name),						\
> -	[ROL16(number##_HIGH, 6)] = VMCS12_OFFSET(name) + sizeof(u32)
> +

Again, this does not belong in this series.  At the very least, not in this
patch.  I also suspect we can use some macro shenanigans to automagically
detect the field size, i.e. isntead of having FIELDN, FIELD32, etc...

...

>  const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs_field_to_offset_table);
> diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
> index 80232daf00ff..016896c9e701 100644
> --- a/arch/x86/kvm/vmx/vmcs12.h
> +++ b/arch/x86/kvm/vmx/vmcs12.h
> @@ -115,7 +115,13 @@ struct __packed vmcs12 {
>  	natural_width host_ia32_sysenter_eip;
>  	natural_width host_rsp;
>  	natural_width host_rip;
> -	natural_width paddingl[8]; /* room for future expansion */
> +	natural_width host_s_cet;
> +	natural_width host_ssp;
> +	natural_width host_ssp_tbl;
> +	natural_width guest_s_cet;
> +	natural_width guest_ssp;
> +	natural_width guest_ssp_tbl;
> +	natural_width paddingl[2]; /* room for future expansion */
>  	u32 pin_based_vm_exec_control;
>  	u32 cpu_based_vm_exec_control;
>  	u32 exception_bitmap;
> @@ -295,6 +301,12 @@ static inline void vmx_check_vmcs12_offsets(void)
>  	CHECK_OFFSET(host_ia32_sysenter_eip, 656);
>  	CHECK_OFFSET(host_rsp, 664);
>  	CHECK_OFFSET(host_rip, 672);
> +	CHECK_OFFSET(host_s_cet, 680);
> +	CHECK_OFFSET(host_ssp, 688);
> +	CHECK_OFFSET(host_ssp_tbl, 696);
> +	CHECK_OFFSET(guest_s_cet, 704);
> +	CHECK_OFFSET(guest_ssp, 712);
> +	CHECK_OFFSET(guest_ssp_tbl, 720);
>  	CHECK_OFFSET(pin_based_vm_exec_control, 744);
>  	CHECK_OFFSET(cpu_based_vm_exec_control, 748);
>  	CHECK_OFFSET(exception_bitmap, 752);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 4ce61427ed49..d465ff990094 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7321,6 +7321,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
>  	cr4_fixed1_update(X86_CR4_PKE,        ecx, feature_bit(PKU));
>  	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
>  	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
> +	cr4_fixed1_update(X86_CR4_CET,	      ecx, feature_bit(SHSTK));
>  
>  #undef cr4_fixed1_update
>  }
> @@ -7340,6 +7341,15 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>  			vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS;
>  		}
>  	}
> +
> +	if (is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER |
> +	    XFEATURE_MASK_CET_KERNEL)) {

I prefer the above MPX style of:

	if (kvm_cet_supported()) {
		bool cet_enabled = guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
				   guest_cpuid_has(vcpu, X86_FEATURE_IBT);

		if (cet_enabled) {
			msrs->entry_ctls_high |= VM_ENTRY_LOAD_CET_STATE;
			msrs->exit_ctls_high |= VM_EXIT_LOAD_CET_STATE;
		} else {
			msrs->entry_ctls_high &= ~VM_ENTRY_LOAD_CET_STATE;
			msrs->exit_ctls_high &= ~VM_EXIT_LOAD_CET_STATE;
		}
	}

That's also more in line with the logic for computing secondary execution
controls.  Not that it really matters, but it means we're not updating the
MSRs when KVM doesn't support CET in the first place.

> +		vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_CET_STATE;
> +		vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_CET_STATE;

The line lengths can be shortened by adding a prep patch to grab
vmx->nested.msrs in a local msrs variable, that way the extra level of
indentation doesn't need a wrap.  'vmx' itself is unnecessary.

> +	} else {
> +		vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_CET_STATE;
> +		vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_CET_STATE;
> +	}
>  }
>  
>  static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states
  2020-07-16  3:16 ` [RESEND v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states Yang Weijiang
@ 2020-07-22 21:29   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 21:29 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:25AM +0800, Yang Weijiang wrote:
> Dump CET VMCS states for debug purpose. Since CET kernel protection is
> not enabled, if related MSRs in host are filled by mistake, warn once on
> detecting it.

This all can be thrown into the enabling patch.  This isn't so much code that
it bloats the enabling patch, and the host MSRs being lost thing is confusing
without the context that KVM doesn't stuff them into the VMCS.

> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d465ff990094..5d4250b9dec8 100644

...

> @@ -8205,6 +8217,7 @@ static __init int hardware_setup(void)
>  	unsigned long host_bndcfgs;
>  	struct desc_ptr dt;
>  	int r, i, ept_lpage_level;
> +	u64 cet_msr;
>  
>  	store_idt(&dt);
>  	host_idt_base = dt.address;
> @@ -8365,6 +8378,16 @@ static __init int hardware_setup(void)
>  			return r;
>  	}
>  
> +	if (boot_cpu_has(X86_FEATURE_IBT) || boot_cpu_has(X86_FEATURE_SHSTK)) {
> +		rdmsrl(MSR_IA32_S_CET, cet_msr);
> +		WARN_ONCE(cet_msr, "KVM: CET S_CET in host will be lost!\n");
> +	}
> +
> +	if (boot_cpu_has(X86_FEATURE_SHSTK)) {
> +		rdmsrl(MSR_IA32_PL0_SSP, cet_msr);
> +		WARN_ONCE(cet_msr, "KVM: CET PL0_SSP in host will be lost!\n");
> +	}

Largely arbitrary, but I'd prefer to do these checks up near the BNDCFG check,
just so that all of these sorts of warnings are clustered together.

> +
>  	vmx_set_cpu_caps();
>  
>  	r = alloc_kvm_area();
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 10/11] KVM: x86: Add #CP support in guest exception dispatch
  2020-07-16  3:16 ` [RESEND v13 10/11] KVM: x86: Add #CP support in guest exception dispatch Yang Weijiang
@ 2020-07-22 21:29   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 21:29 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:26AM +0800, Yang Weijiang wrote:
> CPU defined #CP(21) to handle CET induced exception, it's accompanied
> with several error codes corresponding to different CET violation cases,
> see SDM for detailed description. The exception is classified as a
> contibutory exception w.r.t #DF.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---

This patch can be moved much earlier in the series.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace
  2020-07-16  3:16 ` [RESEND v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace Yang Weijiang
@ 2020-07-22 21:33   ` Sean Christopherson
  0 siblings, 0 replies; 24+ messages in thread
From: Sean Christopherson @ 2020-07-22 21:33 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Thu, Jul 16, 2020 at 11:16:27AM +0800, Yang Weijiang wrote:
> Set the feature bits so that CET capabilities can be seen in guest via
> CPUID enumeration. Add CR4.CET bit support in order to allow guest set CET
> master control bit(CR4.CET).
> 
> Disable KVM CET feature once unrestricted_guest is turned off because
> KVM cannot emulate guest CET behavior well in this case.
> 
> Don't expose CET feature if dependent CET bits are cleared in host XSS.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  3 ++-
>  arch/x86/kvm/cpuid.c            |  5 +++--
>  arch/x86/kvm/vmx/vmx.c          |  5 +++++
>  arch/x86/kvm/x86.c              | 11 +++++++++++
>  4 files changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index e8c749596ba2..c4c82db68b6a 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -99,7 +99,8 @@
>  			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>  			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>  			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> +			  | X86_CR4_CET))
>  
>  #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>  
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index d97b2a6e8a8c..a085b8c57f34 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -340,7 +340,8 @@ void kvm_set_cpu_caps(void)
>  		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
>  		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
>  		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> -		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
> +		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
> +		F(SHSTK)
>  	);
>  	/* Set LA57 based on hardware capability. */
>  	if (cpuid_ecx(7) & F(LA57))
> @@ -356,7 +357,7 @@ void kvm_set_cpu_caps(void)
>  	kvm_cpu_cap_mask(CPUID_7_EDX,
>  		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
>  		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
> -		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM)
> +		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(IBT)
>  	);
>  
>  	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 5d4250b9dec8..31593339b6fe 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7542,6 +7542,11 @@ static __init void vmx_set_cpu_caps(void)
>  
>  	if (vmx_waitpkg_supported())
>  		kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
> +
> +	if (!enable_unrestricted_guest) {

This also needs to check cpu_has_load_cet_ctrl().

> +		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
> +		kvm_cpu_cap_clear(X86_FEATURE_IBT);
> +	}
>  }
>  
>  static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 76892fb0b0a0..c7393d62ad72 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9808,10 +9808,21 @@ int kvm_arch_hardware_setup(void *opaque)
>  	if (kvm_cpu_cap_has(X86_FEATURE_XSAVES))
>  		supported_xss = host_xss & KVM_SUPPORTED_XSS;
>  
> +	if (!(supported_xss & (XFEATURE_MASK_CET_USER |
> +	    XFEATURE_MASK_CET_KERNEL))) {
> +		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
> +		kvm_cpu_cap_clear(X86_FEATURE_IBT);

I played around with a variety of options, and ended up with:

	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
		supported_xss = 0;
	else
		supported_xss &= host_xss;

	/* Update CET features now that supported_xss is finalized. */
	if (!kvm_cet_supported()) {
		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
		kvm_cpu_cap_clear(X86_FEATURE_IBT);
	}

in x86.c / kvm_arch_hardware_setup(), and 

	if (!cpu_has_load_cet_ctrl() || !enable_unrestricted_guest) {
		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
		kvm_cpu_cap_clear(X86_FEATURE_IBT);
	} else if (kvm_cpu_cap_has(X86_FEATURE_SHSTK) ||
		   kvm_cpu_cap_has(X86_FEATURE_IBT)) {
		supported_xss |= XFEATURE_MASK_CET_USER |
				 XFEATURE_MASK_CET_KERNEL;
	}

in vmx.c / vmx_set_cpu_caps.

That avoids KVM_SUPPORTED_XSS, and was the least ugly option I could devise
for avoiding the cyclical dependency between XSS and SHSTK/IBT without
potentially exploding SVM in the future.

> +	}
> +
>  #define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
>  	cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
>  #undef __kvm_cpu_cap_has
>  
> +	if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
> +	    !kvm_cpu_cap_has(X86_FEATURE_IBT))
> +		supported_xss &= ~(XFEATURE_MASK_CET_USER |
> +				   XFEATURE_MASK_CET_KERNEL);
> +
>  	if (kvm_has_tsc_control) {
>  		/*
>  		 * Make sure the user can only configure tsc_khz values that
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RESEND PATCH v13 00/11] Introduce support for guest CET feature
  2020-07-22 19:48 ` [RESEND PATCH v13 00/11] Introduce support for guest CET feature Sean Christopherson
@ 2020-07-23  3:17   ` Yang Weijiang
  0 siblings, 0 replies; 24+ messages in thread
From: Yang Weijiang @ 2020-07-23  3:17 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Yang Weijiang, kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Wed, Jul 22, 2020 at 12:48:05PM -0700, Sean Christopherson wrote:
> On Thu, Jul 16, 2020 at 11:16:16AM +0800, Yang Weijiang wrote:
> > Control-flow Enforcement Technology (CET) provides protection against
> > Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
> > sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
> > SHSTK is to prevent ROP programming and IBT is to prevent JOP programming.
> > 
> > Several parts in KVM have been updated to provide VM CET support, including:
> > CPUID/XSAVES config, MSR pass-through, user space MSR access interface, 
> > vmentry/vmexit config, nested VM etc. These patches have dependency on CET
> > kernel patches for xsaves support and CET definitions, e.g., MSR and related
> > feature flags.
> > 
> > CET kernel patches are here:
> > https://lkml.kernel.org/r/20200429220732.31602-1-yu-cheng.yu@intel.com
> > 
> > v13:
> > - Added CET definitions as a separate patch to facilitate KVM test.
> 
> What I actually want to do is pull in actual kernel patches themselves so
> that we can upstream KVM support without having to wait for the kernel to
> sort out the ABI, which seems like it's going to drag on.
That's an innovative idea and beyond my imagination, great!:-)
> 
> I was thinking that we'd only need the MSR/CR4/CPUID definitions, but forgot
> that KVM also needs XSAVES context switching, so it's not as simple as I was
> thinking.  It's still relatively simple, but it means there would be
> functional changes in the kernel.
> 
> I'll respond to the main SSP series to pose the question of taking the two
> small-ish kernel patches through the KVM tree.
> 
> >  arch/x86/include/asm/kvm_host.h      |   4 +-
> >  arch/x86/include/asm/vmx.h           |   8 +
> >  arch/x86/include/uapi/asm/kvm.h      |   1 +
> >  arch/x86/include/uapi/asm/kvm_para.h |   7 +-
> >  arch/x86/kvm/cpuid.c                 |  28 ++-
> >  arch/x86/kvm/vmx/capabilities.h      |   5 +
> >  arch/x86/kvm/vmx/nested.c            |  34 ++++
> >  arch/x86/kvm/vmx/vmcs12.c            | 267 ++++++++++++++++-----------
> >  arch/x86/kvm/vmx/vmcs12.h            |  14 +-
> >  arch/x86/kvm/vmx/vmx.c               | 262 +++++++++++++++++++++++++-
> >  arch/x86/kvm/x86.c                   |  53 +++++-
> >  arch/x86/kvm/x86.h                   |   2 +-
> >  include/linux/kvm_host.h             |  32 ++++
> >  13 files changed, 590 insertions(+), 127 deletions(-)
> 
> I have quite a few comments/changes (will respond to individual patches),
> but have done all the updates/rework and, assuming I haven't broken things,
> we're nearing the point where I can carry this and push it past the finish
> line, e.g. get acks from tip/x86 maintainers for the kernel patches and
> send a pull request to Paolo.
> 
> I pushed the result to:
> 
>   https://github.com/sean-jc/linux/releases/tag/kvm-cet-v14-rc1
> 
> can you please review and test?  If everything looks good, I'll post v14.
> If not, I'll work offline with you to get it into shape.
>
Thanks a lot for the efforts! I'll review and test the new patches and
let you know the status.

> Thanks!

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-07-23  3:09 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-16  3:16 [RESEND PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
2020-07-16  3:16 ` [RESEND v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Yang Weijiang
2020-07-16  3:16 ` [RESEND v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags Yang Weijiang
2020-07-22 19:48   ` Sean Christopherson
2020-07-16  3:16 ` [RESEND v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Yang Weijiang
2020-07-22 20:14   ` Sean Christopherson
2020-07-16  3:16 ` [RESEND v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing Yang Weijiang
2020-07-22 20:31   ` Sean Christopherson
2020-07-16  3:16 ` [RESEND v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits Yang Weijiang
2020-07-22 20:32   ` Sean Christopherson
2020-07-16  3:16 ` [RESEND v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES Yang Weijiang
2020-07-22 20:32   ` Sean Christopherson
2020-07-16  3:16 ` [RESEND v13 07/11] KVM: x86: Add userspace access interface for CET MSRs Yang Weijiang
2020-07-22 20:54   ` Sean Christopherson
2020-07-16  3:16 ` [RESEND v13 08/11] KVM: VMX: Enable CET support for nested VM Yang Weijiang
2020-07-22 21:20   ` Sean Christopherson
2020-07-16  3:16 ` [RESEND v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states Yang Weijiang
2020-07-22 21:29   ` Sean Christopherson
2020-07-16  3:16 ` [RESEND v13 10/11] KVM: x86: Add #CP support in guest exception dispatch Yang Weijiang
2020-07-22 21:29   ` Sean Christopherson
2020-07-16  3:16 ` [RESEND v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace Yang Weijiang
2020-07-22 21:33   ` Sean Christopherson
2020-07-22 19:48 ` [RESEND PATCH v13 00/11] Introduce support for guest CET feature Sean Christopherson
2020-07-23  3:17   ` Yang Weijiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).