KVM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v13 00/11] Introduce support for guest CET feature
@ 2020-07-01  8:04 Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Yang Weijiang
                   ` (11 more replies)
  0 siblings, 12 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

Control-flow Enforcement Technology (CET) provides protection against
Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
SHSTK is to prevent ROP programming and IBT is to prevent JOP programming.

Several parts in KVM have been updated to provide VM CET support, including:
CPUID/XSAVES config, MSR pass-through, user space MSR access interface, 
vmentry/vmexit config, nested VM etc. These patches have dependency on CET
kernel patches for xsaves support and CET definitions, e.g., MSR and related
feature flags.

CET kernel patches are here:
https://lkml.kernel.org/r/20200429220732.31602-1-yu-cheng.yu@intel.com

v13:
- Added CET definitions as a separate patch to facilitate KVM test.
- Disabled CET support in KVM if unrestricted_guest is turned off since
  in this case CET related instructions/infrastructure cannot be emulated
  well.

v12:
- Fixed a few issues per Sean and Paolo's review feeback.
- Refactored patches to make them properly arranged.
- Removed unnecessary hard-coded CET states for host/guest.
- Added compile-time assertions for vmcs_field_to_offset_table to detect
  mismatch of the field type and field encoding number.
- Added a custom MSR MSR_KVM_GUEST_SSP for guest active SSP save/restore.
- Rebased patches to 5.7-rc3.

v11:
- Fixed a guest vmentry failure issue when guest reboots.
- Used vm_xxx_control_{set, clear}bit() to avoid side effect, it'll
  clear cached data instead of pure VMCS field bits.
- Added vcpu->arch.guest_supported_xss dedidated for guest runtime mask,
  this avoids supported_xss overwritten issue caused by an old qemu.
- Separated vmentry/vmexit state setting with CR0/CR4 dependency check
  to make the patch more clear.
- Added CET VMCS states in dump_vmcs() for debugging purpose.
- Other refactor based on testing.
- This patch serial is built on top of below branch and CET kernel patches
  for seeking xsaves support:
  https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=cpu-caps

v10:
- Refactored code per Sean's review feedback.
- Added CET support for nested VM.
- Removed fix-patch for CPUID(0xd,N) enumeration as this part is done
  by Paolo and Sean.
- This new patchset is based on Paolo's queued cpu_caps branch.
- Modified patch per XSAVES related change.
- Consolidated KVM unit-test patch with KVM patches.

v9:
- Refactored msr-check functions per Sean's feedback.
- Fixed a few issues per Sean's suggestion.
- Rebased patch to kernel-v5.4.
- Moved CET CPUID feature bits and CR4.CET to last patch.

v8:
- Addressed Jim and Sean's feedback on: 1) CPUID(0xD,i) enumeration. 2)
  sanity check when configure guest CET. 3) function improvement.
- Added more sanity check functions.
- Set host vmexit default status so that guest won't leak CET status to
  host when vmexit.
- Added CR0.WP vs. CR4.CET mutual constrains.

v7:
- Rebased patch to kernel v5.3
- Sean suggested to change CPUID(0xd, n) enumeration code as alined with
  existing one, and I think it's better to make the fix as an independent patch 
  since XSS MSR are being used widely on X86 platforms.
- Check more host and guest status before configure guest CET
  per Sean's feedback.
- Add error-check before guest accesses CET MSRs per Sean's feedback.
- Other minor fixes suggested by Sean.

v6:
- Rebase patch to kernel v5.2.
- Move CPUID(0xD, n>=1) helper to a seperate patch.
- Merge xsave size fix with other patch.
- Other minor fixes per community feedback.

v5:
- Rebase patch to kernel v5.1.
- Wrap CPUID(0xD, n>=1) code to a helper function.
- Pass through MSR_IA32_PL1_SSP and MSR_IA32_PL2_SSP to Guest.
- Add Co-developed-by expression in patch description.
- Refine patch description.

v4:
- Add Sean's patch for loading Guest fpu state before access XSAVES
  managed CET MSRs.
- Melt down CET bits setting into CPUID configuration patch.
- Add VMX interface to query Host XSS.
- Check Host and Guest XSS support bits before set Guest XSS.
- Make Guest SHSTK and IBT feature enabling independent.
- Do not report CET support to Guest when Host CET feature is Disabled.

v3:
- Modified patches to make Guest CET independent to Host enabling.
- Added patch 8 to add user space access for Guest CET MSR access.
- Modified code comments and patch description to reflect changes.

v2:
- Re-ordered patch sequence, combined one patch.
- Added more description for CET related VMCS fields.
- Added Host CET capability check while enabling Guest CET loading bit.
- Added Host CET capability check while reporting Guest CPUID(EAX=7, EXC=0).
- Modified code in reporting Guest CPUID(EAX=D,ECX>=1), make it clearer.
- Added Host and Guest XSS mask check while setting bits for Guest XSS.

Sean Christopherson (1):
  KVM: x86: Load guest fpu state when access MSRs managed by XSAVES

Yang Weijiang (10):
  KVM: x86: Include CET definitions for KVM test purpose
  KVM: VMX: Introduce CET VMCS fields and flags
  KVM: VMX: Set guest CET MSRs per KVM and host configuration
  KVM: VMX: Configure CET settings upon guest CR0/4 changing
  KVM: x86: Refresh CPUID once guest changes XSS bits
  KVM: x86: Add userspace access interface for CET MSRs
  KVM: VMX: Enable CET support for nested VM
  KVM: VMX: Add VMCS dump and sanity check for CET states
  KVM: x86: Add #CP support in guest exception dispatch
  KVM: x86: Enable CET virtualization and advertise CET to userspace

 arch/x86/include/asm/kvm_host.h      |   4 +-
 arch/x86/include/asm/vmx.h           |   8 +
 arch/x86/include/uapi/asm/kvm.h      |   1 +
 arch/x86/include/uapi/asm/kvm_para.h |   7 +-
 arch/x86/kvm/cpuid.c                 |  28 ++-
 arch/x86/kvm/vmx/capabilities.h      |   5 +
 arch/x86/kvm/vmx/nested.c            |  34 ++++
 arch/x86/kvm/vmx/vmcs12.c            | 275 ++++++++++++++++-----------
 arch/x86/kvm/vmx/vmcs12.h            |  14 +-
 arch/x86/kvm/vmx/vmx.c               | 262 ++++++++++++++++++++++++-
 arch/x86/kvm/x86.c                   |  47 ++++-
 arch/x86/kvm/x86.h                   |   2 +-
 include/linux/kvm_host.h             |  32 ++++
 13 files changed, 588 insertions(+), 131 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 01/11] KVM: x86: Include CET definitions for KVM test purpose
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags Yang Weijiang
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

These definitions are added by CET kernel patch and referenced by KVM,
if the CET KVM patches are tested without CET kernel patches, this patch
should be included.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 include/linux/kvm_host.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 01276e3d01b9..20e0fe70d3f7 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -35,6 +35,38 @@
 
 #include <asm/kvm_host.h>
 
+#ifndef CONFIG_X86_INTEL_CET
+#define XFEATURE_CET_USER   11
+#define XFEATURE_CET_KERNEL 12
+
+#define XFEATURE_MASK_CET_USER         (1 << XFEATURE_CET_USER)
+#define XFEATURE_MASK_CET_KERNEL       (1 << XFEATURE_CET_KERNEL)
+
+/* Control-flow Enforcement Technology MSRs */
+#define MSR_IA32_U_CET         0x6a0 /* user mode cet setting */
+#define MSR_IA32_S_CET         0x6a2 /* kernel mode cet setting */
+#define MSR_IA32_PL0_SSP       0x6a4 /* kernel shstk pointer */
+#define MSR_IA32_PL1_SSP       0x6a5 /* ring-1 shstk pointer */
+#define MSR_IA32_PL2_SSP       0x6a6 /* ring-2 shstk pointer */
+#define MSR_IA32_PL3_SSP       0x6a7 /* user shstk pointer */
+#define MSR_IA32_INT_SSP_TAB   0x6a8 /* exception shstk table */
+
+#define X86_CR4_CET_BIT        23 /* enable Control-flow Enforcement */
+#define X86_CR4_CET            _BITUL(X86_CR4_CET_BIT)
+
+#define X86_FEATURE_SHSTK      (16*32+ 7) /* Shadow Stack */
+#define X86_FEATURE_IBT        (18*32+20) /* Indirect Branch Tracking */
+
+/* MSR_IA32_U_CET and MSR_IA32_S_CET bits */
+#define MSR_IA32_CET_SHSTK_EN          0x0000000000000001ULL
+#define MSR_IA32_CET_WRSS_EN           0x0000000000000002ULL
+#define MSR_IA32_CET_ENDBR_EN          0x0000000000000004ULL
+#define MSR_IA32_CET_LEG_IW_EN         0x0000000000000008ULL
+#define MSR_IA32_CET_NO_TRACK_EN       0x0000000000000010ULL
+#define MSR_IA32_CET_WAIT_ENDBR        0x00000000000000800UL
+#define MSR_IA32_CET_BITMAP_MASK       0xfffffffffffff000ULL
+#endif
+
 #ifndef KVM_MAX_VCPU_ID
 #define KVM_MAX_VCPU_ID KVM_MAX_VCPUS
 #endif
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Yang Weijiang
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CET(Control-flow Enforcement Technology) is a CPU feature used to prevent
Return/Jump-Oriented Programming(ROP/JOP) attacks. It provides the following
sub-features to defend against ROP/JOP style control-flow subversion attacks:

Shadow Stack (SHSTK):
  A second stack for program which is used exclusively for control transfer
  operations.

Indirect Branch Tracking (IBT):
  Code branching protection to defend against jump/call oriented programming.

Several new CET MSRs are defined in kernel to support CET:
  MSR_IA32_{U,S}_CET: Controls the CET settings for user mode and kernel mode
  respectively.

  MSR_IA32_PL{0,1,2,3}_SSP: Stores shadow stack pointers for CPL-0,1,2,3
  protection respectively.

  MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer table.

Two XSAVES state bits are introduced for CET:
  IA32_XSS:[bit 11]: Control saving/restoring user mode CET states
  IA32_XSS:[bit 12]: Control saving/restoring kernel mode CET states.

Six VMCS fields are introduced for CET:
  {HOST,GUEST}_S_CET: Stores CET settings for kernel mode.
  {HOST,GUEST}_SSP: Stores shadow stack pointer of current task/thread.
  {HOST,GUEST}_INTR_SSP_TABLE: Stores base address of shadow stack pointer
  table.

If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET states are restored from below
VMCS fields at VM-Exit:
  HOST_S_CET
  HOST_SSP
  HOST_INTR_SSP_TABLE

If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET states are loaded from below
VMCS fields at VM-Entry:
  GUEST_S_CET
  GUEST_SSP
  GUEST_INTR_SSP_TABLE

Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/vmx.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 5e090d1f03f8..f301def9125a 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -94,6 +94,7 @@
 #define VM_EXIT_CLEAR_BNDCFGS                   0x00800000
 #define VM_EXIT_PT_CONCEAL_PIP			0x01000000
 #define VM_EXIT_CLEAR_IA32_RTIT_CTL		0x02000000
+#define VM_EXIT_LOAD_CET_STATE                  0x10000000
 
 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
 
@@ -107,6 +108,7 @@
 #define VM_ENTRY_LOAD_BNDCFGS                   0x00010000
 #define VM_ENTRY_PT_CONCEAL_PIP			0x00020000
 #define VM_ENTRY_LOAD_IA32_RTIT_CTL		0x00040000
+#define VM_ENTRY_LOAD_CET_STATE                 0x00100000
 
 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR	0x000011ff
 
@@ -328,6 +330,9 @@ enum vmcs_field {
 	GUEST_PENDING_DBG_EXCEPTIONS    = 0x00006822,
 	GUEST_SYSENTER_ESP              = 0x00006824,
 	GUEST_SYSENTER_EIP              = 0x00006826,
+	GUEST_S_CET                     = 0x00006828,
+	GUEST_SSP                       = 0x0000682a,
+	GUEST_INTR_SSP_TABLE            = 0x0000682c,
 	HOST_CR0                        = 0x00006c00,
 	HOST_CR3                        = 0x00006c02,
 	HOST_CR4                        = 0x00006c04,
@@ -340,6 +345,9 @@ enum vmcs_field {
 	HOST_IA32_SYSENTER_EIP          = 0x00006c12,
 	HOST_RSP                        = 0x00006c14,
 	HOST_RIP                        = 0x00006c16,
+	HOST_S_CET                      = 0x00006c18,
+	HOST_SSP                        = 0x00006c1a,
+	HOST_INTR_SSP_TABLE             = 0x00006c1c
 };
 
 /*
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-02 15:13   ` Xiaoyao Li
  2020-07-01  8:04 ` [PATCH v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing Yang Weijiang
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CET MSRs pass through guest directly to enhance performance. CET runtime
control settings are stored in MSR_IA32_{U,S}_CET, Shadow Stack Pointer(SSP)
are stored in MSR_IA32_PL{0,1,2,3}_SSP, SSP table base address is stored in
MSR_IA32_INT_SSP_TAB, these MSRs are defined in kernel and re-used here.

MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user-mode protection,the MSR
contents are switched between threads during scheduling, it makes sense to pass
through them so that the guest kernel can use xsaves/xrstors to operate them
efficiently. Other MSRs are used for non-user mode protection. See SDM for detailed
info.

The difference between CET VMCS fields and CET MSRs is that,the former are used
during VMEnter/VMExit, whereas the latter are used for CET state storage between
task/thread scheduling.

Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 46 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c     |  3 +++
 2 files changed, 49 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d52d470e36b1..97e766875a7e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3020,6 +3020,13 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long cr3)
 		vmcs_writel(GUEST_CR3, guest_cr3);
 }
 
+static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states)
+{
+	return ((supported_xss & xss_states) &&
+		(guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+		guest_cpuid_has(vcpu, X86_FEATURE_IBT)));
+}
+
 int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -7098,6 +7105,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
 		vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
 }
 
+static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
+	bool incpt;
+
+	incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER);
+	/*
+	 * U_CET is required for USER CET, and U_CET, PL3_SPP are bound as
+	 * one component and controlled by IA32_XSS[bit 11].
+	 */
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW,
+				  incpt);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW,
+				  incpt);
+
+	incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL);
+	/*
+	 * S_CET is required for KERNEL CET, and PL0_SSP ... PL2_SSP are
+	 * bound as one component and controlled by IA32_XSS[bit 12].
+	 */
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW,
+				  incpt);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW,
+				  incpt);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW,
+				  incpt);
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW,
+				  incpt);
+
+	incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+	/* SSP_TAB is only available for KERNEL SHSTK.*/
+	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW,
+				  incpt);
+}
+
 static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -7136,6 +7179,9 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 			vmx_set_guest_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
 		}
 	}
+
+	if (supported_xss & (XFEATURE_MASK_CET_KERNEL | XFEATURE_MASK_CET_USER))
+		vmx_update_intercept_for_cet_msr(vcpu);
 }
 
 static __init void vmx_set_cpu_caps(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c5835f9cb9ad..6390b62c12ed 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -186,6 +186,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs;
 				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
 				| XFEATURE_MASK_PKRU)
 
+#define KVM_SUPPORTED_XSS       (XFEATURE_MASK_CET_USER | \
+				 XFEATURE_MASK_CET_KERNEL)
+
 u64 __read_mostly host_efer;
 EXPORT_SYMBOL_GPL(host_efer);
 
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (2 preceding siblings ...)
  2020-07-01  8:04 ` [PATCH v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits Yang Weijiang
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CR4.CET is master control bit for CET function. There're mutual constrains
between CR0.WP and CR4.CET, so need to check the dependent bit while changing
the control registers.

The processor does not allow CR4.CET to be set if CR0.WP = 0,similarly, it does
not allow CR0.WP to be cleared while CR4.CET = 1. In either case, KVM would
inject #GP to guest.

CET state load bit is set/cleared along with CR4.CET bit set/clear.

Note:
SHSTK and IBT features share one control MSR: MSR_IA32_{U,S}_CET, which means
it's difficult to hide one feature from another in the case of SHSTK != IBT,
after discussed in community, it's agreed to allow guest control two features
independently as it won't introduce security hole.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/capabilities.h |  5 +++++
 arch/x86/kvm/vmx/vmx.c          | 30 ++++++++++++++++++++++++++++--
 arch/x86/kvm/x86.c              |  3 +++
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 8903475f751e..52223f7d31d8 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -101,6 +101,11 @@ static inline bool cpu_has_load_perf_global_ctrl(void)
 	       (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);
 }
 
+static inline bool cpu_has_load_cet_ctrl(void)
+{
+	return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE) &&
+		(vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_CET_STATE);
+}
 static inline bool cpu_has_vmx_mpx(void)
 {
 	return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_BNDCFGS) &&
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 97e766875a7e..7137e252ab38 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2440,7 +2440,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	      VM_EXIT_LOAD_IA32_EFER |
 	      VM_EXIT_CLEAR_BNDCFGS |
 	      VM_EXIT_PT_CONCEAL_PIP |
-	      VM_EXIT_CLEAR_IA32_RTIT_CTL;
+	      VM_EXIT_CLEAR_IA32_RTIT_CTL |
+	      VM_EXIT_LOAD_CET_STATE;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
 				&_vmexit_control) < 0)
 		return -EIO;
@@ -2464,7 +2465,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
 	      VM_ENTRY_LOAD_IA32_EFER |
 	      VM_ENTRY_LOAD_BNDCFGS |
 	      VM_ENTRY_PT_CONCEAL_PIP |
-	      VM_ENTRY_LOAD_IA32_RTIT_CTL;
+	      VM_ENTRY_LOAD_IA32_RTIT_CTL |
+	      VM_ENTRY_LOAD_CET_STATE;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
 				&_vmentry_control) < 0)
 		return -EIO;
@@ -3027,6 +3029,12 @@ static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states)
 		guest_cpuid_has(vcpu, X86_FEATURE_IBT)));
 }
 
+static bool is_cet_supported(struct kvm_vcpu *vcpu)
+{
+	return is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER |
+				      XFEATURE_MASK_CET_KERNEL);
+}
+
 int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -3067,6 +3075,10 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 			return 1;
 	}
 
+	if ((cr4 & X86_CR4_CET) && (!is_cet_supported(vcpu) ||
+	    !(kvm_read_cr0(vcpu) & X86_CR0_WP)))
+		return 1;
+
 	if (vmx->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
 		return 1;
 
@@ -3097,6 +3109,20 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 			hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
 	}
 
+	if (cpu_has_load_cet_ctrl()) {
+		if ((hw_cr4 & X86_CR4_CET) && is_cet_supported(vcpu)) {
+			vm_entry_controls_setbit(to_vmx(vcpu),
+						 VM_ENTRY_LOAD_CET_STATE);
+			vm_exit_controls_setbit(to_vmx(vcpu),
+						VM_EXIT_LOAD_CET_STATE);
+		} else {
+			vm_entry_controls_clearbit(to_vmx(vcpu),
+						   VM_ENTRY_LOAD_CET_STATE);
+			vm_exit_controls_clearbit(to_vmx(vcpu),
+						  VM_EXIT_LOAD_CET_STATE);
+		}
+	}
+
 	vmcs_writel(CR4_READ_SHADOW, cr4);
 	vmcs_writel(GUEST_CR4, hw_cr4);
 	return 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6390b62c12ed..b63727318da1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -803,6 +803,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 	if (!(cr0 & X86_CR0_PG) && kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE))
 		return 1;
 
+	if (!(cr0 & X86_CR0_WP) && kvm_read_cr4_bits(vcpu, X86_CR4_CET))
+		return 1;
+
 	kvm_x86_ops.set_cr0(vcpu, cr0);
 
 	if ((cr0 ^ old_cr0) & X86_CR0_PG) {
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (3 preceding siblings ...)
  2020-07-01  8:04 ` [PATCH v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES Yang Weijiang
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CPUID(0xd, 1) reports the current required storage size of XCR0 | XSS,
when guest updates the XSS, it's necessary to update the CPUID leaf, otherwise
guest will fetch old state size, and results to some WARN traces during guest
running.

supported_xss is initialized to host_xss & KVM_SUPPORTED_XSS to indicate current
MSR_IA32_XSS bits supported in KVM, but actual XSS bits seen in guest depends
on the setting of CPUID(0xd,1).{ECX, EDX} for guest.

Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/cpuid.c            | 23 +++++++++++++++++++----
 arch/x86/kvm/x86.c              | 12 ++++++++----
 3 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 42a2d0d3984a..f68c825e94ad 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -649,6 +649,7 @@ struct kvm_vcpu_arch {
 
 	u64 xcr0;
 	u64 guest_supported_xcr0;
+	u64 guest_supported_xss;
 	u32 guest_xstate_size;
 
 	struct kvm_pio_request pio;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 901cd1fdecd9..984ab2b395b3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -89,15 +89,30 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
 		vcpu->arch.guest_xstate_size = XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET;
 	} else {
 		vcpu->arch.guest_supported_xcr0 =
-			(best->eax | ((u64)best->edx << 32)) & supported_xcr0;
+			(((u64)best->edx << 32) | best->eax) & supported_xcr0;
 		vcpu->arch.guest_xstate_size = best->ebx =
 			xstate_required_size(vcpu->arch.xcr0, false);
 	}
 
 	best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
-	if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
-		     cpuid_entry_has(best, X86_FEATURE_XSAVEC)))
-		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
+	if (best) {
+		if (cpuid_entry_has(best, X86_FEATURE_XSAVES) ||
+		    cpuid_entry_has(best, X86_FEATURE_XSAVEC))  {
+			u64 xstate = vcpu->arch.xcr0 | vcpu->arch.ia32_xss;
+
+			best->ebx = xstate_required_size(xstate, true);
+		}
+
+		if (!cpuid_entry_has(best, X86_FEATURE_XSAVES)) {
+			best->ecx = 0;
+			best->edx = 0;
+		}
+		vcpu->arch.guest_supported_xss =
+			(((u64)best->edx << 32) | best->ecx) & supported_xss;
+
+	} else {
+		vcpu->arch.guest_supported_xss = 0;
+	}
 
 	/*
 	 * The existing code assumes virtual address is 48-bit or 57-bit in the
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b63727318da1..c866087ed0ef 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2843,9 +2843,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		 * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than
 		 * XSAVES/XRSTORS to save/restore PT MSRs.
 		 */
-		if (data & ~supported_xss)
+		if (data & ~vcpu->arch.guest_supported_xss)
 			return 1;
-		vcpu->arch.ia32_xss = data;
+		if (vcpu->arch.ia32_xss != data) {
+			vcpu->arch.ia32_xss = data;
+			kvm_update_cpuid(vcpu);
+		}
 		break;
 	case MSR_SMI_COUNT:
 		if (!msr_info->host_initiated)
@@ -9678,8 +9681,9 @@ int kvm_arch_hardware_setup(void *opaque)
 
 	memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
 
-	if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
-		supported_xss = 0;
+	supported_xss = 0;
+	if (kvm_cpu_cap_has(X86_FEATURE_XSAVES))
+		supported_xss = host_xss & KVM_SUPPORTED_XSS;
 
 	cr4_reserved_bits = kvm_host_cr4_reserved_bits(&boot_cpu_data);
 
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (4 preceding siblings ...)
  2020-07-01  8:04 ` [PATCH v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 07/11] KVM: x86: Add userspace access interface for CET MSRs Yang Weijiang
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

From: Sean Christopherson <sean.j.christopherson@intel.com>

A handful of CET MSRs are not context switched through "traditional"
methods, e.g. VMCS or manual switching, but rather are passed through
to the guest and are saved and restored by XSAVES/XRSTORS, i.e. in the
guest's FPU state.

Load the guest's FPU state if userspace is accessing MSRs whose values
are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(),
can simply do {RD,WR}MSR to access the guest's value.

Note that guest_cpuid_has() is not queried as host userspace is allowed
to access MSRs that have not been exposed to the guest, e.g. it might do
KVM_SET_MSRS prior to KVM_SET_CPUID2.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/x86.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c866087ed0ef..50f80dcab3a9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -109,6 +109,8 @@ static void enter_smm(struct kvm_vcpu *vcpu);
 static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
 static void store_regs(struct kvm_vcpu *vcpu);
 static int sync_regs(struct kvm_vcpu *vcpu);
+static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
+static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
 
 struct kvm_x86_ops kvm_x86_ops __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_x86_ops);
@@ -3267,6 +3269,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 }
 EXPORT_SYMBOL_GPL(kvm_get_msr_common);
 
+static bool is_xsaves_msr(u32 index)
+{
+	return index == MSR_IA32_U_CET ||
+	       (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP);
+}
+
 /*
  * Read or write a bunch of msrs. All parameters are kernel addresses.
  *
@@ -3277,11 +3285,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
 		    int (*do_msr)(struct kvm_vcpu *vcpu,
 				  unsigned index, u64 *data))
 {
+	bool fpu_loaded = false;
 	int i;
 
-	for (i = 0; i < msrs->nmsrs; ++i)
+	for (i = 0; i < msrs->nmsrs; ++i) {
+		if (vcpu && !fpu_loaded && supported_xss &&
+		    is_xsaves_msr(entries[i].index)) {
+			kvm_load_guest_fpu(vcpu);
+			fpu_loaded = true;
+		}
 		if (do_msr(vcpu, entries[i].index, &entries[i].data))
 			break;
+	}
+	if (fpu_loaded)
+		kvm_put_guest_fpu(vcpu);
 
 	return i;
 }
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 07/11] KVM: x86: Add userspace access interface for CET MSRs
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (5 preceding siblings ...)
  2020-07-01  8:04 ` [PATCH v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 08/11] KVM: VMX: Enable CET support for nested VM Yang Weijiang
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

There're two different places storing Guest CET states, states managed
with XSAVES/XRSTORS, as restored/saved in previous patch, can be read/write
directly from/to the MSRs. For those stored in VMCS fields, they're access
via vmcs_read/vmcs_write.

To correctly read/write the CET MSRs, it's necessary to check whether the
kernel FPU context switch happened and reload guest FPU context if needed.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/uapi/asm/kvm_para.h |   7 +-
 arch/x86/kvm/vmx/vmx.c               | 148 +++++++++++++++++++++++++++
 arch/x86/kvm/x86.c                   |   4 +
 3 files changed, 156 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 2a8e0b6b9805..211bba6f7d8a 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -46,10 +46,11 @@
 /* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */
 #define MSR_KVM_WALL_CLOCK_NEW  0x4b564d00
 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01
-#define MSR_KVM_ASYNC_PF_EN 0x4b564d02
-#define MSR_KVM_STEAL_TIME  0x4b564d03
-#define MSR_KVM_PV_EOI_EN      0x4b564d04
+#define MSR_KVM_ASYNC_PF_EN     0x4b564d02
+#define MSR_KVM_STEAL_TIME      0x4b564d03
+#define MSR_KVM_PV_EOI_EN       0x4b564d04
 #define MSR_KVM_POLL_CONTROL	0x4b564d05
+#define MSR_KVM_GUEST_SSP       0x4b564d06
 
 struct kvm_steal_time {
 	__u64 steal;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7137e252ab38..7f3a65ee64c5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1777,6 +1777,94 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 	}
 }
 
+static void vmx_get_xsave_msr(struct msr_data *msr_info)
+{
+	local_irq_disable();
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_return();
+	rdmsrl(msr_info->index, msr_info->data);
+	local_irq_enable();
+}
+
+static void vmx_set_xsave_msr(struct msr_data *msr_info)
+{
+	local_irq_disable();
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_return();
+	wrmsrl(msr_info->index, msr_info->data);
+	local_irq_enable();
+}
+
+#define CET_MSR_RSVD_BITS_1  GENMASK(2, 0)
+#define CET_MSR_RSVD_BITS_2  GENMASK(9, 6)
+
+static bool cet_check_msr_valid(struct kvm_vcpu *vcpu,
+				struct msr_data *msr, u64 rsvd_bits)
+{
+	u64 data = msr->data;
+	u32 index = msr->index;
+
+	if ((index == MSR_IA32_PL0_SSP || index == MSR_IA32_PL1_SSP ||
+	    index == MSR_IA32_PL2_SSP || index == MSR_IA32_PL3_SSP ||
+	    index == MSR_IA32_INT_SSP_TAB || index == MSR_KVM_GUEST_SSP) &&
+	    is_noncanonical_address(data, vcpu))
+		return false;
+
+	if ((index  == MSR_IA32_S_CET || index == MSR_IA32_U_CET) &&
+	    data & MSR_IA32_CET_ENDBR_EN) {
+		u64 bitmap_base = data >> 12;
+
+		if (is_noncanonical_address(bitmap_base, vcpu))
+			return false;
+	}
+
+	return !(data & rsvd_bits);
+}
+
+static bool cet_check_ssp_msr_accessible(struct kvm_vcpu *vcpu,
+					 struct msr_data *msr)
+{
+	u32 index = msr->index;
+
+	if (!boot_cpu_has(X86_FEATURE_SHSTK))
+		return false;
+
+	if (!msr->host_initiated &&
+	    !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
+		return false;
+
+	if (index == MSR_KVM_GUEST_SSP)
+		return msr->host_initiated &&
+		       guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
+
+	if (index == MSR_IA32_INT_SSP_TAB)
+		return true;
+
+	if (index == MSR_IA32_PL3_SSP)
+		return supported_xss & XFEATURE_MASK_CET_USER;
+
+	return supported_xss & XFEATURE_MASK_CET_KERNEL;
+}
+
+static bool cet_check_ctl_msr_accessible(struct kvm_vcpu *vcpu,
+					 struct msr_data *msr)
+{
+	u32 index = msr->index;
+
+	if (!boot_cpu_has(X86_FEATURE_SHSTK) &&
+	    !boot_cpu_has(X86_FEATURE_IBT))
+		return false;
+
+	if (!msr->host_initiated &&
+	    !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
+	    !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
+		return false;
+
+	if (index == MSR_IA32_U_CET)
+		return supported_xss & XFEATURE_MASK_CET_USER;
+
+	return supported_xss & XFEATURE_MASK_CET_KERNEL;
+}
 /*
  * Reads an msr value (of 'msr_index') into 'pdata'.
  * Returns 0 on success, non-0 otherwise.
@@ -1909,6 +1997,31 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
 		break;
+	case MSR_KVM_GUEST_SSP:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		msr_info->data = vmcs_readl(GUEST_SSP);
+		break;
+	case MSR_IA32_S_CET:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		msr_info->data = vmcs_readl(GUEST_S_CET);
+		break;
+	case MSR_IA32_INT_SSP_TAB:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
+		break;
+	case MSR_IA32_U_CET:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		vmx_get_xsave_msr(msr_info);
+		break;
+	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		vmx_get_xsave_msr(msr_info);
+		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
@@ -2165,6 +2278,41 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			vmx->pt_desc.guest.addr_a[index / 2] = data;
 		break;
+	case MSR_KVM_GUEST_SSP:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1))
+			return 1;
+		vmcs_writel(GUEST_SSP, data);
+		break;
+	case MSR_IA32_S_CET:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2))
+			return 1;
+		vmcs_writel(GUEST_S_CET, data);
+		break;
+	case MSR_IA32_INT_SSP_TAB:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, 0))
+			return 1;
+		vmcs_writel(GUEST_INTR_SSP_TABLE, data);
+		break;
+	case MSR_IA32_U_CET:
+		if (!cet_check_ctl_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_2))
+			return 1;
+		vmx_set_xsave_msr(msr_info);
+		break;
+	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
+		if (!cet_check_ssp_msr_accessible(vcpu, msr_info))
+			return 1;
+		if (!cet_check_msr_valid(vcpu, msr_info, CET_MSR_RSVD_BITS_1))
+			return 1;
+		vmx_set_xsave_msr(msr_info);
+		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 50f80dcab3a9..9c16ce65fe74 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1228,6 +1228,10 @@ static const u32 msrs_to_save_all[] = {
 	MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
 	MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
+
+	MSR_IA32_XSS, MSR_IA32_U_CET, MSR_IA32_S_CET,
+	MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP,
+	MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB, MSR_KVM_GUEST_SSP,
 };
 
 static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 08/11] KVM: VMX: Enable CET support for nested VM
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (6 preceding siblings ...)
  2020-07-01  8:04 ` [PATCH v13 07/11] KVM: x86: Add userspace access interface for CET MSRs Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states Yang Weijiang
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CET MSRs pass through guests for performance consideration. Configure the
MSRs to match L0/L1 settings so that nested VM is able to run with CET.

Add assertions for vmcs12 offset table initialization, these assertions can
detect the mismatch of VMCS field encoding and data type at compiling time.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/nested.c |  34 +++++
 arch/x86/kvm/vmx/vmcs12.c | 275 ++++++++++++++++++++++----------------
 arch/x86/kvm/vmx/vmcs12.h |  14 +-
 arch/x86/kvm/vmx/vmx.c    |  10 ++
 4 files changed, 220 insertions(+), 113 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index fd78ffbde644..ce29475226b6 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -555,6 +555,18 @@ static inline void enable_x2apic_msr_intercepts(unsigned long *msr_bitmap)
 	}
 }
 
+static void nested_vmx_update_intercept_for_msr(struct kvm_vcpu *vcpu,
+						u32 msr,
+						unsigned long *msr_bitmap_l1,
+						unsigned long *msr_bitmap_l0,
+						int type)
+{
+	if (!msr_write_intercepted_l01(vcpu, msr))
+		nested_vmx_disable_intercept_for_msr(msr_bitmap_l1,
+						     msr_bitmap_l0,
+						     msr, type);
+}
+
 /*
  * Merge L0's and L1's MSR bitmap, return false to indicate that
  * we do not use the hardware.
@@ -626,6 +638,28 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
 	nested_vmx_disable_intercept_for_msr(msr_bitmap_l1, msr_bitmap_l0,
 					     MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
 
+	/* Pass CET MSRs to nested VM if L0 and L1 are set to pass-through. */
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_U_CET,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_S_CET,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
+	nested_vmx_update_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB,
+					    msr_bitmap_l1, msr_bitmap_l0,
+					    MSR_TYPE_RW);
 	/*
 	 * Checking the L0->L1 bitmap is trying to verify two things:
 	 *
diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c
index 53dfb401316d..f68ac66b1170 100644
--- a/arch/x86/kvm/vmx/vmcs12.c
+++ b/arch/x86/kvm/vmx/vmcs12.c
@@ -4,31 +4,76 @@
 
 #define ROL16(val, n) ((u16)(((u16)(val) << (n)) | ((u16)(val) >> (16 - (n)))))
 #define VMCS12_OFFSET(x) offsetof(struct vmcs12, x)
-#define FIELD(number, name)	[ROL16(number, 6)] = VMCS12_OFFSET(name)
-#define FIELD64(number, name)						\
-	FIELD(number, name),						\
-	[ROL16(number##_HIGH, 6)] = VMCS12_OFFSET(name) + sizeof(u32)
+
+#define VMCS_CHECK_SIZE16(number) \
+	(BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2000) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2001) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x4000) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x6000))
+
+#define VMCS_CHECK_SIZE32(number) \
+	(BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x6000))
+
+#define VMCS_CHECK_SIZE64(number) \
+	(BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2001) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x4000) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x6000))
+
+#define VMCS_CHECK_SIZE_N(number) \
+	(BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2000) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6001) == 0x2001) + \
+	BUILD_BUG_ON_ZERO(__builtin_constant_p(number) && \
+	((number) & 0x6000) == 0x4000))
+
+#define FIELD16(number, name) \
+	[ROL16(number, 6)] = VMCS_CHECK_SIZE16(number) + VMCS12_OFFSET(name)
+
+#define FIELD32(number, name) \
+	[ROL16(number, 6)] = VMCS_CHECK_SIZE32(number) + VMCS12_OFFSET(name)
+
+#define FIELD64(number, name)  \
+	FIELD32(number, name), \
+	[ROL16(number##_HIGH, 6)] = VMCS_CHECK_SIZE32(number) + \
+	VMCS12_OFFSET(name) + sizeof(u32)
+#define FIELDN(number, name) \
+	[ROL16(number, 6)] = VMCS_CHECK_SIZE_N(number) + VMCS12_OFFSET(name)
 
 const unsigned short vmcs_field_to_offset_table[] = {
-	FIELD(VIRTUAL_PROCESSOR_ID, virtual_processor_id),
-	FIELD(POSTED_INTR_NV, posted_intr_nv),
-	FIELD(GUEST_ES_SELECTOR, guest_es_selector),
-	FIELD(GUEST_CS_SELECTOR, guest_cs_selector),
-	FIELD(GUEST_SS_SELECTOR, guest_ss_selector),
-	FIELD(GUEST_DS_SELECTOR, guest_ds_selector),
-	FIELD(GUEST_FS_SELECTOR, guest_fs_selector),
-	FIELD(GUEST_GS_SELECTOR, guest_gs_selector),
-	FIELD(GUEST_LDTR_SELECTOR, guest_ldtr_selector),
-	FIELD(GUEST_TR_SELECTOR, guest_tr_selector),
-	FIELD(GUEST_INTR_STATUS, guest_intr_status),
-	FIELD(GUEST_PML_INDEX, guest_pml_index),
-	FIELD(HOST_ES_SELECTOR, host_es_selector),
-	FIELD(HOST_CS_SELECTOR, host_cs_selector),
-	FIELD(HOST_SS_SELECTOR, host_ss_selector),
-	FIELD(HOST_DS_SELECTOR, host_ds_selector),
-	FIELD(HOST_FS_SELECTOR, host_fs_selector),
-	FIELD(HOST_GS_SELECTOR, host_gs_selector),
-	FIELD(HOST_TR_SELECTOR, host_tr_selector),
+	FIELD16(VIRTUAL_PROCESSOR_ID, virtual_processor_id),
+	FIELD16(POSTED_INTR_NV, posted_intr_nv),
+	FIELD16(GUEST_ES_SELECTOR, guest_es_selector),
+	FIELD16(GUEST_CS_SELECTOR, guest_cs_selector),
+	FIELD16(GUEST_SS_SELECTOR, guest_ss_selector),
+	FIELD16(GUEST_DS_SELECTOR, guest_ds_selector),
+	FIELD16(GUEST_FS_SELECTOR, guest_fs_selector),
+	FIELD16(GUEST_GS_SELECTOR, guest_gs_selector),
+	FIELD16(GUEST_LDTR_SELECTOR, guest_ldtr_selector),
+	FIELD16(GUEST_TR_SELECTOR, guest_tr_selector),
+	FIELD16(GUEST_INTR_STATUS, guest_intr_status),
+	FIELD16(GUEST_PML_INDEX, guest_pml_index),
+	FIELD16(HOST_ES_SELECTOR, host_es_selector),
+	FIELD16(HOST_CS_SELECTOR, host_cs_selector),
+	FIELD16(HOST_SS_SELECTOR, host_ss_selector),
+	FIELD16(HOST_DS_SELECTOR, host_ds_selector),
+	FIELD16(HOST_FS_SELECTOR, host_fs_selector),
+	FIELD16(HOST_GS_SELECTOR, host_gs_selector),
+	FIELD16(HOST_TR_SELECTOR, host_tr_selector),
 	FIELD64(IO_BITMAP_A, io_bitmap_a),
 	FIELD64(IO_BITMAP_B, io_bitmap_b),
 	FIELD64(MSR_BITMAP, msr_bitmap),
@@ -64,94 +109,100 @@ const unsigned short vmcs_field_to_offset_table[] = {
 	FIELD64(HOST_IA32_PAT, host_ia32_pat),
 	FIELD64(HOST_IA32_EFER, host_ia32_efer),
 	FIELD64(HOST_IA32_PERF_GLOBAL_CTRL, host_ia32_perf_global_ctrl),
-	FIELD(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control),
-	FIELD(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control),
-	FIELD(EXCEPTION_BITMAP, exception_bitmap),
-	FIELD(PAGE_FAULT_ERROR_CODE_MASK, page_fault_error_code_mask),
-	FIELD(PAGE_FAULT_ERROR_CODE_MATCH, page_fault_error_code_match),
-	FIELD(CR3_TARGET_COUNT, cr3_target_count),
-	FIELD(VM_EXIT_CONTROLS, vm_exit_controls),
-	FIELD(VM_EXIT_MSR_STORE_COUNT, vm_exit_msr_store_count),
-	FIELD(VM_EXIT_MSR_LOAD_COUNT, vm_exit_msr_load_count),
-	FIELD(VM_ENTRY_CONTROLS, vm_entry_controls),
-	FIELD(VM_ENTRY_MSR_LOAD_COUNT, vm_entry_msr_load_count),
-	FIELD(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field),
-	FIELD(VM_ENTRY_EXCEPTION_ERROR_CODE, vm_entry_exception_error_code),
-	FIELD(VM_ENTRY_INSTRUCTION_LEN, vm_entry_instruction_len),
-	FIELD(TPR_THRESHOLD, tpr_threshold),
-	FIELD(SECONDARY_VM_EXEC_CONTROL, secondary_vm_exec_control),
-	FIELD(VM_INSTRUCTION_ERROR, vm_instruction_error),
-	FIELD(VM_EXIT_REASON, vm_exit_reason),
-	FIELD(VM_EXIT_INTR_INFO, vm_exit_intr_info),
-	FIELD(VM_EXIT_INTR_ERROR_CODE, vm_exit_intr_error_code),
-	FIELD(IDT_VECTORING_INFO_FIELD, idt_vectoring_info_field),
-	FIELD(IDT_VECTORING_ERROR_CODE, idt_vectoring_error_code),
-	FIELD(VM_EXIT_INSTRUCTION_LEN, vm_exit_instruction_len),
-	FIELD(VMX_INSTRUCTION_INFO, vmx_instruction_info),
-	FIELD(GUEST_ES_LIMIT, guest_es_limit),
-	FIELD(GUEST_CS_LIMIT, guest_cs_limit),
-	FIELD(GUEST_SS_LIMIT, guest_ss_limit),
-	FIELD(GUEST_DS_LIMIT, guest_ds_limit),
-	FIELD(GUEST_FS_LIMIT, guest_fs_limit),
-	FIELD(GUEST_GS_LIMIT, guest_gs_limit),
-	FIELD(GUEST_LDTR_LIMIT, guest_ldtr_limit),
-	FIELD(GUEST_TR_LIMIT, guest_tr_limit),
-	FIELD(GUEST_GDTR_LIMIT, guest_gdtr_limit),
-	FIELD(GUEST_IDTR_LIMIT, guest_idtr_limit),
-	FIELD(GUEST_ES_AR_BYTES, guest_es_ar_bytes),
-	FIELD(GUEST_CS_AR_BYTES, guest_cs_ar_bytes),
-	FIELD(GUEST_SS_AR_BYTES, guest_ss_ar_bytes),
-	FIELD(GUEST_DS_AR_BYTES, guest_ds_ar_bytes),
-	FIELD(GUEST_FS_AR_BYTES, guest_fs_ar_bytes),
-	FIELD(GUEST_GS_AR_BYTES, guest_gs_ar_bytes),
-	FIELD(GUEST_LDTR_AR_BYTES, guest_ldtr_ar_bytes),
-	FIELD(GUEST_TR_AR_BYTES, guest_tr_ar_bytes),
-	FIELD(GUEST_INTERRUPTIBILITY_INFO, guest_interruptibility_info),
-	FIELD(GUEST_ACTIVITY_STATE, guest_activity_state),
-	FIELD(GUEST_SYSENTER_CS, guest_sysenter_cs),
-	FIELD(HOST_IA32_SYSENTER_CS, host_ia32_sysenter_cs),
-	FIELD(VMX_PREEMPTION_TIMER_VALUE, vmx_preemption_timer_value),
-	FIELD(CR0_GUEST_HOST_MASK, cr0_guest_host_mask),
-	FIELD(CR4_GUEST_HOST_MASK, cr4_guest_host_mask),
-	FIELD(CR0_READ_SHADOW, cr0_read_shadow),
-	FIELD(CR4_READ_SHADOW, cr4_read_shadow),
-	FIELD(CR3_TARGET_VALUE0, cr3_target_value0),
-	FIELD(CR3_TARGET_VALUE1, cr3_target_value1),
-	FIELD(CR3_TARGET_VALUE2, cr3_target_value2),
-	FIELD(CR3_TARGET_VALUE3, cr3_target_value3),
-	FIELD(EXIT_QUALIFICATION, exit_qualification),
-	FIELD(GUEST_LINEAR_ADDRESS, guest_linear_address),
-	FIELD(GUEST_CR0, guest_cr0),
-	FIELD(GUEST_CR3, guest_cr3),
-	FIELD(GUEST_CR4, guest_cr4),
-	FIELD(GUEST_ES_BASE, guest_es_base),
-	FIELD(GUEST_CS_BASE, guest_cs_base),
-	FIELD(GUEST_SS_BASE, guest_ss_base),
-	FIELD(GUEST_DS_BASE, guest_ds_base),
-	FIELD(GUEST_FS_BASE, guest_fs_base),
-	FIELD(GUEST_GS_BASE, guest_gs_base),
-	FIELD(GUEST_LDTR_BASE, guest_ldtr_base),
-	FIELD(GUEST_TR_BASE, guest_tr_base),
-	FIELD(GUEST_GDTR_BASE, guest_gdtr_base),
-	FIELD(GUEST_IDTR_BASE, guest_idtr_base),
-	FIELD(GUEST_DR7, guest_dr7),
-	FIELD(GUEST_RSP, guest_rsp),
-	FIELD(GUEST_RIP, guest_rip),
-	FIELD(GUEST_RFLAGS, guest_rflags),
-	FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions),
-	FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp),
-	FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip),
-	FIELD(HOST_CR0, host_cr0),
-	FIELD(HOST_CR3, host_cr3),
-	FIELD(HOST_CR4, host_cr4),
-	FIELD(HOST_FS_BASE, host_fs_base),
-	FIELD(HOST_GS_BASE, host_gs_base),
-	FIELD(HOST_TR_BASE, host_tr_base),
-	FIELD(HOST_GDTR_BASE, host_gdtr_base),
-	FIELD(HOST_IDTR_BASE, host_idtr_base),
-	FIELD(HOST_IA32_SYSENTER_ESP, host_ia32_sysenter_esp),
-	FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip),
-	FIELD(HOST_RSP, host_rsp),
-	FIELD(HOST_RIP, host_rip),
+	FIELD32(PIN_BASED_VM_EXEC_CONTROL, pin_based_vm_exec_control),
+	FIELD32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control),
+	FIELD32(EXCEPTION_BITMAP, exception_bitmap),
+	FIELD32(PAGE_FAULT_ERROR_CODE_MASK, page_fault_error_code_mask),
+	FIELD32(PAGE_FAULT_ERROR_CODE_MATCH, page_fault_error_code_match),
+	FIELD32(CR3_TARGET_COUNT, cr3_target_count),
+	FIELD32(VM_EXIT_CONTROLS, vm_exit_controls),
+	FIELD32(VM_EXIT_MSR_STORE_COUNT, vm_exit_msr_store_count),
+	FIELD32(VM_EXIT_MSR_LOAD_COUNT, vm_exit_msr_load_count),
+	FIELD32(VM_ENTRY_CONTROLS, vm_entry_controls),
+	FIELD32(VM_ENTRY_MSR_LOAD_COUNT, vm_entry_msr_load_count),
+	FIELD32(VM_ENTRY_INTR_INFO_FIELD, vm_entry_intr_info_field),
+	FIELD32(VM_ENTRY_EXCEPTION_ERROR_CODE, vm_entry_exception_error_code),
+	FIELD32(VM_ENTRY_INSTRUCTION_LEN, vm_entry_instruction_len),
+	FIELD32(TPR_THRESHOLD, tpr_threshold),
+	FIELD32(SECONDARY_VM_EXEC_CONTROL, secondary_vm_exec_control),
+	FIELD32(VM_INSTRUCTION_ERROR, vm_instruction_error),
+	FIELD32(VM_EXIT_REASON, vm_exit_reason),
+	FIELD32(VM_EXIT_INTR_INFO, vm_exit_intr_info),
+	FIELD32(VM_EXIT_INTR_ERROR_CODE, vm_exit_intr_error_code),
+	FIELD32(IDT_VECTORING_INFO_FIELD, idt_vectoring_info_field),
+	FIELD32(IDT_VECTORING_ERROR_CODE, idt_vectoring_error_code),
+	FIELD32(VM_EXIT_INSTRUCTION_LEN, vm_exit_instruction_len),
+	FIELD32(VMX_INSTRUCTION_INFO, vmx_instruction_info),
+	FIELD32(GUEST_ES_LIMIT, guest_es_limit),
+	FIELD32(GUEST_CS_LIMIT, guest_cs_limit),
+	FIELD32(GUEST_SS_LIMIT, guest_ss_limit),
+	FIELD32(GUEST_DS_LIMIT, guest_ds_limit),
+	FIELD32(GUEST_FS_LIMIT, guest_fs_limit),
+	FIELD32(GUEST_GS_LIMIT, guest_gs_limit),
+	FIELD32(GUEST_LDTR_LIMIT, guest_ldtr_limit),
+	FIELD32(GUEST_TR_LIMIT, guest_tr_limit),
+	FIELD32(GUEST_GDTR_LIMIT, guest_gdtr_limit),
+	FIELD32(GUEST_IDTR_LIMIT, guest_idtr_limit),
+	FIELD32(GUEST_ES_AR_BYTES, guest_es_ar_bytes),
+	FIELD32(GUEST_CS_AR_BYTES, guest_cs_ar_bytes),
+	FIELD32(GUEST_SS_AR_BYTES, guest_ss_ar_bytes),
+	FIELD32(GUEST_DS_AR_BYTES, guest_ds_ar_bytes),
+	FIELD32(GUEST_FS_AR_BYTES, guest_fs_ar_bytes),
+	FIELD32(GUEST_GS_AR_BYTES, guest_gs_ar_bytes),
+	FIELD32(GUEST_LDTR_AR_BYTES, guest_ldtr_ar_bytes),
+	FIELD32(GUEST_TR_AR_BYTES, guest_tr_ar_bytes),
+	FIELD32(GUEST_INTERRUPTIBILITY_INFO, guest_interruptibility_info),
+	FIELD32(GUEST_ACTIVITY_STATE, guest_activity_state),
+	FIELD32(GUEST_SYSENTER_CS, guest_sysenter_cs),
+	FIELD32(HOST_IA32_SYSENTER_CS, host_ia32_sysenter_cs),
+	FIELD32(VMX_PREEMPTION_TIMER_VALUE, vmx_preemption_timer_value),
+	FIELDN(CR0_GUEST_HOST_MASK, cr0_guest_host_mask),
+	FIELDN(CR4_GUEST_HOST_MASK, cr4_guest_host_mask),
+	FIELDN(CR0_READ_SHADOW, cr0_read_shadow),
+	FIELDN(CR4_READ_SHADOW, cr4_read_shadow),
+	FIELDN(CR3_TARGET_VALUE0, cr3_target_value0),
+	FIELDN(CR3_TARGET_VALUE1, cr3_target_value1),
+	FIELDN(CR3_TARGET_VALUE2, cr3_target_value2),
+	FIELDN(CR3_TARGET_VALUE3, cr3_target_value3),
+	FIELDN(EXIT_QUALIFICATION, exit_qualification),
+	FIELDN(GUEST_LINEAR_ADDRESS, guest_linear_address),
+	FIELDN(GUEST_CR0, guest_cr0),
+	FIELDN(GUEST_CR3, guest_cr3),
+	FIELDN(GUEST_CR4, guest_cr4),
+	FIELDN(GUEST_ES_BASE, guest_es_base),
+	FIELDN(GUEST_CS_BASE, guest_cs_base),
+	FIELDN(GUEST_SS_BASE, guest_ss_base),
+	FIELDN(GUEST_DS_BASE, guest_ds_base),
+	FIELDN(GUEST_FS_BASE, guest_fs_base),
+	FIELDN(GUEST_GS_BASE, guest_gs_base),
+	FIELDN(GUEST_LDTR_BASE, guest_ldtr_base),
+	FIELDN(GUEST_TR_BASE, guest_tr_base),
+	FIELDN(GUEST_GDTR_BASE, guest_gdtr_base),
+	FIELDN(GUEST_IDTR_BASE, guest_idtr_base),
+	FIELDN(GUEST_DR7, guest_dr7),
+	FIELDN(GUEST_RSP, guest_rsp),
+	FIELDN(GUEST_RIP, guest_rip),
+	FIELDN(GUEST_RFLAGS, guest_rflags),
+	FIELDN(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions),
+	FIELDN(GUEST_SYSENTER_ESP, guest_sysenter_esp),
+	FIELDN(GUEST_SYSENTER_EIP, guest_sysenter_eip),
+	FIELDN(GUEST_S_CET, guest_s_cet),
+	FIELDN(GUEST_SSP, guest_ssp),
+	FIELDN(GUEST_INTR_SSP_TABLE, guest_ssp_tbl),
+	FIELDN(HOST_CR0, host_cr0),
+	FIELDN(HOST_CR3, host_cr3),
+	FIELDN(HOST_CR4, host_cr4),
+	FIELDN(HOST_FS_BASE, host_fs_base),
+	FIELDN(HOST_GS_BASE, host_gs_base),
+	FIELDN(HOST_TR_BASE, host_tr_base),
+	FIELDN(HOST_GDTR_BASE, host_gdtr_base),
+	FIELDN(HOST_IDTR_BASE, host_idtr_base),
+	FIELDN(HOST_IA32_SYSENTER_ESP, host_ia32_sysenter_esp),
+	FIELDN(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip),
+	FIELDN(HOST_RSP, host_rsp),
+	FIELDN(HOST_RIP, host_rip),
+	FIELDN(HOST_S_CET, host_s_cet),
+	FIELDN(HOST_SSP, host_ssp),
+	FIELDN(HOST_INTR_SSP_TABLE, host_ssp_tbl),
 };
 const unsigned int nr_vmcs12_fields = ARRAY_SIZE(vmcs_field_to_offset_table);
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index d0c6df373f67..62b7be68f05c 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -118,7 +118,13 @@ struct __packed vmcs12 {
 	natural_width host_ia32_sysenter_eip;
 	natural_width host_rsp;
 	natural_width host_rip;
-	natural_width paddingl[8]; /* room for future expansion */
+	natural_width host_s_cet;
+	natural_width host_ssp;
+	natural_width host_ssp_tbl;
+	natural_width guest_s_cet;
+	natural_width guest_ssp;
+	natural_width guest_ssp_tbl;
+	natural_width paddingl[2]; /* room for future expansion */
 	u32 pin_based_vm_exec_control;
 	u32 cpu_based_vm_exec_control;
 	u32 exception_bitmap;
@@ -301,6 +307,12 @@ static inline void vmx_check_vmcs12_offsets(void)
 	CHECK_OFFSET(host_ia32_sysenter_eip, 656);
 	CHECK_OFFSET(host_rsp, 664);
 	CHECK_OFFSET(host_rip, 672);
+	CHECK_OFFSET(host_s_cet, 680);
+	CHECK_OFFSET(host_ssp, 688);
+	CHECK_OFFSET(host_ssp_tbl, 696);
+	CHECK_OFFSET(guest_s_cet, 704);
+	CHECK_OFFSET(guest_ssp, 712);
+	CHECK_OFFSET(guest_ssp_tbl, 720);
 	CHECK_OFFSET(pin_based_vm_exec_control, 744);
 	CHECK_OFFSET(cpu_based_vm_exec_control, 748);
 	CHECK_OFFSET(exception_bitmap, 752);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7f3a65ee64c5..32893573b630 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7189,6 +7189,7 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
 	cr4_fixed1_update(X86_CR4_PKE,        ecx, feature_bit(PKU));
 	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
 	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
+	cr4_fixed1_update(X86_CR4_CET,	      ecx, feature_bit(SHSTK));
 
 #undef cr4_fixed1_update
 }
@@ -7208,6 +7209,15 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 			vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS;
 		}
 	}
+
+	if (is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER |
+	    XFEATURE_MASK_CET_KERNEL)) {
+		vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_CET_STATE;
+		vmx->nested.msrs.exit_ctls_high |= VM_EXIT_LOAD_CET_STATE;
+	} else {
+		vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_CET_STATE;
+		vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_LOAD_CET_STATE;
+	}
 }
 
 static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (7 preceding siblings ...)
  2020-07-01  8:04 ` [PATCH v13 08/11] KVM: VMX: Enable CET support for nested VM Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 10/11] KVM: x86: Add #CP support in guest exception dispatch Yang Weijiang
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

Dump CET VMCS states for debug purpose. Since CET kernel protection is
not enabled, if related MSRs in host are filled by mistake, warn once on
detecting it.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 32893573b630..70cb2d4a1391 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5941,6 +5941,12 @@ void dump_vmcs(void)
 		pr_err("InterruptStatus = %04x\n",
 		       vmcs_read16(GUEST_INTR_STATUS));
 
+	if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) {
+		pr_err("S_CET = 0x%016lx\n", vmcs_readl(GUEST_S_CET));
+		pr_err("SSP = 0x%016lx\n", vmcs_readl(GUEST_SSP));
+		pr_err("SSP TABLE = 0x%016lx\n",
+		       vmcs_readl(GUEST_INTR_SSP_TABLE));
+	}
 	pr_err("*** Host State ***\n");
 	pr_err("RIP = 0x%016lx  RSP = 0x%016lx\n",
 	       vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP));
@@ -6023,6 +6029,12 @@ void dump_vmcs(void)
 	if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID)
 		pr_err("Virtual processor ID = 0x%04x\n",
 		       vmcs_read16(VIRTUAL_PROCESSOR_ID));
+	if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) {
+		pr_err("S_CET = 0x%016lx\n", vmcs_readl(HOST_S_CET));
+		pr_err("SSP = 0x%016lx\n", vmcs_readl(HOST_SSP));
+		pr_err("SSP TABLE = 0x%016lx\n",
+		       vmcs_readl(HOST_INTR_SSP_TABLE));
+	}
 }
 
 /*
@@ -8075,6 +8087,7 @@ static __init int hardware_setup(void)
 	unsigned long host_bndcfgs;
 	struct desc_ptr dt;
 	int r, i, ept_lpage_level;
+	u64 cet_msr;
 
 	store_idt(&dt);
 	host_idt_base = dt.address;
@@ -8236,6 +8249,16 @@ static __init int hardware_setup(void)
 			return r;
 	}
 
+	if (boot_cpu_has(X86_FEATURE_IBT) || boot_cpu_has(X86_FEATURE_SHSTK)) {
+		rdmsrl(MSR_IA32_S_CET, cet_msr);
+		WARN_ONCE(cet_msr, "KVM: CET S_CET in host will be lost!\n");
+	}
+
+	if (boot_cpu_has(X86_FEATURE_SHSTK)) {
+		rdmsrl(MSR_IA32_PL0_SSP, cet_msr);
+		WARN_ONCE(cet_msr, "KVM: CET PL0_SSP in host will be lost!\n");
+	}
+
 	vmx_set_cpu_caps();
 
 	r = alloc_kvm_area();
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 10/11] KVM: x86: Add #CP support in guest exception dispatch
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (8 preceding siblings ...)
  2020-07-01  8:04 ` [PATCH v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-01  8:04 ` [PATCH v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace Yang Weijiang
  2020-07-13 18:13 ` [PATCH v13 00/11] Introduce support for guest CET feature Sean Christopherson
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

CPU defined #CP(21) to handle CET induced exception, it's accompanied
with several error codes corresponding to different CET violation cases,
see SDM for detailed description. The exception is classified as a
contibutory exception w.r.t #DF.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/uapi/asm/kvm.h | 1 +
 arch/x86/kvm/x86.c              | 1 +
 arch/x86/kvm/x86.h              | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 3f3f780c8c65..78e5c4266270 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -31,6 +31,7 @@
 #define MC_VECTOR 18
 #define XM_VECTOR 19
 #define VE_VECTOR 20
+#define CP_VECTOR 21
 
 /* Select x86 specific features in <linux/kvm.h> */
 #define __KVM_HAVE_PIT
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9c16ce65fe74..94ca5b56d233 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -407,6 +407,7 @@ static int exception_class(int vector)
 	case NP_VECTOR:
 	case SS_VECTOR:
 	case GP_VECTOR:
+	case CP_VECTOR:
 		return EXCPT_CONTRIBUTORY;
 	default:
 		break;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index b968acc0516f..7374e77c91d8 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -115,7 +115,7 @@ static inline bool x86_exception_has_error_code(unsigned int vector)
 {
 	static u32 exception_has_error_code = BIT(DF_VECTOR) | BIT(TS_VECTOR) |
 			BIT(NP_VECTOR) | BIT(SS_VECTOR) | BIT(GP_VECTOR) |
-			BIT(PF_VECTOR) | BIT(AC_VECTOR);
+			BIT(PF_VECTOR) | BIT(AC_VECTOR) | BIT(CP_VECTOR);
 
 	return (1U << vector) & exception_has_error_code;
 }
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (9 preceding siblings ...)
  2020-07-01  8:04 ` [PATCH v13 10/11] KVM: x86: Add #CP support in guest exception dispatch Yang Weijiang
@ 2020-07-01  8:04 ` Yang Weijiang
  2020-07-13 18:13 ` [PATCH v13 00/11] Introduce support for guest CET feature Sean Christopherson
  11 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-01  8:04 UTC (permalink / raw)
  To: kvm, linux-kernel, pbonzini, sean.j.christopherson, jmattson
  Cc: yu.c.zhang, Yang Weijiang

Set the feature bits so that CET capabilities can be seen in guest via
CPUID enumeration. Add CR4.CET bit support in order to allow guest set CET
master control bit(CR4.CET).

Disable KVM CET feature once unrestricted_guest is turned off because
KVM cannot emulate guest CET behavior well in this case.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 3 ++-
 arch/x86/kvm/cpuid.c            | 5 +++--
 arch/x86/kvm/vmx/vmx.c          | 5 +++++
 arch/x86/kvm/x86.c              | 5 +++++
 4 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f68c825e94ad..21f3c89d8c70 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -95,7 +95,8 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+			  | X86_CR4_CET))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 984ab2b395b3..333a9e0d7cdf 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -344,7 +344,8 @@ void kvm_set_cpu_caps(void)
 		F(AVX512VBMI) | F(LA57) | 0 /*PKU*/ | 0 /*OSPKE*/ | F(RDPID) |
 		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
 		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
-		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
+		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
+		F(SHSTK)
 	);
 	/* Set LA57 based on hardware capability. */
 	if (cpuid_ecx(7) & F(LA57))
@@ -353,7 +354,7 @@ void kvm_set_cpu_caps(void)
 	kvm_cpu_cap_mask(CPUID_7_EDX,
 		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
 		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
-		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM)
+		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | F(IBT)
 	);
 
 	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 70cb2d4a1391..7dac5747adc8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7411,6 +7411,11 @@ static __init void vmx_set_cpu_caps(void)
 	/* CPUID 0x80000001 */
 	if (!cpu_has_vmx_rdtscp())
 		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
+
+	if (!enable_unrestricted_guest) {
+		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+		kvm_cpu_cap_clear(X86_FEATURE_IBT);
+	}
 }
 
 static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 94ca5b56d233..a4cf5f3211f3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9707,6 +9707,11 @@ int kvm_arch_hardware_setup(void *opaque)
 	if (kvm_cpu_cap_has(X86_FEATURE_XSAVES))
 		supported_xss = host_xss & KVM_SUPPORTED_XSS;
 
+	if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+	    !kvm_cpu_cap_has(X86_FEATURE_IBT))
+		supported_xss &= ~(XFEATURE_MASK_CET_USER |
+				   XFEATURE_MASK_CET_KERNEL);
+
 	cr4_reserved_bits = kvm_host_cr4_reserved_bits(&boot_cpu_data);
 
 	if (kvm_has_tsc_control) {
-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration
  2020-07-01  8:04 ` [PATCH v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Yang Weijiang
@ 2020-07-02 15:13   ` Xiaoyao Li
  2020-07-03 15:02     ` Yang Weijiang
  0 siblings, 1 reply; 16+ messages in thread
From: Xiaoyao Li @ 2020-07-02 15:13 UTC (permalink / raw)
  To: Yang Weijiang, kvm, linux-kernel, pbonzini,
	sean.j.christopherson, jmattson
  Cc: yu.c.zhang

On 7/1/2020 4:04 PM, Yang Weijiang wrote:
> CET MSRs pass through guest directly to enhance performance. CET runtime
> control settings are stored in MSR_IA32_{U,S}_CET, Shadow Stack Pointer(SSP)
> are stored in MSR_IA32_PL{0,1,2,3}_SSP, SSP table base address is stored in
> MSR_IA32_INT_SSP_TAB, these MSRs are defined in kernel and re-used here.
> 
> MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user-mode protection,the MSR
> contents are switched between threads during scheduling, it makes sense to pass
> through them so that the guest kernel can use xsaves/xrstors to operate them
> efficiently. Other MSRs are used for non-user mode protection. See SDM for detailed
> info.
> 
> The difference between CET VMCS fields and CET MSRs is that,the former are used
> during VMEnter/VMExit, whereas the latter are used for CET state storage between
> task/thread scheduling.
> 
> Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>   arch/x86/kvm/vmx/vmx.c | 46 ++++++++++++++++++++++++++++++++++++++++++
>   arch/x86/kvm/x86.c     |  3 +++
>   2 files changed, 49 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d52d470e36b1..97e766875a7e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -3020,6 +3020,13 @@ void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, unsigned long cr3)
>   		vmcs_writel(GUEST_CR3, guest_cr3);
>   }
>   
> +static bool is_cet_state_supported(struct kvm_vcpu *vcpu, u32 xss_states)
> +{
> +	return ((supported_xss & xss_states) &&
> +		(guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> +		guest_cpuid_has(vcpu, X86_FEATURE_IBT)));
> +}
> +
>   int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>   {
>   	struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -7098,6 +7105,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
>   		vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
>   }
>   
> +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
> +	bool incpt;
> +
> +	incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_USER);
> +	/*
> +	 * U_CET is required for USER CET, and U_CET, PL3_SPP are bound as
> +	 * one component and controlled by IA32_XSS[bit 11].
> +	 */
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW,
> +				  incpt);
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW,
> +				  incpt);
> +
> +	incpt = !is_cet_state_supported(vcpu, XFEATURE_MASK_CET_KERNEL);
> +	/*
> +	 * S_CET is required for KERNEL CET, and PL0_SSP ... PL2_SSP are
> +	 * bound as one component and controlled by IA32_XSS[bit 12].
> +	 */
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW,
> +				  incpt);
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW,
> +				  incpt);
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW,
> +				  incpt);
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW,
> +				  incpt);
> +
> +	incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
> +	/* SSP_TAB is only available for KERNEL SHSTK.*/
> +	vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW,
> +				  incpt);
> +}
> +
>   static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
>   {
>   	struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -7136,6 +7179,9 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
>   			vmx_set_guest_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
>   		}
>   	}
> +
> +	if (supported_xss & (XFEATURE_MASK_CET_KERNEL | XFEATURE_MASK_CET_USER))
> +		vmx_update_intercept_for_cet_msr(vcpu);
>   }
>   
>   static __init void vmx_set_cpu_caps(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c5835f9cb9ad..6390b62c12ed 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -186,6 +186,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs;
>   				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
>   				| XFEATURE_MASK_PKRU)
>   
> +#define KVM_SUPPORTED_XSS       (XFEATURE_MASK_CET_USER | \
> +				 XFEATURE_MASK_CET_KERNEL)
> +

This definition need to be moved to Patch 5?

>   u64 __read_mostly host_efer;
>   EXPORT_SYMBOL_GPL(host_efer);
>   
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration
  2020-07-02 15:13   ` Xiaoyao Li
@ 2020-07-03 15:02     ` Yang Weijiang
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-03 15:02 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Yang Weijiang, kvm, linux-kernel, pbonzini,
	sean.j.christopherson, jmattson, yu.c.zhang

On Thu, Jul 02, 2020 at 11:13:35PM +0800, Xiaoyao Li wrote:
> On 7/1/2020 4:04 PM, Yang Weijiang wrote:
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index c5835f9cb9ad..6390b62c12ed 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -186,6 +186,9 @@ static struct kvm_shared_msrs __percpu *shared_msrs;
> >   				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
> >   				| XFEATURE_MASK_PKRU)
> > +#define KVM_SUPPORTED_XSS       (XFEATURE_MASK_CET_USER | \
> > +				 XFEATURE_MASK_CET_KERNEL)
> > +
> 
> This definition need to be moved to Patch 5?
> 
Good capture, thanks! I'll move it in next series.

> >   u64 __read_mostly host_efer;
> >   EXPORT_SYMBOL_GPL(host_efer);
> > 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v13 00/11] Introduce support for guest CET feature
  2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
                   ` (10 preceding siblings ...)
  2020-07-01  8:04 ` [PATCH v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace Yang Weijiang
@ 2020-07-13 18:13 ` Sean Christopherson
  2020-07-15  0:40   ` Yang Weijiang
  11 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2020-07-13 18:13 UTC (permalink / raw)
  To: Yang Weijiang; +Cc: kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Wed, Jul 01, 2020 at 04:04:00PM +0800, Yang Weijiang wrote:
> Control-flow Enforcement Technology (CET) provides protection against
> Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
> sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
> SHSTK is to prevent ROP programming and IBT is to prevent JOP programming.
> 
> Several parts in KVM have been updated to provide VM CET support, including:
> CPUID/XSAVES config, MSR pass-through, user space MSR access interface, 
> vmentry/vmexit config, nested VM etc. These patches have dependency on CET
> kernel patches for xsaves support and CET definitions, e.g., MSR and related
> feature flags.
> 
> CET kernel patches are here:
> https://lkml.kernel.org/r/20200429220732.31602-1-yu-cheng.yu@intel.com
> 
> v13:
> - Added CET definitions as a separate patch to facilitate KVM test.
> - Disabled CET support in KVM if unrestricted_guest is turned off since
>   in this case CET related instructions/infrastructure cannot be emulated
>   well.

This needs to be rebased, I can't get it to apply on any kvm branch nor on
any 5.8 rc.  And when you send series, especially large series that touch
lots of code, please explicitly state what commit the series is based on to
make it easy for reviewers to apply the patches, even if the series needs a
rebase.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v13 00/11] Introduce support for guest CET feature
  2020-07-13 18:13 ` [PATCH v13 00/11] Introduce support for guest CET feature Sean Christopherson
@ 2020-07-15  0:40   ` Yang Weijiang
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2020-07-15  0:40 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Yang Weijiang, kvm, linux-kernel, pbonzini, jmattson, yu.c.zhang

On Mon, Jul 13, 2020 at 11:13:26AM -0700, Sean Christopherson wrote:
> On Wed, Jul 01, 2020 at 04:04:00PM +0800, Yang Weijiang wrote:
> > Control-flow Enforcement Technology (CET) provides protection against
> > Return/Jump-Oriented Programming (ROP/JOP) attack. There're two CET
> > sub-features: Shadow Stack (SHSTK) and Indirect Branch Tracking (IBT).
> > SHSTK is to prevent ROP programming and IBT is to prevent JOP programming.
> > 
> > Several parts in KVM have been updated to provide VM CET support, including:
> > CPUID/XSAVES config, MSR pass-through, user space MSR access interface, 
> > vmentry/vmexit config, nested VM etc. These patches have dependency on CET
> > kernel patches for xsaves support and CET definitions, e.g., MSR and related
> > feature flags.
> > 
> > CET kernel patches are here:
> > https://lkml.kernel.org/r/20200429220732.31602-1-yu-cheng.yu@intel.com
> > 
> > v13:
> > - Added CET definitions as a separate patch to facilitate KVM test.
> > - Disabled CET support in KVM if unrestricted_guest is turned off since
> >   in this case CET related instructions/infrastructure cannot be emulated
> >   well.
> 
> This needs to be rebased, I can't get it to apply on any kvm branch nor on
> any 5.8 rc.  And when you send series, especially large series that touch
> lots of code, please explicitly state what commit the series is based on to
> make it easy for reviewers to apply the patches, even if the series needs a
> rebase.
Sorry for the inconvenience, I'll rebase and resend this series.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, back to index

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-01  8:04 [PATCH v13 00/11] Introduce support for guest CET feature Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 01/11] KVM: x86: Include CET definitions for KVM test purpose Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 02/11] KVM: VMX: Introduce CET VMCS fields and flags Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 03/11] KVM: VMX: Set guest CET MSRs per KVM and host configuration Yang Weijiang
2020-07-02 15:13   ` Xiaoyao Li
2020-07-03 15:02     ` Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 04/11] KVM: VMX: Configure CET settings upon guest CR0/4 changing Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 05/11] KVM: x86: Refresh CPUID once guest changes XSS bits Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 06/11] KVM: x86: Load guest fpu state when access MSRs managed by XSAVES Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 07/11] KVM: x86: Add userspace access interface for CET MSRs Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 08/11] KVM: VMX: Enable CET support for nested VM Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 09/11] KVM: VMX: Add VMCS dump and sanity check for CET states Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 10/11] KVM: x86: Add #CP support in guest exception dispatch Yang Weijiang
2020-07-01  8:04 ` [PATCH v13 11/11] KVM: x86: Enable CET virtualization and advertise CET to userspace Yang Weijiang
2020-07-13 18:13 ` [PATCH v13 00/11] Introduce support for guest CET feature Sean Christopherson
2020-07-15  0:40   ` Yang Weijiang

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git