linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support
@ 2019-03-18 15:03 Yang Weijiang
  2019-03-18 15:03 ` [RFC PATCH v4 1/8] KVM:VMX: Define CET VMCS fields and bits Yang Weijiang
                   ` (9 more replies)
  0 siblings, 10 replies; 21+ messages in thread
From: Yang Weijiang @ 2019-03-18 15:03 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu
  Cc: Yang Weijiang

Control-flow Enforcement Technology (CET) provides protection against
return/jump-oriented programming (ROP) attacks. To make kvm Guest OS own
the capability, this patch-set is required. It enables CET related CPUID
report, xsaves/xrstors, vmx entry configuration etc. for Guest OS.

PATCH 1    : Define CET VMCS fields and bits.
PATCH 2/3  : Report CET feature support in CPUID.
PATCH 4    : Fix xsaves size calculation issue.
PATCH 5    : Pass through CET MSRs to Guest.
PATCH 6    : Set Guest CET state auto loading bit.
PATCH 7    : Load Guest fpu state when accessing CET XSAVES managed MSRs.
PATCH 8    : Add CET MSR user space access interface.

Changelog:

 v4:
 - Add Sean's patch for loading Guest fpu state before access XSAVES
   managed CET MSRs.
 - Melt down CET bits setting into CPUID configuration patch.
 - Add VMX interface to query Host XSS.
 - Check Host and Guest XSS support bits before set Guest XSS.
 - Make Guest SHSTK and IBT feature enabling independent.
 - Do not report CET support to Guest when Host CET feature is
   Disabled.

 v3:
 - Modified patches to make Guest CET independent to Host enabling.
 - Added patch 8 to add user space access for Guest CET MSR access.
 - Modified code comments and patch description to reflect changes.

 v2:
 - Re-ordered patch sequence, combined one patch.
 - Added more description for CET related VMCS fields.
 - Added Host CET capability check while enabling Guest CET loading bit.
 - Added Host CET capability check while reporting Guest CPUID(EAX=7,
   EXC=0).
 - Modified code in reporting Guest CPUID(EAX=D,ECX>=1), make it clearer.
 - Added Host and Guest XSS mask check while setting bits for Guest XSS.


Sean Christopherson (1):
  KVM:x86: load guest fpu state when accessing MSRs managed by XSAVES

Yang Weijiang (7):
  KVM:VMX: Define CET VMCS fields and bits
  KVM:CPUID: Add CET CPUID support for Guest
  KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1).
  KVM:VMX: Pass through host CET related MSRs to Guest.
  KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest
  KVM:x86: Allow Guest to set supported bits in XSS
  KVM:x86: Add user-space read/write interface for CET MSRs

 arch/x86/include/asm/kvm_host.h  |  5 +-
 arch/x86/include/asm/msr-index.h |  2 +
 arch/x86/include/asm/vmx.h       |  8 +++
 arch/x86/kvm/cpuid.c             | 53 ++++++++++++++------
 arch/x86/kvm/vmx.c               | 86 ++++++++++++++++++++++++++++++--
 arch/x86/kvm/x86.c               | 32 +++++++++++-
 arch/x86/kvm/x86.h               |  4 ++
 7 files changed, 167 insertions(+), 23 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH v4 1/8] KVM:VMX: Define CET VMCS fields and bits
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
@ 2019-03-18 15:03 ` Yang Weijiang
  2019-04-02 19:57   ` Sean Christopherson
  2019-03-18 15:03 ` [RFC PATCH v4 2/8] KVM:CPUID: Add CET CPUID support for Guest Yang Weijiang
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Yang Weijiang @ 2019-03-18 15:03 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu
  Cc: Yang Weijiang, Zhang Yi Z

CET - Control-flow Enforcement Technology, it's used to
protect against return/jump oriented programming (ROP)
attacks. It provides the following capabilities to defend
against ROP/JOP style control-flow subversion attacks:
- Shadow Stack (SHSTK):
                 A second stack for the program that is
                 used exclusively for control transfer
                 operations.
- Indirect Branch Tracking (IBT):
                 Free branch protection to defend against
                 Jump/Call Oriented Programming.

On processors that support CET, VMX saves/restores
the states of IA32_S_CET, SSP and IA32_INTR_SSP_TABL_ADDR MSR
to the VMCS area for Guest/Host unconditionally.

If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET MSRs are
restored from VMCS host-state area at VM exit as follows:

- HOST_S_CET: Host supervisor mode IA32_S_CET MSR is loaded
                   from this field.

- HOST_SSP : Host SSP is loaded from this field.

- HOST_INTR_SSP_TABLE : Host IA32_INTR_SSP_TABL_ADDR
                             MSR is loaded from this field.

If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET MSRs are loaded
from VMCS guest-state area at VM entry as follows:

- GUEST_S_CET : Guest supervisor mode IA32_S_CET MSR is loaded
                     from this field.

- GUEST_SSP :        Guest SSP is loaded from this field.

- GUEST_INTR_SSP_TABLE : Guest IA32_INTR_SSP_TABL_ADDR
                             MSR is loaded from this field.

Additionally, to context switch guest and host CET states, the VMM
uses xsaves/xrstors instructions to save/restore the guest CET states
at VM exit/entry. The CET xsave area is within thread_struct.fpu area.
If OS execution flow changes during task switch/interrupt/exception etc.,
the OS also relies on xsaves/xrstors to switch CET states accordingly.

Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/vmx.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index ade0f153947d..fd70c4577c5a 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -98,6 +98,7 @@
 #define VM_EXIT_LOAD_IA32_EFER                  0x00200000
 #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER       0x00400000
 #define VM_EXIT_CLEAR_BNDCFGS                   0x00800000
+#define VM_EXIT_LOAD_HOST_CET_STATE             0x10000000
 
 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
 
@@ -109,6 +110,7 @@
 #define VM_ENTRY_LOAD_IA32_PAT			0x00004000
 #define VM_ENTRY_LOAD_IA32_EFER                 0x00008000
 #define VM_ENTRY_LOAD_BNDCFGS                   0x00010000
+#define VM_ENTRY_LOAD_GUEST_CET_STATE           0x00100000
 
 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR	0x000011ff
 
@@ -325,6 +327,9 @@ enum vmcs_field {
 	GUEST_PENDING_DBG_EXCEPTIONS    = 0x00006822,
 	GUEST_SYSENTER_ESP              = 0x00006824,
 	GUEST_SYSENTER_EIP              = 0x00006826,
+	GUEST_S_CET                     = 0x00006828,
+	GUEST_SSP                       = 0x0000682a,
+	GUEST_INTR_SSP_TABLE            = 0x0000682c,
 	HOST_CR0                        = 0x00006c00,
 	HOST_CR3                        = 0x00006c02,
 	HOST_CR4                        = 0x00006c04,
@@ -337,6 +342,9 @@ enum vmcs_field {
 	HOST_IA32_SYSENTER_EIP          = 0x00006c12,
 	HOST_RSP                        = 0x00006c14,
 	HOST_RIP                        = 0x00006c16,
+	HOST_S_CET                      = 0x00006c18,
+	HOST_SSP                        = 0x00006c1a,
+	HOST_INTR_SSP_TABLE             = 0x00006c1c
 };
 
 /*
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v4 2/8] KVM:CPUID: Add CET CPUID support for Guest
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
  2019-03-18 15:03 ` [RFC PATCH v4 1/8] KVM:VMX: Define CET VMCS fields and bits Yang Weijiang
@ 2019-03-18 15:03 ` Yang Weijiang
  2019-04-02 20:21   ` Sean Christopherson
  2019-04-02 20:40   ` Sean Christopherson
  2019-03-18 15:03 ` [RFC PATCH v4 3/8] KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1) Yang Weijiang
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 21+ messages in thread
From: Yang Weijiang @ 2019-03-18 15:03 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu
  Cc: Yang Weijiang, Zhang Yi Z

CET SHSTK and IBT capability are reported via
CPUID.(EAX=7, ECX=0):ECX[bit 7] and EDX[bit 20] respectively.
CR4.CET[bit 23] is CET master enable bit, it controls CET feature
enabling. CET user mode and supervisor mode xsaves component size
is reported via CPUID.(EAX=0xD, ECX=1):ECX[bit 11] and ECX[bit 12]
respectively.

Note: Although SHSTK or IBT can be enabled independently,
      both of them are controlled by CR4.CET.

Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  4 +++-
 arch/x86/kvm/cpuid.c            | 41 ++++++++++++++++++++++-----------
 arch/x86/kvm/vmx.c              |  6 +++++
 arch/x86/kvm/x86.h              |  4 ++++
 4 files changed, 40 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 55e51ff7e421..fc038bf1924a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -90,7 +90,8 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+			  | X86_CR4_CET))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
@@ -1185,6 +1186,7 @@ struct kvm_x86_ops {
 
 	int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
 				   uint16_t *vmcs_version);
+	u64 (*vmx_supported_xss)(void);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7bcfa61375c0..53abd6019c68 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void)
 	return xcr0;
 }
 
+u64 kvm_supported_xss(void)
+{
+	return KVM_SUPPORTED_XSS & kvm_x86_ops->vmx_supported_xss();
+}
+
 #define F(x) bit(X86_FEATURE_##x)
 
 /* For scattered features from cpufeatures.h; we currently expose none */
@@ -406,12 +411,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ |
 		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
 		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
-		F(CLDEMOTE);
+		F(CLDEMOTE) | F(SHSTK);
 
 	/* cpuid 7.0.edx*/
 	const u32 kvm_cpuid_7_0_edx_x86_features =
 		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
-		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES);
+		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(IBT);
 
 	/* all calls to cpuid_count() should be made on the same cpu */
 	get_cpu();
@@ -564,14 +569,16 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 	}
 	case 0xd: {
 		int idx, i;
-		u64 supported = kvm_supported_xcr0();
+		u64 u_supported = kvm_supported_xcr0();
+		u64 s_supported = kvm_supported_xss();
+		u64 supported;
 
-		entry->eax &= supported;
-		entry->ebx = xstate_required_size(supported, false);
+		entry->eax &= u_supported;
+		entry->ebx = xstate_required_size(u_supported, false);
 		entry->ecx = entry->ebx;
-		entry->edx &= supported >> 32;
+		entry->edx &= u_supported >> 32;
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-		if (!supported)
+		if (!u_supported && !s_supported)
 			break;
 
 		for (idx = 1, i = 1; idx < 64; ++idx) {
@@ -583,19 +590,25 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 			if (idx == 1) {
 				entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
 				cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
+				supported = u_supported | s_supported;
 				entry[i].ebx = 0;
-				if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
+				entry[i].edx = 0;
+				entry[i].ecx &= s_supported;
+				if (entry[i].eax & (F(XSAVES) | F(XSAVEC))) {
 					entry[i].ebx =
-						xstate_required_size(supported,
-								     true);
+					xstate_required_size(supported,
+							     true);
+				}
 			} else {
+				supported = (entry[i].ecx & 1) ? s_supported :
+								 u_supported;
 				if (entry[i].eax == 0 || !(supported & mask))
 					continue;
-				if (WARN_ON_ONCE(entry[i].ecx & 1))
-					continue;
+				entry[i].ecx &= 1;
+				entry[i].edx = 0;
+				if (entry[i].ecx)
+					entry[i].ebx = 0;
 			}
-			entry[i].ecx = 0;
-			entry[i].edx = 0;
 			entry[i].flags |=
 			       KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 			++*nent;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7bbb8b26e901..53cef5a3db96 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4169,6 +4169,11 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	return 0;
 }
 
+static __always_inline u64 vmx_supported_xss(void)
+{
+	return host_xss;
+}
+
 static void vmx_leave_nested(struct kvm_vcpu *vcpu);
 
 /*
@@ -15107,6 +15112,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 	.enable_smi_window = enable_smi_window,
 
 	.nested_enable_evmcs = nested_enable_evmcs,
+	.vmx_supported_xss = vmx_supported_xss,
 };
 
 static void vmx_cleanup_l1d_flush(void)
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 224cd0a47568..c61da41c3c5c 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -283,6 +283,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
 				| XFEATURE_MASK_YMM | XFEATURE_MASK_BNDREGS \
 				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
 				| XFEATURE_MASK_PKRU)
+
+#define KVM_SUPPORTED_XSS	(XFEATURE_MASK_SHSTK_USER \
+				| XFEATURE_MASK_SHSTK_KERNEL)
+
 extern u64 host_xcr0;
 
 extern u64 kvm_supported_xcr0(void);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v4 3/8] KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1).
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
  2019-03-18 15:03 ` [RFC PATCH v4 1/8] KVM:VMX: Define CET VMCS fields and bits Yang Weijiang
  2019-03-18 15:03 ` [RFC PATCH v4 2/8] KVM:CPUID: Add CET CPUID support for Guest Yang Weijiang
@ 2019-03-18 15:03 ` Yang Weijiang
  2019-04-02 20:27   ` Sean Christopherson
  2019-03-18 15:03 ` [RFC PATCH v4 4/8] KVM:VMX: Pass through host CET related MSRs to Guest Yang Weijiang
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Yang Weijiang @ 2019-03-18 15:03 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu
  Cc: Yang Weijiang, Zhang Yi Z

According to latest Software Development Manual vol.2/3.2,
for CPUID.(EAX=0xD,ECX=1), it should report xsaves area size
containing all states enabled  by XCR0|IA32_MSR_XSS.

Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/cpuid.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 53abd6019c68..29d6a5cdc746 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -126,7 +126,8 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
 
 	best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
 	if (best && (best->eax & (F(XSAVES) | F(XSAVEC))))
-		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
+		best->ebx = xstate_required_size(vcpu->arch.xcr0 |
+			    kvm_supported_xss(), true);
 
 	/*
 	 * The existing code assumes virtual address is 48-bit or 57-bit in the
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v4 4/8] KVM:VMX: Pass through host CET related MSRs to Guest.
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
                   ` (2 preceding siblings ...)
  2019-03-18 15:03 ` [RFC PATCH v4 3/8] KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1) Yang Weijiang
@ 2019-03-18 15:03 ` Yang Weijiang
  2019-04-02 20:27   ` Sean Christopherson
  2019-03-18 15:03 ` [RFC PATCH v4 5/8] KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest Yang Weijiang
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Yang Weijiang @ 2019-03-18 15:03 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu
  Cc: Yang Weijiang, Zhang Yi Z

The CET runtime settings, i.e., CET state control bits(IA32_U_CET/
IA32_S_CET), CET SSP(IA32_PL3_SSP/IA32_PL0_SSP) and SSP table address
(IA32_INTERRUPT_SSP_TABLE_ADDR) are task/thread specific, therefore,
OS needs to save/restore the states properly during context switch,
e.g., task/thread switching, interrupt/exception handling, it uses
xsaves/xrstors to achieve that.

The difference between VMCS CET area fields and xsave CET area, is that
the former is for state retention during Guest/Host context
switch while the latter is for state retention during OS execution.

Linux currently doesn't support CPL1 and CPL2, so SSPs for these level
are skipped here.

Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 53cef5a3db96..28b8ac027bd7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11774,6 +11774,7 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
 static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	unsigned long *msr_bitmap;
 
 	if (cpu_has_secondary_exec_ctrls()) {
 		vmx_compute_secondary_exec_control(vmx);
@@ -11791,6 +11792,18 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 		nested_vmx_cr_fixed1_bits_update(vcpu);
 		nested_vmx_entry_exit_ctls_update(vcpu);
 	}
+
+	msr_bitmap = vmx->vmcs01.msr_bitmap;
+
+	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW);
+	}
+
 }
 
 static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v4 5/8] KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
                   ` (3 preceding siblings ...)
  2019-03-18 15:03 ` [RFC PATCH v4 4/8] KVM:VMX: Pass through host CET related MSRs to Guest Yang Weijiang
@ 2019-03-18 15:03 ` Yang Weijiang
  2019-04-02 20:30   ` Sean Christopherson
  2019-03-18 15:03 ` [RFC PATCH v4 6/8] KVM:x86: Allow Guest to set supported bits in XSS Yang Weijiang
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Yang Weijiang @ 2019-03-18 15:03 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu
  Cc: Yang Weijiang, Zhang Yi Z

"Load Guest CET state" bit controls whether guest CET states
will be loaded at Guest entry. Before doing that, KVM needs
to check if CPU CET feature is available.

Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 28b8ac027bd7..246467c12930 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -55,6 +55,7 @@
 #include <asm/mmu_context.h>
 #include <asm/spec-ctrl.h>
 #include <asm/mshyperv.h>
+#include <asm/cet.h>
 
 #include "trace.h"
 #include "pmu.h"
@@ -5414,6 +5415,23 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 			return 1;
 	}
 
+	/*
+	 * To enable Guest CET, check whether CPU CET feature is
+	 * available, if it's there, set Guest CET state loading bit
+	 * per CR4.CET status, otherwise, return a fault to Guest.
+	 */
+	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
+		if (cr4 & X86_CR4_CET)
+			vmcs_set_bits(VM_ENTRY_CONTROLS,
+				      VM_ENTRY_LOAD_GUEST_CET_STATE);
+		else
+			vmcs_clear_bits(VM_ENTRY_CONTROLS,
+					VM_ENTRY_LOAD_GUEST_CET_STATE);
+	} else if (cr4 & X86_CR4_CET) {
+		return 1;
+	}
+
 	if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
 		return 1;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v4 6/8] KVM:x86: Allow Guest to set supported bits in XSS
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
                   ` (4 preceding siblings ...)
  2019-03-18 15:03 ` [RFC PATCH v4 5/8] KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest Yang Weijiang
@ 2019-03-18 15:03 ` Yang Weijiang
  2019-03-18 15:03 ` [RFC PATCH v4 7/8] KVM:x86: load guest fpu state when accessing MSRs managed by XSAVES Yang Weijiang
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 21+ messages in thread
From: Yang Weijiang @ 2019-03-18 15:03 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu
  Cc: Yang Weijiang, Zhang Yi Z

Now that KVM supports setting CET related bits.
Previously, KVM did not support setting any bits in XSS
and so hardcoded its check to inject a #GP if Guest
attempted to write a non-zero value to IA32_XSS.

Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/cpuid.c            | 13 ++++++++++---
 arch/x86/kvm/vmx.c              |  7 ++-----
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fc038bf1924a..30dfb5a65d1b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -616,6 +616,7 @@ struct kvm_vcpu_arch {
 
 	u64 xcr0;
 	u64 guest_supported_xcr0;
+	u64 guest_supported_xss;
 	u32 guest_xstate_size;
 
 	struct kvm_pio_request pio;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 29d6a5cdc746..e66a0557899c 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -125,9 +125,16 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
 	}
 
 	best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
-	if (best && (best->eax & (F(XSAVES) | F(XSAVEC))))
-		best->ebx = xstate_required_size(vcpu->arch.xcr0 |
-			    kvm_supported_xss(), true);
+	if (best) {
+		if (best->eax & (F(XSAVES) | F(XSAVEC)))
+			best->ebx = xstate_required_size(vcpu->arch.xcr0 |
+				    kvm_supported_xss(), true);
+
+		vcpu->arch.guest_supported_xss = best->ecx &
+			kvm_supported_xss();
+	} else {
+		vcpu->arch.guest_supported_xss = 0;
+	}
 
 	/*
 	 * The existing code assumes virtual address is 48-bit or 57-bit in the
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 246467c12930..816e11a66557 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4327,12 +4327,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_XSS:
 		if (!vmx_xsaves_supported())
 			return 1;
-		/*
-		 * The only supported bit as of Skylake is bit 8, but
-		 * it is not supported on KVM.
-		 */
-		if (data != 0)
+		if (data & ~vcpu->arch.guest_supported_xss)
 			return 1;
+
 		vcpu->arch.ia32_xss = data;
 		if (vcpu->arch.ia32_xss != host_xss)
 			add_atomic_switch_msr(vmx, MSR_IA32_XSS,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v4 7/8] KVM:x86: load guest fpu state when accessing MSRs managed by XSAVES
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
                   ` (5 preceding siblings ...)
  2019-03-18 15:03 ` [RFC PATCH v4 6/8] KVM:x86: Allow Guest to set supported bits in XSS Yang Weijiang
@ 2019-03-18 15:03 ` Yang Weijiang
  2019-04-02 20:39   ` Sean Christopherson
  2019-03-18 15:03 ` [RFC PATCH v4 8/8] KVM:x86: Add user-space read/write interface for CET MSRs Yang Weijiang
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Yang Weijiang @ 2019-03-18 15:03 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu
  Cc: Sean Christopherson, Yang Weijiang

From: Sean Christopherson <sean.j.christopherson@intel.com>

A handful of CET MSRs are not context switched through "traditional"
methods, e.g. VMCS or manual switching, but rather are passed through
to the guest and are saved and restored by XSAVES/XRSTORS, i.e. the
guest's FPU state.

Load the guest's FPU state if userspace is accessing MSRs whose values
are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(),
can simply do {RD,WR}MSR to access the guest's value.

Note that guest_cpuid_has() is not queried as host userspace is allowed
to access MSRs that have not been exposed to the guest, e.g. it might do
KVM_SET_MSRS prior to KVM_SET_CPUID2.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/x86.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a0f8b71b2132..ed747dfbff69 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -104,6 +104,8 @@ static void enter_smm(struct kvm_vcpu *vcpu);
 static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
 static void store_regs(struct kvm_vcpu *vcpu);
 static int sync_regs(struct kvm_vcpu *vcpu);
+static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
+static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
 
 struct kvm_x86_ops *kvm_x86_ops __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_x86_ops);
@@ -2889,6 +2891,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 }
 EXPORT_SYMBOL_GPL(kvm_get_msr_common);
 
+static bool is_xsaves_msr(u32 index)
+{
+	return index == MSR_IA32_U_CET ||
+	       (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP);
+}
+
 /*
  * Read or write a bunch of msrs. All parameters are kernel addresses.
  *
@@ -2899,11 +2907,33 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
 		    int (*do_msr)(struct kvm_vcpu *vcpu,
 				  unsigned index, u64 *data))
 {
+	bool fpu_loaded = false;
+	bool xss_loaded = false;
 	int i;
+	u64 cet_bits = XFEATURE_MASK_SHSTK_USER | XFEATURE_MASK_SHSTK_KERNEL;
+	u64 host_xss = 0;
+
+	for (i = 0; i < msrs->nmsrs; ++i) {
+		if (!fpu_loaded && is_xsaves_msr(entries[i].index)) {
+			if (!kvm_x86_ops->xsaves_supported())
+				continue;
+
+			if (!xss_loaded) {
+				rdmsrl(MSR_IA32_XSS, host_xss);
+				xss_loaded = true;
+			}
+
+			if ((host_xss & cet_bits) != cet_bits)
+				continue;
 
-	for (i = 0; i < msrs->nmsrs; ++i)
+			kvm_load_guest_fpu(vcpu);
+			fpu_loaded = true;
+		}
 		if (do_msr(vcpu, entries[i].index, &entries[i].data))
 			break;
+	}
+	if (fpu_loaded)
+		kvm_put_guest_fpu(vcpu);
 
 	return i;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH v4 8/8] KVM:x86: Add user-space read/write interface for CET MSRs
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
                   ` (6 preceding siblings ...)
  2019-03-18 15:03 ` [RFC PATCH v4 7/8] KVM:x86: load guest fpu state when accessing MSRs managed by XSAVES Yang Weijiang
@ 2019-03-18 15:03 ` Yang Weijiang
  2019-04-02 20:35   ` Sean Christopherson
  2019-03-25 20:45 ` [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
  2019-04-02 20:10 ` Sean Christopherson
  9 siblings, 1 reply; 21+ messages in thread
From: Yang Weijiang @ 2019-03-18 15:03 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu
  Cc: Yang Weijiang

When VMExit occurs, the Guest CET MSRs are stored in two different
places, U_CET/PL0_SSP/PL1_SSP/PL2_SSP/PL3_SSP are stored in fpu
xsave area, they are operated by XSAVES/XRSTORS, so before access
these MSRs, kvm_load_guest_fpu is required to restore them to Host MSRs.
After finish operation, need to restore Host MSRs by kvm_put_guest_fpu,
these two functions are call in __msr_io(). S_CET and INTR_SSP_TABLE
are stored in VMCS fields, so they can be access via vmcs_read/vmcs_write
directly.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/msr-index.h |  2 ++
 arch/x86/kvm/vmx.c               | 42 ++++++++++++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 298721ff00f4..cc94ab14a946 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -782,6 +782,8 @@
 #define MSR_IA32_U_CET		0x6a0 /* user mode cet setting */
 #define MSR_IA32_S_CET		0x6a2 /* kernel mode cet setting */
 #define MSR_IA32_PL0_SSP	0x6a4 /* kernel shstk pointer */
+#define MSR_IA32_PL1_SSP	0x6a5 /* ring 1 shstk pointer */
+#define MSR_IA32_PL2_SSP	0x6a6 /* ring 2 shstk pointer */
 #define MSR_IA32_PL3_SSP	0x6a7 /* user shstk pointer */
 #define MSR_IA32_INT_SSP_TAB	0x6a8 /* exception shstk table */
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 816e11a66557..5923b89fe120 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4153,6 +4153,27 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 1;
 		msr_info->data = vcpu->arch.ia32_xss;
 		break;
+	case MSR_IA32_S_CET:
+		msr_info->data = vmcs_readl(GUEST_S_CET);
+		break;
+	case MSR_IA32_U_CET:
+		rdmsrl(MSR_IA32_U_CET, msr_info->data);
+		break;
+	case MSR_IA32_INT_SSP_TAB:
+		msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
+		break;
+	case MSR_IA32_PL0_SSP:
+		rdmsrl(MSR_IA32_PL0_SSP, msr_info->data);
+		break;
+	case MSR_IA32_PL1_SSP:
+		rdmsrl(MSR_IA32_PL1_SSP, msr_info->data);
+		break;
+	case MSR_IA32_PL2_SSP:
+		rdmsrl(MSR_IA32_PL2_SSP, msr_info->data);
+		break;
+	case MSR_IA32_PL3_SSP:
+		rdmsrl(MSR_IA32_PL3_SSP, msr_info->data);
+		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
@@ -4337,6 +4358,27 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			clear_atomic_switch_msr(vmx, MSR_IA32_XSS);
 		break;
+	case MSR_IA32_S_CET:
+		vmcs_writel(GUEST_S_CET, data);
+		break;
+	case MSR_IA32_U_CET:
+		wrmsrl(MSR_IA32_U_CET, data);
+		break;
+	case MSR_IA32_INT_SSP_TAB:
+		vmcs_writel(GUEST_INTR_SSP_TABLE, data);
+		break;
+	case MSR_IA32_PL0_SSP:
+		wrmsrl(MSR_IA32_PL0_SSP, data);
+		break;
+	case MSR_IA32_PL1_SSP:
+		wrmsrl(MSR_IA32_PL1_SSP, data);
+		break;
+	case MSR_IA32_PL2_SSP:
+		wrmsrl(MSR_IA32_PL2_SSP, data);
+		break;
+	case MSR_IA32_PL3_SSP:
+		wrmsrl(MSR_IA32_PL3_SSP, data);
+		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
                   ` (7 preceding siblings ...)
  2019-03-18 15:03 ` [RFC PATCH v4 8/8] KVM:x86: Add user-space read/write interface for CET MSRs Yang Weijiang
@ 2019-03-25 20:45 ` Yang Weijiang
  2019-03-26 22:31   ` Sean Christopherson
  2019-04-02 20:10 ` Sean Christopherson
  9 siblings, 1 reply; 21+ messages in thread
From: Yang Weijiang @ 2019-03-25 20:45 UTC (permalink / raw)
  To: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu

On Mon, Mar 18, 2019 at 11:03:43PM +0800, Yang Weijiang wrote:
> Control-flow Enforcement Technology (CET) provides protection against
> return/jump-oriented programming (ROP) attacks. To make kvm Guest OS own
> the capability, this patch-set is required. It enables CET related CPUID
> report, xsaves/xrstors, vmx entry configuration etc. for Guest OS.
> 
> PATCH 1    : Define CET VMCS fields and bits.
> PATCH 2/3  : Report CET feature support in CPUID.
> PATCH 4    : Fix xsaves size calculation issue.
> PATCH 5    : Pass through CET MSRs to Guest.
> PATCH 6    : Set Guest CET state auto loading bit.
> PATCH 7    : Load Guest fpu state when accessing CET XSAVES managed MSRs.
> PATCH 8    : Add CET MSR user space access interface.
> 
> Changelog:
> 
>  v4:
>  - Add Sean's patch for loading Guest fpu state before access XSAVES
>    managed CET MSRs.
>  - Melt down CET bits setting into CPUID configuration patch.
>  - Add VMX interface to query Host XSS.
>  - Check Host and Guest XSS support bits before set Guest XSS.
>  - Make Guest SHSTK and IBT feature enabling independent.
>  - Do not report CET support to Guest when Host CET feature is
>    Disabled.
> 
>  v3:
>  - Modified patches to make Guest CET independent to Host enabling.
>  - Added patch 8 to add user space access for Guest CET MSR access.
>  - Modified code comments and patch description to reflect changes.
> 
>  v2:
>  - Re-ordered patch sequence, combined one patch.
>  - Added more description for CET related VMCS fields.
>  - Added Host CET capability check while enabling Guest CET loading bit.
>  - Added Host CET capability check while reporting Guest CPUID(EAX=7,
>    EXC=0).
>  - Modified code in reporting Guest CPUID(EAX=D,ECX>=1), make it clearer.
>  - Added Host and Guest XSS mask check while setting bits for Guest XSS.
> 
> 
> Sean Christopherson (1):
>   KVM:x86: load guest fpu state when accessing MSRs managed by XSAVES
> 
> Yang Weijiang (7):
>   KVM:VMX: Define CET VMCS fields and bits
>   KVM:CPUID: Add CET CPUID support for Guest
>   KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1).
>   KVM:VMX: Pass through host CET related MSRs to Guest.
>   KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest
>   KVM:x86: Allow Guest to set supported bits in XSS
>   KVM:x86: Add user-space read/write interface for CET MSRs
> 
>  arch/x86/include/asm/kvm_host.h  |  5 +-
>  arch/x86/include/asm/msr-index.h |  2 +
>  arch/x86/include/asm/vmx.h       |  8 +++
>  arch/x86/kvm/cpuid.c             | 53 ++++++++++++++------
>  arch/x86/kvm/vmx.c               | 86 ++++++++++++++++++++++++++++++--
>  arch/x86/kvm/x86.c               | 32 +++++++++++-
>  arch/x86/kvm/x86.h               |  4 ++
>  7 files changed, 167 insertions(+), 23 deletions(-)
> 
> -- 
> 2.17.1
Hi, Paolo and Sean,
Do you have any comments on v4 patches?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support
  2019-03-25 20:45 ` [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
@ 2019-03-26 22:31   ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2019-03-26 22:31 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu

On Tue, Mar 26, 2019 at 04:45:34AM +0800, Yang Weijiang wrote:
> Hi, Paolo and Sean,
> Do you have any comments on v4 patches?

My backlog is a bit full at the moment, I'll try to review the series
later this week.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 1/8] KVM:VMX: Define CET VMCS fields and bits
  2019-03-18 15:03 ` [RFC PATCH v4 1/8] KVM:VMX: Define CET VMCS fields and bits Yang Weijiang
@ 2019-04-02 19:57   ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2019-04-02 19:57 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu,
	Zhang Yi Z

On Mon, Mar 18, 2019 at 11:03:44PM +0800, Yang Weijiang wrote:
> CET - Control-flow Enforcement Technology, it's used to
> protect against return/jump oriented programming (ROP)
> attacks. It provides the following capabilities to defend
> against ROP/JOP style control-flow subversion attacks:
> - Shadow Stack (SHSTK):
>                  A second stack for the program that is
>                  used exclusively for control transfer
>                  operations.
> - Indirect Branch Tracking (IBT):
>                  Free branch protection to defend against
>                  Jump/Call Oriented Programming.
> 
> On processors that support CET, VMX saves/restores
> the states of IA32_S_CET, SSP and IA32_INTR_SSP_TABL_ADDR MSR
                                                  ^^^^
Please spell out "table" in full, here and below.

> to the VMCS area for Guest/Host unconditionally.
> 
> If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host CET MSRs are
> restored from VMCS host-state area at VM exit as follows:

This contradicts the above statement that "VMX saves/restores ...
IA32_S_CET, SSP and IA32_INTR_SSP_TABL_ADDR MSR" unconditionally.

> 
> - HOST_S_CET: Host supervisor mode IA32_S_CET MSR is loaded
>                    from this field.

Maybe provide a brief introduction on the MSRs instead of mixing that
in with the VMCS field definition?  "Host supervisor mode IA32_S_CET"
makes it sound as if the MSR only exists at cpl=0 or something.

> 
> - HOST_SSP : Host SSP is loaded from this field.
> 
> - HOST_INTR_SSP_TABLE : Host IA32_INTR_SSP_TABL_ADDR
                                             ^^^^
>                              MSR is loaded from this field.
> 
> If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest CET MSRs are loaded
> from VMCS guest-state area at VM entry as follows:
> 
> - GUEST_S_CET : Guest supervisor mode IA32_S_CET MSR is loaded
>                      from this field.
> 
> - GUEST_SSP :        Guest SSP is loaded from this field.
> 
> - GUEST_INTR_SSP_TABLE : Guest IA32_INTR_SSP_TABL_ADDR
                                               ^^^^
>                              MSR is loaded from this field.

You can probably drop the per-field descriptions, the intro regarding
VM_EXIT_LOAD_{HOST,GUEST}_CET_STATE should make it obvious to readers
that these fields contain host/guest values, and the names line up with
the MSRs (assuming you add a brief intro on the MSRs).  E.g.:

If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host's CET MSRs are restored
from the following VMCS fields at VM-Exit:

 - HOST_S_CET
 - HOST_SSP
 - HOST_INTR_SSP_TABLE

> 
> Additionally, to context switch guest and host CET states, the VMM
> uses xsaves/xrstors instructions to save/restore the guest CET states
> at VM exit/entry. The CET xsave area is within thread_struct.fpu area.
> If OS execution flow changes during task switch/interrupt/exception etc.,
> the OS also relies on xsaves/xrstors to switch CET states accordingly.
> 
> Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>

Assuming you both worked on these patches, a Co-developed-by: tag is
needed for Zhang Yi.  The current docs on Co-developed-by: are a bit
sparse, linux-next has a patch with more information on its usage[1].

https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/+/24a2bb90741bf86ddcf6558be419e182ff44fc95

> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/include/asm/vmx.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> index ade0f153947d..fd70c4577c5a 100644
> --- a/arch/x86/include/asm/vmx.h
> +++ b/arch/x86/include/asm/vmx.h
> @@ -98,6 +98,7 @@
>  #define VM_EXIT_LOAD_IA32_EFER                  0x00200000
>  #define VM_EXIT_SAVE_VMX_PREEMPTION_TIMER       0x00400000
>  #define VM_EXIT_CLEAR_BNDCFGS                   0x00800000
> +#define VM_EXIT_LOAD_HOST_CET_STATE             0x10000000
>  
>  #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
>  
> @@ -109,6 +110,7 @@
>  #define VM_ENTRY_LOAD_IA32_PAT			0x00004000
>  #define VM_ENTRY_LOAD_IA32_EFER                 0x00008000
>  #define VM_ENTRY_LOAD_BNDCFGS                   0x00010000
> +#define VM_ENTRY_LOAD_GUEST_CET_STATE           0x00100000
>  
>  #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR	0x000011ff
>  
> @@ -325,6 +327,9 @@ enum vmcs_field {
>  	GUEST_PENDING_DBG_EXCEPTIONS    = 0x00006822,
>  	GUEST_SYSENTER_ESP              = 0x00006824,
>  	GUEST_SYSENTER_EIP              = 0x00006826,
> +	GUEST_S_CET                     = 0x00006828,
> +	GUEST_SSP                       = 0x0000682a,
> +	GUEST_INTR_SSP_TABLE            = 0x0000682c,
>  	HOST_CR0                        = 0x00006c00,
>  	HOST_CR3                        = 0x00006c02,
>  	HOST_CR4                        = 0x00006c04,
> @@ -337,6 +342,9 @@ enum vmcs_field {
>  	HOST_IA32_SYSENTER_EIP          = 0x00006c12,
>  	HOST_RSP                        = 0x00006c14,
>  	HOST_RIP                        = 0x00006c16,
> +	HOST_S_CET                      = 0x00006c18,
> +	HOST_SSP                        = 0x00006c1a,
> +	HOST_INTR_SSP_TABLE             = 0x00006c1c
>  };
>  
>  /*
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support
  2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
                   ` (8 preceding siblings ...)
  2019-03-25 20:45 ` [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
@ 2019-04-02 20:10 ` Sean Christopherson
  9 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2019-04-02 20:10 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu

On Mon, Mar 18, 2019 at 11:03:43PM +0800, Yang Weijiang wrote:
> Control-flow Enforcement Technology (CET) provides protection against
> return/jump-oriented programming (ROP) attacks. To make kvm Guest OS own
> the capability, this patch-set is required. It enables CET related CPUID
> report, xsaves/xrstors, vmx entry configuration etc. for Guest OS.

...

> Yang Weijiang (7):
>   KVM:VMX: Define CET VMCS fields and bits
>   KVM:CPUID: Add CET CPUID support for Guest
>   KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1).

I like the earnestness of scoping to CPUID, but "KVM: x86:" is sufficient
and more correct, e.g. ARM has CPUID as well and most architectures have
an equivalent.

>   KVM:VMX: Pass through host CET related MSRs to Guest.
>   KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest
>   KVM:x86: Allow Guest to set supported bits in XSS
>   KVM:x86: Add user-space read/write interface for CET MSRs

Please add spaces between KVM: and the more specific scope.

> 
>  arch/x86/include/asm/kvm_host.h  |  5 +-
>  arch/x86/include/asm/msr-index.h |  2 +
>  arch/x86/include/asm/vmx.h       |  8 +++
>  arch/x86/kvm/cpuid.c             | 53 ++++++++++++++------
>  arch/x86/kvm/vmx.c               | 86 ++++++++++++++++++++++++++++++--

Please rebase to the latest KVM for the next version, this is getting
fairly stale.  Thanks!

>  arch/x86/kvm/x86.c               | 32 +++++++++++-
>  arch/x86/kvm/x86.h               |  4 ++
>  7 files changed, 167 insertions(+), 23 deletions(-)
> 
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 2/8] KVM:CPUID: Add CET CPUID support for Guest
  2019-03-18 15:03 ` [RFC PATCH v4 2/8] KVM:CPUID: Add CET CPUID support for Guest Yang Weijiang
@ 2019-04-02 20:21   ` Sean Christopherson
  2019-04-02 20:40   ` Sean Christopherson
  1 sibling, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2019-04-02 20:21 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu,
	Zhang Yi Z

On Mon, Mar 18, 2019 at 11:03:45PM +0800, Yang Weijiang wrote:
> CET SHSTK and IBT capability are reported via
> CPUID.(EAX=7, ECX=0):ECX[bit 7] and EDX[bit 20] respectively.
> CR4.CET[bit 23] is CET master enable bit, it controls CET feature
> enabling. CET user mode and supervisor mode xsaves component size
> is reported via CPUID.(EAX=0xD, ECX=1):ECX[bit 11] and ECX[bit 12]
> respectively.
> 
> Note: Although SHSTK or IBT can be enabled independently,
>       both of them are controlled by CR4.CET.
> 
> Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  4 +++-
>  arch/x86/kvm/cpuid.c            | 41 ++++++++++++++++++++++-----------
>  arch/x86/kvm/vmx.c              |  6 +++++
>  arch/x86/kvm/x86.h              |  4 ++++
>  4 files changed, 40 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 55e51ff7e421..fc038bf1924a 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -90,7 +90,8 @@
>  			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>  			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>  			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> +			  | X86_CR4_CET))
>  
>  #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>  
> @@ -1185,6 +1186,7 @@ struct kvm_x86_ops {
>  
>  	int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
>  				   uint16_t *vmcs_version);
> +	u64 (*vmx_supported_xss)(void);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 7bcfa61375c0..53abd6019c68 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void)
>  	return xcr0;
>  }
>  
> +u64 kvm_supported_xss(void)
> +{
> +	return KVM_SUPPORTED_XSS & kvm_x86_ops->vmx_supported_xss();
> +}
> +
>  #define F(x) bit(X86_FEATURE_##x)
>  
>  /* For scattered features from cpufeatures.h; we currently expose none */
> @@ -406,12 +411,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ |
>  		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
>  		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> -		F(CLDEMOTE);
> +		F(CLDEMOTE) | F(SHSTK);
>  
>  	/* cpuid 7.0.edx*/
>  	const u32 kvm_cpuid_7_0_edx_x86_features =
>  		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
> -		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES);
> +		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(IBT);

The features should not be advertised to userspace until KVM fully supports
them, which AIUI doesn't happen until the last patch in this series.

>  
>  	/* all calls to cpuid_count() should be made on the same cpu */
>  	get_cpu();
> @@ -564,14 +569,16 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  	}
>  	case 0xd: {
>  		int idx, i;
> -		u64 supported = kvm_supported_xcr0();
> +		u64 u_supported = kvm_supported_xcr0();
> +		u64 s_supported = kvm_supported_xss();
> +		u64 supported;
>  
> -		entry->eax &= supported;
> -		entry->ebx = xstate_required_size(supported, false);
> +		entry->eax &= u_supported;
> +		entry->ebx = xstate_required_size(u_supported, false);
>  		entry->ecx = entry->ebx;
> -		entry->edx &= supported >> 32;
> +		entry->edx &= u_supported >> 32;
>  		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> -		if (!supported)
> +		if (!u_supported && !s_supported)
>  			break;
>  
>  		for (idx = 1, i = 1; idx < 64; ++idx) {
> @@ -583,19 +590,25 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  			if (idx == 1) {
>  				entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
>  				cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
> +				supported = u_supported | s_supported;
>  				entry[i].ebx = 0;
> -				if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
> +				entry[i].edx = 0;
> +				entry[i].ecx &= s_supported;
> +				if (entry[i].eax & (F(XSAVES) | F(XSAVEC))) {
>  					entry[i].ebx =
> -						xstate_required_size(supported,
> -								     true);
> +					xstate_required_size(supported,
> +							     true);

The indentation depth is getting pretty gnarly, I at a glance I think it'd
be much cleaner to move the guts of the for loop to a helper, e.g.:

		for (idx = 1, i = 1; idx < 64; ++idx, ++i) {
			if (*nent >= maxnent)
				goto out;

			__do_cpuid_0xd_ent(i, u_supported, s_supported);

			++*nent;
		}
> +				}
>  			} else {
> +				supported = (entry[i].ecx & 1) ? s_supported :
> +								 u_supported;
>  				if (entry[i].eax == 0 || !(supported & mask))
>  					continue;
> -				if (WARN_ON_ONCE(entry[i].ecx & 1))
> -					continue;
> +				entry[i].ecx &= 1;
> +				entry[i].edx = 0;
> +				if (entry[i].ecx)
> +					entry[i].ebx = 0;
>  			}
> -			entry[i].ecx = 0;
> -			entry[i].edx = 0;
>  			entry[i].flags |=
>  			       KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
>  			++*nent;

Not your code, but the "++i;" that's sitting below this should be handled
in the for loop (see above).  This is a good opportunity to make that change.

> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 7bbb8b26e901..53cef5a3db96 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -4169,6 +4169,11 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	return 0;
>  }
>  
> +static __always_inline u64 vmx_supported_xss(void)
> +{
> +	return host_xss;
> +}
> +
>  static void vmx_leave_nested(struct kvm_vcpu *vcpu);
>  
>  /*
> @@ -15107,6 +15112,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
>  	.enable_smi_window = enable_smi_window,
>  
>  	.nested_enable_evmcs = nested_enable_evmcs,
> +	.vmx_supported_xss = vmx_supported_xss,
>  };
>  
>  static void vmx_cleanup_l1d_flush(void)
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 224cd0a47568..c61da41c3c5c 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -283,6 +283,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
>  				| XFEATURE_MASK_YMM | XFEATURE_MASK_BNDREGS \
>  				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
>  				| XFEATURE_MASK_PKRU)
> +
> +#define KVM_SUPPORTED_XSS	(XFEATURE_MASK_SHSTK_USER \
> +				| XFEATURE_MASK_SHSTK_KERNEL)
> +
>  extern u64 host_xcr0;
>  
>  extern u64 kvm_supported_xcr0(void);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 3/8] KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1).
  2019-03-18 15:03 ` [RFC PATCH v4 3/8] KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1) Yang Weijiang
@ 2019-04-02 20:27   ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2019-04-02 20:27 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu,
	Zhang Yi Z

On Mon, Mar 18, 2019 at 11:03:46PM +0800, Yang Weijiang wrote:
> According to latest Software Development Manual vol.2/3.2,

Most readers of this code are already familiar with the SDM acronym,
and if they're not, "Intel SDM" provides great search results.  No
need to define it here.

Avoid referencing specific sections, they tend to change.

> for CPUID.(EAX=0xD,ECX=1), it should report xsaves area size

When quoting the SDM, it's usually best to avoid "should" and simply
state what the SDM says.  In other words, the SDM doesn't hedge on
what the processor should (not) do, it documents the exact behavior.

E.g.:

According to the SDM, Vol 2,  CPUID(EAX=0xD,ECX=1) reports the XSAVE are
size containing all states enabled by XCR0|IA32_MSR_XSS.
 
> containing all states enabled  by XCR0|IA32_MSR_XSS.
> 
> Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/kvm/cpuid.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 53abd6019c68..29d6a5cdc746 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -126,7 +126,8 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
>  
>  	best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
>  	if (best && (best->eax & (F(XSAVES) | F(XSAVEC))))
> -		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
> +		best->ebx = xstate_required_size(vcpu->arch.xcr0 |
> +			    kvm_supported_xss(), true);
>  
>  	/*
>  	 * The existing code assumes virtual address is 48-bit or 57-bit in the
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 4/8] KVM:VMX: Pass through host CET related MSRs to Guest.
  2019-03-18 15:03 ` [RFC PATCH v4 4/8] KVM:VMX: Pass through host CET related MSRs to Guest Yang Weijiang
@ 2019-04-02 20:27   ` Sean Christopherson
  2019-04-02 20:46     ` Yang Weijiang
  0 siblings, 1 reply; 21+ messages in thread
From: Sean Christopherson @ 2019-04-02 20:27 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu,
	Zhang Yi Z

On Mon, Mar 18, 2019 at 11:03:47PM +0800, Yang Weijiang wrote:
> The CET runtime settings, i.e., CET state control bits(IA32_U_CET/
> IA32_S_CET), CET SSP(IA32_PL3_SSP/IA32_PL0_SSP) and SSP table address
> (IA32_INTERRUPT_SSP_TABLE_ADDR) are task/thread specific, therefore,
> OS needs to save/restore the states properly during context switch,
> e.g., task/thread switching, interrupt/exception handling, it uses
> xsaves/xrstors to achieve that.
> 
> The difference between VMCS CET area fields and xsave CET area, is that
> the former is for state retention during Guest/Host context
> switch while the latter is for state retention during OS execution.
> 
> Linux currently doesn't support CPL1 and CPL2, so SSPs for these level
> are skipped here.
> 
> Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/kvm/vmx.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 53cef5a3db96..28b8ac027bd7 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -11774,6 +11774,7 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
>  static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	unsigned long *msr_bitmap;
>  
>  	if (cpu_has_secondary_exec_ctrls()) {
>  		vmx_compute_secondary_exec_control(vmx);
> @@ -11791,6 +11792,18 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
>  		nested_vmx_cr_fixed1_bits_update(vcpu);
>  		nested_vmx_entry_exit_ctls_update(vcpu);
>  	}
> +
> +	msr_bitmap = vmx->vmcs01.msr_bitmap;
> +
> +	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> +	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW);
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW);
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW);
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW);

I thought we had agreed to pass-through PL1 and PL2, or am I misremembering?

> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW);
> +	}
> +
>  }
>  
>  static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 5/8] KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest
  2019-03-18 15:03 ` [RFC PATCH v4 5/8] KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest Yang Weijiang
@ 2019-04-02 20:30   ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2019-04-02 20:30 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu,
	Zhang Yi Z

On Mon, Mar 18, 2019 at 11:03:48PM +0800, Yang Weijiang wrote:
> "Load Guest CET state" bit controls whether guest CET states
> will be loaded at Guest entry. Before doing that, KVM needs
> to check if CPU CET feature is available.

Use the changelog to describe *why* the patch does what it does, for
something this simple it's obvious *what* the patch adds.  Specifically,
there was a decent amount of discussion regarding the virtualization
hole that gets introduced when SHSTK!=IBT, and that need to be captured
in the changelog.

> 
> Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/kvm/vmx.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 28b8ac027bd7..246467c12930 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -55,6 +55,7 @@
>  #include <asm/mmu_context.h>
>  #include <asm/spec-ctrl.h>
>  #include <asm/mshyperv.h>
> +#include <asm/cet.h>
>  
>  #include "trace.h"
>  #include "pmu.h"
> @@ -5414,6 +5415,23 @@ static int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>  			return 1;
>  	}
>  
> +	/*
> +	 * To enable Guest CET, check whether CPU CET feature is
> +	 * available, if it's there, set Guest CET state loading bit
> +	 * per CR4.CET status, otherwise, return a fault to Guest.
> +	 */

No need for the comment, the code is straightforward.

> +	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> +	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
> +		if (cr4 & X86_CR4_CET)
> +			vmcs_set_bits(VM_ENTRY_CONTROLS,
> +				      VM_ENTRY_LOAD_GUEST_CET_STATE);
> +		else
> +			vmcs_clear_bits(VM_ENTRY_CONTROLS,
> +					VM_ENTRY_LOAD_GUEST_CET_STATE);
> +	} else if (cr4 & X86_CR4_CET) {
> +		return 1;
> +	}
> +
>  	if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
>  		return 1;
>  
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 8/8] KVM:x86: Add user-space read/write interface for CET MSRs
  2019-03-18 15:03 ` [RFC PATCH v4 8/8] KVM:x86: Add user-space read/write interface for CET MSRs Yang Weijiang
@ 2019-04-02 20:35   ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2019-04-02 20:35 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu

On Mon, Mar 18, 2019 at 11:03:51PM +0800, Yang Weijiang wrote:
> When VMExit occurs, the Guest CET MSRs are stored in two different
> places, U_CET/PL0_SSP/PL1_SSP/PL2_SSP/PL3_SSP are stored in fpu
> xsave area, they are operated by XSAVES/XRSTORS, so before access
> these MSRs, kvm_load_guest_fpu is required to restore them to Host MSRs.
> After finish operation, need to restore Host MSRs by kvm_put_guest_fpu,
> these two functions are call in __msr_io(). S_CET and INTR_SSP_TABLE
> are stored in VMCS fields, so they can be access via vmcs_read/vmcs_write
> directly.

Hmm, the changelog reads like the kvm_load_guest_fpu() behavior is
introduced in this patch.  It should be reworded to make it more obvious
that a previous patch handled that behavior.  Alternatively, just fold
this into said previous patch.

> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/include/asm/msr-index.h |  2 ++
>  arch/x86/kvm/vmx.c               | 42 ++++++++++++++++++++++++++++++++
>  2 files changed, 44 insertions(+)
> 
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 298721ff00f4..cc94ab14a946 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -782,6 +782,8 @@
>  #define MSR_IA32_U_CET		0x6a0 /* user mode cet setting */
>  #define MSR_IA32_S_CET		0x6a2 /* kernel mode cet setting */
>  #define MSR_IA32_PL0_SSP	0x6a4 /* kernel shstk pointer */
> +#define MSR_IA32_PL1_SSP	0x6a5 /* ring 1 shstk pointer */
> +#define MSR_IA32_PL2_SSP	0x6a6 /* ring 2 shstk pointer */
>  #define MSR_IA32_PL3_SSP	0x6a7 /* user shstk pointer */
>  #define MSR_IA32_INT_SSP_TAB	0x6a8 /* exception shstk table */
>  
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 816e11a66557..5923b89fe120 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -4153,6 +4153,27 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  			return 1;
>  		msr_info->data = vcpu->arch.ia32_xss;
>  		break;
> +	case MSR_IA32_S_CET:
> +		msr_info->data = vmcs_readl(GUEST_S_CET);
> +		break;
> +	case MSR_IA32_U_CET:
> +		rdmsrl(MSR_IA32_U_CET, msr_info->data);
> +		break;
> +	case MSR_IA32_INT_SSP_TAB:
> +		msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
> +		break;
> +	case MSR_IA32_PL0_SSP:
> +		rdmsrl(MSR_IA32_PL0_SSP, msr_info->data);
> +		break;
> +	case MSR_IA32_PL1_SSP:
> +		rdmsrl(MSR_IA32_PL1_SSP, msr_info->data);
> +		break;
> +	case MSR_IA32_PL2_SSP:
> +		rdmsrl(MSR_IA32_PL2_SSP, msr_info->data);
> +		break;
> +	case MSR_IA32_PL3_SSP:
> +		rdmsrl(MSR_IA32_PL3_SSP, msr_info->data);
> +		break;
>  	case MSR_TSC_AUX:
>  		if (!msr_info->host_initiated &&
>  		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> @@ -4337,6 +4358,27 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		else
>  			clear_atomic_switch_msr(vmx, MSR_IA32_XSS);
>  		break;
> +	case MSR_IA32_S_CET:
> +		vmcs_writel(GUEST_S_CET, data);
> +		break;
> +	case MSR_IA32_U_CET:
> +		wrmsrl(MSR_IA32_U_CET, data);
> +		break;
> +	case MSR_IA32_INT_SSP_TAB:
> +		vmcs_writel(GUEST_INTR_SSP_TABLE, data);
> +		break;
> +	case MSR_IA32_PL0_SSP:
> +		wrmsrl(MSR_IA32_PL0_SSP, data);
> +		break;
> +	case MSR_IA32_PL1_SSP:
> +		wrmsrl(MSR_IA32_PL1_SSP, data);
> +		break;
> +	case MSR_IA32_PL2_SSP:
> +		wrmsrl(MSR_IA32_PL2_SSP, data);
> +		break;
> +	case MSR_IA32_PL3_SSP:
> +		wrmsrl(MSR_IA32_PL3_SSP, data);
> +		break;
>  	case MSR_TSC_AUX:
>  		if (!msr_info->host_initiated &&
>  		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 7/8] KVM:x86: load guest fpu state when accessing MSRs managed by XSAVES
  2019-03-18 15:03 ` [RFC PATCH v4 7/8] KVM:x86: load guest fpu state when accessing MSRs managed by XSAVES Yang Weijiang
@ 2019-04-02 20:39   ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2019-04-02 20:39 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu

On Mon, Mar 18, 2019 at 11:03:50PM +0800, Yang Weijiang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> A handful of CET MSRs are not context switched through "traditional"
> methods, e.g. VMCS or manual switching, but rather are passed through
> to the guest and are saved and restored by XSAVES/XRSTORS, i.e. the
> guest's FPU state.
> 
> Load the guest's FPU state if userspace is accessing MSRs whose values
> are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(),
> can simply do {RD,WR}MSR to access the guest's value.
> 
> Note that guest_cpuid_has() is not queried as host userspace is allowed
> to access MSRs that have not been exposed to the guest, e.g. it might do
> KVM_SET_MSRS prior to KVM_SET_CPUID2.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>

Needs Co-developed-by: since you made modifications to the patch I sent.

> ---
>  arch/x86/kvm/x86.c | 32 +++++++++++++++++++++++++++++++-
>  1 file changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a0f8b71b2132..ed747dfbff69 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -104,6 +104,8 @@ static void enter_smm(struct kvm_vcpu *vcpu);
>  static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
>  static void store_regs(struct kvm_vcpu *vcpu);
>  static int sync_regs(struct kvm_vcpu *vcpu);
> +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
> +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
>  
>  struct kvm_x86_ops *kvm_x86_ops __read_mostly;
>  EXPORT_SYMBOL_GPL(kvm_x86_ops);
> @@ -2889,6 +2891,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  }
>  EXPORT_SYMBOL_GPL(kvm_get_msr_common);
>  
> +static bool is_xsaves_msr(u32 index)
> +{
> +	return index == MSR_IA32_U_CET ||
> +	       (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP);
> +}
> +
>  /*
>   * Read or write a bunch of msrs. All parameters are kernel addresses.
>   *
> @@ -2899,11 +2907,33 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
>  		    int (*do_msr)(struct kvm_vcpu *vcpu,
>  				  unsigned index, u64 *data))
>  {
> +	bool fpu_loaded = false;
> +	bool xss_loaded = false;
>  	int i;
> +	u64 cet_bits = XFEATURE_MASK_SHSTK_USER | XFEATURE_MASK_SHSTK_KERNEL;
> +	u64 host_xss = 0;
> +
> +	for (i = 0; i < msrs->nmsrs; ++i) {
> +		if (!fpu_loaded && is_xsaves_msr(entries[i].index)) {
> +			if (!kvm_x86_ops->xsaves_supported())
> +				continue;
> +
> +			if (!xss_loaded) {
> +				rdmsrl(MSR_IA32_XSS, host_xss);

Rather than do rdmsr(), we can query host_xss via kvm_x86_ops, e.g.:

			if (!kvm_x86_ops->xsaves_supported() ||
			    !kvm_x86_ops->supported_xss())
				continue;

> +				xss_loaded = true;
> +			}
> +
> +			if ((host_xss & cet_bits) != cet_bits)
> +				continue;
>  
> -	for (i = 0; i < msrs->nmsrs; ++i)
> +			kvm_load_guest_fpu(vcpu);
> +			fpu_loaded = true;
> +		}
>  		if (do_msr(vcpu, entries[i].index, &entries[i].data))
>  			break;
> +	}
> +	if (fpu_loaded)
> +		kvm_put_guest_fpu(vcpu);
>  
>  	return i;
>  }
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 2/8] KVM:CPUID: Add CET CPUID support for Guest
  2019-03-18 15:03 ` [RFC PATCH v4 2/8] KVM:CPUID: Add CET CPUID support for Guest Yang Weijiang
  2019-04-02 20:21   ` Sean Christopherson
@ 2019-04-02 20:40   ` Sean Christopherson
  1 sibling, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2019-04-02 20:40 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu,
	Zhang Yi Z

On Mon, Mar 18, 2019 at 11:03:45PM +0800, Yang Weijiang wrote:
> CET SHSTK and IBT capability are reported via
> CPUID.(EAX=7, ECX=0):ECX[bit 7] and EDX[bit 20] respectively.
> CR4.CET[bit 23] is CET master enable bit, it controls CET feature
> enabling. CET user mode and supervisor mode xsaves component size
> is reported via CPUID.(EAX=0xD, ECX=1):ECX[bit 11] and ECX[bit 12]
> respectively.
> 
> Note: Although SHSTK or IBT can be enabled independently,
>       both of them are controlled by CR4.CET.
> 
> Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  4 +++-
>  arch/x86/kvm/cpuid.c            | 41 ++++++++++++++++++++++-----------
>  arch/x86/kvm/vmx.c              |  6 +++++
>  arch/x86/kvm/x86.h              |  4 ++++
>  4 files changed, 40 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 55e51ff7e421..fc038bf1924a 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -90,7 +90,8 @@
>  			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>  			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>  			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> +			  | X86_CR4_CET))
>  
>  #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>  
> @@ -1185,6 +1186,7 @@ struct kvm_x86_ops {
>  
>  	int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
>  				   uint16_t *vmcs_version);
> +	u64 (*vmx_supported_xss)(void);

Heh, the purpose of kvm_x86_ops is to abstract things a bit, i.e. drop
the "vmx" prefix.

>  };
>  
>  struct kvm_arch_async_pf {

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH v4 4/8] KVM:VMX: Pass through host CET related MSRs to Guest.
  2019-04-02 20:27   ` Sean Christopherson
@ 2019-04-02 20:46     ` Yang Weijiang
  0 siblings, 0 replies; 21+ messages in thread
From: Yang Weijiang @ 2019-04-02 20:46 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: pbonzini, kvm, mst, rkrcmar, jmattson, linux-kernel, yu-cheng.yu,
	Zhang Yi Z

On Tue, Apr 02, 2019 at 01:27:33PM -0700, Sean Christopherson wrote:
> On Mon, Mar 18, 2019 at 11:03:47PM +0800, Yang Weijiang wrote:
> > The CET runtime settings, i.e., CET state control bits(IA32_U_CET/
> > IA32_S_CET), CET SSP(IA32_PL3_SSP/IA32_PL0_SSP) and SSP table address
> > (IA32_INTERRUPT_SSP_TABLE_ADDR) are task/thread specific, therefore,
> > OS needs to save/restore the states properly during context switch,
> > e.g., task/thread switching, interrupt/exception handling, it uses
> > xsaves/xrstors to achieve that.
> > 
> > The difference between VMCS CET area fields and xsave CET area, is that
> > the former is for state retention during Guest/Host context
> > switch while the latter is for state retention during OS execution.
> > 
> > Linux currently doesn't support CPL1 and CPL2, so SSPs for these level
> > are skipped here.
> > 
> > Signed-off-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > ---
> >  arch/x86/kvm/vmx.c | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > index 53cef5a3db96..28b8ac027bd7 100644
> > --- a/arch/x86/kvm/vmx.c
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -11774,6 +11774,7 @@ static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
> >  static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
> >  {
> >  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +	unsigned long *msr_bitmap;
> >  
> >  	if (cpu_has_secondary_exec_ctrls()) {
> >  		vmx_compute_secondary_exec_control(vmx);
> > @@ -11791,6 +11792,18 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
> >  		nested_vmx_cr_fixed1_bits_update(vcpu);
> >  		nested_vmx_entry_exit_ctls_update(vcpu);
> >  	}
> > +
> > +	msr_bitmap = vmx->vmcs01.msr_bitmap;
> > +
> > +	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> > +	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
> > +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW);
> > +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW);
> > +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW);
> > +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW);
> 
> I thought we had agreed to pass-through PL1 and PL2, or am I misremembering?
>
Thanks Sean! I missed your last comments on this patch, will add them in
next version.
> > +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW);
> > +	}
> > +
> >  }
> >  
> >  static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
> > -- 
> > 2.17.1
> > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2019-04-03 13:52 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-18 15:03 [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
2019-03-18 15:03 ` [RFC PATCH v4 1/8] KVM:VMX: Define CET VMCS fields and bits Yang Weijiang
2019-04-02 19:57   ` Sean Christopherson
2019-03-18 15:03 ` [RFC PATCH v4 2/8] KVM:CPUID: Add CET CPUID support for Guest Yang Weijiang
2019-04-02 20:21   ` Sean Christopherson
2019-04-02 20:40   ` Sean Christopherson
2019-03-18 15:03 ` [RFC PATCH v4 3/8] KVM:CPUID: Fix xsaves area size calculation for CPUID.(EAX=0xD,ECX=1) Yang Weijiang
2019-04-02 20:27   ` Sean Christopherson
2019-03-18 15:03 ` [RFC PATCH v4 4/8] KVM:VMX: Pass through host CET related MSRs to Guest Yang Weijiang
2019-04-02 20:27   ` Sean Christopherson
2019-04-02 20:46     ` Yang Weijiang
2019-03-18 15:03 ` [RFC PATCH v4 5/8] KVM:VMX: Load Guest CET via VMCS when CET is enabled in Guest Yang Weijiang
2019-04-02 20:30   ` Sean Christopherson
2019-03-18 15:03 ` [RFC PATCH v4 6/8] KVM:x86: Allow Guest to set supported bits in XSS Yang Weijiang
2019-03-18 15:03 ` [RFC PATCH v4 7/8] KVM:x86: load guest fpu state when accessing MSRs managed by XSAVES Yang Weijiang
2019-04-02 20:39   ` Sean Christopherson
2019-03-18 15:03 ` [RFC PATCH v4 8/8] KVM:x86: Add user-space read/write interface for CET MSRs Yang Weijiang
2019-04-02 20:35   ` Sean Christopherson
2019-03-25 20:45 ` [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support Yang Weijiang
2019-03-26 22:31   ` Sean Christopherson
2019-04-02 20:10 ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).