All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/8] Introduce support for Guest CET feature
@ 2019-05-22  7:00 Yang Weijiang
  2019-05-22  7:00 ` [PATCH v5 1/8] KVM: VMX: Define CET VMCS fields and control bits Yang Weijiang
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Yang Weijiang @ 2019-05-22  7:00 UTC (permalink / raw)
  To: pbonzini, sean.j.christopherson, mst, rkrcmar, jmattson,
	linux-kernel, kvm, yu-cheng.yu
  Cc: weijiang.yang

Control-flow Enforcement Technology (CET) provides protection against
Return/Jump-Oriented Programming (ROP/JOP) attack. It includes two
sub-features: shadow stack (SHSTK) and indirect branch tracking (IBT).

KVM modification is required to support Guest CET feature.
This patch serial implemented CET related CPUID/XSAVES enumeration, MSRs 
and VMEntry configuration etc.so that Guest kernel can setup CET
runtime infrastructure based on them. Some MSRs and related feature
flags used in the patches reference the definitions in kernel patch.

CET kernel patch is here:
https://lkml.org/lkml/2018/11/20/225.

PATCH 1    : Define CET VMCS fields and bits.
PATCH 2/3  : Enumerate CET features/XSAVES in CPUID.
PATCH 4    : Fix xsaves size calculation issue.
PATCH 5    : Pass through CET MSRs to Guest.
PATCH 6    : Set Guest auto loading bit for CET.
PATCH 7    : Load Guest FPU states for XSAVES managed MSRs.
PATCH 8    : Add user-space access interface for CET states.


 v4 -> v5:
  - Rebase patch to kernel v5.1.
  - Wrap CPUID(0xD, n>=1) code to a helper function.
  - Pass through MSR_IA32_PL1_SSP and MSR_IA32_PL2_SSP to Guest.
  - Add Co-developed-by expression in patch description.
  - Refine patch description.

 v3 -> v4:
 - Add Sean's patch for loading Guest fpu state before access XSAVES
   managed CET MSRs.
 - Melt down CET bits setting into CPUID configuration patch.
 - Add VMX interface to query Host XSS.
 - Check Host and Guest XSS support bits before set Guest XSS.
 - Make Guest SHSTK and IBT feature enabling independent.
 - Do not report CET support to Guest when Host CET feature is Disabled.

 v2 -> v3:
 - Modified patches to make Guest CET independent to Host enabling.
 - Added patch 8 to add user space access for Guest CET MSR access.
 - Modified code comments and patch description to reflect changes.

 v1 -> v2:
 - Re-ordered patch sequence, combined one patch.
 - Added more description for CET related VMCS fields.
 - Added Host CET capability check while enabling Guest CET loading bit.
 - Added Host CET capability check while reporting Guest CPUID(EAX=7, EXC=0).
 - Modified code in reporting Guest CPUID(EAX=D,ECX>=1), make it clearer.
 - Added Host and Guest XSS mask check while
   setting bits for Guest XSS.

Sean Christopherson (1):
  KVM: x86: Load Guest fpu state when accessing MSRs managed by XSAVES

Yang Weijiang (7):
  KVM: VMX: Define CET VMCS fields and control bits
  KVM: x86: Implement CET CPUID support for Guest
  KVM: x86: Fix XSAVE size calculation issue
  KVM: VMX: Pass through CET related MSRs to Guest
  KVM: VMX: Load Guest CET via VMCS when CET is enabled in Guest
  KVM: x86: Allow Guest to set supported bits in XSS
  KVM: x86: Add user-space access interface for CET MSRs

 arch/x86/include/asm/kvm_host.h  |   5 +-
 arch/x86/include/asm/msr-index.h |   2 +
 arch/x86/include/asm/vmx.h       |   8 +++
 arch/x86/kvm/cpuid.c             | 109 +++++++++++++++++++++----------
 arch/x86/kvm/vmx/vmx.c           |  83 +++++++++++++++++++++--
 arch/x86/kvm/x86.c               |  29 +++++++-
 arch/x86/kvm/x86.h               |   4 ++
 7 files changed, 197 insertions(+), 43 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 1/8] KVM: VMX: Define CET VMCS fields and control bits
  2019-05-22  7:00 [PATCH v5 0/8] Introduce support for Guest CET feature Yang Weijiang
@ 2019-05-22  7:00 ` Yang Weijiang
  2019-06-04 14:46   ` Sean Christopherson
  2019-05-22  7:00 ` [PATCH v5 2/8] KVM: x86: Implement CET CPUID support for Guest Yang Weijiang
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Yang Weijiang @ 2019-05-22  7:00 UTC (permalink / raw)
  To: pbonzini, sean.j.christopherson, mst, rkrcmar, jmattson,
	linux-kernel, kvm, yu-cheng.yu
  Cc: weijiang.yang

CET(Control-flow Enforcement Technology) is an upcoming Intel® processor
family feature that blocks return/jump-oriented programming (ROP) attacks.
It provides the following capabilities to defend
against ROP/JOP style control-flow subversion attacks:

- Shadow Stack (SHSTK):
  A second stack for the program that is used exclusively for
  control transfer operations.

- Indirect Branch Tracking (IBT):
  Free branch protection to defend against jump/call oriented
  programming.

Several new CET MSRs are defined in kernel to support CET:
MSR_IA32_{U,S}_CET - MSRs to control the CET settings for user
mode and suervisor mode respectively.

MSR_IA32_PL{0,1,2,3}_SSP - MSRs to store shadow stack pointers for
CPL-0,1,2,3 levels.

MSR_IA32_INT_SSP_TAB - MSR to store base address of shadow stack
pointer table.

Two XSAVES state components are introduced for CET:
IA32_XSS:[bit 11] - bit for save/restor user mode CET states
IA32_XSS:[bit 12] - bit for save/restor supervisor mode CET states.

6 VMCS fields are introduced for CET, {HOST,GUEST}_S_CET is to store
CET settings in supervisor mode. {HOST,GUEST}_SSP is to store shadow
stack pointers in supervisor mode. {HOST,GUEST}_INTR_SSP_TABLE is to
store base address of shadow stack pointer table.

If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host's CET MSRs are restored
from below VMCS fields at VM-Exit:
- HOST_S_CET
- HOST_SSP
- HOST_INTR_SSP_TABLE

If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest's CET MSRs are loaded
from below VMCS fields at VM-Entry:
- GUEST_S_CET
- GUEST_SSP
- GUEST_INTR_SSP_TABLE

Apart from VMCS auto-load fields, KVM calls kvm_load_guest_fpu() and
kvm_put_guest_fpu() to save/restore the guest CET MSR states at
VM exit/entry. XSAVES/XRSTORS are executed underneath these functions
if they are supported. The CET xsave area is consolidated with other
XSAVE components in thread_struct.fpu field.

When context switch happens during task switch/interrupt/exception etc.,
Kernel also relies on above functions to switch CET states properly.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
---
 arch/x86/include/asm/vmx.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 4e4133e86484..d84804c7ddaa 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -103,6 +103,7 @@
 #define VM_EXIT_CLEAR_BNDCFGS                   0x00800000
 #define VM_EXIT_PT_CONCEAL_PIP			0x01000000
 #define VM_EXIT_CLEAR_IA32_RTIT_CTL		0x02000000
+#define VM_EXIT_LOAD_HOST_CET_STATE             0x10000000
 
 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR	0x00036dff
 
@@ -116,6 +117,7 @@
 #define VM_ENTRY_LOAD_BNDCFGS                   0x00010000
 #define VM_ENTRY_PT_CONCEAL_PIP			0x00020000
 #define VM_ENTRY_LOAD_IA32_RTIT_CTL		0x00040000
+#define VM_ENTRY_LOAD_GUEST_CET_STATE           0x00100000
 
 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR	0x000011ff
 
@@ -334,6 +336,9 @@ enum vmcs_field {
 	GUEST_PENDING_DBG_EXCEPTIONS    = 0x00006822,
 	GUEST_SYSENTER_ESP              = 0x00006824,
 	GUEST_SYSENTER_EIP              = 0x00006826,
+	GUEST_S_CET                     = 0x00006828,
+	GUEST_SSP                       = 0x0000682a,
+	GUEST_INTR_SSP_TABLE            = 0x0000682c,
 	HOST_CR0                        = 0x00006c00,
 	HOST_CR3                        = 0x00006c02,
 	HOST_CR4                        = 0x00006c04,
@@ -346,6 +351,9 @@ enum vmcs_field {
 	HOST_IA32_SYSENTER_EIP          = 0x00006c12,
 	HOST_RSP                        = 0x00006c14,
 	HOST_RIP                        = 0x00006c16,
+	HOST_S_CET                      = 0x00006c18,
+	HOST_SSP                        = 0x00006c1a,
+	HOST_INTR_SSP_TABLE             = 0x00006c1c
 };
 
 /*
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 2/8] KVM: x86: Implement CET CPUID support for Guest
  2019-05-22  7:00 [PATCH v5 0/8] Introduce support for Guest CET feature Yang Weijiang
  2019-05-22  7:00 ` [PATCH v5 1/8] KVM: VMX: Define CET VMCS fields and control bits Yang Weijiang
@ 2019-05-22  7:00 ` Yang Weijiang
  2019-06-04 19:58   ` Sean Christopherson
  2019-05-22  7:00 ` [PATCH v5 3/8] KVM: x86: Fix XSAVE size calculation issue Yang Weijiang
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Yang Weijiang @ 2019-05-22  7:00 UTC (permalink / raw)
  To: pbonzini, sean.j.christopherson, mst, rkrcmar, jmattson,
	linux-kernel, kvm, yu-cheng.yu
  Cc: weijiang.yang

CET SHSTK and IBT features are introduced here so that
CPUID.(EAX=7, ECX=0):ECX[bit 7] and EDX[bit 20] reflect them.
CET xsave components for supervisor and user mode are reported
via CPUID.(EAX=0xD, ECX=1):ECX[bit 11] and ECX[bit 12]
respectively.

To make the code look clean, wrap CPUID(0xD,n>=1) report code in
a helper function now.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
---
 arch/x86/include/asm/kvm_host.h |  4 +-
 arch/x86/kvm/cpuid.c            | 97 +++++++++++++++++++++------------
 arch/x86/kvm/vmx/vmx.c          |  6 ++
 arch/x86/kvm/x86.h              |  4 ++
 4 files changed, 76 insertions(+), 35 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a5db4475e72d..8c3f0ddc7676 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -91,7 +91,8 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+			  | X86_CR4_CET))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
@@ -1192,6 +1193,7 @@ struct kvm_x86_ops {
 	int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
 				   uint16_t *vmcs_version);
 	uint16_t (*nested_get_evmcs_version)(struct kvm_vcpu *vcpu);
+	u64 (*supported_xss)(void);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index fd3951638ae4..b9fc967fe55a 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void)
 	return xcr0;
 }
 
+u64 kvm_supported_xss(void)
+{
+	return KVM_SUPPORTED_XSS & kvm_x86_ops->supported_xss();
+}
+
 #define F(x) bit(X86_FEATURE_##x)
 
 int kvm_update_cpuid(struct kvm_vcpu *vcpu)
@@ -316,6 +321,50 @@ static int __do_cpuid_ent_emulated(struct kvm_cpuid_entry2 *entry,
 	return 0;
 }
 
+static inline int __do_cpuid_dx_leaf(struct kvm_cpuid_entry2 *entry, int *nent,
+				     int maxnent, u64 xss_mask, u64 xcr0_mask,
+				     u32 eax_mask)
+{
+	int idx, i;
+	u64 mask;
+	u64 supported;
+
+	for (idx = 1, i = 1; idx < 64; ++idx) {
+		mask = ((u64)1 << idx);
+		if (*nent >= maxnent)
+			return -EINVAL;
+
+		do_cpuid_1_ent(&entry[i], 0xD, idx);
+		if (idx == 1) {
+			entry[i].eax &= eax_mask;
+			cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
+			supported = xcr0_mask | xss_mask;
+			entry[i].ebx = 0;
+			entry[i].edx = 0;
+			entry[i].ecx &= xss_mask;
+			if (entry[i].eax & (F(XSAVES) | F(XSAVEC))) {
+				entry[i].ebx =
+					xstate_required_size(supported,
+							     true);
+			}
+		} else {
+			supported = (entry[i].ecx & 1) ? xss_mask :
+				     xcr0_mask;
+			if (entry[i].eax == 0 || !(supported & mask))
+				continue;
+			entry[i].ecx &= 1;
+			entry[i].edx = 0;
+			if (entry[i].ecx)
+				entry[i].ebx = 0;
+		}
+		entry[i].flags |=
+			KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+		++*nent;
+		++i;
+	}
+	return 0;
+}
+
 static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 				 u32 index, int *nent, int maxnent)
 {
@@ -405,12 +454,13 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ |
 		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
 		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
-		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B);
+		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | F(SHSTK);
 
 	/* cpuid 7.0.edx*/
 	const u32 kvm_cpuid_7_0_edx_x86_features =
 		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
-		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP);
+		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
+		F(IBT);
 
 	/* all calls to cpuid_count() should be made on the same cpu */
 	get_cpu();
@@ -565,44 +615,23 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
 		break;
 	}
 	case 0xd: {
-		int idx, i;
-		u64 supported = kvm_supported_xcr0();
+		u64 u_supported = kvm_supported_xcr0();
+		u64 s_supported = kvm_supported_xss();
+		u32 eax_mask = kvm_cpuid_D_1_eax_x86_features;
 
-		entry->eax &= supported;
-		entry->ebx = xstate_required_size(supported, false);
+		entry->eax &= u_supported;
+		entry->ebx = xstate_required_size(u_supported, false);
 		entry->ecx = entry->ebx;
-		entry->edx &= supported >> 32;
+		entry->edx &= u_supported >> 32;
 		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-		if (!supported)
+
+		if (!u_supported && !s_supported)
 			break;
 
-		for (idx = 1, i = 1; idx < 64; ++idx) {
-			u64 mask = ((u64)1 << idx);
-			if (*nent >= maxnent)
-				goto out;
+		if (__do_cpuid_dx_leaf(entry, nent, maxnent, s_supported,
+				       u_supported, eax_mask) < 0)
+			goto out;
 
-			do_cpuid_1_ent(&entry[i], function, idx);
-			if (idx == 1) {
-				entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
-				cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
-				entry[i].ebx = 0;
-				if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
-					entry[i].ebx =
-						xstate_required_size(supported,
-								     true);
-			} else {
-				if (entry[i].eax == 0 || !(supported & mask))
-					continue;
-				if (WARN_ON_ONCE(entry[i].ecx & 1))
-					continue;
-			}
-			entry[i].ecx = 0;
-			entry[i].edx = 0;
-			entry[i].flags |=
-			       KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-			++*nent;
-			++i;
-		}
 		break;
 	}
 	/* Intel PT */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7c015416fd58..574428375ff9 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1637,6 +1637,11 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
 	return !(val & ~valid_bits);
 }
 
+static __always_inline u64 vmx_supported_xss(void)
+{
+	return host_xss;
+}
+
 static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 {
 	switch (msr->index) {
@@ -7711,6 +7716,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
 	.set_nested_state = NULL,
 	.get_vmcs12_pages = NULL,
 	.nested_enable_evmcs = NULL,
+	.supported_xss = vmx_supported_xss,
 };
 
 static void vmx_cleanup_l1d_flush(void)
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 28406aa1136d..e96616149f84 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -288,6 +288,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
 				| XFEATURE_MASK_YMM | XFEATURE_MASK_BNDREGS \
 				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
 				| XFEATURE_MASK_PKRU)
+
+#define KVM_SUPPORTED_XSS	(XFEATURE_MASK_CET_USER \
+				| XFEATURE_MASK_CET_KERNEL)
+
 extern u64 host_xcr0;
 
 extern u64 kvm_supported_xcr0(void);
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 3/8] KVM: x86: Fix XSAVE size calculation issue
  2019-05-22  7:00 [PATCH v5 0/8] Introduce support for Guest CET feature Yang Weijiang
  2019-05-22  7:00 ` [PATCH v5 1/8] KVM: VMX: Define CET VMCS fields and control bits Yang Weijiang
  2019-05-22  7:00 ` [PATCH v5 2/8] KVM: x86: Implement CET CPUID support for Guest Yang Weijiang
@ 2019-05-22  7:00 ` Yang Weijiang
  2019-05-22  7:00 ` [PATCH v5 4/8] KVM: VMX: Pass through CET related MSRs to Guest Yang Weijiang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2019-05-22  7:00 UTC (permalink / raw)
  To: pbonzini, sean.j.christopherson, mst, rkrcmar, jmattson,
	linux-kernel, kvm, yu-cheng.yu
  Cc: weijiang.yang

According to the SDM, Vol 2, CPUID(EAX=0xD,ECX=1) reports the
XSAVE size containing all states enabled by XCR0|IA32_MSR_XSS.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
---
 arch/x86/kvm/cpuid.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b9fc967fe55a..7be16ef0ea4a 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -123,7 +123,8 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
 
 	best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
 	if (best && (best->eax & (F(XSAVES) | F(XSAVEC))))
-		best->ebx = xstate_required_size(vcpu->arch.xcr0, true);
+		best->ebx = xstate_required_size(vcpu->arch.xcr0 |
+			    kvm_supported_xss(), true);
 
 	/*
 	 * The existing code assumes virtual address is 48-bit or 57-bit in the
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 4/8] KVM: VMX: Pass through CET related MSRs to Guest
  2019-05-22  7:00 [PATCH v5 0/8] Introduce support for Guest CET feature Yang Weijiang
                   ` (2 preceding siblings ...)
  2019-05-22  7:00 ` [PATCH v5 3/8] KVM: x86: Fix XSAVE size calculation issue Yang Weijiang
@ 2019-05-22  7:00 ` Yang Weijiang
  2019-06-04 19:59   ` Sean Christopherson
  2019-05-22  7:00 ` [PATCH v5 5/8] KVM: VMX: Load Guest CET via VMCS when CET is enabled in Guest Yang Weijiang
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Yang Weijiang @ 2019-05-22  7:00 UTC (permalink / raw)
  To: pbonzini, sean.j.christopherson, mst, rkrcmar, jmattson,
	linux-kernel, kvm, yu-cheng.yu
  Cc: weijiang.yang

CET MSRs pass through Guest directly to enhance performance.
CET runtime control settings are stored in MSR_IA32_{U,S}_CET,
Shadow Stack Pointer(SSP) are presented in MSR_IA32_PL{0,1,2,3}_SSP,
SSP table base address is stored in MSR_IA32_INT_SSP_TAB,
these MSRs are defined in kernel and re-used here.

MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user mode protection,
the contents could differ from process to process, therefore,
kernel needs to save/restore them during context switch, so it makes
sense to pass through them so that the guest kernel can
use xsaves/xrstors to operate them efficiently. Ohter MSRs are used
for non-user mode protection. See CET spec for detailed info.

The difference between CET VMCS state fields and xsave components is,
the former used for CET state storage during VMEnter/VMExit,
whereas the latter used for state retention between Guest task/process
switch.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 574428375ff9..9321da538f65 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6942,6 +6942,7 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
 static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	unsigned long *msr_bitmap;
 
 	if (cpu_has_secondary_exec_ctrls()) {
 		vmx_compute_secondary_exec_control(vmx);
@@ -6963,6 +6964,19 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
 	if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
 			guest_cpuid_has(vcpu, X86_FEATURE_INTEL_PT))
 		update_intel_pt_cfg(vcpu);
+
+	msr_bitmap = vmx->vmcs01.msr_bitmap;
+
+	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW);
+		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW);
+	}
 }
 
 static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
@@ -7163,6 +7177,7 @@ static void __pi_post_block(struct kvm_vcpu *vcpu)
 		spin_unlock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
 		vcpu->pre_pcpu = -1;
 	}
+
 }
 
 /*
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 5/8] KVM: VMX: Load Guest CET via VMCS when CET is enabled in Guest
  2019-05-22  7:00 [PATCH v5 0/8] Introduce support for Guest CET feature Yang Weijiang
                   ` (3 preceding siblings ...)
  2019-05-22  7:00 ` [PATCH v5 4/8] KVM: VMX: Pass through CET related MSRs to Guest Yang Weijiang
@ 2019-05-22  7:00 ` Yang Weijiang
  2019-06-04 20:03   ` Sean Christopherson
  2019-05-22  7:00 ` [PATCH v5 6/8] KVM: x86: Allow Guest to set supported bits in XSS Yang Weijiang
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Yang Weijiang @ 2019-05-22  7:00 UTC (permalink / raw)
  To: pbonzini, sean.j.christopherson, mst, rkrcmar, jmattson,
	linux-kernel, kvm, yu-cheng.yu
  Cc: weijiang.yang

"Load Guest CET state" bit controls whether Guest CET states
will be loaded at Guest entry. Before doing that, KVM needs
to check if CPU CET feature is available to Guest.

Note: SHSTK and IBT features share one control MSR:
MSR_IA32_{U,S}_CET, which means it's difficult to hide
one feature from another in the case of SHSTK != IBT,
after discussed in community, it's agreed to allow Guest
control two features independently as it won't introduce
security hole.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9321da538f65..1c0d487a4037 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -47,6 +47,7 @@
 #include <asm/spec-ctrl.h>
 #include <asm/virtext.h>
 #include <asm/vmx.h>
+#include <asm/cet.h>
 
 #include "capabilities.h"
 #include "cpuid.h"
@@ -2929,6 +2930,17 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
 		if (!nested_vmx_allowed(vcpu) || is_smm(vcpu))
 			return 1;
 	}
+	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
+	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
+		if (cr4 & X86_CR4_CET)
+			vmcs_set_bits(VM_ENTRY_CONTROLS,
+				      VM_ENTRY_LOAD_GUEST_CET_STATE);
+		else
+			vmcs_clear_bits(VM_ENTRY_CONTROLS,
+					VM_ENTRY_LOAD_GUEST_CET_STATE);
+	} else if (cr4 & X86_CR4_CET) {
+		return 1;
+	}
 
 	if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
 		return 1;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 6/8] KVM: x86: Allow Guest to set supported bits in XSS
  2019-05-22  7:00 [PATCH v5 0/8] Introduce support for Guest CET feature Yang Weijiang
                   ` (4 preceding siblings ...)
  2019-05-22  7:00 ` [PATCH v5 5/8] KVM: VMX: Load Guest CET via VMCS when CET is enabled in Guest Yang Weijiang
@ 2019-05-22  7:00 ` Yang Weijiang
  2019-05-22  7:01 ` [PATCH v5 7/8] KVM: x86: Load Guest fpu state when accessing MSRs managed by XSAVES Yang Weijiang
  2019-05-22  7:01 ` [PATCH v5 8/8] KVM: x86: Add user-space access interface for CET MSRs Yang Weijiang
  7 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2019-05-22  7:00 UTC (permalink / raw)
  To: pbonzini, sean.j.christopherson, mst, rkrcmar, jmattson,
	linux-kernel, kvm, yu-cheng.yu
  Cc: weijiang.yang

Now that KVM supports setting CET related bits in XSS.
Previously, KVM did not support setting any bits in XSS
so hardcoded its check to inject a #GP if Guest
attempted to write a non-zero value to XSS.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/cpuid.c            | 13 ++++++++++---
 arch/x86/kvm/vmx/vmx.c          |  7 ++-----
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8c3f0ddc7676..035367694056 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -620,6 +620,7 @@ struct kvm_vcpu_arch {
 
 	u64 xcr0;
 	u64 guest_supported_xcr0;
+	u64 guest_supported_xss;
 	u32 guest_xstate_size;
 
 	struct kvm_pio_request pio;
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7be16ef0ea4a..b645a143584f 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -122,9 +122,16 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu)
 	}
 
 	best = kvm_find_cpuid_entry(vcpu, 0xD, 1);
-	if (best && (best->eax & (F(XSAVES) | F(XSAVEC))))
-		best->ebx = xstate_required_size(vcpu->arch.xcr0 |
-			    kvm_supported_xss(), true);
+	if (best) {
+		if (best->eax & (F(XSAVES) | F(XSAVEC)))
+			best->ebx = xstate_required_size(vcpu->arch.xcr0 |
+				    kvm_supported_xss(), true);
+
+		vcpu->arch.guest_supported_xss = best->ecx &
+			kvm_supported_xss();
+	} else {
+		vcpu->arch.guest_supported_xss = 0;
+	}
 
 	/*
 	 * The existing code assumes virtual address is 48-bit or 57-bit in the
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1c0d487a4037..dec6bda20235 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1945,12 +1945,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_XSS:
 		if (!vmx_xsaves_supported())
 			return 1;
-		/*
-		 * The only supported bit as of Skylake is bit 8, but
-		 * it is not supported on KVM.
-		 */
-		if (data != 0)
+		if (data & ~vcpu->arch.guest_supported_xss)
 			return 1;
+
 		vcpu->arch.ia32_xss = data;
 		if (vcpu->arch.ia32_xss != host_xss)
 			add_atomic_switch_msr(vmx, MSR_IA32_XSS,
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 7/8] KVM: x86: Load Guest fpu state when accessing MSRs managed by XSAVES
  2019-05-22  7:00 [PATCH v5 0/8] Introduce support for Guest CET feature Yang Weijiang
                   ` (5 preceding siblings ...)
  2019-05-22  7:00 ` [PATCH v5 6/8] KVM: x86: Allow Guest to set supported bits in XSS Yang Weijiang
@ 2019-05-22  7:01 ` Yang Weijiang
  2019-05-22  7:01 ` [PATCH v5 8/8] KVM: x86: Add user-space access interface for CET MSRs Yang Weijiang
  7 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2019-05-22  7:01 UTC (permalink / raw)
  To: pbonzini, sean.j.christopherson, mst, rkrcmar, jmattson,
	linux-kernel, kvm, yu-cheng.yu
  Cc: weijiang.yang

From: Sean Christopherson <sean.j.christopherson@intel.com>

A handful of CET MSRs are not context switched through "traditional"
methods, e.g. VMCS or manual switching, but rather are passed through
to the guest and are saved and restored by XSAVES/XRSTORS, i.e. the
guest's FPU state.

Load the guest's FPU state if userspace is accessing MSRs whose values
are managed by XSAVES so that the MSR helper, e.g. vmx_{get,set}_msr(),
can simply do {RD,WR}MSR to access the guest's value.

Note that guest_cpuid_has() is not queried as host userspace is allowed
to access MSRs that have not been exposed to the guest, e.g. it might do
KVM_SET_MSRS prior to KVM_SET_CPUID2.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/x86.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c3fb4f6b678d..8f1bc65495a9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -105,6 +105,8 @@ static void enter_smm(struct kvm_vcpu *vcpu);
 static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
 static void store_regs(struct kvm_vcpu *vcpu);
 static int sync_regs(struct kvm_vcpu *vcpu);
+static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
+static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
 
 struct kvm_x86_ops *kvm_x86_ops __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_x86_ops);
@@ -2901,6 +2903,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 }
 EXPORT_SYMBOL_GPL(kvm_get_msr_common);
 
+static bool is_xsaves_msr(u32 index)
+{
+	return index == MSR_IA32_U_CET ||
+	       (index >= MSR_IA32_PL0_SSP && index <= MSR_IA32_PL3_SSP);
+}
+
 /*
  * Read or write a bunch of msrs. All parameters are kernel addresses.
  *
@@ -2911,11 +2919,30 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs,
 		    int (*do_msr)(struct kvm_vcpu *vcpu,
 				  unsigned index, u64 *data))
 {
+	bool fpu_loaded = false;
 	int i;
+	u64 cet_bits = XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL;
+	u64 host_xss = 0;
+
+	for (i = 0; i < msrs->nmsrs; ++i) {
+		if (!fpu_loaded && is_xsaves_msr(entries[i].index)) {
+			if (!kvm_x86_ops->xsaves_supported() ||
+			    !kvm_x86_ops->supported_xss())
+				continue;
+
+			host_xss = kvm_x86_ops->supported_xss();
 
-	for (i = 0; i < msrs->nmsrs; ++i)
+			if ((host_xss & cet_bits) != cet_bits)
+				continue;
+
+			kvm_load_guest_fpu(vcpu);
+			fpu_loaded = true;
+		}
 		if (do_msr(vcpu, entries[i].index, &entries[i].data))
 			break;
+	}
+	if (fpu_loaded)
+		kvm_put_guest_fpu(vcpu);
 
 	return i;
 }
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 8/8] KVM: x86: Add user-space access interface for CET MSRs
  2019-05-22  7:00 [PATCH v5 0/8] Introduce support for Guest CET feature Yang Weijiang
                   ` (6 preceding siblings ...)
  2019-05-22  7:01 ` [PATCH v5 7/8] KVM: x86: Load Guest fpu state when accessing MSRs managed by XSAVES Yang Weijiang
@ 2019-05-22  7:01 ` Yang Weijiang
  7 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2019-05-22  7:01 UTC (permalink / raw)
  To: pbonzini, sean.j.christopherson, mst, rkrcmar, jmattson,
	linux-kernel, kvm, yu-cheng.yu
  Cc: weijiang.yang

There're two different places storing Guest CET states, the states
managed with XSAVES/XRSTORS, as restored/saved
in previous patch, can be read/write directly from/to the MSRs.
For those stored in VMCS fields, they're access via vmcs_read/
vmcs_write.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/include/asm/msr-index.h |  2 ++
 arch/x86/kvm/vmx/vmx.c           | 43 ++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index dc0a67c1ed80..53a4ef337846 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -827,6 +827,8 @@
 #define MSR_IA32_U_CET		0x6a0 /* user mode cet setting */
 #define MSR_IA32_S_CET		0x6a2 /* kernel mode cet setting */
 #define MSR_IA32_PL0_SSP	0x6a4 /* kernel shstk pointer */
+#define MSR_IA32_PL1_SSP	0x6a5 /* ring 1 shstk pointer */
+#define MSR_IA32_PL2_SSP	0x6a6 /* ring 2 shstk pointer */
 #define MSR_IA32_PL3_SSP	0x6a7 /* user shstk pointer */
 #define MSR_IA32_INT_SSP_TAB	0x6a8 /* exception shstk table */
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dec6bda20235..233f58af3878 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1777,6 +1777,27 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
 		break;
+	case MSR_IA32_S_CET:
+		msr_info->data = vmcs_readl(GUEST_S_CET);
+		break;
+	case MSR_IA32_U_CET:
+		rdmsrl(MSR_IA32_U_CET, msr_info->data);
+		break;
+	case MSR_IA32_INT_SSP_TAB:
+		msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
+		break;
+	case MSR_IA32_PL0_SSP:
+		rdmsrl(MSR_IA32_PL0_SSP, msr_info->data);
+		break;
+	case MSR_IA32_PL1_SSP:
+		rdmsrl(MSR_IA32_PL1_SSP, msr_info->data);
+		break;
+	case MSR_IA32_PL2_SSP:
+		rdmsrl(MSR_IA32_PL2_SSP, msr_info->data);
+		break;
+	case MSR_IA32_PL3_SSP:
+		rdmsrl(MSR_IA32_PL3_SSP, msr_info->data);
+		break;
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
@@ -2012,6 +2033,28 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		else
 			vmx->pt_desc.guest.addr_a[index / 2] = data;
 		break;
+	case MSR_IA32_S_CET:
+		vmcs_writel(GUEST_S_CET, data);
+		break;
+	case MSR_IA32_U_CET:
+		wrmsrl(MSR_IA32_U_CET, data);
+		break;
+	case MSR_IA32_INT_SSP_TAB:
+		vmcs_writel(GUEST_INTR_SSP_TABLE, data);
+		break;
+	case MSR_IA32_PL0_SSP:
+		wrmsrl(MSR_IA32_PL0_SSP, data);
+		break;
+	case MSR_IA32_PL1_SSP:
+		wrmsrl(MSR_IA32_PL1_SSP, data);
+		break;
+	case MSR_IA32_PL2_SSP:
+		wrmsrl(MSR_IA32_PL2_SSP, data);
+		break;
+	case MSR_IA32_PL3_SSP:
+		wrmsrl(MSR_IA32_PL3_SSP, data);
+		break;
+
 	case MSR_TSC_AUX:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 1/8] KVM: VMX: Define CET VMCS fields and control bits
  2019-05-22  7:00 ` [PATCH v5 1/8] KVM: VMX: Define CET VMCS fields and control bits Yang Weijiang
@ 2019-06-04 14:46   ` Sean Christopherson
  2019-06-05  2:30     ` Yang Weijiang
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2019-06-04 14:46 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, mst, rkrcmar, jmattson, linux-kernel, kvm, yu-cheng.yu

On Wed, May 22, 2019 at 03:00:54PM +0800, Yang Weijiang wrote:
> CET(Control-flow Enforcement Technology) is an upcoming Intel® processor
> family feature that blocks return/jump-oriented programming (ROP) attacks.
> It provides the following capabilities to defend
> against ROP/JOP style control-flow subversion attacks:
> 
> - Shadow Stack (SHSTK):
>   A second stack for the program that is used exclusively for
>   control transfer operations.
> 
> - Indirect Branch Tracking (IBT):
>   Free branch protection to defend against jump/call oriented
>   programming.

What is "free" referring to here?  The software enabling certainly isn't
free, and I doubt the hardware/ucode cost is completely free.
 
> Several new CET MSRs are defined in kernel to support CET:
> MSR_IA32_{U,S}_CET - MSRs to control the CET settings for user
> mode and suervisor mode respectively.
> 
> MSR_IA32_PL{0,1,2,3}_SSP - MSRs to store shadow stack pointers for
> CPL-0,1,2,3 levels.
> 
> MSR_IA32_INT_SSP_TAB - MSR to store base address of shadow stack
> pointer table.

For consistency (within the changelog), these should be list style, e.g.:


  - MSR_IA32_{U,S}_CET: Control CET settings for user mode and suervisor
                        mode respectively.

  - MSR_IA32_PL{0,1,2,3}_SSP: Store shadow stack pointers for CPL levels.

  - MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer
                          table.

> Two XSAVES state components are introduced for CET:
> IA32_XSS:[bit 11] - bit for save/restor user mode CET states
> IA32_XSS:[bit 12] - bit for save/restor supervisor mode CET states.

Likewise, use a consistent list format.

> 6 VMCS fields are introduced for CET, {HOST,GUEST}_S_CET is to store
> CET settings in supervisor mode. {HOST,GUEST}_SSP is to store shadow
> stack pointers in supervisor mode. {HOST,GUEST}_INTR_SSP_TABLE is to
> store base address of shadow stack pointer table.

It'd probably be easier to use a list format for the fields, e.g.:

6 VMCS fields are introduced for CET:

  - {HOST,GUEST}_S_CET: stores CET settings for supervisor mode.

  - {HOST,GUEST}_SSP: stores shadow stack pointers for supervisor mode.

  - {HOST,GUEST}_INTR_SSP_TABLE: stores the based address of the shadow
                                 stack pointer table.

> If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host's CET MSRs are restored
> from below VMCS fields at VM-Exit:
> - HOST_S_CET
> - HOST_SSP
> - HOST_INTR_SSP_TABLE

Personal preference, I like indenting lists like this with a space or two
so that the list is clearly delineated.

> If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest's CET MSRs are loaded
> from below VMCS fields at VM-Entry:
> - GUEST_S_CET
> - GUEST_SSP
> - GUEST_INTR_SSP_TABLE
> 
> Apart from VMCS auto-load fields, KVM calls kvm_load_guest_fpu() and
> kvm_put_guest_fpu() to save/restore the guest CET MSR states at
> VM exit/entry. XSAVES/XRSTORS are executed underneath these functions
> if they are supported. The CET xsave area is consolidated with other
> XSAVE components in thread_struct.fpu field.
> 
> When context switch happens during task switch/interrupt/exception etc.,
> Kernel also relies on above functions to switch CET states properly.

These paragraphs about the FPU and KVM behavior don't belong in this
patch.
 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>

Co-developed-by needs to be accompanied by a SOB.  And your SOB should
be last since you sent the patch.  This comment applies to all patches.

See "12) When to use Acked-by:, Cc:, and Co-developed-by:" in
Documentation/process/submitting-patches.rst for details (I recommend
looking at a v5.2-rc* version, a docs update was merged for v5.2).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 2/8] KVM: x86: Implement CET CPUID support for Guest
  2019-05-22  7:00 ` [PATCH v5 2/8] KVM: x86: Implement CET CPUID support for Guest Yang Weijiang
@ 2019-06-04 19:58   ` Sean Christopherson
  2019-06-05  2:51     ` Yang Weijiang
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2019-06-04 19:58 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, mst, rkrcmar, jmattson, linux-kernel, kvm, yu-cheng.yu

On Wed, May 22, 2019 at 03:00:55PM +0800, Yang Weijiang wrote:
> CET SHSTK and IBT features are introduced here so that
> CPUID.(EAX=7, ECX=0):ECX[bit 7] and EDX[bit 20] reflect them.
> CET xsave components for supervisor and user mode are reported
> via CPUID.(EAX=0xD, ECX=1):ECX[bit 11] and ECX[bit 12]
> respectively.
> 
> To make the code look clean, wrap CPUID(0xD,n>=1) report code in
> a helper function now.

Create the helper in a separate patch so that it's introduced without
any functional changes.
 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  4 +-
>  arch/x86/kvm/cpuid.c            | 97 +++++++++++++++++++++------------
>  arch/x86/kvm/vmx/vmx.c          |  6 ++
>  arch/x86/kvm/x86.h              |  4 ++
>  4 files changed, 76 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index a5db4475e72d..8c3f0ddc7676 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -91,7 +91,8 @@
>  			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>  			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>  			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> +			  | X86_CR4_CET))

As I mentioned in v4, the patch ordering is wrong.  Features shouldn't be
advertised to userspace or exposed to the guest until they're fully
supported in KVM, i.e. the bulk of this patch to advertise the CPUID bits
and allow CR4.CET=1 belongs at the end of the series.

>  #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>  
> @@ -1192,6 +1193,7 @@ struct kvm_x86_ops {
>  	int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
>  				   uint16_t *vmcs_version);
>  	uint16_t (*nested_get_evmcs_version)(struct kvm_vcpu *vcpu);
> +	u64 (*supported_xss)(void);
>  };
>  
>  struct kvm_arch_async_pf {
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index fd3951638ae4..b9fc967fe55a 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void)
>  	return xcr0;
>  }
>  
> +u64 kvm_supported_xss(void)
> +{
> +	return KVM_SUPPORTED_XSS & kvm_x86_ops->supported_xss();
> +}
> +
>  #define F(x) bit(X86_FEATURE_##x)
>  
>  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> @@ -316,6 +321,50 @@ static int __do_cpuid_ent_emulated(struct kvm_cpuid_entry2 *entry,
>  	return 0;
>  }
>  
> +static inline int __do_cpuid_dx_leaf(struct kvm_cpuid_entry2 *entry, int *nent,
> +				     int maxnent, u64 xss_mask, u64 xcr0_mask,
> +				     u32 eax_mask)
> +{
> +	int idx, i;
> +	u64 mask;
> +	u64 supported;
> +
> +	for (idx = 1, i = 1; idx < 64; ++idx) {
> +		mask = ((u64)1 << idx);
> +		if (*nent >= maxnent)
> +			return -EINVAL;
> +
> +		do_cpuid_1_ent(&entry[i], 0xD, idx);
> +		if (idx == 1) {
> +			entry[i].eax &= eax_mask;
> +			cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
> +			supported = xcr0_mask | xss_mask;
> +			entry[i].ebx = 0;
> +			entry[i].edx = 0;
> +			entry[i].ecx &= xss_mask;
> +			if (entry[i].eax & (F(XSAVES) | F(XSAVEC))) {
> +				entry[i].ebx =
> +					xstate_required_size(supported,
> +							     true);
> +			}
> +		} else {
> +			supported = (entry[i].ecx & 1) ? xss_mask :
> +				     xcr0_mask;
> +			if (entry[i].eax == 0 || !(supported & mask))
> +				continue;
> +			entry[i].ecx &= 1;
> +			entry[i].edx = 0;
> +			if (entry[i].ecx)
> +				entry[i].ebx = 0;
> +		}
> +		entry[i].flags |=
> +			KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> +		++*nent;
> +		++i;
> +	}
> +	return 0;
> +}
> +
>  static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  				 u32 index, int *nent, int maxnent)
>  {
> @@ -405,12 +454,13 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ |
>  		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
>  		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> -		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B);
> +		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | F(SHSTK);
>  
>  	/* cpuid 7.0.edx*/
>  	const u32 kvm_cpuid_7_0_edx_x86_features =
>  		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
> -		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP);
> +		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
> +		F(IBT);
>  
>  	/* all calls to cpuid_count() should be made on the same cpu */
>  	get_cpu();
> @@ -565,44 +615,23 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
>  		break;
>  	}
>  	case 0xd: {
> -		int idx, i;
> -		u64 supported = kvm_supported_xcr0();
> +		u64 u_supported = kvm_supported_xcr0();
> +		u64 s_supported = kvm_supported_xss();
> +		u32 eax_mask = kvm_cpuid_D_1_eax_x86_features;
>  
> -		entry->eax &= supported;
> -		entry->ebx = xstate_required_size(supported, false);
> +		entry->eax &= u_supported;
> +		entry->ebx = xstate_required_size(u_supported, false);
>  		entry->ecx = entry->ebx;
> -		entry->edx &= supported >> 32;
> +		entry->edx &= u_supported >> 32;
>  		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> -		if (!supported)
> +
> +		if (!u_supported && !s_supported)
>  			break;
>  
> -		for (idx = 1, i = 1; idx < 64; ++idx) {
> -			u64 mask = ((u64)1 << idx);
> -			if (*nent >= maxnent)
> -				goto out;
> +		if (__do_cpuid_dx_leaf(entry, nent, maxnent, s_supported,
> +				       u_supported, eax_mask) < 0)
> +			goto out;
>  
> -			do_cpuid_1_ent(&entry[i], function, idx);
> -			if (idx == 1) {
> -				entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
> -				cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
> -				entry[i].ebx = 0;
> -				if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
> -					entry[i].ebx =
> -						xstate_required_size(supported,
> -								     true);
> -			} else {
> -				if (entry[i].eax == 0 || !(supported & mask))
> -					continue;
> -				if (WARN_ON_ONCE(entry[i].ecx & 1))
> -					continue;
> -			}
> -			entry[i].ecx = 0;
> -			entry[i].edx = 0;
> -			entry[i].flags |=
> -			       KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> -			++*nent;
> -			++i;
> -		}
>  		break;
>  	}
>  	/* Intel PT */
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 7c015416fd58..574428375ff9 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1637,6 +1637,11 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
>  	return !(val & ~valid_bits);
>  }
>  
> +static __always_inline u64 vmx_supported_xss(void)

This can't be __always_inline since it's invoked indirectly.  Out of
curiosity, does the compiler generate a warning of any kind?

> +{
> +	return host_xss;
> +}
> +
>  static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
>  {
>  	switch (msr->index) {
> @@ -7711,6 +7716,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
>  	.set_nested_state = NULL,
>  	.get_vmcs12_pages = NULL,
>  	.nested_enable_evmcs = NULL,
> +	.supported_xss = vmx_supported_xss,
>  };
>  
>  static void vmx_cleanup_l1d_flush(void)
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 28406aa1136d..e96616149f84 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -288,6 +288,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
>  				| XFEATURE_MASK_YMM | XFEATURE_MASK_BNDREGS \
>  				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
>  				| XFEATURE_MASK_PKRU)
> +
> +#define KVM_SUPPORTED_XSS	(XFEATURE_MASK_CET_USER \
> +				| XFEATURE_MASK_CET_KERNEL)
> +
>  extern u64 host_xcr0;
>  
>  extern u64 kvm_supported_xcr0(void);
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 4/8] KVM: VMX: Pass through CET related MSRs to Guest
  2019-05-22  7:00 ` [PATCH v5 4/8] KVM: VMX: Pass through CET related MSRs to Guest Yang Weijiang
@ 2019-06-04 19:59   ` Sean Christopherson
  0 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2019-06-04 19:59 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, mst, rkrcmar, jmattson, linux-kernel, kvm, yu-cheng.yu

On Wed, May 22, 2019 at 03:00:57PM +0800, Yang Weijiang wrote:
> CET MSRs pass through Guest directly to enhance performance.
> CET runtime control settings are stored in MSR_IA32_{U,S}_CET,
> Shadow Stack Pointer(SSP) are presented in MSR_IA32_PL{0,1,2,3}_SSP,
> SSP table base address is stored in MSR_IA32_INT_SSP_TAB,
> these MSRs are defined in kernel and re-used here.
> 
> MSR_IA32_U_CET and MSR_IA32_PL3_SSP are used for user mode protection,
> the contents could differ from process to process, therefore,
> kernel needs to save/restore them during context switch, so it makes
> sense to pass through them so that the guest kernel can
> use xsaves/xrstors to operate them efficiently. Ohter MSRs are used
> for non-user mode protection. See CET spec for detailed info.
> 
> The difference between CET VMCS state fields and xsave components is,
> the former used for CET state storage during VMEnter/VMExit,
> whereas the latter used for state retention between Guest task/process
> switch.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 574428375ff9..9321da538f65 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6942,6 +6942,7 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
>  static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	unsigned long *msr_bitmap;
>  
>  	if (cpu_has_secondary_exec_ctrls()) {
>  		vmx_compute_secondary_exec_control(vmx);
> @@ -6963,6 +6964,19 @@ static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
>  	if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
>  			guest_cpuid_has(vcpu, X86_FEATURE_INTEL_PT))
>  		update_intel_pt_cfg(vcpu);
> +
> +	msr_bitmap = vmx->vmcs01.msr_bitmap;
> +
> +	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> +	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_U_CET, MSR_TYPE_RW);
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_S_CET, MSR_TYPE_RW);
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW);
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL0_SSP, MSR_TYPE_RW);
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL1_SSP, MSR_TYPE_RW);
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL2_SSP, MSR_TYPE_RW);
> +		vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_PL3_SSP, MSR_TYPE_RW);
> +	}
>  }
>  
>  static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
> @@ -7163,6 +7177,7 @@ static void __pi_post_block(struct kvm_vcpu *vcpu)
>  		spin_unlock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
>  		vcpu->pre_pcpu = -1;
>  	}
> +

Spurious whitespace change.

>  }
>  
>  /*
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 5/8] KVM: VMX: Load Guest CET via VMCS when CET is enabled in Guest
  2019-05-22  7:00 ` [PATCH v5 5/8] KVM: VMX: Load Guest CET via VMCS when CET is enabled in Guest Yang Weijiang
@ 2019-06-04 20:03   ` Sean Christopherson
  2019-06-05  1:49     ` Yang Weijiang
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2019-06-04 20:03 UTC (permalink / raw)
  To: Yang Weijiang
  Cc: pbonzini, mst, rkrcmar, jmattson, linux-kernel, kvm, yu-cheng.yu

On Wed, May 22, 2019 at 03:00:58PM +0800, Yang Weijiang wrote:
> "Load Guest CET state" bit controls whether Guest CET states
> will be loaded at Guest entry. Before doing that, KVM needs
> to check if CPU CET feature is available to Guest.
> 
> Note: SHSTK and IBT features share one control MSR:
> MSR_IA32_{U,S}_CET, which means it's difficult to hide
> one feature from another in the case of SHSTK != IBT,
> after discussed in community, it's agreed to allow Guest
> control two features independently as it won't introduce
> security hole.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9321da538f65..1c0d487a4037 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -47,6 +47,7 @@
>  #include <asm/spec-ctrl.h>
>  #include <asm/virtext.h>
>  #include <asm/vmx.h>
> +#include <asm/cet.h>

Is this include actually needed?  I haven't attempted to compile, but a
glance everything should be in cpufeatures.h or vmx.h.

>  #include "capabilities.h"
>  #include "cpuid.h"
> @@ -2929,6 +2930,17 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
>  		if (!nested_vmx_allowed(vcpu) || is_smm(vcpu))
>  			return 1;
>  	}
> +	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> +	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
> +		if (cr4 & X86_CR4_CET)
> +			vmcs_set_bits(VM_ENTRY_CONTROLS,
> +				      VM_ENTRY_LOAD_GUEST_CET_STATE);
> +		else
> +			vmcs_clear_bits(VM_ENTRY_CONTROLS,
> +					VM_ENTRY_LOAD_GUEST_CET_STATE);
> +	} else if (cr4 & X86_CR4_CET) {
> +		return 1;
> +	}

Don't we also need to check for host CET support prior to toggling
VM_ENTRY_LOAD_GUEST_CET_STATE?

>  
>  	if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
>  		return 1;
> -- 
> 2.17.2
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 5/8] KVM: VMX: Load Guest CET via VMCS when CET is enabled in Guest
  2019-06-04 20:03   ` Sean Christopherson
@ 2019-06-05  1:49     ` Yang Weijiang
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2019-06-05  1:49 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Yang Weijiang, pbonzini, mst, rkrcmar, jmattson, linux-kernel,
	kvm, yu-cheng.yu

On Tue, Jun 04, 2019 at 01:03:36PM -0700, Sean Christopherson wrote:
> On Wed, May 22, 2019 at 03:00:58PM +0800, Yang Weijiang wrote:
> > "Load Guest CET state" bit controls whether Guest CET states
> > will be loaded at Guest entry. Before doing that, KVM needs
> > to check if CPU CET feature is available to Guest.
> > 
> > Note: SHSTK and IBT features share one control MSR:
> > MSR_IA32_{U,S}_CET, which means it's difficult to hide
> > one feature from another in the case of SHSTK != IBT,
> > after discussed in community, it's agreed to allow Guest
> > control two features independently as it won't introduce
> > security hole.
> > 
> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 9321da538f65..1c0d487a4037 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -47,6 +47,7 @@
> >  #include <asm/spec-ctrl.h>
> >  #include <asm/virtext.h>
> >  #include <asm/vmx.h>
> > +#include <asm/cet.h>
> 
> Is this include actually needed?  I haven't attempted to compile, but a
> glance everything should be in cpufeatures.h or vmx.h.
> 
  Thanks Sean!
  My original purpose is to re-use the macro cpu_x86_cet_enabled() to
  check host CET status, for somehow, the check is not there, but to resolve
  your below question, I need to use the macro to check it, so will keep
  this include and add the check in next version.

> >  #include "capabilities.h"
> >  #include "cpuid.h"
> > @@ -2929,6 +2930,17 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
> >  		if (!nested_vmx_allowed(vcpu) || is_smm(vcpu))
> >  			return 1;
> >  	}
> > +	if (guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) ||
> > +	    guest_cpuid_has(vcpu, X86_FEATURE_IBT)) {
> > +		if (cr4 & X86_CR4_CET)
> > +			vmcs_set_bits(VM_ENTRY_CONTROLS,
> > +				      VM_ENTRY_LOAD_GUEST_CET_STATE);
> > +		else
> > +			vmcs_clear_bits(VM_ENTRY_CONTROLS,
> > +					VM_ENTRY_LOAD_GUEST_CET_STATE);
> > +	} else if (cr4 & X86_CR4_CET) {
> > +		return 1;
> > +	}
> 
> Don't we also need to check for host CET support prior to toggling
> VM_ENTRY_LOAD_GUEST_CET_STATE?

Yes, need add back the check. v3 patch changed the CET CPUID enumeration to
guest, and lost the check from then on.
> 
> >  
> >  	if (to_vmx(vcpu)->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
> >  		return 1;
> > -- 
> > 2.17.2
> > 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 1/8] KVM: VMX: Define CET VMCS fields and control bits
  2019-06-04 14:46   ` Sean Christopherson
@ 2019-06-05  2:30     ` Yang Weijiang
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2019-06-05  2:30 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Yang Weijiang, pbonzini, mst, rkrcmar, jmattson, linux-kernel,
	kvm, yu-cheng.yu

On Tue, Jun 04, 2019 at 07:46:13AM -0700, Sean Christopherson wrote:
> On Wed, May 22, 2019 at 03:00:54PM +0800, Yang Weijiang wrote:
> > CET(Control-flow Enforcement Technology) is an upcoming Intel® processor
> > family feature that blocks return/jump-oriented programming (ROP) attacks.
> > It provides the following capabilities to defend
> > against ROP/JOP style control-flow subversion attacks:
> > 
> > - Shadow Stack (SHSTK):
> >   A second stack for the program that is used exclusively for
> >   control transfer operations.
> > 
> > - Indirect Branch Tracking (IBT):
> >   Free branch protection to defend against jump/call oriented
> >   programming.
> 
> What is "free" referring to here?  The software enabling certainly isn't
> free, and I doubt the hardware/ucode cost is completely free.
>
Thank you for pointing it out!
"free" comes from the spec., I guess the author means the major effort of
enabling IBT is in compiler and HW, free effort to SW enabling.
But as you mentioned, actually there's deficated effort to enable it,
will change it to other words.

> > Several new CET MSRs are defined in kernel to support CET:
> > MSR_IA32_{U,S}_CET - MSRs to control the CET settings for user
> > mode and suervisor mode respectively.
> > 
> > MSR_IA32_PL{0,1,2,3}_SSP - MSRs to store shadow stack pointers for
> > CPL-0,1,2,3 levels.
> > 
> > MSR_IA32_INT_SSP_TAB - MSR to store base address of shadow stack
> > pointer table.
> 
> For consistency (within the changelog), these should be list style, e.g.:
> 
> 
>   - MSR_IA32_{U,S}_CET: Control CET settings for user mode and suervisor
>                         mode respectively.
> 
>   - MSR_IA32_PL{0,1,2,3}_SSP: Store shadow stack pointers for CPL levels.
> 
>   - MSR_IA32_INT_SSP_TAB: Stores base address of shadow stack pointer
>                           table.
> 
OK, will change it in next version.
> > Two XSAVES state components are introduced for CET:
> > IA32_XSS:[bit 11] - bit for save/restor user mode CET states
> > IA32_XSS:[bit 12] - bit for save/restor supervisor mode CET states.
> 
> Likewise, use a consistent list format.
> 
> > 6 VMCS fields are introduced for CET, {HOST,GUEST}_S_CET is to store
> > CET settings in supervisor mode. {HOST,GUEST}_SSP is to store shadow
> > stack pointers in supervisor mode. {HOST,GUEST}_INTR_SSP_TABLE is to
> > store base address of shadow stack pointer table.
> 
> It'd probably be easier to use a list format for the fields, e.g.:
> 
> 6 VMCS fields are introduced for CET:
> 
>   - {HOST,GUEST}_S_CET: stores CET settings for supervisor mode.
> 
>   - {HOST,GUEST}_SSP: stores shadow stack pointers for supervisor mode.
> 
>   - {HOST,GUEST}_INTR_SSP_TABLE: stores the based address of the shadow
>                                  stack pointer table.
> 
OK, will modify it.
> > If VM_EXIT_LOAD_HOST_CET_STATE = 1, the host's CET MSRs are restored
> > from below VMCS fields at VM-Exit:
> > - HOST_S_CET
> > - HOST_SSP
> > - HOST_INTR_SSP_TABLE
> 
> Personal preference, I like indenting lists like this with a space or two
> so that the list is clearly delineated.
Good suggestion, thanks!
> 
> > If VM_ENTRY_LOAD_GUEST_CET_STATE = 1, the guest's CET MSRs are loaded
> > from below VMCS fields at VM-Entry:
> > - GUEST_S_CET
> > - GUEST_SSP
> > - GUEST_INTR_SSP_TABLE
> > 
> > Apart from VMCS auto-load fields, KVM calls kvm_load_guest_fpu() and
> > kvm_put_guest_fpu() to save/restore the guest CET MSR states at
> > VM exit/entry. XSAVES/XRSTORS are executed underneath these functions
> > if they are supported. The CET xsave area is consolidated with other
> > XSAVE components in thread_struct.fpu field.
> > 
> > When context switch happens during task switch/interrupt/exception etc.,
> > Kernel also relies on above functions to switch CET states properly.
> 
> These paragraphs about the FPU and KVM behavior don't belong in this
> patch.

OK. looks like it's redundant, will remve it.
>  
> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> 
> Co-developed-by needs to be accompanied by a SOB.  And your SOB should
> be last since you sent the patch.  This comment applies to all patches.
> 
> See "12) When to use Acked-by:, Cc:, and Co-developed-by:" in
> Documentation/process/submitting-patches.rst for details (I recommend
> looking at a v5.2-rc* version, a docs update was merged for v5.2).
Got it, will change all the signatures.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 2/8] KVM: x86: Implement CET CPUID support for Guest
  2019-06-04 19:58   ` Sean Christopherson
@ 2019-06-05  2:51     ` Yang Weijiang
  0 siblings, 0 replies; 16+ messages in thread
From: Yang Weijiang @ 2019-06-05  2:51 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Yang Weijiang, pbonzini, mst, rkrcmar, jmattson, linux-kernel,
	kvm, yu-cheng.yu

On Tue, Jun 04, 2019 at 12:58:01PM -0700, Sean Christopherson wrote:
> On Wed, May 22, 2019 at 03:00:55PM +0800, Yang Weijiang wrote:
> > CET SHSTK and IBT features are introduced here so that
> > CPUID.(EAX=7, ECX=0):ECX[bit 7] and EDX[bit 20] reflect them.
> > CET xsave components for supervisor and user mode are reported
> > via CPUID.(EAX=0xD, ECX=1):ECX[bit 11] and ECX[bit 12]
> > respectively.
> > 
> > To make the code look clean, wrap CPUID(0xD,n>=1) report code in
> > a helper function now.
> 
> Create the helper in a separate patch so that it's introduced without
> any functional changes.
OK, will add a new patch to put the helper.
>  
> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > Co-developed-by: Zhang Yi Z <yi.z.zhang@linux.intel.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h |  4 +-
> >  arch/x86/kvm/cpuid.c            | 97 +++++++++++++++++++++------------
> >  arch/x86/kvm/vmx/vmx.c          |  6 ++
> >  arch/x86/kvm/x86.h              |  4 ++
> >  4 files changed, 76 insertions(+), 35 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index a5db4475e72d..8c3f0ddc7676 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -91,7 +91,8 @@
> >  			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
> >  			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
> >  			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> > -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> > +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> > +			  | X86_CR4_CET))
> 
> As I mentioned in v4, the patch ordering is wrong.  Features shouldn't be
> advertised to userspace or exposed to the guest until they're fully
> supported in KVM, i.e. the bulk of this patch to advertise the CPUID bits
> and allow CR4.CET=1 belongs at the end of the series.
> 
How about merge it to patch 6/8?
> >  #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
> >  
> > @@ -1192,6 +1193,7 @@ struct kvm_x86_ops {
> >  	int (*nested_enable_evmcs)(struct kvm_vcpu *vcpu,
> >  				   uint16_t *vmcs_version);
> >  	uint16_t (*nested_get_evmcs_version)(struct kvm_vcpu *vcpu);
> > +	u64 (*supported_xss)(void);
> >  };
> >  
> >  struct kvm_arch_async_pf {
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index fd3951638ae4..b9fc967fe55a 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void)
> >  	return xcr0;
> >  }
> >  
> > +u64 kvm_supported_xss(void)
> > +{
> > +	return KVM_SUPPORTED_XSS & kvm_x86_ops->supported_xss();
> > +}
> > +
> >  #define F(x) bit(X86_FEATURE_##x)
> >  
> >  int kvm_update_cpuid(struct kvm_vcpu *vcpu)
> > @@ -316,6 +321,50 @@ static int __do_cpuid_ent_emulated(struct kvm_cpuid_entry2 *entry,
> >  	return 0;
> >  }
> >  
> > +static inline int __do_cpuid_dx_leaf(struct kvm_cpuid_entry2 *entry, int *nent,
> > +				     int maxnent, u64 xss_mask, u64 xcr0_mask,
> > +				     u32 eax_mask)
> > +{
> > +	int idx, i;
> > +	u64 mask;
> > +	u64 supported;
> > +
> > +	for (idx = 1, i = 1; idx < 64; ++idx) {
> > +		mask = ((u64)1 << idx);
> > +		if (*nent >= maxnent)
> > +			return -EINVAL;
> > +
> > +		do_cpuid_1_ent(&entry[i], 0xD, idx);
> > +		if (idx == 1) {
> > +			entry[i].eax &= eax_mask;
> > +			cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
> > +			supported = xcr0_mask | xss_mask;
> > +			entry[i].ebx = 0;
> > +			entry[i].edx = 0;
> > +			entry[i].ecx &= xss_mask;
> > +			if (entry[i].eax & (F(XSAVES) | F(XSAVEC))) {
> > +				entry[i].ebx =
> > +					xstate_required_size(supported,
> > +							     true);
> > +			}
> > +		} else {
> > +			supported = (entry[i].ecx & 1) ? xss_mask :
> > +				     xcr0_mask;
> > +			if (entry[i].eax == 0 || !(supported & mask))
> > +				continue;
> > +			entry[i].ecx &= 1;
> > +			entry[i].edx = 0;
> > +			if (entry[i].ecx)
> > +				entry[i].ebx = 0;
> > +		}
> > +		entry[i].flags |=
> > +			KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> > +		++*nent;
> > +		++i;
> > +	}
> > +	return 0;
> > +}
> > +
> >  static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> >  				 u32 index, int *nent, int maxnent)
> >  {
> > @@ -405,12 +454,13 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> >  		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ |
> >  		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
> >  		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
> > -		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B);
> > +		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | F(SHSTK);
> >  
> >  	/* cpuid 7.0.edx*/
> >  	const u32 kvm_cpuid_7_0_edx_x86_features =
> >  		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
> > -		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP);
> > +		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
> > +		F(IBT);
> >  
> >  	/* all calls to cpuid_count() should be made on the same cpu */
> >  	get_cpu();
> > @@ -565,44 +615,23 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
> >  		break;
> >  	}
> >  	case 0xd: {
> > -		int idx, i;
> > -		u64 supported = kvm_supported_xcr0();
> > +		u64 u_supported = kvm_supported_xcr0();
> > +		u64 s_supported = kvm_supported_xss();
> > +		u32 eax_mask = kvm_cpuid_D_1_eax_x86_features;
> >  
> > -		entry->eax &= supported;
> > -		entry->ebx = xstate_required_size(supported, false);
> > +		entry->eax &= u_supported;
> > +		entry->ebx = xstate_required_size(u_supported, false);
> >  		entry->ecx = entry->ebx;
> > -		entry->edx &= supported >> 32;
> > +		entry->edx &= u_supported >> 32;
> >  		entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> > -		if (!supported)
> > +
> > +		if (!u_supported && !s_supported)
> >  			break;
> >  
> > -		for (idx = 1, i = 1; idx < 64; ++idx) {
> > -			u64 mask = ((u64)1 << idx);
> > -			if (*nent >= maxnent)
> > -				goto out;
> > +		if (__do_cpuid_dx_leaf(entry, nent, maxnent, s_supported,
> > +				       u_supported, eax_mask) < 0)
> > +			goto out;
> >  
> > -			do_cpuid_1_ent(&entry[i], function, idx);
> > -			if (idx == 1) {
> > -				entry[i].eax &= kvm_cpuid_D_1_eax_x86_features;
> > -				cpuid_mask(&entry[i].eax, CPUID_D_1_EAX);
> > -				entry[i].ebx = 0;
> > -				if (entry[i].eax & (F(XSAVES)|F(XSAVEC)))
> > -					entry[i].ebx =
> > -						xstate_required_size(supported,
> > -								     true);
> > -			} else {
> > -				if (entry[i].eax == 0 || !(supported & mask))
> > -					continue;
> > -				if (WARN_ON_ONCE(entry[i].ecx & 1))
> > -					continue;
> > -			}
> > -			entry[i].ecx = 0;
> > -			entry[i].edx = 0;
> > -			entry[i].flags |=
> > -			       KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
> > -			++*nent;
> > -			++i;
> > -		}
> >  		break;
> >  	}
> >  	/* Intel PT */
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 7c015416fd58..574428375ff9 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -1637,6 +1637,11 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
> >  	return !(val & ~valid_bits);
> >  }
> >  
> > +static __always_inline u64 vmx_supported_xss(void)
> 
> This can't be __always_inline since it's invoked indirectly.  Out of
> curiosity, does the compiler generate a warning of any kind?
> 
So what's your suggestion? just remove it?
> > +{
> > +	return host_xss;
> > +}
> > +
> >  static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
> >  {
> >  	switch (msr->index) {
> > @@ -7711,6 +7716,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
> >  	.set_nested_state = NULL,
> >  	.get_vmcs12_pages = NULL,
> >  	.nested_enable_evmcs = NULL,
> > +	.supported_xss = vmx_supported_xss,
> >  };
> >  
> >  static void vmx_cleanup_l1d_flush(void)
> > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> > index 28406aa1136d..e96616149f84 100644
> > --- a/arch/x86/kvm/x86.h
> > +++ b/arch/x86/kvm/x86.h
> > @@ -288,6 +288,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, unsigned long cr2,
> >  				| XFEATURE_MASK_YMM | XFEATURE_MASK_BNDREGS \
> >  				| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
> >  				| XFEATURE_MASK_PKRU)
> > +
> > +#define KVM_SUPPORTED_XSS	(XFEATURE_MASK_CET_USER \
> > +				| XFEATURE_MASK_CET_KERNEL)
> > +
> >  extern u64 host_xcr0;
> >  
> >  extern u64 kvm_supported_xcr0(void);
> > -- 
> > 2.17.2
> > 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-06-05  2:52 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-22  7:00 [PATCH v5 0/8] Introduce support for Guest CET feature Yang Weijiang
2019-05-22  7:00 ` [PATCH v5 1/8] KVM: VMX: Define CET VMCS fields and control bits Yang Weijiang
2019-06-04 14:46   ` Sean Christopherson
2019-06-05  2:30     ` Yang Weijiang
2019-05-22  7:00 ` [PATCH v5 2/8] KVM: x86: Implement CET CPUID support for Guest Yang Weijiang
2019-06-04 19:58   ` Sean Christopherson
2019-06-05  2:51     ` Yang Weijiang
2019-05-22  7:00 ` [PATCH v5 3/8] KVM: x86: Fix XSAVE size calculation issue Yang Weijiang
2019-05-22  7:00 ` [PATCH v5 4/8] KVM: VMX: Pass through CET related MSRs to Guest Yang Weijiang
2019-06-04 19:59   ` Sean Christopherson
2019-05-22  7:00 ` [PATCH v5 5/8] KVM: VMX: Load Guest CET via VMCS when CET is enabled in Guest Yang Weijiang
2019-06-04 20:03   ` Sean Christopherson
2019-06-05  1:49     ` Yang Weijiang
2019-05-22  7:00 ` [PATCH v5 6/8] KVM: x86: Allow Guest to set supported bits in XSS Yang Weijiang
2019-05-22  7:01 ` [PATCH v5 7/8] KVM: x86: Load Guest fpu state when accessing MSRs managed by XSAVES Yang Weijiang
2019-05-22  7:01 ` [PATCH v5 8/8] KVM: x86: Add user-space access interface for CET MSRs Yang Weijiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.