linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/6] LASS KVM virtualization support
@ 2023-06-01 14:23 Zeng Guang
  2023-06-01 14:23 ` [PATCH v1 1/6] KVM: x86: Consolidate flags for __linearize() Zeng Guang
                   ` (8 more replies)
  0 siblings, 9 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-01 14:23 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

Linear Address Space Separation (LASS)[1] is an independent mechanism
that enforces the mode-based protections on any access to a linear
address.

Based on a linear-address organization, the 64-bit canonical linear
address space is partitioned into two halves: all linear addresses
whose most significant bit is 0 are user space addresses, while linear
addresses whose most significant bit is 1 are supervisor space address.

LASS aims to prevent any attempt to probe supervisor space addresses by
user mode, and likewise stop any attempt to access (if SMAP enabled) or
execute user space addresses from supervisor mode.

When platform has LASS capability, KVM requires to expose this feature
to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
executed in the guest directly, hardware will perform the check. But KVM
also needs to behave same as hardware to apply LASS to kinds of guest
memory accesses when emulating privileged instructions by software.

KVM will take following LASS voilations check on emulation path.
User-mode access to supervisor space address:
        LA[bit 63] && (CPL == 3)
Supervisor-mode access to user space address:
        Instruction fetch: !LA[bit 63] && (CPL < 3)
        Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
                     CPL < 3) || Implicit supervisor access)

This patch series provide a LASS KVM solution.

We tested the basic function of LASS virtualization including LASS
enumeration and enabling in non-root and nested environment. As KVM
unittest framework is not compatible to LASS rule, we use kernel module
and application test to emulate LASS violation instead. With KVM forced
emulation mechanism, we also verified the LASS functionality on some
emulation path with instruction fetch and data access to have same
behavior as hardware.

[1] Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368
Chapter Linear Address Space Separation (LASS)

------------------------------------------------------

v0->v1
1. Adapt to new __linearize() API
2. Function refactor of vmx_check_lass()
3. Refine commit message to be more precise
4. Drop LASS kvm cap detection depending
   on hardware capability


Binbin Wu (1):
  KVM: x86: Consolidate flags for __linearize()

Zeng Guang (5):
  KVM: x86: Virtualize CR4.LASS
  KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  KVM: x86: Add emulator helper for LASS violation check
  KVM: x86: LASS protection on KVM emulation
  KVM: x86: Advertise LASS CPUID to user space

 arch/x86/include/asm/kvm-x86-ops.h |  3 +-
 arch/x86/include/asm/kvm_host.h    |  4 ++-
 arch/x86/kvm/cpuid.c               |  5 ++-
 arch/x86/kvm/emulate.c             | 47 +++++++++++++++++++++++-----
 arch/x86/kvm/kvm_emulate.h         |  6 ++++
 arch/x86/kvm/vmx/nested.c          |  3 ++
 arch/x86/kvm/vmx/sgx.c             |  4 +++
 arch/x86/kvm/vmx/vmx.c             | 50 ++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h             |  2 ++
 arch/x86/kvm/x86.c                 | 12 +++++++
 arch/x86/kvm/x86.h                 |  2 ++
 11 files changed, 126 insertions(+), 12 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v1 1/6] KVM: x86: Consolidate flags for __linearize()
  2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
@ 2023-06-01 14:23 ` Zeng Guang
  2023-06-27 17:40   ` Sean Christopherson
  2023-06-01 14:23 ` [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 36+ messages in thread
From: Zeng Guang @ 2023-06-01 14:23 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Binbin Wu, Zeng Guang

From: Binbin Wu <binbin.wu@linux.intel.com>

Define a 32-bit parameter and consolidate the two bools into it.

__linearize() has two bool parameters write and fetch. And new flag
will be needed to support new feature (e.g. LAM needs a flag to skip
address untag under some conditions).

No functional change intended.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/kvm/emulate.c     | 19 +++++++++++++------
 arch/x86/kvm/kvm_emulate.h |  4 ++++
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 936a397a08cd..9508836e8a35 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -687,8 +687,8 @@ static unsigned insn_alignment(struct x86_emulate_ctxt *ctxt, unsigned size)
 static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 				       struct segmented_address addr,
 				       unsigned *max_size, unsigned size,
-				       bool write, bool fetch,
-				       enum x86emul_mode mode, ulong *linear)
+				       u32 flags, enum x86emul_mode mode,
+				       ulong *linear)
 {
 	struct desc_struct desc;
 	bool usable;
@@ -696,6 +696,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 	u32 lim;
 	u16 sel;
 	u8  va_bits;
+	bool fetch = !!(flags & X86EMUL_F_FETCH);
+	bool write = !!(flags & X86EMUL_F_WRITE);
 
 	la = seg_base(ctxt, addr.seg) + addr.ea;
 	*max_size = 0;
@@ -757,7 +759,11 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
 		     ulong *linear)
 {
 	unsigned max_size;
-	return __linearize(ctxt, addr, &max_size, size, write, false,
+	u32 flags = 0;
+
+	if (write)
+		flags |= X86EMUL_F_WRITE;
+	return __linearize(ctxt, addr, &max_size, size, flags,
 			   ctxt->mode, linear);
 }
 
@@ -768,10 +774,11 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
 	unsigned max_size;
 	struct segmented_address addr = { .seg = VCPU_SREG_CS,
 					   .ea = dst };
+	u32 flags = X86EMUL_F_FETCH;
 
 	if (ctxt->op_bytes != sizeof(unsigned long))
 		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
-	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
+	rc = __linearize(ctxt, addr, &max_size, 1, flags, ctxt->mode, &linear);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->_eip = addr.ea;
 	return rc;
@@ -896,6 +903,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
 	int cur_size = ctxt->fetch.end - ctxt->fetch.data;
 	struct segmented_address addr = { .seg = VCPU_SREG_CS,
 					   .ea = ctxt->eip + cur_size };
+	u32 flags = X86EMUL_F_FETCH;
 
 	/*
 	 * We do not know exactly how many bytes will be needed, and
@@ -907,8 +915,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
 	 * boundary check itself.  Instead, we use max_size to check
 	 * against op_size.
 	 */
-	rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
-			 &linear);
+	rc = __linearize(ctxt, addr, &max_size, 0, flags, ctxt->mode, &linear);
 	if (unlikely(rc != X86EMUL_CONTINUE))
 		return rc;
 
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index ab65f3a47dfd..5b9ec610b2cb 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -88,6 +88,10 @@ struct x86_instruction_info {
 #define X86EMUL_IO_NEEDED       5 /* IO is needed to complete emulation */
 #define X86EMUL_INTERCEPTED     6 /* Intercepted by nested VMCB/VMCS */
 
+/* x86-specific emulation flags */
+#define X86EMUL_F_FETCH			BIT(0)
+#define X86EMUL_F_WRITE			BIT(1)
+
 struct x86_emulate_ops {
 	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
 	/*
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS
  2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
  2023-06-01 14:23 ` [PATCH v1 1/6] KVM: x86: Consolidate flags for __linearize() Zeng Guang
@ 2023-06-01 14:23 ` Zeng Guang
  2023-06-05  1:57   ` Binbin Wu
                     ` (2 more replies)
  2023-06-01 14:23 ` [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
                   ` (6 subsequent siblings)
  8 siblings, 3 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-01 14:23 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
as CR4.LASS generally set once for each vCPU at boot time and won't be
toggled at runtime. Besides, only if VM has LASS capability enumerated with
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
to set CR4.LASS.

Updating cr4_fixed1 to set CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1
MSR for guests and allow guests to enable LASS in nested VMX operaion as well.

Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
effect in legacy mode even if CR4.LASS is set.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/vmx/vmx.c          | 3 +++
 arch/x86/kvm/x86.h              | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fb9d1f2d6136..92d8e65fe88c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,7 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP | X86_CR4_LASS))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 44fb619803b8..a33205ded85c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7603,6 +7603,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
 	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
 	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
 
+	entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
+	cr4_fixed1_update(X86_CR4_LASS,       eax, feature_bit(LASS));
+
 #undef cr4_fixed1_update
 }
 
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index c544602d07a3..e1295f490308 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -529,6 +529,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
 		__reserved_bits |= X86_CR4_VMXE;        \
 	if (!__cpu_has(__c, X86_FEATURE_PCID))          \
 		__reserved_bits |= X86_CR4_PCIDE;       \
+	if (!__cpu_has(__c, X86_FEATURE_LASS))          \
+		__reserved_bits |= X86_CR4_LASS;        \
 	__reserved_bits;                                \
 })
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
  2023-06-01 14:23 ` [PATCH v1 1/6] KVM: x86: Consolidate flags for __linearize() Zeng Guang
  2023-06-01 14:23 ` [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
@ 2023-06-01 14:23 ` Zeng Guang
  2023-06-05  3:31   ` Binbin Wu
                     ` (3 more replies)
  2023-06-01 14:23 ` [PATCH v1 4/6] KVM: x86: Add emulator helper " Zeng Guang
                   ` (5 subsequent siblings)
  8 siblings, 4 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-01 14:23 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

Intel introduces LASS (Linear Address Separation) feature providing
an independent mechanism to achieve the mode-based protection.

LASS partitions 64-bit linear address space into two halves, user-mode
address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
stops any code execution or conditional data access[1]
    1. from user mode to supervisor-mode address space
    2. from supervisor mode to user-mode address space
and generates LASS violation fault accordingly.

[1]A supervisor mode data access causes a LASS violation only if supervisor
mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
or the access implicitly accesses a system data structure.

Following are the rules of LASS violation check on the linear address(LA).
User access to supervisor-mode address space:
    LA[bit 63] && (CPL == 3)
Supervisor access to user-mode address space:
    Instruction fetch: !LA[bit 63] && (CPL < 3)
    Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
                 CPL < 3) || Implicit supervisor access)

Add new ops in kvm_x86_ops to do LASS violation check.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  3 +-
 arch/x86/include/asm/kvm_host.h    |  2 ++
 arch/x86/kvm/kvm_emulate.h         |  1 +
 arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h             |  2 ++
 5 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 13bc212cd4bc..8980a3bfa687 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
 KVM_X86_OP(msr_filter_changed)
 KVM_X86_OP(complete_emulated_msr)
 KVM_X86_OP(vcpu_deliver_sipi_vector)
-KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
+KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
+KVM_X86_OP_OPTIONAL_RET0(check_lass)
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 92d8e65fe88c..98666d1e7727 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
 	 * Returns vCPU specific APICv inhibit reasons
 	 */
 	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
+
+	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
 };
 
 struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 5b9ec610b2cb..f1439ab7c14b 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -91,6 +91,7 @@ struct x86_instruction_info {
 /* x86-specific emulation flags */
 #define X86EMUL_F_FETCH			BIT(0)
 #define X86EMUL_F_WRITE			BIT(1)
+#define X86EMUL_F_SKIPLASS		BIT(2)
 
 struct x86_emulate_ops {
 	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a33205ded85c..876997e8448e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
 	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
 }
 
+/*
+ * Determine whether an access to the linear address causes a LASS violation.
+ * LASS protection is only effective in long mode. As a prerequisite, caller
+ * should make sure vCPU running in long mode and invoke this api to do LASS
+ * violation check.
+ */
+bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
+{
+	bool user_mode, user_as, rflags_ac;
+
+	if (!!(flags & X86EMUL_F_SKIPLASS) ||
+	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
+		return false;
+
+	WARN_ON_ONCE(!is_long_mode(vcpu));
+
+	user_as = !(la >> 63);
+
+	/*
+	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
+	 * accesses a system data structure. For implicit accesses to system
+	 * data structure, the processor acts as if RFLAGS.AC is clear.
+	 */
+	if (access & PFERR_IMPLICIT_ACCESS) {
+		user_mode = false;
+		rflags_ac = false;
+	} else {
+		user_mode = vmx_get_cpl(vcpu) == 3;
+		if (!user_mode)
+			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
+	}
+
+	if (user_mode == user_as)
+		return false;
+
+	/*
+	 * Supervisor-mode _data_ accesses to user address space
+	 * cause LASS violations only if SMAP is enabled.
+	 */
+	if (!user_mode && !(access & PFERR_FETCH_MASK))
+		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;
+
+	return true;
+}
+
 static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.name = KBUILD_MODNAME,
 
@@ -8269,6 +8314,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.complete_emulated_msr = kvm_complete_insn_gp,
 
 	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
+
+	.check_lass = vmx_check_lass,
 };
 
 static unsigned int vmx_handle_intel_pt_intr(void)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 9e66531861cf..f2e775b9849b 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
 u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
 u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
 
+bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
+
 static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
 					     int type, bool value)
 {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v1 4/6] KVM: x86: Add emulator helper for LASS violation check
  2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
                   ` (2 preceding siblings ...)
  2023-06-01 14:23 ` [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
@ 2023-06-01 14:23 ` Zeng Guang
  2023-06-27 18:28   ` Sean Christopherson
  2023-06-01 14:23 ` [PATCH v1 5/6] KVM: x86: LASS protection on KVM emulation Zeng Guang
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 36+ messages in thread
From: Zeng Guang @ 2023-06-01 14:23 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

When LASS is enabled, KVM need apply LASS violation check to instruction
emulations. Add helper for the usage of x86 emulator to perform LASS
protection.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/kvm/kvm_emulate.h |  1 +
 arch/x86/kvm/x86.c         | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index f1439ab7c14b..fd1c2b22867e 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -230,6 +230,7 @@ struct x86_emulate_ops {
 	int (*leave_smm)(struct x86_emulate_ctxt *ctxt);
 	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
 	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
+	bool (*check_lass)(struct x86_emulate_ctxt *ctxt, u64 access, u64 la, u32 flags);
 };
 
 /* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c0778ca39650..faf01fecc4ca 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8287,6 +8287,17 @@ static void emulator_vm_bugged(struct x86_emulate_ctxt *ctxt)
 		kvm_vm_bugged(kvm);
 }
 
+static bool emulator_check_lass(struct x86_emulate_ctxt *ctxt,
+				u64 access, u64 la, u32 flags)
+{
+	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+
+	if (!is_long_mode(vcpu))
+		return false;
+
+	return static_call(kvm_x86_check_lass)(vcpu, access, la, flags);
+}
+
 static const struct x86_emulate_ops emulate_ops = {
 	.vm_bugged           = emulator_vm_bugged,
 	.read_gpr            = emulator_read_gpr,
@@ -8332,6 +8343,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.leave_smm           = emulator_leave_smm,
 	.triple_fault        = emulator_triple_fault,
 	.set_xcr             = emulator_set_xcr,
+	.check_lass          = emulator_check_lass,
 };
 
 static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v1 5/6] KVM: x86: LASS protection on KVM emulation
  2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
                   ` (3 preceding siblings ...)
  2023-06-01 14:23 ` [PATCH v1 4/6] KVM: x86: Add emulator helper " Zeng Guang
@ 2023-06-01 14:23 ` Zeng Guang
  2023-06-06  4:20   ` Binbin Wu
  2023-06-01 14:23 ` [PATCH v1 6/6] KVM: x86: Advertise LASS CPUID to user space Zeng Guang
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 36+ messages in thread
From: Zeng Guang @ 2023-06-01 14:23 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

Do LASS violation check for instructions emulated by KVM. Note that for
instructions executed in the guest directly, hardware will perform the
check.

Not all instruction emulation leads to accesses to guest linear addresses
because 1) some instructions like CPUID, RDMSR, don't take memory as
operands 2) instruction fetch in most cases is already done inside the
guest.

Four cases in which KVM uses a linear address to access guest memory:
- KVM emulates instruction fetches or data accesses
- KVM emulates implicit data access to a system data structure
- VMX instruction emulation
- SGX ENCLS instruction emulation

LASS violation check applies to these linear addresses so as to enforce
mode-based protections as hardware behaves.

As exceptions, the target memory address of emulation of invlpg, branch
and call instructions doesn't require LASS violation check.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/kvm/emulate.c    | 30 ++++++++++++++++++++++++++++--
 arch/x86/kvm/vmx/nested.c |  3 +++
 arch/x86/kvm/vmx/sgx.c    |  4 ++++
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 9508836e8a35..ed5191fa2079 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -698,6 +698,7 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 	u8  va_bits;
 	bool fetch = !!(flags & X86EMUL_F_FETCH);
 	bool write = !!(flags & X86EMUL_F_WRITE);
+	u64 access = fetch ? PFERR_FETCH_MASK : 0;
 
 	la = seg_base(ctxt, addr.seg) + addr.ea;
 	*max_size = 0;
@@ -743,6 +744,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 		}
 		break;
 	}
+
+	if (ctxt->ops->check_lass(ctxt, access, *linear, flags))
+		goto bad;
+
 	if (la & (insn_alignment(ctxt, size) - 1))
 		return emulate_gp(ctxt, 0);
 	return X86EMUL_CONTINUE;
@@ -774,7 +779,11 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
 	unsigned max_size;
 	struct segmented_address addr = { .seg = VCPU_SREG_CS,
 					   .ea = dst };
-	u32 flags = X86EMUL_F_FETCH;
+	/*
+	 * LASS doesn't apply to addresses that specify the targets of jump and
+	 * call instructions.
+	 */
+	u32 flags = X86EMUL_F_FETCH | X86EMUL_F_SKIPLASS;
 
 	if (ctxt->op_bytes != sizeof(unsigned long))
 		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
@@ -853,6 +862,13 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
 static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
 			      void *data, unsigned size)
 {
+	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
+		ctxt->exception.vector = GP_VECTOR;
+		ctxt->exception.error_code = 0;
+		ctxt->exception.error_code_valid = true;
+		return X86EMUL_PROPAGATE_FAULT;
+	}
+
 	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
 }
 
@@ -860,6 +876,13 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
 			       ulong linear, void *data,
 			       unsigned int size)
 {
+	if (ctxt->ops->check_lass(ctxt, PFERR_IMPLICIT_ACCESS, linear, 0)) {
+		ctxt->exception.vector = GP_VECTOR;
+		ctxt->exception.error_code = 0;
+		ctxt->exception.error_code_valid = true;
+		return X86EMUL_PROPAGATE_FAULT;
+	}
+
 	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
 }
 
@@ -3448,8 +3471,11 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
 {
 	int rc;
 	ulong linear;
+	unsigned max_size;
 
-	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
+	/* LASS doesn't apply to the memory address for invlpg */
+	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1,
+			 X86EMUL_F_SKIPLASS, ctxt->mode, &linear);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->ops->invlpg(ctxt, linear);
 	/* Disable writeback. */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index e35cf0bd0df9..bb1c3fa13c13 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4986,6 +4986,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 		 * destination for long mode!
 		 */
 		exn = is_noncanonical_address(*ret, vcpu);
+
+		if (!exn)
+			exn = vmx_check_lass(vcpu, 0, *ret, 0);
 	} else {
 		/*
 		 * When not in long mode, the virtual/linear address is
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 2261b684a7d4..3825275827eb 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -46,6 +46,10 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
 			((s.base != 0 || s.limit != 0xffffffff) &&
 			(((u64)*gva + size - 1) > s.limit + 1));
 	}
+
+	if (!fault && is_long_mode(vcpu))
+		fault = vmx_check_lass(vcpu, 0, *gva, 0);
+
 	if (fault)
 		kvm_inject_gp(vcpu, 0);
 	return fault ? -EINVAL : 0;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v1 6/6] KVM: x86: Advertise LASS CPUID to user space
  2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
                   ` (4 preceding siblings ...)
  2023-06-01 14:23 ` [PATCH v1 5/6] KVM: x86: LASS protection on KVM emulation Zeng Guang
@ 2023-06-01 14:23 ` Zeng Guang
  2023-06-02  0:35 ` [PATCH v1 0/6] LASS KVM virtualization support Sean Christopherson
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-01 14:23 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

LASS (Linear-address space separation) is an independent mechanism
to enforce the mode-based protection that can prevent user-mode
accesses to supervisor-mode addresses, and vice versa. Because the
LASS protections are applied before paging, malicious software can
not acquire any paging-based timing information to compromise the
security of system.

The CPUID bit definition to support LASS:
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6]

Advertise LASS to user space to support LASS virtualization.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/kvm/cpuid.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0c9660a07b23..a7fafe99ffe4 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -646,9 +646,8 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
 
 	kvm_cpu_cap_mask(CPUID_7_1_EAX,
-		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
-		F(FZRM) | F(FSRS) | F(FSRC) |
-		F(AMX_FP16) | F(AVX_IFMA)
+		F(AVX_VNNI) | F(AVX512_BF16) | F(LASS) | F(CMPCCXADD) |
+		F(FZRM) | F(FSRS) | F(FSRC) | F(AMX_FP16) | F(AVX_IFMA)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 0/6] LASS KVM virtualization support
  2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
                   ` (5 preceding siblings ...)
  2023-06-01 14:23 ` [PATCH v1 6/6] KVM: x86: Advertise LASS CPUID to user space Zeng Guang
@ 2023-06-02  0:35 ` Sean Christopherson
  2023-06-06  2:22   ` Zeng Guang
  2023-06-05  1:39 ` Binbin Wu
  2023-06-27 17:08 ` Sean Christopherson
  8 siblings, 1 reply; 36+ messages in thread
From: Sean Christopherson @ 2023-06-02  0:35 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel

On Thu, Jun 01, 2023, Zeng Guang wrote:
> v0->v1

Heh, the kernel process is a bit of a heathen and starts counting patch versions
at '1', not '0'.  I.e. this should be v2, not v1.  No need to resend, just an FYI
for the future.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 0/6] LASS KVM virtualization support
  2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
                   ` (6 preceding siblings ...)
  2023-06-02  0:35 ` [PATCH v1 0/6] LASS KVM virtualization support Sean Christopherson
@ 2023-06-05  1:39 ` Binbin Wu
  2023-06-06  2:40   ` Zeng Guang
  2023-06-27 17:08 ` Sean Christopherson
  8 siblings, 1 reply; 36+ messages in thread
From: Binbin Wu @ 2023-06-05  1:39 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Dave Hansen, H Peter Anvin, Borislav Petkov, kvm, x86,
	linux-kernel



On 6/1/2023 10:23 PM, Zeng Guang wrote:
> Subject:
> [PATCH v1 0/6] LASS KVM virtualization support
> From:
> Zeng Guang <guang.zeng@intel.com>
> Date:
> 6/1/2023, 10:23 PM
>
> To:
> Paolo Bonzini <pbonzini@redhat.com>, Sean Christopherson 
> <seanjc@google.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar 
> <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen 
> <dave.hansen@linux.intel.com>, H Peter Anvin <hpa@zytor.com>, 
> kvm@vger.kernel.org
> CC:
> x86@kernel.org, linux-kernel@vger.kernel.org, Zeng Guang 
> <guang.zeng@intel.com>
>
>
> Linear Address Space Separation (LASS)[1] is an independent mechanism
> that enforces the mode-based protections on any access to a linear
> address.
>
> Based on a linear-address organization, the 64-bit canonical linear
> address space is partitioned into two halves: all linear addresses
> whose most significant bit is 0 are user space addresses, while linear
> addresses whose most significant bit is 1 are supervisor space address.
>
> LASS aims to prevent any attempt to probe supervisor space addresses by
> user mode, and likewise stop any attempt to access (if SMAP enabled) or
> execute user space addresses from supervisor mode.
>
> When platform has LASS capability, KVM requires to expose this feature
> to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
> allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
> executed in the guest directly, hardware will perform the check. But KVM
> also needs to behave same as hardware to apply LASS to kinds of guest
> memory accesses when emulating privileged instructions by software.
Not just privileged instructions, e.g. MMIO access instructions.

>
> KVM will take following LASS voilations check on emulation path.
/s/voilations/violations

> User-mode access to supervisor space address:
>          LA[bit 63] && (CPL == 3)
> Supervisor-mode access to user space address:
>          Instruction fetch: !LA[bit 63] && (CPL < 3)
>          Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>                       CPL < 3) || Implicit supervisor access)
>
> This patch series provide a LASS KVM solution.
>
> We tested the basic function of LASS virtualization including LASS
> enumeration and enabling in non-root and nested environment. As KVM
> unittest framework is not compatible to LASS rule, we use kernel module
> and application test to emulate LASS violation instead. With KVM forced
> emulation mechanism, we also verified the LASS functionality on some
> emulation path with instruction fetch and data access to have same
> behavior as hardware.
>
> [1] Intel ISEhttps://cdrdv2.intel.com/v1/dl/getContent/671368
> Chapter Linear Address Space Separation (LASS)
>
> ------------------------------------------------------
>
> v0->v1
> 1. Adapt to new __linearize() API
> 2. Function refactor of vmx_check_lass()
> 3. Refine commit message to be more precise
> 4. Drop LASS kvm cap detection depending
>     on hardware capability
>
>
> Binbin Wu (1):
>    KVM: x86: Consolidate flags for __linearize()
>
> Zeng Guang (5):
>    KVM: x86: Virtualize CR4.LASS
>    KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
>    KVM: x86: Add emulator helper for LASS violation check
>    KVM: x86: LASS protection on KVM emulation
>    KVM: x86: Advertise LASS CPUID to user space
>
>   arch/x86/include/asm/kvm-x86-ops.h |  3 +-
>   arch/x86/include/asm/kvm_host.h    |  4 ++-
>   arch/x86/kvm/cpuid.c               |  5 ++-
>   arch/x86/kvm/emulate.c             | 47 +++++++++++++++++++++++-----
>   arch/x86/kvm/kvm_emulate.h         |  6 ++++
>   arch/x86/kvm/vmx/nested.c          |  3 ++
>   arch/x86/kvm/vmx/sgx.c             |  4 +++
>   arch/x86/kvm/vmx/vmx.c             | 50 ++++++++++++++++++++++++++++++
>   arch/x86/kvm/vmx/vmx.h             |  2 ++
>   arch/x86/kvm/x86.c                 | 12 +++++++
>   arch/x86/kvm/x86.h                 |  2 ++
>   11 files changed, 126 insertions(+), 12 deletions(-)
>
> -- 2.27.0


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS
  2023-06-01 14:23 ` [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
@ 2023-06-05  1:57   ` Binbin Wu
  2023-06-06  2:57     ` Zeng Guang
  2023-06-27 17:43   ` Sean Christopherson
  2023-08-16 22:16   ` Sean Christopherson
  2 siblings, 1 reply; 36+ messages in thread
From: Binbin Wu @ 2023-06-05  1:57 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel



On 6/1/2023 10:23 PM, Zeng Guang wrote:
> Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
> as CR4.LASS generally set once for each vCPU at boot time and won't be
> toggled at runtime. Besides, only if VM has LASS capability enumerated with
> CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
> to set CR4.LASS.
>
> Updating cr4_fixed1 to set CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1
> MSR for guests and allow guests to enable LASS in nested VMX operaion as well.
s/operaion/operation

>
> Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
> effect in legacy mode even if CR4.LASS is set.
>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> ---
>   arch/x86/include/asm/kvm_host.h | 2 +-
>   arch/x86/kvm/vmx/vmx.c          | 3 +++
>   arch/x86/kvm/x86.h              | 2 ++
>   3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index fb9d1f2d6136..92d8e65fe88c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -125,7 +125,7 @@
>   			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>   			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>   			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP | X86_CR4_LASS))
Suppose there is some bare-matel linux patch to define the LASS related 
macros.
May be better to describe the dependent patch(es) in cover letter.


>   
>   #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>   
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 44fb619803b8..a33205ded85c 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7603,6 +7603,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
>   	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
>   	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
>   
> +	entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
> +	cr4_fixed1_update(X86_CR4_LASS,       eax, feature_bit(LASS));
> +
>   #undef cr4_fixed1_update
>   }
>   
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index c544602d07a3..e1295f490308 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -529,6 +529,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
>   		__reserved_bits |= X86_CR4_VMXE;        \
>   	if (!__cpu_has(__c, X86_FEATURE_PCID))          \
>   		__reserved_bits |= X86_CR4_PCIDE;       \
> +	if (!__cpu_has(__c, X86_FEATURE_LASS))          \
> +		__reserved_bits |= X86_CR4_LASS;        \
>   	__reserved_bits;                                \
>   })
>   


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-01 14:23 ` [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
@ 2023-06-05  3:31   ` Binbin Wu
  2023-06-05 12:53     ` Zhi Wang
  2023-06-07  6:28     ` Zeng Guang
  2023-06-05  3:47   ` Chao Gao
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 36+ messages in thread
From: Binbin Wu @ 2023-06-05  3:31 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel



On 6/1/2023 10:23 PM, Zeng Guang wrote:
> Intel introduces LASS (Linear Address Separation) feature providing
                       ^
  missing "Space" here
> an independent mechanism to achieve the mode-based protection.
>
> LASS partitions 64-bit linear address space into two halves, user-mode
> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
> stops any code execution or conditional data access[1]
>      1. from user mode to supervisor-mode address space
>      2. from supervisor mode to user-mode address space
> and generates LASS violation fault accordingly.
>
> [1]A supervisor mode data access causes a LASS violation only if supervisor
> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
> or the access implicitly accesses a system data structure.
>
> Following are the rules of LASS violation check on the linear address(LA).
> User access to supervisor-mode address space:
>      LA[bit 63] && (CPL == 3)
> Supervisor access to user-mode address space:
>      Instruction fetch: !LA[bit 63] && (CPL < 3)
>      Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>                   CPL < 3) || Implicit supervisor access)
>
> Add new ops in kvm_x86_ops to do LASS violation check.
>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> ---
>   arch/x86/include/asm/kvm-x86-ops.h |  3 +-
>   arch/x86/include/asm/kvm_host.h    |  2 ++
>   arch/x86/kvm/kvm_emulate.h         |  1 +
>   arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
>   arch/x86/kvm/vmx/vmx.h             |  2 ++
>   5 files changed, 54 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 13bc212cd4bc..8980a3bfa687 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
>   KVM_X86_OP(msr_filter_changed)
>   KVM_X86_OP(complete_emulated_msr)
>   KVM_X86_OP(vcpu_deliver_sipi_vector)
> -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
> +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
> +KVM_X86_OP_OPTIONAL_RET0(check_lass)
>   
>   #undef KVM_X86_OP
>   #undef KVM_X86_OP_OPTIONAL
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 92d8e65fe88c..98666d1e7727 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
>   	 * Returns vCPU specific APICv inhibit reasons
>   	 */
>   	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
> +
> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
>   };
>   
>   struct kvm_x86_nested_ops {
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index 5b9ec610b2cb..f1439ab7c14b 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -91,6 +91,7 @@ struct x86_instruction_info {
>   /* x86-specific emulation flags */
>   #define X86EMUL_F_FETCH			BIT(0)
>   #define X86EMUL_F_WRITE			BIT(1)
> +#define X86EMUL_F_SKIPLASS		BIT(2)
>   
>   struct x86_emulate_ops {
>   	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a33205ded85c..876997e8448e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
>   	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>   }
>   
> +/*
> + * Determine whether an access to the linear address causes a LASS violation.
> + * LASS protection is only effective in long mode. As a prerequisite, caller
> + * should make sure vCPU running in long mode and invoke this api to do LASS
> + * violation check.
> + */
> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
> +{
> +	bool user_mode, user_as, rflags_ac;
> +
> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
> +		return false;
> +
> +	WARN_ON_ONCE(!is_long_mode(vcpu));
IMHO, it's better to skip the following checks and return false if it is 
out of long mode.

> +
> +	user_as = !(la >> 63);
It's better to describe how LASS treat linear address in compatibility 
mode in changelog or/and in comment,
i.e. for a linear address with only 32 bits (or 16 bits), the processor 
treats bit 63 as if it were 0.


> +
> +	/*
> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
> +	 * accesses a system data structure. For implicit accesses to system
> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
> +	 */
> +	if (access & PFERR_IMPLICIT_ACCESS) {
> +		user_mode = false;
> +		rflags_ac = false;
> +	} else {
> +		user_mode = vmx_get_cpl(vcpu) == 3;
> +		if (!user_mode)
> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
> +	}
> +
> +	if (user_mode == user_as)
> +		return false;
> +
> +	/*
> +	 * Supervisor-mode _data_ accesses to user address space
> +	 * cause LASS violations only if SMAP is enabled.
> +	 */
> +	if (!user_mode && !(access & PFERR_FETCH_MASK))
> +		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;
> +
> +	return true;
> +}
> +
>   static struct kvm_x86_ops vmx_x86_ops __initdata = {
>   	.name = KBUILD_MODNAME,
>   
> @@ -8269,6 +8314,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>   	.complete_emulated_msr = kvm_complete_insn_gp,
>   
>   	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
> +
> +	.check_lass = vmx_check_lass,
>   };
>   
>   static unsigned int vmx_handle_intel_pt_intr(void)
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 9e66531861cf..f2e775b9849b 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
>   u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
>   u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>   
> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
> +
>   static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
>   					     int type, bool value)
>   {


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-01 14:23 ` [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
  2023-06-05  3:31   ` Binbin Wu
@ 2023-06-05  3:47   ` Chao Gao
  2023-06-06  6:22     ` Zeng Guang
  2023-06-05 14:07   ` Yuan Yao
  2023-06-27 18:26   ` Sean Christopherson
  3 siblings, 1 reply; 36+ messages in thread
From: Chao Gao @ 2023-06-05  3:47 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel

On Thu, Jun 01, 2023 at 10:23:06PM +0800, Zeng Guang wrote:
>diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>index 92d8e65fe88c..98666d1e7727 100644
>--- a/arch/x86/include/asm/kvm_host.h
>+++ b/arch/x86/include/asm/kvm_host.h
>@@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
> 	 * Returns vCPU specific APICv inhibit reasons
> 	 */
> 	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
>+
>+	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);

It is better to declare the @la as gva_t since the address is a virtual address.

Both @access and @flags provide additional informaiton about a memory access. I
think we can drop one of them e.g. adding a new bit X86EMUL_F_IMPLICIT_ACCESS.

Or maybe in the first place, we can just extend PFERR_? for SKIP_LASS/LAM
behavior instead of adding another set of flags (X86EMUL_F_?). The benefit of
adding new flags is they won't collide with future hardware extensions. I am not
sure.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-05  3:31   ` Binbin Wu
@ 2023-06-05 12:53     ` Zhi Wang
  2023-06-06  2:57       ` Binbin Wu
  2023-06-07  6:28     ` Zeng Guang
  1 sibling, 1 reply; 36+ messages in thread
From: Zhi Wang @ 2023-06-05 12:53 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Zeng Guang, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin, kvm,
	x86, linux-kernel

On Mon, 5 Jun 2023 11:31:48 +0800
Binbin Wu <binbin.wu@linux.intel.com> wrote:

> 
> 
> On 6/1/2023 10:23 PM, Zeng Guang wrote:
> > Intel introduces LASS (Linear Address Separation) feature providing
>                        ^
>   missing "Space" here
> > an independent mechanism to achieve the mode-based protection.
> >
> > LASS partitions 64-bit linear address space into two halves, user-mode
> > address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
> > stops any code execution or conditional data access[1]
> >      1. from user mode to supervisor-mode address space
> >      2. from supervisor mode to user-mode address space
> > and generates LASS violation fault accordingly.
> >
> > [1]A supervisor mode data access causes a LASS violation only if supervisor
> > mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
> > or the access implicitly accesses a system data structure.
> >
> > Following are the rules of LASS violation check on the linear address(LA).
> > User access to supervisor-mode address space:
> >      LA[bit 63] && (CPL == 3)
> > Supervisor access to user-mode address space:
> >      Instruction fetch: !LA[bit 63] && (CPL < 3)
> >      Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
> >                   CPL < 3) || Implicit supervisor access)
> >
> > Add new ops in kvm_x86_ops to do LASS violation check.
> >
> > Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> > Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> > ---
> >   arch/x86/include/asm/kvm-x86-ops.h |  3 +-
> >   arch/x86/include/asm/kvm_host.h    |  2 ++
> >   arch/x86/kvm/kvm_emulate.h         |  1 +
> >   arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
> >   arch/x86/kvm/vmx/vmx.h             |  2 ++
> >   5 files changed, 54 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> > index 13bc212cd4bc..8980a3bfa687 100644
> > --- a/arch/x86/include/asm/kvm-x86-ops.h
> > +++ b/arch/x86/include/asm/kvm-x86-ops.h
> > @@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
> >   KVM_X86_OP(msr_filter_changed)
> >   KVM_X86_OP(complete_emulated_msr)
> >   KVM_X86_OP(vcpu_deliver_sipi_vector)
> > -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
> > +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
> > +KVM_X86_OP_OPTIONAL_RET0(check_lass)
> >   
> >   #undef KVM_X86_OP
> >   #undef KVM_X86_OP_OPTIONAL
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 92d8e65fe88c..98666d1e7727 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
> >   	 * Returns vCPU specific APICv inhibit reasons
> >   	 */
> >   	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
> > +
> > +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
> >   };
> >   
> >   struct kvm_x86_nested_ops {
> > diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> > index 5b9ec610b2cb..f1439ab7c14b 100644
> > --- a/arch/x86/kvm/kvm_emulate.h
> > +++ b/arch/x86/kvm/kvm_emulate.h
> > @@ -91,6 +91,7 @@ struct x86_instruction_info {
> >   /* x86-specific emulation flags */
> >   #define X86EMUL_F_FETCH			BIT(0)
> >   #define X86EMUL_F_WRITE			BIT(1)
> > +#define X86EMUL_F_SKIPLASS		BIT(2)
> >   
> >   struct x86_emulate_ops {
> >   	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index a33205ded85c..876997e8448e 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
> >   	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
> >   }
> >   
> > +/*
> > + * Determine whether an access to the linear address causes a LASS violation.
> > + * LASS protection is only effective in long mode. As a prerequisite, caller
> > + * should make sure vCPU running in long mode and invoke this api to do LASS
> > + * violation check.
> > + */
> > +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
> > +{
> > +	bool user_mode, user_as, rflags_ac;
> > +
> > +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
> > +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
> > +		return false;
> > +
> > +	WARN_ON_ONCE(!is_long_mode(vcpu));
> IMHO, it's better to skip the following checks and return false if it is 
> out of long mode.
>
The check of long mode is in the caller implemented in in the next patch. :)

+	if (!is_long_mode(vcpu))
+		return false;

> > +
> > +	user_as = !(la >> 63);
> It's better to describe how LASS treat linear address in compatibility 
> mode in changelog or/and in comment,
> i.e. for a linear address with only 32 bits (or 16 bits), the processor 
> treats bit 63 as if it were 0.
>
> 
> > +
> > +	/*
> > +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
> > +	 * accesses a system data structure. For implicit accesses to system
> > +	 * data structure, the processor acts as if RFLAGS.AC is clear.
> > +	 */
> > +	if (access & PFERR_IMPLICIT_ACCESS) {
> > +		user_mode = false;
> > +		rflags_ac = false;
> > +	} else {
> > +		user_mode = vmx_get_cpl(vcpu) == 3;
> > +		if (!user_mode)
> > +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
> > +	}
> > +
> > +	if (user_mode == user_as)
> > +		return false;
> > +
> > +	/*
> > +	 * Supervisor-mode _data_ accesses to user address space
> > +	 * cause LASS violations only if SMAP is enabled.
> > +	 */
> > +	if (!user_mode && !(access & PFERR_FETCH_MASK))
> > +		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;
> > +
> > +	return true;
> > +}
> > +
> >   static struct kvm_x86_ops vmx_x86_ops __initdata = {
> >   	.name = KBUILD_MODNAME,
> >   
> > @@ -8269,6 +8314,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
> >   	.complete_emulated_msr = kvm_complete_insn_gp,
> >   
> >   	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
> > +
> > +	.check_lass = vmx_check_lass,
> >   };
> >   
> >   static unsigned int vmx_handle_intel_pt_intr(void)
> > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> > index 9e66531861cf..f2e775b9849b 100644
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
> >   u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
> >   u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
> >   
> > +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
> > +
> >   static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
> >   					     int type, bool value)
> >   {
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-01 14:23 ` [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
  2023-06-05  3:31   ` Binbin Wu
  2023-06-05  3:47   ` Chao Gao
@ 2023-06-05 14:07   ` Yuan Yao
  2023-06-06  3:08     ` Zeng Guang
  2023-06-27 18:26   ` Sean Christopherson
  3 siblings, 1 reply; 36+ messages in thread
From: Yuan Yao @ 2023-06-05 14:07 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel

On Thu, Jun 01, 2023 at 10:23:06PM +0800, Zeng Guang wrote:
> Intel introduces LASS (Linear Address Separation) feature providing
> an independent mechanism to achieve the mode-based protection.
>
> LASS partitions 64-bit linear address space into two halves, user-mode
> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
> stops any code execution or conditional data access[1]
>     1. from user mode to supervisor-mode address space
>     2. from supervisor mode to user-mode address space
> and generates LASS violation fault accordingly.
>
> [1]A supervisor mode data access causes a LASS violation only if supervisor
> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
> or the access implicitly accesses a system data structure.
>
> Following are the rules of LASS violation check on the linear address(LA).
> User access to supervisor-mode address space:
>     LA[bit 63] && (CPL == 3)
> Supervisor access to user-mode address space:
>     Instruction fetch: !LA[bit 63] && (CPL < 3)
>     Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>                  CPL < 3) || Implicit supervisor access)
>
> Add new ops in kvm_x86_ops to do LASS violation check.
>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  3 +-
>  arch/x86/include/asm/kvm_host.h    |  2 ++
>  arch/x86/kvm/kvm_emulate.h         |  1 +
>  arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
>  arch/x86/kvm/vmx/vmx.h             |  2 ++
>  5 files changed, 54 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 13bc212cd4bc..8980a3bfa687 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
>  KVM_X86_OP(msr_filter_changed)
>  KVM_X86_OP(complete_emulated_msr)
>  KVM_X86_OP(vcpu_deliver_sipi_vector)
> -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
> +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
> +KVM_X86_OP_OPTIONAL_RET0(check_lass)
>
>  #undef KVM_X86_OP
>  #undef KVM_X86_OP_OPTIONAL
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 92d8e65fe88c..98666d1e7727 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
>  	 * Returns vCPU specific APICv inhibit reasons
>  	 */
>  	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
> +
> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
>  };
>
>  struct kvm_x86_nested_ops {
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index 5b9ec610b2cb..f1439ab7c14b 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -91,6 +91,7 @@ struct x86_instruction_info {
>  /* x86-specific emulation flags */
>  #define X86EMUL_F_FETCH			BIT(0)
>  #define X86EMUL_F_WRITE			BIT(1)
> +#define X86EMUL_F_SKIPLASS		BIT(2)
>
>  struct x86_emulate_ops {
>  	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a33205ded85c..876997e8448e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
>  	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>  }
>
> +/*
> + * Determine whether an access to the linear address causes a LASS violation.
> + * LASS protection is only effective in long mode. As a prerequisite, caller
> + * should make sure vCPU running in long mode and invoke this api to do LASS
> + * violation check.
> + */
> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
> +{
> +	bool user_mode, user_as, rflags_ac;
> +
> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
> +		return false;
> +
> +	WARN_ON_ONCE(!is_long_mode(vcpu));
> +
> +	user_as = !(la >> 63);
> +
> +	/*
> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
> +	 * accesses a system data structure. For implicit accesses to system
> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
> +	 */
> +	if (access & PFERR_IMPLICIT_ACCESS) {
> +		user_mode = false;
> +		rflags_ac = false;
> +	} else {
> +		user_mode = vmx_get_cpl(vcpu) == 3;
> +		if (!user_mode)
> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
> +	}
> +
> +	if (user_mode == user_as)

Confused by user_as, it's role of address(U/S) so how about
"user_addr" ? "if (user_mode == user_addr)" looks more clear
to me.

> +		return false;
> +
> +	/*
> +	 * Supervisor-mode _data_ accesses to user address space
> +	 * cause LASS violations only if SMAP is enabled.
> +	 */
> +	if (!user_mode && !(access & PFERR_FETCH_MASK))
> +		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;
> +
> +	return true;
> +}
> +
>  static struct kvm_x86_ops vmx_x86_ops __initdata = {
>  	.name = KBUILD_MODNAME,
>
> @@ -8269,6 +8314,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>  	.complete_emulated_msr = kvm_complete_insn_gp,
>
>  	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
> +
> +	.check_lass = vmx_check_lass,
>  };
>
>  static unsigned int vmx_handle_intel_pt_intr(void)
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 9e66531861cf..f2e775b9849b 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
>  u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
>  u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>
> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
> +
>  static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
>  					     int type, bool value)
>  {
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 0/6] LASS KVM virtualization support
  2023-06-02  0:35 ` [PATCH v1 0/6] LASS KVM virtualization support Sean Christopherson
@ 2023-06-06  2:22   ` Zeng Guang
  0 siblings, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-06  2:22 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel


On 6/2/2023 8:35 AM, Sean Christopherson wrote:
> On Thu, Jun 01, 2023, Zeng Guang wrote:
>> v0->v1
> Heh, the kernel process is a bit of a heathen and starts counting patch versions
> at '1', not '0'.  I.e. this should be v2, not v1.  No need to resend, just an FYI
> for the future.

Got it !
Thanks.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 0/6] LASS KVM virtualization support
  2023-06-05  1:39 ` Binbin Wu
@ 2023-06-06  2:40   ` Zeng Guang
  0 siblings, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-06  2:40 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Paolo Bonzini, Christopherson,,
	Sean, Thomas Gleixner, Ingo Molnar, Dave Hansen, H Peter Anvin,
	Borislav Petkov, kvm, x86, linux-kernel


On 6/5/2023 9:39 AM, Binbin Wu wrote:
>
> On 6/1/2023 10:23 PM, Zeng Guang wrote:
>> Subject:
>> [PATCH v1 0/6] LASS KVM virtualization support
>> From:
>> Zeng Guang <guang.zeng@intel.com>
>> Date:
>> 6/1/2023, 10:23 PM
>>
>> To:
>> Paolo Bonzini <pbonzini@redhat.com>, Sean Christopherson
>> <seanjc@google.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar
>> <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen
>> <dave.hansen@linux.intel.com>, H Peter Anvin <hpa@zytor.com>,
>> kvm@vger.kernel.org
>> CC:
>> x86@kernel.org, linux-kernel@vger.kernel.org, Zeng Guang
>> <guang.zeng@intel.com>
>>
>>
>> Linear Address Space Separation (LASS)[1] is an independent mechanism
>> that enforces the mode-based protections on any access to a linear
>> address.
>>
>> Based on a linear-address organization, the 64-bit canonical linear
>> address space is partitioned into two halves: all linear addresses
>> whose most significant bit is 0 are user space addresses, while linear
>> addresses whose most significant bit is 1 are supervisor space address.
>>
>> LASS aims to prevent any attempt to probe supervisor space addresses by
>> user mode, and likewise stop any attempt to access (if SMAP enabled) or
>> execute user space addresses from supervisor mode.
>>
>> When platform has LASS capability, KVM requires to expose this feature
>> to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
>> allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
>> executed in the guest directly, hardware will perform the check. But KVM
>> also needs to behave same as hardware to apply LASS to kinds of guest
>> memory accesses when emulating privileged instructions by software.
> Not just privileged instructions, e.g. MMIO access instructions.
OK. I'll revise it.
>> KVM will take following LASS voilations check on emulation path.
> /s/voilations/violations
Thanks.
>> User-mode access to supervisor space address:
>>           LA[bit 63] && (CPL == 3)
>> Supervisor-mode access to user space address:
>>           Instruction fetch: !LA[bit 63] && (CPL < 3)
>>           Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>>                        CPL < 3) || Implicit supervisor access)
>>
>> This patch series provide a LASS KVM solution.
>>
>> We tested the basic function of LASS virtualization including LASS
>> enumeration and enabling in non-root and nested environment. As KVM
>> unittest framework is not compatible to LASS rule, we use kernel module
>> and application test to emulate LASS violation instead. With KVM forced
>> emulation mechanism, we also verified the LASS functionality on some
>> emulation path with instruction fetch and data access to have same
>> behavior as hardware.
>>
>> [1] Intel ISEhttps://cdrdv2.intel.com/v1/dl/getContent/671368
>> Chapter Linear Address Space Separation (LASS)
>>
>> ------------------------------------------------------
>>
>> v0->v1
>> 1. Adapt to new __linearize() API
>> 2. Function refactor of vmx_check_lass()
>> 3. Refine commit message to be more precise
>> 4. Drop LASS kvm cap detection depending
>>      on hardware capability
>>
>>
>> Binbin Wu (1):
>>     KVM: x86: Consolidate flags for __linearize()
>>
>> Zeng Guang (5):
>>     KVM: x86: Virtualize CR4.LASS
>>     KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
>>     KVM: x86: Add emulator helper for LASS violation check
>>     KVM: x86: LASS protection on KVM emulation
>>     KVM: x86: Advertise LASS CPUID to user space
>>
>>    arch/x86/include/asm/kvm-x86-ops.h |  3 +-
>>    arch/x86/include/asm/kvm_host.h    |  4 ++-
>>    arch/x86/kvm/cpuid.c               |  5 ++-
>>    arch/x86/kvm/emulate.c             | 47 +++++++++++++++++++++++-----
>>    arch/x86/kvm/kvm_emulate.h         |  6 ++++
>>    arch/x86/kvm/vmx/nested.c          |  3 ++
>>    arch/x86/kvm/vmx/sgx.c             |  4 +++
>>    arch/x86/kvm/vmx/vmx.c             | 50 ++++++++++++++++++++++++++++++
>>    arch/x86/kvm/vmx/vmx.h             |  2 ++
>>    arch/x86/kvm/x86.c                 | 12 +++++++
>>    arch/x86/kvm/x86.h                 |  2 ++
>>    11 files changed, 126 insertions(+), 12 deletions(-)
>>
>> -- 2.27.0

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-05 12:53     ` Zhi Wang
@ 2023-06-06  2:57       ` Binbin Wu
  2023-06-06  3:53         ` Zhi Wang
  0 siblings, 1 reply; 36+ messages in thread
From: Binbin Wu @ 2023-06-06  2:57 UTC (permalink / raw)
  To: Zeng Guang, Zhi Wang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel



On 6/5/2023 8:53 PM, Zhi Wang wrote:
> On Mon, 5 Jun 2023 11:31:48 +0800
> Binbin Wu <binbin.wu@linux.intel.com> wrote:
>
>>
>> On 6/1/2023 10:23 PM, Zeng Guang wrote:
>>> Intel introduces LASS (Linear Address Separation) feature providing
>>                         ^
>>    missing "Space" here
>>> an independent mechanism to achieve the mode-based protection.
>>>
>>> LASS partitions 64-bit linear address space into two halves, user-mode
>>> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
>>> stops any code execution or conditional data access[1]
>>>       1. from user mode to supervisor-mode address space
>>>       2. from supervisor mode to user-mode address space
>>> and generates LASS violation fault accordingly.
>>>
>>> [1]A supervisor mode data access causes a LASS violation only if supervisor
>>> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
>>> or the access implicitly accesses a system data structure.
>>>
>>> Following are the rules of LASS violation check on the linear address(LA).
>>> User access to supervisor-mode address space:
>>>       LA[bit 63] && (CPL == 3)
>>> Supervisor access to user-mode address space:
>>>       Instruction fetch: !LA[bit 63] && (CPL < 3)
>>>       Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>>>                    CPL < 3) || Implicit supervisor access)
>>>
>>> Add new ops in kvm_x86_ops to do LASS violation check.
>>>
>>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
>>> ---
>>>    arch/x86/include/asm/kvm-x86-ops.h |  3 +-
>>>    arch/x86/include/asm/kvm_host.h    |  2 ++
>>>    arch/x86/kvm/kvm_emulate.h         |  1 +
>>>    arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
>>>    arch/x86/kvm/vmx/vmx.h             |  2 ++
>>>    5 files changed, 54 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
>>> index 13bc212cd4bc..8980a3bfa687 100644
>>> --- a/arch/x86/include/asm/kvm-x86-ops.h
>>> +++ b/arch/x86/include/asm/kvm-x86-ops.h
>>> @@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
>>>    KVM_X86_OP(msr_filter_changed)
>>>    KVM_X86_OP(complete_emulated_msr)
>>>    KVM_X86_OP(vcpu_deliver_sipi_vector)
>>> -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
>>> +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
>>> +KVM_X86_OP_OPTIONAL_RET0(check_lass)
>>>    
>>>    #undef KVM_X86_OP
>>>    #undef KVM_X86_OP_OPTIONAL
>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>>> index 92d8e65fe88c..98666d1e7727 100644
>>> --- a/arch/x86/include/asm/kvm_host.h
>>> +++ b/arch/x86/include/asm/kvm_host.h
>>> @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
>>>    	 * Returns vCPU specific APICv inhibit reasons
>>>    	 */
>>>    	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
>>> +
>>> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
>>>    };
>>>    
>>>    struct kvm_x86_nested_ops {
>>> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
>>> index 5b9ec610b2cb..f1439ab7c14b 100644
>>> --- a/arch/x86/kvm/kvm_emulate.h
>>> +++ b/arch/x86/kvm/kvm_emulate.h
>>> @@ -91,6 +91,7 @@ struct x86_instruction_info {
>>>    /* x86-specific emulation flags */
>>>    #define X86EMUL_F_FETCH			BIT(0)
>>>    #define X86EMUL_F_WRITE			BIT(1)
>>> +#define X86EMUL_F_SKIPLASS		BIT(2)
>>>    
>>>    struct x86_emulate_ops {
>>>    	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index a33205ded85c..876997e8448e 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
>>>    	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>>>    }
>>>    
>>> +/*
>>> + * Determine whether an access to the linear address causes a LASS violation.
>>> + * LASS protection is only effective in long mode. As a prerequisite, caller
>>> + * should make sure vCPU running in long mode and invoke this api to do LASS
>>> + * violation check.
>>> + */
>>> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
>>> +{
>>> +	bool user_mode, user_as, rflags_ac;
>>> +
>>> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
>>> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
>>> +		return false;
>>> +
>>> +	WARN_ON_ONCE(!is_long_mode(vcpu));
>> IMHO, it's better to skip the following checks and return false if it is
>> out of long mode.
>>
> The check of long mode is in the caller implemented in in the next patch. :)
>
> +	if (!is_long_mode(vcpu))
> +		return false;
I know the callers have checked the mode, however, IMHO, it's better as 
following:

+	if (!!(flags & X86EMUL_F_SKIPLASS) ||
+	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || WARN_ON_ONCE(!is_long_mode(vcpu)))
+		return false;



>>> +
>>> +	user_as = !(la >> 63);
>> It's better to describe how LASS treat linear address in compatibility
>> mode in changelog or/and in comment,
>> i.e. for a linear address with only 32 bits (or 16 bits), the processor
>> treats bit 63 as if it were 0.
>>
>>
>>> +
>>> +	/*
>>> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
>>> +	 * accesses a system data structure. For implicit accesses to system
>>> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
>>> +	 */
>>> +	if (access & PFERR_IMPLICIT_ACCESS) {
>>> +		user_mode = false;
>>> +		rflags_ac = false;
>>> +	} else {
>>> +		user_mode = vmx_get_cpl(vcpu) == 3;
>>> +		if (!user_mode)
>>> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
>>> +	}
>>> +
>>> +	if (user_mode == user_as)
>>> +		return false;
>>> +
>>> +	/*
>>> +	 * Supervisor-mode _data_ accesses to user address space
>>> +	 * cause LASS violations only if SMAP is enabled.
>>> +	 */
>>> +	if (!user_mode && !(access & PFERR_FETCH_MASK))
>>> +		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;
>>> +
>>> +	return true;
>>> +}
>>> +
>>>    static struct kvm_x86_ops vmx_x86_ops __initdata = {
>>>    	.name = KBUILD_MODNAME,
>>>    
>>> @@ -8269,6 +8314,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>>>    	.complete_emulated_msr = kvm_complete_insn_gp,
>>>    
>>>    	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
>>> +
>>> +	.check_lass = vmx_check_lass,
>>>    };
>>>    
>>>    static unsigned int vmx_handle_intel_pt_intr(void)
>>> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
>>> index 9e66531861cf..f2e775b9849b 100644
>>> --- a/arch/x86/kvm/vmx/vmx.h
>>> +++ b/arch/x86/kvm/vmx/vmx.h
>>> @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
>>>    u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
>>>    u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>>>    
>>> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
>>> +
>>>    static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
>>>    					     int type, bool value)
>>>    {


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS
  2023-06-05  1:57   ` Binbin Wu
@ 2023-06-06  2:57     ` Zeng Guang
  0 siblings, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-06  2:57 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Paolo Bonzini, Christopherson,,
	Sean, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H Peter Anvin, kvm, x86, linux-kernel


On 6/5/2023 9:57 AM, Binbin Wu wrote:
>
> On 6/1/2023 10:23 PM, Zeng Guang wrote:
>> Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
>> as CR4.LASS generally set once for each vCPU at boot time and won't be
>> toggled at runtime. Besides, only if VM has LASS capability enumerated with
>> CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
>> to set CR4.LASS.
>>
>> Updating cr4_fixed1 to set CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1
>> MSR for guests and allow guests to enable LASS in nested VMX operaion as well.
> s/operaion/operation
Thanks.
>
>> Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
>> effect in legacy mode even if CR4.LASS is set.
>>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
>> ---
>>    arch/x86/include/asm/kvm_host.h | 2 +-
>>    arch/x86/kvm/vmx/vmx.c          | 3 +++
>>    arch/x86/kvm/x86.h              | 2 ++
>>    3 files changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index fb9d1f2d6136..92d8e65fe88c 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -125,7 +125,7 @@
>>    			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>>    			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>>    			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
>> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
>> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP | X86_CR4_LASS))
> Suppose there is some bare-matel linux patch to define the LASS related
> macros.
> May be better to describe the dependent patch(es) in cover letter.
>
Good suggestion.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-05 14:07   ` Yuan Yao
@ 2023-06-06  3:08     ` Zeng Guang
  0 siblings, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-06  3:08 UTC (permalink / raw)
  To: Yuan Yao
  Cc: Paolo Bonzini, Christopherson,,
	Sean, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H Peter Anvin, kvm, x86, linux-kernel


On 6/5/2023 10:07 PM, Yuan Yao wrote:
> On Thu, Jun 01, 2023 at 10:23:06PM +0800, Zeng Guang wrote:
>> Intel introduces LASS (Linear Address Separation) feature providing
>> an independent mechanism to achieve the mode-based protection.
>>
>> LASS partitions 64-bit linear address space into two halves, user-mode
>> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
>> stops any code execution or conditional data access[1]
>>      1. from user mode to supervisor-mode address space
>>      2. from supervisor mode to user-mode address space
>> and generates LASS violation fault accordingly.
>>
>> +/*
>> + * Determine whether an access to the linear address causes a LASS violation.
>> + * LASS protection is only effective in long mode. As a prerequisite, caller
>> + * should make sure vCPU running in long mode and invoke this api to do LASS
>> + * violation check.
>> + */
>> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
>> +{
>> +	bool user_mode, user_as, rflags_ac;
>> +
>> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
>> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
>> +		return false;
>> +
>> +	WARN_ON_ONCE(!is_long_mode(vcpu));
>> +
>> +	user_as = !(la >> 63);
>> +
>> +	/*
>> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
>> +	 * accesses a system data structure. For implicit accesses to system
>> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
>> +	 */
>> +	if (access & PFERR_IMPLICIT_ACCESS) {
>> +		user_mode = false;
>> +		rflags_ac = false;
>> +	} else {
>> +		user_mode = vmx_get_cpl(vcpu) == 3;
>> +		if (!user_mode)
>> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
>> +	}
>> +
>> +	if (user_mode == user_as)
> Confused by user_as, it's role of address(U/S) so how about
> "user_addr" ? "if (user_mode == user_addr)" looks more clear
> to me.
>
Actually "as" stands for "address space". I suppose it more precise. :)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-06  2:57       ` Binbin Wu
@ 2023-06-06  3:53         ` Zhi Wang
  0 siblings, 0 replies; 36+ messages in thread
From: Zhi Wang @ 2023-06-06  3:53 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Zeng Guang, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H Peter Anvin, kvm,
	x86, linux-kernel

On Tue, 6 Jun 2023 10:57:23 +0800
Binbin Wu <binbin.wu@linux.intel.com> wrote:

> 
> 
> On 6/5/2023 8:53 PM, Zhi Wang wrote:
> > On Mon, 5 Jun 2023 11:31:48 +0800
> > Binbin Wu <binbin.wu@linux.intel.com> wrote:
> >
> >>
> >> On 6/1/2023 10:23 PM, Zeng Guang wrote:
> >>> Intel introduces LASS (Linear Address Separation) feature providing
> >>                         ^
> >>    missing "Space" here
> >>> an independent mechanism to achieve the mode-based protection.
> >>>
> >>> LASS partitions 64-bit linear address space into two halves, user-mode
> >>> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
> >>> stops any code execution or conditional data access[1]
> >>>       1. from user mode to supervisor-mode address space
> >>>       2. from supervisor mode to user-mode address space
> >>> and generates LASS violation fault accordingly.
> >>>
> >>> [1]A supervisor mode data access causes a LASS violation only if supervisor
> >>> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
> >>> or the access implicitly accesses a system data structure.
> >>>
> >>> Following are the rules of LASS violation check on the linear address(LA).
> >>> User access to supervisor-mode address space:
> >>>       LA[bit 63] && (CPL == 3)
> >>> Supervisor access to user-mode address space:
> >>>       Instruction fetch: !LA[bit 63] && (CPL < 3)
> >>>       Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
> >>>                    CPL < 3) || Implicit supervisor access)
> >>>
> >>> Add new ops in kvm_x86_ops to do LASS violation check.
> >>>
> >>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> >>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> >>> ---
> >>>    arch/x86/include/asm/kvm-x86-ops.h |  3 +-
> >>>    arch/x86/include/asm/kvm_host.h    |  2 ++
> >>>    arch/x86/kvm/kvm_emulate.h         |  1 +
> >>>    arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
> >>>    arch/x86/kvm/vmx/vmx.h             |  2 ++
> >>>    5 files changed, 54 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> >>> index 13bc212cd4bc..8980a3bfa687 100644
> >>> --- a/arch/x86/include/asm/kvm-x86-ops.h
> >>> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> >>> @@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
> >>>    KVM_X86_OP(msr_filter_changed)
> >>>    KVM_X86_OP(complete_emulated_msr)
> >>>    KVM_X86_OP(vcpu_deliver_sipi_vector)
> >>> -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
> >>> +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
> >>> +KVM_X86_OP_OPTIONAL_RET0(check_lass)
> >>>    
> >>>    #undef KVM_X86_OP
> >>>    #undef KVM_X86_OP_OPTIONAL
> >>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >>> index 92d8e65fe88c..98666d1e7727 100644
> >>> --- a/arch/x86/include/asm/kvm_host.h
> >>> +++ b/arch/x86/include/asm/kvm_host.h
> >>> @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
> >>>    	 * Returns vCPU specific APICv inhibit reasons
> >>>    	 */
> >>>    	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
> >>> +
> >>> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
> >>>    };
> >>>    
> >>>    struct kvm_x86_nested_ops {
> >>> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> >>> index 5b9ec610b2cb..f1439ab7c14b 100644
> >>> --- a/arch/x86/kvm/kvm_emulate.h
> >>> +++ b/arch/x86/kvm/kvm_emulate.h
> >>> @@ -91,6 +91,7 @@ struct x86_instruction_info {
> >>>    /* x86-specific emulation flags */
> >>>    #define X86EMUL_F_FETCH			BIT(0)
> >>>    #define X86EMUL_F_WRITE			BIT(1)
> >>> +#define X86EMUL_F_SKIPLASS		BIT(2)
> >>>    
> >>>    struct x86_emulate_ops {
> >>>    	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
> >>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> >>> index a33205ded85c..876997e8448e 100644
> >>> --- a/arch/x86/kvm/vmx/vmx.c
> >>> +++ b/arch/x86/kvm/vmx/vmx.c
> >>> @@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
> >>>    	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
> >>>    }
> >>>    
> >>> +/*
> >>> + * Determine whether an access to the linear address causes a LASS violation.
> >>> + * LASS protection is only effective in long mode. As a prerequisite, caller
> >>> + * should make sure vCPU running in long mode and invoke this api to do LASS
> >>> + * violation check.
> >>> + */
> >>> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
> >>> +{
> >>> +	bool user_mode, user_as, rflags_ac;
> >>> +
> >>> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
> >>> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
> >>> +		return false;
> >>> +
> >>> +	WARN_ON_ONCE(!is_long_mode(vcpu));
> >> IMHO, it's better to skip the following checks and return false if it is
> >> out of long mode.
> >>
> > The check of long mode is in the caller implemented in in the next patch. :)
> >
> > +	if (!is_long_mode(vcpu))
> > +		return false;
> I know the callers have checked the mode, however, IMHO, it's better as 
> following:
> 
> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || WARN_ON_ONCE(!is_long_mode(vcpu)))
> +		return false;
>
Uh. I see. LGTM. 
> 
> 
> >>> +
> >>> +	user_as = !(la >> 63);
> >> It's better to describe how LASS treat linear address in compatibility
> >> mode in changelog or/and in comment,
> >> i.e. for a linear address with only 32 bits (or 16 bits), the processor
> >> treats bit 63 as if it were 0.
> >>
> >>
> >>> +
> >>> +	/*
> >>> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
> >>> +	 * accesses a system data structure. For implicit accesses to system
> >>> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
> >>> +	 */
> >>> +	if (access & PFERR_IMPLICIT_ACCESS) {
> >>> +		user_mode = false;
> >>> +		rflags_ac = false;
> >>> +	} else {
> >>> +		user_mode = vmx_get_cpl(vcpu) == 3;
> >>> +		if (!user_mode)
> >>> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
> >>> +	}
> >>> +
> >>> +	if (user_mode == user_as)
> >>> +		return false;
> >>> +
> >>> +	/*
> >>> +	 * Supervisor-mode _data_ accesses to user address space
> >>> +	 * cause LASS violations only if SMAP is enabled.
> >>> +	 */
> >>> +	if (!user_mode && !(access & PFERR_FETCH_MASK))
> >>> +		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;
> >>> +
> >>> +	return true;
> >>> +}
> >>> +
> >>>    static struct kvm_x86_ops vmx_x86_ops __initdata = {
> >>>    	.name = KBUILD_MODNAME,
> >>>    
> >>> @@ -8269,6 +8314,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
> >>>    	.complete_emulated_msr = kvm_complete_insn_gp,
> >>>    
> >>>    	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
> >>> +
> >>> +	.check_lass = vmx_check_lass,
> >>>    };
> >>>    
> >>>    static unsigned int vmx_handle_intel_pt_intr(void)
> >>> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> >>> index 9e66531861cf..f2e775b9849b 100644
> >>> --- a/arch/x86/kvm/vmx/vmx.h
> >>> +++ b/arch/x86/kvm/vmx/vmx.h
> >>> @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
> >>>    u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
> >>>    u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
> >>>    
> >>> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
> >>> +
> >>>    static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
> >>>    					     int type, bool value)
> >>>    {
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 5/6] KVM: x86: LASS protection on KVM emulation
  2023-06-01 14:23 ` [PATCH v1 5/6] KVM: x86: LASS protection on KVM emulation Zeng Guang
@ 2023-06-06  4:20   ` Binbin Wu
  0 siblings, 0 replies; 36+ messages in thread
From: Binbin Wu @ 2023-06-06  4:20 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel



On 6/1/2023 10:23 PM, Zeng Guang wrote:
> Do LASS violation check for instructions emulated by KVM. Note that for
> instructions executed in the guest directly, hardware will perform the
> check.
>
> Not all instruction emulation leads to accesses to guest linear addresses
> because 1) some instructions like CPUID, RDMSR, don't take memory as
> operands 2) instruction fetch in most cases is already done inside the
> guest.
>
> Four cases in which KVM uses a linear address to access guest memory:
> - KVM emulates instruction fetches or data accesses
> - KVM emulates implicit data access to a system data structure
> - VMX instruction emulation
> - SGX ENCLS instruction emulation
>
> LASS violation check applies to these linear addresses so as to enforce
> mode-based protections as hardware behaves.
>
> As exceptions, the target memory address of emulation of invlpg, branch
> and call instructions doesn't require LASS violation check.
I think LASS doesn't apply to the target addresses in the descriptors of 
INVPCID and INVVPID.
Although no code change needed, IMHO, it's better to describe it in the 
changelog or/and comments.

>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> ---
>   arch/x86/kvm/emulate.c    | 30 ++++++++++++++++++++++++++++--
>   arch/x86/kvm/vmx/nested.c |  3 +++
>   arch/x86/kvm/vmx/sgx.c    |  4 ++++
>   3 files changed, 35 insertions(+), 2 deletions(-)
>
[...]
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index e35cf0bd0df9..bb1c3fa13c13 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4986,6 +4986,9 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
>   		 * destination for long mode!
>   		 */
>   		exn = is_noncanonical_address(*ret, vcpu);
> +
> +		if (!exn)
> +			exn = vmx_check_lass(vcpu, 0, *ret, 0);
Can be simpler by using logical-or:

exn = is_noncanonical_address(*ret, vcpu) || vmx_check_lass(vcpu, 0, *ret, 0);



>   	} else {
>   		/*
>   		 * When not in long mode, the virtual/linear address is
> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
> index 2261b684a7d4..3825275827eb 100644
> --- a/arch/x86/kvm/vmx/sgx.c
> +++ b/arch/x86/kvm/vmx/sgx.c
> @@ -46,6 +46,10 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
>   			((s.base != 0 || s.limit != 0xffffffff) &&
>   			(((u64)*gva + size - 1) > s.limit + 1));
>   	}
> +
> +	if (!fault && is_long_mode(vcpu))
> +		fault = vmx_check_lass(vcpu, 0, *gva, 0);
> +
>   	if (fault)
>   		kvm_inject_gp(vcpu, 0);
>   	return fault ? -EINVAL : 0;


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-05  3:47   ` Chao Gao
@ 2023-06-06  6:22     ` Zeng Guang
  0 siblings, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-06  6:22 UTC (permalink / raw)
  To: Gao, Chao
  Cc: Paolo Bonzini, Christopherson,,
	Sean, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H Peter Anvin, kvm, x86, linux-kernel


On 6/5/2023 11:47 AM, Gao, Chao wrote:
> On Thu, Jun 01, 2023 at 10:23:06PM +0800, Zeng Guang wrote:
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 92d8e65fe88c..98666d1e7727 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
>> 	 * Returns vCPU specific APICv inhibit reasons
>> 	 */
>> 	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
>> +
>> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
> It is better to declare the @la as gva_t since the address is a virtual address.
>
> Both @access and @flags provide additional informaiton about a memory access. I
> think we can drop one of them e.g. adding a new bit X86EMUL_F_IMPLICIT_ACCESS.
>
> Or maybe in the first place, we can just extend PFERR_? for SKIP_LASS/LAM
> behavior instead of adding another set of flags (X86EMUL_F_?). The benefit of
> adding new flags is they won't collide with future hardware extensions. I am not
> sure.
Make sense. Prefer to adding a new bit of X86EMUL flags.
PFERR_ is used for page fault case and actually not proper to be taken for
LASS/LAM usage.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-05  3:31   ` Binbin Wu
  2023-06-05 12:53     ` Zhi Wang
@ 2023-06-07  6:28     ` Zeng Guang
  1 sibling, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-07  6:28 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Paolo Bonzini, Christopherson,,
	Sean, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H Peter Anvin, kvm, x86, linux-kernel


On 6/5/2023 11:31 AM, Binbin Wu wrote:
>
> On 6/1/2023 10:23 PM, Zeng Guang wrote:
>> Intel introduces LASS (Linear Address Separation) feature providing
>                         ^
>    missing "Space" here
Thanks.
>> an independent mechanism to achieve the mode-based protection.
>>
>> LASS partitions 64-bit linear address space into two halves, user-mode
>> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
>> stops any code execution or conditional data access[1]
>>       1. from user mode to supervisor-mode address space
>>       2. from supervisor mode to user-mode address space
>> and generates LASS violation fault accordingly.
>>
>> [1]A supervisor mode data access causes a LASS violation only if supervisor
>> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
>> or the access implicitly accesses a system data structure.
>>
>> Following are the rules of LASS violation check on the linear address(LA).
>> User access to supervisor-mode address space:
>>       LA[bit 63] && (CPL == 3)
>> Supervisor access to user-mode address space:
>>       Instruction fetch: !LA[bit 63] && (CPL < 3)
>>       Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>>                    CPL < 3) || Implicit supervisor access)
>>
>> Add new ops in kvm_x86_ops to do LASS violation check.
>>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
>> ---
>>    arch/x86/include/asm/kvm-x86-ops.h |  3 +-
>>    arch/x86/include/asm/kvm_host.h    |  2 ++
>>    arch/x86/kvm/kvm_emulate.h         |  1 +
>>    arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
>>    arch/x86/kvm/vmx/vmx.h             |  2 ++
>>    5 files changed, 54 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
>> index 13bc212cd4bc..8980a3bfa687 100644
>> --- a/arch/x86/include/asm/kvm-x86-ops.h
>> +++ b/arch/x86/include/asm/kvm-x86-ops.h
>> @@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
>>    KVM_X86_OP(msr_filter_changed)
>>    KVM_X86_OP(complete_emulated_msr)
>>    KVM_X86_OP(vcpu_deliver_sipi_vector)
>> -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
>> +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
>> +KVM_X86_OP_OPTIONAL_RET0(check_lass)
>>    
>>    #undef KVM_X86_OP
>>    #undef KVM_X86_OP_OPTIONAL
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 92d8e65fe88c..98666d1e7727 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
>>    	 * Returns vCPU specific APICv inhibit reasons
>>    	 */
>>    	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
>> +
>> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
>>    };
>>    
>>    struct kvm_x86_nested_ops {
>> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
>> index 5b9ec610b2cb..f1439ab7c14b 100644
>> --- a/arch/x86/kvm/kvm_emulate.h
>> +++ b/arch/x86/kvm/kvm_emulate.h
>> @@ -91,6 +91,7 @@ struct x86_instruction_info {
>>    /* x86-specific emulation flags */
>>    #define X86EMUL_F_FETCH			BIT(0)
>>    #define X86EMUL_F_WRITE			BIT(1)
>> +#define X86EMUL_F_SKIPLASS		BIT(2)
>>    
>>    struct x86_emulate_ops {
>>    	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index a33205ded85c..876997e8448e 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
>>    	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>>    }
>>    
>> +/*
>> + * Determine whether an access to the linear address causes a LASS violation.
>> + * LASS protection is only effective in long mode. As a prerequisite, caller
>> + * should make sure vCPU running in long mode and invoke this api to do LASS
>> + * violation check.
>> + */
>> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
>> +{
>> +	bool user_mode, user_as, rflags_ac;
>> +
>> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
>> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
>> +		return false;
>> +
>> +	WARN_ON_ONCE(!is_long_mode(vcpu));
> IMHO, it's better to skip the following checks and return false if it is
> out of long mode.
In some cases , cpu mode check already exists before invoking LASS violation
detect, e.g. vmx instruction emulation. So it's designed to make 
vmx_check_lass()
focusing on LASS violation alone, and leave it to caller taking care of 
cpu mode
in advance.

The purpose is to avoid duplicating cpu mode check though the impact 
seems not
significant. :)
>> +
>> +	user_as = !(la >> 63);
> It's better to describe how LASS treat linear address in compatibility
> mode in changelog or/and in comment,
> i.e. for a linear address with only 32 bits (or 16 bits), the processor
> treats bit 63 as if it were 0.
>
OK, will add comments on specific treatment in compatibility mode.
>> +
>> +	/*
>> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
>> +	 * accesses a system data structure. For implicit accesses to system
>> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
>> +	 */
>> +	if (access & PFERR_IMPLICIT_ACCESS) {
>> +		user_mode = false;
>> +		rflags_ac = false;
>> +	} else {
>> +		user_mode = vmx_get_cpl(vcpu) == 3;
>> +		if (!user_mode)
>> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
>> +	}
>> +
>> +	if (user_mode == user_as)
>> +		return false;
>> +
>> +	/*
>> +	 * Supervisor-mode _data_ accesses to user address space
>> +	 * cause LASS violations only if SMAP is enabled.
>> +	 */
>> +	if (!user_mode && !(access & PFERR_FETCH_MASK))
>> +		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;
>> +
>> +	return true;
>> +}
>> +
>>    static struct kvm_x86_ops vmx_x86_ops __initdata = {
>>    	.name = KBUILD_MODNAME,
>>    
>> @@ -8269,6 +8314,8 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>>    	.complete_emulated_msr = kvm_complete_insn_gp,
>>    
>>    	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
>> +
>> +	.check_lass = vmx_check_lass,
>>    };
>>    
>>    static unsigned int vmx_handle_intel_pt_intr(void)
>> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
>> index 9e66531861cf..f2e775b9849b 100644
>> --- a/arch/x86/kvm/vmx/vmx.h
>> +++ b/arch/x86/kvm/vmx/vmx.h
>> @@ -433,6 +433,8 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
>>    u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
>>    u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>>    
>> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
>> +
>>    static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
>>    					     int type, bool value)
>>    {

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 0/6] LASS KVM virtualization support
  2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
                   ` (7 preceding siblings ...)
  2023-06-05  1:39 ` Binbin Wu
@ 2023-06-27 17:08 ` Sean Christopherson
  2023-06-28  8:42   ` Zeng Guang
  8 siblings, 1 reply; 36+ messages in thread
From: Sean Christopherson @ 2023-06-27 17:08 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel

On Thu, Jun 01, 2023, Zeng Guang wrote:
> This patch series provide a LASS KVM solution.

... and depends on kernel enabling that can be found at

https://lore.kernel.org/all/20230609183632.48706-1-alexander.shishkin@linux.intel.com

> We tested the basic function of LASS virtualization including LASS
> enumeration and enabling in non-root and nested environment. As KVM
> unittest framework is not compatible to LASS rule, we use kernel module
> and application test to emulate LASS violation instead. With KVM forced
> emulation mechanism, we also verified the LASS functionality on some
> emulation path with instruction fetch and data access to have same
> behavior as hardware.
> 
> [1] Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368
> Chapter Linear Address Space Separation (LASS)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 1/6] KVM: x86: Consolidate flags for __linearize()
  2023-06-01 14:23 ` [PATCH v1 1/6] KVM: x86: Consolidate flags for __linearize() Zeng Guang
@ 2023-06-27 17:40   ` Sean Christopherson
  2023-06-28  5:13     ` Binbin Wu
  2023-06-28  7:27     ` Zeng Guang
  0 siblings, 2 replies; 36+ messages in thread
From: Sean Christopherson @ 2023-06-27 17:40 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel, Binbin Wu

On Thu, Jun 01, 2023, Zeng Guang wrote:
> From: Binbin Wu <binbin.wu@linux.intel.com>
> 
> Define a 32-bit parameter and consolidate the two bools into it.

Please write changelogs so that they make sense without the context of the shortlog.
In isolation, the above provides zero context.  And there's no need to provide a
play-by-play description of the change, e.g. this can be:

  Consolidate __linearize()'s @write and @fetch into a set of flags so that
  additional flags can be added without needing more/new boolean parameters,
  e.g. to precisely identify the access type for LASS.

> __linearize() has two bool parameters write and fetch. And new flag
> will be needed to support new feature (e.g. LAM needs a flag to skip

s/LAM/LASS

> address untag under some conditions).

Looks like this was copy+pasted LAM.  AIUI, there is no untagging in LASS.  Please,
please take the time to proofread what you're posting.  To you it's a minor typo,
to others, incorrect statements like this can create a lot of confusion.

> No functional change intended.
> 
> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> ---
>  arch/x86/kvm/emulate.c     | 19 +++++++++++++------
>  arch/x86/kvm/kvm_emulate.h |  4 ++++
>  2 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 936a397a08cd..9508836e8a35 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -687,8 +687,8 @@ static unsigned insn_alignment(struct x86_emulate_ctxt *ctxt, unsigned size)
>  static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>  				       struct segmented_address addr,
>  				       unsigned *max_size, unsigned size,
> -				       bool write, bool fetch,
> -				       enum x86emul_mode mode, ulong *linear)
> +				       u32 flags, enum x86emul_mode mode,

"unsigned int", not "u32".  They're obviously the same effective thing, but using
"unsigned int" captures that the number of bits doesn't truly matter, e.g. isn't
reflected in hardware or anything.  This could just as easily be a u16, but there's
obviously no reason to squeeze this into a u16.

> +				       ulong *linear)
>  {
>  	struct desc_struct desc;
>  	bool usable;
> @@ -696,6 +696,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>  	u32 lim;
>  	u16 sel;
>  	u8  va_bits;
> +	bool fetch = !!(flags & X86EMUL_F_FETCH);
> +	bool write = !!(flags & X86EMUL_F_WRITE);
>  
>  	la = seg_base(ctxt, addr.seg) + addr.ea;
>  	*max_size = 0;
> @@ -757,7 +759,11 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
>  		     ulong *linear)
>  {
>  	unsigned max_size;
> -	return __linearize(ctxt, addr, &max_size, size, write, false,
> +	u32 flags = 0;
> +
> +	if (write)
> +		flags |= X86EMUL_F_WRITE;
> +	return __linearize(ctxt, addr, &max_size, size, flags,
>  			   ctxt->mode, linear);

I'm tempted to have this be:

	return __linearize(ctxt, addr, &max_size, size,
			   write ? X86EMUL_F_WRITE : 0, ctxt->mode, linear);

Mostly so that it's obvious "flags" is constant.  The alterntive would e

	const unsigned int flags = write ? X86EMUL_F_WRITE : 0;

But my preference is probably to omit "flags" entirely.

>  }
>  
> @@ -768,10 +774,11 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>  	unsigned max_size;
>  	struct segmented_address addr = { .seg = VCPU_SREG_CS,
>  					   .ea = dst };
> +	u32 flags = X86EMUL_F_FETCH;
>  
>  	if (ctxt->op_bytes != sizeof(unsigned long))
>  		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
> -	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
> +	rc = __linearize(ctxt, addr, &max_size, 1, flags, ctxt->mode, &linear);

Meh, just pass X86EMUL_F_FETCH directly, i.e. drop the local "flags".

>  	if (rc == X86EMUL_CONTINUE)
>  		ctxt->_eip = addr.ea;
>  	return rc;
> @@ -896,6 +903,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
>  	int cur_size = ctxt->fetch.end - ctxt->fetch.data;
>  	struct segmented_address addr = { .seg = VCPU_SREG_CS,
>  					   .ea = ctxt->eip + cur_size };
> +	u32 flags = X86EMUL_F_FETCH;
>  
>  	/*
>  	 * We do not know exactly how many bytes will be needed, and
> @@ -907,8 +915,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
>  	 * boundary check itself.  Instead, we use max_size to check
>  	 * against op_size.
>  	 */
> -	rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
> -			 &linear);
> +	rc = __linearize(ctxt, addr, &max_size, 0, flags, ctxt->mode, &linear);

And here.

>  	if (unlikely(rc != X86EMUL_CONTINUE))
>  		return rc;
>  
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index ab65f3a47dfd..5b9ec610b2cb 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -88,6 +88,10 @@ struct x86_instruction_info {
>  #define X86EMUL_IO_NEEDED       5 /* IO is needed to complete emulation */
>  #define X86EMUL_INTERCEPTED     6 /* Intercepted by nested VMCB/VMCS */
>  
> +/* x86-specific emulation flags */
> +#define X86EMUL_F_FETCH			BIT(0)
> +#define X86EMUL_F_WRITE			BIT(1)
> +
>  struct x86_emulate_ops {
>  	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
>  	/*
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS
  2023-06-01 14:23 ` [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
  2023-06-05  1:57   ` Binbin Wu
@ 2023-06-27 17:43   ` Sean Christopherson
  2023-06-28  8:19     ` Zeng Guang
  2023-08-16 22:16   ` Sean Christopherson
  2 siblings, 1 reply; 36+ messages in thread
From: Sean Christopherson @ 2023-06-27 17:43 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel

On Thu, Jun 01, 2023, Zeng Guang wrote:
> Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
> as CR4.LASS generally set once for each vCPU at boot time and won't be
> toggled at runtime. Besides, only if VM has LASS capability enumerated with
> CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
> to set CR4.LASS.
>
> Updating cr4_fixed1 to set CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1
> MSR for guests and allow guests to enable LASS in nested VMX operaion as well.

s/operaion/operation.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-01 14:23 ` [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
                     ` (2 preceding siblings ...)
  2023-06-05 14:07   ` Yuan Yao
@ 2023-06-27 18:26   ` Sean Christopherson
  2023-06-27 22:45     ` Sean Christopherson
  2023-06-30 18:50     ` Zeng Guang
  3 siblings, 2 replies; 36+ messages in thread
From: Sean Christopherson @ 2023-06-27 18:26 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel

Similar to comments on an earlier patch, don't try to describe the literal code
change, e.g. this does far more than just "Add new ops in kvm_x86_ops".  Something
like

  KVM: VMX: Add emulation of LASS violation checks on linear address 

On Thu, Jun 01, 2023, Zeng Guang wrote:
> Intel introduces LASS (Linear Address Separation) feature providing
> an independent mechanism to achieve the mode-based protection.
> 
> LASS partitions 64-bit linear address space into two halves, user-mode
> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
> stops any code execution or conditional data access[1]
>     1. from user mode to supervisor-mode address space
>     2. from supervisor mode to user-mode address space
> and generates LASS violation fault accordingly.
> 
> [1]A supervisor mode data access causes a LASS violation only if supervisor
> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
> or the access implicitly accesses a system data structure.
> 
> Following are the rules of LASS violation check on the linear address(LA).
> User access to supervisor-mode address space:
>     LA[bit 63] && (CPL == 3)
> Supervisor access to user-mode address space:
>     Instruction fetch: !LA[bit 63] && (CPL < 3)
>     Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>                  CPL < 3) || Implicit supervisor access)
> 
> Add new ops in kvm_x86_ops to do LASS violation check.
> 
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> ---
>  arch/x86/include/asm/kvm-x86-ops.h |  3 +-
>  arch/x86/include/asm/kvm_host.h    |  2 ++
>  arch/x86/kvm/kvm_emulate.h         |  1 +
>  arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
>  arch/x86/kvm/vmx/vmx.h             |  2 ++
>  5 files changed, 54 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 13bc212cd4bc..8980a3bfa687 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
>  KVM_X86_OP(msr_filter_changed)
>  KVM_X86_OP(complete_emulated_msr)
>  KVM_X86_OP(vcpu_deliver_sipi_vector)
> -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
> +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
> +KVM_X86_OP_OPTIONAL_RET0(check_lass)

Use "is_lass_violation" instead of "check_lass" for all function names.  "check"
doesn't convey that the function is a predicate, i.e. that it returns true/false.

>  #undef KVM_X86_OP
>  #undef KVM_X86_OP_OPTIONAL
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 92d8e65fe88c..98666d1e7727 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
>  	 * Returns vCPU specific APICv inhibit reasons
>  	 */
>  	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
> +
> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
>  };
>  
>  struct kvm_x86_nested_ops {
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index 5b9ec610b2cb..f1439ab7c14b 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -91,6 +91,7 @@ struct x86_instruction_info {
>  /* x86-specific emulation flags */
>  #define X86EMUL_F_FETCH			BIT(0)
>  #define X86EMUL_F_WRITE			BIT(1)
> +#define X86EMUL_F_SKIPLASS		BIT(2)

This belongs in the patch that integrates LASS into the emulator.  And rather than
make the flag a command (SKIPLASS), I think it makes sense to make the flag describe
the access.  It'll mean more flags, but those are free.  That way the originators of
the accesses, e.g. em_invlpg(), don't need to document how they interact with LASS,
e.g. this code is self-documenting, and if someone wants to understand why KVM
has a dedicated X86EMUL_F_INVLPG flag, it's easy enough to find the consumer.
And in the unlikely scenario that other things care about INVLPG, branch targets,
etc., we won't end up with X86EMUL_F_SKIPLASS, X86EMUL_F_SKIPOTHER, etc.

	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1,
			 X86EMUL_F_INVLPG, ctxt->mode, &linear);

So this?

  #define X86EMUL_F_IMPLICIT
  #define X86EMUL_F_INVLPG
  #define X86EMUL_F_BRANCH_TARGET

>  struct x86_emulate_ops {
>  	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a33205ded85c..876997e8448e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
>  	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>  }
>  
> +/*
> + * Determine whether an access to the linear address causes a LASS violation.
> + * LASS protection is only effective in long mode. As a prerequisite, caller
> + * should make sure vCPU running in long mode and invoke this api to do LASS
> + * violation check.
> + */
> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
> +{
> +	bool user_mode, user_as, rflags_ac;
> +
> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
> +		return false;
> +
> +	WARN_ON_ONCE(!is_long_mode(vcpu));

This is silly.  By making this a preqreq, 2 of the 3 callers are forced to explicitly
check for is_long_mode(), *and* it unnecessarily bleeds LASS details outside of VMX.

> +
> +	user_as = !(la >> 63);
> +
> +	/*
> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
> +	 * accesses a system data structure. For implicit accesses to system
> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
> +	 */
> +	if (access & PFERR_IMPLICIT_ACCESS) {

Please don't use PFERR_IMPLICIT_ACCESS, just extend the new flags.  This is
obviously not coming from a page fault.  PFERR_IMPLICIT_ACCESS really shouldn't
exist, but at least there was reasonable justification for adding it (changing
all of the paths that lead to permission_fault() would have require a massive
overhaul).

***HOWEVER***

I think the entire approach of hooking __linearize() may be a mistake, and LASS
should instead be implemented in a wrapper of ->gva_to_gpa().  The two calls to
__linearize() that are escaped with SKIPLASS are escaped *because* they don't
actually access memory (branch targets and INVLPG), and so if LASS is enforced
only when ->gva_to_gpa() is invoked, all of these new flags go away because the
cases that ignore LASS are naturally handled.

That should also make it unnecessary to add one-off checks since e.g. kvm_handle_invpcid()
will hit kvm_read_guest_virt() and thus ->gva_to_gpa(), i.e. we won't end up playing
an ongoing game of whack-a-mole.

And one question that needs to be answered is what happens when an access rolls
over from supervisor to user, e.g. if the kernel access 8 bytes at -1ull and thus
reads all Fs => 0x6, does the access get a LASS violation on the access to user
memory.  User=>supervisor can't happen because non-canonical checks have higher
priority, but supervisor=>user can.  And that matters because it will impact
whether or not KVM needs to check each *page* or just the initial address.

> +		user_mode = false;
> +		rflags_ac = false;
> +	} else {
> +		user_mode = vmx_get_cpl(vcpu) == 3;
> +		if (!user_mode)
> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
> +	}
> +
> +	if (user_mode == user_as)
> +		return false;
> +
> +	/*
> +	 * Supervisor-mode _data_ accesses to user address space
> +	 * cause LASS violations only if SMAP is enabled.
> +	 */
> +	if (!user_mode && !(access & PFERR_FETCH_MASK))
> +		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;

This is all more complex than it needs to be.  Using local bools just so that
the "user_mode == user_as" is common is not a good tradeoff.  User addresses have
*significantly* different behavior, lean into that instead of dancing around it.

bool is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
		       unsigned int flags)
{
	const bool is_supervisor_access = addr & BIT_ULL(63);
	const bool implicit = flags & X86EMUL_F_IMPLICIT;

	bool user_mode, user_as, rflags_ac;

	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
		return false;

	/*
	 * INVLPG isn't subject to LASS, e.g. to allow invalidating userspace
	 * addresses without toggling RFLAGS.AC.  Branch targets aren't subject
	 * to LASS in order to simplifiy far control transfers (the subsequent
	 * fetch will enforce LASS as appropriate).
	 */
	if (flags & (X86EMUL_F_BRANCH_TARGET | X86EMUL_F_INVLPG))
		return false;

	if (!implicit && vmx_get_cpl(vcpu) == 3)
		return is_supervisor_address;

	/* LASS is enforced for supervisor access iff SMAP is enabled. */
	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
		return false;

	/* Like SMAP, RFLAGS.AC disables LASS checks in supervisor mode. */
	if (!implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
		return false;

	return !is_supervisor_address;
}

> +	return true;
> +}

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 4/6] KVM: x86: Add emulator helper for LASS violation check
  2023-06-01 14:23 ` [PATCH v1 4/6] KVM: x86: Add emulator helper " Zeng Guang
@ 2023-06-27 18:28   ` Sean Christopherson
  2023-06-29 15:06     ` Zeng Guang
  0 siblings, 1 reply; 36+ messages in thread
From: Sean Christopherson @ 2023-06-27 18:28 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel

On Thu, Jun 01, 2023, Zeng Guang wrote:
> When LASS is enabled, KVM need apply LASS violation check to instruction
> emulations. Add helper for the usage of x86 emulator to perform LASS
> protection.
> 
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> ---
>  arch/x86/kvm/kvm_emulate.h |  1 +
>  arch/x86/kvm/x86.c         | 12 ++++++++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index f1439ab7c14b..fd1c2b22867e 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -230,6 +230,7 @@ struct x86_emulate_ops {
>  	int (*leave_smm)(struct x86_emulate_ctxt *ctxt);
>  	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
>  	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
> +	bool (*check_lass)(struct x86_emulate_ctxt *ctxt, u64 access, u64 la, u32 flags);
>  };
>  
>  /* Type, address-of, and value of an instruction's operand. */
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c0778ca39650..faf01fecc4ca 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8287,6 +8287,17 @@ static void emulator_vm_bugged(struct x86_emulate_ctxt *ctxt)
>  		kvm_vm_bugged(kvm);
>  }
>  
> +static bool emulator_check_lass(struct x86_emulate_ctxt *ctxt,
> +				u64 access, u64 la, u32 flags)
> +{
> +	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
> +
> +	if (!is_long_mode(vcpu))
> +		return false;

Likely a moot point if we wrap ->gva_to_gpa(), but most this into the vendor
implementation.

And if we keep these emulator hooks, massage the patch ordering:

  1. Add plumbing to emulator to pass new flags
  2. Add kvm_x86_ops definition and invocation from emulator
  3. Implement and wire up vmx_is_lass_violation()

That way the changes to each area of KVM are better isolated.

> +	return static_call(kvm_x86_check_lass)(vcpu, access, la, flags);
> +}
> +
>  static const struct x86_emulate_ops emulate_ops = {
>  	.vm_bugged           = emulator_vm_bugged,
>  	.read_gpr            = emulator_read_gpr,
> @@ -8332,6 +8343,7 @@ static const struct x86_emulate_ops emulate_ops = {
>  	.leave_smm           = emulator_leave_smm,
>  	.triple_fault        = emulator_triple_fault,
>  	.set_xcr             = emulator_set_xcr,
> +	.check_lass          = emulator_check_lass,
>  };
>  
>  static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-27 18:26   ` Sean Christopherson
@ 2023-06-27 22:45     ` Sean Christopherson
  2023-06-30 18:50     ` Zeng Guang
  1 sibling, 0 replies; 36+ messages in thread
From: Sean Christopherson @ 2023-06-27 22:45 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel

On Tue, Jun 27, 2023, Sean Christopherson wrote:
> > +	/*
> > +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
> > +	 * accesses a system data structure. For implicit accesses to system
> > +	 * data structure, the processor acts as if RFLAGS.AC is clear.
> > +	 */
> > +	if (access & PFERR_IMPLICIT_ACCESS) {
> 
> Please don't use PFERR_IMPLICIT_ACCESS, just extend the new flags.  This is
> obviously not coming from a page fault.  PFERR_IMPLICIT_ACCESS really shouldn't
> exist, but at least there was reasonable justification for adding it (changing
> all of the paths that lead to permission_fault() would have require a massive
> overhaul).
> 
> ***HOWEVER***
> 
> I think the entire approach of hooking __linearize() may be a mistake, and LASS
> should instead be implemented in a wrapper of ->gva_to_gpa().  The two calls to
> __linearize() that are escaped with SKIPLASS are escaped *because* they don't
> actually access memory (branch targets and INVLPG), and so if LASS is enforced
> only when ->gva_to_gpa() is invoked, all of these new flags go away because the
> cases that ignore LASS are naturally handled.
> 
> That should also make it unnecessary to add one-off checks since e.g. kvm_handle_invpcid()
> will hit kvm_read_guest_virt() and thus ->gva_to_gpa(), i.e. we won't end up playing
> an ongoing game of whack-a-mole.

Drat, that won't work, at least not without quite a few more changes.

  1. kvm_{read,write,fetch}_guest_virt() are effectively defined to work with a
    fully resolve linear address, i.e. callers assume failure means #PF

  2. Similar to (1), segment information isn't available, i.e. KVM wouldn't know
     when to inject #SS instead of #GP

And IIUC, LASS violations are higher priority than instruction specific alignment
checks, e.g. on CMPXCHG16B.  

And looking at LAM, that untagging needs to be done before the canonical checks,
which means that we'll need at least X86EMUL_F_INVLPG.

So my idea of shoving this into a ->gva_to_gpa() wrapper won't work well.  Bummer.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 1/6] KVM: x86: Consolidate flags for __linearize()
  2023-06-27 17:40   ` Sean Christopherson
@ 2023-06-28  5:13     ` Binbin Wu
  2023-06-28  7:27     ` Zeng Guang
  1 sibling, 0 replies; 36+ messages in thread
From: Binbin Wu @ 2023-06-28  5:13 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Zeng Guang, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel



On 6/28/2023 1:40 AM, Sean Christopherson wrote:
> On Thu, Jun 01, 2023, Zeng Guang wrote:
>> From: Binbin Wu <binbin.wu@linux.intel.com>
>>
>> Define a 32-bit parameter and consolidate the two bools into it.
> Please write changelogs so that they make sense without the context of the shortlog.
> In isolation, the above provides zero context.  And there's no need to provide a
> play-by-play description of the change, e.g. this can be:
>
>    Consolidate __linearize()'s @write and @fetch into a set of flags so that
>    additional flags can be added without needing more/new boolean parameters,
>    e.g. to precisely identify the access type for LASS.
>
>> __linearize() has two bool parameters write and fetch. And new flag
>> will be needed to support new feature (e.g. LAM needs a flag to skip
> s/LAM/LASS
>
>> address untag under some conditions).
> Looks like this was copy+pasted LAM.  AIUI, there is no untagging in LASS.  Please,
> please take the time to proofread what you're posting.  To you it's a minor typo,
> to others, incorrect statements like this can create a lot of confusion.
>
>> No functional change intended.
>>
>> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> ---
>>   arch/x86/kvm/emulate.c     | 19 +++++++++++++------
>>   arch/x86/kvm/kvm_emulate.h |  4 ++++
>>   2 files changed, 17 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 936a397a08cd..9508836e8a35 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -687,8 +687,8 @@ static unsigned insn_alignment(struct x86_emulate_ctxt *ctxt, unsigned size)
>>   static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>>   				       struct segmented_address addr,
>>   				       unsigned *max_size, unsigned size,
>> -				       bool write, bool fetch,
>> -				       enum x86emul_mode mode, ulong *linear)
>> +				       u32 flags, enum x86emul_mode mode,
> "unsigned int", not "u32".  They're obviously the same effective thing, but using
> "unsigned int" captures that the number of bits doesn't truly matter, e.g. isn't
> reflected in hardware or anything.  This could just as easily be a u16, but there's
> obviously no reason to squeeze this into a u16.
OK. Thanks.

Except for the "u32" / "usigned int" thing, I have updated the patch in 
KVM LAM enabling patch series v9:
https://lore.kernel.org/kvm/20230606091842.13123-2-binbin.wu@linux.intel.com/
It has dropped the unnecessary local variables you mentioned below and 
improved the changelog a bit to be more informative.

Do you have any comment on this version?


>
>> +				       ulong *linear)
>>   {
>>   	struct desc_struct desc;
>>   	bool usable;
>> @@ -696,6 +696,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>>   	u32 lim;
>>   	u16 sel;
>>   	u8  va_bits;
>> +	bool fetch = !!(flags & X86EMUL_F_FETCH);
>> +	bool write = !!(flags & X86EMUL_F_WRITE);
>>   
>>   	la = seg_base(ctxt, addr.seg) + addr.ea;
>>   	*max_size = 0;
>> @@ -757,7 +759,11 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
>>   		     ulong *linear)
>>   {
>>   	unsigned max_size;
>> -	return __linearize(ctxt, addr, &max_size, size, write, false,
>> +	u32 flags = 0;
>> +
>> +	if (write)
>> +		flags |= X86EMUL_F_WRITE;
>> +	return __linearize(ctxt, addr, &max_size, size, flags,
>>   			   ctxt->mode, linear);
> I'm tempted to have this be:
>
> 	return __linearize(ctxt, addr, &max_size, size,
> 			   write ? X86EMUL_F_WRITE : 0, ctxt->mode, linear);
>
> Mostly so that it's obvious "flags" is constant.  The alterntive would e
>
> 	const unsigned int flags = write ? X86EMUL_F_WRITE : 0;
>
> But my preference is probably to omit "flags" entirely.
>
>>   }
>>   
>> @@ -768,10 +774,11 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
>>   	unsigned max_size;
>>   	struct segmented_address addr = { .seg = VCPU_SREG_CS,
>>   					   .ea = dst };
>> +	u32 flags = X86EMUL_F_FETCH;
>>   
>>   	if (ctxt->op_bytes != sizeof(unsigned long))
>>   		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
>> -	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
>> +	rc = __linearize(ctxt, addr, &max_size, 1, flags, ctxt->mode, &linear);
> Meh, just pass X86EMUL_F_FETCH directly, i.e. drop the local "flags".
>
>>   	if (rc == X86EMUL_CONTINUE)
>>   		ctxt->_eip = addr.ea;
>>   	return rc;
>> @@ -896,6 +903,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
>>   	int cur_size = ctxt->fetch.end - ctxt->fetch.data;
>>   	struct segmented_address addr = { .seg = VCPU_SREG_CS,
>>   					   .ea = ctxt->eip + cur_size };
>> +	u32 flags = X86EMUL_F_FETCH;
>>   
>>   	/*
>>   	 * We do not know exactly how many bytes will be needed, and
>> @@ -907,8 +915,7 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
>>   	 * boundary check itself.  Instead, we use max_size to check
>>   	 * against op_size.
>>   	 */
>> -	rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
>> -			 &linear);
>> +	rc = __linearize(ctxt, addr, &max_size, 0, flags, ctxt->mode, &linear);
> And here.
>
>>   	if (unlikely(rc != X86EMUL_CONTINUE))
>>   		return rc;
>>   
>> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
>> index ab65f3a47dfd..5b9ec610b2cb 100644
>> --- a/arch/x86/kvm/kvm_emulate.h
>> +++ b/arch/x86/kvm/kvm_emulate.h
>> @@ -88,6 +88,10 @@ struct x86_instruction_info {
>>   #define X86EMUL_IO_NEEDED       5 /* IO is needed to complete emulation */
>>   #define X86EMUL_INTERCEPTED     6 /* Intercepted by nested VMCB/VMCS */
>>   
>> +/* x86-specific emulation flags */
>> +#define X86EMUL_F_FETCH			BIT(0)
>> +#define X86EMUL_F_WRITE			BIT(1)
>> +
>>   struct x86_emulate_ops {
>>   	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
>>   	/*
>> -- 
>> 2.27.0
>>


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 1/6] KVM: x86: Consolidate flags for __linearize()
  2023-06-27 17:40   ` Sean Christopherson
  2023-06-28  5:13     ` Binbin Wu
@ 2023-06-28  7:27     ` Zeng Guang
  1 sibling, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-28  7:27 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel, Binbin Wu


On 6/28/2023 1:40 AM, Sean Christopherson wrote:
> On Thu, Jun 01, 2023, Zeng Guang wrote:
>> From: Binbin Wu <binbin.wu@linux.intel.com>
>>
>> Define a 32-bit parameter and consolidate the two bools into it.
> Please write changelogs so that they make sense without the context of the shortlog.
> In isolation, the above provides zero context.  And there's no need to provide a
> play-by-play description of the change, e.g. this can be:
>
>    Consolidate __linearize()'s @write and @fetch into a set of flags so that
>    additional flags can be added without needing more/new boolean parameters,
>    e.g. to precisely identify the access type for LASS.
>
>> __linearize() has two bool parameters write and fetch. And new flag
>> will be needed to support new feature (e.g. LAM needs a flag to skip
> s/LAM/LASS
>
>> address untag under some conditions).
> Looks like this was copy+pasted LAM.  AIUI, there is no untagging in LASS.  Please,
> please take the time to proofread what you're posting.  To you it's a minor typo,
> to others, incorrect statements like this can create a lot of confusion.

LASS and LAM attempts to apply same emulator framework. This patch as
prerequisite directly uses the one of LAM patches. But no excuse, we
will modify it to non-feature and hardware specific as you suggested.
Thanks.




^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS
  2023-06-27 17:43   ` Sean Christopherson
@ 2023-06-28  8:19     ` Zeng Guang
  0 siblings, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-28  8:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel


On 6/28/2023 1:43 AM, Sean Christopherson wrote:
> On Thu, Jun 01, 2023, Zeng Guang wrote:
>> Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
>> as CR4.LASS generally set once for each vCPU at boot time and won't be
>> toggled at runtime. Besides, only if VM has LASS capability enumerated with
>> CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
>> to set CR4.LASS.
>>
>> Updating cr4_fixed1 to set CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1
>> MSR for guests and allow guests to enable LASS in nested VMX operaion as well.
> s/operaion/operation.
Will fix in next version. Thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 0/6] LASS KVM virtualization support
  2023-06-27 17:08 ` Sean Christopherson
@ 2023-06-28  8:42   ` Zeng Guang
  0 siblings, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-28  8:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel


On 6/28/2023 1:08 AM, Sean Christopherson wrote:
> On Thu, Jun 01, 2023, Zeng Guang wrote:
>> This patch series provide a LASS KVM solution.
> ... and depends on kernel enabling that can be found at
>
> https://lore.kernel.org/all/20230609183632.48706-1-alexander.shishkin@linux.intel.com
OK. I will add the dependency statement in next version.
>> We tested the basic function of LASS virtualization including LASS
>> enumeration and enabling in non-root and nested environment. As KVM
>> unittest framework is not compatible to LASS rule, we use kernel module
>> and application test to emulate LASS violation instead. With KVM forced
>> emulation mechanism, we also verified the LASS functionality on some
>> emulation path with instruction fetch and data access to have same
>> behavior as hardware.
>>
>> [1] Intel ISE https://cdrdv2.intel.com/v1/dl/getContent/671368
>> Chapter Linear Address Space Separation (LASS)


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 4/6] KVM: x86: Add emulator helper for LASS violation check
  2023-06-27 18:28   ` Sean Christopherson
@ 2023-06-29 15:06     ` Zeng Guang
  0 siblings, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-29 15:06 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel


On 6/28/2023 2:28 AM, Sean Christopherson wrote:
> On Thu, Jun 01, 2023, Zeng Guang wrote:
>> When LASS is enabled, KVM need apply LASS violation check to instruction
>> emulations. Add helper for the usage of x86 emulator to perform LASS
>> protection.
>>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
>> ---
>>   arch/x86/kvm/kvm_emulate.h |  1 +
>>   arch/x86/kvm/x86.c         | 12 ++++++++++++
>>   2 files changed, 13 insertions(+)
>>
>> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
>> index f1439ab7c14b..fd1c2b22867e 100644
>> --- a/arch/x86/kvm/kvm_emulate.h
>> +++ b/arch/x86/kvm/kvm_emulate.h
>> @@ -230,6 +230,7 @@ struct x86_emulate_ops {
>>   	int (*leave_smm)(struct x86_emulate_ctxt *ctxt);
>>   	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
>>   	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
>> +	bool (*check_lass)(struct x86_emulate_ctxt *ctxt, u64 access, u64 la, u32 flags);
>>   };
>>   
>>   /* Type, address-of, and value of an instruction's operand. */
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index c0778ca39650..faf01fecc4ca 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -8287,6 +8287,17 @@ static void emulator_vm_bugged(struct x86_emulate_ctxt *ctxt)
>>   		kvm_vm_bugged(kvm);
>>   }
>>   
>> +static bool emulator_check_lass(struct x86_emulate_ctxt *ctxt,
>> +				u64 access, u64 la, u32 flags)
>> +{
>> +	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
>> +
>> +	if (!is_long_mode(vcpu))
>> +		return false;
> Likely a moot point if we wrap ->gva_to_gpa(), but most this into the vendor
> implementation.
It's right way to move cpu mode check into is_lass_violation(). 
Previously I was struggling
to expect getting benefit from separate invocation. But it doesn't help 
much indeed.

>
> And if we keep these emulator hooks, massage the patch ordering:
>
>    1. Add plumbing to emulator to pass new flags
>    2. Add kvm_x86_ops definition and invocation from emulator
>    3. Implement and wire up vmx_is_lass_violation()
>
> That way the changes to each area of KVM are better isolated.
OK. Will reorganize the patch accordingly.
Thanks.
>> +	return static_call(kvm_x86_check_lass)(vcpu, access, la, flags);
>> +}
>> +
>>   static const struct x86_emulate_ops emulate_ops = {
>>   	.vm_bugged           = emulator_vm_bugged,
>>   	.read_gpr            = emulator_read_gpr,
>> @@ -8332,6 +8343,7 @@ static const struct x86_emulate_ops emulate_ops = {
>>   	.leave_smm           = emulator_leave_smm,
>>   	.triple_fault        = emulator_triple_fault,
>>   	.set_xcr             = emulator_set_xcr,
>> +	.check_lass          = emulator_check_lass,
>>   };
>>   
>>   static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
>> -- 
>> 2.27.0
>>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check
  2023-06-27 18:26   ` Sean Christopherson
  2023-06-27 22:45     ` Sean Christopherson
@ 2023-06-30 18:50     ` Zeng Guang
  1 sibling, 0 replies; 36+ messages in thread
From: Zeng Guang @ 2023-06-30 18:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel


On 6/28/2023 2:26 AM, Sean Christopherson wrote:
> Similar to comments on an earlier patch, don't try to describe the literal code
> change, e.g. this does far more than just "Add new ops in kvm_x86_ops".  Something
> like
>
>    KVM: VMX: Add emulation of LASS violation checks on linear address
OK. I will modify the change log to focus on overview of patch
implementation and be more concise.
>
> On Thu, Jun 01, 2023, Zeng Guang wrote:
>> Intel introduces LASS (Linear Address Separation) feature providing
>> an independent mechanism to achieve the mode-based protection.
>>
>> LASS partitions 64-bit linear address space into two halves, user-mode
>> address (LA[bit 63]=0) and supervisor-mode address (LA[bit 63]=1). It
>> stops any code execution or conditional data access[1]
>>      1. from user mode to supervisor-mode address space
>>      2. from supervisor mode to user-mode address space
>> and generates LASS violation fault accordingly.
>>
>> [1]A supervisor mode data access causes a LASS violation only if supervisor
>> mode access protection is enabled (CR4.SMAP = 1) and either RFLAGS.AC = 0
>> or the access implicitly accesses a system data structure.
>>
>> Following are the rules of LASS violation check on the linear address(LA).
>> User access to supervisor-mode address space:
>>      LA[bit 63] && (CPL == 3)
>> Supervisor access to user-mode address space:
>>      Instruction fetch: !LA[bit 63] && (CPL < 3)
>>      Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
>>                   CPL < 3) || Implicit supervisor access)
>>
>> Add new ops in kvm_x86_ops to do LASS violation check.
>>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
>> ---
>>   arch/x86/include/asm/kvm-x86-ops.h |  3 +-
>>   arch/x86/include/asm/kvm_host.h    |  2 ++
>>   arch/x86/kvm/kvm_emulate.h         |  1 +
>>   arch/x86/kvm/vmx/vmx.c             | 47 ++++++++++++++++++++++++++++++
>>   arch/x86/kvm/vmx/vmx.h             |  2 ++
>>   5 files changed, 54 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
>> index 13bc212cd4bc..8980a3bfa687 100644
>> --- a/arch/x86/include/asm/kvm-x86-ops.h
>> +++ b/arch/x86/include/asm/kvm-x86-ops.h
>> @@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
>>   KVM_X86_OP(msr_filter_changed)
>>   KVM_X86_OP(complete_emulated_msr)
>>   KVM_X86_OP(vcpu_deliver_sipi_vector)
>> -KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
>> +KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
>> +KVM_X86_OP_OPTIONAL_RET0(check_lass)
> Use "is_lass_violation" instead of "check_lass" for all function names.  "check"
> doesn't convey that the function is a predicate, i.e. that it returns true/false.
That does make sense. Will change it.
>>   #undef KVM_X86_OP
>>   #undef KVM_X86_OP_OPTIONAL
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 92d8e65fe88c..98666d1e7727 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1731,6 +1731,8 @@ struct kvm_x86_ops {
>>   	 * Returns vCPU specific APICv inhibit reasons
>>   	 */
>>   	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
>> +
>> +	bool (*check_lass)(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags);
>>   };
>>   
>>   struct kvm_x86_nested_ops {
>> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
>> index 5b9ec610b2cb..f1439ab7c14b 100644
>> --- a/arch/x86/kvm/kvm_emulate.h
>> +++ b/arch/x86/kvm/kvm_emulate.h
>> @@ -91,6 +91,7 @@ struct x86_instruction_info {
>>   /* x86-specific emulation flags */
>>   #define X86EMUL_F_FETCH			BIT(0)
>>   #define X86EMUL_F_WRITE			BIT(1)
>> +#define X86EMUL_F_SKIPLASS		BIT(2)
> This belongs in the patch that integrates LASS into the emulator.  And rather than
> make the flag a command (SKIPLASS), I think it makes sense to make the flag describe
> the access.  It'll mean more flags, but those are free.  That way the originators of
> the accesses, e.g. em_invlpg(), don't need to document how they interact with LASS,
> e.g. this code is self-documenting, and if someone wants to understand why KVM
> has a dedicated X86EMUL_F_INVLPG flag, it's easy enough to find the consumer.
> And in the unlikely scenario that other things care about INVLPG, branch targets,
> etc., we won't end up with X86EMUL_F_SKIPLASS, X86EMUL_F_SKIPOTHER, etc.
>
> 	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1,
> 			 X86EMUL_F_INVLPG, ctxt->mode, &linear);
>
> So this?
>
>    #define X86EMUL_F_IMPLICIT
>    #define X86EMUL_F_INVLPG
>    #define X86EMUL_F_BRANCH_TARGET
By this way, emulator just provides the type of operation and make it to 
become
common interface open to various platforms. That also conceals vendor 
specific
implementation details. We will follow this idea.
Thanks.
>>   struct x86_emulate_ops {
>>   	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index a33205ded85c..876997e8448e 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -8130,6 +8130,51 @@ static void vmx_vm_destroy(struct kvm *kvm)
>>   	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>>   }
>>   
>> +/*
>> + * Determine whether an access to the linear address causes a LASS violation.
>> + * LASS protection is only effective in long mode. As a prerequisite, caller
>> + * should make sure vCPU running in long mode and invoke this api to do LASS
>> + * violation check.
>> + */
>> +bool vmx_check_lass(struct kvm_vcpu *vcpu, u64 access, u64 la, u32 flags)
>> +{
>> +	bool user_mode, user_as, rflags_ac;
>> +
>> +	if (!!(flags & X86EMUL_F_SKIPLASS) ||
>> +	    !kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS))
>> +		return false;
>> +
>> +	WARN_ON_ONCE(!is_long_mode(vcpu));
> This is silly.  By making this a preqreq, 2 of the 3 callers are forced to explicitly
> check for is_long_mode(), *and* it unnecessarily bleeds LASS details outside of VMX.
Right. Will give up this implementation.
>> +
>> +	user_as = !(la >> 63);
>> +
>> +	/*
>> +	 * An access is a supervisor-mode access if CPL < 3 or if it implicitly
>> +	 * accesses a system data structure. For implicit accesses to system
>> +	 * data structure, the processor acts as if RFLAGS.AC is clear.
>> +	 */
>> +	if (access & PFERR_IMPLICIT_ACCESS) {
> Please don't use PFERR_IMPLICIT_ACCESS, just extend the new flags.  This is
> obviously not coming from a page fault.  PFERR_IMPLICIT_ACCESS really shouldn't
> exist, but at least there was reasonable justification for adding it (changing
> all of the paths that lead to permission_fault() would have require a massive
> overhaul).
I've realized it not proper to use "PFERR_xxx" which was defined for
mmu process actually. It easily makes things confuse.
> ***HOWEVER***
>
> I think the entire approach of hooking __linearize() may be a mistake, and LASS
> should instead be implemented in a wrapper of ->gva_to_gpa().  The two calls to
> __linearize() that are escaped with SKIPLASS are escaped *because* they don't
> actually access memory (branch targets and INVLPG), and so if LASS is enforced
> only when ->gva_to_gpa() is invoked, all of these new flags go away because the
> cases that ignore LASS are naturally handled.
>
> That should also make it unnecessary to add one-off checks since e.g. kvm_handle_invpcid()
> will hit kvm_read_guest_virt() and thus ->gva_to_gpa(), i.e. we won't end up playing
> an ongoing game of whack-a-mole.
We've ever thought about this scheme, i.e. doing LASS violation check before
page table walk in ->gva_to_gpa(). But we find it difficult to identify and
process the fault of Lass violation. That may need to change the 
architecture
design of mmu. Thus we recommend current implementation.
> And one question that needs to be answered is what happens when an access rolls
> over from supervisor to user, e.g. if the kernel access 8 bytes at -1ull and thus
> reads all Fs => 0x6, does the access get a LASS violation on the access to user
> memory.  User=>supervisor can't happen because non-canonical checks have higher
> priority, but supervisor=>user can.  And that matters because it will impact
> whether or not KVM needs to check each *page* or just the initial address.

By experiment on Simics platform, I verified such kind of access will 
cause a
LASS violation. KVM has to consider lass violation check at both end of 
address
range of instruction fetch and data access in emulation. Currently it 
doesn't
take care of this corner case. Thanks to point out this potential problem.

>
>> +		user_mode = false;
>> +		rflags_ac = false;
>> +	} else {
>> +		user_mode = vmx_get_cpl(vcpu) == 3;
>> +		if (!user_mode)
>> +			rflags_ac = !!(kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
>> +	}
>> +
>> +	if (user_mode == user_as)
>> +		return false;
>> +
>> +	/*
>> +	 * Supervisor-mode _data_ accesses to user address space
>> +	 * cause LASS violations only if SMAP is enabled.
>> +	 */
>> +	if (!user_mode && !(access & PFERR_FETCH_MASK))
>> +		return kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP) && !rflags_ac;
> This is all more complex than it needs to be.  Using local bools just so that
> the "user_mode == user_as" is common is not a good tradeoff.  User addresses have
> *significantly* different behavior, lean into that instead of dancing around it.
>
> bool is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
> 		       unsigned int flags)
> {
> 	const bool is_supervisor_access = addr & BIT_ULL(63);
> 	const bool implicit = flags & X86EMUL_F_IMPLICIT;
>
> 	bool user_mode, user_as, rflags_ac;
>
> 	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
> 		return false;
>
> 	/*
> 	 * INVLPG isn't subject to LASS, e.g. to allow invalidating userspace
> 	 * addresses without toggling RFLAGS.AC.  Branch targets aren't subject
> 	 * to LASS in order to simplifiy far control transfers (the subsequent
> 	 * fetch will enforce LASS as appropriate).
> 	 */
> 	if (flags & (X86EMUL_F_BRANCH_TARGET | X86EMUL_F_INVLPG))
> 		return false;
>
> 	if (!implicit && vmx_get_cpl(vcpu) == 3)
> 		return is_supervisor_address;
>
> 	/* LASS is enforced for supervisor access iff SMAP is enabled. */
> 	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
> 		return false;
>
> 	/* Like SMAP, RFLAGS.AC disables LASS checks in supervisor mode. */
> 	if (!implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
> 		return false;
>
> 	return !is_supervisor_address;
> }

OK. I will refine the implementation further.
Thanks.

>> +	return true;
>> +}

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS
  2023-06-01 14:23 ` [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
  2023-06-05  1:57   ` Binbin Wu
  2023-06-27 17:43   ` Sean Christopherson
@ 2023-08-16 22:16   ` Sean Christopherson
  2 siblings, 0 replies; 36+ messages in thread
From: Sean Christopherson @ 2023-08-16 22:16 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel

Belatedly, same complaints as the LAM series[*], this patch doesn't actually
virtual LASS.  Just squash this with the previous patch, "KVM: VMX: Implement and
apply vmx_is_lass_violation() for LASS protection", though I would keep the
shortlog from this patch.

[*] https://lore.kernel.org/all/ZN1CjTQ0zWiOxk6j@google.com

On Thu, Jun 01, 2023, Zeng Guang wrote:
> Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
> as CR4.LASS generally set once for each vCPU at boot time and won't be
> toggled at runtime. Besides, only if VM has LASS capability enumerated with
> CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
> to set CR4.LASS.
> 
> Updating cr4_fixed1 to set CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1
> MSR for guests and allow guests to enable LASS in nested VMX operaion as well.
> 
> Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
> effect in legacy mode even if CR4.LASS is set.
> 
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2023-08-16 22:17 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-01 14:23 [PATCH v1 0/6] LASS KVM virtualization support Zeng Guang
2023-06-01 14:23 ` [PATCH v1 1/6] KVM: x86: Consolidate flags for __linearize() Zeng Guang
2023-06-27 17:40   ` Sean Christopherson
2023-06-28  5:13     ` Binbin Wu
2023-06-28  7:27     ` Zeng Guang
2023-06-01 14:23 ` [PATCH v1 2/6] KVM: x86: Virtualize CR4.LASS Zeng Guang
2023-06-05  1:57   ` Binbin Wu
2023-06-06  2:57     ` Zeng Guang
2023-06-27 17:43   ` Sean Christopherson
2023-06-28  8:19     ` Zeng Guang
2023-08-16 22:16   ` Sean Christopherson
2023-06-01 14:23 ` [PATCH v1 3/6] KVM: VMX: Add new ops in kvm_x86_ops for LASS violation check Zeng Guang
2023-06-05  3:31   ` Binbin Wu
2023-06-05 12:53     ` Zhi Wang
2023-06-06  2:57       ` Binbin Wu
2023-06-06  3:53         ` Zhi Wang
2023-06-07  6:28     ` Zeng Guang
2023-06-05  3:47   ` Chao Gao
2023-06-06  6:22     ` Zeng Guang
2023-06-05 14:07   ` Yuan Yao
2023-06-06  3:08     ` Zeng Guang
2023-06-27 18:26   ` Sean Christopherson
2023-06-27 22:45     ` Sean Christopherson
2023-06-30 18:50     ` Zeng Guang
2023-06-01 14:23 ` [PATCH v1 4/6] KVM: x86: Add emulator helper " Zeng Guang
2023-06-27 18:28   ` Sean Christopherson
2023-06-29 15:06     ` Zeng Guang
2023-06-01 14:23 ` [PATCH v1 5/6] KVM: x86: LASS protection on KVM emulation Zeng Guang
2023-06-06  4:20   ` Binbin Wu
2023-06-01 14:23 ` [PATCH v1 6/6] KVM: x86: Advertise LASS CPUID to user space Zeng Guang
2023-06-02  0:35 ` [PATCH v1 0/6] LASS KVM virtualization support Sean Christopherson
2023-06-06  2:22   ` Zeng Guang
2023-06-05  1:39 ` Binbin Wu
2023-06-06  2:40   ` Zeng Guang
2023-06-27 17:08 ` Sean Christopherson
2023-06-28  8:42   ` Zeng Guang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).