linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] LASS KVM virtualization support
@ 2023-07-19  2:45 Zeng Guang
  2023-07-19  2:45 ` [PATCH v2 1/8] KVM: x86: Consolidate flags for __linearize() Zeng Guang
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Zeng Guang @ 2023-07-19  2:45 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

Linear Address Space Separation (LASS)[1] is a new mechanism that
enforces the same mode-based protections as paging, i.e. SMAP/SMEP
but without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, "probes" by malicious
software will provide no paging-based timing information.

Based on a linear-address organization, LASS partitions 64-bit linear
address space into two halves, user-mode address (LA[bit 63]=0) and
supervisor-mode address (LA[bit 63]=1).

LASS aims to prevent any attempt to probe supervisor-mode addresses by
user mode, and likewise stop any attempt to access (if SMAP enabled) or
execute user-mode addresses from supervisor mode.

When platform has LASS capability, KVM requires to expose this feature
to guest VM enumerated by CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], and
allow guest to enable it via CR4.LASS[bit 27] on demand. For instruction
executed in the guest directly, hardware will perform the check. But KVM
also needs to behave same as hardware to apply LASS to kinds of guest
memory accesses when emulating instructions by software.

KVM will take following LASS violations check on emulation path.
User-mode access to supervisor space address:
        LA[bit 63] && (CPL == 3)
Supervisor-mode access to user space address:
        Instruction fetch: !LA[bit 63] && (CPL < 3)
        Data access: !LA[bit 63] && (CR4.SMAP==1) && ((RFLAGS.AC == 0 &&
                     CPL < 3) || Implicit supervisor access)

This patch series provide a LASS KVM solution and depends on kernel
enabling that can be found at [2].

We tested the basic function of LASS virtualization including LASS
enumeration and enabling in non-root and nested environment. As KVM
unittest framework is not compatible to LASS rule, we use kernel module
and application test to emulate LASS violation instead. With KVM forced
emulation mechanism, we also verified the LASS functionality on some
emulation path with instruction fetch and data access to have same
behavior as hardware.

How to extend kselftest to support LASS is under investigation and
experiment.

[1] Intel ISE spec https://cdrdv2.intel.com/v1/dl/getContent/671368
Chapter Linear Address Space Separation (LASS)

[2] LASS kernel patch series
https://lore.kernel.org/all/20230609183632.48706-1-alexander.shishkin@linux.intel.com/

------------------------------------------------------------------------

v1->v2
1. refactor and optimize the interface of instruction emulation
   by introducing new set of operation type definition prefixed with
   "X86EMUL_F_" to distinguish access.
2. reorganize the patch to make each area of KVM better isolated.
3. refine LASS violation check design with consideration of wraparound
   access across address space boundary.

v0->v1
1. Adapt to new __linearize() API
2. Function refactor of vmx_check_lass()
3. Refine commit message to be more precise
4. Drop LASS kvm cap detection depending
   on hardware capability

Binbin Wu (4):
  KVM: x86: Consolidate flags for __linearize()
  KVM: x86: Use a new flag for branch instructions
  KVM: x86: Add an emulation flag for implicit system access
  KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()

Zeng Guang (4):
  KVM: emulator: Add emulation of LASS violation checks on linear
    address
  KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS
    protection
  KVM: x86: Virtualize CR4.LASS
  KVM: x86: Advertise LASS CPUID to user space

 arch/x86/include/asm/kvm-x86-ops.h |  3 ++-
 arch/x86/include/asm/kvm_host.h    |  5 +++-
 arch/x86/kvm/cpuid.c               |  5 ++--
 arch/x86/kvm/emulate.c             | 37 ++++++++++++++++++++---------
 arch/x86/kvm/kvm_emulate.h         |  9 +++++++
 arch/x86/kvm/vmx/nested.c          |  3 ++-
 arch/x86/kvm/vmx/sgx.c             |  4 ++++
 arch/x86/kvm/vmx/vmx.c             | 38 ++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h             |  3 +++
 arch/x86/kvm/x86.c                 | 10 ++++++++
 arch/x86/kvm/x86.h                 |  2 ++
 11 files changed, 102 insertions(+), 17 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/8] KVM: x86: Consolidate flags for __linearize()
  2023-07-19  2:45 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
@ 2023-07-19  2:45 ` Zeng Guang
  2023-07-19  2:45 ` [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions Zeng Guang
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Zeng Guang @ 2023-07-19  2:45 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Binbin Wu, Zeng Guang

From: Binbin Wu <binbin.wu@linux.intel.com>

Consolidate @write and @fetch of __linearize() into a set of flags so that
additional flags can be added without needing more/new boolean parameters,
to precisely identify the access type.

No functional change intended.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/kvm/emulate.c     | 21 +++++++++++----------
 arch/x86/kvm/kvm_emulate.h |  4 ++++
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 936a397a08cd..3ddfbc99fa4f 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -687,8 +687,8 @@ static unsigned insn_alignment(struct x86_emulate_ctxt *ctxt, unsigned size)
 static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 				       struct segmented_address addr,
 				       unsigned *max_size, unsigned size,
-				       bool write, bool fetch,
-				       enum x86emul_mode mode, ulong *linear)
+				       enum x86emul_mode mode, ulong *linear,
+				       unsigned int flags)
 {
 	struct desc_struct desc;
 	bool usable;
@@ -717,11 +717,11 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 		if (!usable)
 			goto bad;
 		/* code segment in protected mode or read-only data segment */
-		if ((((ctxt->mode != X86EMUL_MODE_REAL) && (desc.type & 8))
-					|| !(desc.type & 2)) && write)
+		if ((((ctxt->mode != X86EMUL_MODE_REAL) && (desc.type & 8)) || !(desc.type & 2)) &&
+		    (flags & X86EMUL_F_WRITE))
 			goto bad;
 		/* unreadable code segment */
-		if (!fetch && (desc.type & 8) && !(desc.type & 2))
+		if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
 			goto bad;
 		lim = desc_limit_scaled(&desc);
 		if (!(desc.type & 8) && (desc.type & 4)) {
@@ -757,8 +757,8 @@ static int linearize(struct x86_emulate_ctxt *ctxt,
 		     ulong *linear)
 {
 	unsigned max_size;
-	return __linearize(ctxt, addr, &max_size, size, write, false,
-			   ctxt->mode, linear);
+	return __linearize(ctxt, addr, &max_size, size, ctxt->mode, linear,
+			   write ? X86EMUL_F_WRITE : 0);
 }
 
 static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
@@ -771,7 +771,8 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
 
 	if (ctxt->op_bytes != sizeof(unsigned long))
 		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
-	rc = __linearize(ctxt, addr, &max_size, 1, false, true, ctxt->mode, &linear);
+	rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
+			 X86EMUL_F_FETCH);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->_eip = addr.ea;
 	return rc;
@@ -907,8 +908,8 @@ static int __do_insn_fetch_bytes(struct x86_emulate_ctxt *ctxt, int op_size)
 	 * boundary check itself.  Instead, we use max_size to check
 	 * against op_size.
 	 */
-	rc = __linearize(ctxt, addr, &max_size, 0, false, true, ctxt->mode,
-			 &linear);
+	rc = __linearize(ctxt, addr, &max_size, 0, ctxt->mode, &linear,
+			 X86EMUL_F_FETCH);
 	if (unlikely(rc != X86EMUL_CONTINUE))
 		return rc;
 
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index ab65f3a47dfd..86bbe997162d 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -88,6 +88,10 @@ struct x86_instruction_info {
 #define X86EMUL_IO_NEEDED       5 /* IO is needed to complete emulation */
 #define X86EMUL_INTERCEPTED     6 /* Intercepted by nested VMCB/VMCS */
 
+/* x86-specific emulation flags */
+#define X86EMUL_F_WRITE			BIT(0)
+#define X86EMUL_F_FETCH			BIT(1)
+
 struct x86_emulate_ops {
 	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
 	/*
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions
  2023-07-19  2:45 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
  2023-07-19  2:45 ` [PATCH v2 1/8] KVM: x86: Consolidate flags for __linearize() Zeng Guang
@ 2023-07-19  2:45 ` Zeng Guang
  2023-08-15 22:51   ` Sean Christopherson
  2023-07-19  2:45 ` [PATCH v2 3/8] KVM: x86: Add an emulation flag for implicit system access Zeng Guang
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Zeng Guang @ 2023-07-19  2:45 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Binbin Wu, Zeng Guang

From: Binbin Wu <binbin.wu@linux.intel.com>

Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
assign_eip(), since strictly speaking it is not behavior of instruction
fetch.

Another reason is to distinguish instruction fetch and execution of branch
instruction for feature(s) that handle differently on them.

Branch instruction is not data access instruction, so skip checking against
execute-only code segment as instruction fetch.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/kvm/emulate.c     | 5 +++--
 arch/x86/kvm/kvm_emulate.h | 1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 3ddfbc99fa4f..8e706d19ae45 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -721,7 +721,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 		    (flags & X86EMUL_F_WRITE))
 			goto bad;
 		/* unreadable code segment */
-		if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
+		if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH))
+			&& (desc.type & 8) && !(desc.type & 2))
 			goto bad;
 		lim = desc_limit_scaled(&desc);
 		if (!(desc.type & 8) && (desc.type & 4)) {
@@ -772,7 +773,7 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
 	if (ctxt->op_bytes != sizeof(unsigned long))
 		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
 	rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
-			 X86EMUL_F_FETCH);
+			 X86EMUL_F_BRANCH);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->_eip = addr.ea;
 	return rc;
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 86bbe997162d..9fc7d34a4ac1 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -91,6 +91,7 @@ struct x86_instruction_info {
 /* x86-specific emulation flags */
 #define X86EMUL_F_WRITE			BIT(0)
 #define X86EMUL_F_FETCH			BIT(1)
+#define X86EMUL_F_BRANCH		BIT(2)
 
 struct x86_emulate_ops {
 	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/8] KVM: x86: Add an emulation flag for implicit system access
  2023-07-19  2:45 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
  2023-07-19  2:45 ` [PATCH v2 1/8] KVM: x86: Consolidate flags for __linearize() Zeng Guang
  2023-07-19  2:45 ` [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions Zeng Guang
@ 2023-07-19  2:45 ` Zeng Guang
  2023-07-19  2:45 ` [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg() Zeng Guang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Zeng Guang @ 2023-07-19  2:45 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Binbin Wu, Zeng Guang

From: Binbin Wu <binbin.wu@linux.intel.com>

Add an emulation flag X86EMUL_F_IMPLICIT to identify the behavior of
implicit system access in instruction emulation.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/kvm/kvm_emulate.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 9fc7d34a4ac1..c0e48f4fa7c4 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -92,6 +92,7 @@ struct x86_instruction_info {
 #define X86EMUL_F_WRITE			BIT(0)
 #define X86EMUL_F_FETCH			BIT(1)
 #define X86EMUL_F_BRANCH		BIT(2)
+#define X86EMUL_F_IMPLICIT		BIT(3)
 
 struct x86_emulate_ops {
 	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()
  2023-07-19  2:45 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
                   ` (2 preceding siblings ...)
  2023-07-19  2:45 ` [PATCH v2 3/8] KVM: x86: Add an emulation flag for implicit system access Zeng Guang
@ 2023-07-19  2:45 ` Zeng Guang
  2023-08-15 23:11   ` Sean Christopherson
  2023-07-19  2:45 ` [PATCH v2 5/8] KVM: emulator: Add emulation of LASS violation checks on linear address Zeng Guang
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Zeng Guang @ 2023-07-19  2:45 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Binbin Wu, Zeng Guang

From: Binbin Wu <binbin.wu@linux.intel.com>

Add an emulation flag X86EMUL_F_INVTLB, which is used to identify an
instruction that does TLB invalidation without true memory access.

Only invlpg & invlpga implemented in emulator belong to this kind.
invlpga doesn't need additional information for emulation. Just pass
the flag to em_invlpg().

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/kvm/emulate.c     | 4 +++-
 arch/x86/kvm/kvm_emulate.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 8e706d19ae45..9b4b3ce6d52a 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3443,8 +3443,10 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
 {
 	int rc;
 	ulong linear;
+	unsigned max_size;
 
-	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
+	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
+		&linear, X86EMUL_F_INVTLB);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->ops->invlpg(ctxt, linear);
 	/* Disable writeback. */
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index c0e48f4fa7c4..c944055091e1 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -93,6 +93,7 @@ struct x86_instruction_info {
 #define X86EMUL_F_FETCH			BIT(1)
 #define X86EMUL_F_BRANCH		BIT(2)
 #define X86EMUL_F_IMPLICIT		BIT(3)
+#define X86EMUL_F_INVTLB		BIT(4)
 
 struct x86_emulate_ops {
 	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 5/8] KVM: emulator: Add emulation of LASS violation checks on linear address
  2023-07-19  2:45 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
                   ` (3 preceding siblings ...)
  2023-07-19  2:45 ` [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg() Zeng Guang
@ 2023-07-19  2:45 ` Zeng Guang
  2023-07-19  2:45 ` [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection Zeng Guang
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Zeng Guang @ 2023-07-19  2:45 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

When enabled Intel CPU feature Linear Address Space Separation (LASS),
KVM emulator will take LASS violation check on every access to guest
memory by a linear address.

We defined a new function prototype in kvm_x86_ops for emulator to
construct the interface to identify whether a LASS violation occurs.
It can have further practical implementation according to vendor
specific requirements.

Emulator will use the passed (address, size) pair and instruction
operation type (flags) to enforce LASS protection when KVM emulates
instruction fetch, data access including implicit data access to a
system data structure.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/include/asm/kvm-x86-ops.h |  3 ++-
 arch/x86/include/asm/kvm_host.h    |  3 +++
 arch/x86/kvm/emulate.c             | 11 +++++++++++
 arch/x86/kvm/kvm_emulate.h         |  2 ++
 arch/x86/kvm/x86.c                 | 10 ++++++++++
 5 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
index 13bc212cd4bc..a301f0a46381 100644
--- a/arch/x86/include/asm/kvm-x86-ops.h
+++ b/arch/x86/include/asm/kvm-x86-ops.h
@@ -132,7 +132,8 @@ KVM_X86_OP_OPTIONAL(migrate_timers)
 KVM_X86_OP(msr_filter_changed)
 KVM_X86_OP(complete_emulated_msr)
 KVM_X86_OP(vcpu_deliver_sipi_vector)
-KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons);
+KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons)
+KVM_X86_OP_OPTIONAL_RET0(is_lass_violation)
 
 #undef KVM_X86_OP
 #undef KVM_X86_OP_OPTIONAL
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index fb9d1f2d6136..791f0dd48cd9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1731,6 +1731,9 @@ struct kvm_x86_ops {
 	 * Returns vCPU specific APICv inhibit reasons
 	 */
 	unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
+
+	bool (*is_lass_violation)(struct kvm_vcpu *vcpu, unsigned long addr,
+				  unsigned int size, unsigned int flags);
 };
 
 struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 9b4b3ce6d52a..7bb595811486 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -742,6 +742,10 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 		}
 		break;
 	}
+
+	if (ctxt->ops->is_lass_violation(ctxt, *linear, size, flags))
+		goto bad;
+
 	if (la & (insn_alignment(ctxt, size) - 1))
 		return emulate_gp(ctxt, 0);
 	return X86EMUL_CONTINUE;
@@ -848,6 +852,9 @@ static inline int jmp_rel(struct x86_emulate_ctxt *ctxt, int rel)
 static int linear_read_system(struct x86_emulate_ctxt *ctxt, ulong linear,
 			      void *data, unsigned size)
 {
+	if (ctxt->ops->is_lass_violation(ctxt, linear, size, X86EMUL_F_IMPLICIT))
+		return emulate_gp(ctxt, 0);
+
 	return ctxt->ops->read_std(ctxt, linear, data, size, &ctxt->exception, true);
 }
 
@@ -855,6 +862,10 @@ static int linear_write_system(struct x86_emulate_ctxt *ctxt,
 			       ulong linear, void *data,
 			       unsigned int size)
 {
+	if (ctxt->ops->is_lass_violation(ctxt, linear, size,
+					 X86EMUL_F_IMPLICIT | X86EMUL_F_WRITE))
+		return emulate_gp(ctxt, 0);
+
 	return ctxt->ops->write_std(ctxt, linear, data, size, &ctxt->exception, true);
 }
 
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index c944055091e1..6f0996d0da56 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -232,6 +232,8 @@ struct x86_emulate_ops {
 	int (*leave_smm)(struct x86_emulate_ctxt *ctxt);
 	void (*triple_fault)(struct x86_emulate_ctxt *ctxt);
 	int (*set_xcr)(struct x86_emulate_ctxt *ctxt, u32 index, u64 xcr);
+	bool (*is_lass_violation)(struct x86_emulate_ctxt *ctxt, unsigned long addr,
+				  unsigned int size, unsigned int flags);
 };
 
 /* Type, address-of, and value of an instruction's operand. */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 04b57a336b34..6448ff706539 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8287,6 +8287,15 @@ static void emulator_vm_bugged(struct x86_emulate_ctxt *ctxt)
 		kvm_vm_bugged(kvm);
 }
 
+static bool emulator_is_lass_violation(struct x86_emulate_ctxt *ctxt,
+				       unsigned long addr,
+				       unsigned int size,
+				       unsigned int flags)
+{
+	return static_call(kvm_x86_is_lass_violation)(emul_to_vcpu(ctxt),
+						      addr, size, flags);
+}
+
 static const struct x86_emulate_ops emulate_ops = {
 	.vm_bugged           = emulator_vm_bugged,
 	.read_gpr            = emulator_read_gpr,
@@ -8332,6 +8341,7 @@ static const struct x86_emulate_ops emulate_ops = {
 	.leave_smm           = emulator_leave_smm,
 	.triple_fault        = emulator_triple_fault,
 	.set_xcr             = emulator_set_xcr,
+	.is_lass_violation   = emulator_is_lass_violation,
 };
 
 static void toggle_interruptibility(struct kvm_vcpu *vcpu, u32 mask)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection
  2023-07-19  2:45 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
                   ` (4 preceding siblings ...)
  2023-07-19  2:45 ` [PATCH v2 5/8] KVM: emulator: Add emulation of LASS violation checks on linear address Zeng Guang
@ 2023-07-19  2:45 ` Zeng Guang
  2023-08-07  7:03   ` Binbin Wu
  2023-07-19  2:45 ` [PATCH v2 7/8] KVM: x86: Virtualize CR4.LASS Zeng Guang
  2023-07-19  2:45 ` [PATCH v2 8/8] KVM: x86: Advertise LASS CPUID to user space Zeng Guang
  7 siblings, 1 reply; 21+ messages in thread
From: Zeng Guang @ 2023-07-19  2:45 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

Implement and wire up vmx_is_lass_violation() in kvm_x86_ops for VMX.

LASS violation check takes effect in KVM emulation of instruction fetch
and data access including implicit access when vCPU is running in long
mode, and also involved in emulation of VMX instruction and SGX ENCLS
instruction to enforce the mode-based protections before paging.

But the target memory address of emulation of TLB invalidation and branch
instructions aren't subject to LASS as exceptions.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/kvm/vmx/nested.c |  3 ++-
 arch/x86/kvm/vmx/sgx.c    |  4 ++++
 arch/x86/kvm/vmx/vmx.c    | 35 +++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h    |  3 +++
 4 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index e35cf0bd0df9..72e78566a3b6 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4985,7 +4985,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
 		 * non-canonical form. This is the only check on the memory
 		 * destination for long mode!
 		 */
-		exn = is_noncanonical_address(*ret, vcpu);
+		exn = is_noncanonical_address(*ret, vcpu) ||
+		      vmx_is_lass_violation(vcpu, *ret, len, 0);
 	} else {
 		/*
 		 * When not in long mode, the virtual/linear address is
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 2261b684a7d4..f8de637ce634 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -46,6 +46,10 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
 			((s.base != 0 || s.limit != 0xffffffff) &&
 			(((u64)*gva + size - 1) > s.limit + 1));
 	}
+
+	if (!fault)
+		fault = vmx_is_lass_violation(vcpu, *gva, size, 0);
+
 	if (fault)
 		kvm_inject_gp(vcpu, 0);
 	return fault ? -EINVAL : 0;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 44fb619803b8..15a7c6e7a25d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8127,6 +8127,40 @@ static void vmx_vm_destroy(struct kvm *kvm)
 	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
 }
 
+bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
+			   unsigned int size, unsigned int flags)
+{
+	const bool is_supervisor_address = !!(addr & BIT_ULL(63));
+	const bool implicit = !!(flags & X86EMUL_F_IMPLICIT);
+	const bool fetch = !!(flags & X86EMUL_F_FETCH);
+	const bool is_wraparound_access = size ? (addr + size - 1) < addr : false;
+
+	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
+		return false;
+
+	/*
+	 * INVTLB isn't subject to LASS, e.g. to allow invalidating userspace
+	 * addresses without toggling RFLAGS.AC.  Branch targets aren't subject
+	 * to LASS in order to simplifiy far control transfers (the subsequent
+	 * fetch will enforce LASS as appropriate).
+	 */
+	if (flags & (X86EMUL_F_BRANCH | X86EMUL_F_INVTLB))
+		return false;
+
+	if (!implicit && vmx_get_cpl(vcpu) == 3)
+		return is_supervisor_address;
+
+	/* LASS is enforced for supervisor-mode access iff SMAP is enabled. */
+	if (!fetch && !kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
+		return false;
+
+	/* Like SMAP, RFLAGS.AC disables LASS checks in supervisor mode. */
+	if (!fetch && !implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
+		return false;
+
+	return is_wraparound_access ? true : !is_supervisor_address;
+}
+
 static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.name = KBUILD_MODNAME,
 
@@ -8266,6 +8300,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.complete_emulated_msr = kvm_complete_insn_gp,
 
 	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
+	.is_lass_violation = vmx_is_lass_violation,
 };
 
 static unsigned int vmx_handle_intel_pt_intr(void)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 9e66531861cf..c1e541a790bb 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -433,6 +433,9 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
 u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
 u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
 
+bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
+			   unsigned int size, unsigned int flags);
+
 static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
 					     int type, bool value)
 {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 7/8] KVM: x86: Virtualize CR4.LASS
  2023-07-19  2:45 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
                   ` (5 preceding siblings ...)
  2023-07-19  2:45 ` [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection Zeng Guang
@ 2023-07-19  2:45 ` Zeng Guang
  2023-07-19  2:45 ` [PATCH v2 8/8] KVM: x86: Advertise LASS CPUID to user space Zeng Guang
  7 siblings, 0 replies; 21+ messages in thread
From: Zeng Guang @ 2023-07-19  2:45 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

Virtualize CR4.LASS[bit 27] under KVM control instead of being guest-owned
as CR4.LASS generally set once for each vCPU at boot time and won't be
toggled at runtime. Besides, only if VM has LASS capability enumerated with
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6], KVM allows guest software to be able
to set CR4.LASS.

Updating cr4_fixed1 to set CR4.LASS bit in the emulated IA32_VMX_CR4_FIXED1
MSR for guests and allow guests to enable LASS in nested VMX operation as
well.

Notes: Setting CR4.LASS to 1 enable LASS in IA-32e mode. It doesn't take
effect in legacy mode even if CR4.LASS is set.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/vmx/vmx.c          | 3 +++
 arch/x86/kvm/x86.h              | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 791f0dd48cd9..a881b0518a18 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,7 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP | X86_CR4_LASS))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 15a7c6e7a25d..e74991bed362 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7603,6 +7603,9 @@ static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
 	cr4_fixed1_update(X86_CR4_UMIP,       ecx, feature_bit(UMIP));
 	cr4_fixed1_update(X86_CR4_LA57,       ecx, feature_bit(LA57));
 
+	entry = kvm_find_cpuid_entry_index(vcpu, 0x7, 1);
+	cr4_fixed1_update(X86_CR4_LASS,       eax, feature_bit(LASS));
+
 #undef cr4_fixed1_update
 }
 
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index c544602d07a3..e1295f490308 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -529,6 +529,8 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type);
 		__reserved_bits |= X86_CR4_VMXE;        \
 	if (!__cpu_has(__c, X86_FEATURE_PCID))          \
 		__reserved_bits |= X86_CR4_PCIDE;       \
+	if (!__cpu_has(__c, X86_FEATURE_LASS))          \
+		__reserved_bits |= X86_CR4_LASS;        \
 	__reserved_bits;                                \
 })
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 8/8] KVM: x86: Advertise LASS CPUID to user space
  2023-07-19  2:45 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
                   ` (6 preceding siblings ...)
  2023-07-19  2:45 ` [PATCH v2 7/8] KVM: x86: Virtualize CR4.LASS Zeng Guang
@ 2023-07-19  2:45 ` Zeng Guang
  7 siblings, 0 replies; 21+ messages in thread
From: Zeng Guang @ 2023-07-19  2:45 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Zeng Guang

Linear address space separation (LASS) is an independent mechanism
to enforce the mode-based protection that can prevent user-mode
accesses to supervisor-mode addresses, and vice versa. Because the
LASS protections are applied before paging, malicious software can
not acquire any paging-based timing information to compromise the
security of system.

The CPUID bit definition to support LASS:
CPUID.(EAX=07H.ECX=1):EAX.LASS[bit 6]

Advertise LASS to user space to support LASS virtualization.

Signed-off-by: Zeng Guang <guang.zeng@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---
 arch/x86/kvm/cpuid.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0c9660a07b23..a7fafe99ffe4 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -646,9 +646,8 @@ void kvm_set_cpu_caps(void)
 		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
 
 	kvm_cpu_cap_mask(CPUID_7_1_EAX,
-		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
-		F(FZRM) | F(FSRS) | F(FSRC) |
-		F(AMX_FP16) | F(AVX_IFMA)
+		F(AVX_VNNI) | F(AVX512_BF16) | F(LASS) | F(CMPCCXADD) |
+		F(FZRM) | F(FSRS) | F(FSRC) | F(AMX_FP16) | F(AVX_IFMA)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection
  2023-07-19  2:45 ` [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection Zeng Guang
@ 2023-08-07  7:03   ` Binbin Wu
  2023-08-15 23:46     ` Sean Christopherson
  0 siblings, 1 reply; 21+ messages in thread
From: Binbin Wu @ 2023-08-07  7:03 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel



On 7/19/2023 10:45 AM, Zeng Guang wrote:
> Implement and wire up vmx_is_lass_violation() in kvm_x86_ops for VMX.
>
> LASS violation check takes effect in KVM emulation of instruction fetch
> and data access including implicit access when vCPU is running in long
> mode, and also involved in emulation of VMX instruction and SGX ENCLS
> instruction to enforce the mode-based protections before paging.
>
> But the target memory address of emulation of TLB invalidation and branch
> instructions aren't subject to LASS as exceptions.
>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> ---
>   arch/x86/kvm/vmx/nested.c |  3 ++-
>   arch/x86/kvm/vmx/sgx.c    |  4 ++++
>   arch/x86/kvm/vmx/vmx.c    | 35 +++++++++++++++++++++++++++++++++++
>   arch/x86/kvm/vmx/vmx.h    |  3 +++
>   4 files changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index e35cf0bd0df9..72e78566a3b6 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4985,7 +4985,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
>   		 * non-canonical form. This is the only check on the memory
>   		 * destination for long mode!
>   		 */
> -		exn = is_noncanonical_address(*ret, vcpu);
> +		exn = is_noncanonical_address(*ret, vcpu) ||
> +		      vmx_is_lass_violation(vcpu, *ret, len, 0);
>   	} else {
>   		/*
>   		 * When not in long mode, the virtual/linear address is
> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
> index 2261b684a7d4..f8de637ce634 100644
> --- a/arch/x86/kvm/vmx/sgx.c
> +++ b/arch/x86/kvm/vmx/sgx.c
> @@ -46,6 +46,10 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
>   			((s.base != 0 || s.limit != 0xffffffff) &&
>   			(((u64)*gva + size - 1) > s.limit + 1));
>   	}
> +
> +	if (!fault)
> +		fault = vmx_is_lass_violation(vcpu, *gva, size, 0);
> +
>   	if (fault)
>   		kvm_inject_gp(vcpu, 0);
>   	return fault ? -EINVAL : 0;
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 44fb619803b8..15a7c6e7a25d 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8127,6 +8127,40 @@ static void vmx_vm_destroy(struct kvm *kvm)
>   	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>   }
>   
> +bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
> +			   unsigned int size, unsigned int flags)
> +{
> +	const bool is_supervisor_address = !!(addr & BIT_ULL(63));
> +	const bool implicit = !!(flags & X86EMUL_F_IMPLICIT);
> +	const bool fetch = !!(flags & X86EMUL_F_FETCH);
> +	const bool is_wraparound_access = size ? (addr + size - 1) < addr : false;
> +
> +	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
> +		return false;
> +
> +	/*
> +	 * INVTLB isn't subject to LASS, e.g. to allow invalidating userspace
> +	 * addresses without toggling RFLAGS.AC.  Branch targets aren't subject
> +	 * to LASS in order to simplifiy far control transfers (the subsequent
s/simplifiy/simplifiy

> +	 * fetch will enforce LASS as appropriate).
> +	 */
> +	if (flags & (X86EMUL_F_BRANCH | X86EMUL_F_INVTLB))
> +		return false;
> +
> +	if (!implicit && vmx_get_cpl(vcpu) == 3)
> +		return is_supervisor_address;
> +
> +	/* LASS is enforced for supervisor-mode access iff SMAP is enabled. */
To be more accurate, supervisor-mode data access.
Also, "iff" here is not is a typo for "if" or it stands for "if and only 
if"?
It is not accureate to use "if and only if" here because beside SMAP, 
there are other conditions, i.e. implicit or RFLAGS.AC.

> +	if (!fetch && !kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
> +		return false;
> +
> +	/* Like SMAP, RFLAGS.AC disables LASS checks in supervisor mode. */
> +	if (!fetch && !implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
> +		return false;
> +
> +	return is_wraparound_access ? true : !is_supervisor_address;
> +}
> +
>   static struct kvm_x86_ops vmx_x86_ops __initdata = {
>   	.name = KBUILD_MODNAME,
>   
> @@ -8266,6 +8300,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>   	.complete_emulated_msr = kvm_complete_insn_gp,
>   
>   	.vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector,
> +	.is_lass_violation = vmx_is_lass_violation,
>   };
>   
>   static unsigned int vmx_handle_intel_pt_intr(void)
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 9e66531861cf..c1e541a790bb 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -433,6 +433,9 @@ void vmx_enable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type);
>   u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu);
>   u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu);
>   
> +bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
> +			   unsigned int size, unsigned int flags);
> +
>   static inline void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr,
>   					     int type, bool value)
>   {


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions
  2023-07-19  2:45 ` [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions Zeng Guang
@ 2023-08-15 22:51   ` Sean Christopherson
  2023-08-16  7:34     ` Binbin Wu
  0 siblings, 1 reply; 21+ messages in thread
From: Sean Christopherson @ 2023-08-15 22:51 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel, Binbin Wu

Branch *targets*, not branch instructions.  

On Wed, Jul 19, 2023, Zeng Guang wrote:
> From: Binbin Wu <binbin.wu@linux.intel.com>
> 
> Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
> assign_eip(), since strictly speaking it is not behavior of instruction
> fetch.

Eh, I'd just drop this paragraph, as evidenced by this code existing as-is for
years, we wouldn't introduce X86EMUL_F_BRANCH just because resolving a branch
target isn't strictly an instruction fetch.

> Another reason is to distinguish instruction fetch and execution of branch
> instruction for feature(s) that handle differently on them.

Similar to the shortlog, it's about computing the branch target, not executing a
branch instruction.  That distinction matters, e.g. a Jcc that is not taken will
*not* follow the branch target, but the instruction is still *executed*.  And there
exist instructions that compute branch targets, but aren't what most people would
typically consider a branch instruction, e.g. XBEGIN.

> Branch instruction is not data access instruction, so skip checking against
> execute-only code segment as instruction fetch.

Rather than call out individual use case, I would simply state that as of this
patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as far as KVM is
concernered.  That let's the reader know that (a) there's no intended change in
behavior and (b) that the intent is to effectively split all consumption of
X86EMUL_F_FETCH into (X86EMUL_F_FETCH | X86EMUL_F_BRANCH).

> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> ---
>  arch/x86/kvm/emulate.c     | 5 +++--
>  arch/x86/kvm/kvm_emulate.h | 1 +
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 3ddfbc99fa4f..8e706d19ae45 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -721,7 +721,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>  		    (flags & X86EMUL_F_WRITE))
>  			goto bad;
>  		/* unreadable code segment */
> -		if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
> +		if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH))
> +			&& (desc.type & 8) && !(desc.type & 2))

Put the && on the first line, and align indendation.

		/* unreadable code segment */
		if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH)) &&
		    (desc.type & 8) && !(desc.type & 2))
			goto bad;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()
  2023-07-19  2:45 ` [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg() Zeng Guang
@ 2023-08-15 23:11   ` Sean Christopherson
  2023-08-16  7:55     ` Binbin Wu
  0 siblings, 1 reply; 21+ messages in thread
From: Sean Christopherson @ 2023-08-15 23:11 UTC (permalink / raw)
  To: Zeng Guang
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel, Binbin Wu

On Wed, Jul 19, 2023, Zeng Guang wrote:
> From: Binbin Wu <binbin.wu@linux.intel.com>
> 
> Add an emulation flag X86EMUL_F_INVTLB, which is used to identify an
> instruction that does TLB invalidation without true memory access.
> 
> Only invlpg & invlpga implemented in emulator belong to this kind.
> invlpga doesn't need additional information for emulation. Just pass
> the flag to em_invlpg().

Please add a paragraph explaining *why* this flag is being added.  Ideally, the
previous patch would also explain the need for an IMPLICIT flag, but that one
doesn't bug me all that much because implicit accesses are known to be special
snowflakes, i.e. it's easy to imagine that KVM would need to identify such
accesses.  But for INVLPG, without already knowing the details of LASS (or LAM),
it's harder to think of why it needs to exist.

> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> ---
>  arch/x86/kvm/emulate.c     | 4 +++-
>  arch/x86/kvm/kvm_emulate.h | 1 +
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 8e706d19ae45..9b4b3ce6d52a 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -3443,8 +3443,10 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
>  {
>  	int rc;
>  	ulong linear;
> +	unsigned max_size;

	unsigned int

> -	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
> +	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
> +		&linear, X86EMUL_F_INVTLB);

Align indentation:

	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
			 &linear, X86EMUL_F_INVTLB);

>  	if (rc == X86EMUL_CONTINUE)
>  		ctxt->ops->invlpg(ctxt, linear);
>  	/* Disable writeback. */
> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> index c0e48f4fa7c4..c944055091e1 100644
> --- a/arch/x86/kvm/kvm_emulate.h
> +++ b/arch/x86/kvm/kvm_emulate.h
> @@ -93,6 +93,7 @@ struct x86_instruction_info {
>  #define X86EMUL_F_FETCH			BIT(1)
>  #define X86EMUL_F_BRANCH		BIT(2)
>  #define X86EMUL_F_IMPLICIT		BIT(3)
> +#define X86EMUL_F_INVTLB		BIT(4)

Why F_INVTLB instead of X86EMUL_F_INVLPG?  Ah, because LAM is ignored for the
linear address in the INVPCID and INVVPID descriptors.  Hrm.

I think my vote is to call this X86EMUL_F_INVLPG even though *in theory* it's not
strictly limited to INVLPG.  Odds are good KVM's emulator will never support
INVPCID or INVVPID, and IMO even though F_INVLPG would be somewhat of a misnomer,
it's much more intuitive even for INVPCID and INVVPID descriptors.  F_INVTLB makes
me think more of the actual act of invalidating the TLB.

I'm not dead set against INVTLB if someone really likes it, but I did scratch my
head for a second when I saw it.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection
  2023-08-07  7:03   ` Binbin Wu
@ 2023-08-15 23:46     ` Sean Christopherson
  2023-08-17 16:15       ` Zeng Guang
  0 siblings, 1 reply; 21+ messages in thread
From: Sean Christopherson @ 2023-08-15 23:46 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Zeng Guang, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel

On Mon, Aug 07, 2023, Binbin Wu wrote:
> 
> On 7/19/2023 10:45 AM, Zeng Guang wrote:
> > Implement and wire up vmx_is_lass_violation() in kvm_x86_ops for VMX.
> > 
> > LASS violation check takes effect in KVM emulation of instruction fetch
> > and data access including implicit access when vCPU is running in long
> > mode, and also involved in emulation of VMX instruction and SGX ENCLS
> > instruction to enforce the mode-based protections before paging.
> > 
> > But the target memory address of emulation of TLB invalidation and branch
> > instructions aren't subject to LASS as exceptions.

Same nit about branch instructions.  And I would explicitly say "linear address"
instead of "target memory address", the "target" part makes it a bit ambiguous.

How about this?

Linear addresses used for TLB invalidation (INVPLG, INVPCID, and INVVPID) and
branch targets are not subject to LASS enforcement.

> > 
> > Signed-off-by: Zeng Guang <guang.zeng@intel.com>
> > Tested-by: Xuelian Guo <xuelian.guo@intel.com>
> > ---
> >   arch/x86/kvm/vmx/nested.c |  3 ++-
> >   arch/x86/kvm/vmx/sgx.c    |  4 ++++
> >   arch/x86/kvm/vmx/vmx.c    | 35 +++++++++++++++++++++++++++++++++++
> >   arch/x86/kvm/vmx/vmx.h    |  3 +++
> >   4 files changed, 44 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index e35cf0bd0df9..72e78566a3b6 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -4985,7 +4985,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
> >   		 * non-canonical form. This is the only check on the memory
> >   		 * destination for long mode!
> >   		 */
> > -		exn = is_noncanonical_address(*ret, vcpu);
> > +		exn = is_noncanonical_address(*ret, vcpu) ||
> > +		      vmx_is_lass_violation(vcpu, *ret, len, 0);
> >   	} else {
> >   		/*
> >   		 * When not in long mode, the virtual/linear address is
> > diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
> > index 2261b684a7d4..f8de637ce634 100644
> > --- a/arch/x86/kvm/vmx/sgx.c
> > +++ b/arch/x86/kvm/vmx/sgx.c
> > @@ -46,6 +46,10 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
> >   			((s.base != 0 || s.limit != 0xffffffff) &&
> >   			(((u64)*gva + size - 1) > s.limit + 1));
> >   	}
> > +
> > +	if (!fault)
> > +		fault = vmx_is_lass_violation(vcpu, *gva, size, 0);

At the risk of bleeding details where they don't need to go... LASS is Long Mode
only, so I believe this chunk can be:

	if (!IS_ALIGNED(*gva, alignment)) {
		fault = true;
	} else if (likely(is_64_bit_mode(vcpu))) {
		fault = is_noncanonical_address(*gva, vcpu) ||
			vmx_is_lass_violation(vcpu, *gva, size, 0);
	} else {
		*gva &= 0xffffffff;
		fault = (s.unusable) ||
			(s.type != 2 && s.type != 3) ||
			(*gva > s.limit) ||
			((s.base != 0 || s.limit != 0xffffffff) &&
			(((u64)*gva + size - 1) > s.limit + 1));
	}

which IIRC matches some earlier emulator code.

> > +
> >   	if (fault)
> >   		kvm_inject_gp(vcpu, 0);
> >   	return fault ? -EINVAL : 0;
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 44fb619803b8..15a7c6e7a25d 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -8127,6 +8127,40 @@ static void vmx_vm_destroy(struct kvm *kvm)
> >   	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
> >   }
> > +bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
> > +			   unsigned int size, unsigned int flags)
> > +{
> > +	const bool is_supervisor_address = !!(addr & BIT_ULL(63));
> > +	const bool implicit = !!(flags & X86EMUL_F_IMPLICIT);
> > +	const bool fetch = !!(flags & X86EMUL_F_FETCH);
> > +	const bool is_wraparound_access = size ? (addr + size - 1) < addr : false;

Shouldn't this WARN if size==0?  Ah, the "pre"-fetch fetch to get the max insn
size passes '0'.  Well that's annoying.

Please don't use a local variable to track if an access wraps.  It's used exactly
one, and there's zero reason to use a ternary operator at the return.  E.g. this
is much easier on the eyes:

	if (size && (addr + size - 1) < addr)
		return true;

	return !is_supervisor_address;

Hrm, and typing that out makes me go "huh?"  Ah, it's the "implicit" thing that
turned me around.  Can you rename "implicit" to "implicit_supervisor"?  The
F_IMPLICIT flag is fine, it's just this code:

	if (!implicit && vmx_get_cpl(vcpu) == 3)
		return is_supervisor_address;

where it's easy to miss that "implicit" is "implicit supervisor".

And one more nit, rather than detect wraparound, I think it would be better to
detect that bit 63 isn't set.  Functionally, they're the same, but detecting
wraparound makes it look like wraparound itself is problematic, which isn't
technically true, it's just the only case where an access can possibly straddle
user and kernel address spaces.

And I think we should call out that if LAM is supported, @addr has already been
untagged.  Yeah, it's peeking ahead a bit, but I'd rather have a comment that
is a bit premature than forget to add the appropriate comment in the LAM series.

> > +
> > +	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
> > +		return false;
> > +
> > +	/*
> > +	 * INVTLB isn't subject to LASS, e.g. to allow invalidating userspace
> > +	 * addresses without toggling RFLAGS.AC.  Branch targets aren't subject
> > +	 * to LASS in order to simplifiy far control transfers (the subsequent
> s/simplifiy/simplifiy
> 
> > +	 * fetch will enforce LASS as appropriate).
> > +	 */
> > +	if (flags & (X86EMUL_F_BRANCH | X86EMUL_F_INVTLB))
> > +		return false;
> > +
> > +	if (!implicit && vmx_get_cpl(vcpu) == 3)
> > +		return is_supervisor_address;
> > +
> > +	/* LASS is enforced for supervisor-mode access iff SMAP is enabled. */
> To be more accurate, supervisor-mode data access.
> Also, "iff" here is not is a typo for "if" or it stands for "if and only
> if"?

The latter.

> It is not accureate to use "if and only if" here because beside SMAP, there
> are other conditions, i.e. implicit or RFLAGS.AC.

I was trying to avoid a multi-line comment when I suggested the above.  Hmm, and
I think we could/should consolidate the two if-statements.  This?

	/*
	 * LASS enforcement for supervisor-mode data accesses depends on SMAP
	 * being enabled, and like SMAP ignores explicit accesses if RFLAGS.AC=1.
	 */
	if (!fetch) {
		if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
			return false;

		if (!implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
			return false;
	}

> > +	if (!fetch && !kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
> > +		return false;
> > +
> > +	/* Like SMAP, RFLAGS.AC disables LASS checks in supervisor mode. */
> > +	if (!fetch && !implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
> > +		return false;

All in all, this?  (wildly untested)

	const bool is_supervisor_address = !!(addr & BIT_ULL(63));
	const bool implicit_supervisor = !!(flags & X86EMUL_F_IMPLICIT);
	const bool fetch = !!(flags & X86EMUL_F_FETCH);

	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
		return false;

	/*
	 * INVTLB isn't subject to LASS, e.g. to allow invalidating userspace
	 * addresses without toggling RFLAGS.AC.  Branch targets aren't subject
	 * to LASS in order to simplifiy far control transfers (the subsequent
	 * fetch will enforce LASS as appropriate).
	 */
	if (flags & (X86EMUL_F_BRANCH | X86EMUL_F_INVTLB))
		return false;

	if (!implicit_supervisor && vmx_get_cpl(vcpu) == 3)
		return is_supervisor_address;

	/*
	 * LASS enforcement for supervisor-mode data accesses depends on SMAP
	 * being enabled, and like SMAP ignores explicit accesses if RFLAGS.AC=1.
	 */
	if (!fetch) {
		if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
			return false;

		if (!implicit_supervisor && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
			return false;
	}

	/*
	 * The entire access must be in the appropriate address space.  Note,
	 * if LAM is supported, @addr has already been untagged, so barring a
	 * massive architecture change to expand the canonical address range,
	 * it's impossible for a user access to straddle user and supervisor
	 * address spaces.
	 */
	if (size && !((addr + size - 1) & BIT_ULL(63)))
		return true;

	return !is_supervisor_address;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions
  2023-08-15 22:51   ` Sean Christopherson
@ 2023-08-16  7:34     ` Binbin Wu
  2023-08-16 14:38       ` Sean Christopherson
  0 siblings, 1 reply; 21+ messages in thread
From: Binbin Wu @ 2023-08-16  7:34 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Zeng Guang, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel



On 8/16/2023 6:51 AM, Sean Christopherson wrote:
> Branch *targets*, not branch instructions.
>
> On Wed, Jul 19, 2023, Zeng Guang wrote:
>> From: Binbin Wu <binbin.wu@linux.intel.com>
>>
>> Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
>> assign_eip(), since strictly speaking it is not behavior of instruction
>> fetch.
> Eh, I'd just drop this paragraph, as evidenced by this code existing as-is for
> years, we wouldn't introduce X86EMUL_F_BRANCH just because resolving a branch
> target isn't strictly an instruction fetch.
>
>> Another reason is to distinguish instruction fetch and execution of branch
>> instruction for feature(s) that handle differently on them.
> Similar to the shortlog, it's about computing the branch target, not executing a
> branch instruction.  That distinction matters, e.g. a Jcc that is not taken will
> *not* follow the branch target, but the instruction is still *executed*.  And there
> exist instructions that compute branch targets, but aren't what most people would
> typically consider a branch instruction, e.g. XBEGIN.
>
>> Branch instruction is not data access instruction, so skip checking against
>> execute-only code segment as instruction fetch.
> Rather than call out individual use case, I would simply state that as of this
> patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as far as KVM is
> concernered.  That let's the reader know that (a) there's no intended change in
> behavior and (b) that the intent is to effectively split all consumption of
> X86EMUL_F_FETCH into (X86EMUL_F_FETCH | X86EMUL_F_BRANCH).

How about this:

     KVM: x86: Use a new flag for branch targets

     Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in 
assign_eip()
     to distinguish instruction fetch and branch target computation for 
feature(s)
     that handle differently on them.

     As of this patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are 
identical as far as
     KVM is concernered.

     No functional change intended.


>> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> ---
>>   arch/x86/kvm/emulate.c     | 5 +++--
>>   arch/x86/kvm/kvm_emulate.h | 1 +
>>   2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 3ddfbc99fa4f..8e706d19ae45 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -721,7 +721,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
>>   		    (flags & X86EMUL_F_WRITE))
>>   			goto bad;
>>   		/* unreadable code segment */
>> -		if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
>> +		if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH))
>> +			&& (desc.type & 8) && !(desc.type & 2))
> Put the && on the first line, and align indendation.
I should have been more careful on the alignment & indentation.
Will update it. Thanks.

>
> 		/* unreadable code segment */
> 		if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH)) &&
> 		    (desc.type & 8) && !(desc.type & 2))
> 			goto bad;


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()
  2023-08-15 23:11   ` Sean Christopherson
@ 2023-08-16  7:55     ` Binbin Wu
  2023-08-16 14:27       ` Sean Christopherson
  0 siblings, 1 reply; 21+ messages in thread
From: Binbin Wu @ 2023-08-16  7:55 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Zeng Guang, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel



On 8/16/2023 7:11 AM, Sean Christopherson wrote:
> On Wed, Jul 19, 2023, Zeng Guang wrote:
>> From: Binbin Wu <binbin.wu@linux.intel.com>
>>
>> Add an emulation flag X86EMUL_F_INVTLB, which is used to identify an
>> instruction that does TLB invalidation without true memory access.
>>
>> Only invlpg & invlpga implemented in emulator belong to this kind.
>> invlpga doesn't need additional information for emulation. Just pass
>> the flag to em_invlpg().
> Please add a paragraph explaining *why* this flag is being added.  Ideally, the
> previous patch would also explain the need for an IMPLICIT flag, but that one
> doesn't bug me all that much because implicit accesses are known to be special
> snowflakes, i.e. it's easy to imagine that KVM would need to identify such
> accesses.  But for INVLPG, without already knowing the details of LASS (or LAM),
> it's harder to think of why it needs to exist.
OK, will add the reason for this case and for IMPLICIT as well.
Thanks.


>
>> Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>> ---
>>   arch/x86/kvm/emulate.c     | 4 +++-
>>   arch/x86/kvm/kvm_emulate.h | 1 +
>>   2 files changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 8e706d19ae45..9b4b3ce6d52a 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -3443,8 +3443,10 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
>>   {
>>   	int rc;
>>   	ulong linear;
>> +	unsigned max_size;
> 	unsigned int
Let me think why I use 'unsigned'...
It's because the exist code uses 'unsigned'.
I suppose it is considered bad practice?
I will cleanup the exist code as well. Is it OK to cleanup it 
opportunistically inside this patch?


>> -	rc = linearize(ctxt, ctxt->src.addr.mem, 1, false, &linear);
>> +	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
>> +		&linear, X86EMUL_F_INVTLB);
> Align indentation:
Will update it.

>
> 	rc = __linearize(ctxt, ctxt->src.addr.mem, &max_size, 1, ctxt->mode,
> 			 &linear, X86EMUL_F_INVTLB);
>
>>   	if (rc == X86EMUL_CONTINUE)
>>   		ctxt->ops->invlpg(ctxt, linear);
>>   	/* Disable writeback. */
>> diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
>> index c0e48f4fa7c4..c944055091e1 100644
>> --- a/arch/x86/kvm/kvm_emulate.h
>> +++ b/arch/x86/kvm/kvm_emulate.h
>> @@ -93,6 +93,7 @@ struct x86_instruction_info {
>>   #define X86EMUL_F_FETCH			BIT(1)
>>   #define X86EMUL_F_BRANCH		BIT(2)
>>   #define X86EMUL_F_IMPLICIT		BIT(3)
>> +#define X86EMUL_F_INVTLB		BIT(4)
> Why F_INVTLB instead of X86EMUL_F_INVLPG?  Ah, because LAM is ignored for the
> linear address in the INVPCID and INVVPID descriptors.  Hrm.
>
> I think my vote is to call this X86EMUL_F_INVLPG even though *in theory* it's not
> strictly limited to INVLPG.  Odds are good KVM's emulator will never support
> INVPCID or INVVPID,
One case is kvm_handle_invpcid() is in the common kvm x86 code.
LAM doesn't apply to the address in descriptor of invpcid though, but I 
am not sure if there will be the need for SVM in the future.
But for now, F_INVLPG is OK if you think F_INVTLB brings confusion.


> and IMO even though F_INVLPG would be somewhat of a misnomer,
> it's much more intuitive even for INVPCID and INVVPID descriptors.  F_INVTLB makes
> me think more of the actual act of invalidating the TLB.
>
> I'm not dead set against INVTLB if someone really likes it, but I did scratch my
> head for a second when I saw it.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg()
  2023-08-16  7:55     ` Binbin Wu
@ 2023-08-16 14:27       ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2023-08-16 14:27 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Zeng Guang, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel

On Wed, Aug 16, 2023, Binbin Wu wrote:
> 
> 
> On 8/16/2023 7:11 AM, Sean Christopherson wrote:
> > On Wed, Jul 19, 2023, Zeng Guang wrote:
> > > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> > > index 8e706d19ae45..9b4b3ce6d52a 100644
> > > --- a/arch/x86/kvm/emulate.c
> > > +++ b/arch/x86/kvm/emulate.c
> > > @@ -3443,8 +3443,10 @@ static int em_invlpg(struct x86_emulate_ctxt *ctxt)
> > >   {
> > >   	int rc;
> > >   	ulong linear;
> > > +	unsigned max_size;
> > 	unsigned int
> Let me think why I use 'unsigned'...
> It's because the exist code uses 'unsigned'.
> I suppose it is considered bad practice?

Yeah, use "unsigned int" when writing new code.

> I will cleanup the exist code as well. Is it OK to cleanup it
> opportunistically inside this patch?

No, don't bother cleaning up existing usage.  If a patch touches the "bad" code,
then by all means do an opportunistic cleanup.  But we have too much "legacy" code
in KVM for a wholesale cleanup of bare unsigned usage to be worth the churn and
git blame pollution.  See also:

https://lore.kernel.org/all/ZNvIRS%2FYExLtGO2B@google.com

> > > diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
> > > index c0e48f4fa7c4..c944055091e1 100644
> > > --- a/arch/x86/kvm/kvm_emulate.h
> > > +++ b/arch/x86/kvm/kvm_emulate.h
> > > @@ -93,6 +93,7 @@ struct x86_instruction_info {
> > >   #define X86EMUL_F_FETCH			BIT(1)
> > >   #define X86EMUL_F_BRANCH		BIT(2)
> > >   #define X86EMUL_F_IMPLICIT		BIT(3)
> > > +#define X86EMUL_F_INVTLB		BIT(4)
> > Why F_INVTLB instead of X86EMUL_F_INVLPG?  Ah, because LAM is ignored for the
> > linear address in the INVPCID and INVVPID descriptors.  Hrm.
> > 
> > I think my vote is to call this X86EMUL_F_INVLPG even though *in theory* it's not
> > strictly limited to INVLPG.  Odds are good KVM's emulator will never support
> > INVPCID or INVVPID,
> One case is kvm_handle_invpcid() is in the common kvm x86 code.
> LAM doesn't apply to the address in descriptor of invpcid though, but I am
> not sure if there will be the need for SVM in the future.

Right, but the emulator itself doesn't handle INVPCID or INVVPID, so there's no
direct "conflict" at this time.

> But for now, F_INVLPG is OK if you think F_INVTLB brings confusion.

Yeah, please use F_INVLPG unless someone has a strong objection.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions
  2023-08-16  7:34     ` Binbin Wu
@ 2023-08-16 14:38       ` Sean Christopherson
  2023-08-17  1:38         ` Binbin Wu
  0 siblings, 1 reply; 21+ messages in thread
From: Sean Christopherson @ 2023-08-16 14:38 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Zeng Guang, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel

On Wed, Aug 16, 2023, Binbin Wu wrote:
> 
> 
> On 8/16/2023 6:51 AM, Sean Christopherson wrote:
> > Branch *targets*, not branch instructions.
> > 
> > On Wed, Jul 19, 2023, Zeng Guang wrote:
> > > From: Binbin Wu <binbin.wu@linux.intel.com>
> > > 
> > > Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
> > > assign_eip(), since strictly speaking it is not behavior of instruction
> > > fetch.
> > Eh, I'd just drop this paragraph, as evidenced by this code existing as-is for
> > years, we wouldn't introduce X86EMUL_F_BRANCH just because resolving a branch
> > target isn't strictly an instruction fetch.
> > 
> > > Another reason is to distinguish instruction fetch and execution of branch
> > > instruction for feature(s) that handle differently on them.
> > Similar to the shortlog, it's about computing the branch target, not executing a
> > branch instruction.  That distinction matters, e.g. a Jcc that is not taken will
> > *not* follow the branch target, but the instruction is still *executed*.  And there
> > exist instructions that compute branch targets, but aren't what most people would
> > typically consider a branch instruction, e.g. XBEGIN.
> > 
> > > Branch instruction is not data access instruction, so skip checking against
> > > execute-only code segment as instruction fetch.
> > Rather than call out individual use case, I would simply state that as of this
> > patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as far as KVM is
> > concernered.  That let's the reader know that (a) there's no intended change in
> > behavior and (b) that the intent is to effectively split all consumption of
> > X86EMUL_F_FETCH into (X86EMUL_F_FETCH | X86EMUL_F_BRANCH).
> 
> How about this:
> 
>     KVM: x86: Use a new flag for branch targets
> 
>     Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
> assign_eip()
>     to distinguish instruction fetch and branch target computation for
> feature(s)

Just "features", i.e. no parentheses...

>     that handle differently on them.

...and tack on ", e.g. LASS and LAM." at the end.  There's zero reason not to more
explicitly call out why the flag is being added.  Trying to predict the future in
changelogs is generally discouraged, but having understandable changelogs is more
important.

>     As of this patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as
> far as
>     KVM is concernered.
> 
>     No functional change intended.

Heh, you need to fix whatever is forcefully wrapping lines, but other than the
nit above, the content itself is good.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions
  2023-08-16 14:38       ` Sean Christopherson
@ 2023-08-17  1:38         ` Binbin Wu
  2023-08-17 14:45           ` Sean Christopherson
  0 siblings, 1 reply; 21+ messages in thread
From: Binbin Wu @ 2023-08-17  1:38 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Zeng Guang, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel



On 8/16/2023 10:38 PM, Sean Christopherson wrote:
> On Wed, Aug 16, 2023, Binbin Wu wrote:
>>
>> On 8/16/2023 6:51 AM, Sean Christopherson wrote:
>>> Branch *targets*, not branch instructions.
>>>
>>> On Wed, Jul 19, 2023, Zeng Guang wrote:
>>>> From: Binbin Wu <binbin.wu@linux.intel.com>
>>>>
>>>> Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
>>>> assign_eip(), since strictly speaking it is not behavior of instruction
>>>> fetch.
>>> Eh, I'd just drop this paragraph, as evidenced by this code existing as-is for
>>> years, we wouldn't introduce X86EMUL_F_BRANCH just because resolving a branch
>>> target isn't strictly an instruction fetch.
>>>
>>>> Another reason is to distinguish instruction fetch and execution of branch
>>>> instruction for feature(s) that handle differently on them.
>>> Similar to the shortlog, it's about computing the branch target, not executing a
>>> branch instruction.  That distinction matters, e.g. a Jcc that is not taken will
>>> *not* follow the branch target, but the instruction is still *executed*.  And there
>>> exist instructions that compute branch targets, but aren't what most people would
>>> typically consider a branch instruction, e.g. XBEGIN.
>>>
>>>> Branch instruction is not data access instruction, so skip checking against
>>>> execute-only code segment as instruction fetch.
>>> Rather than call out individual use case, I would simply state that as of this
>>> patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as far as KVM is
>>> concernered.  That let's the reader know that (a) there's no intended change in
>>> behavior and (b) that the intent is to effectively split all consumption of
>>> X86EMUL_F_FETCH into (X86EMUL_F_FETCH | X86EMUL_F_BRANCH).
>> How about this:
>>
>>      KVM: x86: Use a new flag for branch targets
>>
>>      Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
>> assign_eip()
>>      to distinguish instruction fetch and branch target computation for
>> feature(s)
> Just "features", i.e. no parentheses...
>
>>      that handle differently on them.
> ...and tack on ", e.g. LASS and LAM." at the end.
OK, but only LASS here, since LAM only applies to addresses for data 
accesses, i.e, no need to distingush the two flag.

> There's zero reason not to more
> explicitly call out why the flag is being added.  Trying to predict the future in
> changelogs is generally discouraged, but having understandable changelogs is more
> important.
>
>>      As of this patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as
>> far as
>>      KVM is concernered.
>>
>>      No functional change intended.
> Heh, you need to fix whatever is forcefully wrapping lines, but other than the
> nit above, the content itself is good.
Sure, I think the wrapping lines due to additional intendations I added, 
it should be OK in changelog.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions
  2023-08-17  1:38         ` Binbin Wu
@ 2023-08-17 14:45           ` Sean Christopherson
  0 siblings, 0 replies; 21+ messages in thread
From: Sean Christopherson @ 2023-08-17 14:45 UTC (permalink / raw)
  To: Binbin Wu
  Cc: Zeng Guang, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm, x86,
	linux-kernel

On Thu, Aug 17, 2023, Binbin Wu wrote:
> 
> 
> On 8/16/2023 10:38 PM, Sean Christopherson wrote:
> > On Wed, Aug 16, 2023, Binbin Wu wrote:
> > > 
> > > On 8/16/2023 6:51 AM, Sean Christopherson wrote:
> > > > Rather than call out individual use case, I would simply state that as of this
> > > > patch, X86EMUL_F_BRANCH and X86EMUL_F_FETCH are identical as far as KVM is
> > > > concernered.  That let's the reader know that (a) there's no intended change in
> > > > behavior and (b) that the intent is to effectively split all consumption of
> > > > X86EMUL_F_FETCH into (X86EMUL_F_FETCH | X86EMUL_F_BRANCH).
> > > How about this:
> > > 
> > >      KVM: x86: Use a new flag for branch targets
> > > 
> > >      Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
> > > assign_eip()
> > >      to distinguish instruction fetch and branch target computation for
> > > feature(s)
> > Just "features", i.e. no parentheses...
> > 
> > >      that handle differently on them.
> > ...and tack on ", e.g. LASS and LAM." at the end.
> OK, but only LASS here, since LAM only applies to addresses for data
> accesses, i.e, no need to distingush the two flag.

Oh, right.   Thanks!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection
  2023-08-15 23:46     ` Sean Christopherson
@ 2023-08-17 16:15       ` Zeng Guang
  0 siblings, 0 replies; 21+ messages in thread
From: Zeng Guang @ 2023-08-17 16:15 UTC (permalink / raw)
  To: Sean Christopherson, Binbin Wu
  Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H Peter Anvin, kvm, x86, linux-kernel


On 8/16/2023 7:46 AM, Sean Christopherson wrote:
> On Mon, Aug 07, 2023, Binbin Wu wrote:
>> On 7/19/2023 10:45 AM, Zeng Guang wrote:
>>> Implement and wire up vmx_is_lass_violation() in kvm_x86_ops for VMX.
>>>
>>> LASS violation check takes effect in KVM emulation of instruction fetch
>>> and data access including implicit access when vCPU is running in long
>>> mode, and also involved in emulation of VMX instruction and SGX ENCLS
>>> instruction to enforce the mode-based protections before paging.
>>>
>>> But the target memory address of emulation of TLB invalidation and branch
>>> instructions aren't subject to LASS as exceptions.
> Same nit about branch instructions.  And I would explicitly say "linear address"
> instead of "target memory address", the "target" part makes it a bit ambiguous.
>
> How about this?
>
> Linear addresses used for TLB invalidation (INVPLG, INVPCID, and INVVPID) and
> branch targets are not subject to LASS enforcement.

That's much precise and will change as you suggest.
Thanks.

>>> Signed-off-by: Zeng Guang <guang.zeng@intel.com>
>>> Tested-by: Xuelian Guo <xuelian.guo@intel.com>
>>> ---
>>>    arch/x86/kvm/vmx/nested.c |  3 ++-
>>>    arch/x86/kvm/vmx/sgx.c    |  4 ++++
>>>    arch/x86/kvm/vmx/vmx.c    | 35 +++++++++++++++++++++++++++++++++++
>>>    arch/x86/kvm/vmx/vmx.h    |  3 +++
>>>    4 files changed, 44 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>>> index e35cf0bd0df9..72e78566a3b6 100644
>>> --- a/arch/x86/kvm/vmx/nested.c
>>> +++ b/arch/x86/kvm/vmx/nested.c
>>> @@ -4985,7 +4985,8 @@ int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
>>>    		 * non-canonical form. This is the only check on the memory
>>>    		 * destination for long mode!
>>>    		 */
>>> -		exn = is_noncanonical_address(*ret, vcpu);
>>> +		exn = is_noncanonical_address(*ret, vcpu) ||
>>> +		      vmx_is_lass_violation(vcpu, *ret, len, 0);
>>>    	} else {
>>>    		/*
>>>    		 * When not in long mode, the virtual/linear address is
>>> diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
>>> index 2261b684a7d4..f8de637ce634 100644
>>> --- a/arch/x86/kvm/vmx/sgx.c
>>> +++ b/arch/x86/kvm/vmx/sgx.c
>>> @@ -46,6 +46,10 @@ static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
>>>    			((s.base != 0 || s.limit != 0xffffffff) &&
>>>    			(((u64)*gva + size - 1) > s.limit + 1));
>>>    	}
>>> +
>>> +	if (!fault)
>>> +		fault = vmx_is_lass_violation(vcpu, *gva, size, 0);
> At the risk of bleeding details where they don't need to go... LASS is Long Mode
> only, so I believe this chunk can be:
>
> 	if (!IS_ALIGNED(*gva, alignment)) {
> 		fault = true;
> 	} else if (likely(is_64_bit_mode(vcpu))) {
> 		fault = is_noncanonical_address(*gva, vcpu) ||
> 			vmx_is_lass_violation(vcpu, *gva, size, 0);
> 	} else {
> 		*gva &= 0xffffffff;
> 		fault = (s.unusable) ||
> 			(s.type != 2 && s.type != 3) ||
> 			(*gva > s.limit) ||
> 			((s.base != 0 || s.limit != 0xffffffff) &&
> 			(((u64)*gva + size - 1) > s.limit + 1));
> 	}
>
> which IIRC matches some earlier emulator code.
Just as you mentioned, LASS is long mode only, meanwhile "ENCLS" instruction
can be executed in kernel mode in 64bit mode. Thus LASS violation check can
only take into account for 64bit mode.
>>> +
>>>    	if (fault)
>>>    		kvm_inject_gp(vcpu, 0);
>>>    	return fault ? -EINVAL : 0;
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index 44fb619803b8..15a7c6e7a25d 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -8127,6 +8127,40 @@ static void vmx_vm_destroy(struct kvm *kvm)
>>>    	free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm));
>>>    }
>>> +bool vmx_is_lass_violation(struct kvm_vcpu *vcpu, unsigned long addr,
>>> +			   unsigned int size, unsigned int flags)
>>> +{
>>> +	const bool is_supervisor_address = !!(addr & BIT_ULL(63));
>>> +	const bool implicit = !!(flags & X86EMUL_F_IMPLICIT);
>>> +	const bool fetch = !!(flags & X86EMUL_F_FETCH);
>>> +	const bool is_wraparound_access = size ? (addr + size - 1) < addr : false;
> Shouldn't this WARN if size==0?  Ah, the "pre"-fetch fetch to get the max insn
> size passes '0'.  Well that's annoying.
>
Right, it passes size as "0" in instruction pre-fetch implemented in 
emulator. Instruction
fetch could take twice if it straddles two pages. So we consider this 
situation for wraparound
case in LASS violation detection.
> Please don't use a local variable to track if an access wraps.  It's used exactly
> one, and there's zero reason to use a ternary operator at the return.  E.g. this
> is much easier on the eyes:
>
> 	if (size && (addr + size - 1) < addr)
> 		return true;
>
> 	return !is_supervisor_address;
>
> Hrm, and typing that out makes me go "huh?"  Ah, it's the "implicit" thing that
> turned me around.  Can you rename "implicit" to "implicit_supervisor"?  The
> F_IMPLICIT flag is fine, it's just this code:
>
> 	if (!implicit && vmx_get_cpl(vcpu) == 3)
> 		return is_supervisor_address;
>
> where it's easy to miss that "implicit" is "implicit supervisor".
"implicit" does mean implicit supervisor-mode access regardless of CPL.
Using "implicit_supervisor" should be better.
Thanks.

> And one more nit, rather than detect wraparound, I think it would be better to
> detect that bit 63 isn't set.  Functionally, they're the same, but detecting
> wraparound makes it look like wraparound itself is problematic, which isn't
> technically true, it's just the only case where an access can possibly straddle
> user and kernel address spaces.
>
> And I think we should call out that if LAM is supported, @addr has already been
> untagged.  Yeah, it's peeking ahead a bit, but I'd rather have a comment that
> is a bit premature than forget to add the appropriate comment in the LAM series.
>
>>> +
>>> +	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
>>> +		return false;
>>> +
>>> +	/*
>>> +	 * INVTLB isn't subject to LASS, e.g. to allow invalidating userspace
>>> +	 * addresses without toggling RFLAGS.AC.  Branch targets aren't subject
>>> +	 * to LASS in order to simplifiy far control transfers (the subsequent
>> s/simplifiy/simplifiy
>>
>>> +	 * fetch will enforce LASS as appropriate).
>>> +	 */
>>> +	if (flags & (X86EMUL_F_BRANCH | X86EMUL_F_INVTLB))
>>> +		return false;
>>> +
>>> +	if (!implicit && vmx_get_cpl(vcpu) == 3)
>>> +		return is_supervisor_address;
>>> +
>>> +	/* LASS is enforced for supervisor-mode access iff SMAP is enabled. */
>> To be more accurate, supervisor-mode data access.
>> Also, "iff" here is not is a typo for "if" or it stands for "if and only
>> if"?
> The latter.
>
>> It is not accureate to use "if and only if" here because beside SMAP, there
>> are other conditions, i.e. implicit or RFLAGS.AC.
> I was trying to avoid a multi-line comment when I suggested the above.  Hmm, and
> I think we could/should consolidate the two if-statements.  This?
>
> 	/*
> 	 * LASS enforcement for supervisor-mode data accesses depends on SMAP
> 	 * being enabled, and like SMAP ignores explicit accesses if RFLAGS.AC=1.
> 	 */
> 	if (!fetch) {
> 		if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
> 			return false;
>
> 		if (!implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
> 			return false;
> 	}
>
>>> +	if (!fetch && !kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
>>> +		return false;
>>> +
>>> +	/* Like SMAP, RFLAGS.AC disables LASS checks in supervisor mode. */
>>> +	if (!fetch && !implicit && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
>>> +		return false;
> All in all, this?  (wildly untested)
>
> 	const bool is_supervisor_address = !!(addr & BIT_ULL(63));
> 	const bool implicit_supervisor = !!(flags & X86EMUL_F_IMPLICIT);
> 	const bool fetch = !!(flags & X86EMUL_F_FETCH);
>
> 	if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_LASS) || !is_long_mode(vcpu))
> 		return false;
>
> 	/*
> 	 * INVTLB isn't subject to LASS, e.g. to allow invalidating userspace
> 	 * addresses without toggling RFLAGS.AC.  Branch targets aren't subject
> 	 * to LASS in order to simplifiy far control transfers (the subsequent
> 	 * fetch will enforce LASS as appropriate).
> 	 */
> 	if (flags & (X86EMUL_F_BRANCH | X86EMUL_F_INVTLB))
> 		return false;
>
> 	if (!implicit_supervisor && vmx_get_cpl(vcpu) == 3)
> 		return is_supervisor_address;
>
> 	/*
> 	 * LASS enforcement for supervisor-mode data accesses depends on SMAP
> 	 * being enabled, and like SMAP ignores explicit accesses if RFLAGS.AC=1.
> 	 */
> 	if (!fetch) {
> 		if (!kvm_is_cr4_bit_set(vcpu, X86_CR4_SMAP))
> 			return false;
>
> 		if (!implicit_supervisor && (kvm_get_rflags(vcpu) & X86_EFLAGS_AC))
> 			return false;
> 	}
>
> 	/*
> 	 * The entire access must be in the appropriate address space.  Note,
> 	 * if LAM is supported, @addr has already been untagged, so barring a
> 	 * massive architecture change to expand the canonical address range,
> 	 * it's impossible for a user access to straddle user and supervisor
> 	 * address spaces.
> 	 */
> 	if (size && !((addr + size - 1) & BIT_ULL(63)))
> 		return true;
>
> 	return !is_supervisor_address;

OK. I'll adopt all you suggested for next version.
Thanks.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions
  2023-07-18 13:18 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
@ 2023-07-18 13:18 ` Zeng Guang
  0 siblings, 0 replies; 21+ messages in thread
From: Zeng Guang @ 2023-07-18 13:18 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H Peter Anvin, kvm
  Cc: x86, linux-kernel, Binbin Wu, Zeng Guang

From: Binbin Wu <binbin.wu@linux.intel.com>

Use the new flag X86EMUL_F_BRANCH instead of X86EMUL_F_FETCH in
assign_eip(), since strictly speaking it is not behavior of instruction
fetch.

Another reason is to distinguish instruction fetch and execution of branch
instruction for feature(s) that handle differently on them.

Branch instruction is not data access instruction, so skip checking against
execute-only code segment as instruction fetch.

Signed-off-by: Binbin Wu <binbin.wu@linux.intel.com>
Signed-off-by: Zeng Guang <guang.zeng@intel.com>
---
 arch/x86/kvm/emulate.c     | 5 +++--
 arch/x86/kvm/kvm_emulate.h | 1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 3ddfbc99fa4f..8e706d19ae45 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -721,7 +721,8 @@ static __always_inline int __linearize(struct x86_emulate_ctxt *ctxt,
 		    (flags & X86EMUL_F_WRITE))
 			goto bad;
 		/* unreadable code segment */
-		if (!(flags & X86EMUL_F_FETCH) && (desc.type & 8) && !(desc.type & 2))
+		if (!(flags & (X86EMUL_F_FETCH | X86EMUL_F_BRANCH))
+			&& (desc.type & 8) && !(desc.type & 2))
 			goto bad;
 		lim = desc_limit_scaled(&desc);
 		if (!(desc.type & 8) && (desc.type & 4)) {
@@ -772,7 +773,7 @@ static inline int assign_eip(struct x86_emulate_ctxt *ctxt, ulong dst)
 	if (ctxt->op_bytes != sizeof(unsigned long))
 		addr.ea = dst & ((1UL << (ctxt->op_bytes << 3)) - 1);
 	rc = __linearize(ctxt, addr, &max_size, 1, ctxt->mode, &linear,
-			 X86EMUL_F_FETCH);
+			 X86EMUL_F_BRANCH);
 	if (rc == X86EMUL_CONTINUE)
 		ctxt->_eip = addr.ea;
 	return rc;
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index 86bbe997162d..9fc7d34a4ac1 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -91,6 +91,7 @@ struct x86_instruction_info {
 /* x86-specific emulation flags */
 #define X86EMUL_F_WRITE			BIT(0)
 #define X86EMUL_F_FETCH			BIT(1)
+#define X86EMUL_F_BRANCH		BIT(2)
 
 struct x86_emulate_ops {
 	void (*vm_bugged)(struct x86_emulate_ctxt *ctxt);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-08-17 16:15 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-19  2:45 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
2023-07-19  2:45 ` [PATCH v2 1/8] KVM: x86: Consolidate flags for __linearize() Zeng Guang
2023-07-19  2:45 ` [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions Zeng Guang
2023-08-15 22:51   ` Sean Christopherson
2023-08-16  7:34     ` Binbin Wu
2023-08-16 14:38       ` Sean Christopherson
2023-08-17  1:38         ` Binbin Wu
2023-08-17 14:45           ` Sean Christopherson
2023-07-19  2:45 ` [PATCH v2 3/8] KVM: x86: Add an emulation flag for implicit system access Zeng Guang
2023-07-19  2:45 ` [PATCH v2 4/8] KVM: x86: Add X86EMUL_F_INVTLB and pass it in em_invlpg() Zeng Guang
2023-08-15 23:11   ` Sean Christopherson
2023-08-16  7:55     ` Binbin Wu
2023-08-16 14:27       ` Sean Christopherson
2023-07-19  2:45 ` [PATCH v2 5/8] KVM: emulator: Add emulation of LASS violation checks on linear address Zeng Guang
2023-07-19  2:45 ` [PATCH v2 6/8] KVM: VMX: Implement and apply vmx_is_lass_violation() for LASS protection Zeng Guang
2023-08-07  7:03   ` Binbin Wu
2023-08-15 23:46     ` Sean Christopherson
2023-08-17 16:15       ` Zeng Guang
2023-07-19  2:45 ` [PATCH v2 7/8] KVM: x86: Virtualize CR4.LASS Zeng Guang
2023-07-19  2:45 ` [PATCH v2 8/8] KVM: x86: Advertise LASS CPUID to user space Zeng Guang
  -- strict thread matches above, loose matches on Subject: below --
2023-07-18 13:18 [PATCH v2 0/8] LASS KVM virtualization support Zeng Guang
2023-07-18 13:18 ` [PATCH v2 2/8] KVM: x86: Use a new flag for branch instructions Zeng Guang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).