All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7]  Enable LASS (Linear Address space Separation)
@ 2023-01-10  5:51 Yian Chen
  2023-01-10  5:51 ` [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits Yian Chen
                   ` (8 more replies)
  0 siblings, 9 replies; 40+ messages in thread
From: Yian Chen @ 2023-01-10  5:51 UTC (permalink / raw)
  To: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai, Yian Chen

LASS (Linear Address Space Separation) is a security
extension that prevents speculative address accesses across 
user/kernel mode. The LASS details have been published in
Chapter 11 in 
https://cdrdv2.intel.com/v1/dl/getContent/671368

LASS works in 64-bit mode only and partitions the 64-bit
virtual address space into two halves:
    1. Lower half (LA[63]=0) --> user space
    2. Upper half (LA[63]=1) --> kernel space
When LASS is enabled, a general protection #GP(0) fault will
be generated if software accesses the address from the half in
which it resides to another half, e.g., either from user space
to upper half, or from kernel space to lower half. This
protection applies to data access, code execution, cache line
flushing instructions.

Almost all kernel accesses are to the upper half of the virtual
address space. However, there are valid reasons for kernel to
access the lower half. For these cases,  kernel can temporarily
suspend the enforcement of LASS by disabling SMAP (Supervisor
Mode Access Prevention).

Kernel access to copy data to/from user addresses already
disables SMAP using the stac()/clac() functions. New functions
low_addr_access_begin()/low_addr_access_end() are added to
also disable/enable SMAP around other code that legitimately
needs to access the lower half of the virtual address space.

User space cannot use any kernel address while LASS is
enabled. Less fortunately, legacy vsyscall functions used
by old version of glibc are located in the address range
0xffffffffff600000-0xffffffffff601000 and emulated in kernel.
Therefore, to comply with LASS policy, the legacy vsyscall is
disabled by default. I am looking for input from Andy and
others if this approach is acceptable.

This patch set by default enforces LASS when the platform
supports it. It can be disabled via the command line parameter
"clearcpuid" or by setting "vsyscall=emulate/xonly".

As of now there is no publicly available CPU supporting LASS.
The first one to support LASS is Sierra Forest line. The Intel
Simics® Simulator was used as software development and testing
vehicle for this patch set.

Paul Lai (1):
  x86/kvm: Expose LASS feature to VM guest

Yian Chen (6):
  x86/cpu: Enumerate LASS CPUID and CR4 bits
  x86: Add CONFIG option X86_LASS
  x86/cpu: Disable kernel LASS when patching kernel alternatives
  x86/vsyscall: Setup vsyscall to compromise LASS protection
  x86/cpu: Enable LASS (Linear Address Space Separation)
  x86/cpu: Set LASS as pinning sensitive CR4 bit

 .../admin-guide/kernel-parameters.txt         | 12 +++++++----
 arch/x86/Kconfig                              | 10 +++++++++
 arch/x86/entry/vsyscall/vsyscall_64.c         | 14 +++++++++++++
 arch/x86/include/asm/cpufeatures.h            |  1 +
 arch/x86/include/asm/disabled-features.h      |  8 ++++++-
 arch/x86/include/asm/kvm_host.h               |  3 ++-
 arch/x86/include/asm/smap.h                   | 13 ++++++++++++
 arch/x86/include/uapi/asm/processor-flags.h   |  2 ++
 arch/x86/kernel/Makefile                      |  2 ++
 arch/x86/kernel/alternative.c                 | 21 +++++++++++++++++--
 arch/x86/kernel/cpu/common.c                  | 20 +++++++++++++++++-
 arch/x86/kvm/cpuid.c                          |  2 +-
 tools/arch/x86/include/asm/cpufeatures.h      |  1 +
 .../arch/x86/include/asm/disabled-features.h  |  8 ++++++-
 tools/objtool/arch/x86/special.c              |  2 ++
 15 files changed, 108 insertions(+), 11 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits
  2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
@ 2023-01-10  5:51 ` Yian Chen
  2023-01-10 20:14   ` Sohil Mehta
  2023-01-10  5:51 ` [PATCH 2/7] x86: Add CONFIG option X86_LASS Yian Chen
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 40+ messages in thread
From: Yian Chen @ 2023-01-10  5:51 UTC (permalink / raw)
  To: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai, Yian Chen

LASS (Linear Address Space Separation) is a CPU feature to
prevent speculative address access in user/kernel mode.

LASS partitions 64-bit virtual address space into two
halves, lower address (LA[63]=0) and upper address
(LA[63]=1). It stops any data access or code execution
    1. from upper half address space to any lower half address
    2, from lower half address space to any upper half address
and generates #GP fault for a violation.

In Linux, this means LASS does not allow both kernel code
to access any user space address and user code to access
any kernel space address.

Signed-off-by: Yian Chen <yian.chen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/cpufeatures.h          | 1 +
 arch/x86/include/uapi/asm/processor-flags.h | 2 ++
 tools/arch/x86/include/asm/cpufeatures.h    | 1 +
 3 files changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 61012476d66e..03b375db026b 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -311,6 +311,7 @@
 /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
 #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI instructions */
 #define X86_FEATURE_AVX512_BF16		(12*32+ 5) /* AVX512 BFLOAT16 instructions */
+#define X86_FEATURE_LASS		(12*32+ 6) /* Linear address space separation */
 #define X86_FEATURE_CMPCCXADD           (12*32+ 7) /* "" CMPccXADD instructions */
 #define X86_FEATURE_AMX_FP16		(12*32+21) /* "" AMX fp16 Support */
 #define X86_FEATURE_AVX_IFMA            (12*32+23) /* "" Support for VPMADD52[H,L]UQ */
diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h
index c47cc7f2feeb..fd84ea8240fc 100644
--- a/arch/x86/include/uapi/asm/processor-flags.h
+++ b/arch/x86/include/uapi/asm/processor-flags.h
@@ -132,6 +132,8 @@
 #define X86_CR4_PKE		_BITUL(X86_CR4_PKE_BIT)
 #define X86_CR4_CET_BIT		23 /* enable Control-flow Enforcement Technology */
 #define X86_CR4_CET		_BITUL(X86_CR4_CET_BIT)
+#define X86_CR4_LASS_BIT	27 /* enable LASS support */
+#define X86_CR4_LASS		_BITUL(X86_CR4_LASS_BIT)
 
 /*
  * x86-64 Task Priority Register, CR8
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 61012476d66e..03b375db026b 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -311,6 +311,7 @@
 /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
 #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI instructions */
 #define X86_FEATURE_AVX512_BF16		(12*32+ 5) /* AVX512 BFLOAT16 instructions */
+#define X86_FEATURE_LASS		(12*32+ 6) /* Linear address space separation */
 #define X86_FEATURE_CMPCCXADD           (12*32+ 7) /* "" CMPccXADD instructions */
 #define X86_FEATURE_AMX_FP16		(12*32+21) /* "" AMX fp16 Support */
 #define X86_FEATURE_AVX_IFMA            (12*32+23) /* "" Support for VPMADD52[H,L]UQ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 2/7] x86: Add CONFIG option X86_LASS
  2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
  2023-01-10  5:51 ` [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits Yian Chen
@ 2023-01-10  5:51 ` Yian Chen
  2023-01-10 21:05   ` Sohil Mehta
  2023-01-10  5:52 ` [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives Yian Chen
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 40+ messages in thread
From: Yian Chen @ 2023-01-10  5:51 UTC (permalink / raw)
  To: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai, Yian Chen

LASS is an Intel x86-64 only feature. Add CONFIG
option X86_LASS and flag DISABLE_LASS to choose
opt-in/out the feature from kernel binary.

CONFIG_X86_LASS is enabled by default because it
is a security feature which should have little
to no overhead or side effects. If any issues are
found with specific use cases, the CONFIG option
makes it easy to disable.

Signed-off-by: Yian Chen <yian.chen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/Kconfig                               | 10 ++++++++++
 arch/x86/include/asm/disabled-features.h       |  8 +++++++-
 tools/arch/x86/include/asm/disabled-features.h |  8 +++++++-
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3604074a878b..38b1497afd75 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1826,6 +1826,16 @@ config ARCH_USES_PG_UNCACHED
 	def_bool y
 	depends on X86_PAT
 
+config X86_LASS
+	def_bool y
+	prompt "Linear Address Space Separation"
+	depends on X86_64 && CPU_SUP_INTEL
+	help
+	  Linear Address Space Separation (LASS) is a processor
+	  feature that mitigates address space layout probes.
+
+	  if unsure, say Y.
+
 config X86_UMIP
 	def_bool y
 	prompt "User Mode Instruction Prevention" if EXPERT
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index c44b56f7ffba..0cad37d59e0f 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -16,6 +16,12 @@
 # define DISABLE_UMIP	(1<<(X86_FEATURE_UMIP & 31))
 #endif
 
+#ifdef CONFIG_X86_LASS
+# define DISABLE_LASS	0
+#else
+# define DISABLE_LASS	(1<<(X86_FEATURE_LASS & 31))
+#endif
+
 #ifdef CONFIG_X86_64
 # define DISABLE_VME		(1<<(X86_FEATURE_VME & 31))
 # define DISABLE_K6_MTRR	(1<<(X86_FEATURE_K6_MTRR & 31))
@@ -115,7 +121,7 @@
 #define DISABLED_MASK10	0
 #define DISABLED_MASK11	(DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \
 			 DISABLE_CALL_DEPTH_TRACKING)
-#define DISABLED_MASK12	0
+#define DISABLED_MASK12	(DISABLE_LASS)
 #define DISABLED_MASK13	0
 #define DISABLED_MASK14	0
 #define DISABLED_MASK15	0
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index c44b56f7ffba..0cad37d59e0f 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -16,6 +16,12 @@
 # define DISABLE_UMIP	(1<<(X86_FEATURE_UMIP & 31))
 #endif
 
+#ifdef CONFIG_X86_LASS
+# define DISABLE_LASS	0
+#else
+# define DISABLE_LASS	(1<<(X86_FEATURE_LASS & 31))
+#endif
+
 #ifdef CONFIG_X86_64
 # define DISABLE_VME		(1<<(X86_FEATURE_VME & 31))
 # define DISABLE_K6_MTRR	(1<<(X86_FEATURE_K6_MTRR & 31))
@@ -115,7 +121,7 @@
 #define DISABLED_MASK10	0
 #define DISABLED_MASK11	(DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \
 			 DISABLE_CALL_DEPTH_TRACKING)
-#define DISABLED_MASK12	0
+#define DISABLED_MASK12	(DISABLE_LASS)
 #define DISABLED_MASK13	0
 #define DISABLED_MASK14	0
 #define DISABLED_MASK15	0
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
  2023-01-10  5:51 ` [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits Yian Chen
  2023-01-10  5:51 ` [PATCH 2/7] x86: Add CONFIG option X86_LASS Yian Chen
@ 2023-01-10  5:52 ` Yian Chen
  2023-01-10 21:04   ` Peter Zijlstra
  2023-01-10 22:41   ` Sohil Mehta
  2023-01-10  5:52 ` [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection Yian Chen
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Yian Chen @ 2023-01-10  5:52 UTC (permalink / raw)
  To: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai, Yian Chen

Most of the kernel is mapped at virtual addresses
in the upper half of the address range. But kernel
deliberately initialized a temporary mm area
within the lower half of the address range
for text poking, see commit 4fc19708b165
("x86/alternatives: Initialize temporary mm
for patching").

LASS stops access to a lower half address in kernel,
and this can be deactivated if AC bit in EFLAGS
register is set. Hence use stac and clac instructions
around access to the address to avoid triggering a
LASS #GP fault.

Kernel objtool validation warns if the binary calls
to a non-whitelisted function that exists outside of
the stac/clac guard, or references any function with a
dynamic function pointer inside the guard; see section
9 in the document tools/objtool/Documentation/objtool.txt.

For these reasons, also considering text poking size is
usually small, simple modifications have been done
in function text_poke_memcpy() and text_poke_memset() to
avoid non-whitelisted function calls inside the stac/clac
guard.

Gcc may detect and replace the target with its built-in
functions. However, the replacement would break the
objtool validation criteria. Hence, add compiler option
-fno-builtin for the file.

Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Yian Chen <yian.chen@intel.com>
---
 arch/x86/include/asm/smap.h      | 13 +++++++++++++
 arch/x86/kernel/Makefile         |  2 ++
 arch/x86/kernel/alternative.c    | 21 +++++++++++++++++++--
 tools/objtool/arch/x86/special.c |  2 ++
 4 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index bab490379c65..6f7ac0839b10 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -39,6 +39,19 @@ static __always_inline void stac(void)
 	alternative("", __ASM_STAC, X86_FEATURE_SMAP);
 }
 
+/* Deactivate/activate LASS via AC bit in EFLAGS register */
+static __always_inline void low_addr_access_begin(void)
+{
+	/* Note: a barrier is implicit in alternative() */
+	alternative("", __ASM_STAC, X86_FEATURE_LASS);
+}
+
+static __always_inline void low_addr_access_end(void)
+{
+	/* Note: a barrier is implicit in alternative() */
+	alternative("", __ASM_CLAC, X86_FEATURE_LASS);
+}
+
 static __always_inline unsigned long smap_save(void)
 {
 	unsigned long flags;
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 96d51bbc2bd4..f8a455fc56a2 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -7,6 +7,8 @@ extra-y	+= vmlinux.lds
 
 CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
 
+CFLAGS_alternative.o	+= -fno-builtin
+
 ifdef CONFIG_FUNCTION_TRACER
 # Do not profile debug and lowlevel utilities
 CFLAGS_REMOVE_tsc.o = -pg
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 7d8c3cbde368..4de8b54fb5f2 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1530,14 +1530,31 @@ __ro_after_init unsigned long poking_addr;
 
 static void text_poke_memcpy(void *dst, const void *src, size_t len)
 {
-	memcpy(dst, src, len);
+	const char *s = src;
+	char *d = dst;
+
+	/* The parameter dst ends up referencing to the global variable
+	 * poking_addr, which is mapped to the low half address space.
+	 * In kernel, accessing the low half address range is prevented
+	 * by LASS. So relax LASS prevention while accessing the memory
+	 * range.
+	 */
+	low_addr_access_begin();
+	while (len-- > 0)
+		*d++ = *s++;
+	low_addr_access_end();
 }
 
 static void text_poke_memset(void *dst, const void *src, size_t len)
 {
 	int c = *(const int *)src;
+	char *d = dst;
 
-	memset(dst, c, len);
+	/* The same comment as it is in function text_poke_memcpy */
+	low_addr_access_begin();
+	while (len-- > 0)
+		*d++ = c;
+	low_addr_access_end();
 }
 
 typedef void text_poke_f(void *dst, const void *src, size_t len);
diff --git a/tools/objtool/arch/x86/special.c b/tools/objtool/arch/x86/special.c
index 7c97b7391279..3a34ebe3966a 100644
--- a/tools/objtool/arch/x86/special.c
+++ b/tools/objtool/arch/x86/special.c
@@ -6,11 +6,13 @@
 
 #define X86_FEATURE_POPCNT (4 * 32 + 23)
 #define X86_FEATURE_SMAP   (9 * 32 + 20)
+#define X86_FEATURE_LASS   (12 * 32 + 6)
 
 void arch_handle_alternative(unsigned short feature, struct special_alt *alt)
 {
 	switch (feature) {
 	case X86_FEATURE_SMAP:
+	case X86_FEATURE_LASS:
 		/*
 		 * If UACCESS validation is enabled; force that alternative;
 		 * otherwise force it the other way.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection
  2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
                   ` (2 preceding siblings ...)
  2023-01-10  5:52 ` [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives Yian Chen
@ 2023-01-10  5:52 ` Yian Chen
  2023-01-11  0:34   ` Sohil Mehta
  2023-01-21  4:09   ` Andy Lutomirski
  2023-01-10  5:52 ` [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation) Yian Chen
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Yian Chen @ 2023-01-10  5:52 UTC (permalink / raw)
  To: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai, Yian Chen

Kernel enables LASS automatically at starting time in LASS
capable platforms. Any access to kernel addresses
or upper half addresses from user space triggers a #GP
fault.

Legacy vsyscall does not comply with LASS, because
the vsyscall functions are mapped in the range
0xffffffffff600000-0xffffffffff601000.

In theory, it would be possible to write a #GP fault handler
to emulate the old vsyscall behavior, but vsyscall has been
deprecated for some time, so this has not been done.

Therefore, when kernel enforces LASS, vsyscall does not work
and should be disabled. On the other hand, the user can relax
the enforcement by clearing lass cpu id (clearcpuid=lass/390)
or enabling vsyscall (vsyscall=xxx) from kernel command line.
The user can also opt-out LASS in config file to build kernel
binary.

Signed-off-by: Yian Chen <yian.chen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 12 ++++++++----
 arch/x86/entry/vsyscall/vsyscall_64.c           | 14 ++++++++++++++
 2 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6cfa6e3996cf..3988e0c8c175 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6755,10 +6755,14 @@
 			versions of glibc use these calls.  Because these
 			functions are at fixed addresses, they make nice
 			targets for exploits that can control RIP.
-
-			emulate     [default] Vsyscalls turn into traps and are
-			            emulated reasonably safely.  The vsyscall
-				    page is readable.
+			In newer versions of Intel platforms that come with
+			LASS(Linear Address Space separation) protection,
+			vsyscall is disabled by default. Enabling vsyscall
+			via the parameter overrides LASS protection.
+
+			emulate     [default if not LASS capable] Vsyscalls
+				    turn into traps and are emulated reasonably
+				    safely.  The vsyscall page is readable.
 
 			xonly       Vsyscalls turn into traps and are
 			            emulated reasonably safely.  The vsyscall
diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
index 4af81df133ee..2691f26835d1 100644
--- a/arch/x86/entry/vsyscall/vsyscall_64.c
+++ b/arch/x86/entry/vsyscall/vsyscall_64.c
@@ -63,6 +63,12 @@ static int __init vsyscall_setup(char *str)
 		else
 			return -EINVAL;
 
+		if (cpu_feature_enabled(X86_FEATURE_LASS) &&
+		    vsyscall_mode != NONE) {
+			setup_clear_cpu_cap(X86_FEATURE_LASS);
+			pr_warn("LASS disabled by command line enabling vsyscall\n");
+		}
+
 		return 0;
 	}
 
@@ -379,6 +385,14 @@ void __init map_vsyscall(void)
 	extern char __vsyscall_page;
 	unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
 
+	/*
+	 * When LASS is on, vsyscall triggers a #GP fault,
+	 * so that force vsyscall_mode to NONE.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
+		vsyscall_mode = NONE;
+		return;
+	}
 	/*
 	 * For full emulation, the page needs to exist for real.  In
 	 * execute-only mode, there is no PTE at all backing the vsyscall
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation)
  2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
                   ` (3 preceding siblings ...)
  2023-01-10  5:52 ` [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection Yian Chen
@ 2023-01-10  5:52 ` Yian Chen
  2023-01-11 22:22   ` Sohil Mehta
  2023-01-12 18:17   ` Dave Hansen
  2023-01-10  5:52 ` [PATCH 6/7] x86/cpu: Set LASS as pinning sensitive CR4 bit Yian Chen
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Yian Chen @ 2023-01-10  5:52 UTC (permalink / raw)
  To: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai, Yian Chen

LASS is enabled via setting a CR4 bit if the platform
supports the feature.

LASS may be disabled in early boot time, for example,
by command line parameter clearcpuid=lass/390 or
vsyscall flag. In such cases, the CPU feature and CR4
bits will be cleared.

Signed-off-by: Yian Chen <yian.chen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/common.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 9cfca3d7d0e2..efc7c7623968 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -412,6 +412,23 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
 	cr4_clear_bits(X86_CR4_UMIP);
 }
 
+static __always_inline void setup_lass(struct cpuinfo_x86 *c)
+{
+	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
+		cr4_set_bits(X86_CR4_LASS);
+	} else {
+		/*
+		 * only clear the feature and cr4 bits when hardware
+		 * supports LASS, in case it was enabled in a previous
+		 * boot (e.g., via kexec)
+		 */
+		if (cpu_has(c, X86_FEATURE_LASS)) {
+			cr4_clear_bits(X86_CR4_LASS);
+			clear_cpu_cap(c, X86_FEATURE_LASS);
+		}
+	}
+}
+
 /* These bits should not change their value after CPU init is finished. */
 static const unsigned long cr4_pinned_mask =
 	X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
@@ -1848,6 +1865,7 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	setup_smep(c);
 	setup_smap(c);
 	setup_umip(c);
+	setup_lass(c);
 
 	/* Enable FSGSBASE instructions if available. */
 	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 6/7] x86/cpu: Set LASS as pinning sensitive CR4 bit
  2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
                   ` (4 preceding siblings ...)
  2023-01-10  5:52 ` [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation) Yian Chen
@ 2023-01-10  5:52 ` Yian Chen
  2023-01-10  5:52 ` [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest Yian Chen
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 40+ messages in thread
From: Yian Chen @ 2023-01-10  5:52 UTC (permalink / raw)
  To: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai, Yian Chen

Security protection features are pinning sensitive.
LASS comes with an effort for security concerns.
Therefore, add it to the set of pinning sensitive
bits

Signed-off-by: Yian Chen <yian.chen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index efc7c7623968..e224cbaf7866 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -432,7 +432,7 @@ static __always_inline void setup_lass(struct cpuinfo_x86 *c)
 /* These bits should not change their value after CPU init is finished. */
 static const unsigned long cr4_pinned_mask =
 	X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
-	X86_CR4_FSGSBASE | X86_CR4_CET;
+	X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_LASS;
 static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning);
 static unsigned long cr4_pinned_bits __ro_after_init;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest
  2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
                   ` (5 preceding siblings ...)
  2023-01-10  5:52 ` [PATCH 6/7] x86/cpu: Set LASS as pinning sensitive CR4 bit Yian Chen
@ 2023-01-10  5:52 ` Yian Chen
  2023-02-07  3:21   ` Wang, Lei
  2023-01-10 19:48 ` [PATCH 0/7] Enable LASS (Linear Address space Separation) Sohil Mehta
  2023-01-10 22:57 ` Dave Hansen
  8 siblings, 1 reply; 40+ messages in thread
From: Yian Chen @ 2023-01-10  5:52 UTC (permalink / raw)
  To: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai, Yian Chen

From: Paul Lai <paul.c.lai@intel.com>

Expose LASS feature which is defined in the CPUID.7.1.EAX
bit and enabled via the CR4 bit for VM guest.

Signed-off-by: Paul Lai <paul.c.lai@intel.com>
Signed-off-by: Yian Chen <yian.chen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 3 ++-
 arch/x86/kvm/cpuid.c            | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f35f1ff4427b..bd39f45e9b5a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,7 +125,8 @@
 			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
 			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
 			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
-			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+			  | X86_CR4_LASS))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b14653b61470..e0f53f85f5ae 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)
 
 	kvm_cpu_cap_mask(CPUID_7_1_EAX,
 		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
-		F(AVX_IFMA)
+		F(AVX_IFMA) | F(LASS)
 	);
 
 	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/7] Enable LASS (Linear Address space Separation)
  2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
                   ` (6 preceding siblings ...)
  2023-01-10  5:52 ` [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest Yian Chen
@ 2023-01-10 19:48 ` Sohil Mehta
  2023-01-10 22:57 ` Dave Hansen
  8 siblings, 0 replies; 40+ messages in thread
From: Sohil Mehta @ 2023-01-10 19:48 UTC (permalink / raw)
  To: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai, kvm, linux-doc, linux-mm

Yian,

I added a few missing lists pertaining to KVM, MM and documentation 
since these patches impact them.

In future, scripts/get_maintainer.pl can help you with generating the 
relevant maintainers and lists.

On 1/9/2023 9:51 PM, Yian Chen wrote:
> LASS (Linear Address Space Separation) is a security
> extension that prevents speculative address accesses across
> user/kernel mode. The LASS details have been published in
> Chapter 11 in
> https://cdrdv2.intel.com/v1/dl/getContent/671368
> 
> LASS works in 64-bit mode only and partitions the 64-bit
> virtual address space into two halves:
>      1. Lower half (LA[63]=0) --> user space
>      2. Upper half (LA[63]=1) --> kernel space
> When LASS is enabled, a general protection #GP(0) fault will
> be generated if software accesses the address from the half in
> which it resides to another half, e.g., either from user space
> to upper half, or from kernel space to lower half. This
> protection applies to data access, code execution, cache line
> flushing instructions.
> 
> Almost all kernel accesses are to the upper half of the virtual
> address space. However, there are valid reasons for kernel to
> access the lower half. For these cases,  kernel can temporarily
> suspend the enforcement of LASS by disabling SMAP (Supervisor
> Mode Access Prevention).
> 
> Kernel access to copy data to/from user addresses already
> disables SMAP using the stac()/clac() functions. New functions
> low_addr_access_begin()/low_addr_access_end() are added to
> also disable/enable SMAP around other code that legitimately
> needs to access the lower half of the virtual address space.
> 
> User space cannot use any kernel address while LASS is
> enabled. Less fortunately, legacy vsyscall functions used
> by old version of glibc are located in the address range
> 0xffffffffff600000-0xffffffffff601000 and emulated in kernel.
> Therefore, to comply with LASS policy, the legacy vsyscall is
> disabled by default. I am looking for input from Andy and
> others if this approach is acceptable.
> 
> This patch set by default enforces LASS when the platform
> supports it. It can be disabled via the command line parameter
> "clearcpuid" or by setting "vsyscall=emulate/xonly".
> 
> As of now there is no publicly available CPU supporting LASS.
> The first one to support LASS is Sierra Forest line. The Intel
> Simics® Simulator was used as software development and testing
> vehicle for this patch set.
> 
> Paul Lai (1):
>    x86/kvm: Expose LASS feature to VM guest
> 
> Yian Chen (6):
>    x86/cpu: Enumerate LASS CPUID and CR4 bits
>    x86: Add CONFIG option X86_LASS
>    x86/cpu: Disable kernel LASS when patching kernel alternatives
>    x86/vsyscall: Setup vsyscall to compromise LASS protection
>    x86/cpu: Enable LASS (Linear Address Space Separation)
>    x86/cpu: Set LASS as pinning sensitive CR4 bit
> 

It's usually good practice to include a base-commit to make it easier to 
apply these patches.

>   .../admin-guide/kernel-parameters.txt         | 12 +++++++----
>   arch/x86/Kconfig                              | 10 +++++++++
>   arch/x86/entry/vsyscall/vsyscall_64.c         | 14 +++++++++++++
>   arch/x86/include/asm/cpufeatures.h            |  1 +
>   arch/x86/include/asm/disabled-features.h      |  8 ++++++-
>   arch/x86/include/asm/kvm_host.h               |  3 ++-
>   arch/x86/include/asm/smap.h                   | 13 ++++++++++++
>   arch/x86/include/uapi/asm/processor-flags.h   |  2 ++
>   arch/x86/kernel/Makefile                      |  2 ++
>   arch/x86/kernel/alternative.c                 | 21 +++++++++++++++++--
>   arch/x86/kernel/cpu/common.c                  | 20 +++++++++++++++++-
>   arch/x86/kvm/cpuid.c                          |  2 +-
>   tools/arch/x86/include/asm/cpufeatures.h      |  1 +
>   .../arch/x86/include/asm/disabled-features.h  |  8 ++++++-
>   tools/objtool/arch/x86/special.c              |  2 ++
>   15 files changed, 108 insertions(+), 11 deletions(-)
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits
  2023-01-10  5:51 ` [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits Yian Chen
@ 2023-01-10 20:14   ` Sohil Mehta
  2023-01-11  0:13     ` Dave Hansen
  2023-01-11 19:21     ` Chen, Yian
  0 siblings, 2 replies; 40+ messages in thread
From: Sohil Mehta @ 2023-01-10 20:14 UTC (permalink / raw)
  To: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai

On 1/9/2023 9:51 PM, Yian Chen wrote:
> LASS (Linear Address Space Separation) is a CPU feature to
> prevent speculative address access in user/kernel mode.
> 

Would it be better to say?

LASS (Linear Address Space Separation) is a security feature that 
intends to prevent unintentional speculative address access across 
user/kernel mode.


> LASS partitions 64-bit virtual address space into two
> halves, lower address (LA[63]=0) and upper address
> (LA[63]=1). It stops any data access or code execution
>      1. from upper half address space to any lower half address
>      2, from lower half address space to any upper half address
> and generates #GP fault for a violation.
> 

I am not sure if this is the best way to say it. The kernel already 
partitions the address space this way. LASS takes what is already the 
typical OS implementation and bakes it into the hardware architecture.

> In Linux, this means LASS does not allow both kernel code
> to access any user space address and user code to access
> any kernel space address.
> 

There is clearly an overlap between the protections provided by paging 
and with SMAP and SMEP. It would be useful to paraphrase some of the 
information mentioned in the spec regarding how LASS differs from them.

"With these mode-based protections, paging can prevent malicious 
software from directly reading or writing memory inappropriately. To 
enforce these protections, the processor must traverse the hierarchy of 
paging structures in memory. Unprivileged software can use timing 
information resulting from this traversal to determine details about the 
paging structures, and these details may be used to determine the layout 
of supervisor memory.

Linear-address space separation (LASS) is an independent mechanism that 
enforces the same mode-based protections as paging but without 
traversing the paging structures. Because the protections enforced by 
LASS are applied before paging, “probes” by malicious software will 
provide no paging-based timing information."

> Signed-off-by: Yian Chen <yian.chen@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-10  5:52 ` [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives Yian Chen
@ 2023-01-10 21:04   ` Peter Zijlstra
  2023-01-11  1:01     ` Chen, Yian
  2023-01-10 22:41   ` Sohil Mehta
  1 sibling, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2023-01-10 21:04 UTC (permalink / raw)
  To: Yian Chen
  Cc: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai

On Mon, Jan 09, 2023 at 09:52:00PM -0800, Yian Chen wrote:
> Most of the kernel is mapped at virtual addresses
> in the upper half of the address range. But kernel
> deliberately initialized a temporary mm area
> within the lower half of the address range
> for text poking, see commit 4fc19708b165
> ("x86/alternatives: Initialize temporary mm
> for patching").
> 
> LASS stops access to a lower half address in kernel,
> and this can be deactivated if AC bit in EFLAGS
> register is set. Hence use stac and clac instructions
> around access to the address to avoid triggering a
> LASS #GP fault.
> 
> Kernel objtool validation warns if the binary calls
> to a non-whitelisted function that exists outside of
> the stac/clac guard, or references any function with a
> dynamic function pointer inside the guard; see section
> 9 in the document tools/objtool/Documentation/objtool.txt.
> 
> For these reasons, also considering text poking size is
> usually small, simple modifications have been done
> in function text_poke_memcpy() and text_poke_memset() to
> avoid non-whitelisted function calls inside the stac/clac
> guard.
> 
> Gcc may detect and replace the target with its built-in
> functions. However, the replacement would break the
> objtool validation criteria. Hence, add compiler option
> -fno-builtin for the file.

Please reflow to 72 characters consistently, this is silly.

> Co-developed-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Yian Chen <yian.chen@intel.com>
> ---
>  arch/x86/include/asm/smap.h      | 13 +++++++++++++
>  arch/x86/kernel/Makefile         |  2 ++
>  arch/x86/kernel/alternative.c    | 21 +++++++++++++++++++--
>  tools/objtool/arch/x86/special.c |  2 ++
>  4 files changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
> index bab490379c65..6f7ac0839b10 100644
> --- a/arch/x86/include/asm/smap.h
> +++ b/arch/x86/include/asm/smap.h
> @@ -39,6 +39,19 @@ static __always_inline void stac(void)
>  	alternative("", __ASM_STAC, X86_FEATURE_SMAP);
>  }
>  
> +/* Deactivate/activate LASS via AC bit in EFLAGS register */
> +static __always_inline void low_addr_access_begin(void)
> +{
> +	/* Note: a barrier is implicit in alternative() */
> +	alternative("", __ASM_STAC, X86_FEATURE_LASS);
> +}
> +
> +static __always_inline void low_addr_access_end(void)
> +{
> +	/* Note: a barrier is implicit in alternative() */
> +	alternative("", __ASM_CLAC, X86_FEATURE_LASS);
> +}

Can't say I like the name. Also if you look at bit 63 as a sign bit,
it's actively wrong since -1 is lower than 0.

> +
>  static __always_inline unsigned long smap_save(void)
>  {
>  	unsigned long flags;
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index 96d51bbc2bd4..f8a455fc56a2 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -7,6 +7,8 @@ extra-y	+= vmlinux.lds
>  
>  CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
>  
> +CFLAGS_alternative.o	+= -fno-builtin
> +
>  ifdef CONFIG_FUNCTION_TRACER
>  # Do not profile debug and lowlevel utilities
>  CFLAGS_REMOVE_tsc.o = -pg
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 7d8c3cbde368..4de8b54fb5f2 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -1530,14 +1530,31 @@ __ro_after_init unsigned long poking_addr;
>  
>  static void text_poke_memcpy(void *dst, const void *src, size_t len)
>  {
> -	memcpy(dst, src, len);
> +	const char *s = src;
> +	char *d = dst;
> +
> +	/* The parameter dst ends up referencing to the global variable
> +	 * poking_addr, which is mapped to the low half address space.
> +	 * In kernel, accessing the low half address range is prevented
> +	 * by LASS. So relax LASS prevention while accessing the memory
> +	 * range.
> +	 */
> +	low_addr_access_begin();
> +	while (len-- > 0)
> +		*d++ = *s++;
> +	low_addr_access_end();
>  }
>  
>  static void text_poke_memset(void *dst, const void *src, size_t len)
>  {
>  	int c = *(const int *)src;
> +	char *d = dst;
>  
> -	memset(dst, c, len);
> +	/* The same comment as it is in function text_poke_memcpy */
> +	low_addr_access_begin();
> +	while (len-- > 0)
> +		*d++ = c;
> +	low_addr_access_end();
>  }

This is horrific tinkering :/

Also, what about the EFI mm? IIRC EFI also lives in the user address
space.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/7] x86: Add CONFIG option X86_LASS
  2023-01-10  5:51 ` [PATCH 2/7] x86: Add CONFIG option X86_LASS Yian Chen
@ 2023-01-10 21:05   ` Sohil Mehta
  2023-01-12  0:13     ` Chen, Yian
  0 siblings, 1 reply; 40+ messages in thread
From: Sohil Mehta @ 2023-01-10 21:05 UTC (permalink / raw)
  To: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai

The subject can be better stated as:

x86/Kconfig: Add config option for LASS

On 1/9/2023 9:51 PM, Yian Chen wrote:
> LASS is an Intel x86-64 only feature. 

> Add CONFIG
> option X86_LASS and flag DISABLE_LASS to choose
> opt-in/out the feature from kernel binary.
> The second sentence is unnecessary.

> CONFIG_X86_LASS is enabled by default because it
> is a security feature which should have little
> to no overhead or side effects. 

Doesn't it have a side effect that it modifies default vsyscall behavior?

I am guessing the impact of the vsyscall change would be minimal. 
However, should LASS be disabled by default at least initially to 
minimize the impact on userspace?

A follow-up patch can then enable this by default once the overall 
impact is clearly known.

> If any issues are
> found with specific use cases, the CONFIG option
> makes it easy to disable.
>This sentence is unnecessary.

>   
> +config X86_LASS
> +	def_bool y
> +	prompt "Linear Address Space Separation"
> +	depends on X86_64 && CPU_SUP_INTEL
> +	help
> +	  Linear Address Space Separation (LASS) is a processor
> +	  feature that mitigates address space layout probes.
> +

Let's try to be consistent about what LASS is expected to do. This 
definition is very different from the one in patch 1/7.

Also, we should include some information here on how enabling the LASS 
config option can impact vsyscall behavior.

> +	  if unsure, say Y.
> +


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-10  5:52 ` [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives Yian Chen
  2023-01-10 21:04   ` Peter Zijlstra
@ 2023-01-10 22:41   ` Sohil Mehta
  2023-01-12  0:27     ` Chen, Yian
  1 sibling, 1 reply; 40+ messages in thread
From: Sohil Mehta @ 2023-01-10 22:41 UTC (permalink / raw)
  To: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai

On 1/9/2023 9:52 PM, Yian Chen wrote:

> LASS stops access to a lower half address in kernel,
> and this can be deactivated if AC bit in EFLAGS
> register is set. Hence use stac and clac instructions
> around access to the address to avoid triggering a
> LASS #GP fault.
> 

It seems we are implicitly relying on the on stac() and clac() calls 
that are added for SMAP. Have you tried running with SMAP disabled  i.e 
"clearcpuid=smap"?

I believe there needs to be a dependency between LASS and SMAP.

diff --git a/arch/x86/kernel/cpu/cpuid-deps.c 
b/arch/x86/kernel/cpu/cpuid-deps.c
index d95221117129..00bc7e4a65d2 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -79,6 +79,7 @@ static const struct cpuid_dep cpuid_deps[] = {
  	{ X86_FEATURE_XFD,			X86_FEATURE_XSAVES    },
  	{ X86_FEATURE_XFD,			X86_FEATURE_XGETBV1   },
  	{ X86_FEATURE_AMX_TILE,			X86_FEATURE_XFD       },
+	{ X86_FEATURE_LASS,			X86_FEATURE_SMAP      },
  	{}
  };

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 0/7] Enable LASS (Linear Address space Separation)
  2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
                   ` (7 preceding siblings ...)
  2023-01-10 19:48 ` [PATCH 0/7] Enable LASS (Linear Address space Separation) Sohil Mehta
@ 2023-01-10 22:57 ` Dave Hansen
  8 siblings, 0 replies; 40+ messages in thread
From: Dave Hansen @ 2023-01-10 22:57 UTC (permalink / raw)
  To: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Tony Luck, Sohil Mehta, Paul Lai

On 1/9/23 21:51, Yian Chen wrote:
> LASS (Linear Address Space Separation) is a security
> extension that prevents speculative address accesses across 
> user/kernel mode. The LASS details have been published in
> Chapter 11 in 
> https://cdrdv2.intel.com/v1/dl/getContent/671368
> 
> LASS works in 64-bit mode only and partitions the 64-bit
> virtual address space into two halves:
>     1. Lower half (LA[63]=0) --> user space
>     2. Upper half (LA[63]=1) --> kernel space
> When LASS is enabled, a general protection #GP(0) fault will
> be generated if software accesses the address from the half in
> which it resides to another half, e.g., either from user space
> to upper half, or from kernel space to lower half. This
> protection applies to data access, code execution, cache line
> flushing instructions.

This does a good job of explaining the nuts and bolts -- *what* LASS
does.  It does a less good job of explaining why this was built, how it
can benefit end users and who cares about it.

LASS seemed really cool when we were reeling from Meltdown.  It would
*obviously* have been a godsend five years ago.  But, it's less clear
what role it plays today and how important it is.

Could you enlighten us, please?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits
  2023-01-10 20:14   ` Sohil Mehta
@ 2023-01-11  0:13     ` Dave Hansen
  2023-01-11 23:23       ` Chen, Yian
  2023-01-11 19:21     ` Chen, Yian
  1 sibling, 1 reply; 40+ messages in thread
From: Dave Hansen @ 2023-01-11  0:13 UTC (permalink / raw)
  To: Sohil Mehta, Yian Chen, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai

On 1/10/23 12:14, Sohil Mehta wrote:
> On 1/9/2023 9:51 PM, Yian Chen wrote:
>> LASS (Linear Address Space Separation) is a CPU feature to
>> prevent speculative address access in user/kernel mode.
> 
> Would it be better to say?
> 
> LASS (Linear Address Space Separation) is a security feature that
> intends to prevent unintentional speculative address access across
> user/kernel mode.

It's more than that, though.  The spec actually says this pretty nicely:

> Linear-address space separation (LASS) is an independent mechanism
> that enforces the same mode-based protections as paging but without
> traversing the paging structures. Because the protections enforced by
> LASS are applied before paging, “probes” by malicious software will
> provide no paging-based timing information

So, it's not _just_ that it can prevent some speculative accesses.  It
completely short-circuits paging itself and *ALL* of the baggage that
goes along with paging.

The TLB, mid-level caches, the page walker itself, the data cache
impact...  all of it.  Gone.

*THAT* is the important part here, IMNHO.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection
  2023-01-10  5:52 ` [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection Yian Chen
@ 2023-01-11  0:34   ` Sohil Mehta
  2023-01-12  1:43     ` Chen, Yian
  2023-01-21  4:09   ` Andy Lutomirski
  1 sibling, 1 reply; 40+ messages in thread
From: Sohil Mehta @ 2023-01-11  0:34 UTC (permalink / raw)
  To: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai

On 1/9/2023 9:52 PM, Yian Chen wrote:

> The user can also opt-out LASS in config file to build kernel
> binary.

This line is unnecessary.

> Signed-off-by: Yian Chen <yian.chen@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
>   Documentation/admin-guide/kernel-parameters.txt | 12 ++++++++----
>   arch/x86/entry/vsyscall/vsyscall_64.c           | 14 ++++++++++++++
>   2 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 6cfa6e3996cf..3988e0c8c175 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6755,10 +6755,14 @@
>   			versions of glibc use these calls.  Because these
>   			functions are at fixed addresses, they make nice
>   			targets for exploits that can control RIP.
> -
> -			emulate     [default] Vsyscalls turn into traps and are
> -			            emulated reasonably safely.  The vsyscall
> -				    page is readable.

The existing documentation here is incorrect. The default vsyscall mode 
is actually xonly. This has been so since:
commit 625b7b7f79c6 (x86/vsyscall: Change the default vsyscall mode to 
xonly)

> +			In newer versions of Intel platforms that come with

Words such as "newer" in the kernel start losing meaning very quickly. 
Also, this comment looks out of place in between the vsyscall sub-options.

> +			LASS(Linear Address Space separation) protection,
> +			vsyscall is disabled by default. Enabling vsyscall
> +			via the parameter overrides LASS protection.
> +


IIUC, you are making the default mode somewhat dynamic.

vsyscall = xonly (if LASS is not enabled)
vsyscall = none (if LASS is enabled)

The decision to disable vsyscall doesn't happen at compile time but it 
is taken at runtime when the LASS feature is detected. This would make 
the system behavior highly platform dependent.

Instead of doing this dance, can we provide a simplified behavior to the 
user/admin and move the decision making to compile time?

Option 1: A bigger hammer
Set the default vsyscall option as CONFIG_LEGACY_VSYSCALL_NONE. 
CONFIG_X86_LASS is set by default. Changing the compile time VSYSCALL 
option would disable LASS.

Option 2: A smaller hammer
CONFIG_X86_LASS is off by default. Vsyscall default stays as 
CONFIG_LEGACY_VSYSCALL_XONLY. Selecting LASS automatically chooses 
CONFIG_LEGACY_VSYSCALL_NONE. In this case, even if the hardware doesn't 
support LASS, vsyscall would still remain disabled.

In both of these cases, any command line input to override the vsyscall 
behavior can disable LASS as well.


> +			emulate     [default if not LASS capable] Vsyscalls
> +				    turn into traps and are emulated reasonably
> +				    safely.  The vsyscall page is readable.
>   
>   			xonly       Vsyscalls turn into traps and are
>   			            emulated reasonably safely.  The vsyscall
> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
> index 4af81df133ee..2691f26835d1 100644
> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
> @@ -63,6 +63,12 @@ static int __init vsyscall_setup(char *str)
>   		else
>   			return -EINVAL;
>   
> +		if (cpu_feature_enabled(X86_FEATURE_LASS) &&
> +		    vsyscall_mode != NONE) {
> +			setup_clear_cpu_cap(X86_FEATURE_LASS);
> +			pr_warn("LASS disabled by command line enabling vsyscall\n");

A warning seems too drastic here. A pr_info() should probably suffice.

> +		}
> +
>   		return 0;
>   	}
>   
> @@ -379,6 +385,14 @@ void __init map_vsyscall(void)
>   	extern char __vsyscall_page;
>   	unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
>   
> +	/*
> +	 * When LASS is on, vsyscall triggers a #GP fault,
> +	 * so that force vsyscall_mode to NONE.
> +	 */

This comment doesn't make much sense nor does it provide the full 
picture. Some of the reasoning from the cover letter/commit log can be 
duplicated here.

> +	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> +		vsyscall_mode = NONE;
> +		return;
> +	}
>   	/*
>   	 * For full emulation, the page needs to exist for real.  In
>   	 * execute-only mode, there is no PTE at all backing the vsyscall


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-10 21:04   ` Peter Zijlstra
@ 2023-01-11  1:01     ` Chen, Yian
  2023-01-11  9:10       ` Peter Zijlstra
  0 siblings, 1 reply; 40+ messages in thread
From: Chen, Yian @ 2023-01-11  1:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai



On 1/10/2023 1:04 PM, Peter Zijlstra wrote:
> On Mon, Jan 09, 2023 at 09:52:00PM -0800, Yian Chen wrote:
>> Most of the kernel is mapped at virtual addresses
>> in the upper half of the address range. But kernel
>> deliberately initialized a temporary mm area
>> within the lower half of the address range
>> for text poking, see commit 4fc19708b165
>> ("x86/alternatives: Initialize temporary mm
>> for patching").
>>
>> LASS stops access to a lower half address in kernel,
>> and this can be deactivated if AC bit in EFLAGS
>> register is set. Hence use stac and clac instructions
>> around access to the address to avoid triggering a
>> LASS #GP fault.
>>
>> Kernel objtool validation warns if the binary calls
>> to a non-whitelisted function that exists outside of
>> the stac/clac guard, or references any function with a
>> dynamic function pointer inside the guard; see section
>> 9 in the document tools/objtool/Documentation/objtool.txt.
>>
>> For these reasons, also considering text poking size is
>> usually small, simple modifications have been done
>> in function text_poke_memcpy() and text_poke_memset() to
>> avoid non-whitelisted function calls inside the stac/clac
>> guard.
>>
>> Gcc may detect and replace the target with its built-in
>> functions. However, the replacement would break the
>> objtool validation criteria. Hence, add compiler option
>> -fno-builtin for the file.
> 
> Please reflow to 72 characters consistently, this is silly.
> 
Sure. I will format the commit msg guideline.

>> Co-developed-by: Tony Luck <tony.luck@intel.com>
>> Signed-off-by: Tony Luck <tony.luck@intel.com>
>> Signed-off-by: Yian Chen <yian.chen@intel.com>
>> ---
>>   arch/x86/include/asm/smap.h      | 13 +++++++++++++
>>   arch/x86/kernel/Makefile         |  2 ++
>>   arch/x86/kernel/alternative.c    | 21 +++++++++++++++++++--
>>   tools/objtool/arch/x86/special.c |  2 ++
>>   4 files changed, 36 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
>> index bab490379c65..6f7ac0839b10 100644
>> --- a/arch/x86/include/asm/smap.h
>> +++ b/arch/x86/include/asm/smap.h
>> @@ -39,6 +39,19 @@ static __always_inline void stac(void)
>>   	alternative("", __ASM_STAC, X86_FEATURE_SMAP);
>>   }
>>   
>> +/* Deactivate/activate LASS via AC bit in EFLAGS register */
>> +static __always_inline void low_addr_access_begin(void)
>> +{
>> +	/* Note: a barrier is implicit in alternative() */
>> +	alternative("", __ASM_STAC, X86_FEATURE_LASS);
>> +}
>> +
>> +static __always_inline void low_addr_access_end(void)
>> +{
>> +	/* Note: a barrier is implicit in alternative() */
>> +	alternative("", __ASM_CLAC, X86_FEATURE_LASS);
>> +}
> 
> Can't say I like the name. 
Indeed, there are alternative ways to name the functions. for example,
enable_kernel_lass()/disable_kernel_lass(), or simply keep no change to 
use stac()/clac().

I choose this name because it is straight forward to the purpose and 
helps in understanding when to use the functions.

Also if you look at bit 63 as a sign bit,
> it's actively wrong since -1 is lower than 0.
>This could be a trade-off choice. While considering address manipulation 
and calculation, it is likely an unsigned. I would be happy to get input 
for better naming.

>> +
>>   static __always_inline unsigned long smap_save(void)
>>   {
>>   	unsigned long flags;
>> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
>> index 96d51bbc2bd4..f8a455fc56a2 100644
>> --- a/arch/x86/kernel/Makefile
>> +++ b/arch/x86/kernel/Makefile
>> @@ -7,6 +7,8 @@ extra-y	+= vmlinux.lds
>>   
>>   CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
>>   
>> +CFLAGS_alternative.o	+= -fno-builtin
>> +
>>   ifdef CONFIG_FUNCTION_TRACER
>>   # Do not profile debug and lowlevel utilities
>>   CFLAGS_REMOVE_tsc.o = -pg
>> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
>> index 7d8c3cbde368..4de8b54fb5f2 100644
>> --- a/arch/x86/kernel/alternative.c
>> +++ b/arch/x86/kernel/alternative.c
>> @@ -1530,14 +1530,31 @@ __ro_after_init unsigned long poking_addr;
>>   
>>   static void text_poke_memcpy(void *dst, const void *src, size_t len)
>>   {
>> -	memcpy(dst, src, len);
>> +	const char *s = src;
>> +	char *d = dst;
>> +
>> +	/* The parameter dst ends up referencing to the global variable
>> +	 * poking_addr, which is mapped to the low half address space.
>> +	 * In kernel, accessing the low half address range is prevented
>> +	 * by LASS. So relax LASS prevention while accessing the memory
>> +	 * range.
>> +	 */
>> +	low_addr_access_begin();
>> +	while (len-- > 0)
>> +		*d++ = *s++;
>> +	low_addr_access_end();
>>   }
>>   
>>   static void text_poke_memset(void *dst, const void *src, size_t len)
>>   {
>>   	int c = *(const int *)src;
>> +	char *d = dst;
>>   
>> -	memset(dst, c, len);
>> +	/* The same comment as it is in function text_poke_memcpy */
>> +	low_addr_access_begin();
>> +	while (len-- > 0)
>> +		*d++ = c;
>> +	low_addr_access_end();
>>   }
> 
> This is horrific tinkering :/
>
This part seems difficult to have a perfect solution since function call 
or function pointer inside the guard of instruction stac and clac will 
trigger objtool warning (stated the reasons in the commit msg)

To avoid the warning, I considered this might be okay since the poking 
text usually seems a few bytes.

> Also, what about the EFI mm? IIRC EFI also lives in the user address
> space.

I didn't encounter EFI mm related problem while I tested the 
implementation. I will update you later after I investigate more around 
the EFI mm.

Thanks,
Yian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-11  1:01     ` Chen, Yian
@ 2023-01-11  9:10       ` Peter Zijlstra
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Zijlstra @ 2023-01-11  9:10 UTC (permalink / raw)
  To: Chen, Yian
  Cc: linux-kernel, x86, Andy Lutomirski, Dave Hansen, Ravi Shankar,
	Tony Luck, Sohil Mehta, Paul Lai

On Tue, Jan 10, 2023 at 05:01:59PM -0800, Chen, Yian wrote:

> > > +/* Deactivate/activate LASS via AC bit in EFLAGS register */
> > > +static __always_inline void low_addr_access_begin(void)
> > > +{
> > > +	/* Note: a barrier is implicit in alternative() */
> > > +	alternative("", __ASM_STAC, X86_FEATURE_LASS);
> > > +}
> > > +
> > > +static __always_inline void low_addr_access_end(void)
> > > +{
> > > +	/* Note: a barrier is implicit in alternative() */
> > > +	alternative("", __ASM_CLAC, X86_FEATURE_LASS);
> > > +}
> > 
> > Can't say I like the name.
> Indeed, there are alternative ways to name the functions. for example,
> enable_kernel_lass()/disable_kernel_lass(), or simply keep no change to use
> stac()/clac().
> 
> I choose this name because it is straight forward to the purpose and helps
> in understanding when to use the functions.

Given we normally refer to the kernel address space as negative, it is
somewhat confusing.

  lass_access_{begin,end}()

or something might be better names.

> Also if you look at bit 63 as a sign bit,
> > it's actively wrong since -1 is lower than 0.
> This could be a trade-off choice. While considering address manipulation
> and calculation, it is likely an unsigned. I would be happy to get input for
> better naming.

Note that Documentation/x86/x86_64/mm.rst likes to refer to the kernel
range as negative.

Also things like making a canonical address use sign-extention.

> > This is horrific tinkering :/
> > 
> This part seems difficult to have a perfect solution since function call or
> function pointer inside the guard of instruction stac and clac will trigger
> objtool warning (stated the reasons in the commit msg)

Yeah, I'm familiar with that objtool warning, I wrote that particular check :-)

Still, this is a horrific solution. Adding something like
__inline_mem{set,cpy}() is a much saner option.

Something a little like the completely untested below.

---
diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index 888731ccf1f6..f43fc2d9b182 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -23,6 +23,16 @@ extern void *memcpy(void *to, const void *from, size_t len);
 #endif
 extern void *__memcpy(void *to, const void *from, size_t len);
 
+static __always_inline void *__inline_memcpy(void *to, const void *from, size_t len)
+{
+	void *ret = to;
+
+	asm volatile ("rep movsb"
+		      : "+D" (to), "+S" (from), "+c" (len)
+		      : : "memory");
+	return ret;
+}
+
 #define __HAVE_ARCH_MEMSET
 #if defined(__SANITIZE_MEMORY__) && defined(__NO_FORTIFY)
 extern void *__msan_memset(void *s, int c, size_t n);
@@ -33,6 +43,17 @@ void *memset(void *s, int c, size_t n);
 #endif
 void *__memset(void *s, int c, size_t n);
 
+static __always_inline void *__inline_memset(void *s, int v, size_t n)
+{
+	void *ret = s;
+
+	asm volatile("rep stosb"
+		     : "+D" (s), "+c" (n)
+		     : "a" ((uint8_t)v)
+		     : "memory");
+	return ret;
+}
+
 #define __HAVE_ARCH_MEMSET16
 static inline void *memset16(uint16_t *s, uint16_t v, size_t n)
 {

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits
  2023-01-10 20:14   ` Sohil Mehta
  2023-01-11  0:13     ` Dave Hansen
@ 2023-01-11 19:21     ` Chen, Yian
  1 sibling, 0 replies; 40+ messages in thread
From: Chen, Yian @ 2023-01-11 19:21 UTC (permalink / raw)
  To: Sohil Mehta, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai



On 1/10/2023 12:14 PM, Sohil Mehta wrote:
> On 1/9/2023 9:51 PM, Yian Chen wrote:
>> LASS (Linear Address Space Separation) is a CPU feature to
>> prevent speculative address access in user/kernel mode.
>>
> 
> Would it be better to say?
> 
> LASS (Linear Address Space Separation) is a security feature that 
> intends to prevent unintentional speculative address access across 
> user/kernel mode.
> 
>
Sure, I will revise the statement precisely.

>> LASS partitions 64-bit virtual address space into two
>> halves, lower address (LA[63]=0) and upper address
>> (LA[63]=1). It stops any data access or code execution
>>      1. from upper half address space to any lower half address
>>      2, from lower half address space to any upper half address
>> and generates #GP fault for a violation.
>>
> 
> I am not sure if this is the best way to say it. The kernel already 
> partitions the address space this way. LASS takes what is already the 
> typical OS implementation and bakes it into the hardware architecture.
> 
Yes, LASS by design matches the addressing usage in OS. I will try to 
include this in the statement.
>> In Linux, this means LASS does not allow both kernel code
>> to access any user space address and user code to access
>> any kernel space address.
>>
> 
> There is clearly an overlap between the protections provided by paging 
> and with SMAP and SMEP. It would be useful to paraphrase some of the 
> information mentioned in the spec regarding how LASS differs from them.
> 
Yes, I will differentiate between LASS and SMAP more clearly.

> "With these mode-based protections, paging can prevent malicious 
> software from directly reading or writing memory inappropriately. To 
> enforce these protections, the processor must traverse the hierarchy of 
> paging structures in memory. Unprivileged software can use timing 
> information resulting from this traversal to determine details about the 
> paging structures, and these details may be used to determine the layout 
> of supervisor memory.
> 
> Linear-address space separation (LASS) is an independent mechanism that 
> enforces the same mode-based protections as paging but without 
> traversing the paging structures. Because the protections enforced by 
> LASS are applied before paging, “probes” by malicious software will 
> provide no paging-based timing information."
> 
Yes, I will also state the advantage of LASS.

>> Signed-off-by: Yian Chen <yian.chen@intel.com>
>> Reviewed-by: Tony Luck <tony.luck@intel.com>
> 

Thanks,
Yian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation)
  2023-01-10  5:52 ` [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation) Yian Chen
@ 2023-01-11 22:22   ` Sohil Mehta
  2023-01-12 17:56     ` Chen, Yian
  2023-01-12 18:17   ` Dave Hansen
  1 sibling, 1 reply; 40+ messages in thread
From: Sohil Mehta @ 2023-01-11 22:22 UTC (permalink / raw)
  To: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai, Ricardo Neri

> +static __always_inline void setup_lass(struct cpuinfo_x86 *c)
> +{
> +	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> +		cr4_set_bits(X86_CR4_LASS);
> +	} else {
> +		/*
> +		 * only clear the feature and cr4 bits when hardware
> +		 * supports LASS, in case it was enabled in a previous
> +		 * boot (e.g., via kexec)
> +		 */
> +		if (cpu_has(c, X86_FEATURE_LASS)) {
> +			cr4_clear_bits(X86_CR4_LASS);
> +			clear_cpu_cap(c, X86_FEATURE_LASS);
> +		}
> +	}
> +}

I am quite confused by the "else" code flow. Can you please help 
understand how this code path would be exercised?

Also, why don't other features such as SMAP or SMEP need this type of 
handling? I see something on similar lines for UMIP.

Also, how does the CR4 pinning code in the following patch play into 
this? Could it flag a warning when cr4_clear_bits() is called above?

> +
>   /* These bits should not change their value after CPU init is finished. */
>   static const unsigned long cr4_pinned_mask =
>   	X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
> @@ -1848,6 +1865,7 @@ static void identify_cpu(struct cpuinfo_x86 *c)
>   	setup_smep(c);
>   	setup_smap(c);
>   	setup_umip(c);
> +	setup_lass(c);
>   

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits
  2023-01-11  0:13     ` Dave Hansen
@ 2023-01-11 23:23       ` Chen, Yian
  2023-01-12  0:06         ` Luck, Tony
  0 siblings, 1 reply; 40+ messages in thread
From: Chen, Yian @ 2023-01-11 23:23 UTC (permalink / raw)
  To: Dave Hansen, Sohil Mehta, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai



On 1/10/2023 4:13 PM, Dave Hansen wrote:
> On 1/10/23 12:14, Sohil Mehta wrote:
>> On 1/9/2023 9:51 PM, Yian Chen wrote:
>>> LASS (Linear Address Space Separation) is a CPU feature to
>>> prevent speculative address access in user/kernel mode.
>>
>> Would it be better to say?
>>
>> LASS (Linear Address Space Separation) is a security feature that
>> intends to prevent unintentional speculative address access across
>> user/kernel mode.
> 
> It's more than that, though.  The spec actually says this pretty nicely:
> >> Linear-address space separation (LASS) is an independent mechanism
>> that enforces the same mode-based protections as paging but without
>> traversing the paging structures. Because the protections enforced by
>> LASS are applied before paging, “probes” by malicious software will
>> provide no paging-based timing information
> 
> So, it's not _just_ that it can prevent some speculative accesses.  It
> completely short-circuits paging itself and *ALL* of the baggage that
> goes along with paging.
> 
> The TLB, mid-level caches, the page walker itself, the data cache
> impact...  all of it.  Gone.
> 
> *THAT* is the important part here, IMNHO.
sure, how about if I rewrite the 1st paragraph as follows:

LASS (Linear Address Space Separation) is an independent security 
mechanism that enforces kernel and user mode-base protections, similar 
to Intel SMAP/SMEP in kernel mode, but enhanced without traversing the 
paging structures. Therefore, the protection will not leak paging-based 
timing information and prevent malicious software from probing the info. 
The LASS details have been published in Chapter 11 in
https://cdrdv2.intel.com/v1/dl/getContent/671368


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits
  2023-01-11 23:23       ` Chen, Yian
@ 2023-01-12  0:06         ` Luck, Tony
  2023-01-12  0:15           ` Chen, Yian
  0 siblings, 1 reply; 40+ messages in thread
From: Luck, Tony @ 2023-01-12  0:06 UTC (permalink / raw)
  To: Chen, Yian, Hansen, Dave, Mehta, Sohil, linux-kernel, x86,
	Lutomirski, Andy, Dave Hansen, Shankar, Ravi V, Lai, Paul C

> The LASS details have been published in Chapter 11 in
> https://cdrdv2.intel.com/v1/dl/getContent/671368

Intel is terrible about maintaining web URLs like this. Better to
give this reference as:

  The December 2022 edition of the Intel Architecture
  Instruction Set Extensions and Future Features Programming
  Reference manual.

Maybe also still give the URL in a reply like this, but avoid using
It in a commit message that will be preserved forever.

-Tony


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 2/7] x86: Add CONFIG option X86_LASS
  2023-01-10 21:05   ` Sohil Mehta
@ 2023-01-12  0:13     ` Chen, Yian
  0 siblings, 0 replies; 40+ messages in thread
From: Chen, Yian @ 2023-01-12  0:13 UTC (permalink / raw)
  To: Sohil Mehta, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai



On 1/10/2023 1:05 PM, Sohil Mehta wrote:
> The subject can be better stated as:
> 
> x86/Kconfig: Add config option for LASS
> 
> On 1/9/2023 9:51 PM, Yian Chen wrote:
>> LASS is an Intel x86-64 only feature. 
> 
>> Add CONFIG
>> option X86_LASS and flag DISABLE_LASS to choose
>> opt-in/out the feature from kernel binary.
>> The second sentence is unnecessary.
> 
Sure, It makes sense to remove unnecessary sentence here.
>> CONFIG_X86_LASS is enabled by default because it
>> is a security feature which should have little
>> to no overhead or side effects. 
> 
> Doesn't it have a side effect that it modifies default vsyscall behavior?
> 
> I am guessing the impact of the vsyscall change would be minimal. 
> However, should LASS be disabled by default at least initially to 
> minimize the impact on userspace?
> 
> A follow-up patch can then enable this by default once the overall 
> impact is clearly known.
> 
>> If any issues are
>> found with specific use cases, the CONFIG option
>> makes it easy to disable.
>> This sentence is unnecessary.
> 
sure, I will remove this state too.

>> +config X86_LASS
>> +    def_bool y
>> +    prompt "Linear Address Space Separation"
>> +    depends on X86_64 && CPU_SUP_INTEL
>> +    help
>> +      Linear Address Space Separation (LASS) is a processor
>> +      feature that mitigates address space layout probes.
>> +
> 
> Let's try to be consistent about what LASS is expected to do. This 
> definition is very different from the one in patch 1/7.
> 
> Also, we should include some information here on how enabling the LASS 
> config option can impact vsyscall behavior.
> 
Sure, I will rewrite this help message and explain the impact to legacy 
vsyscall as well.

>> +      if unsure, say Y.
>> +
> 

thanks,
Yian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits
  2023-01-12  0:06         ` Luck, Tony
@ 2023-01-12  0:15           ` Chen, Yian
  0 siblings, 0 replies; 40+ messages in thread
From: Chen, Yian @ 2023-01-12  0:15 UTC (permalink / raw)
  To: Luck, Tony, Hansen, Dave, Mehta, Sohil, linux-kernel, x86,
	Lutomirski, Andy, Dave Hansen, Shankar, Ravi V, Lai, Paul C



On 1/11/2023 4:06 PM, Luck, Tony wrote:
>> The LASS details have been published in Chapter 11 in
>> https://cdrdv2.intel.com/v1/dl/getContent/671368
> 
> Intel is terrible about maintaining web URLs like this. Better to
> give this reference as:
> 
>    The December 2022 edition of the Intel Architecture
>    Instruction Set Extensions and Future Features Programming
>    Reference manual.
> 
> Maybe also still give the URL in a reply like this, but avoid using
> It in a commit message that will be preserved forever.
> 
Sure, I will add the detail document name and versioning info.

> -Tony
> 

Thanks,
Yian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-10 22:41   ` Sohil Mehta
@ 2023-01-12  0:27     ` Chen, Yian
  2023-01-12  0:37       ` Dave Hansen
  0 siblings, 1 reply; 40+ messages in thread
From: Chen, Yian @ 2023-01-12  0:27 UTC (permalink / raw)
  To: Sohil Mehta, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai



On 1/10/2023 2:41 PM, Sohil Mehta wrote:
> On 1/9/2023 9:52 PM, Yian Chen wrote:
> 
>> LASS stops access to a lower half address in kernel,
>> and this can be deactivated if AC bit in EFLAGS
>> register is set. Hence use stac and clac instructions
>> around access to the address to avoid triggering a
>> LASS #GP fault.
>>
> 
> It seems we are implicitly relying on the on stac() and clac() calls 
> that are added for SMAP. Have you tried running with SMAP disabled  i.e 
> "clearcpuid=smap"?
>
Yes, I tested with clearcpuid=smap.

> I believe there needs to be a dependency between LASS and SMAP.
> 
Yes, In kernel mode, LASS depends on SMAP to work. And in user mode, it 
doesn't, so the dependency description in following may miss user space 
effect.

> diff --git a/arch/x86/kernel/cpu/cpuid-deps.c 
> b/arch/x86/kernel/cpu/cpuid-deps.c
> index d95221117129..00bc7e4a65d2 100644
> --- a/arch/x86/kernel/cpu/cpuid-deps.c
> +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> @@ -79,6 +79,7 @@ static const struct cpuid_dep cpuid_deps[] = {
>       { X86_FEATURE_XFD,            X86_FEATURE_XSAVES    },
>       { X86_FEATURE_XFD,            X86_FEATURE_XGETBV1   },
>       { X86_FEATURE_AMX_TILE,            X86_FEATURE_XFD       },
> +    { X86_FEATURE_LASS,            X86_FEATURE_SMAP      },
>       {}
>   };

Thanks,
Yian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-12  0:27     ` Chen, Yian
@ 2023-01-12  0:37       ` Dave Hansen
  2023-01-12 18:36         ` Chen, Yian
  2023-02-01  2:10         ` Sohil Mehta
  0 siblings, 2 replies; 40+ messages in thread
From: Dave Hansen @ 2023-01-12  0:37 UTC (permalink / raw)
  To: Chen, Yian, Sohil Mehta, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai

On 1/11/23 16:27, Chen, Yian wrote:
>> It seems we are implicitly relying on the on stac() and clac()
>> calls that are added for SMAP. Have you tried running with SMAP
>> disabled i.e "clearcpuid=smap"?
>> 
> Yes, I tested with clearcpuid=smap.
It works by accident, then.

clearcpuid=smap means that the kernel should be running as if
CPUID.(EAX=07H,ECX=0H):EBX.SMAP[bit 20]==0.  STAC/CLAC should #UD in
that case.

The only reason that it happens to work is that STAC/CLAC apparently
actually continue to work even if CR4.SMAP==0.

I'm actually a _bit_ surprised by this, but I bet there's a good reason
for it.

In any case, please just make LASS dependent on SMAP.  It's the right
thing to do on several levels.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection
  2023-01-11  0:34   ` Sohil Mehta
@ 2023-01-12  1:43     ` Chen, Yian
  2023-01-12  2:49       ` Sohil Mehta
  0 siblings, 1 reply; 40+ messages in thread
From: Chen, Yian @ 2023-01-12  1:43 UTC (permalink / raw)
  To: Sohil Mehta, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai



On 1/10/2023 4:34 PM, Sohil Mehta wrote:
> On 1/9/2023 9:52 PM, Yian Chen wrote:
> 
>> The user can also opt-out LASS in config file to build kernel
>> binary.
> 
> This line is unnecessary.
> 
Sure, this line can be dropped.

>> Signed-off-by: Yian Chen <yian.chen@intel.com>
>> Reviewed-by: Tony Luck <tony.luck@intel.com>
>> ---
>>   Documentation/admin-guide/kernel-parameters.txt | 12 ++++++++----
>>   arch/x86/entry/vsyscall/vsyscall_64.c           | 14 ++++++++++++++
>>   2 files changed, 22 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index 6cfa6e3996cf..3988e0c8c175 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -6755,10 +6755,14 @@
>>               versions of glibc use these calls.  Because these
>>               functions are at fixed addresses, they make nice
>>               targets for exploits that can control RIP.
>> -
>> -            emulate     [default] Vsyscalls turn into traps and are
>> -                        emulated reasonably safely.  The vsyscall
>> -                    page is readable.
> 
> The existing documentation here is incorrect. The default vsyscall mode 
> is actually xonly. This has been so since:
> commit 625b7b7f79c6 (x86/vsyscall: Change the default vsyscall mode to 
> xonly)
>
Yes, you are right. but this patch can overwrite and correct existing 
one. I am assuming we don't need to correct the existing document first 
before update it for LASS.

>> +            In newer versions of Intel platforms that come with
> 
> Words such as "newer" in the kernel start losing meaning very quickly. 
> Also, this comment looks out of place in between the vsyscall sub-options.
> 
>> +            LASS(Linear Address Space separation) protection,
>> +            vsyscall is disabled by default. Enabling vsyscall
>> +            via the parameter overrides LASS protection.
>> +
Sure, I will take out this part change.
> 
> 
> IIUC, you are making the default mode somewhat dynamic.
> 
> vsyscall = xonly (if LASS is not enabled)
> vsyscall = none (if LASS is enabled)
> 
Yes, this looks better.

> The decision to disable vsyscall doesn't happen at compile time but it 
> is taken at runtime when the LASS feature is detected. This would make 
> the system behavior highly platform dependent.
> 
> Instead of doing this dance, can we provide a simplified behavior to the 
> user/admin and move the decision making to compile time?
> 
Current strategy is to disable vsyscall by default only for LASS capable 
platforms. So that the dynamic decision is likely a necessary.

> Option 1: A bigger hammer
> Set the default vsyscall option as CONFIG_LEGACY_VSYSCALL_NONE. 
> CONFIG_X86_LASS is set by default. Changing the compile time VSYSCALL 
> option would disable LASS.
>
This means to disable vsyscall by default for all platforms, doen't 
matter LASS. I am not sure if we want to go that far.

> Option 2: A smaller hammer
> CONFIG_X86_LASS is off by default. Vsyscall default stays as 
> CONFIG_LEGACY_VSYSCALL_XONLY. Selecting LASS automatically chooses 
> CONFIG_LEGACY_VSYSCALL_NONE. In this case, even if the hardware doesn't 
> support LASS, vsyscall would still remain disabled.
> 
This turns out to disable LASS by default. Then the LASS may not be 
taken in the first place.

> In both of these cases, any command line input to override the vsyscall 
> behavior can disable LASS as well.
> 
> 
>> +            emulate     [default if not LASS capable] Vsyscalls
>> +                    turn into traps and are emulated reasonably
>> +                    safely.  The vsyscall page is readable.
>>               xonly       Vsyscalls turn into traps and are
>>                           emulated reasonably safely.  The vsyscall
>> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c 
>> b/arch/x86/entry/vsyscall/vsyscall_64.c
>> index 4af81df133ee..2691f26835d1 100644
>> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
>> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
>> @@ -63,6 +63,12 @@ static int __init vsyscall_setup(char *str)
>>           else
>>               return -EINVAL;
>> +        if (cpu_feature_enabled(X86_FEATURE_LASS) &&
>> +            vsyscall_mode != NONE) {
>> +            setup_clear_cpu_cap(X86_FEATURE_LASS);
>> +            pr_warn("LASS disabled by command line enabling 
>> vsyscall\n");
> 
> A warning seems too drastic here. A pr_info() should probably suffice.
> 
sure I will modify it to use pr_info.

>> +        }
>> +
>>           return 0;
>>       }
>> @@ -379,6 +385,14 @@ void __init map_vsyscall(void)
>>       extern char __vsyscall_page;
>>       unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
>> +    /*
>> +     * When LASS is on, vsyscall triggers a #GP fault,
>> +     * so that force vsyscall_mode to NONE.
>> +     */
> 
> This comment doesn't make much sense nor does it provide the full 
> picture. Some of the reasoning from the cover letter/commit log can be 
> duplicated here.
> 
sure, How about a more detail inline comment as following:
+	/*
+        * When LASS protection is on, user space vsyscall triggers
+        * a #GP fault since the vsyscall page is
+        * 0xffffffffff600000-0xffffffffff601000,
+	 * so that force vsyscall_mode to NONE and disable this mapping.
+	 */

>> +    if (cpu_feature_enabled(X86_FEATURE_LASS)) {
>> +        vsyscall_mode = NONE;
>> +        return;
>> +    }
>>       /*
>>        * For full emulation, the page needs to exist for real.  In
>>        * execute-only mode, there is no PTE at all backing the vsyscall
> 
Thanks,
Yian


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection
  2023-01-12  1:43     ` Chen, Yian
@ 2023-01-12  2:49       ` Sohil Mehta
  0 siblings, 0 replies; 40+ messages in thread
From: Sohil Mehta @ 2023-01-12  2:49 UTC (permalink / raw)
  To: Chen, Yian, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai

>> The existing documentation here is incorrect. The default vsyscall 
>> mode is actually xonly. This has been so since:
>> commit 625b7b7f79c6 (x86/vsyscall: Change the default vsyscall mode to 
>> xonly)
>>
> Yes, you are right. but this patch can overwrite and correct existing 
> one. I am assuming we don't need to correct the existing document first 
> before update it for LASS.
> 

We should fix this independent of the LASS enabling. I sent a patch 
earlier today to address it. I apologize, I missed cc'ing you.

https://lore.kernel.org/lkml/20230111193211.1987047-1-sohil.mehta@intel.com/

>>> +            In newer versions of Intel platforms that come with
>>
>> Words such as "newer" in the kernel start losing meaning very quickly. 
>> Also, this comment looks out of place in between the vsyscall 
>> sub-options.
>>
>>> +            LASS(Linear Address Space separation) protection,
>>> +            vsyscall is disabled by default. Enabling vsyscall
>>> +            via the parameter overrides LASS protection.
>>> +
> Sure, I will take out this part change.

Actually, having some text here might be ok. I mistook it to be placed 
between the sub-options. But avoid merging it with the previous 
paragraph as is the case right now.

>> Instead of doing this dance, can we provide a simplified behavior to 
>> the user/admin and move the decision making to compile time?
>>
> Current strategy is to disable vsyscall by default only for LASS capable 
> platforms. So that the dynamic decision is likely a necessary.
> 

Making this dynamic and platform dependent would make things hard to 
debug and isolate. It would be a perfect recipe for "But, it works on my 
system!" type of issues.

Let's see what others have to say.

-Sohil

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation)
  2023-01-11 22:22   ` Sohil Mehta
@ 2023-01-12 17:56     ` Chen, Yian
  0 siblings, 0 replies; 40+ messages in thread
From: Chen, Yian @ 2023-01-12 17:56 UTC (permalink / raw)
  To: Sohil Mehta, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Paul Lai, Ricardo Neri



On 1/11/2023 2:22 PM, Sohil Mehta wrote:
>> +static __always_inline void setup_lass(struct cpuinfo_x86 *c)
>> +{
>> +    if (cpu_feature_enabled(X86_FEATURE_LASS)) {
>> +        cr4_set_bits(X86_CR4_LASS);
>> +    } else {
>> +        /*
>> +         * only clear the feature and cr4 bits when hardware
>> +         * supports LASS, in case it was enabled in a previous
>> +         * boot (e.g., via kexec)
>> +         */
>> +        if (cpu_has(c, X86_FEATURE_LASS)) {
>> +            cr4_clear_bits(X86_CR4_LASS);
>> +            clear_cpu_cap(c, X86_FEATURE_LASS);
>> +        }
>> +    }
>> +}
> 
> I am quite confused by the "else" code flow. Can you please help 
> understand how this code path would be exercised?
>
The "else" branch is to disable LASS for every cpu. Addition to clear 
the CR4 bit, it would be better to clear cpu feature id for consistent 
result in every CPU as well, otherwise,  we could see the feature id is 
cleared in the boot-cpu only, for example, if user enables vsyscall.

> Also, why don't other features such as SMAP or SMEP need this type of 
> handling? 
It seems there is no need to have such if-else for SMAP/SMEP since the 
X86_SMAP/SMEP was gone from arch/x86/Kconfig.

I see something on similar lines for UMIP.
> 
> Also, how does the CR4 pinning code in the following patch play into 
> this?
I did not test if pinning code works as commented:  "/* These bits 
should not change their value after CPU init is finished. */"

Could it flag a warning when cr4_clear_bits() is called above?
I did not observed a warning for cr4_clear_bits(), this should be okay 
since it is called during init.
> 



>> +
>>   /* These bits should not change their value after CPU init is 
>> finished. */
>>   static const unsigned long cr4_pinned_mask =
>>       X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP |
>> @@ -1848,6 +1865,7 @@ static void identify_cpu(struct cpuinfo_x86 *c)
>>       setup_smep(c);
>>       setup_smap(c);
>>       setup_umip(c);
>> +    setup_lass(c);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation)
  2023-01-10  5:52 ` [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation) Yian Chen
  2023-01-11 22:22   ` Sohil Mehta
@ 2023-01-12 18:17   ` Dave Hansen
  2023-01-13  1:17     ` Sohil Mehta
  1 sibling, 1 reply; 40+ messages in thread
From: Dave Hansen @ 2023-01-12 18:17 UTC (permalink / raw)
  To: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Sohil Mehta, Paul Lai

On 1/9/23 21:52, Yian Chen wrote:
> +static __always_inline void setup_lass(struct cpuinfo_x86 *c)
> +{
> +	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> +		cr4_set_bits(X86_CR4_LASS);
> +	} else {
> +		/*
> +		 * only clear the feature and cr4 bits when hardware
> +		 * supports LASS, in case it was enabled in a previous
> +		 * boot (e.g., via kexec)
> +		 */
> +		if (cpu_has(c, X86_FEATURE_LASS)) {
> +			cr4_clear_bits(X86_CR4_LASS);
> +			clear_cpu_cap(c, X86_FEATURE_LASS);
> +		}
> +	}
> +}

Could you try testing this, please?

Please remember that there are three things in play here:
 1. disabled-features.h.  Makes cpu_feature_enabled(X86_FEATURE_LASS)==0
    when CONFIG_X86_LASS=n.
 2. The X86_FEATURE_LASS software feature bit itself.  clearcpuid=lass
    would clear it.
 3. The actual CPU enumeration of X86_FEATURE_LASS

The else{} is handling the case where X86_FEATURE_LASS is compiled out
but where the CPU supports LASS.  It doesn't do anything when the CPU
lacks LASS support *OR* when clearcpuid=lass is used.

In the end, want X86_CR4_LASS set when the kernel wants LASS and clear
in *ALL* other cases.  That would be simply:

	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
		cr4_set_bits(X86_CR4_LASS);
	} else {
		cr4_clear_bits(X86_CR4_LASS);
	}

The cr4_clear_bits(X86_CR4_LASS) should be safe regardless of CPU or
kernel support for LASS.

As for the:

	clear_cpu_cap(c, X86_FEATURE_LASS);

It really only matters for kernels where CONFIG_X86_LASS=n but where the
CPU supports it.  I'm not clear on what specifically you expect to gain
from it, though.

I'm also wondering if we even want a Kconfig option.  Is anyone
realistically going to be compiling this support out?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-12  0:37       ` Dave Hansen
@ 2023-01-12 18:36         ` Chen, Yian
  2023-01-12 18:48           ` Dave Hansen
  2023-02-01  2:10         ` Sohil Mehta
  1 sibling, 1 reply; 40+ messages in thread
From: Chen, Yian @ 2023-01-12 18:36 UTC (permalink / raw)
  To: Dave Hansen, Sohil Mehta, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai



On 1/11/2023 4:37 PM, Dave Hansen wrote:
> On 1/11/23 16:27, Chen, Yian wrote:
>>> It seems we are implicitly relying on the on stac() and clac()
>>> calls that are added for SMAP. Have you tried running with SMAP
>>> disabled i.e "clearcpuid=smap"?
>>>
>> Yes, I tested with clearcpuid=smap.
> It works by accident, then.
> 
> clearcpuid=smap means that the kernel should be running as if
> CPUID.(EAX=07H,ECX=0H):EBX.SMAP[bit 20]==0.  STAC/CLAC should #UD in
> that case.
> 
It could be something wrong in my Simics simulation environment.

> The only reason that it happens to work is that STAC/CLAC apparently
> actually continue to work even if CR4.SMAP==0.
> 
> I'm actually a _bit_ surprised by this, but I bet there's a good reason
> for it.
> 
> In any case, please just make LASS dependent on SMAP.  It's the right
> thing to do on several levels.
Sure, I will add the dependency.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-12 18:36         ` Chen, Yian
@ 2023-01-12 18:48           ` Dave Hansen
  2023-02-01  2:25             ` Sohil Mehta
  0 siblings, 1 reply; 40+ messages in thread
From: Dave Hansen @ 2023-01-12 18:48 UTC (permalink / raw)
  To: Chen, Yian, Sohil Mehta, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai

On 1/12/23 10:36, Chen, Yian wrote:
> On 1/11/2023 4:37 PM, Dave Hansen wrote:
>> On 1/11/23 16:27, Chen, Yian wrote:
>>>> It seems we are implicitly relying on the on stac() and clac()
>>>> calls that are added for SMAP. Have you tried running with SMAP
>>>> disabled i.e "clearcpuid=smap"?
>>>>
>>> Yes, I tested with clearcpuid=smap.
>> It works by accident, then.
>>
>> clearcpuid=smap means that the kernel should be running as if
>> CPUID.(EAX=07H,ECX=0H):EBX.SMAP[bit 20]==0.  STAC/CLAC should #UD in
>> that case.
>>
> It could be something wrong in my Simics simulation environment.

Also, Andy Cooper made a very good point: when the kernel enables
paging, it's running with a low address so that the instruction pointer
stays valid as paging becomes enabled.

But, if LASS were enabled and enforced at that point, you'd get a LASS
fault and go kablooey.  Can you see what simics does in that case, and
also make sure we're clearing CR4.LASS when enabling paging?  That would
obviate the need to do it later in C code too.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation)
  2023-01-12 18:17   ` Dave Hansen
@ 2023-01-13  1:17     ` Sohil Mehta
  2023-01-13 19:39       ` Sohil Mehta
  0 siblings, 1 reply; 40+ messages in thread
From: Sohil Mehta @ 2023-01-13  1:17 UTC (permalink / raw)
  To: Dave Hansen, Yian Chen, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai

On 1/12/2023 10:17 AM, Dave Hansen wrote:

> In the end, want X86_CR4_LASS set when the kernel wants LASS and clear
> in *ALL* other cases.  That would be simply:
> 
> 	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> 		cr4_set_bits(X86_CR4_LASS);
> 	} else {
> 		cr4_clear_bits(X86_CR4_LASS);
> 	}
> 

Thanks for the explanation. This is very helpful.

> I'm also wondering if we even want a Kconfig option.  Is anyone
> realistically going to be compiling this support out?

I was initially thinking we should leave the Kconfig option for users
that really *need* vsyscall support. But, thinking about it more, we
don't need an LASS Kconfig option for that.

We can make CONFIG_LEGACY_VSYSCALL_NONE as the default instead of the
current CONFIG_LEGACY_VSYSCALL_XONLY. The kernel can disable LASS at
boot time if a user/admin has shown explicit preference by selecting
CONFIG_LEGACY_VSYSCALL_XONLY or by providing command line parameter
vsyscall=xonly/emulate.

Thoughts?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation)
  2023-01-13  1:17     ` Sohil Mehta
@ 2023-01-13 19:39       ` Sohil Mehta
  0 siblings, 0 replies; 40+ messages in thread
From: Sohil Mehta @ 2023-01-13 19:39 UTC (permalink / raw)
  To: Dave Hansen, Yian Chen, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai, Kees Cook,
	Florian Weimer, Borislav Petkov

Adding folks from the previous discussion when a similar suggestion was
made by Kees. There didn't seem to be any major objection that time.

https://lore.kernel.org/lkml/202205111424.DEB0E3418@keescook/

On 1/12/2023 5:17 PM, Sohil Mehta wrote:
> We can make CONFIG_LEGACY_VSYSCALL_NONE as the default instead of the
> current CONFIG_LEGACY_VSYSCALL_XONLY. The kernel can disable LASS at
> boot time if a user/admin has shown explicit preference by selecting
> CONFIG_LEGACY_VSYSCALL_XONLY or by providing command line parameter
> vsyscall=xonly/emulate.
> 
> Thoughts?


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection
  2023-01-10  5:52 ` [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection Yian Chen
  2023-01-11  0:34   ` Sohil Mehta
@ 2023-01-21  4:09   ` Andy Lutomirski
  1 sibling, 0 replies; 40+ messages in thread
From: Andy Lutomirski @ 2023-01-21  4:09 UTC (permalink / raw)
  To: Yian Chen, Linux Kernel Mailing List, the arch/x86 maintainers,
	Dave Hansen, Shankar, Ravi V, Tony Luck, Sohil Mehta, Paul Lai



On Mon, Jan 9, 2023, at 9:52 PM, Yian Chen wrote:
> Kernel enables LASS automatically at starting time in LASS
> capable platforms. Any access to kernel addresses
> or upper half addresses from user space triggers a #GP
> fault.
>
> Legacy vsyscall does not comply with LASS, because
> the vsyscall functions are mapped in the range
> 0xffffffffff600000-0xffffffffff601000.
>
> In theory, it would be possible to write a #GP fault handler
> to emulate the old vsyscall behavior, but vsyscall has been
> deprecated for some time, so this has not been done.

The ISE docs are really quite explicit that #GP will have RIP pointing at the vDSO if LASS is on. Unless I’ve missed something, this should be quite straightforward to handle without breaking userspace compatibility.  Let’s please do this.

>
> Therefore, when kernel enforces LASS, vsyscall does not work
> and should be disabled. On the other hand, the user can relax
> the enforcement by clearing lass cpu id (clearcpuid=lass/390)
> or enabling vsyscall (vsyscall=xxx) from kernel command line.
> The user can also opt-out LASS in config file to build kernel
> binary.
>
> Signed-off-by: Yian Chen <yian.chen@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 12 ++++++++----
>  arch/x86/entry/vsyscall/vsyscall_64.c           | 14 ++++++++++++++
>  2 files changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index 6cfa6e3996cf..3988e0c8c175 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6755,10 +6755,14 @@
>  			versions of glibc use these calls.  Because these
>  			functions are at fixed addresses, they make nice
>  			targets for exploits that can control RIP.
> -
> -			emulate     [default] Vsyscalls turn into traps and are
> -			            emulated reasonably safely.  The vsyscall
> -				    page is readable.
> +			In newer versions of Intel platforms that come with
> +			LASS(Linear Address Space separation) protection,
> +			vsyscall is disabled by default. Enabling vsyscall
> +			via the parameter overrides LASS protection.
> +
> +			emulate     [default if not LASS capable] Vsyscalls
> +				    turn into traps and are emulated reasonably
> +				    safely.  The vsyscall page is readable.
> 
>  			xonly       Vsyscalls turn into traps and are
>  			            emulated reasonably safely.  The vsyscall
> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c 
> b/arch/x86/entry/vsyscall/vsyscall_64.c
> index 4af81df133ee..2691f26835d1 100644
> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
> @@ -63,6 +63,12 @@ static int __init vsyscall_setup(char *str)
>  		else
>  			return -EINVAL;
> 
> +		if (cpu_feature_enabled(X86_FEATURE_LASS) &&
> +		    vsyscall_mode != NONE) {
> +			setup_clear_cpu_cap(X86_FEATURE_LASS);
> +			pr_warn("LASS disabled by command line enabling vsyscall\n");
> +		}
> +
>  		return 0;
>  	}
> 
> @@ -379,6 +385,14 @@ void __init map_vsyscall(void)
>  	extern char __vsyscall_page;
>  	unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
> 
> +	/*
> +	 * When LASS is on, vsyscall triggers a #GP fault,
> +	 * so that force vsyscall_mode to NONE.
> +	 */
> +	if (cpu_feature_enabled(X86_FEATURE_LASS)) {
> +		vsyscall_mode = NONE;
> +		return;
> +	}
>  	/*
>  	 * For full emulation, the page needs to exist for real.  In
>  	 * execute-only mode, there is no PTE at all backing the vsyscall
> -- 
> 2.34.1

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-12  0:37       ` Dave Hansen
  2023-01-12 18:36         ` Chen, Yian
@ 2023-02-01  2:10         ` Sohil Mehta
  1 sibling, 0 replies; 40+ messages in thread
From: Sohil Mehta @ 2023-02-01  2:10 UTC (permalink / raw)
  To: Dave Hansen, Chen, Yian, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai

On 1/11/2023 4:37 PM, Dave Hansen wrote:
> clearcpuid=smap means that the kernel should be running as if
> CPUID.(EAX=07H,ECX=0H):EBX.SMAP[bit 20]==0.  STAC/CLAC should #UD in
> that case.
> 
> The only reason that it happens to work is that STAC/CLAC apparently
> actually continue to work even if CR4.SMAP==0.
> 

It seems this is by design. I was trying to experiment with
clearcpuid=smap on an older platform (KBL). I noticed that even if
CR4.SMAP==0 the STAC/CLAC instructions do not fault.

The STAC instruction operation in the SDM also suggests that it may be
intentional. It does *not* list CR4.SMAP=0 as a reason for #UD.

#UD	If the LOCK prefix is used.
	If the CPL > 0.
	If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.


> I'm actually a _bit_ surprised by this, but I bet there's a good reason
> for it.
> 

I would love to find out the actual reasoning behind this. I'll try to
poke some of the architects internally.

> In any case, please just make LASS dependent on SMAP.  It's the right
> thing to do on several levels.

Anyway, the end result still remains the same. We should still make LASS
dependent on SMAP since:

1) The spec says that LASS enforcement only happens when SMAP is enabled.
"A supervisor-mode data access causes a LASS violation only if
supervisor-mode access protection is enabled (because CR4.SMAP = 1)"

2) In the extremely improbably case that LASS is available but SMAP is
not available on the hardware, the STAC and CLAC instructions will fault
due to the missing SMAP CPUID bit.

-Sohil

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-01-12 18:48           ` Dave Hansen
@ 2023-02-01  2:25             ` Sohil Mehta
  2023-02-01 18:20               ` Dave Hansen
  0 siblings, 1 reply; 40+ messages in thread
From: Sohil Mehta @ 2023-02-01  2:25 UTC (permalink / raw)
  To: Dave Hansen, Chen, Yian, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai

On 1/12/2023 10:48 AM, Dave Hansen wrote:
> Also, Andy Cooper made a very good point: when the kernel enables
> paging, it's running with a low address so that the instruction pointer
> stays valid as paging becomes enabled.
> 
> But, if LASS were enabled and enforced at that point, you'd get a LASS
> fault and go kablooey.  Can you see what simics does in that case, and
> also make sure we're clearing CR4.LASS when enabling paging?  That would
> obviate the need to do it later in C code too.

CR4 and CR0 are always restored to a known state during kexec. So,
running with LASS enabled should not happen during early boot.

machine_kexec()
-> relocate_kernel()
   -> identity_mapped()

> 	/*
> 	 * Set cr0 to a known state:
> 	 *  - Paging enabled
> 	 *  - Alignment check disabled
> 	 *  - Write protect disabled
> 	 *  - No task switch
> 	 *  - Don't do FP software emulation.
> 	 *  - Protected mode enabled
> 	 */
> 	movq	%cr0, %rax
> 	andq	$~(X86_CR0_AM | X86_CR0_WP | X86_CR0_TS | X86_CR0_EM), %rax
> 	orl	$(X86_CR0_PG | X86_CR0_PE), %eax
> 	movq	%rax, %cr0
> 
> 	/*
> 	 * Set cr4 to a known state:
> 	 *  - physical address extension enabled
> 	 *  - 5-level paging, if it was enabled before
> 	 */
> 	movl	$X86_CR4_PAE, %eax
> 	testq	$X86_CR4_LA57, %r13
> 	jz	1f
> 	orl	$X86_CR4_LA57, %eax
> 1:
> 	movq	%rax, %cr4
> 
> 	jmp 1f
> 1:

Dave, does this address your concern or were you looking for something
else? Is there some path other than kexec that should also be audited
for this scenario?



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives
  2023-02-01  2:25             ` Sohil Mehta
@ 2023-02-01 18:20               ` Dave Hansen
  0 siblings, 0 replies; 40+ messages in thread
From: Dave Hansen @ 2023-02-01 18:20 UTC (permalink / raw)
  To: Sohil Mehta, Chen, Yian, linux-kernel, x86, Andy Lutomirski,
	Dave Hansen, Ravi Shankar, Tony Luck, Paul Lai

On 1/31/23 18:25, Sohil Mehta wrote:
>> 	/*
>> 	 * Set cr4 to a known state:
>> 	 *  - physical address extension enabled
>> 	 *  - 5-level paging, if it was enabled before
>> 	 */
>> 	movl	$X86_CR4_PAE, %eax
>> 	testq	$X86_CR4_LA57, %r13
>> 	jz	1f
>> 	orl	$X86_CR4_LA57, %eax
>> 1:
>> 	movq	%rax, %cr4
>>
>> 	jmp 1f
>> 1:
> Dave, does this address your concern or were you looking for something
> else? Is there some path other than kexec that should also be audited
> for this scenario?

Yep, that addresses it.  I don't know of any other path that would
matter.  Couldn't hurt to poke around and look for other CR4
manipulation that might need to be LASS-aware, though.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest
  2023-01-10  5:52 ` [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest Yian Chen
@ 2023-02-07  3:21   ` Wang, Lei
  2023-02-09 17:18     ` Sean Christopherson
  0 siblings, 1 reply; 40+ messages in thread
From: Wang, Lei @ 2023-02-07  3:21 UTC (permalink / raw)
  To: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Sohil Mehta, Paul Lai

Could you also CC the KVM mailing list (kvm@vger.kernel.org) for KVM guys to review?

BR,
Lei

On 1/10/2023 1:52 PM, Yian Chen wrote:
> From: Paul Lai <paul.c.lai@intel.com>
> 
> Expose LASS feature which is defined in the CPUID.7.1.EAX
> bit and enabled via the CR4 bit for VM guest.
> 
> Signed-off-by: Paul Lai <paul.c.lai@intel.com>
> Signed-off-by: Yian Chen <yian.chen@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 3 ++-
>  arch/x86/kvm/cpuid.c            | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index f35f1ff4427b..bd39f45e9b5a 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -125,7 +125,8 @@
>  			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
>  			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
>  			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> +			  | X86_CR4_LASS))
>  
>  #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
>  
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index b14653b61470..e0f53f85f5ae 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)
>  
>  	kvm_cpu_cap_mask(CPUID_7_1_EAX,
>  		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
> -		F(AVX_IFMA)
> +		F(AVX_IFMA) | F(LASS)
>  	);
>  
>  	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest
  2023-02-07  3:21   ` Wang, Lei
@ 2023-02-09 17:18     ` Sean Christopherson
  0 siblings, 0 replies; 40+ messages in thread
From: Sean Christopherson @ 2023-02-09 17:18 UTC (permalink / raw)
  To: Wang, Lei
  Cc: Yian Chen, linux-kernel, x86, Andy Lutomirski, Dave Hansen,
	Ravi Shankar, Tony Luck, Sohil Mehta, Paul Lai

On Tue, Feb 07, 2023, Wang, Lei wrote:
> Could you also CC the KVM mailing list (kvm@vger.kernel.org) for KVM guys to review?

As Sohil pointed out[*], use scripts/get_maintainer.pl.  Cc'ing kvm@ on random
bits of the series isn't sufficient, e.g. I completely missed Sohil's Cc on the
cover letter.  For me personally at least, if I'm Cc'd on one patch then I want
to be Cc'd on all patches.

That's somewhat of a moot point for the future of LASS enabling, because the KVM
enabling needs to be a separate series.

[*] https://lore.kernel.org/all/66857084-fbed-3e9a-ed2c-7167010cbad9@intel.com

> On 1/10/2023 1:52 PM, Yian Chen wrote:
> > From: Paul Lai <paul.c.lai@intel.com>
> > 
> > Expose LASS feature which is defined in the CPUID.7.1.EAX
> > bit and enabled via the CR4 bit for VM guest.
> > 
> > Signed-off-by: Paul Lai <paul.c.lai@intel.com>
> > Signed-off-by: Yian Chen <yian.chen@intel.com>
> > Reviewed-by: Tony Luck <tony.luck@intel.com>

So I'm pretty sure this "review" was to satisfy Intel's requirements to never post
anything to x86@ that wasn't reviewed internally by one of a set of select
individuals.  Please drop that requirement for KVM x86, I truly think it's doing
more harm than good.

This is laughably inadqueate "enabling".  This has architecturally visible effects
on _all_ guest virtual/linear accesses.  That statement alone should set off alarms
left and right that simply advertising support via KVM_GET_SUPPORTED_CPUID and
marking the CR4 bit as not unconditionally reserved is insufficient.

My recommendation is to look at the touchpoints for X86_CR4_LA57 and follow the
various breadcrumbs to identify what all needs to be done to properly support LASS.

More importantly, I want tests.  There's _zero_ chance this was reasonably tested.
E.g. the most basic negative testcase of attempting to set X86_CR4_LASS on a host
without LASS is missed (see __cr4_reserved_bits()).

I will not so much as glance at future versions of LASS enabling if there aren't
testscases in KVM-unit-tests and/or selftests that give me a high level of confidence
that KVM correctly handles the various edge cases, e.g. that KVM correctly handles
an implicit supervisor access from user mode while in the emulator.

That goes a for all new hardware enabling: no tests, no review.  Please relay that
message to managers, leadership, and everyone working on KVM.  Ample warning has
been given that new KVM features need to be accompanied by tests.

> > ---
> >  arch/x86/include/asm/kvm_host.h | 3 ++-
> >  arch/x86/kvm/cpuid.c            | 2 +-
> >  2 files changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index f35f1ff4427b..bd39f45e9b5a 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -125,7 +125,8 @@
> >  			  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | X86_CR4_PCIDE \
> >  			  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
> >  			  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
> > -			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
> > +			  | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
> > +			  | X86_CR4_LASS))
> >  
> >  #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
> >  
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index b14653b61470..e0f53f85f5ae 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -664,7 +664,7 @@ void kvm_set_cpu_caps(void)
> >  
> >  	kvm_cpu_cap_mask(CPUID_7_1_EAX,
> >  		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) | F(AMX_FP16) |
> > -		F(AVX_IFMA)
> > +		F(AVX_IFMA) | F(LASS)
> >  	);
> >  
> >  	kvm_cpu_cap_init_kvm_defined(CPUID_7_1_EDX,

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2023-02-09 17:18 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-10  5:51 [PATCH 0/7] Enable LASS (Linear Address space Separation) Yian Chen
2023-01-10  5:51 ` [PATCH 1/7] x86/cpu: Enumerate LASS CPUID and CR4 bits Yian Chen
2023-01-10 20:14   ` Sohil Mehta
2023-01-11  0:13     ` Dave Hansen
2023-01-11 23:23       ` Chen, Yian
2023-01-12  0:06         ` Luck, Tony
2023-01-12  0:15           ` Chen, Yian
2023-01-11 19:21     ` Chen, Yian
2023-01-10  5:51 ` [PATCH 2/7] x86: Add CONFIG option X86_LASS Yian Chen
2023-01-10 21:05   ` Sohil Mehta
2023-01-12  0:13     ` Chen, Yian
2023-01-10  5:52 ` [PATCH 3/7] x86/cpu: Disable kernel LASS when patching kernel alternatives Yian Chen
2023-01-10 21:04   ` Peter Zijlstra
2023-01-11  1:01     ` Chen, Yian
2023-01-11  9:10       ` Peter Zijlstra
2023-01-10 22:41   ` Sohil Mehta
2023-01-12  0:27     ` Chen, Yian
2023-01-12  0:37       ` Dave Hansen
2023-01-12 18:36         ` Chen, Yian
2023-01-12 18:48           ` Dave Hansen
2023-02-01  2:25             ` Sohil Mehta
2023-02-01 18:20               ` Dave Hansen
2023-02-01  2:10         ` Sohil Mehta
2023-01-10  5:52 ` [PATCH 4/7] x86/vsyscall: Setup vsyscall to compromise LASS protection Yian Chen
2023-01-11  0:34   ` Sohil Mehta
2023-01-12  1:43     ` Chen, Yian
2023-01-12  2:49       ` Sohil Mehta
2023-01-21  4:09   ` Andy Lutomirski
2023-01-10  5:52 ` [PATCH 5/7] x86/cpu: Enable LASS (Linear Address Space Separation) Yian Chen
2023-01-11 22:22   ` Sohil Mehta
2023-01-12 17:56     ` Chen, Yian
2023-01-12 18:17   ` Dave Hansen
2023-01-13  1:17     ` Sohil Mehta
2023-01-13 19:39       ` Sohil Mehta
2023-01-10  5:52 ` [PATCH 6/7] x86/cpu: Set LASS as pinning sensitive CR4 bit Yian Chen
2023-01-10  5:52 ` [PATCH 7/7] x86/kvm: Expose LASS feature to VM guest Yian Chen
2023-02-07  3:21   ` Wang, Lei
2023-02-09 17:18     ` Sean Christopherson
2023-01-10 19:48 ` [PATCH 0/7] Enable LASS (Linear Address space Separation) Sohil Mehta
2023-01-10 22:57 ` Dave Hansen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.