linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
@ 2023-08-04  9:02 Marco Elver
  2023-08-04  9:02 ` [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks Marco Elver
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Marco Elver @ 2023-08-04  9:02 UTC (permalink / raw)
  To: elver, Andrew Morton, Kees Cook
  Cc: Guenter Roeck, Peter Zijlstra, Mark Rutland, Steven Rostedt,
	Marc Zyngier, Oliver Upton, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, Miguel Ojeda, linux-arm-kernel,
	kvmarm, linux-kernel, llvm, Dmitry Vyukov, Alexander Potapenko,
	kasan-dev, linux-toolchains

[1]: "On X86-64 and AArch64 targets, this attribute changes the calling
convention of a function. The preserve_most calling convention attempts
to make the code in the caller as unintrusive as possible. This
convention behaves identically to the C calling convention on how
arguments and return values are passed, but it uses a different set of
caller/callee-saved registers. This alleviates the burden of saving and
recovering a large register set before and after the call in the
caller."

[1] https://clang.llvm.org/docs/AttributeReference.html#preserve-most

Introduce the attribute to compiler_types.h as __preserve_most.

Use of this attribute results in better code generation for calls to
very rarely called functions, such as error-reporting functions, or
rarely executed slow paths.

Beware that the attribute conflicts with instrumentation calls inserted
on function entry which do not use __preserve_most themselves. Notably,
function tracing which assumes the normal C calling convention for the
given architecture.  Where the attribute is supported, __preserve_most
will imply notrace. It is recommended to restrict use of the attribute
to functions that should or already disable tracing.

Signed-off-by: Marco Elver <elver@google.com>
---
v2:
* Imply notrace, to avoid any conflicts with tracing which is inserted
  on function entry. See added comments.
---
 include/linux/compiler_types.h | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 547ea1ff806e..12c4540335b7 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -106,6 +106,33 @@ static inline void __chk_io_ptr(const volatile void __iomem *ptr) { }
 #define __cold
 #endif
 
+/*
+ * On x86-64 and arm64 targets, __preserve_most changes the calling convention
+ * of a function to make the code in the caller as unintrusive as possible. This
+ * convention behaves identically to the C calling convention on how arguments
+ * and return values are passed, but uses a different set of caller- and callee-
+ * saved registers.
+ *
+ * The purpose is to alleviates the burden of saving and recovering a large
+ * register set before and after the call in the caller.  This is beneficial for
+ * rarely taken slow paths, such as error-reporting functions that may be called
+ * from hot paths.
+ *
+ * Note: This may conflict with instrumentation inserted on function entry which
+ * does not use __preserve_most or equivalent convention (if in assembly). Since
+ * function tracing assumes the normal C calling convention, where the attribute
+ * is supported, __preserve_most implies notrace.
+ *
+ * Optional: not supported by gcc.
+ *
+ * clang: https://clang.llvm.org/docs/AttributeReference.html#preserve-most
+ */
+#if __has_attribute(__preserve_most__)
+# define __preserve_most notrace __attribute__((__preserve_most__))
+#else
+# define __preserve_most
+#endif
+
 /* Builtins */
 
 /*
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks
  2023-08-04  9:02 [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Marco Elver
@ 2023-08-04  9:02 ` Marco Elver
  2023-08-04 16:03   ` Steven Rostedt
  2023-08-04  9:02 ` [PATCH v2 3/3] list_debug: Introduce CONFIG_DEBUG_LIST_MINIMAL Marco Elver
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Marco Elver @ 2023-08-04  9:02 UTC (permalink / raw)
  To: elver, Andrew Morton, Kees Cook
  Cc: Guenter Roeck, Peter Zijlstra, Mark Rutland, Steven Rostedt,
	Marc Zyngier, Oliver Upton, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, Miguel Ojeda, linux-arm-kernel,
	kvmarm, linux-kernel, llvm, Dmitry Vyukov, Alexander Potapenko,
	kasan-dev, linux-toolchains

Turn the list debug checking functions __list_*_valid() into inline
functions that wrap the out-of-line functions. Care is taken to ensure
the inline wrappers are always inlined, so that additional compiler
instrumentation (such as sanitizers) does not result in redundant
outlining.

This change is preparation for performing checks in the inline wrappers.

No functional change intended.

Signed-off-by: Marco Elver <elver@google.com>
---
 arch/arm64/kvm/hyp/nvhe/list_debug.c |  6 +++---
 include/linux/list.h                 | 15 +++++++++++++--
 lib/list_debug.c                     | 11 +++++------
 3 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/list_debug.c b/arch/arm64/kvm/hyp/nvhe/list_debug.c
index d68abd7ea124..589284496ac5 100644
--- a/arch/arm64/kvm/hyp/nvhe/list_debug.c
+++ b/arch/arm64/kvm/hyp/nvhe/list_debug.c
@@ -26,8 +26,8 @@ static inline __must_check bool nvhe_check_data_corruption(bool v)
 
 /* The predicates checked here are taken from lib/list_debug.c. */
 
-bool __list_add_valid(struct list_head *new, struct list_head *prev,
-		      struct list_head *next)
+bool ___list_add_valid(struct list_head *new, struct list_head *prev,
+		       struct list_head *next)
 {
 	if (NVHE_CHECK_DATA_CORRUPTION(next->prev != prev) ||
 	    NVHE_CHECK_DATA_CORRUPTION(prev->next != next) ||
@@ -37,7 +37,7 @@ bool __list_add_valid(struct list_head *new, struct list_head *prev,
 	return true;
 }
 
-bool __list_del_entry_valid(struct list_head *entry)
+bool ___list_del_entry_valid(struct list_head *entry)
 {
 	struct list_head *prev, *next;
 
diff --git a/include/linux/list.h b/include/linux/list.h
index f10344dbad4d..e0b2cf904409 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -39,10 +39,21 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
 }
 
 #ifdef CONFIG_DEBUG_LIST
-extern bool __list_add_valid(struct list_head *new,
+extern bool ___list_add_valid(struct list_head *new,
 			      struct list_head *prev,
 			      struct list_head *next);
-extern bool __list_del_entry_valid(struct list_head *entry);
+static __always_inline bool __list_add_valid(struct list_head *new,
+					     struct list_head *prev,
+					     struct list_head *next)
+{
+	return ___list_add_valid(new, prev, next);
+}
+
+extern bool ___list_del_entry_valid(struct list_head *entry);
+static __always_inline bool __list_del_entry_valid(struct list_head *entry)
+{
+	return ___list_del_entry_valid(entry);
+}
 #else
 static inline bool __list_add_valid(struct list_head *new,
 				struct list_head *prev,
diff --git a/lib/list_debug.c b/lib/list_debug.c
index d98d43f80958..fd69009cc696 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -17,8 +17,8 @@
  * attempt).
  */
 
-bool __list_add_valid(struct list_head *new, struct list_head *prev,
-		      struct list_head *next)
+bool ___list_add_valid(struct list_head *new, struct list_head *prev,
+		       struct list_head *next)
 {
 	if (CHECK_DATA_CORRUPTION(prev == NULL,
 			"list_add corruption. prev is NULL.\n") ||
@@ -37,9 +37,9 @@ bool __list_add_valid(struct list_head *new, struct list_head *prev,
 
 	return true;
 }
-EXPORT_SYMBOL(__list_add_valid);
+EXPORT_SYMBOL(___list_add_valid);
 
-bool __list_del_entry_valid(struct list_head *entry)
+bool ___list_del_entry_valid(struct list_head *entry)
 {
 	struct list_head *prev, *next;
 
@@ -65,6 +65,5 @@ bool __list_del_entry_valid(struct list_head *entry)
 		return false;
 
 	return true;
-
 }
-EXPORT_SYMBOL(__list_del_entry_valid);
+EXPORT_SYMBOL(___list_del_entry_valid);
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v2 3/3] list_debug: Introduce CONFIG_DEBUG_LIST_MINIMAL
  2023-08-04  9:02 [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Marco Elver
  2023-08-04  9:02 ` [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks Marco Elver
@ 2023-08-04  9:02 ` Marco Elver
  2023-08-04 15:58 ` [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Steven Rostedt
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Marco Elver @ 2023-08-04  9:02 UTC (permalink / raw)
  To: elver, Andrew Morton, Kees Cook
  Cc: Guenter Roeck, Peter Zijlstra, Mark Rutland, Steven Rostedt,
	Marc Zyngier, Oliver Upton, James Morse, Suzuki K Poulose,
	Zenghui Yu, Catalin Marinas, Will Deacon, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, Miguel Ojeda, linux-arm-kernel,
	kvmarm, linux-kernel, llvm, Dmitry Vyukov, Alexander Potapenko,
	kasan-dev, linux-toolchains

Numerous production kernel configs (see [1, 2]) are choosing to enable
CONFIG_DEBUG_LIST, which is also being recommended by KSPP for hardened
configs [3]. The feature has never been designed with performance in
mind, yet common list manipulation is happening across hot paths all
over the kernel.

Introduce CONFIG_DEBUG_LIST_MINIMAL, which performs list pointer
checking inline, and only upon list corruption delegates to the
reporting slow path.

To generate optimal machine code with CONFIG_DEBUG_LIST_MINIMAL:

  1. Elide checking for pointer values which upon dereference would
     result in an immediate access fault -- therefore "minimal" checks.
     The trade-off is lower-quality error reports.

  2. Use the newly introduced __preserve_most function attribute
     (available with Clang, but not yet with GCC) to minimize the code
     footprint for calling the reporting slow path. As a result,
     function size of callers is reduced by avoiding saving registers
     before calling the rarely called reporting slow path.

     Note that all TUs in lib/Makefile already disable function tracing,
     including list_debug.c, and __preserve_most's implied notrace has
     no effect in this case.

  3. Because the inline checks are a subset of the full set of checks in
     ___list_*_valid(), always return false if the inline checks failed.
     This avoids redundant compare and conditional branch right after
     return from the slow path.

As a side-effect of the checks being inline, if the compiler can prove
some condition to always be true, it can completely elide some checks.

Running netperf with CONFIG_DEBUG_LIST_MINIMAL (using a Clang compiler
with "preserve_most") shows throughput improvements, in my case of ~7%
on average (up to 20-30% on some test cases).

Link: https://r.android.com/1266735 [1]
Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/blob/main/config [2]
Link: https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings [3]
Signed-off-by: Marco Elver <elver@google.com>
---
v2:
* Note that lib/Makefile disables function tracing for everything and
  __preserve_most's implied notrace is a noop here.
---
 arch/arm64/kvm/hyp/nvhe/list_debug.c |  2 +
 include/linux/list.h                 | 56 +++++++++++++++++++++++++---
 lib/Kconfig.debug                    | 15 ++++++++
 lib/list_debug.c                     |  2 +
 4 files changed, 69 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/list_debug.c b/arch/arm64/kvm/hyp/nvhe/list_debug.c
index 589284496ac5..df718e29f6d4 100644
--- a/arch/arm64/kvm/hyp/nvhe/list_debug.c
+++ b/arch/arm64/kvm/hyp/nvhe/list_debug.c
@@ -26,6 +26,7 @@ static inline __must_check bool nvhe_check_data_corruption(bool v)
 
 /* The predicates checked here are taken from lib/list_debug.c. */
 
+__list_valid_slowpath
 bool ___list_add_valid(struct list_head *new, struct list_head *prev,
 		       struct list_head *next)
 {
@@ -37,6 +38,7 @@ bool ___list_add_valid(struct list_head *new, struct list_head *prev,
 	return true;
 }
 
+__list_valid_slowpath
 bool ___list_del_entry_valid(struct list_head *entry)
 {
 	struct list_head *prev, *next;
diff --git a/include/linux/list.h b/include/linux/list.h
index e0b2cf904409..a28a215a3eb1 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -39,20 +39,64 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
 }
 
 #ifdef CONFIG_DEBUG_LIST
-extern bool ___list_add_valid(struct list_head *new,
-			      struct list_head *prev,
-			      struct list_head *next);
+
+#ifdef CONFIG_DEBUG_LIST_MINIMAL
+# define __list_valid_slowpath __cold __preserve_most
+#else
+# define __list_valid_slowpath
+#endif
+
+extern bool __list_valid_slowpath ___list_add_valid(struct list_head *new,
+						    struct list_head *prev,
+						    struct list_head *next);
 static __always_inline bool __list_add_valid(struct list_head *new,
 					     struct list_head *prev,
 					     struct list_head *next)
 {
-	return ___list_add_valid(new, prev, next);
+	bool ret = true;
+
+	if (IS_ENABLED(CONFIG_DEBUG_LIST_MINIMAL)) {
+		/*
+		 * In the minimal config, elide checking if next and prev are
+		 * NULL, since the immediate dereference of them below would
+		 * result in a fault if NULL.
+		 *
+		 * With the minimal config we can afford to inline the checks,
+		 * which also gives the compiler a chance to elide some of them
+		 * completely if they can be proven at compile-time. If one of
+		 * the pre-conditions does not hold, the slow-path will show a
+		 * report which pre-condition failed.
+		 */
+		if (likely(next->prev == prev && prev->next == next && new != prev && new != next))
+			return true;
+		ret = false;
+	}
+
+	ret &= ___list_add_valid(new, prev, next);
+	return ret;
 }
 
-extern bool ___list_del_entry_valid(struct list_head *entry);
+extern bool __list_valid_slowpath ___list_del_entry_valid(struct list_head *entry);
 static __always_inline bool __list_del_entry_valid(struct list_head *entry)
 {
-	return ___list_del_entry_valid(entry);
+	bool ret = true;
+
+	if (IS_ENABLED(CONFIG_DEBUG_LIST_MINIMAL)) {
+		struct list_head *prev = entry->prev;
+		struct list_head *next = entry->next;
+
+		/*
+		 * In the minimal config, elide checking if next and prev are
+		 * NULL, LIST_POISON1 or LIST_POISON2, since the immediate
+		 * dereference of them below would result in a fault.
+		 */
+		if (likely(prev->next == entry && next->prev == entry))
+			return true;
+		ret = false;
+	}
+
+	ret &= ___list_del_entry_valid(entry);
+	return ret;
 }
 #else
 static inline bool __list_add_valid(struct list_head *new,
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index fbc89baf7de6..e72cf08af0fa 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1680,6 +1680,21 @@ config DEBUG_LIST
 
 	  If unsure, say N.
 
+config DEBUG_LIST_MINIMAL
+	bool "Minimal linked list debug checks"
+	default !DEBUG_KERNEL
+	depends on DEBUG_LIST
+	help
+	  Only perform the minimal set of checks in the linked-list walking
+	  routines to catch corruptions that are not guaranteed to result in an
+	  immediate access fault.
+
+	  This trades lower quality error reports for improved performance: the
+	  generated code should be more optimal and provide trade-offs that may
+	  better serve safety- and performance- critical environments.
+
+	  If unsure, say Y.
+
 config DEBUG_PLIST
 	bool "Debug priority linked list manipulation"
 	depends on DEBUG_KERNEL
diff --git a/lib/list_debug.c b/lib/list_debug.c
index fd69009cc696..daad32855f0d 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -17,6 +17,7 @@
  * attempt).
  */
 
+__list_valid_slowpath
 bool ___list_add_valid(struct list_head *new, struct list_head *prev,
 		       struct list_head *next)
 {
@@ -39,6 +40,7 @@ bool ___list_add_valid(struct list_head *new, struct list_head *prev,
 }
 EXPORT_SYMBOL(___list_add_valid);
 
+__list_valid_slowpath
 bool ___list_del_entry_valid(struct list_head *entry)
 {
 	struct list_head *prev, *next;
-- 
2.41.0.640.ga95def55d0-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-04  9:02 [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Marco Elver
  2023-08-04  9:02 ` [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks Marco Elver
  2023-08-04  9:02 ` [PATCH v2 3/3] list_debug: Introduce CONFIG_DEBUG_LIST_MINIMAL Marco Elver
@ 2023-08-04 15:58 ` Steven Rostedt
  2023-08-04 18:14 ` Peter Zijlstra
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2023-08-04 15:58 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Nathan Chancellor, Nick Desaulniers, Tom Rix, Miguel Ojeda,
	linux-arm-kernel, kvmarm, linux-kernel, llvm, Dmitry Vyukov,
	Alexander Potapenko, kasan-dev, linux-toolchains

On Fri,  4 Aug 2023 11:02:56 +0200
Marco Elver <elver@google.com> wrote:

> [1]: "On X86-64 and AArch64 targets, this attribute changes the calling
> convention of a function. The preserve_most calling convention attempts
> to make the code in the caller as unintrusive as possible. This
> convention behaves identically to the C calling convention on how
> arguments and return values are passed, but it uses a different set of
> caller/callee-saved registers. This alleviates the burden of saving and
> recovering a large register set before and after the call in the
> caller."
> 
> [1] https://clang.llvm.org/docs/AttributeReference.html#preserve-most
> 
> Introduce the attribute to compiler_types.h as __preserve_most.
> 
> Use of this attribute results in better code generation for calls to
> very rarely called functions, such as error-reporting functions, or
> rarely executed slow paths.
> 
> Beware that the attribute conflicts with instrumentation calls inserted
> on function entry which do not use __preserve_most themselves. Notably,
> function tracing which assumes the normal C calling convention for the
> given architecture.  Where the attribute is supported, __preserve_most
> will imply notrace. It is recommended to restrict use of the attribute
> to functions that should or already disable tracing.
> 
> Signed-off-by: Marco Elver <elver@google.com>
> ---
> v2:
> * Imply notrace, to avoid any conflicts with tracing which is inserted
>   on function entry. See added comments.

Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve

> ---
>  include/linux/compiler_types.h | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 547ea1ff806e..12c4540335b7 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -106,6 +106,33 @@ static inline void __chk_io_ptr(const volatile void __iomem *ptr) { }
>  #define __cold
>  #endif
>  
> +/*
> + * On x86-64 and arm64 targets, __preserve_most changes the calling convention
> + * of a function to make the code in the caller as unintrusive as possible. This
> + * convention behaves identically to the C calling convention on how arguments
> + * and return values are passed, but uses a different set of caller- and callee-
> + * saved registers.
> + *
> + * The purpose is to alleviates the burden of saving and recovering a large
> + * register set before and after the call in the caller.  This is beneficial for
> + * rarely taken slow paths, such as error-reporting functions that may be called
> + * from hot paths.
> + *
> + * Note: This may conflict with instrumentation inserted on function entry which
> + * does not use __preserve_most or equivalent convention (if in assembly). Since
> + * function tracing assumes the normal C calling convention, where the attribute
> + * is supported, __preserve_most implies notrace.
> + *
> + * Optional: not supported by gcc.
> + *
> + * clang: https://clang.llvm.org/docs/AttributeReference.html#preserve-most
> + */
> +#if __has_attribute(__preserve_most__)
> +# define __preserve_most notrace __attribute__((__preserve_most__))
> +#else
> +# define __preserve_most
> +#endif
> +
>  /* Builtins */
>  
>  /*


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks
  2023-08-04  9:02 ` [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks Marco Elver
@ 2023-08-04 16:03   ` Steven Rostedt
  2023-08-04 17:49     ` Marco Elver
  2023-08-05  6:30     ` Miguel Ojeda
  0 siblings, 2 replies; 26+ messages in thread
From: Steven Rostedt @ 2023-08-04 16:03 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Nathan Chancellor, Nick Desaulniers, Tom Rix, Miguel Ojeda,
	linux-arm-kernel, kvmarm, linux-kernel, llvm, Dmitry Vyukov,
	Alexander Potapenko, kasan-dev, linux-toolchains

On Fri,  4 Aug 2023 11:02:57 +0200
Marco Elver <elver@google.com> wrote:

> Turn the list debug checking functions __list_*_valid() into inline
> functions that wrap the out-of-line functions. Care is taken to ensure
> the inline wrappers are always inlined, so that additional compiler
> instrumentation (such as sanitizers) does not result in redundant
> outlining.
> 
> This change is preparation for performing checks in the inline wrappers.
> 
> No functional change intended.

I think the entire underscoring functions calling more underscoring
functions in the kernel is an abomination. Yes, there's lots of precedence
to this craziness, but let's not extend it.

Can we give actual real names to why the function is "special" besides that
it now has another underscore added to it?

I've been guilty of this madness myself, but I have learned the errors of
my ways, and have been avoiding doing so in any new code I write.

-- Steve


> 
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/list_debug.c |  6 +++---
>  include/linux/list.h                 | 15 +++++++++++++--
>  lib/list_debug.c                     | 11 +++++------
>  3 files changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/list_debug.c b/arch/arm64/kvm/hyp/nvhe/list_debug.c
> index d68abd7ea124..589284496ac5 100644
> --- a/arch/arm64/kvm/hyp/nvhe/list_debug.c
> +++ b/arch/arm64/kvm/hyp/nvhe/list_debug.c
> @@ -26,8 +26,8 @@ static inline __must_check bool nvhe_check_data_corruption(bool v)
>  
>  /* The predicates checked here are taken from lib/list_debug.c. */
>  
> -bool __list_add_valid(struct list_head *new, struct list_head *prev,
> -		      struct list_head *next)
> +bool ___list_add_valid(struct list_head *new, struct list_head *prev,
> +		       struct list_head *next)
>  {
>  	if (NVHE_CHECK_DATA_CORRUPTION(next->prev != prev) ||
>  	    NVHE_CHECK_DATA_CORRUPTION(prev->next != next) ||
> @@ -37,7 +37,7 @@ bool __list_add_valid(struct list_head *new, struct list_head *prev,
>  	return true;
>  }
>  
> -bool __list_del_entry_valid(struct list_head *entry)
> +bool ___list_del_entry_valid(struct list_head *entry)
>  {
>  	struct list_head *prev, *next;
>  
> diff --git a/include/linux/list.h b/include/linux/list.h
> index f10344dbad4d..e0b2cf904409 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -39,10 +39,21 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
>  }
>  
>  #ifdef CONFIG_DEBUG_LIST
> -extern bool __list_add_valid(struct list_head *new,
> +extern bool ___list_add_valid(struct list_head *new,
>  			      struct list_head *prev,
>  			      struct list_head *next);
> -extern bool __list_del_entry_valid(struct list_head *entry);
> +static __always_inline bool __list_add_valid(struct list_head *new,
> +					     struct list_head *prev,
> +					     struct list_head *next)
> +{
> +	return ___list_add_valid(new, prev, next);
> +}
> +
> +extern bool ___list_del_entry_valid(struct list_head *entry);
> +static __always_inline bool __list_del_entry_valid(struct list_head *entry)
> +{
> +	return ___list_del_entry_valid(entry);
> +}
>  #else
>  static inline bool __list_add_valid(struct list_head *new,
>  				struct list_head *prev,
> diff --git a/lib/list_debug.c b/lib/list_debug.c
> index d98d43f80958..fd69009cc696 100644
> --- a/lib/list_debug.c
> +++ b/lib/list_debug.c
> @@ -17,8 +17,8 @@
>   * attempt).
>   */
>  
> -bool __list_add_valid(struct list_head *new, struct list_head *prev,
> -		      struct list_head *next)
> +bool ___list_add_valid(struct list_head *new, struct list_head *prev,
> +		       struct list_head *next)
>  {
>  	if (CHECK_DATA_CORRUPTION(prev == NULL,
>  			"list_add corruption. prev is NULL.\n") ||
> @@ -37,9 +37,9 @@ bool __list_add_valid(struct list_head *new, struct list_head *prev,
>  
>  	return true;
>  }
> -EXPORT_SYMBOL(__list_add_valid);
> +EXPORT_SYMBOL(___list_add_valid);
>  
> -bool __list_del_entry_valid(struct list_head *entry)
> +bool ___list_del_entry_valid(struct list_head *entry)
>  {
>  	struct list_head *prev, *next;
>  
> @@ -65,6 +65,5 @@ bool __list_del_entry_valid(struct list_head *entry)
>  		return false;
>  
>  	return true;
> -
>  }
> -EXPORT_SYMBOL(__list_del_entry_valid);
> +EXPORT_SYMBOL(___list_del_entry_valid);


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks
  2023-08-04 16:03   ` Steven Rostedt
@ 2023-08-04 17:49     ` Marco Elver
  2023-08-04 17:57       ` Steven Rostedt
  2023-08-05  6:30     ` Miguel Ojeda
  1 sibling, 1 reply; 26+ messages in thread
From: Marco Elver @ 2023-08-04 17:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Nathan Chancellor, Nick Desaulniers, Tom Rix, Miguel Ojeda,
	linux-arm-kernel, kvmarm, linux-kernel, llvm, Dmitry Vyukov,
	Alexander Potapenko, kasan-dev, linux-toolchains

On Fri, 4 Aug 2023 at 18:03, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri,  4 Aug 2023 11:02:57 +0200
> Marco Elver <elver@google.com> wrote:
>
> > Turn the list debug checking functions __list_*_valid() into inline
> > functions that wrap the out-of-line functions. Care is taken to ensure
> > the inline wrappers are always inlined, so that additional compiler
> > instrumentation (such as sanitizers) does not result in redundant
> > outlining.
> >
> > This change is preparation for performing checks in the inline wrappers.
> >
> > No functional change intended.
>
> I think the entire underscoring functions calling more underscoring
> functions in the kernel is an abomination. Yes, there's lots of precedence
> to this craziness, but let's not extend it.
>
> Can we give actual real names to why the function is "special" besides that
> it now has another underscore added to it?
>
> I've been guilty of this madness myself, but I have learned the errors of
> my ways, and have been avoiding doing so in any new code I write.

That's fair. We can call them __list_*_valid() (inline), and
__list_*_valid_or_report() ?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks
  2023-08-04 17:49     ` Marco Elver
@ 2023-08-04 17:57       ` Steven Rostedt
  2023-08-04 17:59         ` Steven Rostedt
  0 siblings, 1 reply; 26+ messages in thread
From: Steven Rostedt @ 2023-08-04 17:57 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Nathan Chancellor, Nick Desaulniers, Tom Rix, Miguel Ojeda,
	linux-arm-kernel, kvmarm, linux-kernel, llvm, Dmitry Vyukov,
	Alexander Potapenko, kasan-dev, linux-toolchains

On Fri, 4 Aug 2023 19:49:48 +0200
Marco Elver <elver@google.com> wrote:

> > I've been guilty of this madness myself, but I have learned the errors of
> > my ways, and have been avoiding doing so in any new code I write.  
> 
> That's fair. We can call them __list_*_valid() (inline), and
> __list_*_valid_or_report() ?

__list_*_valid_check() ?

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks
  2023-08-04 17:57       ` Steven Rostedt
@ 2023-08-04 17:59         ` Steven Rostedt
  2023-08-04 18:08           ` Marco Elver
  2023-08-04 18:19           ` Peter Zijlstra
  0 siblings, 2 replies; 26+ messages in thread
From: Steven Rostedt @ 2023-08-04 17:59 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Nathan Chancellor, Nick Desaulniers, Tom Rix, Miguel Ojeda,
	linux-arm-kernel, kvmarm, linux-kernel, llvm, Dmitry Vyukov,
	Alexander Potapenko, kasan-dev, linux-toolchains

On Fri, 4 Aug 2023 13:57:57 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 4 Aug 2023 19:49:48 +0200
> Marco Elver <elver@google.com> wrote:
> 
> > > I've been guilty of this madness myself, but I have learned the errors of
> > > my ways, and have been avoiding doing so in any new code I write.    
> > 
> > That's fair. We can call them __list_*_valid() (inline), and
> > __list_*_valid_or_report() ?  
> 
> __list_*_valid_check() ?
> 

I have to admit, I think the main reason kernel developers default to using
these useless underscores is because kernel developers are notoriously
lousy at naming. ;-)

-- Steve


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks
  2023-08-04 17:59         ` Steven Rostedt
@ 2023-08-04 18:08           ` Marco Elver
  2023-08-04 18:19           ` Peter Zijlstra
  1 sibling, 0 replies; 26+ messages in thread
From: Marco Elver @ 2023-08-04 18:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Nathan Chancellor, Nick Desaulniers, Tom Rix, Miguel Ojeda,
	linux-arm-kernel, kvmarm, linux-kernel, llvm, Dmitry Vyukov,
	Alexander Potapenko, kasan-dev, linux-toolchains

On Fri, 4 Aug 2023 at 19:59, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Fri, 4 Aug 2023 13:57:57 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > On Fri, 4 Aug 2023 19:49:48 +0200
> > Marco Elver <elver@google.com> wrote:
> >
> > > > I've been guilty of this madness myself, but I have learned the errors of
> > > > my ways, and have been avoiding doing so in any new code I write.
> > >
> > > That's fair. We can call them __list_*_valid() (inline), and
> > > __list_*_valid_or_report() ?
> >
> > __list_*_valid_check() ?

Well, in patch 3/3, the inline function will also do a reduced set of
checking, so "valid_check" is also misleading because both will do
checks.

The key distinguishing thing between the inline and non-inline version
is that the non-inline version will check more things, and also
produce reports.

So I can see

 1. __list_*_valid_or_report()
 2. __list_*_full_valid()

To be appropriate. Preference?

> I have to admit, I think the main reason kernel developers default to using
> these useless underscores is because kernel developers are notoriously
> lousy at naming. ;-)

Heh, naming is hard. ;-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-04  9:02 [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Marco Elver
                   ` (2 preceding siblings ...)
  2023-08-04 15:58 ` [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Steven Rostedt
@ 2023-08-04 18:14 ` Peter Zijlstra
  2023-08-05  6:35 ` Miguel Ojeda
  2023-08-07 11:41 ` Florian Weimer
  5 siblings, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2023-08-04 18:14 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Mark Rutland,
	Steven Rostedt, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Nathan Chancellor, Nick Desaulniers, Tom Rix, Miguel Ojeda,
	linux-arm-kernel, kvmarm, linux-kernel, llvm, Dmitry Vyukov,
	Alexander Potapenko, kasan-dev, linux-toolchains

On Fri, Aug 04, 2023 at 11:02:56AM +0200, Marco Elver wrote:
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 547ea1ff806e..12c4540335b7 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -106,6 +106,33 @@ static inline void __chk_io_ptr(const volatile void __iomem *ptr) { }
>  #define __cold
>  #endif
>  
> +/*
> + * On x86-64 and arm64 targets, __preserve_most changes the calling convention
> + * of a function to make the code in the caller as unintrusive as possible. This
> + * convention behaves identically to the C calling convention on how arguments
> + * and return values are passed, but uses a different set of caller- and callee-
> + * saved registers.
> + *
> + * The purpose is to alleviates the burden of saving and recovering a large
> + * register set before and after the call in the caller.  This is beneficial for
> + * rarely taken slow paths, such as error-reporting functions that may be called
> + * from hot paths.
> + *
> + * Note: This may conflict with instrumentation inserted on function entry which
> + * does not use __preserve_most or equivalent convention (if in assembly). Since
> + * function tracing assumes the normal C calling convention, where the attribute
> + * is supported, __preserve_most implies notrace.
> + *
> + * Optional: not supported by gcc.
> + *
> + * clang: https://clang.llvm.org/docs/AttributeReference.html#preserve-most
> + */
> +#if __has_attribute(__preserve_most__)
> +# define __preserve_most notrace __attribute__((__preserve_most__))
> +#else
> +# define __preserve_most
> +#endif

This seems to shrink the ARM64 vmlinux just a little and mirrors what we
do on x86 in asm. I'll leave it to the arm64 people to judge if this is
worth the hassle.

Index: linux-2.6/arch/arm64/include/asm/preempt.h
===================================================================
--- linux-2.6.orig/arch/arm64/include/asm/preempt.h
+++ linux-2.6/arch/arm64/include/asm/preempt.h
@@ -88,15 +88,18 @@ void preempt_schedule_notrace(void);
 #ifdef CONFIG_PREEMPT_DYNAMIC
 
 DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched);
-void dynamic_preempt_schedule(void);
+void __preserve_most dynamic_preempt_schedule(void);
 #define __preempt_schedule()		dynamic_preempt_schedule()
-void dynamic_preempt_schedule_notrace(void);
+void __preserve_most dynamic_preempt_schedule_notrace(void);
 #define __preempt_schedule_notrace()	dynamic_preempt_schedule_notrace()
 
 #else /* CONFIG_PREEMPT_DYNAMIC */
 
-#define __preempt_schedule()		preempt_schedule()
-#define __preempt_schedule_notrace()	preempt_schedule_notrace()
+void __preserve_most preserve_preempt_schedule(void);
+void __preserve_most preserve_preempt_schedule_notrace(void);
+
+#define __preempt_schedule()		preserve_preempt_schedule()
+#define __preempt_schedule_notrace()	preserve_preempt_schedule_notrace()
 
 #endif /* CONFIG_PREEMPT_DYNAMIC */
 #endif /* CONFIG_PREEMPTION */
Index: linux-2.6/kernel/sched/core.c
===================================================================
--- linux-2.6.orig/kernel/sched/core.c
+++ linux-2.6/kernel/sched/core.c
@@ -6915,7 +6915,7 @@ DEFINE_STATIC_CALL(preempt_schedule, pre
 EXPORT_STATIC_CALL_TRAMP(preempt_schedule);
 #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
 static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule);
-void __sched notrace dynamic_preempt_schedule(void)
+void __sched __preserve_most dynamic_preempt_schedule(void)
 {
 	if (!static_branch_unlikely(&sk_dynamic_preempt_schedule))
 		return;
@@ -6924,6 +6924,11 @@ void __sched notrace dynamic_preempt_sch
 NOKPROBE_SYMBOL(dynamic_preempt_schedule);
 EXPORT_SYMBOL(dynamic_preempt_schedule);
 #endif
+#else
+void __sched __preserve_most preserve_preempt_schedule(void)
+{
+	preempt_schedule();
+}
 #endif
 
 /**
@@ -6988,7 +6993,7 @@ DEFINE_STATIC_CALL(preempt_schedule_notr
 EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace);
 #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
 static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace);
-void __sched notrace dynamic_preempt_schedule_notrace(void)
+void __sched __preserve_most dynamic_preempt_schedule_notrace(void)
 {
 	if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace))
 		return;
@@ -6997,6 +7002,11 @@ void __sched notrace dynamic_preempt_sch
 NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace);
 EXPORT_SYMBOL(dynamic_preempt_schedule_notrace);
 #endif
+#else
+void __sched __preserve_most preserve_preempt_schedule_notrace(void)
+{
+	preempt_schedule_notrace();
+}
 #endif
 
 #endif /* CONFIG_PREEMPTION */

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks
  2023-08-04 17:59         ` Steven Rostedt
  2023-08-04 18:08           ` Marco Elver
@ 2023-08-04 18:19           ` Peter Zijlstra
  1 sibling, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2023-08-04 18:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Marco Elver, Andrew Morton, Kees Cook, Guenter Roeck,
	Mark Rutland, Marc Zyngier, Oliver Upton, James Morse,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Nathan Chancellor, Nick Desaulniers, Tom Rix, Miguel Ojeda,
	linux-arm-kernel, kvmarm, linux-kernel, llvm, Dmitry Vyukov,
	Alexander Potapenko, kasan-dev, linux-toolchains

On Fri, Aug 04, 2023 at 01:59:02PM -0400, Steven Rostedt wrote:
> On Fri, 4 Aug 2023 13:57:57 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Fri, 4 Aug 2023 19:49:48 +0200
> > Marco Elver <elver@google.com> wrote:
> > 
> > > > I've been guilty of this madness myself, but I have learned the errors of
> > > > my ways, and have been avoiding doing so in any new code I write.    
> > > 
> > > That's fair. We can call them __list_*_valid() (inline), and
> > > __list_*_valid_or_report() ?  
> > 
> > __list_*_valid_check() ?
> > 
> 
> I have to admit, I think the main reason kernel developers default to using
> these useless underscores is because kernel developers are notoriously
> lousy at naming. ;-)

Well, that and I detest novella length identifiers.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks
  2023-08-04 16:03   ` Steven Rostedt
  2023-08-04 17:49     ` Marco Elver
@ 2023-08-05  6:30     ` Miguel Ojeda
  1 sibling, 0 replies; 26+ messages in thread
From: Miguel Ojeda @ 2023-08-05  6:30 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Marco Elver, Andrew Morton, Kees Cook, Guenter Roeck,
	Peter Zijlstra, Mark Rutland, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains

On Fri, Aug 4, 2023 at 6:03 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Can we give actual real names to why the function is "special" besides that
> it now has another underscore added to it?

+1, some docs on top of that can also help a lot to explain the
intended difference quickly to a reader.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-04  9:02 [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Marco Elver
                   ` (3 preceding siblings ...)
  2023-08-04 18:14 ` Peter Zijlstra
@ 2023-08-05  6:35 ` Miguel Ojeda
  2023-08-07 11:41 ` Florian Weimer
  5 siblings, 0 replies; 26+ messages in thread
From: Miguel Ojeda @ 2023-08-05  6:35 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Steven Rostedt, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains

On Fri, Aug 4, 2023 at 11:06 AM Marco Elver <elver@google.com> wrote:
>
> will imply notrace. It is recommended to restrict use of the attribute
> to functions that should or already disable tracing.

Should the last sentence here be added into the code comment?

Apart from that this looks fine from a `compiler_attributes.h`
perspective (even if we are not there anymore):

Reviewed-by: Miguel Ojeda <ojeda@kernel.org>

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-04  9:02 [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Marco Elver
                   ` (4 preceding siblings ...)
  2023-08-05  6:35 ` Miguel Ojeda
@ 2023-08-07 11:41 ` Florian Weimer
  2023-08-07 12:24   ` Marco Elver
                     ` (3 more replies)
  5 siblings, 4 replies; 26+ messages in thread
From: Florian Weimer @ 2023-08-07 11:41 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Steven Rostedt, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains

* Marco Elver:

> [1]: "On X86-64 and AArch64 targets, this attribute changes the calling
> convention of a function. The preserve_most calling convention attempts
> to make the code in the caller as unintrusive as possible. This
> convention behaves identically to the C calling convention on how
> arguments and return values are passed, but it uses a different set of
> caller/callee-saved registers. This alleviates the burden of saving and
> recovering a large register set before and after the call in the
> caller."
>
> [1] https://clang.llvm.org/docs/AttributeReference.html#preserve-most

You dropped the interesting part:

| If the arguments are passed in callee-saved registers, then they will
| be preserved by the callee across the call. This doesn’t apply for
| values returned in callee-saved registers.
| 
|  ·  On X86-64 the callee preserves all general purpose registers, except
|     for R11. R11 can be used as a scratch register. Floating-point
|     registers (XMMs/YMMs) are not preserved and need to be saved by the
|     caller.
|     
|  ·  On AArch64 the callee preserve all general purpose registers, except
|     X0-X8 and X16-X18.

Ideally, this would be documented in the respective psABI supplement.
I filled in some gaps and filed:

  Document the ABI for __preserve_most__ function calls
  <https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/45>

Doesn't this change impact the kernel module ABI?

I would really expect a check here

> +#if __has_attribute(__preserve_most__)
> +# define __preserve_most notrace __attribute__((__preserve_most__))
> +#else
> +# define __preserve_most
> +#endif

that this is not a compilation for a module.  Otherwise modules built
with a compiler with __preserve_most__ attribute support are
incompatible with kernels built with a compiler without that attribute.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 11:41 ` Florian Weimer
@ 2023-08-07 12:24   ` Marco Elver
  2023-08-07 12:36     ` Florian Weimer
  2023-08-07 12:38     ` Jakub Jelinek
  2023-08-07 12:31   ` Peter Zijlstra
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 26+ messages in thread
From: Marco Elver @ 2023-08-07 12:24 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Steven Rostedt, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains

On Mon, 7 Aug 2023 at 13:41, Florian Weimer <fweimer@redhat.com> wrote:
>
> * Marco Elver:
>
> > [1]: "On X86-64 and AArch64 targets, this attribute changes the calling
> > convention of a function. The preserve_most calling convention attempts
> > to make the code in the caller as unintrusive as possible. This
> > convention behaves identically to the C calling convention on how
> > arguments and return values are passed, but it uses a different set of
> > caller/callee-saved registers. This alleviates the burden of saving and
> > recovering a large register set before and after the call in the
> > caller."
> >
> > [1] https://clang.llvm.org/docs/AttributeReference.html#preserve-most
>
> You dropped the interesting part:

I will add it back for the kernel documentation.

> | If the arguments are passed in callee-saved registers, then they will
> | be preserved by the callee across the call. This doesn’t apply for
> | values returned in callee-saved registers.
> |
> |  ·  On X86-64 the callee preserves all general purpose registers, except
> |     for R11. R11 can be used as a scratch register. Floating-point
> |     registers (XMMs/YMMs) are not preserved and need to be saved by the
> |     caller.
> |
> |  ·  On AArch64 the callee preserve all general purpose registers, except
> |     X0-X8 and X16-X18.
>
> Ideally, this would be documented in the respective psABI supplement.
> I filled in some gaps and filed:
>
>   Document the ABI for __preserve_most__ function calls
>   <https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/45>

Good idea. I had already created
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899, and we need
better spec to proceed for GCC anyway.

> Doesn't this change impact the kernel module ABI?
>
> I would really expect a check here
>
> > +#if __has_attribute(__preserve_most__)
> > +# define __preserve_most notrace __attribute__((__preserve_most__))
> > +#else
> > +# define __preserve_most
> > +#endif
>
> that this is not a compilation for a module.  Otherwise modules built
> with a compiler with __preserve_most__ attribute support are
> incompatible with kernels built with a compiler without that attribute.

That's true, but is it a real problem? Isn't it known that trying to
make kernel modules built for a kernel with a different config (incl.
compiler) is not guaranteed to work? See IBT, CFI schemes, kernel
sanitizers, etc?

If we were to start trying to introduce some kind of minimal kernel to
module ABI so that modules and kernels built with different toolchains
keep working together, we'd need a mechanism to guarantee this minimal
ABI or prohibit incompatible modules and kernels somehow. Is there a
precedence for this somewhere?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 11:41 ` Florian Weimer
  2023-08-07 12:24   ` Marco Elver
@ 2023-08-07 12:31   ` Peter Zijlstra
  2023-08-08  2:16     ` Steven Rostedt
  2023-08-07 15:27   ` Nick Desaulniers
  2023-08-08 10:57   ` Peter Zijlstra
  3 siblings, 1 reply; 26+ messages in thread
From: Peter Zijlstra @ 2023-08-07 12:31 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Marco Elver, Andrew Morton, Kees Cook, Guenter Roeck,
	Mark Rutland, Steven Rostedt, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains

On Mon, Aug 07, 2023 at 01:41:07PM +0200, Florian Weimer wrote:
> * Marco Elver:
> 
> > [1]: "On X86-64 and AArch64 targets, this attribute changes the calling
> > convention of a function. The preserve_most calling convention attempts
> > to make the code in the caller as unintrusive as possible. This
> > convention behaves identically to the C calling convention on how
> > arguments and return values are passed, but it uses a different set of
> > caller/callee-saved registers. This alleviates the burden of saving and
> > recovering a large register set before and after the call in the
> > caller."
> >
> > [1] https://clang.llvm.org/docs/AttributeReference.html#preserve-most
> 
> You dropped the interesting part:
> 
> | If the arguments are passed in callee-saved registers, then they will
> | be preserved by the callee across the call. This doesn’t apply for
> | values returned in callee-saved registers.
> | 
> |  ·  On X86-64 the callee preserves all general purpose registers, except
> |     for R11. R11 can be used as a scratch register. Floating-point
> |     registers (XMMs/YMMs) are not preserved and need to be saved by the
> |     caller.
> |     
> |  ·  On AArch64 the callee preserve all general purpose registers, except
> |     X0-X8 and X16-X18.
> 
> Ideally, this would be documented in the respective psABI supplement.
> I filled in some gaps and filed:
> 
>   Document the ABI for __preserve_most__ function calls
>   <https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/45>
> 
> Doesn't this change impact the kernel module ABI?
> 
> I would really expect a check here
> 
> > +#if __has_attribute(__preserve_most__)
> > +# define __preserve_most notrace __attribute__((__preserve_most__))
> > +#else
> > +# define __preserve_most
> > +#endif
> 
> that this is not a compilation for a module.  Otherwise modules built
> with a compiler with __preserve_most__ attribute support are
> incompatible with kernels built with a compiler without that attribute.

We have a metric ton of options that can break module ABI. If you're
daft enough to not build with the exact same compiler and .config you
get to keep the pieces.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 12:24   ` Marco Elver
@ 2023-08-07 12:36     ` Florian Weimer
  2023-08-07 13:07       ` Marco Elver
  2023-08-07 15:06       ` Peter Zijlstra
  2023-08-07 12:38     ` Jakub Jelinek
  1 sibling, 2 replies; 26+ messages in thread
From: Florian Weimer @ 2023-08-07 12:36 UTC (permalink / raw)
  To: Marco Elver
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Steven Rostedt, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains

* Marco Elver:

> Good idea. I had already created
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899, and we need
> better spec to proceed for GCC anyway.

Thanks for the reference.

>> Doesn't this change impact the kernel module ABI?
>>
>> I would really expect a check here
>>
>> > +#if __has_attribute(__preserve_most__)
>> > +# define __preserve_most notrace __attribute__((__preserve_most__))
>> > +#else
>> > +# define __preserve_most
>> > +#endif
>>
>> that this is not a compilation for a module.  Otherwise modules built
>> with a compiler with __preserve_most__ attribute support are
>> incompatible with kernels built with a compiler without that attribute.
>
> That's true, but is it a real problem? Isn't it known that trying to
> make kernel modules built for a kernel with a different config (incl.
> compiler) is not guaranteed to work? See IBT, CFI schemes, kernel
> sanitizers, etc?
>
> If we were to start trying to introduce some kind of minimal kernel to
> module ABI so that modules and kernels built with different toolchains
> keep working together, we'd need a mechanism to guarantee this minimal
> ABI or prohibit incompatible modules and kernels somehow. Is there a
> precedence for this somewhere?

I think the GCC vs Clang thing is expected to work today, isn't it?
Using the Clang-based BPF tools with a GCC-compiled kernel requires a
matching ABI.

The other things you listed result in fairly obvious breakage, sometimes
even module loading failures.  Unconditional crashes are possible as
well.  With __preserve_most__, the issues are much more subtle and may
only appear for some kernel/module compielr combinations and
optimization settings.  The impact of incorrectly clobbered registers
tends to be like that.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 12:24   ` Marco Elver
  2023-08-07 12:36     ` Florian Weimer
@ 2023-08-07 12:38     ` Jakub Jelinek
  2023-08-07 12:43       ` Florian Weimer
  1 sibling, 1 reply; 26+ messages in thread
From: Jakub Jelinek @ 2023-08-07 12:38 UTC (permalink / raw)
  To: Marco Elver
  Cc: Florian Weimer, Andrew Morton, Kees Cook, Guenter Roeck,
	Peter Zijlstra, Mark Rutland, Steven Rostedt, Marc Zyngier,
	Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, Miguel Ojeda, linux-arm-kernel,
	kvmarm, linux-kernel, llvm, Dmitry Vyukov, Alexander Potapenko,
	kasan-dev, linux-toolchains

On Mon, Aug 07, 2023 at 02:24:26PM +0200, Marco Elver wrote:
> > | If the arguments are passed in callee-saved registers, then they will
> > | be preserved by the callee across the call. This doesn’t apply for
> > | values returned in callee-saved registers.
> > |
> > |  ·  On X86-64 the callee preserves all general purpose registers, except
> > |     for R11. R11 can be used as a scratch register. Floating-point
> > |     registers (XMMs/YMMs) are not preserved and need to be saved by the
> > |     caller.
> > |
> > |  ·  On AArch64 the callee preserve all general purpose registers, except
> > |     X0-X8 and X16-X18.
> >
> > Ideally, this would be documented in the respective psABI supplement.
> > I filled in some gaps and filed:
> >
> >   Document the ABI for __preserve_most__ function calls
> >   <https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/45>
> 
> Good idea. I had already created
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899, and we need
> better spec to proceed for GCC anyway.

"Registers used for passing arguments
are preserved by the called function, but registers used for
returning results are not."

You mean just GPRs or also vector SSE or MMX registers?  Because if some
of those are to be preserved by callee, there is an issue that they need
to be e.g. handled during unwinding, with all the consequences.  It is hard
to impossible to guess what size needs to be saved/restored, both normally
or during unwinding.  As caller could be say -mavx512f and expect
preservation of all 512 bits and callee -msse2 or -mavx{,2},
or caller -mavx{,2} and expect preservation of all 256 bits and callee -msse2 etc.
MSABI "solves" that by making just the low 128 bits preserved and upper bits
clobbered.

	Jakub


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 12:38     ` Jakub Jelinek
@ 2023-08-07 12:43       ` Florian Weimer
  2023-08-07 13:06         ` Jakub Jelinek
  0 siblings, 1 reply; 26+ messages in thread
From: Florian Weimer @ 2023-08-07 12:43 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Marco Elver, Andrew Morton, Kees Cook, Guenter Roeck,
	Peter Zijlstra, Mark Rutland, Steven Rostedt, Marc Zyngier,
	Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, Miguel Ojeda, linux-arm-kernel,
	kvmarm, linux-kernel, llvm, Dmitry Vyukov, Alexander Potapenko,
	kasan-dev, linux-toolchains

* Jakub Jelinek:

> On Mon, Aug 07, 2023 at 02:24:26PM +0200, Marco Elver wrote:
>> > | If the arguments are passed in callee-saved registers, then they will
>> > | be preserved by the callee across the call. This doesn’t apply for
>> > | values returned in callee-saved registers.
>> > |
>> > |  ·  On X86-64 the callee preserves all general purpose registers, except
>> > |     for R11. R11 can be used as a scratch register. Floating-point
>> > |     registers (XMMs/YMMs) are not preserved and need to be saved by the
>> > |     caller.
>> > |
>> > |  ·  On AArch64 the callee preserve all general purpose registers, except
>> > |     X0-X8 and X16-X18.
>> >
>> > Ideally, this would be documented in the respective psABI supplement.
>> > I filled in some gaps and filed:
>> >
>> >   Document the ABI for __preserve_most__ function calls
>> >   <https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/45>
>> 
>> Good idea. I had already created
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899, and we need
>> better spec to proceed for GCC anyway.
>
> "Registers used for passing arguments
> are preserved by the called function, but registers used for
> returning results are not."
>
> You mean just GPRs or also vector SSE or MMX registers?

I think this is pretty clear for x86-64:

| Floating-point registers (XMMs/YMMs) are not preserved and need to be
| saved by the caller.

The issue is more with future GPR extensions like APX.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 12:43       ` Florian Weimer
@ 2023-08-07 13:06         ` Jakub Jelinek
  0 siblings, 0 replies; 26+ messages in thread
From: Jakub Jelinek @ 2023-08-07 13:06 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Marco Elver, Andrew Morton, Kees Cook, Guenter Roeck,
	Peter Zijlstra, Mark Rutland, Steven Rostedt, Marc Zyngier,
	Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, Miguel Ojeda, linux-arm-kernel,
	kvmarm, linux-kernel, llvm, Dmitry Vyukov, Alexander Potapenko,
	kasan-dev, linux-toolchains

On Mon, Aug 07, 2023 at 02:43:39PM +0200, Florian Weimer wrote:
> > On Mon, Aug 07, 2023 at 02:24:26PM +0200, Marco Elver wrote:
> >> > | If the arguments are passed in callee-saved registers, then they will
> >> > | be preserved by the callee across the call. This doesn’t apply for
> >> > | values returned in callee-saved registers.
> >> > |
> >> > |  ·  On X86-64 the callee preserves all general purpose registers, except
> >> > |     for R11. R11 can be used as a scratch register. Floating-point
> >> > |     registers (XMMs/YMMs) are not preserved and need to be saved by the
> >> > |     caller.
> >> > |
> >> > |  ·  On AArch64 the callee preserve all general purpose registers, except
> >> > |     X0-X8 and X16-X18.
> >> >
> >> > Ideally, this would be documented in the respective psABI supplement.
> >> > I filled in some gaps and filed:
> >> >
> >> >   Document the ABI for __preserve_most__ function calls
> >> >   <https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/45>
> >> 
> >> Good idea. I had already created
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899, and we need
> >> better spec to proceed for GCC anyway.
> >
> > "Registers used for passing arguments
> > are preserved by the called function, but registers used for
> > returning results are not."
> >
> > You mean just GPRs or also vector SSE or MMX registers?
> 
> I think this is pretty clear for x86-64:
> 
> | Floating-point registers (XMMs/YMMs) are not preserved and need to be
> | saved by the caller.

The above wording conflicts with that, so it should be clarified.

	Jakub


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 12:36     ` Florian Weimer
@ 2023-08-07 13:07       ` Marco Elver
  2023-08-07 15:06       ` Peter Zijlstra
  1 sibling, 0 replies; 26+ messages in thread
From: Marco Elver @ 2023-08-07 13:07 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Andrew Morton, Kees Cook, Guenter Roeck, Peter Zijlstra,
	Mark Rutland, Steven Rostedt, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains

On Mon, 7 Aug 2023 at 14:37, Florian Weimer <fweimer@redhat.com> wrote:
>
> * Marco Elver:
>
> > Good idea. I had already created
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110899, and we need
> > better spec to proceed for GCC anyway.
>
> Thanks for the reference.
>
> >> Doesn't this change impact the kernel module ABI?
> >>
> >> I would really expect a check here
> >>
> >> > +#if __has_attribute(__preserve_most__)
> >> > +# define __preserve_most notrace __attribute__((__preserve_most__))
> >> > +#else
> >> > +# define __preserve_most
> >> > +#endif
> >>
> >> that this is not a compilation for a module.  Otherwise modules built
> >> with a compiler with __preserve_most__ attribute support are
> >> incompatible with kernels built with a compiler without that attribute.
> >
> > That's true, but is it a real problem? Isn't it known that trying to
> > make kernel modules built for a kernel with a different config (incl.
> > compiler) is not guaranteed to work? See IBT, CFI schemes, kernel
> > sanitizers, etc?
> >
> > If we were to start trying to introduce some kind of minimal kernel to
> > module ABI so that modules and kernels built with different toolchains
> > keep working together, we'd need a mechanism to guarantee this minimal
> > ABI or prohibit incompatible modules and kernels somehow. Is there a
> > precedence for this somewhere?
>
> I think the GCC vs Clang thing is expected to work today, isn't it?

I, personally, wouldn't bet on it. It very much depends on the kernel
config used.

> Using the Clang-based BPF tools with a GCC-compiled kernel requires a
> matching ABI.

BPF is a different story altogether, and falls more into the category
of user space to kernel ABI, which the kernel has strong guarantees
on.

> The other things you listed result in fairly obvious breakage, sometimes
> even module loading failures.  Unconditional crashes are possible as
> well.  With __preserve_most__, the issues are much more subtle and may
> only appear for some kernel/module compielr combinations and
> optimization settings.  The impact of incorrectly clobbered registers
> tends to be like that.

One way around this would be to make the availability of the attribute
a Kconfig variable. Then externally compiled kernel modules should do
the right thing, since they ought to use the right .config when being
built.

I can make that change for v3.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 12:36     ` Florian Weimer
  2023-08-07 13:07       ` Marco Elver
@ 2023-08-07 15:06       ` Peter Zijlstra
  1 sibling, 0 replies; 26+ messages in thread
From: Peter Zijlstra @ 2023-08-07 15:06 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Marco Elver, Andrew Morton, Kees Cook, Guenter Roeck,
	Mark Rutland, Steven Rostedt, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains

On Mon, Aug 07, 2023 at 02:36:53PM +0200, Florian Weimer wrote:

> I think the GCC vs Clang thing is expected to work today, isn't it?
> Using the Clang-based BPF tools with a GCC-compiled kernel requires a
> matching ABI.

Nope, all bets are off. There is no module ABI, in the widest possible
sense.

There's all sorts of subtle breakage, I don't remember the exact
details, but IIRC building the kernel with a compiler that has
asm-goto-output and modules with a compiler that doesn't have it gets
you fireworks.

We absolutely do even bother tracking any of this.

There's also things like GCC plugins, they can randomly (literally in
the case of struct randomization) change things around that isn't
compatible with what the other compiler does -- or perhaps even a later
version of the same compiler.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 11:41 ` Florian Weimer
  2023-08-07 12:24   ` Marco Elver
  2023-08-07 12:31   ` Peter Zijlstra
@ 2023-08-07 15:27   ` Nick Desaulniers
  2023-08-08 10:57   ` Peter Zijlstra
  3 siblings, 0 replies; 26+ messages in thread
From: Nick Desaulniers @ 2023-08-07 15:27 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Marco Elver, Andrew Morton, Kees Cook, Guenter Roeck,
	Peter Zijlstra, Mark Rutland, Steven Rostedt, Marc Zyngier,
	Oliver Upton, James Morse, Suzuki K Poulose, Zenghui Yu,
	Catalin Marinas, Will Deacon, Nathan Chancellor, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains,
	Jakub Jelinek, Greg KH

On Mon, Aug 7, 2023 at 4:41 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Marco Elver:
>
> > [1]: "On X86-64 and AArch64 targets, this attribute changes the calling
> > convention of a function. The preserve_most calling convention attempts
> > to make the code in the caller as unintrusive as possible. This
> > convention behaves identically to the C calling convention on how
> > arguments and return values are passed, but it uses a different set of
> > caller/callee-saved registers. This alleviates the burden of saving and
> > recovering a large register set before and after the call in the
> > caller."
> >
> > [1] https://clang.llvm.org/docs/AttributeReference.html#preserve-most
>
> You dropped the interesting part:
>
> | If the arguments are passed in callee-saved registers, then they will
> | be preserved by the callee across the call. This doesn’t apply for
> | values returned in callee-saved registers.
> |
> |  ·  On X86-64 the callee preserves all general purpose registers, except
> |     for R11. R11 can be used as a scratch register. Floating-point
> |     registers (XMMs/YMMs) are not preserved and need to be saved by the
> |     caller.
> |
> |  ·  On AArch64 the callee preserve all general purpose registers, except
> |     X0-X8 and X16-X18.
>
> Ideally, this would be documented in the respective psABI supplement.
> I filled in some gaps and filed:
>
>   Document the ABI for __preserve_most__ function calls
>   <https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/45>
>
> Doesn't this change impact the kernel module ABI?
>
> I would really expect a check here
>
> > +#if __has_attribute(__preserve_most__)
> > +# define __preserve_most notrace __attribute__((__preserve_most__))
> > +#else
> > +# define __preserve_most
> > +#endif
>
> that this is not a compilation for a module.  Otherwise modules built
> with a compiler with __preserve_most__ attribute support are
> incompatible with kernels built with a compiler without that attribute.

Surely the Linux kernel has a stable ABI for modules right? Nah.
https://www.kernel.org/doc/Documentation/process/stable-api-nonsense.rst

>
> Thanks,
> Florian
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 12:31   ` Peter Zijlstra
@ 2023-08-08  2:16     ` Steven Rostedt
  0 siblings, 0 replies; 26+ messages in thread
From: Steven Rostedt @ 2023-08-08  2:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Florian Weimer, Marco Elver, Andrew Morton, Kees Cook,
	Guenter Roeck, Mark Rutland, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains

On Mon, 7 Aug 2023 14:31:37 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> > that this is not a compilation for a module.  Otherwise modules built
> > with a compiler with __preserve_most__ attribute support are
> > incompatible with kernels built with a compiler without that attribute.  
> 
> We have a metric ton of options that can break module ABI. If you're
> daft enough to not build with the exact same compiler and .config you
> get to keep the pieces.

I believe there's enough checks for various compiler options in order to
enable features during the build that trying to load a module built with
another compiler is pretty much guaranteed to fail today.

-- Steve

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-07 11:41 ` Florian Weimer
                     ` (2 preceding siblings ...)
  2023-08-07 15:27   ` Nick Desaulniers
@ 2023-08-08 10:57   ` Peter Zijlstra
  2023-08-08 11:41     ` Florian Weimer
  3 siblings, 1 reply; 26+ messages in thread
From: Peter Zijlstra @ 2023-08-08 10:57 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Marco Elver, Andrew Morton, Kees Cook, Guenter Roeck,
	Mark Rutland, Steven Rostedt, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains,
	Josh Poimboeuf

On Mon, Aug 07, 2023 at 01:41:07PM +0200, Florian Weimer wrote:
> * Marco Elver:
> 
> > [1]: "On X86-64 and AArch64 targets, this attribute changes the calling
> > convention of a function. The preserve_most calling convention attempts
> > to make the code in the caller as unintrusive as possible. This
> > convention behaves identically to the C calling convention on how
> > arguments and return values are passed, but it uses a different set of
> > caller/callee-saved registers. This alleviates the burden of saving and
> > recovering a large register set before and after the call in the
> > caller."
> >
> > [1] https://clang.llvm.org/docs/AttributeReference.html#preserve-most
> 
> You dropped the interesting part:
> 
> | If the arguments are passed in callee-saved registers, then they will
> | be preserved by the callee across the call. This doesn’t apply for
> | values returned in callee-saved registers.
> | 
> |  ·  On X86-64 the callee preserves all general purpose registers, except
> |     for R11. R11 can be used as a scratch register. Floating-point
> |     registers (XMMs/YMMs) are not preserved and need to be saved by the
> |     caller.
> |     
> |  ·  On AArch64 the callee preserve all general purpose registers, except
> |     X0-X8 and X16-X18.
> 
> Ideally, this would be documented in the respective psABI supplement.
> I filled in some gaps and filed:
> 
>   Document the ABI for __preserve_most__ function calls
>   <https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/45>
> 
> Doesn't this change impact the kernel module ABI?
> 
> I would really expect a check here

So in the GCC bugzilla you raise the point of unwinding.

So in arch/x86 we've beeing doing what this attribute proposes (except
without that weird r11 exception) for a long while.

We simply do: asm("call foo_thunk"); instead of a C call, and then
have the 'thunk' thing PUSH/POP all the registers around a regular C
function.

Paravirt does quite a lot of that as well.

In a very few cases we implement a function specifically to avoid all
the PUSH/POP nonsense because it's so small the stack overhead kills it.

For unwinding this means that the 'thunk' becomes invisible when IP is
not inside it. But since the thunk is purely 'uninteresting' PUSH/POP
around a real C call, this is not an issue.

[[ tail calls leaving their sibling invisible is a far more annoying
   issue ]]

If the function is one of those special things, then it will be a leaf
function and we get to see it throught he IP.


Now, the problem with __preserve_most is that it makes it really easy to
deviate from this pattern, you can trivially write a function that is
not a trivial wrapper and then does not show up on unwind. This might
indeed be a problem.

Ofc. the kernel doesn't longjmp much, and we also don't throw many
exxceptions, but the live-patch people might care (although ORC unwinder
should be able to deal with all this perfectly fine).

In userspace, would not .eh_frame still be able to unwind this?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute
  2023-08-08 10:57   ` Peter Zijlstra
@ 2023-08-08 11:41     ` Florian Weimer
  0 siblings, 0 replies; 26+ messages in thread
From: Florian Weimer @ 2023-08-08 11:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Marco Elver, Andrew Morton, Kees Cook, Guenter Roeck,
	Mark Rutland, Steven Rostedt, Marc Zyngier, Oliver Upton,
	James Morse, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	Miguel Ojeda, linux-arm-kernel, kvmarm, linux-kernel, llvm,
	Dmitry Vyukov, Alexander Potapenko, kasan-dev, linux-toolchains,
	Josh Poimboeuf

* Peter Zijlstra:

> Now, the problem with __preserve_most is that it makes it really easy to
> deviate from this pattern, you can trivially write a function that is
> not a trivial wrapper and then does not show up on unwind. This might
> indeed be a problem.

Backtrace generation shouldn't be impacted by a compiler implementation
of __preserve_most__.  If unwinding implies restoring register contents,
the question becomes whether the unwinder can be taught to do this
natively.  For .eh_frame/PT_GNU_EH_FRAME-based unwinders and
__preserve_most__, I think that's true because they already support
custom ABIs (and GCC uses them for local functions).  In other cases, if
the unwinder does not support the extra registers, then it might still
be possible to compensate for that via code generation (e.g., setjmp
won't be __preserve_most__, so the compiler would have to preserve
register contents by other means, also accounting for the returns-twice
nature, likewise for exception handling landing pads).

But __preserve_all__ is a completely different beast.  I *think* it is
possible to do this with helpers (state size, state save, state restore)
and strategically placed restores after returns-twice functions and the
like, but people disagree.  This has come up before in the context of
the s390x vector ABI and the desire to add new callee-saved registers.
We just couldn't make that work at the time.  On the other hand,
__preserve_all__ goes into the other direction (opt-in of extra saves),
so it may be conceptually easier.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-08-08 17:29 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-04  9:02 [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Marco Elver
2023-08-04  9:02 ` [PATCH v2 2/3] list_debug: Introduce inline wrappers for debug checks Marco Elver
2023-08-04 16:03   ` Steven Rostedt
2023-08-04 17:49     ` Marco Elver
2023-08-04 17:57       ` Steven Rostedt
2023-08-04 17:59         ` Steven Rostedt
2023-08-04 18:08           ` Marco Elver
2023-08-04 18:19           ` Peter Zijlstra
2023-08-05  6:30     ` Miguel Ojeda
2023-08-04  9:02 ` [PATCH v2 3/3] list_debug: Introduce CONFIG_DEBUG_LIST_MINIMAL Marco Elver
2023-08-04 15:58 ` [PATCH v2 1/3] compiler_types: Introduce the Clang __preserve_most function attribute Steven Rostedt
2023-08-04 18:14 ` Peter Zijlstra
2023-08-05  6:35 ` Miguel Ojeda
2023-08-07 11:41 ` Florian Weimer
2023-08-07 12:24   ` Marco Elver
2023-08-07 12:36     ` Florian Weimer
2023-08-07 13:07       ` Marco Elver
2023-08-07 15:06       ` Peter Zijlstra
2023-08-07 12:38     ` Jakub Jelinek
2023-08-07 12:43       ` Florian Weimer
2023-08-07 13:06         ` Jakub Jelinek
2023-08-07 12:31   ` Peter Zijlstra
2023-08-08  2:16     ` Steven Rostedt
2023-08-07 15:27   ` Nick Desaulniers
2023-08-08 10:57   ` Peter Zijlstra
2023-08-08 11:41     ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).