linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] lib/string.c: implement stpcpy
@ 2020-08-25 13:58 Nick Desaulniers
  2020-08-25 18:51 ` Nathan Chancellor
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Nick Desaulniers @ 2020-08-25 13:58 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: clang-built-linux, Nick Desaulniers, stable, Andy Lavr,
	Arvind Sankar, Joe Perches, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Kees Cook, Andy Shevchenko, Alexandru Ardelean,
	Yury Norov, linux-kernel

LLVM implemented a recent "libcall optimization" that lowers calls to
`sprintf(dest, "%s", str)` where the return value is used to
`stpcpy(dest, str) - dest`. This generally avoids the machinery involved
in parsing format strings.  `stpcpy` is just like `strcpy` except it
returns the pointer to the new tail of `dest`.  This optimization was
introduced into clang-12.

Implement this so that we don't observe linkage failures due to missing
symbol definitions for `stpcpy`.

Similar to last year's fire drill with:
commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")

The kernel is somewhere between a "freestanding" environment (no full libc)
and "hosted" environment (many symbols from libc exist with the same
type, function signature, and semantics).

As H. Peter Anvin notes, there's not really a great way to inform the
compiler that you're targeting a freestanding environment but would like
to opt-in to some libcall optimizations (see pr/47280 below), rather than
opt-out.

Arvind notes, -fno-builtin-* behaves slightly differently between GCC
and Clang, and Clang is missing many __builtin_* definitions, which I
consider a bug in Clang and am working on fixing.

Masahiro summarizes the subtle distinction between compilers justly:
  To prevent transformation from foo() into bar(), there are two ways in
  Clang to do that; -fno-builtin-foo, and -fno-builtin-bar.  There is
  only one in GCC; -fno-buitin-foo.

(Any difference in that behavior in Clang is likely a bug from a missing
__builtin_* definition.)

Masahiro also notes:
  We want to disable optimization from foo() to bar(),
  but we may still benefit from the optimization from
  foo() into something else. If GCC implements the same transform, we
  would run into a problem because it is not -fno-builtin-bar, but
  -fno-builtin-foo that disables that optimization.

  In this regard, -fno-builtin-foo would be more future-proof than
  -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
  may want to prevent calls from foo() being optimized into calls to
  bar(), but we still may want other optimization on calls to foo().

It seems that compilers today don't quite provide the fine grain control
over which libcall optimizations pseudo-freestanding environments would
prefer.

Finally, Kees notes that this interface is unsafe, so we should not
encourage its use.  As such, I've removed the declaration from any
header, but it still needs to be exported to avoid linkage errors in
modules.

Cc: stable@vger.kernel.org
Link: https://bugs.llvm.org/show_bug.cgi?id=47162
Link: https://bugs.llvm.org/show_bug.cgi?id=47280
Link: https://github.com/ClangBuiltLinux/linux/issues/1126
Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
Link: https://reviews.llvm.org/D85963
Suggested-by: Andy Lavr <andy.lavr@gmail.com>
Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
Suggested-by: Joe Perches <joe@perches.com>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Changes V3:
* Drop Sami's Tested by tag; newer patch.
* Add EXPORT_SYMBOL as per Andy.
* Rewrite commit message, rewrote part of what Masahiro said to be
  generic in terms of foo() and bar().
* Prefer %NUL-terminated to NULL terminated. NUL is the ASCII character
  '\0', as per Arvind and Rasmus.

Changes V2:
* Added Sami's Tested by; though the patch changed implementation, the
  missing symbol at link time was the problem Sami was observing.
* Fix __restrict -> __restrict__ typo as per Joe.
* Drop note about restrict from commit message as per Arvind.
* Fix NULL -> NUL as per Arvind; NUL is ASCII '\0'. TIL
* Fix off by one error as per Arvind; I had another off by one error in
  my test program that was masking this.

 lib/string.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/lib/string.c b/lib/string.c
index 6012c385fb31..6bd0cf0fb009 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count)
 }
 EXPORT_SYMBOL(strscpy_pad);
 
+/**
+ * stpcpy - copy a string from src to dest returning a pointer to the new end
+ *          of dest, including src's %NUL-terminator. May overrun dest.
+ * @dest: pointer to end of string being copied into. Must be large enough
+ *        to receive copy.
+ * @src: pointer to the beginning of string being copied from. Must not overlap
+ *       dest.
+ *
+ * stpcpy differs from strcpy in a key way: the return value is the new
+ * %NUL-terminated character. (for strcpy, the return value is a pointer to
+ * src. This interface is considered unsafe as it doesn't perform bounds
+ * checking of the inputs. As such it's not recommended for usage. Instead,
+ * its definition is provided in case the compiler lowers other libcalls to
+ * stpcpy.
+ */
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src);
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)
+{
+	while ((*dest++ = *src++) != '\0')
+		/* nothing */;
+	return --dest;
+}
+EXPORT_SYMBOL(stpcpy);
+
 #ifndef __HAVE_ARCH_STRCAT
 /**
  * strcat - Append one %NUL-terminated string to another
-- 
2.28.0.297.g1956fa8f8d-goog


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-25 13:58 [PATCH v3] lib/string.c: implement stpcpy Nick Desaulniers
@ 2020-08-25 18:51 ` Nathan Chancellor
  2020-08-26 15:41 ` Sedat Dilek
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Nathan Chancellor @ 2020-08-25 18:51 UTC (permalink / raw)
  To: Nick Desaulniers, h
  Cc: Masahiro Yamada, clang-built-linux, stable, Andy Lavr,
	Arvind Sankar, Joe Perches, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Kees Cook, Andy Shevchenko, Alexandru Ardelean,
	Yury Norov, linux-kernel

On Tue, Aug 25, 2020 at 06:58:36AM -0700, Nick Desaulniers wrote:
> LLVM implemented a recent "libcall optimization" that lowers calls to
> `sprintf(dest, "%s", str)` where the return value is used to
> `stpcpy(dest, str) - dest`. This generally avoids the machinery involved
> in parsing format strings.  `stpcpy` is just like `strcpy` except it
> returns the pointer to the new tail of `dest`.  This optimization was
> introduced into clang-12.
> 
> Implement this so that we don't observe linkage failures due to missing
> symbol definitions for `stpcpy`.
> 
> Similar to last year's fire drill with:
> commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
> 
> The kernel is somewhere between a "freestanding" environment (no full libc)
> and "hosted" environment (many symbols from libc exist with the same
> type, function signature, and semantics).
> 
> As H. Peter Anvin notes, there's not really a great way to inform the
> compiler that you're targeting a freestanding environment but would like
> to opt-in to some libcall optimizations (see pr/47280 below), rather than
> opt-out.
> 
> Arvind notes, -fno-builtin-* behaves slightly differently between GCC
> and Clang, and Clang is missing many __builtin_* definitions, which I
> consider a bug in Clang and am working on fixing.
> 
> Masahiro summarizes the subtle distinction between compilers justly:
>   To prevent transformation from foo() into bar(), there are two ways in
>   Clang to do that; -fno-builtin-foo, and -fno-builtin-bar.  There is
>   only one in GCC; -fno-buitin-foo.
> 
> (Any difference in that behavior in Clang is likely a bug from a missing
> __builtin_* definition.)
> 
> Masahiro also notes:
>   We want to disable optimization from foo() to bar(),
>   but we may still benefit from the optimization from
>   foo() into something else. If GCC implements the same transform, we
>   would run into a problem because it is not -fno-builtin-bar, but
>   -fno-builtin-foo that disables that optimization.
> 
>   In this regard, -fno-builtin-foo would be more future-proof than
>   -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
>   may want to prevent calls from foo() being optimized into calls to
>   bar(), but we still may want other optimization on calls to foo().
> 
> It seems that compilers today don't quite provide the fine grain control
> over which libcall optimizations pseudo-freestanding environments would
> prefer.
> 
> Finally, Kees notes that this interface is unsafe, so we should not
> encourage its use.  As such, I've removed the declaration from any
> header, but it still needs to be exported to avoid linkage errors in
> modules.
> 
> Cc: stable@vger.kernel.org
> Link: https://bugs.llvm.org/show_bug.cgi?id=47162
> Link: https://bugs.llvm.org/show_bug.cgi?id=47280
> Link: https://github.com/ClangBuiltLinux/linux/issues/1126
> Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
> Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
> Link: https://reviews.llvm.org/D85963
> Suggested-by: Andy Lavr <andy.lavr@gmail.com>
> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
> Suggested-by: Joe Perches <joe@perches.com>
> Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Reported-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Tested-by: Nathan Chancellor <natechancellor@gmail.com>

> ---
> Changes V3:
> * Drop Sami's Tested by tag; newer patch.
> * Add EXPORT_SYMBOL as per Andy.
> * Rewrite commit message, rewrote part of what Masahiro said to be
>   generic in terms of foo() and bar().
> * Prefer %NUL-terminated to NULL terminated. NUL is the ASCII character
>   '\0', as per Arvind and Rasmus.
> 
> Changes V2:
> * Added Sami's Tested by; though the patch changed implementation, the
>   missing symbol at link time was the problem Sami was observing.
> * Fix __restrict -> __restrict__ typo as per Joe.
> * Drop note about restrict from commit message as per Arvind.
> * Fix NULL -> NUL as per Arvind; NUL is ASCII '\0'. TIL
> * Fix off by one error as per Arvind; I had another off by one error in
>   my test program that was masking this.
> 
>  lib/string.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/lib/string.c b/lib/string.c
> index 6012c385fb31..6bd0cf0fb009 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count)
>  }
>  EXPORT_SYMBOL(strscpy_pad);
>  
> +/**
> + * stpcpy - copy a string from src to dest returning a pointer to the new end
> + *          of dest, including src's %NUL-terminator. May overrun dest.
> + * @dest: pointer to end of string being copied into. Must be large enough
> + *        to receive copy.
> + * @src: pointer to the beginning of string being copied from. Must not overlap
> + *       dest.
> + *
> + * stpcpy differs from strcpy in a key way: the return value is the new
> + * %NUL-terminated character. (for strcpy, the return value is a pointer to
> + * src. This interface is considered unsafe as it doesn't perform bounds
> + * checking of the inputs. As such it's not recommended for usage. Instead,
> + * its definition is provided in case the compiler lowers other libcalls to
> + * stpcpy.
> + */
> +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src);
> +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)
> +{
> +	while ((*dest++ = *src++) != '\0')
> +		/* nothing */;
> +	return --dest;
> +}
> +EXPORT_SYMBOL(stpcpy);
> +
>  #ifndef __HAVE_ARCH_STRCAT
>  /**
>   * strcat - Append one %NUL-terminated string to another
> -- 
> 2.28.0.297.g1956fa8f8d-goog
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-25 13:58 [PATCH v3] lib/string.c: implement stpcpy Nick Desaulniers
  2020-08-25 18:51 ` Nathan Chancellor
@ 2020-08-26 15:41 ` Sedat Dilek
  2020-08-26 15:58 ` Masahiro Yamada
  2020-08-26 16:49 ` Masahiro Yamada
  3 siblings, 0 replies; 25+ messages in thread
From: Sedat Dilek @ 2020-08-26 15:41 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Masahiro Yamada, Clang-Built-Linux ML, stable, Andy Lavr,
	Arvind Sankar, Joe Perches, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Kees Cook, Andy Shevchenko, Alexandru Ardelean,
	Yury Norov, linux-kernel

On Tue, Aug 25, 2020 at 3:58 PM 'Nick Desaulniers' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
>
> LLVM implemented a recent "libcall optimization" that lowers calls to
> `sprintf(dest, "%s", str)` where the return value is used to
> `stpcpy(dest, str) - dest`. This generally avoids the machinery involved
> in parsing format strings.  `stpcpy` is just like `strcpy` except it
> returns the pointer to the new tail of `dest`.  This optimization was
> introduced into clang-12.
>
> Implement this so that we don't observe linkage failures due to missing
> symbol definitions for `stpcpy`.
>
> Similar to last year's fire drill with:
> commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
>
> The kernel is somewhere between a "freestanding" environment (no full libc)
> and "hosted" environment (many symbols from libc exist with the same
> type, function signature, and semantics).
>
> As H. Peter Anvin notes, there's not really a great way to inform the
> compiler that you're targeting a freestanding environment but would like
> to opt-in to some libcall optimizations (see pr/47280 below), rather than
> opt-out.
>
> Arvind notes, -fno-builtin-* behaves slightly differently between GCC
> and Clang, and Clang is missing many __builtin_* definitions, which I
> consider a bug in Clang and am working on fixing.
>
> Masahiro summarizes the subtle distinction between compilers justly:
>   To prevent transformation from foo() into bar(), there are two ways in
>   Clang to do that; -fno-builtin-foo, and -fno-builtin-bar.  There is
>   only one in GCC; -fno-buitin-foo.
>
> (Any difference in that behavior in Clang is likely a bug from a missing
> __builtin_* definition.)
>
> Masahiro also notes:
>   We want to disable optimization from foo() to bar(),
>   but we may still benefit from the optimization from
>   foo() into something else. If GCC implements the same transform, we
>   would run into a problem because it is not -fno-builtin-bar, but
>   -fno-builtin-foo that disables that optimization.
>
>   In this regard, -fno-builtin-foo would be more future-proof than
>   -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
>   may want to prevent calls from foo() being optimized into calls to
>   bar(), but we still may want other optimization on calls to foo().
>
> It seems that compilers today don't quite provide the fine grain control
> over which libcall optimizations pseudo-freestanding environments would
> prefer.
>
> Finally, Kees notes that this interface is unsafe, so we should not
> encourage its use.  As such, I've removed the declaration from any
> header, but it still needs to be exported to avoid linkage errors in
> modules.
>
> Cc: stable@vger.kernel.org
> Link: https://bugs.llvm.org/show_bug.cgi?id=47162
> Link: https://bugs.llvm.org/show_bug.cgi?id=47280
> Link: https://github.com/ClangBuiltLinux/linux/issues/1126
> Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
> Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
> Link: https://reviews.llvm.org/D85963
> Suggested-by: Andy Lavr <andy.lavr@gmail.com>
> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
> Suggested-by: Joe Perches <joe@perches.com>
> Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Reported-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # included in Sami's
clang-cfi Git

- Sedat -

> ---
> Changes V3:
> * Drop Sami's Tested by tag; newer patch.
> * Add EXPORT_SYMBOL as per Andy.
> * Rewrite commit message, rewrote part of what Masahiro said to be
>   generic in terms of foo() and bar().
> * Prefer %NUL-terminated to NULL terminated. NUL is the ASCII character
>   '\0', as per Arvind and Rasmus.
>
> Changes V2:
> * Added Sami's Tested by; though the patch changed implementation, the
>   missing symbol at link time was the problem Sami was observing.
> * Fix __restrict -> __restrict__ typo as per Joe.
> * Drop note about restrict from commit message as per Arvind.
> * Fix NULL -> NUL as per Arvind; NUL is ASCII '\0'. TIL
> * Fix off by one error as per Arvind; I had another off by one error in
>   my test program that was masking this.
>
>  lib/string.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
>
> diff --git a/lib/string.c b/lib/string.c
> index 6012c385fb31..6bd0cf0fb009 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count)
>  }
>  EXPORT_SYMBOL(strscpy_pad);
>
> +/**
> + * stpcpy - copy a string from src to dest returning a pointer to the new end
> + *          of dest, including src's %NUL-terminator. May overrun dest.
> + * @dest: pointer to end of string being copied into. Must be large enough
> + *        to receive copy.
> + * @src: pointer to the beginning of string being copied from. Must not overlap
> + *       dest.
> + *
> + * stpcpy differs from strcpy in a key way: the return value is the new
> + * %NUL-terminated character. (for strcpy, the return value is a pointer to
> + * src. This interface is considered unsafe as it doesn't perform bounds
> + * checking of the inputs. As such it's not recommended for usage. Instead,
> + * its definition is provided in case the compiler lowers other libcalls to
> + * stpcpy.
> + */
> +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src);
> +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)
> +{
> +       while ((*dest++ = *src++) != '\0')
> +               /* nothing */;
> +       return --dest;
> +}
> +EXPORT_SYMBOL(stpcpy);
> +
>  #ifndef __HAVE_ARCH_STRCAT
>  /**
>   * strcat - Append one %NUL-terminated string to another
> --
> 2.28.0.297.g1956fa8f8d-goog
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200825135838.2938771-1-ndesaulniers%40google.com.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-25 13:58 [PATCH v3] lib/string.c: implement stpcpy Nick Desaulniers
  2020-08-25 18:51 ` Nathan Chancellor
  2020-08-26 15:41 ` Sedat Dilek
@ 2020-08-26 15:58 ` Masahiro Yamada
  2020-09-06  9:57   ` Kees Cook
  2020-08-26 16:49 ` Masahiro Yamada
  3 siblings, 1 reply; 25+ messages in thread
From: Masahiro Yamada @ 2020-08-26 15:58 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: clang-built-linux, stable, Andy Lavr, Arvind Sankar, Joe Perches,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Kees Cook,
	Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Tue, Aug 25, 2020 at 10:58 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> LLVM implemented a recent "libcall optimization" that lowers calls to
> `sprintf(dest, "%s", str)` where the return value is used to
> `stpcpy(dest, str) - dest`. This generally avoids the machinery involved
> in parsing format strings.  `stpcpy` is just like `strcpy` except it
> returns the pointer to the new tail of `dest`.  This optimization was
> introduced into clang-12.
>
> Implement this so that we don't observe linkage failures due to missing
> symbol definitions for `stpcpy`.
>
> Similar to last year's fire drill with:
> commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
>
> The kernel is somewhere between a "freestanding" environment (no full libc)
> and "hosted" environment (many symbols from libc exist with the same
> type, function signature, and semantics).
>
> As H. Peter Anvin notes, there's not really a great way to inform the
> compiler that you're targeting a freestanding environment but would like
> to opt-in to some libcall optimizations (see pr/47280 below), rather than
> opt-out.
>
> Arvind notes, -fno-builtin-* behaves slightly differently between GCC
> and Clang, and Clang is missing many __builtin_* definitions, which I
> consider a bug in Clang and am working on fixing.
>
> Masahiro summarizes the subtle distinction between compilers justly:
>   To prevent transformation from foo() into bar(), there are two ways in
>   Clang to do that; -fno-builtin-foo, and -fno-builtin-bar.  There is
>   only one in GCC; -fno-buitin-foo.
>
> (Any difference in that behavior in Clang is likely a bug from a missing
> __builtin_* definition.)
>
> Masahiro also notes:
>   We want to disable optimization from foo() to bar(),
>   but we may still benefit from the optimization from
>   foo() into something else. If GCC implements the same transform, we
>   would run into a problem because it is not -fno-builtin-bar, but
>   -fno-builtin-foo that disables that optimization.
>
>   In this regard, -fno-builtin-foo would be more future-proof than
>   -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
>   may want to prevent calls from foo() being optimized into calls to
>   bar(), but we still may want other optimization on calls to foo().
>
> It seems that compilers today don't quite provide the fine grain control
> over which libcall optimizations pseudo-freestanding environments would
> prefer.
>
> Finally, Kees notes that this interface is unsafe, so we should not
> encourage its use.  As such, I've removed the declaration from any
> header, but it still needs to be exported to avoid linkage errors in
> modules.
>
> Cc: stable@vger.kernel.org
> Link: https://bugs.llvm.org/show_bug.cgi?id=47162
> Link: https://bugs.llvm.org/show_bug.cgi?id=47280
> Link: https://github.com/ClangBuiltLinux/linux/issues/1126
> Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
> Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
> Link: https://reviews.llvm.org/D85963
> Suggested-by: Andy Lavr <andy.lavr@gmail.com>
> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
> Suggested-by: Joe Perches <joe@perches.com>
> Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Reported-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
> Changes V3:
> * Drop Sami's Tested by tag; newer patch.
> * Add EXPORT_SYMBOL as per Andy.
> * Rewrite commit message, rewrote part of what Masahiro said to be
>   generic in terms of foo() and bar().
> * Prefer %NUL-terminated to NULL terminated. NUL is the ASCII character
>   '\0', as per Arvind and Rasmus.
>
> Changes V2:
> * Added Sami's Tested by; though the patch changed implementation, the
>   missing symbol at link time was the problem Sami was observing.
> * Fix __restrict -> __restrict__ typo as per Joe.
> * Drop note about restrict from commit message as per Arvind.
> * Fix NULL -> NUL as per Arvind; NUL is ASCII '\0'. TIL
> * Fix off by one error as per Arvind; I had another off by one error in
>   my test program that was masking this.
>
>  lib/string.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
>
> diff --git a/lib/string.c b/lib/string.c
> index 6012c385fb31..6bd0cf0fb009 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count)
>  }
>  EXPORT_SYMBOL(strscpy_pad);
>
> +/**
> + * stpcpy - copy a string from src to dest returning a pointer to the new end
> + *          of dest, including src's %NUL-terminator. May overrun dest.
> + * @dest: pointer to end of string being copied into. Must be large enough
> + *        to receive copy.
> + * @src: pointer to the beginning of string being copied from. Must not overlap
> + *       dest.
> + *
> + * stpcpy differs from strcpy in a key way: the return value is the new
> + * %NUL-terminated character. (for strcpy, the return value is a pointer to
> + * src.


return a pointer to src?


"man 3 strcpy" says:

The strcpy() and strncpy() functions return
a pointer to the destination string *dest*.








>  This interface is considered unsafe as it doesn't perform bounds
> + * checking of the inputs. As such it's not recommended for usage. Instead,
> + * its definition is provided in case the compiler lowers other libcalls to
> + * stpcpy.
> + */
> +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src);
> +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)
> +{
> +       while ((*dest++ = *src++) != '\0')
> +               /* nothing */;
> +       return --dest;
> +}
> +EXPORT_SYMBOL(stpcpy);
> +
>  #ifndef __HAVE_ARCH_STRCAT
>  /**
>   * strcat - Append one %NUL-terminated string to another
> --
> 2.28.0.297.g1956fa8f8d-goog
>


--
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-25 13:58 [PATCH v3] lib/string.c: implement stpcpy Nick Desaulniers
                   ` (2 preceding siblings ...)
  2020-08-26 15:58 ` Masahiro Yamada
@ 2020-08-26 16:49 ` Masahiro Yamada
  2020-08-26 16:57   ` Joe Perches
  3 siblings, 1 reply; 25+ messages in thread
From: Masahiro Yamada @ 2020-08-26 16:49 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: clang-built-linux, stable, Andy Lavr, Arvind Sankar, Joe Perches,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Kees Cook,
	Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Tue, Aug 25, 2020 at 10:58 PM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> LLVM implemented a recent "libcall optimization" that lowers calls to
> `sprintf(dest, "%s", str)` where the return value is used to
> `stpcpy(dest, str) - dest`. This generally avoids the machinery involved
> in parsing format strings.  `stpcpy` is just like `strcpy` except it
> returns the pointer to the new tail of `dest`.  This optimization was
> introduced into clang-12.
>
> Implement this so that we don't observe linkage failures due to missing
> symbol definitions for `stpcpy`.
>
> Similar to last year's fire drill with:
> commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
>
> The kernel is somewhere between a "freestanding" environment (no full libc)
> and "hosted" environment (many symbols from libc exist with the same
> type, function signature, and semantics).
>
> As H. Peter Anvin notes, there's not really a great way to inform the
> compiler that you're targeting a freestanding environment but would like
> to opt-in to some libcall optimizations (see pr/47280 below), rather than
> opt-out.
>
> Arvind notes, -fno-builtin-* behaves slightly differently between GCC
> and Clang, and Clang is missing many __builtin_* definitions, which I
> consider a bug in Clang and am working on fixing.
>
> Masahiro summarizes the subtle distinction between compilers justly:
>   To prevent transformation from foo() into bar(), there are two ways in
>   Clang to do that; -fno-builtin-foo, and -fno-builtin-bar.  There is
>   only one in GCC; -fno-buitin-foo.
>
> (Any difference in that behavior in Clang is likely a bug from a missing
> __builtin_* definition.)
>
> Masahiro also notes:
>   We want to disable optimization from foo() to bar(),
>   but we may still benefit from the optimization from
>   foo() into something else. If GCC implements the same transform, we
>   would run into a problem because it is not -fno-builtin-bar, but
>   -fno-builtin-foo that disables that optimization.
>
>   In this regard, -fno-builtin-foo would be more future-proof than
>   -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
>   may want to prevent calls from foo() being optimized into calls to
>   bar(), but we still may want other optimization on calls to foo().
>
> It seems that compilers today don't quite provide the fine grain control
> over which libcall optimizations pseudo-freestanding environments would
> prefer.
>
> Finally, Kees notes that this interface is unsafe, so we should not
> encourage its use.  As such, I've removed the declaration from any
> header, but it still needs to be exported to avoid linkage errors in
> modules.
>
> Cc: stable@vger.kernel.org
> Link: https://bugs.llvm.org/show_bug.cgi?id=47162
> Link: https://bugs.llvm.org/show_bug.cgi?id=47280
> Link: https://github.com/ClangBuiltLinux/linux/issues/1126
> Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
> Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
> Link: https://reviews.llvm.org/D85963
> Suggested-by: Andy Lavr <andy.lavr@gmail.com>
> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
> Suggested-by: Joe Perches <joe@perches.com>
> Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Reported-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
> Changes V3:
> * Drop Sami's Tested by tag; newer patch.
> * Add EXPORT_SYMBOL as per Andy.
> * Rewrite commit message, rewrote part of what Masahiro said to be
>   generic in terms of foo() and bar().
> * Prefer %NUL-terminated to NULL terminated. NUL is the ASCII character
>   '\0', as per Arvind and Rasmus.
>
> Changes V2:
> * Added Sami's Tested by; though the patch changed implementation, the
>   missing symbol at link time was the problem Sami was observing.
> * Fix __restrict -> __restrict__ typo as per Joe.
> * Drop note about restrict from commit message as per Arvind.
> * Fix NULL -> NUL as per Arvind; NUL is ASCII '\0'. TIL
> * Fix off by one error as per Arvind; I had another off by one error in
>   my test program that was masking this.
>
>  lib/string.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
>
> diff --git a/lib/string.c b/lib/string.c
> index 6012c385fb31..6bd0cf0fb009 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count)
>  }
>  EXPORT_SYMBOL(strscpy_pad);
>
> +/**
> + * stpcpy - copy a string from src to dest returning a pointer to the new end
> + *          of dest, including src's %NUL-terminator. May overrun dest.
> + * @dest: pointer to end of string being copied into. Must be large enough
> + *        to receive copy.
> + * @src: pointer to the beginning of string being copied from. Must not overlap
> + *       dest.
> + *
> + * stpcpy differs from strcpy in a key way: the return value is the new
> + * %NUL-terminated character. (for strcpy, the return value is a pointer to
> + * src. This interface is considered unsafe as it doesn't perform bounds
> + * checking of the inputs. As such it's not recommended for usage. Instead,
> + * its definition is provided in case the compiler lowers other libcalls to
> + * stpcpy.


I do not have time to keep track of the discussion fully,
but could you give me a little more context why
the usage of stpcpy() is not recommended ?

The implementation of strcpy() is almost the same.
It is unclear to me what makes stpcpy() unsafe..



> + */
> +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src);
> +char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)
> +{
> +       while ((*dest++ = *src++) != '\0')
> +               /* nothing */;
> +       return --dest;
> +}
> +EXPORT_SYMBOL(stpcpy);
> +
>  #ifndef __HAVE_ARCH_STRCAT
>  /**
>   * strcat - Append one %NUL-terminated string to another
> --
> 2.28.0.297.g1956fa8f8d-goog
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-26 16:49 ` Masahiro Yamada
@ 2020-08-26 16:57   ` Joe Perches
  2020-08-26 16:58     ` Nick Desaulniers
  0 siblings, 1 reply; 25+ messages in thread
From: Joe Perches @ 2020-08-26 16:57 UTC (permalink / raw)
  To: Masahiro Yamada, Nick Desaulniers
  Cc: clang-built-linux, stable, Andy Lavr, Arvind Sankar,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Kees Cook,
	Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Thu, 2020-08-27 at 01:49 +0900, Masahiro Yamada wrote:
> I do not have time to keep track of the discussion fully,
> but could you give me a little more context why
> the usage of stpcpy() is not recommended ?
> 
> The implementation of strcpy() is almost the same.
> It is unclear to me what makes stpcpy() unsafe..

It's the same thing that makes strcpy unsafe:

Unchecked buffer lengths with no guarantee src is terminated.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-26 16:57   ` Joe Perches
@ 2020-08-26 16:58     ` Nick Desaulniers
  2020-08-26 22:59       ` Masahiro Yamada
  0 siblings, 1 reply; 25+ messages in thread
From: Nick Desaulniers @ 2020-08-26 16:58 UTC (permalink / raw)
  To: Joe Perches
  Cc: Masahiro Yamada, clang-built-linux, stable, Andy Lavr,
	Arvind Sankar, Rasmus Villemoes, Sami Tolvanen, Andrew Morton,
	Kees Cook, Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Wed, Aug 26, 2020 at 9:57 AM Joe Perches <joe@perches.com> wrote:
>
> On Thu, 2020-08-27 at 01:49 +0900, Masahiro Yamada wrote:
> > I do not have time to keep track of the discussion fully,
> > but could you give me a little more context why
> > the usage of stpcpy() is not recommended ?
> >
> > The implementation of strcpy() is almost the same.
> > It is unclear to me what makes stpcpy() unsafe..

https://lore.kernel.org/lkml/202008150921.B70721A359@keescook/

>
> It's the same thing that makes strcpy unsafe:
>
> Unchecked buffer lengths with no guarantee src is terminated.

-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-26 16:58     ` Nick Desaulniers
@ 2020-08-26 22:59       ` Masahiro Yamada
  2020-08-26 23:38         ` Kees Cook
  0 siblings, 1 reply; 25+ messages in thread
From: Masahiro Yamada @ 2020-08-26 22:59 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Joe Perches, clang-built-linux, stable, Andy Lavr, Arvind Sankar,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Kees Cook,
	Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Thu, Aug 27, 2020 at 1:58 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Wed, Aug 26, 2020 at 9:57 AM Joe Perches <joe@perches.com> wrote:
> >
> > On Thu, 2020-08-27 at 01:49 +0900, Masahiro Yamada wrote:
> > > I do not have time to keep track of the discussion fully,
> > > but could you give me a little more context why
> > > the usage of stpcpy() is not recommended ?
> > >
> > > The implementation of strcpy() is almost the same.
> > > It is unclear to me what makes stpcpy() unsafe..
>
> https://lore.kernel.org/lkml/202008150921.B70721A359@keescook/
>
> >
> > It's the same thing that makes strcpy unsafe:
> >
> > Unchecked buffer lengths with no guarantee src is terminated.
>


OK, then stpcpy(), strcpy() and sprintf()
have the same level of unsafety.


strcpy() is used everywhere.

I am not convinced why only stpcpy() should be hidden.




-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-26 22:59       ` Masahiro Yamada
@ 2020-08-26 23:38         ` Kees Cook
  2020-08-26 23:57           ` Joe Perches
  2020-08-27  8:59           ` Andy Shevchenko
  0 siblings, 2 replies; 25+ messages in thread
From: Kees Cook @ 2020-08-26 23:38 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Nick Desaulniers, Joe Perches, clang-built-linux, stable,
	Andy Lavr, Arvind Sankar, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Thu, Aug 27, 2020 at 07:59:45AM +0900, Masahiro Yamada wrote:
> On Thu, Aug 27, 2020 at 1:58 AM Nick Desaulniers
> <ndesaulniers@google.com> wrote:
> >
> > On Wed, Aug 26, 2020 at 9:57 AM Joe Perches <joe@perches.com> wrote:
> > >
> > > On Thu, 2020-08-27 at 01:49 +0900, Masahiro Yamada wrote:
> > > > I do not have time to keep track of the discussion fully,
> > > > but could you give me a little more context why
> > > > the usage of stpcpy() is not recommended ?
> > > >
> > > > The implementation of strcpy() is almost the same.
> > > > It is unclear to me what makes stpcpy() unsafe..
> >
> > https://lore.kernel.org/lkml/202008150921.B70721A359@keescook/
> >
> > >
> > > It's the same thing that makes strcpy unsafe:
> > >
> > > Unchecked buffer lengths with no guarantee src is terminated.
> >
> 
> 
> OK, then stpcpy(), strcpy() and sprintf()
> have the same level of unsafety.

Yes. And even snprintf() is dangerous because its return value is how
much it WOULD have written, which when (commonly) used as an offset for
further pointer writes, causes OOB writes too. :(
https://github.com/KSPP/linux/issues/105

> strcpy() is used everywhere.

Yes. It's very frustrating, but it's not an excuse to continue
using it nor introducing more bad APIs.

$ git grep '\bstrcpy\b' | wc -l
2212
$ git grep '\bstrncpy\b' | wc -l
751
$ git grep '\bstrlcpy\b' | wc -l
1712

$ git grep '\bstrscpy\b' | wc -l
1066

https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
https://github.com/KSPP/linux/issues/88

https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
https://github.com/KSPP/linux/issues/89

https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
https://github.com/KSPP/linux/issues/90

We have no way right now to block the addition of deprecated API usage,
which makes ever catching up on this replacement very challenging. The
only way we caught up with VLA removal was because of -Wvla on sfr's
-next builds.

I guess we could set up a robot to just watch -next commits and yell
about new instances, but patches come and go -- I worry it'd be noisy...

> I am not convinced why only stpcpy() should be hidden.

Because nothing uses it right now. It's only the compiler suddenly now
trying to use it directly...

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-26 23:38         ` Kees Cook
@ 2020-08-26 23:57           ` Joe Perches
  2020-08-27  2:33             ` Kees Cook
  2020-08-27  8:59           ` Andy Shevchenko
  1 sibling, 1 reply; 25+ messages in thread
From: Joe Perches @ 2020-08-26 23:57 UTC (permalink / raw)
  To: Kees Cook, Masahiro Yamada
  Cc: Nick Desaulniers, clang-built-linux, stable, Andy Lavr,
	Arvind Sankar, Rasmus Villemoes, Sami Tolvanen, Andrew Morton,
	Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Wed, 2020-08-26 at 16:38 -0700, Kees Cook wrote:
> On Thu, Aug 27, 2020 at 07:59:45AM +0900, Masahiro Yamada wrote:
[]
> > OK, then stpcpy(), strcpy() and sprintf()
> > have the same level of unsafety.
> 
> Yes. And even snprintf() is dangerous because its return value is how
> much it WOULD have written, which when (commonly) used as an offset for
> further pointer writes, causes OOB writes too. :(
> https://github.com/KSPP/linux/issues/105
> 
> > strcpy() is used everywhere.
> 
> Yes. It's very frustrating, but it's not an excuse to continue
> using it nor introducing more bad APIs.
> 
> $ git grep '\bstrcpy\b' | wc -l
> 2212
> $ git grep '\bstrncpy\b' | wc -l
> 751
> $ git grep '\bstrlcpy\b' | wc -l
> 1712
> 
> $ git grep '\bstrscpy\b' | wc -l
> 1066
> 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
> https://github.com/KSPP/linux/issues/88
> 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
> https://github.com/KSPP/linux/issues/89
> 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
> https://github.com/KSPP/linux/issues/90
> 
> We have no way right now to block the addition of deprecated API usage,
> which makes ever catching up on this replacement very challenging.

These could be added to checkpatch's deprecated_api test.
---
 scripts/checkpatch.pl | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 149518d2a6a7..f9ccb2a63a95 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -605,6 +605,9 @@ foreach my $entry (@mode_permission_funcs) {
 $mode_perms_search = "(?:${mode_perms_search})";
 
 our %deprecated_apis = (
+	"strcpy"				=> "strscpy",
+	"strncpy"				=> "strscpy",
+	"strlcpy"				=> "strscpy",
 	"synchronize_rcu_bh"			=> "synchronize_rcu",
 	"synchronize_rcu_bh_expedited"		=> "synchronize_rcu_expedited",
 	"call_rcu_bh"				=> "call_rcu",



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-26 23:57           ` Joe Perches
@ 2020-08-27  2:33             ` Kees Cook
  2020-08-27  2:42               ` Joe Perches
  0 siblings, 1 reply; 25+ messages in thread
From: Kees Cook @ 2020-08-27  2:33 UTC (permalink / raw)
  To: Joe Perches
  Cc: Masahiro Yamada, Nick Desaulniers, clang-built-linux, stable,
	Andy Lavr, Arvind Sankar, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Wed, Aug 26, 2020 at 04:57:41PM -0700, Joe Perches wrote:
> On Wed, 2020-08-26 at 16:38 -0700, Kees Cook wrote:
> > On Thu, Aug 27, 2020 at 07:59:45AM +0900, Masahiro Yamada wrote:
> []
> > > OK, then stpcpy(), strcpy() and sprintf()
> > > have the same level of unsafety.
> > 
> > Yes. And even snprintf() is dangerous because its return value is how
> > much it WOULD have written, which when (commonly) used as an offset for
> > further pointer writes, causes OOB writes too. :(
> > https://github.com/KSPP/linux/issues/105
> > 
> > > strcpy() is used everywhere.
> > 
> > Yes. It's very frustrating, but it's not an excuse to continue
> > using it nor introducing more bad APIs.
> > 
> > $ git grep '\bstrcpy\b' | wc -l
> > 2212
> > $ git grep '\bstrncpy\b' | wc -l
> > 751
> > $ git grep '\bstrlcpy\b' | wc -l
> > 1712
> > 
> > $ git grep '\bstrscpy\b' | wc -l
> > 1066
> > 
> > https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
> > https://github.com/KSPP/linux/issues/88
> > 
> > https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
> > https://github.com/KSPP/linux/issues/89
> > 
> > https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
> > https://github.com/KSPP/linux/issues/90
> > 
> > We have no way right now to block the addition of deprecated API usage,
> > which makes ever catching up on this replacement very challenging.
> 
> These could be added to checkpatch's deprecated_api test.
> ---
>  scripts/checkpatch.pl | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 149518d2a6a7..f9ccb2a63a95 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -605,6 +605,9 @@ foreach my $entry (@mode_permission_funcs) {
>  $mode_perms_search = "(?:${mode_perms_search})";
>  
>  our %deprecated_apis = (
> +	"strcpy"				=> "strscpy",
> +	"strncpy"				=> "strscpy",
> +	"strlcpy"				=> "strscpy",
>  	"synchronize_rcu_bh"			=> "synchronize_rcu",
>  	"synchronize_rcu_bh_expedited"		=> "synchronize_rcu_expedited",
>  	"call_rcu_bh"				=> "call_rcu",
> 
> 

Good idea, yeah. We, unfortunately, need to leave strncpy() off this
list for now because it's not *strictly* deprecated (see the notes in
bug report[1]), but the others can be.

[1] https://github.com/KSPP/linux/issues/89

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-27  2:33             ` Kees Cook
@ 2020-08-27  2:42               ` Joe Perches
  2020-08-27 18:26                 ` Kees Cook
  0 siblings, 1 reply; 25+ messages in thread
From: Joe Perches @ 2020-08-27  2:42 UTC (permalink / raw)
  To: Kees Cook
  Cc: Masahiro Yamada, Nick Desaulniers, clang-built-linux, stable,
	Andy Lavr, Arvind Sankar, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Wed, 2020-08-26 at 19:33 -0700, Kees Cook wrote:
> On Wed, Aug 26, 2020 at 04:57:41PM -0700, Joe Perches wrote:
> > On Wed, 2020-08-26 at 16:38 -0700, Kees Cook wrote:
> > > On Thu, Aug 27, 2020 at 07:59:45AM +0900, Masahiro Yamada wrote:
> > []
> > > > OK, then stpcpy(), strcpy() and sprintf()
> > > > have the same level of unsafety.
> > > 
> > > Yes. And even snprintf() is dangerous because its return value is how
> > > much it WOULD have written, which when (commonly) used as an offset for
> > > further pointer writes, causes OOB writes too. :(
> > > https://github.com/KSPP/linux/issues/105
> > > 
> > > > strcpy() is used everywhere.
> > > 
> > > Yes. It's very frustrating, but it's not an excuse to continue
> > > using it nor introducing more bad APIs.
> > > 
> > > $ git grep '\bstrcpy\b' | wc -l
> > > 2212
> > > $ git grep '\bstrncpy\b' | wc -l
> > > 751
> > > $ git grep '\bstrlcpy\b' | wc -l
> > > 1712
> > > 
> > > $ git grep '\bstrscpy\b' | wc -l
> > > 1066
> > > 
> > > https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
> > > https://github.com/KSPP/linux/issues/88
> > > 
> > > https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
> > > https://github.com/KSPP/linux/issues/89
> > > 
> > > https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
> > > https://github.com/KSPP/linux/issues/90
> > > 
> > > We have no way right now to block the addition of deprecated API usage,
> > > which makes ever catching up on this replacement very challenging.
> > 
> > These could be added to checkpatch's deprecated_api test.
> > ---
> >  scripts/checkpatch.pl | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > index 149518d2a6a7..f9ccb2a63a95 100755
> > --- a/scripts/checkpatch.pl
> > +++ b/scripts/checkpatch.pl
> > @@ -605,6 +605,9 @@ foreach my $entry (@mode_permission_funcs) {
> >  $mode_perms_search = "(?:${mode_perms_search})";
> >  
> >  our %deprecated_apis = (
> > +	"strcpy"				=> "strscpy",
> > +	"strncpy"				=> "strscpy",
> > +	"strlcpy"				=> "strscpy",
> >  	"synchronize_rcu_bh"			=> "synchronize_rcu",
> >  	"synchronize_rcu_bh_expedited"		=> "synchronize_rcu_expedited",
> >  	"call_rcu_bh"				=> "call_rcu",
> > 
> > 
> 
> Good idea, yeah. We, unfortunately, need to leave strncpy() off this
> list for now because it's not *strictly* deprecated (see the notes in
> bug report[1]), but the others can be.

OK, but it is in Documentation/process/deprecated.rst

strncpy() on NUL-terminated strings
-----------------------------------
Use of strncpy() does not guarantee that the destination buffer
will be NUL terminated. This can lead to various linear read overflows
and other misbehavior due to the missing termination. It also NUL-pads the
destination buffer if the source contents are shorter than the destination
buffer size, which may be a needless performance penalty for callers using
only NUL-terminated strings. The safe replacement is strscpy().
(Users of strscpy() still needing NUL-padding should instead
use strscpy_pad().)

If a caller is using non-NUL-terminated strings, strncpy() can
still be used, but destinations should be marked with the `__nonstring
<https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html>`_
attribute to avoid future compiler warnings.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-26 23:38         ` Kees Cook
  2020-08-26 23:57           ` Joe Perches
@ 2020-08-27  8:59           ` Andy Shevchenko
  2020-08-27 18:30             ` Kees Cook
  1 sibling, 1 reply; 25+ messages in thread
From: Andy Shevchenko @ 2020-08-27  8:59 UTC (permalink / raw)
  To: Kees Cook
  Cc: Masahiro Yamada, Nick Desaulniers, Joe Perches,
	clang-built-linux, stable, Andy Lavr, Arvind Sankar,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Andy Shevchenko,
	Alexandru Ardelean, Yury Norov, Linux Kernel Mailing List

On Thu, Aug 27, 2020 at 2:40 AM Kees Cook <keescook@chromium.org> wrote:
> On Thu, Aug 27, 2020 at 07:59:45AM +0900, Masahiro Yamada wrote:
> > On Thu, Aug 27, 2020 at 1:58 AM Nick Desaulniers
> > <ndesaulniers@google.com> wrote:
> > > On Wed, Aug 26, 2020 at 9:57 AM Joe Perches <joe@perches.com> wrote:
> > > > On Thu, 2020-08-27 at 01:49 +0900, Masahiro Yamada wrote:
> > > > > I do not have time to keep track of the discussion fully,
> > > > > but could you give me a little more context why
> > > > > the usage of stpcpy() is not recommended ?
> > > > >
> > > > > The implementation of strcpy() is almost the same.
> > > > > It is unclear to me what makes stpcpy() unsafe..
> > >
> > > https://lore.kernel.org/lkml/202008150921.B70721A359@keescook/
> > >
> > > >
> > > > It's the same thing that makes strcpy unsafe:
> > > >
> > > > Unchecked buffer lengths with no guarantee src is terminated.
> > >
> >
> >
> > OK, then stpcpy(), strcpy() and sprintf()
> > have the same level of unsafety.
>
> Yes. And even snprintf() is dangerous because its return value is how
> much it WOULD have written, which when (commonly) used as an offset for
> further pointer writes, causes OOB writes too. :(
> https://github.com/KSPP/linux/issues/105
>
> > strcpy() is used everywhere.
>
> Yes. It's very frustrating, but it's not an excuse to continue
> using it nor introducing more bad APIs.

strcpy() is not a bad API for the cases when you know what you are
doing. A problem that most of the developers do not know what they are
doing.
No need to split everything to bad and good by its name or semantics,
each API has its own pros and cons and programmers must use their
brains.

>
> $ git grep '\bstrcpy\b' | wc -l
> 2212
> $ git grep '\bstrncpy\b' | wc -l
> 751
> $ git grep '\bstrlcpy\b' | wc -l
> 1712
>
> $ git grep '\bstrscpy\b' | wc -l
> 1066
>
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
> https://github.com/KSPP/linux/issues/88
>
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
> https://github.com/KSPP/linux/issues/89
>
> https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
> https://github.com/KSPP/linux/issues/90
>
> We have no way right now to block the addition of deprecated API usage,
> which makes ever catching up on this replacement very challenging. The
> only way we caught up with VLA removal was because of -Wvla on sfr's
> -next builds.
>
> I guess we could set up a robot to just watch -next commits and yell
> about new instances, but patches come and go -- I worry it'd be noisy...
>
> > I am not convinced why only stpcpy() should be hidden.
>
> Because nothing uses it right now. It's only the compiler suddenly now
> trying to use it directly...
>
> --
> Kees Cook



-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-27  2:42               ` Joe Perches
@ 2020-08-27 18:26                 ` Kees Cook
  0 siblings, 0 replies; 25+ messages in thread
From: Kees Cook @ 2020-08-27 18:26 UTC (permalink / raw)
  To: Joe Perches
  Cc: Masahiro Yamada, Nick Desaulniers, clang-built-linux, stable,
	Andy Lavr, Arvind Sankar, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Wed, Aug 26, 2020 at 07:42:17PM -0700, Joe Perches wrote:
> On Wed, 2020-08-26 at 19:33 -0700, Kees Cook wrote:
> > On Wed, Aug 26, 2020 at 04:57:41PM -0700, Joe Perches wrote:
> > > On Wed, 2020-08-26 at 16:38 -0700, Kees Cook wrote:
> > > > On Thu, Aug 27, 2020 at 07:59:45AM +0900, Masahiro Yamada wrote:
> > > []
> > > > > OK, then stpcpy(), strcpy() and sprintf()
> > > > > have the same level of unsafety.
> > > > 
> > > > Yes. And even snprintf() is dangerous because its return value is how
> > > > much it WOULD have written, which when (commonly) used as an offset for
> > > > further pointer writes, causes OOB writes too. :(
> > > > https://github.com/KSPP/linux/issues/105
> > > > 
> > > > > strcpy() is used everywhere.
> > > > 
> > > > Yes. It's very frustrating, but it's not an excuse to continue
> > > > using it nor introducing more bad APIs.
> > > > 
> > > > $ git grep '\bstrcpy\b' | wc -l
> > > > 2212
> > > > $ git grep '\bstrncpy\b' | wc -l
> > > > 751
> > > > $ git grep '\bstrlcpy\b' | wc -l
> > > > 1712
> > > > 
> > > > $ git grep '\bstrscpy\b' | wc -l
> > > > 1066
> > > > 
> > > > https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
> > > > https://github.com/KSPP/linux/issues/88
> > > > 
> > > > https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
> > > > https://github.com/KSPP/linux/issues/89
> > > > 
> > > > https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
> > > > https://github.com/KSPP/linux/issues/90
> > > > 
> > > > We have no way right now to block the addition of deprecated API usage,
> > > > which makes ever catching up on this replacement very challenging.
> > > 
> > > These could be added to checkpatch's deprecated_api test.
> > > ---
> > >  scripts/checkpatch.pl | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > > index 149518d2a6a7..f9ccb2a63a95 100755
> > > --- a/scripts/checkpatch.pl
> > > +++ b/scripts/checkpatch.pl
> > > @@ -605,6 +605,9 @@ foreach my $entry (@mode_permission_funcs) {
> > >  $mode_perms_search = "(?:${mode_perms_search})";
> > >  
> > >  our %deprecated_apis = (
> > > +	"strcpy"				=> "strscpy",
> > > +	"strncpy"				=> "strscpy",
> > > +	"strlcpy"				=> "strscpy",
> > >  	"synchronize_rcu_bh"			=> "synchronize_rcu",
> > >  	"synchronize_rcu_bh_expedited"		=> "synchronize_rcu_expedited",
> > >  	"call_rcu_bh"				=> "call_rcu",
> > > 
> > > 
> > 
> > Good idea, yeah. We, unfortunately, need to leave strncpy() off this
> > list for now because it's not *strictly* deprecated (see the notes in
> > bug report[1]), but the others can be.
> 
> OK, but it is in Documentation/process/deprecated.rst
> 
> strncpy() on NUL-terminated strings

"... on NUL-terminated strings". It's "valid" to use it on known-size
(either external or by definition) NUL-padded buffers (e.g. NLA_STRING).

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-27  8:59           ` Andy Shevchenko
@ 2020-08-27 18:30             ` Kees Cook
  2020-08-27 19:37               ` Joe Perches
  2020-08-27 20:05               ` Andy Shevchenko
  0 siblings, 2 replies; 25+ messages in thread
From: Kees Cook @ 2020-08-27 18:30 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Masahiro Yamada, Nick Desaulniers, Joe Perches,
	clang-built-linux, stable, Andy Lavr, Arvind Sankar,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Andy Shevchenko,
	Alexandru Ardelean, Yury Norov, Linux Kernel Mailing List

On Thu, Aug 27, 2020 at 11:59:24AM +0300, Andy Shevchenko wrote:
> strcpy() is not a bad API for the cases when you know what you are
> doing. A problem that most of the developers do not know what they are
> doing.
> No need to split everything to bad and good by its name or semantics,
> each API has its own pros and cons and programmers must use their
> brains.

I equate "unsafe" or "fragile" with "bad". There's no reason to use our
brains for remembering what's safe or not when we can just remove unsafe
things from the available APIs, and/or lean on the compiler to help
(e.g. CONFIG_FORTIFY_SOURCE).

Most of the uses of strcpy() in the kernel are just copying between two
known-at-compile-time NUL-terminated character arrays. We had wanted to
introduce stracpy() for this, but Linus objected to yet more string
functions. So for now, I'm aimed at removing strlcpy() completely first,
then look at strcpy() -> strscpy() for cases where target size is NOT
compile-time known, and then to convert the kernel's strcpy() into
_requiring_ that source/dest lengths are known at compile time.

And then tackle strncpy(), which is a mess.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-27 18:30             ` Kees Cook
@ 2020-08-27 19:37               ` Joe Perches
  2020-08-27 19:41                 ` Kees Cook
  2020-08-27 20:05               ` Andy Shevchenko
  1 sibling, 1 reply; 25+ messages in thread
From: Joe Perches @ 2020-08-27 19:37 UTC (permalink / raw)
  To: Kees Cook, Andy Shevchenko
  Cc: Masahiro Yamada, Nick Desaulniers, clang-built-linux, stable,
	Andy Lavr, Arvind Sankar, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Thu, 2020-08-27 at 11:30 -0700, Kees Cook wrote:

> Most of the uses of strcpy() in the kernel are just copying between two
> known-at-compile-time NUL-terminated character arrays. We had wanted to
> introduce stracpy() for this, but Linus objected to yet more string
> functions.

https://lore.kernel.org/kernel-hardening/24bb53c57767c1c2a8f266c305a670f7@sk2.org/T/

I still think stracpy is a good idea.

Maybe when the strcpy/strlcpy uses are removed
it'll be more acceptable.

And here's a cocci script to convert most of them.
https://lore.kernel.org/kernel-hardening/b9bb5550b264d4b29b2b20f7ff8b1b40d20def6a.camel@perches.com/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-27 19:37               ` Joe Perches
@ 2020-08-27 19:41                 ` Kees Cook
  0 siblings, 0 replies; 25+ messages in thread
From: Kees Cook @ 2020-08-27 19:41 UTC (permalink / raw)
  To: Joe Perches
  Cc: Andy Shevchenko, Masahiro Yamada, Nick Desaulniers,
	clang-built-linux, stable, Andy Lavr, Arvind Sankar,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Andy Shevchenko,
	Alexandru Ardelean, Yury Norov, Linux Kernel Mailing List

On Thu, Aug 27, 2020 at 12:37:03PM -0700, Joe Perches wrote:
> On Thu, 2020-08-27 at 11:30 -0700, Kees Cook wrote:
> 
> > Most of the uses of strcpy() in the kernel are just copying between two
> > known-at-compile-time NUL-terminated character arrays. We had wanted to
> > introduce stracpy() for this, but Linus objected to yet more string
> > functions.
> 
> https://lore.kernel.org/kernel-hardening/24bb53c57767c1c2a8f266c305a670f7@sk2.org/T/
> 
> I still think stracpy is a good idea.
> 
> Maybe when the strcpy/strlcpy uses are removed
> it'll be more acceptable.
> 
> And here's a cocci script to convert most of them.
> https://lore.kernel.org/kernel-hardening/b9bb5550b264d4b29b2b20f7ff8b1b40d20def6a.camel@perches.com/

Yeah, thanks again for that. Most of this is very mechanical. (strncpy is not, unfortunately)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-27 18:30             ` Kees Cook
  2020-08-27 19:37               ` Joe Perches
@ 2020-08-27 20:05               ` Andy Shevchenko
  2020-08-27 22:26                 ` Kees Cook
  1 sibling, 1 reply; 25+ messages in thread
From: Andy Shevchenko @ 2020-08-27 20:05 UTC (permalink / raw)
  To: Kees Cook
  Cc: Masahiro Yamada, Nick Desaulniers, Joe Perches,
	clang-built-linux, stable, Andy Lavr, Arvind Sankar,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Andy Shevchenko,
	Alexandru Ardelean, Yury Norov, Linux Kernel Mailing List

On Thu, Aug 27, 2020 at 9:30 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, Aug 27, 2020 at 11:59:24AM +0300, Andy Shevchenko wrote:
> > strcpy() is not a bad API for the cases when you know what you are
> > doing. A problem that most of the developers do not know what they are
> > doing.
> > No need to split everything to bad and good by its name or semantics,
> > each API has its own pros and cons and programmers must use their
> > brains.
>
> I equate "unsafe" or "fragile" with "bad". There's no reason to use our
> brains for remembering what's safe or not when we can just remove unsafe
> things from the available APIs, and/or lean on the compiler to help
> (e.g. CONFIG_FORTIFY_SOURCE).
>
> Most of the uses of strcpy() in the kernel are just copying between two
> known-at-compile-time NUL-terminated character arrays. We had wanted to
> introduce stracpy() for this, but Linus objected to yet more string
> functions. So for now, I'm aimed at removing strlcpy() completely first,
> then look at strcpy() -> strscpy() for cases where target size is NOT
> compile-time known, and then to convert the kernel's strcpy() into
> _requiring_ that source/dest lengths are known at compile time.
>
> And then tackle strncpy(), which is a mess.

In general it's better to have a robust API, but what may go wrong
with the interface where we have no length of  the buffer passed, but
we all know that it's PAGE_SIZE?
So, what's wrong with doing something like
strcpy(buf, "Yes, we know we won't overflow here\n");
?


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-27 20:05               ` Andy Shevchenko
@ 2020-08-27 22:26                 ` Kees Cook
  2020-08-28  8:17                   ` Andy Shevchenko
  0 siblings, 1 reply; 25+ messages in thread
From: Kees Cook @ 2020-08-27 22:26 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Masahiro Yamada, Nick Desaulniers, Joe Perches,
	clang-built-linux, stable, Andy Lavr, Arvind Sankar,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Andy Shevchenko,
	Alexandru Ardelean, Yury Norov, Linux Kernel Mailing List

On Thu, Aug 27, 2020 at 11:05:42PM +0300, Andy Shevchenko wrote:
> In general it's better to have a robust API, but what may go wrong
> with the interface where we have no length of  the buffer passed, but
> we all know that it's PAGE_SIZE?
> So, what's wrong with doing something like
> strcpy(buf, "Yes, we know we won't overflow here\n");

(There's a whole thread[1] about this right now, actually.)

The problem isn't the uses where it's safe (obviously), it's about the
uses where it is NOT safe. (Or _looks_ safe but isn't.) In order to
eliminate bug classes, we need remove the APIs that are foot-guns. Even
if one developer never gets it wrong, others might.

[1] https://lore.kernel.org/lkml/c256eba42a564c01a8e470320475d46f@AcuMS.aculab.com/T/#mac95487d7ae427de03251b49b75dd4de40c2462d

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-27 22:26                 ` Kees Cook
@ 2020-08-28  8:17                   ` Andy Shevchenko
  2020-08-31 23:21                     ` Nick Desaulniers
  0 siblings, 1 reply; 25+ messages in thread
From: Andy Shevchenko @ 2020-08-28  8:17 UTC (permalink / raw)
  To: Kees Cook
  Cc: Masahiro Yamada, Nick Desaulniers, Joe Perches,
	clang-built-linux, stable, Andy Lavr, Arvind Sankar,
	Rasmus Villemoes, Sami Tolvanen, Andrew Morton, Andy Shevchenko,
	Alexandru Ardelean, Yury Norov, Linux Kernel Mailing List

On Fri, Aug 28, 2020 at 1:26 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, Aug 27, 2020 at 11:05:42PM +0300, Andy Shevchenko wrote:
> > In general it's better to have a robust API, but what may go wrong
> > with the interface where we have no length of  the buffer passed, but
> > we all know that it's PAGE_SIZE?
> > So, what's wrong with doing something like
> > strcpy(buf, "Yes, we know we won't overflow here\n");
>
> (There's a whole thread[1] about this right now, actually.)
>
> The problem isn't the uses where it's safe (obviously), it's about the
> uses where it is NOT safe. (Or _looks_ safe but isn't.) In order to
> eliminate bug classes, we need remove the APIs that are foot-guns. Even
> if one developer never gets it wrong, others might.
>
> [1] https://lore.kernel.org/lkml/c256eba42a564c01a8e470320475d46f@AcuMS.aculab.com/T/#mac95487d7ae427de03251b49b75dd4de40c2462d

Seems to me that this is a fixation on an abstract problem that never
exists (of course, if a developer has brains to think).

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-28  8:17                   ` Andy Shevchenko
@ 2020-08-31 23:21                     ` Nick Desaulniers
  2020-09-01  8:51                       ` David Laight
  0 siblings, 1 reply; 25+ messages in thread
From: Nick Desaulniers @ 2020-08-31 23:21 UTC (permalink / raw)
  To: Andy Shevchenko, Kees Cook
  Cc: Masahiro Yamada, Joe Perches, clang-built-linux, stable,
	Andy Lavr, Arvind Sankar, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Thu, Aug 27, 2020 at 1:59 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> strcpy() is not a bad API for the cases when you know what you are
> doing. A problem that most of the developers do not know what they are
> doing.
> No need to split everything to bad and good by its name or semantics,
> each API has its own pros and cons and programmers must use their
> brains.

On Fri, Aug 28, 2020 at 1:17 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> Seems to me that this is a fixation on an abstract problem that never
> exists (of course, if a developer has brains to think).

Of course, no "True Scotsman" would accidentally misuse C string.h API!
https://yourlogicalfallacyis.com/no-true-scotsman

(I will note the irony of my off by one in my v1 implementation of
stpcpy. I've also missed strncpy zeroing the rest of a destination
buffer before.  I might not be a "True Scotsman.")

On Thu, Aug 27, 2020 at 11:30 AM Kees Cook <keescook@chromium.org> wrote:
>
> I equate "unsafe" or "fragile" with "bad". There's no reason to use our
> brains for remembering what's safe or not when we can just remove unsafe
> things from the available APIs, and/or lean on the compiler to help
> (e.g. CONFIG_FORTIFY_SOURCE).

Having seatbelts is great (ie. fortify source), but is no substitute
for driving carefully (having proper APIs that help me not shoot my
foot off).  I think it's nice to have *both*, but if I drove solely
relying on my seatbelts, we might all be in trouble.  Not disagreeing
with you, Kees.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-31 23:21                     ` Nick Desaulniers
@ 2020-09-01  8:51                       ` David Laight
  0 siblings, 0 replies; 25+ messages in thread
From: David Laight @ 2020-09-01  8:51 UTC (permalink / raw)
  To: 'Nick Desaulniers', Andy Shevchenko, Kees Cook
  Cc: Masahiro Yamada, Joe Perches, clang-built-linux, stable,
	Andy Lavr, Arvind Sankar, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

> Of course, no "True Scotsman" would accidentally misuse C string.h API!
> https://yourlogicalfallacyis.com/no-true-scotsman

Google will find plenty of:
	str[strlen(str)] = 0;

   David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-26 15:58 ` Masahiro Yamada
@ 2020-09-06  9:57   ` Kees Cook
  0 siblings, 0 replies; 25+ messages in thread
From: Kees Cook @ 2020-09-06  9:57 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Nick Desaulniers, clang-built-linux, stable, Andy Lavr,
	Arvind Sankar, Joe Perches, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Alexandru Ardelean, Yury Norov,
	Linux Kernel Mailing List

On Thu, Aug 27, 2020 at 12:58:35AM +0900, Masahiro Yamada wrote:
> On Tue, Aug 25, 2020 at 10:58 PM Nick Desaulniers
> <ndesaulniers@google.com> wrote:
> > [...]
> > +/**
> > + * stpcpy - copy a string from src to dest returning a pointer to the new end
> > + *          of dest, including src's %NUL-terminator. May overrun dest.
> > + * @dest: pointer to end of string being copied into. Must be large enough
> > + *        to receive copy.
> > + * @src: pointer to the beginning of string being copied from. Must not overlap
> > + *       dest.
> > + *
> > + * stpcpy differs from strcpy in a key way: the return value is the new
> > + * %NUL-terminated character. (for strcpy, the return value is a pointer to
> > + * src.
> 
> 
> return a pointer to src?
> 
> "man 3 strcpy" says:
> 
> The strcpy() and strncpy() functions return
> a pointer to the destination string *dest*.

Agreed; that's a typo.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] lib/string.c: implement stpcpy
  2020-08-25 14:00 Nick Desaulniers
@ 2020-08-26 15:22 ` Kees Cook
  0 siblings, 0 replies; 25+ messages in thread
From: Kees Cook @ 2020-08-26 15:22 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Masahiro Yamada, clang-built-linux, stable, Andy Lavr,
	Arvind Sankar, Joe Perches, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Yury Norov, Alexandru Ardelean,
	linux-kernel

On Tue, Aug 25, 2020 at 07:00:00AM -0700, Nick Desaulniers wrote:
> LLVM implemented a recent "libcall optimization" that lowers calls to
> `sprintf(dest, "%s", str)` where the return value is used to
> `stpcpy(dest, str) - dest`. This generally avoids the machinery involved
> in parsing format strings.  `stpcpy` is just like `strcpy` except it
> returns the pointer to the new tail of `dest`.  This optimization was
> introduced into clang-12.
> 
> Implement this so that we don't observe linkage failures due to missing
> symbol definitions for `stpcpy`.
> 
> Similar to last year's fire drill with:
> commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
> 
> The kernel is somewhere between a "freestanding" environment (no full libc)
> and "hosted" environment (many symbols from libc exist with the same
> type, function signature, and semantics).
> 
> As H. Peter Anvin notes, there's not really a great way to inform the
> compiler that you're targeting a freestanding environment but would like
> to opt-in to some libcall optimizations (see pr/47280 below), rather than
> opt-out.
> 
> Arvind notes, -fno-builtin-* behaves slightly differently between GCC
> and Clang, and Clang is missing many __builtin_* definitions, which I
> consider a bug in Clang and am working on fixing.
> 
> Masahiro summarizes the subtle distinction between compilers justly:
>   To prevent transformation from foo() into bar(), there are two ways in
>   Clang to do that; -fno-builtin-foo, and -fno-builtin-bar.  There is
>   only one in GCC; -fno-buitin-foo.
> 
> (Any difference in that behavior in Clang is likely a bug from a missing
> __builtin_* definition.)
> 
> Masahiro also notes:
>   We want to disable optimization from foo() to bar(),
>   but we may still benefit from the optimization from
>   foo() into something else. If GCC implements the same transform, we
>   would run into a problem because it is not -fno-builtin-bar, but
>   -fno-builtin-foo that disables that optimization.
> 
>   In this regard, -fno-builtin-foo would be more future-proof than
>   -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
>   may want to prevent calls from foo() being optimized into calls to
>   bar(), but we still may want other optimization on calls to foo().
> 
> It seems that compilers today don't quite provide the fine grain control
> over which libcall optimizations pseudo-freestanding environments would
> prefer.
> 
> Finally, Kees notes that this interface is unsafe, so we should not
> encourage its use.  As such, I've removed the declaration from any
> header, but it still needs to be exported to avoid linkage errors in
> modules.
> 
> Cc: stable@vger.kernel.org
> Link: https://bugs.llvm.org/show_bug.cgi?id=47162
> Link: https://bugs.llvm.org/show_bug.cgi?id=47280
> Link: https://github.com/ClangBuiltLinux/linux/issues/1126
> Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
> Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
> Link: https://reviews.llvm.org/D85963
> Suggested-by: Andy Lavr <andy.lavr@gmail.com>
> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
> Suggested-by: Joe Perches <joe@perches.com>
> Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Reported-by: Sami Tolvanen <samitolvanen@google.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Acked-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3] lib/string.c: implement stpcpy
@ 2020-08-25 14:00 Nick Desaulniers
  2020-08-26 15:22 ` Kees Cook
  0 siblings, 1 reply; 25+ messages in thread
From: Nick Desaulniers @ 2020-08-25 14:00 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: clang-built-linux, Nick Desaulniers, stable, Andy Lavr,
	Arvind Sankar, Joe Perches, Rasmus Villemoes, Sami Tolvanen,
	Andrew Morton, Andy Shevchenko, Kees Cook, Yury Norov,
	Alexandru Ardelean, linux-kernel

LLVM implemented a recent "libcall optimization" that lowers calls to
`sprintf(dest, "%s", str)` where the return value is used to
`stpcpy(dest, str) - dest`. This generally avoids the machinery involved
in parsing format strings.  `stpcpy` is just like `strcpy` except it
returns the pointer to the new tail of `dest`.  This optimization was
introduced into clang-12.

Implement this so that we don't observe linkage failures due to missing
symbol definitions for `stpcpy`.

Similar to last year's fire drill with:
commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")

The kernel is somewhere between a "freestanding" environment (no full libc)
and "hosted" environment (many symbols from libc exist with the same
type, function signature, and semantics).

As H. Peter Anvin notes, there's not really a great way to inform the
compiler that you're targeting a freestanding environment but would like
to opt-in to some libcall optimizations (see pr/47280 below), rather than
opt-out.

Arvind notes, -fno-builtin-* behaves slightly differently between GCC
and Clang, and Clang is missing many __builtin_* definitions, which I
consider a bug in Clang and am working on fixing.

Masahiro summarizes the subtle distinction between compilers justly:
  To prevent transformation from foo() into bar(), there are two ways in
  Clang to do that; -fno-builtin-foo, and -fno-builtin-bar.  There is
  only one in GCC; -fno-buitin-foo.

(Any difference in that behavior in Clang is likely a bug from a missing
__builtin_* definition.)

Masahiro also notes:
  We want to disable optimization from foo() to bar(),
  but we may still benefit from the optimization from
  foo() into something else. If GCC implements the same transform, we
  would run into a problem because it is not -fno-builtin-bar, but
  -fno-builtin-foo that disables that optimization.

  In this regard, -fno-builtin-foo would be more future-proof than
  -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
  may want to prevent calls from foo() being optimized into calls to
  bar(), but we still may want other optimization on calls to foo().

It seems that compilers today don't quite provide the fine grain control
over which libcall optimizations pseudo-freestanding environments would
prefer.

Finally, Kees notes that this interface is unsafe, so we should not
encourage its use.  As such, I've removed the declaration from any
header, but it still needs to be exported to avoid linkage errors in
modules.

Cc: stable@vger.kernel.org
Link: https://bugs.llvm.org/show_bug.cgi?id=47162
Link: https://bugs.llvm.org/show_bug.cgi?id=47280
Link: https://github.com/ClangBuiltLinux/linux/issues/1126
Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
Link: https://reviews.llvm.org/D85963
Suggested-by: Andy Lavr <andy.lavr@gmail.com>
Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
Suggested-by: Joe Perches <joe@perches.com>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Changes V3:
* Drop Sami's Tested by tag; newer patch.
* Add EXPORT_SYMBOL as per Andy.
* Rewrite commit message, rewrote part of what Masahiro said to be
  generic in terms of foo() and bar().
* Prefer %NUL-terminated to NULL terminated. NUL is the ASCII character
  '\0', as per Arvind and Rasmus.

Changes V2:
* Added Sami's Tested by; though the patch changed implementation, the
  missing symbol at link time was the problem Sami was observing.
* Fix __restrict -> __restrict__ typo as per Joe.
* Drop note about restrict from commit message as per Arvind.
* Fix NULL -> NUL as per Arvind; NUL is ASCII '\0'. TIL
* Fix off by one error as per Arvind; I had another off by one error in
  my test program that was masking this.

 lib/string.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/lib/string.c b/lib/string.c
index 6012c385fb31..6bd0cf0fb009 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -272,6 +272,30 @@ ssize_t strscpy_pad(char *dest, const char *src, size_t count)
 }
 EXPORT_SYMBOL(strscpy_pad);
 
+/**
+ * stpcpy - copy a string from src to dest returning a pointer to the new end
+ *          of dest, including src's %NUL-terminator. May overrun dest.
+ * @dest: pointer to end of string being copied into. Must be large enough
+ *        to receive copy.
+ * @src: pointer to the beginning of string being copied from. Must not overlap
+ *       dest.
+ *
+ * stpcpy differs from strcpy in a key way: the return value is the new
+ * %NUL-terminated character. (for strcpy, the return value is a pointer to
+ * src. This interface is considered unsafe as it doesn't perform bounds
+ * checking of the inputs. As such it's not recommended for usage. Instead,
+ * its definition is provided in case the compiler lowers other libcalls to
+ * stpcpy.
+ */
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src);
+char *stpcpy(char *__restrict__ dest, const char *__restrict__ src)
+{
+	while ((*dest++ = *src++) != '\0')
+		/* nothing */;
+	return --dest;
+}
+EXPORT_SYMBOL(stpcpy);
+
 #ifndef __HAVE_ARCH_STRCAT
 /**
  * strcat - Append one %NUL-terminated string to another
-- 
2.28.0.297.g1956fa8f8d-goog


^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-09-06  9:57 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-25 13:58 [PATCH v3] lib/string.c: implement stpcpy Nick Desaulniers
2020-08-25 18:51 ` Nathan Chancellor
2020-08-26 15:41 ` Sedat Dilek
2020-08-26 15:58 ` Masahiro Yamada
2020-09-06  9:57   ` Kees Cook
2020-08-26 16:49 ` Masahiro Yamada
2020-08-26 16:57   ` Joe Perches
2020-08-26 16:58     ` Nick Desaulniers
2020-08-26 22:59       ` Masahiro Yamada
2020-08-26 23:38         ` Kees Cook
2020-08-26 23:57           ` Joe Perches
2020-08-27  2:33             ` Kees Cook
2020-08-27  2:42               ` Joe Perches
2020-08-27 18:26                 ` Kees Cook
2020-08-27  8:59           ` Andy Shevchenko
2020-08-27 18:30             ` Kees Cook
2020-08-27 19:37               ` Joe Perches
2020-08-27 19:41                 ` Kees Cook
2020-08-27 20:05               ` Andy Shevchenko
2020-08-27 22:26                 ` Kees Cook
2020-08-28  8:17                   ` Andy Shevchenko
2020-08-31 23:21                     ` Nick Desaulniers
2020-09-01  8:51                       ` David Laight
2020-08-25 14:00 Nick Desaulniers
2020-08-26 15:22 ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).