kernel-hardening.lists.openwall.com archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] lib: Add safe string funtions
@ 2019-02-18 23:23 Tobin C. Harding
  2019-02-18 23:23 ` [PATCH 1/6] lib/string: Enable string selftesting Tobin C. Harding
                   ` (6 more replies)
  0 siblings, 7 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-18 23:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, kernel-hardening,
	linux-kernel

No file maintainer, CC'ing all those who touched this file :) And
Shua for kselftest stuff.


Hi Kess,

During your talk at LCA you mentioned that we could do with a couple
more safe string functions.  One to zero the tail of the destination
buffer after call to strscpy() and also the self explanatory
strscpy_from_user().

Here is a patch set with my attempts to implement these two functions.

While doing this I noticed that we have a test module for lib/string.c
(lib/test_string.c) that is not tied into kselftest.  So I enable this
first up in patch 1.

Patch 2 and 3 are function docstring cleanups.

Patch 4 adds the copy and zero function, naming it strscpy_zeroed().
I'd love some help naming this better.  Patch also adds test code.

Patch 5 fixes function docstring to correctly document the behavior of
strncpy_from_user().

Patch 6 adds strscpy_from_user().  Does not include test code.

I had to do a bit of learning for getting the tests hooked into
kselftest, I think its all correct.  Module is built correctly when the
config option is present and the tests run both via

	make -C tools/testing/selftests TARGETS=lib run_tests

and via loading the module manually.  As a side note, this series leaves
tools/testing/selftests/lib with 4 shell scripts that are identical
except the test name.  We could probably do with refactoring them into a
single script.

Patchset introduces a checkpatch warning

	WARNING:CONFIG_DESCRIPTION: please write a paragraph that describes the config symbol fully

I couldn't work out if this is a false positive or not?  Does the new
config option CONFIG_TEST_STRING need more documentation?  I don't see
where extra docs should be added and it seems self explanatory as is.


thanks,
Tobin.

Tobin C. Harding (6):
  lib/string: Enable string selftesting
  lib/string: Fix erroneous 'overflow' documentation
  lib/string: Use correct docstring format
  lib/string: Add string copy/zero function
  lib: Fix function documentation for strncpy_from_user
  lib: Add function strscpy_from_user()

 include/linux/string.h                |  4 ++
 lib/Kconfig.debug                     | 14 +++++++
 lib/Makefile                          |  2 +-
 lib/string.c                          | 41 ++++++++++++++----
 lib/strncpy_from_user.c               | 60 ++++++++++++++++++++++-----
 lib/test_string.c                     | 35 +++++++++++++++-
 tools/testing/selftests/lib/Makefile  |  2 +-
 tools/testing/selftests/lib/config    |  1 +
 tools/testing/selftests/lib/string.sh | 19 +++++++++
 9 files changed, 157 insertions(+), 21 deletions(-)
 create mode 100755 tools/testing/selftests/lib/string.sh

-- 
2.20.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 1/6] lib/string: Enable string selftesting
  2019-02-18 23:23 [PATCH 0/6] lib: Add safe string funtions Tobin C. Harding
@ 2019-02-18 23:23 ` Tobin C. Harding
  2019-02-19 10:55   ` Andy Shevchenko
  2019-02-20 23:57   ` Kees Cook
  2019-02-18 23:23 ` [PATCH 2/6] lib/string: Fix erroneous 'overflow' documentation Tobin C. Harding
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-18 23:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, kernel-hardening,
	linux-kernel

Currently we have a test module but it is not tied into the kselftest
infrastructure.  In preparation for adding string manipulation functions
and testing we should enable kselftest to utilize the test module.

Enable string testing via kselftest infrastructure.

Signed-off-by: Tobin C. Harding <tobin@kernel.org>
---
 lib/Kconfig.debug                     | 14 ++++++++++++++
 lib/Makefile                          |  2 +-
 lib/test_string.c                     |  4 ++--
 tools/testing/selftests/lib/Makefile  |  2 +-
 tools/testing/selftests/lib/config    |  1 +
 tools/testing/selftests/lib/string.sh | 19 +++++++++++++++++++
 6 files changed, 38 insertions(+), 4 deletions(-)
 create mode 100755 tools/testing/selftests/lib/string.sh

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index d4df5b24d75e..0dca64c1d8a4 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1802,8 +1802,22 @@ config ASYNC_RAID6_TEST
 config TEST_HEXDUMP
 	tristate "Test functions located in the hexdump module at runtime"
 
+config TEST_STRING
+       tristate "Perform selftest on string manipulation functions"
+       default n
+       help
+        Enable this option to test string manipulation functions.
+	Currently this only tests memset_{16,32,64}.
+
+	If unsure, say N.
+
 config TEST_STRING_HELPERS
 	tristate "Test functions located in the string_helpers module at runtime"
+	default n
+	help
+	 Enable this option to unit test code in lib/string_helpers.c
+
+	 If unsure, say N.
 
 config TEST_KSTRTOX
 	tristate "Test kstrto*() family of functions at runtime"
diff --git a/lib/Makefile b/lib/Makefile
index e1b59da71418..9c30e1fee27f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -39,7 +39,7 @@ obj-y += bcd.o div64.o sort.o parser.o debug_locks.o random32.o \
 	 bsearch.o find_bit.o llist.o memweight.o kfifo.o \
 	 percpu-refcount.o rhashtable.o reciprocal_div.o \
 	 once.o refcount.o usercopy.o errseq.o bucket_locks.o
-obj-$(CONFIG_STRING_SELFTEST) += test_string.o
+obj-$(CONFIG_TEST_STRING) += test_string.o
 obj-y += string_helpers.o
 obj-$(CONFIG_TEST_STRING_HELPERS) += test-string_helpers.o
 obj-y += hexdump.o
diff --git a/lib/test_string.c b/lib/test_string.c
index 0fcdb82dca86..a9cba442389a 100644
--- a/lib/test_string.c
+++ b/lib/test_string.c
@@ -111,7 +111,7 @@ static __init int memset64_selftest(void)
 	return 0;
 }
 
-static __init int string_selftest_init(void)
+static __init int test_string_init(void)
 {
 	int test, subtest;
 
@@ -137,5 +137,5 @@ static __init int string_selftest_init(void)
 	return 0;
 }
 
-module_init(string_selftest_init);
+module_init(test_string_init);
 MODULE_LICENSE("GPL v2");
diff --git a/tools/testing/selftests/lib/Makefile b/tools/testing/selftests/lib/Makefile
index 70d5711e3ac8..2ee4559b277e 100644
--- a/tools/testing/selftests/lib/Makefile
+++ b/tools/testing/selftests/lib/Makefile
@@ -3,6 +3,6 @@
 # No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
 all:
 
-TEST_PROGS := printf.sh bitmap.sh prime_numbers.sh
+TEST_PROGS := printf.sh bitmap.sh prime_numbers.sh string.sh
 
 include ../lib.mk
diff --git a/tools/testing/selftests/lib/config b/tools/testing/selftests/lib/config
index 126933bcc950..2032402ad409 100644
--- a/tools/testing/selftests/lib/config
+++ b/tools/testing/selftests/lib/config
@@ -1,3 +1,4 @@
 CONFIG_TEST_PRINTF=m
 CONFIG_TEST_BITMAP=m
+CONFIG_TEST_STRING=m
 CONFIG_PRIME_NUMBERS=m
diff --git a/tools/testing/selftests/lib/string.sh b/tools/testing/selftests/lib/string.sh
new file mode 100755
index 000000000000..99024b6f3a6a
--- /dev/null
+++ b/tools/testing/selftests/lib/string.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# Runs string manipulation tests using test_string kernel module
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+if ! /sbin/modprobe -q -n test_string; then
+	echo "string: module test_string is not found [SKIP]"
+	exit $ksft_skip
+fi
+
+if /sbin/modprobe -q test_string; then
+	/sbin/modprobe -q -r test_string
+	echo "string: ok"
+else
+	echo "string: [FAIL]"
+	exit 1
+fi
-- 
2.20.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 2/6] lib/string: Fix erroneous 'overflow' documentation
  2019-02-18 23:23 [PATCH 0/6] lib: Add safe string funtions Tobin C. Harding
  2019-02-18 23:23 ` [PATCH 1/6] lib/string: Enable string selftesting Tobin C. Harding
@ 2019-02-18 23:23 ` Tobin C. Harding
  2019-02-21  0:02   ` Kees Cook
  2019-02-18 23:23 ` [PATCH 3/6] lib/string: Use correct docstring format Tobin C. Harding
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-18 23:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, kernel-hardening,
	linux-kernel

Current documentation uses 'overflow' to describe a situation where less
data is written to a buffer than buffer size not more.  'overflow' is
the wrong word here - since we don't typically say 'underflow' change
the whole sentence.

Fix erroneous 'overflow' documentation for under filled buffer.

Signed-off-by: Tobin C. Harding <tobin@kernel.org>
---
 lib/string.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index 38e4ca08e757..7f1d72db53c5 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -173,8 +173,8 @@ EXPORT_SYMBOL(strlcpy);
  *
  * Preferred to strncpy() since it always returns a valid string, and
  * doesn't unnecessarily force the tail of the destination buffer to be
- * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy()
- * with an overflow test, then just memset() the tail of the dest buffer.
+ * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy(),
+ * check the return size, then just memset() the tail of the dest buffer.
  */
 ssize_t strscpy(char *dest, const char *src, size_t count)
 {
-- 
2.20.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 3/6] lib/string: Use correct docstring format
  2019-02-18 23:23 [PATCH 0/6] lib: Add safe string funtions Tobin C. Harding
  2019-02-18 23:23 ` [PATCH 1/6] lib/string: Enable string selftesting Tobin C. Harding
  2019-02-18 23:23 ` [PATCH 2/6] lib/string: Fix erroneous 'overflow' documentation Tobin C. Harding
@ 2019-02-18 23:23 ` Tobin C. Harding
  2019-02-21  0:07   ` Kees Cook
  2019-02-18 23:23 ` [PATCH 4/6] lib/string: Add string copy/zero function Tobin C. Harding
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-18 23:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, kernel-hardening,
	linux-kernel

Currently the docstring comments for strscpy() are not in the correct
format.  Prior to working on this file fix up the docstring.

Use correct docstring format for strscpy().

Signed-off-by: Tobin C. Harding <tobin@kernel.org>
---
 lib/string.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index 7f1d72db53c5..65969cf32f5d 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -159,11 +159,9 @@ EXPORT_SYMBOL(strlcpy);
  * @src: Where to copy the string from
  * @count: Size of destination buffer
  *
- * Copy the string, or as much of it as fits, into the dest buffer.
- * The routine returns the number of characters copied (not including
- * the trailing NUL) or -E2BIG if the destination buffer wasn't big enough.
- * The behavior is undefined if the string buffers overlap.
- * The destination buffer is always NUL terminated, unless it's zero-sized.
+ * Copy the string, or as much of it as fits, into the dest buffer.  The
+ * behavior is undefined if the string buffers overlap.  The destination
+ * buffer is always NUL terminated, unless it's zero-sized.
  *
  * Preferred to strlcpy() since the API doesn't require reading memory
  * from the src string beyond the specified "count" bytes, and since
@@ -175,6 +173,9 @@ EXPORT_SYMBOL(strlcpy);
  * doesn't unnecessarily force the tail of the destination buffer to be
  * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy(),
  * check the return size, then just memset() the tail of the dest buffer.
+ *
+ * Return: The number of characters copied (not including the trailing
+ *         NUL) or -E2BIG if the destination buffer wasn't big enough.
  */
 ssize_t strscpy(char *dest, const char *src, size_t count)
 {
-- 
2.20.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 4/6] lib/string: Add string copy/zero function
  2019-02-18 23:23 [PATCH 0/6] lib: Add safe string funtions Tobin C. Harding
                   ` (2 preceding siblings ...)
  2019-02-18 23:23 ` [PATCH 3/6] lib/string: Use correct docstring format Tobin C. Harding
@ 2019-02-18 23:23 ` Tobin C. Harding
  2019-02-21  0:48   ` Kees Cook
  2019-02-18 23:23 ` [PATCH 5/6] lib: Fix function documentation for strncpy_from_user Tobin C. Harding
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-18 23:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, kernel-hardening,
	linux-kernel

We have a function to copy strings safely and we have a function to copy
strings _and_ zero the tail of the destination (if source string is
shorter than destination buffer) but we do not have a function to do
both at once.  This means developers must write this themselves if they
desire this functionality.  This is a chore, and also leaves us open to
off by one errors unnecessarily.

Add a function that calls strscpy() then memset()s the tail to zero if
the source string is shorter than the destination buffer.

Add testing via kselftest.

Signed-off-by: Tobin C. Harding <tobin@kernel.org>
---
 include/linux/string.h |  4 ++++
 lib/Kconfig.debug      |  2 +-
 lib/string.c           | 30 ++++++++++++++++++++++++++++--
 lib/test_string.c      | 31 +++++++++++++++++++++++++++++++
 4 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/include/linux/string.h b/include/linux/string.h
index 7927b875f80c..695a5e6a31e3 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -31,6 +31,10 @@ size_t strlcpy(char *, const char *, size_t);
 #ifndef __HAVE_ARCH_STRSCPY
 ssize_t strscpy(char *, const char *, size_t);
 #endif
+
+/* Wrapper function, no arch specific code required */
+ssize_t strscpy_zeroed(char *dest, const char *src, size_t count);
+
 #ifndef __HAVE_ARCH_STRCAT
 extern char * strcat(char *, const char *);
 #endif
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 0dca64c1d8a4..faa15ff47c4f 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1807,7 +1807,7 @@ config TEST_STRING
        default n
        help
         Enable this option to test string manipulation functions.
-	Currently this only tests memset_{16,32,64}.
+	Currently this only tests memset_{16,32,64} and strscpy_zeroed().
 
 	If unsure, say N.
 
diff --git a/lib/string.c b/lib/string.c
index 65969cf32f5d..ff5106e8249f 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -171,8 +171,7 @@ EXPORT_SYMBOL(strlcpy);
  *
  * Preferred to strncpy() since it always returns a valid string, and
  * doesn't unnecessarily force the tail of the destination buffer to be
- * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy(),
- * check the return size, then just memset() the tail of the dest buffer.
+ * zeroed.  If the zeroing is desired use strscpy_zeroed().
  *
  * Return: The number of characters copied (not including the trailing
  *         NUL) or -E2BIG if the destination buffer wasn't big enough.
@@ -238,6 +237,33 @@ ssize_t strscpy(char *dest, const char *src, size_t count)
 EXPORT_SYMBOL(strscpy);
 #endif
 
+/**
+ * strscopy_zeroed() - Copy a C-string into a sized buffer
+ * @dest: Where to copy the string to
+ * @src: Where to copy the string from
+ * @count: Size of destination buffer
+ *
+ * If the source string is shorter than the destination buffer, zeros
+ * the tail of the destination buffer.
+ *
+ * Return: The number of characters copied (not including the trailing
+ *         NUL) or -E2BIG if the destination buffer wasn't big enough.
+ */
+ssize_t strscpy_zeroed(char *dest, const char *src, size_t count)
+{
+	ssize_t written;
+
+	written = strscpy(dest, src, count);
+	if (written < 0)
+		return written;
+
+	if (written < count)
+		memset(dest + written, 0, count - written);
+
+	return written;
+}
+EXPORT_SYMBOL(strscpy_zeroed);
+
 #ifndef __HAVE_ARCH_STRCAT
 /**
  * strcat - Append one %NUL-terminated string to another
diff --git a/lib/test_string.c b/lib/test_string.c
index a9cba442389a..cc4eef51a395 100644
--- a/lib/test_string.c
+++ b/lib/test_string.c
@@ -111,6 +111,32 @@ static __init int memset64_selftest(void)
 	return 0;
 }
 
+static __init int strscpy_zeroed_selftest(void)
+{
+	char buf[6];
+	int written;
+
+	memset(buf, 'a', sizeof(buf));
+
+	written = strscpy_zeroed(buf, "bb", 4);
+	if (written != 2)
+		return 1;
+
+	/* Copied correctly */
+	if (buf[0] != 'b' || buf[1] != 'b')
+		return 2;
+
+	/* Zeroed correctly */
+	if (buf[2] != '\0' || buf[3] != '\0')
+		return 3;
+
+	/* Only touched what it was supposed to */
+	if (buf[4] != 'a' || buf[5] != 'a')
+		return 4;
+
+	return 0;
+}
+
 static __init int test_string_init(void)
 {
 	int test, subtest;
@@ -130,6 +156,11 @@ static __init int test_string_init(void)
 	if (subtest)
 		goto fail;
 
+	test = 4;
+	subtest = strscpy_zeroed_selftest();
+	if (subtest)
+		goto fail;
+
 	pr_info("String selftests succeeded\n");
 	return 0;
 fail:
-- 
2.20.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-18 23:23 [PATCH 0/6] lib: Add safe string funtions Tobin C. Harding
                   ` (3 preceding siblings ...)
  2019-02-18 23:23 ` [PATCH 4/6] lib/string: Add string copy/zero function Tobin C. Harding
@ 2019-02-18 23:23 ` Tobin C. Harding
  2019-02-19  0:51   ` Jann Horn
  2019-02-18 23:23 ` [PATCH 6/6] lib: Add function strscpy_from_user() Tobin C. Harding
  2019-02-20 23:31 ` [PATCH 0/6] lib: Add safe string funtions Kees Cook
  6 siblings, 1 reply; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-18 23:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, kernel-hardening,
	linux-kernel

Current function documentation for strncpy_from_user() is incorrect.  If
@count (size of destination buffer) is less than the length of the user
string the function does _not_ return @count but rather returns -EFAULT.

Document correctly the function return value, also add note that string
may be left non-null terminated.

Signed-off-by: Tobin C. Harding <tobin@kernel.org>
---
 lib/strncpy_from_user.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index 58eacd41526c..11fe9a4a00fd 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -82,22 +82,19 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
 }
 
 /**
- * strncpy_from_user: - Copy a NUL terminated string from userspace.
+ * strncpy_from_user() - Copy a NUL terminated string from userspace.
  * @dst:   Destination address, in kernel space.  This buffer must be at
  *         least @count bytes long.
  * @src:   Source address, in user space.
- * @count: Maximum number of bytes to copy, including the trailing NUL.
+ * @count: Maximum number of bytes to copy, including the trailing %NUL.
  *
  * Copies a NUL-terminated string from userspace to kernel space.
  *
- * On success, returns the length of the string (not including the trailing
- * NUL).
- *
- * If access to userspace fails, returns -EFAULT (some data may have been
- * copied).
- *
- * If @count is smaller than the length of the string, copies @count bytes
- * and returns @count.
+ * Return: If access to userspace fails, returns -EFAULT.  Otherwise,
+ *         return the number of characters copied excluding the trailing
+ *         %NUL, if the length of the user string exceeds @count return
+ *         -EFAULT (in which case @dst will be left without a %NUL
+ *         terminator).
  */
 long strncpy_from_user(char *dst, const char __user *src, long count)
 {
-- 
2.20.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 6/6] lib: Add function strscpy_from_user()
  2019-02-18 23:23 [PATCH 0/6] lib: Add safe string funtions Tobin C. Harding
                   ` (4 preceding siblings ...)
  2019-02-18 23:23 ` [PATCH 5/6] lib: Fix function documentation for strncpy_from_user Tobin C. Harding
@ 2019-02-18 23:23 ` Tobin C. Harding
  2019-02-19  2:09   ` Jann Horn
  2019-02-19  2:12   ` Jann Horn
  2019-02-20 23:31 ` [PATCH 0/6] lib: Add safe string funtions Kees Cook
  6 siblings, 2 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-18 23:23 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, kernel-hardening,
	linux-kernel

Currently we have strncpy_from_userspace().  If the user string is
longer than the destination kernel buffer we get an error code -EFAULT.
We are unable to recover from here because this is the same error
returned if the access to userspace fails totally.

There is no reason we cannot continue execution with the user string
truncated.

Add a function strscpy_from_user() that guarantees the string written is
null-terminated.  If user string is longer than destination buffer
truncates the string.  Returns the number of characters written
excluding the null-terminator.

Signed-off-by: Tobin C. Harding <tobin@kernel.org>
---
 lib/strncpy_from_user.c | 43 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index 11fe9a4a00fd..6bd603ccec7a 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -120,3 +120,46 @@ long strncpy_from_user(char *dst, const char __user *src, long count)
 	return -EFAULT;
 }
 EXPORT_SYMBOL(strncpy_from_user);
+
+/**
+ * strscpy_from_user() - Copy a NUL terminated string from userspace.
+ * @dst:   Destination address, in kernel space.  This buffer must be at
+ *         least @count bytes long.
+ * @src:   Source address, in user space.
+ * @count: Maximum number of bytes to copy, including the trailing %NUL.
+ *
+ * Copies a NUL-terminated string from userspace to kernel space.  When
+ * the function returns @dst is guaranteed to be null terminated.
+ *
+ * Return: If access to userspace fails, returns -EFAULT.  Otherwise,
+ *         return the number of characters copied excluding the trailing
+ *         %NUL.
+ */
+long strscpy_from_user(char *dst, const char __user *src, long count)
+{
+	unsigned long max_addr, src_addr;
+
+	if (unlikely(count <= 0))
+		return 0;
+
+	max_addr = user_addr_max();
+	src_addr = (unsigned long)src;
+	if (likely(src_addr < max_addr)) {
+		unsigned long max = max_addr - src_addr;
+		long retval;
+
+		kasan_check_write(dst, count);
+		check_object_size(dst, count, false);
+		if (user_access_begin(src, max)) {
+			retval = do_strncpy_from_user(dst, src, count, max);
+			user_access_end();
+			if (retval == -EFAULT) {
+				dst[count-1] = '\0';
+				return count - 1;
+			}
+			return retval;
+		}
+	}
+	return -EFAULT;
+}
+EXPORT_SYMBOL(strscpy_from_user);
-- 
2.20.1

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-18 23:23 ` [PATCH 5/6] lib: Fix function documentation for strncpy_from_user Tobin C. Harding
@ 2019-02-19  0:51   ` Jann Horn
  2019-02-19 21:52     ` Tobin C. Harding
  2019-02-21  1:05     ` Kees Cook
  0 siblings, 2 replies; 41+ messages in thread
From: Jann Horn @ 2019-02-19  0:51 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Kees Cook, Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, kernel list, Andy Lutomirski

+cc Andy because he's also preparing a patch for this function

On Tue, Feb 19, 2019 at 12:25 AM Tobin C. Harding <tobin@kernel.org> wrote:
> Current function documentation for strncpy_from_user() is incorrect.  If
> @count (size of destination buffer) is less than the length of the user
> string the function does _not_ return @count but rather returns -EFAULT.
>
> Document correctly the function return value, also add note that string
> may be left non-null terminated.
>
> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> ---
>  lib/strncpy_from_user.c | 17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> index 58eacd41526c..11fe9a4a00fd 100644
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c
> @@ -82,22 +82,19 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
>  }
>
>  /**
> - * strncpy_from_user: - Copy a NUL terminated string from userspace.
> + * strncpy_from_user() - Copy a NUL terminated string from userspace.
>   * @dst:   Destination address, in kernel space.  This buffer must be at
>   *         least @count bytes long.
>   * @src:   Source address, in user space.
> - * @count: Maximum number of bytes to copy, including the trailing NUL.
> + * @count: Maximum number of bytes to copy, including the trailing %NUL.
>   *
>   * Copies a NUL-terminated string from userspace to kernel space.
>   *
> - * On success, returns the length of the string (not including the trailing
> - * NUL).
> - *
> - * If access to userspace fails, returns -EFAULT (some data may have been
> - * copied).
> - *
> - * If @count is smaller than the length of the string, copies @count bytes
> - * and returns @count.
> + * Return: If access to userspace fails, returns -EFAULT.  Otherwise,
> + *         return the number of characters copied excluding the trailing
> + *         %NUL, if the length of the user string exceeds @count return
> + *         -EFAULT (in which case @dst will be left without a %NUL
> + *         terminator).
>   */

AFAICS the byte_at_a_time loop exits when max==0 is reached, and then
if `res >= count` (in other words, if we've copied as many bytes as
requested, haven't encountered a null byte so far, and haven't reached
the end of the address space), we return `res`, which is the same as
`count`. Are you sure?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 6/6] lib: Add function strscpy_from_user()
  2019-02-18 23:23 ` [PATCH 6/6] lib: Add function strscpy_from_user() Tobin C. Harding
@ 2019-02-19  2:09   ` Jann Horn
  2019-02-19  2:12   ` Jann Horn
  1 sibling, 0 replies; 41+ messages in thread
From: Jann Horn @ 2019-02-19  2:09 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Kees Cook, Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, kernel list

On Tue, Feb 19, 2019 at 12:25 AM Tobin C. Harding <tobin@kernel.org> wrote:
> Currently we have strncpy_from_userspace().  If the user string is
> longer than the destination kernel buffer we get an error code -EFAULT.

No, see the other thread. If the user string is too long,
strncpy_from_userspace() fills the output buffer with non-null bytes
and returns the supplied length.

> We are unable to recover from here because this is the same error
> returned if the access to userspace fails totally.
>
> There is no reason we cannot continue execution with the user string
> truncated.
>
> Add a function strscpy_from_user() that guarantees the string written is
> null-terminated.  If user string is longer than destination buffer
> truncates the string.  Returns the number of characters written
> excluding the null-terminator.
>
> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> ---
>  lib/strncpy_from_user.c | 43 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
>
> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> index 11fe9a4a00fd..6bd603ccec7a 100644
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c
> @@ -120,3 +120,46 @@ long strncpy_from_user(char *dst, const char __user *src, long count)
>         return -EFAULT;
>  }
>  EXPORT_SYMBOL(strncpy_from_user);
> +
> +/**
> + * strscpy_from_user() - Copy a NUL terminated string from userspace.
> + * @dst:   Destination address, in kernel space.  This buffer must be at
> + *         least @count bytes long.
> + * @src:   Source address, in user space.
> + * @count: Maximum number of bytes to copy, including the trailing %NUL.
> + *
> + * Copies a NUL-terminated string from userspace to kernel space.  When
> + * the function returns @dst is guaranteed to be null terminated.
> + *
> + * Return: If access to userspace fails, returns -EFAULT.

That's wrong. Actually, you only return -EFAULT if the specified
source address points to an address outside the userspace address
range.

> Otherwise,
> + *         return the number of characters copied excluding the trailing
> + *         %NUL.
> + */
> +long strscpy_from_user(char *dst, const char __user *src, long count)
> +{
> +       unsigned long max_addr, src_addr;
> +
> +       if (unlikely(count <= 0))
> +               return 0;

The "supply a signed long and quietly bail out if it's smaller than
zero" pattern seems bad to me. If count is zero, you can't guarantee
that the buffer will be null-terminated, and if it's smaller than
zero, something has gone very wrong.

> +       max_addr = user_addr_max();
> +       src_addr = (unsigned long)src;
> +       if (likely(src_addr < max_addr)) {
> +               unsigned long max = max_addr - src_addr;
> +               long retval;
> +
> +               kasan_check_write(dst, count);
> +               check_object_size(dst, count, false);
> +               if (user_access_begin(src, max)) {
> +                       retval = do_strncpy_from_user(dst, src, count, max);
> +                       user_access_end();
> +                       if (retval == -EFAULT) {
> +                               dst[count-1] = '\0';
> +                               return count - 1;

Uh... this looks bad. If do_strncpy_from_user() gets a fault -
anywhere -, you just put a nullbyte at the end of the supplied buffer
and return? As far as I can tell, this means that the caller will
think that you've filled the entire buffer, but actually everything
except for the last byte might be uninitialized.

> +                       }
> +                       return retval;
> +               }
> +       }
> +       return -EFAULT;
> +}
> +EXPORT_SYMBOL(strscpy_from_user);
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 6/6] lib: Add function strscpy_from_user()
  2019-02-18 23:23 ` [PATCH 6/6] lib: Add function strscpy_from_user() Tobin C. Harding
  2019-02-19  2:09   ` Jann Horn
@ 2019-02-19  2:12   ` Jann Horn
  2019-02-19 21:53     ` Tobin C. Harding
  1 sibling, 1 reply; 41+ messages in thread
From: Jann Horn @ 2019-02-19  2:12 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Kees Cook, Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, kernel list

On Tue, Feb 19, 2019 at 12:25 AM Tobin C. Harding <tobin@kernel.org> wrote:
> Currently we have strncpy_from_userspace().  If the user string is
> longer than the destination kernel buffer we get an error code -EFAULT.
> We are unable to recover from here because this is the same error
> returned if the access to userspace fails totally.
>
> There is no reason we cannot continue execution with the user string
> truncated.
>
> Add a function strscpy_from_user() that guarantees the string written is
> null-terminated.  If user string is longer than destination buffer
> truncates the string.  Returns the number of characters written
> excluding the null-terminator.
>
> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> ---
>  lib/strncpy_from_user.c | 43 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
>
> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> index 11fe9a4a00fd..6bd603ccec7a 100644
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c

This file is only built when CONFIG_GENERIC_STRNCPY_FROM_USER is set.
Some architectures have their own versions of strncpy_from_user() and
don't set that, so on those architectures, your code wouldn't be built
into the kernel.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/6] lib/string: Enable string selftesting
  2019-02-18 23:23 ` [PATCH 1/6] lib/string: Enable string selftesting Tobin C. Harding
@ 2019-02-19 10:55   ` Andy Shevchenko
  2019-02-19 21:55     ` Tobin C. Harding
  2019-02-20 23:57   ` Kees Cook
  1 sibling, 1 reply; 41+ messages in thread
From: Andy Shevchenko @ 2019-02-19 10:55 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Kees Cook, Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, Linux Kernel Mailing List

On Tue, Feb 19, 2019 at 4:44 AM Tobin C. Harding <tobin@kernel.org> wrote:
>
> Currently we have a test module but it is not tied into the kselftest
> infrastructure.  In preparation for adding string manipulation functions
> and testing we should enable kselftest to utilize the test module.
>
> Enable string testing via kselftest infrastructure.
>
> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> ---
>  lib/Kconfig.debug                     | 14 ++++++++++++++
>  lib/Makefile                          |  2 +-
>  lib/test_string.c                     |  4 ++--
>  tools/testing/selftests/lib/Makefile  |  2 +-
>  tools/testing/selftests/lib/config    |  1 +
>  tools/testing/selftests/lib/string.sh | 19 +++++++++++++++++++
>  6 files changed, 38 insertions(+), 4 deletions(-)
>  create mode 100755 tools/testing/selftests/lib/string.sh
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index d4df5b24d75e..0dca64c1d8a4 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1802,8 +1802,22 @@ config ASYNC_RAID6_TEST
>  config TEST_HEXDUMP
>         tristate "Test functions located in the hexdump module at runtime"
>
> +config TEST_STRING
> +       tristate "Perform selftest on string manipulation functions"

> +       default n

Redundant

> +       help
> +        Enable this option to test string manipulation functions.
> +       Currently this only tests memset_{16,32,64}.
> +
> +       If unsure, say N.
> +
>  config TEST_STRING_HELPERS
>         tristate "Test functions located in the string_helpers module at runtime"

> +       default n

Redundant

> +       help
> +        Enable this option to unit test code in lib/string_helpers.c
> +
> +        If unsure, say N.
>
>  config TEST_KSTRTOX
>         tristate "Test kstrto*() family of functions at runtime"
> diff --git a/lib/Makefile b/lib/Makefile
> index e1b59da71418..9c30e1fee27f 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -39,7 +39,7 @@ obj-y += bcd.o div64.o sort.o parser.o debug_locks.o random32.o \
>          bsearch.o find_bit.o llist.o memweight.o kfifo.o \
>          percpu-refcount.o rhashtable.o reciprocal_div.o \
>          once.o refcount.o usercopy.o errseq.o bucket_locks.o
> -obj-$(CONFIG_STRING_SELFTEST) += test_string.o
> +obj-$(CONFIG_TEST_STRING) += test_string.o
>  obj-y += string_helpers.o
>  obj-$(CONFIG_TEST_STRING_HELPERS) += test-string_helpers.o
>  obj-y += hexdump.o
> diff --git a/lib/test_string.c b/lib/test_string.c
> index 0fcdb82dca86..a9cba442389a 100644
> --- a/lib/test_string.c
> +++ b/lib/test_string.c
> @@ -111,7 +111,7 @@ static __init int memset64_selftest(void)
>         return 0;
>  }
>
> -static __init int string_selftest_init(void)
> +static __init int test_string_init(void)
>  {
>         int test, subtest;
>
> @@ -137,5 +137,5 @@ static __init int string_selftest_init(void)
>         return 0;
>  }
>
> -module_init(string_selftest_init);
> +module_init(test_string_init);
>  MODULE_LICENSE("GPL v2");
> diff --git a/tools/testing/selftests/lib/Makefile b/tools/testing/selftests/lib/Makefile
> index 70d5711e3ac8..2ee4559b277e 100644
> --- a/tools/testing/selftests/lib/Makefile
> +++ b/tools/testing/selftests/lib/Makefile
> @@ -3,6 +3,6 @@
>  # No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
>  all:
>
> -TEST_PROGS := printf.sh bitmap.sh prime_numbers.sh
> +TEST_PROGS := printf.sh bitmap.sh prime_numbers.sh string.sh
>
>  include ../lib.mk
> diff --git a/tools/testing/selftests/lib/config b/tools/testing/selftests/lib/config
> index 126933bcc950..2032402ad409 100644
> --- a/tools/testing/selftests/lib/config
> +++ b/tools/testing/selftests/lib/config
> @@ -1,3 +1,4 @@
>  CONFIG_TEST_PRINTF=m
>  CONFIG_TEST_BITMAP=m
> +CONFIG_TEST_STRING=m
>  CONFIG_PRIME_NUMBERS=m
> diff --git a/tools/testing/selftests/lib/string.sh b/tools/testing/selftests/lib/string.sh
> new file mode 100755
> index 000000000000..99024b6f3a6a
> --- /dev/null
> +++ b/tools/testing/selftests/lib/string.sh
> @@ -0,0 +1,19 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0
> +# Runs string manipulation tests using test_string kernel module
> +
> +# Kselftest framework requirement - SKIP code is 4.
> +ksft_skip=4
> +
> +if ! /sbin/modprobe -q -n test_string; then
> +       echo "string: module test_string is not found [SKIP]"
> +       exit $ksft_skip
> +fi
> +
> +if /sbin/modprobe -q test_string; then
> +       /sbin/modprobe -q -r test_string
> +       echo "string: ok"
> +else
> +       echo "string: [FAIL]"
> +       exit 1
> +fi
> --
> 2.20.1
>


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-19  0:51   ` Jann Horn
@ 2019-02-19 21:52     ` Tobin C. Harding
  2019-02-21  1:05     ` Kees Cook
  1 sibling, 0 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-19 21:52 UTC (permalink / raw)
  To: Jann Horn
  Cc: Tobin C. Harding, Kees Cook, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	kernel list, Andy Lutomirski

On Tue, Feb 19, 2019 at 01:51:45AM +0100, Jann Horn wrote:
> +cc Andy because he's also preparing a patch for this function
> 
> On Tue, Feb 19, 2019 at 12:25 AM Tobin C. Harding <tobin@kernel.org> wrote:
> > Current function documentation for strncpy_from_user() is incorrect.  If
> > @count (size of destination buffer) is less than the length of the user
> > string the function does _not_ return @count but rather returns -EFAULT.
> >
> > Document correctly the function return value, also add note that string
> > may be left non-null terminated.
> >
> > Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> > ---
> >  lib/strncpy_from_user.c | 17 +++++++----------
> >  1 file changed, 7 insertions(+), 10 deletions(-)
> >
> > diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> > index 58eacd41526c..11fe9a4a00fd 100644
> > --- a/lib/strncpy_from_user.c
> > +++ b/lib/strncpy_from_user.c
> > @@ -82,22 +82,19 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
> >  }
> >
> >  /**
> > - * strncpy_from_user: - Copy a NUL terminated string from userspace.
> > + * strncpy_from_user() - Copy a NUL terminated string from userspace.
> >   * @dst:   Destination address, in kernel space.  This buffer must be at
> >   *         least @count bytes long.
> >   * @src:   Source address, in user space.
> > - * @count: Maximum number of bytes to copy, including the trailing NUL.
> > + * @count: Maximum number of bytes to copy, including the trailing %NUL.
> >   *
> >   * Copies a NUL-terminated string from userspace to kernel space.
> >   *
> > - * On success, returns the length of the string (not including the trailing
> > - * NUL).
> > - *
> > - * If access to userspace fails, returns -EFAULT (some data may have been
> > - * copied).
> > - *
> > - * If @count is smaller than the length of the string, copies @count bytes
> > - * and returns @count.
> > + * Return: If access to userspace fails, returns -EFAULT.  Otherwise,
> > + *         return the number of characters copied excluding the trailing
> > + *         %NUL, if the length of the user string exceeds @count return
> > + *         -EFAULT (in which case @dst will be left without a %NUL
> > + *         terminator).
> >   */
> 
> AFAICS the byte_at_a_time loop exits when max==0 is reached, and then
> if `res >= count` (in other words, if we've copied as many bytes as
> requested, haven't encountered a null byte so far, and haven't reached
> the end of the address space), we return `res`, which is the same as
> `count`. Are you sure?

Thanks for the review Jaan.  Seems I was in a world of confused, I
misread count to be the size of the destination buffer not the size of
the source string.

Also, re-reading strscpy() and why its better I think I may not be the
correct person to implement strscpy_from_user() since I'm not across all
the subtleties.

If v2 seems wanted for the first patches in this set I'll drop the
*_from_user() stuff.

thanks,
Tobin.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 6/6] lib: Add function strscpy_from_user()
  2019-02-19  2:12   ` Jann Horn
@ 2019-02-19 21:53     ` Tobin C. Harding
  0 siblings, 0 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-19 21:53 UTC (permalink / raw)
  To: Jann Horn
  Cc: Tobin C. Harding, Kees Cook, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	kernel list

On Tue, Feb 19, 2019 at 03:12:33AM +0100, Jann Horn wrote:
> On Tue, Feb 19, 2019 at 12:25 AM Tobin C. Harding <tobin@kernel.org> wrote:
> > Currently we have strncpy_from_userspace().  If the user string is
> > longer than the destination kernel buffer we get an error code -EFAULT.
> > We are unable to recover from here because this is the same error
> > returned if the access to userspace fails totally.
> >
> > There is no reason we cannot continue execution with the user string
> > truncated.
> >
> > Add a function strscpy_from_user() that guarantees the string written is
> > null-terminated.  If user string is longer than destination buffer
> > truncates the string.  Returns the number of characters written
> > excluding the null-terminator.
> >
> > Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> > ---
> >  lib/strncpy_from_user.c | 43 +++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 43 insertions(+)
> >
> > diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> > index 11fe9a4a00fd..6bd603ccec7a 100644
> > --- a/lib/strncpy_from_user.c
> > +++ b/lib/strncpy_from_user.c
> 
> This file is only built when CONFIG_GENERIC_STRNCPY_FROM_USER is set.
> Some architectures have their own versions of strncpy_from_user() and
> don't set that, so on those architectures, your code wouldn't be built
> into the kernel.

thanks!  Dropping *_from_user() stuff from set.

	Tobin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/6] lib/string: Enable string selftesting
  2019-02-19 10:55   ` Andy Shevchenko
@ 2019-02-19 21:55     ` Tobin C. Harding
  2019-02-20 10:49       ` Andy Shevchenko
  2019-02-20 23:58       ` Kees Cook
  0 siblings, 2 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-19 21:55 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Tobin C. Harding, Kees Cook, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	Linux Kernel Mailing List

On Tue, Feb 19, 2019 at 12:55:09PM +0200, Andy Shevchenko wrote:
> On Tue, Feb 19, 2019 at 4:44 AM Tobin C. Harding <tobin@kernel.org> wrote:
> >
> > Currently we have a test module but it is not tied into the kselftest
> > infrastructure.  In preparation for adding string manipulation functions
> > and testing we should enable kselftest to utilize the test module.
> >
> > Enable string testing via kselftest infrastructure.
> >
> > Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> > ---
> >  lib/Kconfig.debug                     | 14 ++++++++++++++
> >  lib/Makefile                          |  2 +-
> >  lib/test_string.c                     |  4 ++--
> >  tools/testing/selftests/lib/Makefile  |  2 +-
> >  tools/testing/selftests/lib/config    |  1 +
> >  tools/testing/selftests/lib/string.sh | 19 +++++++++++++++++++
> >  6 files changed, 38 insertions(+), 4 deletions(-)
> >  create mode 100755 tools/testing/selftests/lib/string.sh
> >
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index d4df5b24d75e..0dca64c1d8a4 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -1802,8 +1802,22 @@ config ASYNC_RAID6_TEST
> >  config TEST_HEXDUMP
> >         tristate "Test functions located in the hexdump module at runtime"
> >
> > +config TEST_STRING
> > +       tristate "Perform selftest on string manipulation functions"
> 
> > +       default n
> 
> Redundant

Cool, thanks.

> > +       help
> > +        Enable this option to test string manipulation functions.
> > +       Currently this only tests memset_{16,32,64}.
> > +
> > +       If unsure, say N.

Does that mean that this is redundant too?


thanks,
Tobin.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/6] lib/string: Enable string selftesting
  2019-02-19 21:55     ` Tobin C. Harding
@ 2019-02-20 10:49       ` Andy Shevchenko
  2019-02-20 23:58       ` Kees Cook
  1 sibling, 0 replies; 41+ messages in thread
From: Andy Shevchenko @ 2019-02-20 10:49 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Tobin C. Harding, Kees Cook, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	Linux Kernel Mailing List

On Tue, Feb 19, 2019 at 11:55 PM Tobin C. Harding <me@tobin.cc> wrote:
> On Tue, Feb 19, 2019 at 12:55:09PM +0200, Andy Shevchenko wrote:
> > On Tue, Feb 19, 2019 at 4:44 AM Tobin C. Harding <tobin@kernel.org> wrote:
> > >
> > > Currently we have a test module but it is not tied into the kselftest
> > > infrastructure.  In preparation for adding string manipulation functions
> > > and testing we should enable kselftest to utilize the test module.

> > > +       default n
> >
> > Redundant
>
> Cool, thanks.

> > > +       If unsure, say N.
>
> Does that mean that this is redundant too?

It's usual pattern in many help summaries, I don't know what way is better.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/6] lib: Add safe string funtions
  2019-02-18 23:23 [PATCH 0/6] lib: Add safe string funtions Tobin C. Harding
                   ` (5 preceding siblings ...)
  2019-02-18 23:23 ` [PATCH 6/6] lib: Add function strscpy_from_user() Tobin C. Harding
@ 2019-02-20 23:31 ` Kees Cook
  2019-02-21  5:15   ` Tobin C. Harding
  6 siblings, 1 reply; 41+ messages in thread
From: Kees Cook @ 2019-02-20 23:31 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, LKML

On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
> During your talk at LCA you mentioned that we could do with a couple
> more safe string functions.  One to zero the tail of the destination
> buffer after call to strscpy() and also the self explanatory
> strscpy_from_user().

Thanks for jumping in with this! :)

> I couldn't work out if this is a false positive or not?  Does the new
> config option CONFIG_TEST_STRING need more documentation?  I don't see
> where extra docs should be added and it seems self explanatory as is.

Usually this just means the help string in Kconfig is "too short".
Sometimes this is a false positive -- really up to you if you think it
needs more. :)

On to individual patches...

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/6] lib/string: Enable string selftesting
  2019-02-18 23:23 ` [PATCH 1/6] lib/string: Enable string selftesting Tobin C. Harding
  2019-02-19 10:55   ` Andy Shevchenko
@ 2019-02-20 23:57   ` Kees Cook
  2019-02-21  5:16     ` Tobin C. Harding
  1 sibling, 1 reply; 41+ messages in thread
From: Kees Cook @ 2019-02-20 23:57 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, LKML

On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
>
> Currently we have a test module but it is not tied into the kselftest
> infrastructure.  In preparation for adding string manipulation functions
> and testing we should enable kselftest to utilize the test module.
>
> Enable string testing via kselftest infrastructure.
>
> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> ---
>  lib/Kconfig.debug                     | 14 ++++++++++++++
>  lib/Makefile                          |  2 +-
>  lib/test_string.c                     |  4 ++--
>  tools/testing/selftests/lib/Makefile  |  2 +-
>  tools/testing/selftests/lib/config    |  1 +
>  tools/testing/selftests/lib/string.sh | 19 +++++++++++++++++++
>  6 files changed, 38 insertions(+), 4 deletions(-)
>  create mode 100755 tools/testing/selftests/lib/string.sh
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index d4df5b24d75e..0dca64c1d8a4 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1802,8 +1802,22 @@ config ASYNC_RAID6_TEST
>  config TEST_HEXDUMP
>         tristate "Test functions located in the hexdump module at runtime"
>
> +config TEST_STRING
> +       tristate "Perform selftest on string manipulation functions"
> +       default n
> +       help
> +        Enable this option to test string manipulation functions.
> +       Currently this only tests memset_{16,32,64}.
> +
> +       If unsure, say N.
> +
>  config TEST_STRING_HELPERS
>         tristate "Test functions located in the string_helpers module at runtime"
> +       default n
> +       help
> +        Enable this option to unit test code in lib/string_helpers.c
> +
> +        If unsure, say N.
>
>  config TEST_KSTRTOX
>         tristate "Test kstrto*() family of functions at runtime"
> diff --git a/lib/Makefile b/lib/Makefile
> index e1b59da71418..9c30e1fee27f 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -39,7 +39,7 @@ obj-y += bcd.o div64.o sort.o parser.o debug_locks.o random32.o \
>          bsearch.o find_bit.o llist.o memweight.o kfifo.o \
>          percpu-refcount.o rhashtable.o reciprocal_div.o \
>          once.o refcount.o usercopy.o errseq.o bucket_locks.o
> -obj-$(CONFIG_STRING_SELFTEST) += test_string.o
> +obj-$(CONFIG_TEST_STRING) += test_string.o

This patch should remove 'config STRING_SELFTEST' from lib/Kconfig too.

> diff --git a/tools/testing/selftests/lib/Makefile b/tools/testing/selftests/lib/Makefile
> index 70d5711e3ac8..2ee4559b277e 100644
> --- a/tools/testing/selftests/lib/Makefile
> +++ b/tools/testing/selftests/lib/Makefile
> @@ -3,6 +3,6 @@
>  # No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
>  all:
>
> -TEST_PROGS := printf.sh bitmap.sh prime_numbers.sh
> +TEST_PROGS := printf.sh bitmap.sh prime_numbers.sh string.sh
>
>  include ../lib.mk
> diff --git a/tools/testing/selftests/lib/config b/tools/testing/selftests/lib/config
> index 126933bcc950..2032402ad409 100644
> --- a/tools/testing/selftests/lib/config
> +++ b/tools/testing/selftests/lib/config
> @@ -1,3 +1,4 @@
>  CONFIG_TEST_PRINTF=m
>  CONFIG_TEST_BITMAP=m
> +CONFIG_TEST_STRING=m
>  CONFIG_PRIME_NUMBERS=m
> diff --git a/tools/testing/selftests/lib/string.sh b/tools/testing/selftests/lib/string.sh
> new file mode 100755
> index 000000000000..99024b6f3a6a
> --- /dev/null
> +++ b/tools/testing/selftests/lib/string.sh
> @@ -0,0 +1,19 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0
> +# Runs string manipulation tests using test_string kernel module
> +
> +# Kselftest framework requirement - SKIP code is 4.
> +ksft_skip=4
> +
> +if ! /sbin/modprobe -q -n test_string; then
> +       echo "string: module test_string is not found [SKIP]"
> +       exit $ksft_skip
> +fi
> +
> +if /sbin/modprobe -q test_string; then
> +       /sbin/modprobe -q -r test_string
> +       echo "string: ok"
> +else
> +       echo "string: [FAIL]"
> +       exit 1
> +fi
> --
> 2.20.1
>

You mentioned "redundant scripts" here. You might want to refactor
first, and have a common tool that does the core testing, and then
have the scripts doing one line each:

i.e.:

#!/bin/bash
exec./test_module.sh prime_numbers selftest=65536

with "test_module.sh" doing all the rest.

I bet there are other test_*.ko tests we could wire up too, and this
refactor will make that much easier. And actually, maybe we should
just have a single test running that just reads the "config" file for
the list of test modules, and runs them with the correct output format
to show which are skipped, etc.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/6] lib/string: Enable string selftesting
  2019-02-19 21:55     ` Tobin C. Harding
  2019-02-20 10:49       ` Andy Shevchenko
@ 2019-02-20 23:58       ` Kees Cook
  1 sibling, 0 replies; 41+ messages in thread
From: Kees Cook @ 2019-02-20 23:58 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Andy Shevchenko, Tobin C. Harding, Shuah Khan,
	Alexander Shishkin, Greg Kroah-Hartman, Andy Shevchenko,
	Kernel Hardening, Linux Kernel Mailing List

On Tue, Feb 19, 2019 at 1:55 PM Tobin C. Harding <me@tobin.cc> wrote:
>
> On Tue, Feb 19, 2019 at 12:55:09PM +0200, Andy Shevchenko wrote:
> > On Tue, Feb 19, 2019 at 4:44 AM Tobin C. Harding <tobin@kernel.org> wrote:
> > > +       If unsure, say N.
>
> Does that mean that this is redundant too?

I've started leaving this off Kconfigs more and more lately. I don't
think it's needed here.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/6] lib/string: Fix erroneous 'overflow' documentation
  2019-02-18 23:23 ` [PATCH 2/6] lib/string: Fix erroneous 'overflow' documentation Tobin C. Harding
@ 2019-02-21  0:02   ` Kees Cook
  2019-02-21  5:17     ` Tobin C. Harding
  0 siblings, 1 reply; 41+ messages in thread
From: Kees Cook @ 2019-02-21  0:02 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, LKML

On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
>
> Current documentation uses 'overflow' to describe a situation where less
> data is written to a buffer than buffer size not more.  'overflow' is
> the wrong word here - since we don't typically say 'underflow' change
> the whole sentence.
>
> Fix erroneous 'overflow' documentation for under filled buffer.
>
> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> ---
>  lib/string.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/string.c b/lib/string.c
> index 38e4ca08e757..7f1d72db53c5 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -173,8 +173,8 @@ EXPORT_SYMBOL(strlcpy);
>   *
>   * Preferred to strncpy() since it always returns a valid string, and
>   * doesn't unnecessarily force the tail of the destination buffer to be
> - * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy()
> - * with an overflow test, then just memset() the tail of the dest buffer.
> + * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy(),
> + * check the return size, then just memset() the tail of the dest buffer.
>   */

I'd just fold this patch into the strscpy_zeroed() patch. No need for
a kind of "no op" change here when we'll just change it again with a
better advice ("use strscpy_zeroed()!")

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 3/6] lib/string: Use correct docstring format
  2019-02-18 23:23 ` [PATCH 3/6] lib/string: Use correct docstring format Tobin C. Harding
@ 2019-02-21  0:07   ` Kees Cook
  2019-02-21  4:14     ` Randy Dunlap
  0 siblings, 1 reply; 41+ messages in thread
From: Kees Cook @ 2019-02-21  0:07 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, LKML

On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
>
> Currently the docstring comments for strscpy() are not in the correct
> format.  Prior to working on this file fix up the docstring.
>
> Use correct docstring format for strscpy().

Is this attached to "make htmldocs" anywhere? Maybe in the device
driver api doc? That's where I put refcount_t. See
driver-api/basics.rst and put something like:

String Handling
--------------------

.. kernel-doc:: lib/string.c
   :internal:

and add that chunk to this patch.

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

>
> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> ---
>  lib/string.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/lib/string.c b/lib/string.c
> index 7f1d72db53c5..65969cf32f5d 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -159,11 +159,9 @@ EXPORT_SYMBOL(strlcpy);
>   * @src: Where to copy the string from
>   * @count: Size of destination buffer
>   *
> - * Copy the string, or as much of it as fits, into the dest buffer.
> - * The routine returns the number of characters copied (not including
> - * the trailing NUL) or -E2BIG if the destination buffer wasn't big enough.
> - * The behavior is undefined if the string buffers overlap.
> - * The destination buffer is always NUL terminated, unless it's zero-sized.
> + * Copy the string, or as much of it as fits, into the dest buffer.  The
> + * behavior is undefined if the string buffers overlap.  The destination
> + * buffer is always NUL terminated, unless it's zero-sized.
>   *
>   * Preferred to strlcpy() since the API doesn't require reading memory
>   * from the src string beyond the specified "count" bytes, and since
> @@ -175,6 +173,9 @@ EXPORT_SYMBOL(strlcpy);
>   * doesn't unnecessarily force the tail of the destination buffer to be
>   * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy(),
>   * check the return size, then just memset() the tail of the dest buffer.
> + *
> + * Return: The number of characters copied (not including the trailing
> + *         NUL) or -E2BIG if the destination buffer wasn't big enough.
>   */
>  ssize_t strscpy(char *dest, const char *src, size_t count)
>  {
> --
> 2.20.1
>


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/6] lib/string: Add string copy/zero function
  2019-02-18 23:23 ` [PATCH 4/6] lib/string: Add string copy/zero function Tobin C. Harding
@ 2019-02-21  0:48   ` Kees Cook
  2019-02-21  5:20     ` Tobin C. Harding
                       ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Kees Cook @ 2019-02-21  0:48 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, LKML, Matthew Wilcox,
	Rasmus Villemoes, Daniel Micay

On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
>
> We have a function to copy strings safely and we have a function to copy
> strings _and_ zero the tail of the destination (if source string is
> shorter than destination buffer) but we do not have a function to do
> both at once.  This means developers must write this themselves if they
> desire this functionality.  This is a chore, and also leaves us open to
> off by one errors unnecessarily.
>
> Add a function that calls strscpy() then memset()s the tail to zero if
> the source string is shorter than the destination buffer.
>
> Add testing via kselftest.
>
> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> ---
>  include/linux/string.h |  4 ++++
>  lib/Kconfig.debug      |  2 +-
>  lib/string.c           | 30 ++++++++++++++++++++++++++++--
>  lib/test_string.c      | 31 +++++++++++++++++++++++++++++++
>  4 files changed, 64 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/string.h b/include/linux/string.h
> index 7927b875f80c..695a5e6a31e3 100644
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -31,6 +31,10 @@ size_t strlcpy(char *, const char *, size_t);
>  #ifndef __HAVE_ARCH_STRSCPY
>  ssize_t strscpy(char *, const char *, size_t);
>  #endif
> +
> +/* Wrapper function, no arch specific code required */
> +ssize_t strscpy_zeroed(char *dest, const char *src, size_t count);

bikeshed: I think "pad" is shorter and more descriptive. How about
something like strspad() strscpy_pad() or strscpy_zero()? (just to
shorten it slightly)

Not a blocker, just a TODO: we need a wrapper to do
CONFIG_FORTIFY_SOURCE checking for strscpy() (and strscpy_zeroed()) to
check for __builtin_object_size() vs the "size" argument, as done in
strlcpy() in include/linux/string.h

> @@ -238,6 +237,33 @@ ssize_t strscpy(char *dest, const char *src, size_t count)
>  EXPORT_SYMBOL(strscpy);
>  #endif
>
> +/**
> + * strscopy_zeroed() - Copy a C-string into a sized buffer
> + * @dest: Where to copy the string to
> + * @src: Where to copy the string from
> + * @count: Size of destination buffer
> + *
> + * If the source string is shorter than the destination buffer, zeros
> + * the tail of the destination buffer.
> + *
> + * Return: The number of characters copied (not including the trailing
> + *         NUL) or -E2BIG if the destination buffer wasn't big enough.
> + */
> +ssize_t strscpy_zeroed(char *dest, const char *src, size_t count)
> +{
> +       ssize_t written;
> +
> +       written = strscpy(dest, src, count);
> +       if (written < 0)
> +               return written;

If written < 0 we filled everything (i.e. we wrote "count - 1" bytes).
If we also exactly wrote "count - 1", then we also don't need the zero
padding either, since strscpy wrote the trailing NUL.

so:

if (written < 0 || (count && written == count - 1))
    return written;

> +
> +       if (written < count)
> +               memset(dest + written, 0, count - written);

Now we know written must be [0, count - 2], so we can just:

memset(dest + written + 1, 0, count - written - 1);

The pattern (which should be added to the seltest) is:

count           source  written                                 pad@
0               *       -E2BIG (0 char, 0 NUL, 0 to zero)

1               "a"     -E2BIG (0 char, 1 NUL, 0 to zero)
1               ""      0 (0 char, 1 NUL, 0 to zero)

2               "ab"    -E2BIG (1 char, 1 NUL, 0 to zero)
2               "a"     1 (1 char, 1 NUL, 0 to zero)
2               ""      0 (0 char, 1 NUL, 1 to zero)            dest + 1

3               "abc"   -E2BIG (2 char, 1 NUL, 0 to zero)
3               "ab"    2 (2 char, 1 NUL, 0 to zero)
3               "a"     1 (1 char, 1 NUL, 1 to zero)            dest + 2
3               ""      0 (0 char, 1 NUL, 2 to zero)            dest + 1

4               "abcd"  -E2BIG (3 char, 1 NUL, 0 to zero)
4               "abc"   3 (3 char, 1 NUL, 0 to zero)
4               "ab"    2 (2 char, 1 NUL, 1 to zero)            dest + 3
4               "a"     1 (1 char, 1 NUL, 2 to zero)            dest + 2
4               ""      0 (0 char, 1 NUL, 3 to zero)            dest + 1


> +
> +       return written;
> +}
> +EXPORT_SYMBOL(strscpy_zeroed);
> +
>  #ifndef __HAVE_ARCH_STRCAT
>  /**
>   * strcat - Append one %NUL-terminated string to another
> diff --git a/lib/test_string.c b/lib/test_string.c
> index a9cba442389a..cc4eef51a395 100644
> --- a/lib/test_string.c
> +++ b/lib/test_string.c
> @@ -111,6 +111,32 @@ static __init int memset64_selftest(void)
>         return 0;
>  }
>
> +static __init int strscpy_zeroed_selftest(void)
> +{
> +       char buf[6];
> +       int written;
> +
> +       memset(buf, 'a', sizeof(buf));
> +
> +       written = strscpy_zeroed(buf, "bb", 4);
> +       if (written != 2)
> +               return 1;
> +
> +       /* Copied correctly */
> +       if (buf[0] != 'b' || buf[1] != 'b')
> +               return 2;
> +
> +       /* Zeroed correctly */
> +       if (buf[2] != '\0' || buf[3] != '\0')
> +               return 3;
> +
> +       /* Only touched what it was supposed to */
> +       if (buf[4] != 'a' || buf[5] != 'a')
> +               return 4;
> +
> +       return 0;
> +}

Cool, I like both the positive and negative tests. :) Can you add all
the cases above, too, which should validate the various corners?

> +
>  static __init int test_string_init(void)
>  {
>         int test, subtest;
> @@ -130,6 +156,11 @@ static __init int test_string_init(void)
>         if (subtest)
>                 goto fail;
>
> +       test = 4;
> +       subtest = strscpy_zeroed_selftest();
> +       if (subtest)
> +               goto fail;
> +
>         pr_info("String selftests succeeded\n");
>         return 0;
>  fail:
> --
> 2.20.1
>

Nice! :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-19  0:51   ` Jann Horn
  2019-02-19 21:52     ` Tobin C. Harding
@ 2019-02-21  1:05     ` Kees Cook
  2019-02-21  5:24       ` Tobin C. Harding
  2019-02-21 14:28       ` Jann Horn
  1 sibling, 2 replies; 41+ messages in thread
From: Kees Cook @ 2019-02-21  1:05 UTC (permalink / raw)
  To: Jann Horn
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	kernel list, Andy Lutomirski, Rasmus Villemoes, Daniel Micay

On Mon, Feb 18, 2019 at 4:52 PM Jann Horn <jannh@google.com> wrote:
> AFAICS the byte_at_a_time loop exits when max==0 is reached, and then
> if `res >= count` (in other words, if we've copied as many bytes as
> requested, haven't encountered a null byte so far, and haven't reached
> the end of the address space), we return `res`, which is the same as
> `count`. Are you sure?

Oh, whew, there is only 1 arch-specific implementation of this. I
thought you meant there was multiple implementations.

So, generally speaking, I'd love to split all strncpy* uses into
strscpy_zero() (when expecting to do str->str copies), and some new
function, named like mempadstr() or str2mem() that copies a str to a
__nonstring char array, with trailing padding, if there is space. Then
there is no more mixing the two cases and botching things.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 3/6] lib/string: Use correct docstring format
  2019-02-21  0:07   ` Kees Cook
@ 2019-02-21  4:14     ` Randy Dunlap
  2019-02-21  5:27       ` Kees Cook
  0 siblings, 1 reply; 41+ messages in thread
From: Randy Dunlap @ 2019-02-21  4:14 UTC (permalink / raw)
  To: Kees Cook, Tobin C. Harding
  Cc: Shuah Khan, Alexander Shishkin, Greg Kroah-Hartman,
	Andy Shevchenko, Kernel Hardening, LKML

On 2/20/19 4:07 PM, Kees Cook wrote:
> On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
>>
>> Currently the docstring comments for strscpy() are not in the correct
>> format.  Prior to working on this file fix up the docstring.
>>
>> Use correct docstring format for strscpy().
> 
> Is this attached to "make htmldocs" anywhere? Maybe in the device
> driver api doc? That's where I put refcount_t. See
> driver-api/basics.rst and put something like:
> 
> String Handling
> --------------------
> 
> .. kernel-doc:: lib/string.c
>    :internal:
> 
> and add that chunk to this patch.

It's already in Documentation/core-api/kernel-api.rst, under
"String Manipulation."

> Acked-by: Kees Cook <keescook@chromium.org>
> 
> -Kees
> 
>>
>> Signed-off-by: Tobin C. Harding <tobin@kernel.org>
>> ---
>>  lib/string.c | 11 ++++++-----
>>  1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/lib/string.c b/lib/string.c
>> index 7f1d72db53c5..65969cf32f5d 100644
>> --- a/lib/string.c
>> +++ b/lib/string.c
>> @@ -159,11 +159,9 @@ EXPORT_SYMBOL(strlcpy);
>>   * @src: Where to copy the string from
>>   * @count: Size of destination buffer
>>   *
>> - * Copy the string, or as much of it as fits, into the dest buffer.
>> - * The routine returns the number of characters copied (not including
>> - * the trailing NUL) or -E2BIG if the destination buffer wasn't big enough.
>> - * The behavior is undefined if the string buffers overlap.
>> - * The destination buffer is always NUL terminated, unless it's zero-sized.
>> + * Copy the string, or as much of it as fits, into the dest buffer.  The
>> + * behavior is undefined if the string buffers overlap.  The destination
>> + * buffer is always NUL terminated, unless it's zero-sized.
>>   *
>>   * Preferred to strlcpy() since the API doesn't require reading memory
>>   * from the src string beyond the specified "count" bytes, and since
>> @@ -175,6 +173,9 @@ EXPORT_SYMBOL(strlcpy);
>>   * doesn't unnecessarily force the tail of the destination buffer to be
>>   * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy(),
>>   * check the return size, then just memset() the tail of the dest buffer.
>> + *
>> + * Return: The number of characters copied (not including the trailing
>> + *         NUL) or -E2BIG if the destination buffer wasn't big enough.
>>   */
>>  ssize_t strscpy(char *dest, const char *src, size_t count)
>>  {
>> --
>> 2.20.1
>>
> 
> 


-- 
~Randy

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/6] lib: Add safe string funtions
  2019-02-20 23:31 ` [PATCH 0/6] lib: Add safe string funtions Kees Cook
@ 2019-02-21  5:15   ` Tobin C. Harding
  0 siblings, 0 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-21  5:15 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening, LKML

On Wed, Feb 20, 2019 at 03:31:07PM -0800, Kees Cook wrote:
> On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
> > During your talk at LCA you mentioned that we could do with a couple
> > more safe string functions.  One to zero the tail of the destination
> > buffer after call to strscpy() and also the self explanatory
> > strscpy_from_user().
> 
> Thanks for jumping in with this! :)

Good to be working with you again.

> > I couldn't work out if this is a false positive or not?  Does the new
> > config option CONFIG_TEST_STRING need more documentation?  I don't see
> > where extra docs should be added and it seems self explanatory as is.
> 
> Usually this just means the help string in Kconfig is "too short".
> Sometimes this is a false positive -- really up to you if you think it
> needs more. :)

Cool, thanks.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 1/6] lib/string: Enable string selftesting
  2019-02-20 23:57   ` Kees Cook
@ 2019-02-21  5:16     ` Tobin C. Harding
  0 siblings, 0 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-21  5:16 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening, LKML

On Wed, Feb 20, 2019 at 03:57:18PM -0800, Kees Cook wrote:
> On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
> >
> > Currently we have a test module but it is not tied into the kselftest
> > infrastructure.  In preparation for adding string manipulation functions
> > and testing we should enable kselftest to utilize the test module.
> >
> > Enable string testing via kselftest infrastructure.
> >
> > Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> > ---
> >  lib/Kconfig.debug                     | 14 ++++++++++++++
> >  lib/Makefile                          |  2 +-
> >  lib/test_string.c                     |  4 ++--
> >  tools/testing/selftests/lib/Makefile  |  2 +-
> >  tools/testing/selftests/lib/config    |  1 +
> >  tools/testing/selftests/lib/string.sh | 19 +++++++++++++++++++
> >  6 files changed, 38 insertions(+), 4 deletions(-)
> >  create mode 100755 tools/testing/selftests/lib/string.sh
> >
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index d4df5b24d75e..0dca64c1d8a4 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -1802,8 +1802,22 @@ config ASYNC_RAID6_TEST
> >  config TEST_HEXDUMP
> >         tristate "Test functions located in the hexdump module at runtime"
> >
> > +config TEST_STRING
> > +       tristate "Perform selftest on string manipulation functions"
> > +       default n
> > +       help
> > +        Enable this option to test string manipulation functions.
> > +       Currently this only tests memset_{16,32,64}.
> > +
> > +       If unsure, say N.
> > +
> >  config TEST_STRING_HELPERS
> >         tristate "Test functions located in the string_helpers module at runtime"
> > +       default n
> > +       help
> > +        Enable this option to unit test code in lib/string_helpers.c
> > +
> > +        If unsure, say N.
> >
> >  config TEST_KSTRTOX
> >         tristate "Test kstrto*() family of functions at runtime"
> > diff --git a/lib/Makefile b/lib/Makefile
> > index e1b59da71418..9c30e1fee27f 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -39,7 +39,7 @@ obj-y += bcd.o div64.o sort.o parser.o debug_locks.o random32.o \
> >          bsearch.o find_bit.o llist.o memweight.o kfifo.o \
> >          percpu-refcount.o rhashtable.o reciprocal_div.o \
> >          once.o refcount.o usercopy.o errseq.o bucket_locks.o
> > -obj-$(CONFIG_STRING_SELFTEST) += test_string.o
> > +obj-$(CONFIG_TEST_STRING) += test_string.o
> 
> This patch should remove 'config STRING_SELFTEST' from lib/Kconfig too.
> 
> > diff --git a/tools/testing/selftests/lib/Makefile b/tools/testing/selftests/lib/Makefile
> > index 70d5711e3ac8..2ee4559b277e 100644
> > --- a/tools/testing/selftests/lib/Makefile
> > +++ b/tools/testing/selftests/lib/Makefile
> > @@ -3,6 +3,6 @@
> >  # No binaries, but make sure arg-less "make" doesn't trigger "run_tests"
> >  all:
> >
> > -TEST_PROGS := printf.sh bitmap.sh prime_numbers.sh
> > +TEST_PROGS := printf.sh bitmap.sh prime_numbers.sh string.sh
> >
> >  include ../lib.mk
> > diff --git a/tools/testing/selftests/lib/config b/tools/testing/selftests/lib/config
> > index 126933bcc950..2032402ad409 100644
> > --- a/tools/testing/selftests/lib/config
> > +++ b/tools/testing/selftests/lib/config
> > @@ -1,3 +1,4 @@
> >  CONFIG_TEST_PRINTF=m
> >  CONFIG_TEST_BITMAP=m
> > +CONFIG_TEST_STRING=m
> >  CONFIG_PRIME_NUMBERS=m
> > diff --git a/tools/testing/selftests/lib/string.sh b/tools/testing/selftests/lib/string.sh
> > new file mode 100755
> > index 000000000000..99024b6f3a6a
> > --- /dev/null
> > +++ b/tools/testing/selftests/lib/string.sh
> > @@ -0,0 +1,19 @@
> > +#!/bin/sh
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Runs string manipulation tests using test_string kernel module
> > +
> > +# Kselftest framework requirement - SKIP code is 4.
> > +ksft_skip=4
> > +
> > +if ! /sbin/modprobe -q -n test_string; then
> > +       echo "string: module test_string is not found [SKIP]"
> > +       exit $ksft_skip
> > +fi
> > +
> > +if /sbin/modprobe -q test_string; then
> > +       /sbin/modprobe -q -r test_string
> > +       echo "string: ok"
> > +else
> > +       echo "string: [FAIL]"
> > +       exit 1
> > +fi
> > --
> > 2.20.1
> >
> 
> You mentioned "redundant scripts" here. You might want to refactor
> first, and have a common tool that does the core testing, and then
> have the scripts doing one line each:
> 
> i.e.:
> 
> #!/bin/bash
> exec./test_module.sh prime_numbers selftest=65536
> 
> with "test_module.sh" doing all the rest.
> 
> I bet there are other test_*.ko tests we could wire up too, and this
> refactor will make that much easier. And actually, maybe we should
> just have a single test running that just reads the "config" file for
> the list of test modules, and runs them with the correct output format
> to show which are skipped, etc.

Got it.  I'll have a go at all this and pre-pend it to v2.

thanks,
Tobin.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/6] lib/string: Fix erroneous 'overflow' documentation
  2019-02-21  0:02   ` Kees Cook
@ 2019-02-21  5:17     ` Tobin C. Harding
  0 siblings, 0 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-21  5:17 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening, LKML

On Wed, Feb 20, 2019 at 04:02:37PM -0800, Kees Cook wrote:
> On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
> >
> > Current documentation uses 'overflow' to describe a situation where less
> > data is written to a buffer than buffer size not more.  'overflow' is
> > the wrong word here - since we don't typically say 'underflow' change
> > the whole sentence.
> >
> > Fix erroneous 'overflow' documentation for under filled buffer.
> >
> > Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> > ---
> >  lib/string.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/string.c b/lib/string.c
> > index 38e4ca08e757..7f1d72db53c5 100644
> > --- a/lib/string.c
> > +++ b/lib/string.c
> > @@ -173,8 +173,8 @@ EXPORT_SYMBOL(strlcpy);
> >   *
> >   * Preferred to strncpy() since it always returns a valid string, and
> >   * doesn't unnecessarily force the tail of the destination buffer to be
> > - * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy()
> > - * with an overflow test, then just memset() the tail of the dest buffer.
> > + * zeroed.  If the zeroing is desired, it's likely cleaner to use strscpy(),
> > + * check the return size, then just memset() the tail of the dest buffer.
> >   */
> 
> I'd just fold this patch into the strscpy_zeroed() patch. No need for
> a kind of "no op" change here when we'll just change it again with a
> better advice ("use strscpy_zeroed()!")

Got it.

thanks,
Tobin.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/6] lib/string: Add string copy/zero function
  2019-02-21  0:48   ` Kees Cook
@ 2019-02-21  5:20     ` Tobin C. Harding
  2019-02-21 12:02     ` Andy Shevchenko
  2019-02-25 20:09     ` Tobin C. Harding
  2 siblings, 0 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-21  5:20 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening, LKML,
	Matthew Wilcox, Rasmus Villemoes, Daniel Micay

On Wed, Feb 20, 2019 at 04:48:18PM -0800, Kees Cook wrote:
> On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
> >
> > We have a function to copy strings safely and we have a function to copy
> > strings _and_ zero the tail of the destination (if source string is
> > shorter than destination buffer) but we do not have a function to do
> > both at once.  This means developers must write this themselves if they
> > desire this functionality.  This is a chore, and also leaves us open to
> > off by one errors unnecessarily.
> >
> > Add a function that calls strscpy() then memset()s the tail to zero if
> > the source string is shorter than the destination buffer.
> >
> > Add testing via kselftest.
> >
> > Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> > ---
> >  include/linux/string.h |  4 ++++
> >  lib/Kconfig.debug      |  2 +-
> >  lib/string.c           | 30 ++++++++++++++++++++++++++++--
> >  lib/test_string.c      | 31 +++++++++++++++++++++++++++++++
> >  4 files changed, 64 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/string.h b/include/linux/string.h
> > index 7927b875f80c..695a5e6a31e3 100644
> > --- a/include/linux/string.h
> > +++ b/include/linux/string.h
> > @@ -31,6 +31,10 @@ size_t strlcpy(char *, const char *, size_t);
> >  #ifndef __HAVE_ARCH_STRSCPY
> >  ssize_t strscpy(char *, const char *, size_t);
> >  #endif
> > +
> > +/* Wrapper function, no arch specific code required */
> > +ssize_t strscpy_zeroed(char *dest, const char *src, size_t count);
> 
> bikeshed: I think "pad" is shorter and more descriptive. How about
> something like strspad() strscpy_pad() or strscpy_zero()? (just to
> shorten it slightly)

I like strscpy_pad()

> Not a blocker, just a TODO: we need a wrapper to do
> CONFIG_FORTIFY_SOURCE checking for strscpy() (and strscpy_zeroed()) to
> check for __builtin_object_size() vs the "size" argument, as done in
> strlcpy() in include/linux/string.h

I'll look into this for v2

> > @@ -238,6 +237,33 @@ ssize_t strscpy(char *dest, const char *src, size_t count)
> >  EXPORT_SYMBOL(strscpy);
> >  #endif
> >
> > +/**
> > + * strscopy_zeroed() - Copy a C-string into a sized buffer
> > + * @dest: Where to copy the string to
> > + * @src: Where to copy the string from
> > + * @count: Size of destination buffer
> > + *
> > + * If the source string is shorter than the destination buffer, zeros
> > + * the tail of the destination buffer.
> > + *
> > + * Return: The number of characters copied (not including the trailing
> > + *         NUL) or -E2BIG if the destination buffer wasn't big enough.
> > + */
> > +ssize_t strscpy_zeroed(char *dest, const char *src, size_t count)
> > +{
> > +       ssize_t written;
> > +
> > +       written = strscpy(dest, src, count);
> > +       if (written < 0)
> > +               return written;
> 
> If written < 0 we filled everything (i.e. we wrote "count - 1" bytes).
> If we also exactly wrote "count - 1", then we also don't need the zero
> padding either, since strscpy wrote the trailing NUL.
> 
> so:
> 
> if (written < 0 || (count && written == count - 1))
>     return written;
> 
> > +
> > +       if (written < count)
> > +               memset(dest + written, 0, count - written);
> 
> Now we know written must be [0, count - 2], so we can just:
> 
> memset(dest + written + 1, 0, count - written - 1);
> 
> The pattern (which should be added to the seltest) is:
> 
> count           source  written                                 pad@
> 0               *       -E2BIG (0 char, 0 NUL, 0 to zero)
> 
> 1               "a"     -E2BIG (0 char, 1 NUL, 0 to zero)
> 1               ""      0 (0 char, 1 NUL, 0 to zero)
> 
> 2               "ab"    -E2BIG (1 char, 1 NUL, 0 to zero)
> 2               "a"     1 (1 char, 1 NUL, 0 to zero)
> 2               ""      0 (0 char, 1 NUL, 1 to zero)            dest + 1
> 
> 3               "abc"   -E2BIG (2 char, 1 NUL, 0 to zero)
> 3               "ab"    2 (2 char, 1 NUL, 0 to zero)
> 3               "a"     1 (1 char, 1 NUL, 1 to zero)            dest + 2
> 3               ""      0 (0 char, 1 NUL, 2 to zero)            dest + 1
> 
> 4               "abcd"  -E2BIG (3 char, 1 NUL, 0 to zero)
> 4               "abc"   3 (3 char, 1 NUL, 0 to zero)
> 4               "ab"    2 (2 char, 1 NUL, 1 to zero)            dest + 3
> 4               "a"     1 (1 char, 1 NUL, 2 to zero)            dest + 2
> 4               ""      0 (0 char, 1 NUL, 3 to zero)            dest + 1

So thorough, you're the man.

> > +
> > +       return written;
> > +}
> > +EXPORT_SYMBOL(strscpy_zeroed);
> > +
> >  #ifndef __HAVE_ARCH_STRCAT
> >  /**
> >   * strcat - Append one %NUL-terminated string to another
> > diff --git a/lib/test_string.c b/lib/test_string.c
> > index a9cba442389a..cc4eef51a395 100644
> > --- a/lib/test_string.c
> > +++ b/lib/test_string.c
> > @@ -111,6 +111,32 @@ static __init int memset64_selftest(void)
> >         return 0;
> >  }
> >
> > +static __init int strscpy_zeroed_selftest(void)
> > +{
> > +       char buf[6];
> > +       int written;
> > +
> > +       memset(buf, 'a', sizeof(buf));
> > +
> > +       written = strscpy_zeroed(buf, "bb", 4);
> > +       if (written != 2)
> > +               return 1;
> > +
> > +       /* Copied correctly */
> > +       if (buf[0] != 'b' || buf[1] != 'b')
> > +               return 2;
> > +
> > +       /* Zeroed correctly */
> > +       if (buf[2] != '\0' || buf[3] != '\0')
> > +               return 3;
> > +
> > +       /* Only touched what it was supposed to */
> > +       if (buf[4] != 'a' || buf[5] != 'a')
> > +               return 4;
> > +
> > +       return 0;
> > +}
> 
> Cool, I like both the positive and negative tests. :) Can you add all
> the cases above, too, which should validate the various corners?

Sure thing.

> > +
> >  static __init int test_string_init(void)
> >  {
> >         int test, subtest;
> > @@ -130,6 +156,11 @@ static __init int test_string_init(void)
> >         if (subtest)
> >                 goto fail;
> >
> > +       test = 4;
> > +       subtest = strscpy_zeroed_selftest();
> > +       if (subtest)
> > +               goto fail;
> > +
> >         pr_info("String selftests succeeded\n");
> >         return 0;
> >  fail:
> > --
> > 2.20.1
> >
> 
> Nice! :)

Cheers.  And they said we don't test in kernel land :)

	Tobin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21  1:05     ` Kees Cook
@ 2019-02-21  5:24       ` Tobin C. Harding
  2019-02-21  6:02         ` Kees Cook
  2019-02-21 14:28       ` Jann Horn
  1 sibling, 1 reply; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-21  5:24 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jann Horn, Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	kernel list, Andy Lutomirski, Rasmus Villemoes, Daniel Micay

On Wed, Feb 20, 2019 at 05:05:10PM -0800, Kees Cook wrote:
> On Mon, Feb 18, 2019 at 4:52 PM Jann Horn <jannh@google.com> wrote:
> > AFAICS the byte_at_a_time loop exits when max==0 is reached, and then
> > if `res >= count` (in other words, if we've copied as many bytes as
> > requested, haven't encountered a null byte so far, and haven't reached
> > the end of the address space), we return `res`, which is the same as
> > `count`. Are you sure?
> 
> Oh, whew, there is only 1 arch-specific implementation of this. I
> thought you meant there was multiple implementations.
> 
> So, generally speaking, I'd love to split all strncpy* uses into
> strscpy_zero() (when expecting to do str->str copies), and some new
> function, named like mempadstr() or str2mem() that copies a str to a
> __nonstring char array, with trailing padding, if there is space. Then
> there is no more mixing the two cases and botching things.

Oh cool, treewide changes, I'm down with that.  So to v2 I'll add
str2mem() and then attack the tree as suggested.  What could possibly go
wrong :)?

	Tobin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 3/6] lib/string: Use correct docstring format
  2019-02-21  4:14     ` Randy Dunlap
@ 2019-02-21  5:27       ` Kees Cook
  0 siblings, 0 replies; 41+ messages in thread
From: Kees Cook @ 2019-02-21  5:27 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening, LKML

On Wed, Feb 20, 2019 at 8:14 PM Randy Dunlap <rdunlap@infradead.org> wrote:
> It's already in Documentation/core-api/kernel-api.rst, under
> "String Manipulation."

Ah! Thanks, yes, I missed it. :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21  5:24       ` Tobin C. Harding
@ 2019-02-21  6:02         ` Kees Cook
  2019-02-21 14:58           ` Rasmus Villemoes
                             ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Kees Cook @ 2019-02-21  6:02 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: Jann Horn, Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	kernel list, Andy Lutomirski, Rasmus Villemoes, Daniel Micay,
	Arnd Bergmann, Stephen Rothwell, Miguel Ojeda,
	Gustavo A. R. Silva

On Wed, Feb 20, 2019 at 9:25 PM Tobin C. Harding <me@tobin.cc> wrote:
>
> On Wed, Feb 20, 2019 at 05:05:10PM -0800, Kees Cook wrote:
> > So, generally speaking, I'd love to split all strncpy* uses into
> > strscpy_zero() (when expecting to do str->str copies), and some new
> > function, named like mempadstr() or str2mem() that copies a str to a
> > __nonstring char array, with trailing padding, if there is space. Then
> > there is no more mixing the two cases and botching things.

I should use "converts" instead of "copies" above, just to drive the
point home. :)

>
> Oh cool, treewide changes, I'm down with that.  So to v2 I'll add
> str2mem() and then attack the tree as suggested.  What could possibly go
> wrong :)?

Some clear documentation needs to be written for str2mem() to really
help people understand what a "non string" character array is
(especially given that it LOOKS like it has NUL termination -- when in
fact it's just "padding").

The tree-wide changes will likely take a while (and don't need to be
part of this series unless you want to find a couple good examples)
since we have to do them case-by-case: it's not always obvious when
it's actually a non-string, so getting help from maintainers here will
be needed. (And maybe some kind of flow chart added to
Documentation/process/deprecated.rst for how to stop using strncpy()
and strlcpy().)

What I can't quite figure out yet is how to find a way for sfr to flag
newly added users of strcpy, strncpy, and strlcpy. We might need to
bring back __deprecated, but hide it behind a W=linux-next flag or
something crazy. Stephen, in your builds you're already injecting
-Wimplicit-fallthrough: do you do W=1 or anything like that? If not, I
think we need some W= setting for your linux-next builds that generate
the maintainer-nag warnings...

-Kees

P.S. Here's C string API Rant (I just had to get this out, please feel
free to ignore):

strcpy returns dest ... which is already known, so it's a meaningless
return value.

strncpy returns dest (still meaningless)

strlcpy returns strlen(src) ... the length we WOULD have copied. Why
would we care about that? I'm operating on dest. Were there callers
that needed to both copy part of src and learn how long it was at the
same time?

strscpy returns -E2BIG or non-NUL bytes copied: yay, a return about
what actually happened from the operation!

... snprintf returns what it WOULD have written, much like strlcpy
above. At least snprintf has an excuse: it can be used to calculate an
allocation size (called with a NULL dest and 0 size) ... but shouldn't
"how big is this format string going to be?" be a separate function? I
wonder if we can kill all kernel uses of snprintf too (after
introducing a "how big would it be?" function and switching all other
callers over to scnprintf)...

So scnprintf() does the right thing (count of non-NUL bytes copied out).

So now our safe(r?) string API versions use different letters to show
they're safe: "s" in strcpy and "cn" in sprintf. /me cries forever.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/6] lib/string: Add string copy/zero function
  2019-02-21  0:48   ` Kees Cook
  2019-02-21  5:20     ` Tobin C. Harding
@ 2019-02-21 12:02     ` Andy Shevchenko
  2019-02-25 20:09     ` Tobin C. Harding
  2 siblings, 0 replies; 41+ messages in thread
From: Andy Shevchenko @ 2019-02-21 12:02 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening, LKML,
	Matthew Wilcox, Rasmus Villemoes, Daniel Micay

On Thu, Feb 21, 2019 at 2:49 AM Kees Cook <keescook@chromium.org> wrote:
> On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
> >
> > We have a function to copy strings safely and we have a function to copy
> > strings _and_ zero the tail of the destination (if source string is
> > shorter than destination buffer) but we do not have a function to do
> > both at once.  This means developers must write this themselves if they
> > desire this functionality.  This is a chore, and also leaves us open to
> > off by one errors unnecessarily.
> >
> > Add a function that calls strscpy() then memset()s the tail to zero if
> > the source string is shorter than the destination buffer.

> > +/* Wrapper function, no arch specific code required */
> > +ssize_t strscpy_zeroed(char *dest, const char *src, size_t count);
>
> bikeshed: I think "pad" is shorter and more descriptive. How about
> something like strspad() strscpy_pad() or strscpy_zero()? (just to
> shorten it slightly)

zero / zeroed examples in the kernel have semantics of getting some
area completely zeroed. OTOH pad means different and we have examples
as well (see seq_pad() as one).

So, I would definitely vote for _pad b/c of semantics.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21  1:05     ` Kees Cook
  2019-02-21  5:24       ` Tobin C. Harding
@ 2019-02-21 14:28       ` Jann Horn
  2019-02-21 22:52         ` Kees Cook
  1 sibling, 1 reply; 41+ messages in thread
From: Jann Horn @ 2019-02-21 14:28 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	kernel list, Andy Lutomirski, Rasmus Villemoes, Daniel Micay

On Thu, Feb 21, 2019 at 2:05 AM Kees Cook <keescook@chromium.org> wrote:
> On Mon, Feb 18, 2019 at 4:52 PM Jann Horn <jannh@google.com> wrote:
> > AFAICS the byte_at_a_time loop exits when max==0 is reached, and then
> > if `res >= count` (in other words, if we've copied as many bytes as
> > requested, haven't encountered a null byte so far, and haven't reached
> > the end of the address space), we return `res`, which is the same as
> > `count`. Are you sure?
>
> Oh, whew, there is only 1 arch-specific implementation of this. I
> thought you meant there was multiple implementations.

Not sure what you're talking about. Are you talking about
implementations of strncpy_from_user()?


csky: custom strncpy_from_user() calls custom __do_strncpy_from_user,
which is a macro containing a big asm() block.


h8300:
;;; long strncpy_from_user(void *to, void *from, size_t n)
strncpy_from_user:
mov.l er2,er2
bne 1f
sub.l er0,er0
rts
[...]


mips:
static inline long
strncpy_from_user(char *__to, const char __user *__from, long __len)
{
long res;

if (eva_kernel_access()) {
__asm__ __volatile__(
"move\t$4, %1\n\t"
"move\t$5, %2\n\t"
"move\t$6, %3\n\t"
[...]


unicore32:
ENTRY(__strncpy_from_user)
mov ip, r1
1: sub.a r2, r2, #1
ldrusr r3, r1, 1, ns
bfs 2f
stb.w r3, [r0]+, #1
cxor.a r3, #0
[...]


and so on.

> So, generally speaking, I'd love to split all strncpy* uses into
> strscpy_zero() (when expecting to do str->str copies), and some new
> function, named like mempadstr() or str2mem() that copies a str to a
> __nonstring char array, with trailing padding, if there is space. Then
> there is no more mixing the two cases and botching things.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21  6:02         ` Kees Cook
@ 2019-02-21 14:58           ` Rasmus Villemoes
  2019-02-21 23:03             ` Kees Cook
  2019-02-21 16:06           ` Jann Horn
  2019-02-21 20:26           ` Stephen Rothwell
  2 siblings, 1 reply; 41+ messages in thread
From: Rasmus Villemoes @ 2019-02-21 14:58 UTC (permalink / raw)
  To: Kees Cook, Tobin C. Harding
  Cc: Jann Horn, Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	kernel list, Andy Lutomirski, Daniel Micay, Arnd Bergmann,
	Stephen Rothwell, Miguel Ojeda, Gustavo A. R. Silva

On 21/02/2019 07.02, Kees Cook wrote:

> P.S. Here's C string API Rant (I just had to get this out, please feel
> free to ignore):

I'll bite. First, it's "linux kernel string API", only some of string.h
interfaces are in std C. Sure, none of those satisfy all use cases, but
adding Yet Another One also has its costs.

> strcpy returns dest ... which is already known, so it's a meaningless
> return value.
> 
> strncpy returns dest (still meaningless)

Yeah, same for memcpy, memset. Those are classic C interfaces. It does
allow some micro-optimizations in some cases - since one is likely to
continue doing other stuff with dest, the compiler might use the fact
that it has dest in the return register instead of spilling and
reloading it. But I do wonder what the real rationale was.

> strlcpy returns strlen(src) ... the length we WOULD have copied. Why
> would we care about that? I'm operating on dest. Were there callers
> that needed to both copy part of src and learn how long it was at the
> same time?

The rationale is exactly the same as for snprintf - to allow you to
optimistically format/copy to a buffer, and know exactly how much to
realloc() in case it turned out to be too small. Of course, nobody ever
uses strlcpy like that, since just doing strdup() is much much simpler
and less error-prone. So yes, strlcpy() is often a silly interface to use.

> strscpy returns -E2BIG or non-NUL bytes copied: yay, a return about
> what actually happened from the operation!

Well, strictly speaking, strlcpy()'s return value also tells you exactly
what happened, just not in the kernel "negative errno for error
condition" style.

> ... snprintf returns what it WOULD have written, much like strlcpy
> above. At least snprintf has an excuse: it can be used to calculate an
> allocation size (called with a NULL dest and 0 size) ... but shouldn't
> "how big is this format string going to be?" be a separate function?

Why? Sure, you can make a wrapper vhowmuch(const char *fmt, va_args ap),
but its implementation would have to be "return vsnprintf(NULL, 0, fmt,
ap);", since we really must reuse the exact same logic for computing the
length as for doing the actual printf'ing (otherwise they'd get out of
sync).

> I wonder if we can kill all kernel uses of snprintf too (after
> introducing a "how big would it be?" function and switching all other
> callers over to scnprintf)...

Please no. snprintf() is standardized, has sane semantics (though one
sometimes _want_ scnprintf semantics - in which case one can and should
use that...), and, importantly, gcc understands those semantics. So we
get optimizations and diagnostics that we'd miss if we kill off explicit
snprintf and replace with non-standard howmuch+scnprintf.

> So scnprintf() does the right thing (count of non-NUL bytes copied out).

The right thing, when that's the thing you want to know. Which it is in
the "build a string in a buffer by repeated printf calls, perhaps
intermixed with a little flow control logic". But not so in a "format
this stuff to this fixed-size char buffer inside that struct, and let me
know [i.e., 'give me a way of knowing'] if it all fit".


Hello, I'm you super-fantastic-one-size-fits-all string copying API.
Please carefully fill out the following form, and I'll get back to you ASAP.

A1: destination
A2: max bytes to write
B1: source
B2: max bytes to read
C1: do you want me to nul-terminate the destination?
C2: if C1, should that be done by truncating the result if necessary?
C3: how do you want me to handle leftover space in dest, if any?
C4: should the destination be left entirely untouched in case the result
does not fit?
C5: is it an error if I do not encounter a nul byte somewhere in [B1,
B1+B2-1] ?
C6: how do you want me to report on the results of my efforts and my
discoveries?
C7: are there any special conditions I should take into account?
(overlapping buffers, allergies, ...)

Put -1 under A2 to mean "assume there's room enough". Put -1 under B2 to
mean "assume there's a nul byte before walking into lala-land".

Answering yes to C1 and C2 and writing 0 under A2 causes your CapsLock
to stay permanently on. If there's any ambiguity in your answer to C6,
the machine reboots.


Rasmus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21  6:02         ` Kees Cook
  2019-02-21 14:58           ` Rasmus Villemoes
@ 2019-02-21 16:06           ` Jann Horn
  2019-02-21 23:14             ` Kees Cook
  2019-02-21 20:26           ` Stephen Rothwell
  2 siblings, 1 reply; 41+ messages in thread
From: Jann Horn @ 2019-02-21 16:06 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Tobin C. Harding, Shuah Khan,
	Alexander Shishkin, Greg Kroah-Hartman, Andy Shevchenko,
	Kernel Hardening, kernel list, Andy Lutomirski, Rasmus Villemoes,
	Daniel Micay, Arnd Bergmann, Stephen Rothwell, Miguel Ojeda,
	Gustavo A. R. Silva

On Thu, Feb 21, 2019 at 7:02 AM Kees Cook <keescook@chromium.org> wrote:
> On Wed, Feb 20, 2019 at 9:25 PM Tobin C. Harding <me@tobin.cc> wrote:
> >
> > On Wed, Feb 20, 2019 at 05:05:10PM -0800, Kees Cook wrote:
> > > So, generally speaking, I'd love to split all strncpy* uses into
> > > strscpy_zero() (when expecting to do str->str copies), and some new
> > > function, named like mempadstr() or str2mem() that copies a str to a
> > > __nonstring char array, with trailing padding, if there is space. Then
> > > there is no more mixing the two cases and botching things.
>
> I should use "converts" instead of "copies" above, just to drive the
> point home. :)
>
> >
> > Oh cool, treewide changes, I'm down with that.  So to v2 I'll add
> > str2mem() and then attack the tree as suggested.  What could possibly go
> > wrong :)?
>
> Some clear documentation needs to be written for str2mem() to really
> help people understand what a "non string" character array is
> (especially given that it LOOKS like it has NUL termination -- when in
> fact it's just "padding").
>
> The tree-wide changes will likely take a while (and don't need to be
> part of this series unless you want to find a couple good examples)
> since we have to do them case-by-case: it's not always obvious when
> it's actually a non-string, so getting help from maintainers here will
> be needed. (And maybe some kind of flow chart added to
> Documentation/process/deprecated.rst for how to stop using strncpy()
> and strlcpy().)

Wild brainstorming ahead...

Such non-string character arrays are usually fixed-size, right? We
could use struct types around such character arrays to hide the actual
characters (so that people don't accidentally use them with C string
APIs), and enforce that the length is passed around as part of the
type... something like:

#define char_array(len) struct { char __ca_data[len]; }
#define __CA_UNWRAP(captr) (captr)->__ca_data, sizeof((captr)->__ca_data)

void __str2ca(char *dst, size_t dst_len, char *src) {
  size_t src_len = strlen(src);
  if (WARN_ON(src_len > dst_len)) {
    // or whatever else we think is the safest way to deal with this
    // without crashing, if BUG() is not an option.
    memset(dst, 0, dst_len);
    return;
  }
  memcpy(dst, src, src_len);
  memset(dst + src_len, 0, dst_len - src_len);
}
#define str2ca(dst, src) __str2ca(__CA_UNWRAP(dst), (src))

static inline void __ca2ca(char *dst, size_t dst_len, char *src,
size_t src_len) {
  BUILD_BUG_ON(dst_len < src_len);
  memcpy(dst, src, src_len);
  if (src_len != dst_len) {
    memset(dst + src_len, 0, dst_len - src_len);
  }
}
#define ca2ca(dst, src) __ca2ca(__CA_UNWRAP(dst), __CA_UNWRAP(src))


And if you want to do direct assignments instead of using helpers, you
make a typedef like this (since just assigning between two equivalent
structs doesn't work):

typedef char_array(20) blah_name;
blah_name global_name;
int main(int argc, char **argv) {
  blah_name local_name;
  str2ca(&local_name, argv[1]);
  global_name = local_name;
}

You'd also have to use a typedef like this if you want to pass
references to other functions.


Something like this would also be neat for classic C strings in some
situations - if you can put bounds in the types, or (if the string is
dynamically-sized) in the struct, you don't need to messily pass
around bounds for things like snprint() and so on.

> What I can't quite figure out yet is how to find a way for sfr to flag
> newly added users of strcpy, strncpy, and strlcpy. We might need to
> bring back __deprecated, but hide it behind a W=linux-next flag or
> something crazy. Stephen, in your builds you're already injecting
> -Wimplicit-fallthrough: do you do W=1 or anything like that? If not, I
> think we need some W= setting for your linux-next builds that generate
> the maintainer-nag warnings...
>
> -Kees
>
> P.S. Here's C string API Rant (I just had to get this out, please feel
> free to ignore):
>
> strcpy returns dest ... which is already known, so it's a meaningless
> return value.
>
> strncpy returns dest (still meaningless)

Weeeell... it kinda makes sense if you're trying to micro-optimize the
amount of stack spills. If you write code like this:

char *a = malloc(10);
a = strcpy(a, other_string);
return a;

... then the compiler doesn't have to shove `a` in one of the
callee-saved registers or spill it to the stack around the strcpy().
Plus, it might allow tail-call optimizations in some cases.

> strlcpy returns strlen(src) ... the length we WOULD have copied. Why
> would we care about that? I'm operating on dest. Were there callers
> that needed to both copy part of src and learn how long it was at the
> same time?

You could use it to optimistically attempt a copy in a fastpath, and
then fall back to a reallocation if that failed.

> strscpy returns -E2BIG or non-NUL bytes copied: yay, a return about
> what actually happened from the operation!
>
> ... snprintf returns what it WOULD have written, much like strlcpy
> above. At least snprintf has an excuse: it can be used to calculate an
> allocation size (called with a NULL dest and 0 size) ... but shouldn't
> "how big is this format string going to be?" be a separate function? I
> wonder if we can kill all kernel uses of snprintf too (after
> introducing a "how big would it be?" function and switching all other
> callers over to scnprintf)...
>
> So scnprintf() does the right thing (count of non-NUL bytes copied out).

That seems kinda suboptimal. Wouldn't you ideally want to bail out
with an error, or at least do a WARN(), if you're trying to format a
string that doesn't fit into its destination? Like, for example, if
you're assembling a path, and the path doesn't fit into a buffer, and
so you just use half of it, you might end up in a very different place
from where you intended to go.

> So now our safe(r?) string API versions use different letters to show
> they're safe: "s" in strcpy and "cn" in sprintf. /me cries forever.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21  6:02         ` Kees Cook
  2019-02-21 14:58           ` Rasmus Villemoes
  2019-02-21 16:06           ` Jann Horn
@ 2019-02-21 20:26           ` Stephen Rothwell
  2019-02-21 23:16             ` Kees Cook
  2 siblings, 1 reply; 41+ messages in thread
From: Stephen Rothwell @ 2019-02-21 20:26 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Jann Horn, Tobin C. Harding, Shuah Khan,
	Alexander Shishkin, Greg Kroah-Hartman, Andy Shevchenko,
	Kernel Hardening, kernel list, Andy Lutomirski, Rasmus Villemoes,
	Daniel Micay, Arnd Bergmann, Miguel Ojeda, Gustavo A. R. Silva

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

Hi Kees,

On Wed, 20 Feb 2019 22:02:32 -0800 Kees Cook <keescook@chromium.org> wrote:
>
> What I can't quite figure out yet is how to find a way for sfr to flag
> newly added users of strcpy, strncpy, and strlcpy. We might need to
> bring back __deprecated, but hide it behind a W=linux-next flag or
> something crazy. Stephen, in your builds you're already injecting
> -Wimplicit-fallthrough: do you do W=1 or anything like that? If not, I
> think we need some W= setting for your linux-next builds that generate
> the maintainer-nag warnings...

I just have a set of compiler flags that my build scripts explicitly
enable by setting KCFLAGS.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21 14:28       ` Jann Horn
@ 2019-02-21 22:52         ` Kees Cook
  0 siblings, 0 replies; 41+ messages in thread
From: Kees Cook @ 2019-02-21 22:52 UTC (permalink / raw)
  To: Jann Horn
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening,
	kernel list, Andy Lutomirski, Rasmus Villemoes, Daniel Micay

On Thu, Feb 21, 2019 at 6:28 AM Jann Horn <jannh@google.com> wrote:
> On Thu, Feb 21, 2019 at 2:05 AM Kees Cook <keescook@chromium.org> wrote:
> > On Mon, Feb 18, 2019 at 4:52 PM Jann Horn <jannh@google.com> wrote:
> > > AFAICS the byte_at_a_time loop exits when max==0 is reached, and then
> > > if `res >= count` (in other words, if we've copied as many bytes as
> > > requested, haven't encountered a null byte so far, and haven't reached
> > > the end of the address space), we return `res`, which is the same as
> > > `count`. Are you sure?
> >
> > Oh, whew, there is only 1 arch-specific implementation of this. I
> > thought you meant there was multiple implementations.
>
> Not sure what you're talking about. Are you talking about
> implementations of strncpy_from_user()?

Ah, I used a bad match. But it seems it's only about half (and not
x86, arm, powerpc).

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21 14:58           ` Rasmus Villemoes
@ 2019-02-21 23:03             ` Kees Cook
  2019-02-25 15:41               ` Rasmus Villemoes
  0 siblings, 1 reply; 41+ messages in thread
From: Kees Cook @ 2019-02-21 23:03 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Tobin C. Harding, Jann Horn, Tobin C. Harding, Shuah Khan,
	Alexander Shishkin, Greg Kroah-Hartman, Andy Shevchenko,
	Kernel Hardening, kernel list, Andy Lutomirski, Daniel Micay,
	Arnd Bergmann, Stephen Rothwell, Miguel Ojeda,
	Gustavo A. R. Silva

On Thu, Feb 21, 2019 at 6:58 AM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> On 21/02/2019 07.02, Kees Cook wrote:
>
> > P.S. Here's C string API Rant (I just had to get this out, please feel
> > free to ignore):
>
> I'll bite. First, it's "linux kernel string API", only some of string.h
> interfaces are in std C. Sure, none of those satisfy all use cases, but
> adding Yet Another One also has its costs.

Well, yes, strscpy and scnprintf appear to be only in the kernel (did
the originate externally somewhere I'm not aware of?)

> > strcpy returns dest ... which is already known, so it's a meaningless
> > return value.
> >
> > strncpy returns dest (still meaningless)
>
> Yeah, same for memcpy, memset. Those are classic C interfaces. It does
> allow some micro-optimizations in some cases - since one is likely to

Right, yes, but this level of optimization is best left to the
compiler. There's much more interesting results a caller should care
about. :)

> > strlcpy returns strlen(src) ... the length we WOULD have copied. Why
> > would we care about that? I'm operating on dest. Were there callers
> > that needed to both copy part of src and learn how long it was at the
> > same time?
>
> The rationale is exactly the same as for snprintf - to allow you to

Okay, but there's as separate function for that: strlen().

> > strscpy returns -E2BIG or non-NUL bytes copied: yay, a return about
> > what actually happened from the operation!
>
> Well, strictly speaking, strlcpy()'s return value also tells you exactly
> what happened, just not in the kernel "negative errno for error
> condition" style.

True, yes, but it's unfriendly: it requires careful math, just like
snprintf, which depends on rechecking the args, making sure you're not
doing an off-by-one from the NUL, etc etc.

> > ... snprintf returns what it WOULD have written, much like strlcpy
> > above. At least snprintf has an excuse: it can be used to calculate an
> > allocation size (called with a NULL dest and 0 size) ... but shouldn't
> > "how big is this format string going to be?" be a separate function?
>
> Why? Sure, you can make a wrapper vhowmuch(const char *fmt, va_args ap),
> but its implementation would have to be "return vsnprintf(NULL, 0, fmt,
> ap);", since we really must reuse the exact same logic for computing the
> length as for doing the actual printf'ing (otherwise they'd get out of
> sync).

Well, I'd likely go the opposite directly: make snprintf() a wrapper
and call vhowmuch(), etc, convert until snprintf could be removed. But
really the best might be removal of snprintf first, then refactor it
with vhowmuch() etc. Lots of ways to solve it, I suppose. But dropping
the unfriendly API would be nice.

> Please no. snprintf() is standardized, has sane semantics (though one

No, it doesn't -- it has well-defined semantics, but using it
correctly is too hard. It's a regular source of bugs. (Not *nearly* as
bad as strncpy, so let's stick to dropping one bad API at a time...)

> sometimes _want_ scnprintf semantics - in which case one can and should
> use that...), and, importantly, gcc understands those semantics. So we
> get optimizations and diagnostics that we'd miss if we kill off explicit
> snprintf and replace with non-standard howmuch+scnprintf.

Those diagnostics can be had with the __printf() markings already...

> > So scnprintf() does the right thing (count of non-NUL bytes copied out).
>
> The right thing, when that's the thing you want to know. Which it is in
> the "build a string in a buffer by repeated printf calls, perhaps
> intermixed with a little flow control logic". But not so in a "format
> this stuff to this fixed-size char buffer inside that struct, and let me
> know [i.e., 'give me a way of knowing'] if it all fit".

Do we need ssprintf() to get the -E2BIG result like strscpy()?

> Hello, I'm you super-fantastic-one-size-fits-all string copying API.
> Please carefully fill out the following form, and I'll get back to you ASAP.

I mean, this is kinda what we have already. We just keep exposing too
many knobs. :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21 16:06           ` Jann Horn
@ 2019-02-21 23:14             ` Kees Cook
  0 siblings, 0 replies; 41+ messages in thread
From: Kees Cook @ 2019-02-21 23:14 UTC (permalink / raw)
  To: Jann Horn
  Cc: Tobin C. Harding, Tobin C. Harding, Shuah Khan,
	Alexander Shishkin, Greg Kroah-Hartman, Andy Shevchenko,
	Kernel Hardening, kernel list, Andy Lutomirski, Rasmus Villemoes,
	Daniel Micay, Arnd Bergmann, Stephen Rothwell, Miguel Ojeda,
	Gustavo A. R. Silva

On Thu, Feb 21, 2019 at 8:07 AM Jann Horn <jannh@google.com> wrote:
>
> On Thu, Feb 21, 2019 at 7:02 AM Kees Cook <keescook@chromium.org> wrote:
> > The tree-wide changes will likely take a while (and don't need to be
> > part of this series unless you want to find a couple good examples)
> > since we have to do them case-by-case: it's not always obvious when
> > it's actually a non-string, so getting help from maintainers here will
> > be needed. (And maybe some kind of flow chart added to
> > Documentation/process/deprecated.rst for how to stop using strncpy()
> > and strlcpy().)
>
> Wild brainstorming ahead...
>
> Such non-string character arrays are usually fixed-size, right? We
> could use struct types around such character arrays to hide the actual
> characters (so that people don't accidentally use them with C string
> APIs), and enforce that the length is passed around as part of the
> type... something like:
>
> #define char_array(len) struct { char __ca_data[len]; }

Does this need __packed or will the compiler understand it's
byte-aligned? And it needs __nonstring :)

> #define __CA_UNWRAP(captr) (captr)->__ca_data, sizeof((captr)->__ca_data)
>
> void __str2ca(char *dst, size_t dst_len, char *src) {
>   size_t src_len = strlen(src);
>   if (WARN_ON(src_len > dst_len)) {
>     // or whatever else we think is the safest way to deal with this
>     // without crashing, if BUG() is not an option.
>     memset(dst, 0, dst_len);
>     return;
>   }
>   memcpy(dst, src, src_len);
>   memset(dst + src_len, 0, dst_len - src_len);
> }
> #define str2ca(dst, src) __str2ca(__CA_UNWRAP(dst), (src))
>
> static inline void __ca2ca(char *dst, size_t dst_len, char *src,
> size_t src_len) {
>   BUILD_BUG_ON(dst_len < src_len);
>   memcpy(dst, src, src_len);
>   if (src_len != dst_len) {
>     memset(dst + src_len, 0, dst_len - src_len);
>   }
> }
> #define ca2ca(dst, src) __ca2ca(__CA_UNWRAP(dst), __CA_UNWRAP(src))

Yeah, I was trying to think of ways to do this but I was thinking
mostly about noticing the __nonstring mark. #define-ing this away
might work, but it might also just annoy people more. At least GCC
will yell about __nonstring use in some cases.

> And if you want to do direct assignments instead of using helpers, you
> make a typedef like this (since just assigning between two equivalent
> structs doesn't work):
>
> typedef char_array(20) blah_name;
> blah_name global_name;
> int main(int argc, char **argv) {
>   blah_name local_name;
>   str2ca(&local_name, argv[1]);
>   global_name = local_name;
> }
>
> You'd also have to use a typedef like this if you want to pass
> references to other functions.

Yeah, it might work. I need to go look through ext4 -- that's one
place I know there were "legit" strncpy() uses...

Converting existing non-string cases to this and see if it's worse
would be educational.

> Something like this would also be neat for classic C strings in some
> situations - if you can put bounds in the types, or (if the string is
> dynamically-sized) in the struct, you don't need to messily pass
> around bounds for things like snprint() and so on.

Yeah, I remain annoyed about string pointers having lost their
allocation size meta data. The vast majority of str*() functions could
just be "str, sizeof(str)" like you have for the CA_UNWRAP.

> > So scnprintf() does the right thing (count of non-NUL bytes copied out).
>
> That seems kinda suboptimal. Wouldn't you ideally want to bail out
> with an error, or at least do a WARN(), if you're trying to format a
> string that doesn't fit into its destination? Like, for example, if
> you're assembling a path, and the path doesn't fit into a buffer, and
> so you just use half of it, you might end up in a very different place
> from where you intended to go.

ssprintf() with -E2BIG? Most correct users of snprintf()'s return
value do some version of trying to detect if there wasn't enough space
and then error out:

static inline int usb_make_path(struct usb_device *dev, char *buf, size_t size)
{
        int actual;
        actual = snprintf(buf, size, "usb-%s-%s", dev->bus->bus_name,
                          dev->devpath);
        return (actual >= (int)size) ? -1 : actual;
}

a) this could just be "return ssprintf(..."
b) as far as I see, there are literally 2 callers of usb_make_path()
that check the return value

Anyway, snprintf() is "next". I'd like to get through
strcpy/strncpy/strlcpy removal first... :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21 20:26           ` Stephen Rothwell
@ 2019-02-21 23:16             ` Kees Cook
  0 siblings, 0 replies; 41+ messages in thread
From: Kees Cook @ 2019-02-21 23:16 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Tobin C. Harding, Jann Horn, Tobin C. Harding, Shuah Khan,
	Alexander Shishkin, Greg Kroah-Hartman, Andy Shevchenko,
	Kernel Hardening, kernel list, Andy Lutomirski, Rasmus Villemoes,
	Daniel Micay, Arnd Bergmann, Miguel Ojeda, Gustavo A. R. Silva

On Thu, Feb 21, 2019 at 12:27 PM Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> Hi Kees,
>
> On Wed, 20 Feb 2019 22:02:32 -0800 Kees Cook <keescook@chromium.org> wrote:
> >
> > What I can't quite figure out yet is how to find a way for sfr to flag
> > newly added users of strcpy, strncpy, and strlcpy. We might need to
> > bring back __deprecated, but hide it behind a W=linux-next flag or
> > something crazy. Stephen, in your builds you're already injecting
> > -Wimplicit-fallthrough: do you do W=1 or anything like that? If not, I
> > think we need some W= setting for your linux-next builds that generate
> > the maintainer-nag warnings...
>
> I just have a set of compiler flags that my build scripts explicitly
> enable by setting KCFLAGS.

Okay, so you could include some -D option to enable it. I'll see if I
can cook something up.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user
  2019-02-21 23:03             ` Kees Cook
@ 2019-02-25 15:41               ` Rasmus Villemoes
  0 siblings, 0 replies; 41+ messages in thread
From: Rasmus Villemoes @ 2019-02-25 15:41 UTC (permalink / raw)
  To: Kees Cook, Rasmus Villemoes
  Cc: Tobin C. Harding, Jann Horn, Tobin C. Harding, Shuah Khan,
	Alexander Shishkin, Greg Kroah-Hartman, Andy Shevchenko,
	Kernel Hardening, kernel list, Andy Lutomirski, Daniel Micay,
	Arnd Bergmann, Stephen Rothwell, Miguel Ojeda,
	Gustavo A. R. Silva

On 22/02/2019 00.03, Kees Cook wrote:
> On Thu, Feb 21, 2019 at 6:58 AM Rasmus Villemoes
> <linux@rasmusvillemoes.dk> wrote:
>>
>> On 21/02/2019 07.02, Kees Cook wrote:
>>
> 
>>> ... snprintf returns what it WOULD have written, much like strlcpy
>>> above. At least snprintf has an excuse: it can be used to calculate an
>>> allocation size (called with a NULL dest and 0 size) ... but shouldn't
>>> "how big is this format string going to be?" be a separate function?
>>
>> Why? Sure, you can make a wrapper vhowmuch(const char *fmt, va_args ap),
>> but its implementation would have to be "return vsnprintf(NULL, 0, fmt,
>> ap);", since we really must reuse the exact same logic for computing the
>> length as for doing the actual printf'ing (otherwise they'd get out of
>> sync).
> 
> Well, I'd likely go the opposite directly: make snprintf() a wrapper
> and call vhowmuch(), etc, convert until snprintf could be removed.But
> really the best might be removal of snprintf first, then refactor it
> with vhowmuch() etc. Lots of ways to solve it, I suppose.

I'd really like to see how vsprintf.c would look in a world where the
workhorse (which is vsnprintf() and its tree of helpers) would not
format and measure at the same time.

But dropping
> the unfriendly API would be nice.
> 
>> Please no. snprintf() is standardized, has sane semantics (though one
> 
> No, it doesn't -- it has well-defined semantics, 

We'll just have to disagree on what sane means.

> but using it
> correctly is too hard. It's a regular source of bugs. (Not *nearly* as
> bad as strncpy, so let's stick to dropping one bad API at a time...)
> 
>> sometimes _want_ scnprintf semantics - in which case one can and should
>> use that...), and, importantly, gcc understands those semantics. So we
>> get optimizations and diagnostics that we'd miss if we kill off explicit
>> snprintf and replace with non-standard howmuch+scnprintf.
> 
> Those diagnostics can be had with the __printf() markings already...

No. You're thinking of the type-checking things. Those are important, of
course, but have nothing at all to do with the s or n parts of snprintf
- what I'm thinking of is the fact that gcc knows that buf is a
char-buffer of length len into which snprintf() may/will write, so we
have the Wformat-truncation and Warray-bounds kind of warnings. And the
optimization part is where gcc can replace a snprintf() with a simpler
string op (e.g. a memcpy), or know that an overflow check can be elided
because "%d" does fit in the buf[16], or...

One thing I've had on my wishlist for a long time is a
buffer_size(ptrarg, expr, r/w/rw) attribute that would say that the
function treats the pointer argument ptrarg as a buffer of size given by
the expr (in bytes), and accesses it according to the third parameter.
Then the compiler could automatically check with its builtin_objsize
machinery for mismatched buffer size computations (e.g., using sizeof()
when the interface expects ARRAY_SIZE()) violations or uninitialized
reads, or any number of optional run-time instrumentations in caller or
callee... So

memcpy(void *dst, const void *src, size_t len) __buffer_size(dst, len,
"w") __buffer_size(src, len, "r")

or

int poll(struct pollfd *fds, nfds_t nfds, int timeout)
__buffer_size(fds, nfds*sizeof(struct pollfd), "rw")

the latter being an example of where an arbitrary expression in terms of
the formal parameters is needed - but I don't think gcc's attribute
machinery supports this syntax at all.

Bonus points if one could attach buffer_size to a struct definition as
well, saying that the ->foo member points to ->bar*4  of storage.


Rasmus

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/6] lib/string: Add string copy/zero function
  2019-02-21  0:48   ` Kees Cook
  2019-02-21  5:20     ` Tobin C. Harding
  2019-02-21 12:02     ` Andy Shevchenko
@ 2019-02-25 20:09     ` Tobin C. Harding
  2 siblings, 0 replies; 41+ messages in thread
From: Tobin C. Harding @ 2019-02-25 20:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tobin C. Harding, Shuah Khan, Alexander Shishkin,
	Greg Kroah-Hartman, Andy Shevchenko, Kernel Hardening, LKML,
	Matthew Wilcox, Rasmus Villemoes, Daniel Micay

On Wed, Feb 20, 2019 at 04:48:18PM -0800, Kees Cook wrote:
> On Mon, Feb 18, 2019 at 3:24 PM Tobin C. Harding <tobin@kernel.org> wrote:
> >
> > We have a function to copy strings safely and we have a function to copy
> > strings _and_ zero the tail of the destination (if source string is
> > shorter than destination buffer) but we do not have a function to do
> > both at once.  This means developers must write this themselves if they
> > desire this functionality.  This is a chore, and also leaves us open to
> > off by one errors unnecessarily.
> >
> > Add a function that calls strscpy() then memset()s the tail to zero if
> > the source string is shorter than the destination buffer.
> >
> > Add testing via kselftest.
> >
> > Signed-off-by: Tobin C. Harding <tobin@kernel.org>
> > ---
> >  include/linux/string.h |  4 ++++
> >  lib/Kconfig.debug      |  2 +-
> >  lib/string.c           | 30 ++++++++++++++++++++++++++++--
> >  lib/test_string.c      | 31 +++++++++++++++++++++++++++++++
> >  4 files changed, 64 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/string.h b/include/linux/string.h
> > index 7927b875f80c..695a5e6a31e3 100644
> > --- a/include/linux/string.h
> > +++ b/include/linux/string.h
> > @@ -31,6 +31,10 @@ size_t strlcpy(char *, const char *, size_t);
> >  #ifndef __HAVE_ARCH_STRSCPY
> >  ssize_t strscpy(char *, const char *, size_t);
> >  #endif
> > +
> > +/* Wrapper function, no arch specific code required */
> > +ssize_t strscpy_zeroed(char *dest, const char *src, size_t count);
> 
> bikeshed: I think "pad" is shorter and more descriptive. How about
> something like strspad() strscpy_pad() or strscpy_zero()? (just to
> shorten it slightly)
> 
> Not a blocker, just a TODO: we need a wrapper to do
> CONFIG_FORTIFY_SOURCE checking for strscpy() (and strscpy_zeroed()) to
> check for __builtin_object_size() vs the "size" argument, as done in
> strlcpy() in include/linux/string.h
> 
> > @@ -238,6 +237,33 @@ ssize_t strscpy(char *dest, const char *src, size_t count)
> >  EXPORT_SYMBOL(strscpy);
> >  #endif
> >
> > +/**
> > + * strscopy_zeroed() - Copy a C-string into a sized buffer
> > + * @dest: Where to copy the string to
> > + * @src: Where to copy the string from
> > + * @count: Size of destination buffer
> > + *
> > + * If the source string is shorter than the destination buffer, zeros
> > + * the tail of the destination buffer.
> > + *
> > + * Return: The number of characters copied (not including the trailing
> > + *         NUL) or -E2BIG if the destination buffer wasn't big enough.
> > + */
> > +ssize_t strscpy_zeroed(char *dest, const char *src, size_t count)
> > +{
> > +       ssize_t written;
> > +
> > +       written = strscpy(dest, src, count);
> > +       if (written < 0)
> > +               return written;
> 
> If written < 0 we filled everything (i.e. we wrote "count - 1" bytes).
> If we also exactly wrote "count - 1", then we also don't need the zero
> padding either, since strscpy wrote the trailing NUL.
> 
> so:
> 
> if (written < 0 || (count && written == count - 1))

(I meant to reply yesterday before posting v2).  At this stage we know
count >= 0 otherwise written would be less than 0.  So I removed the
'count' from the second part of this statement, leaving

  if (written < 0 || written == count - 1)

>     return written;
> 
> > +
> > +       if (written < count)
> > +               memset(dest + written, 0, count - written);

I used this :)

thanks,
Tobin.

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2019-02-25 20:09 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-18 23:23 [PATCH 0/6] lib: Add safe string funtions Tobin C. Harding
2019-02-18 23:23 ` [PATCH 1/6] lib/string: Enable string selftesting Tobin C. Harding
2019-02-19 10:55   ` Andy Shevchenko
2019-02-19 21:55     ` Tobin C. Harding
2019-02-20 10:49       ` Andy Shevchenko
2019-02-20 23:58       ` Kees Cook
2019-02-20 23:57   ` Kees Cook
2019-02-21  5:16     ` Tobin C. Harding
2019-02-18 23:23 ` [PATCH 2/6] lib/string: Fix erroneous 'overflow' documentation Tobin C. Harding
2019-02-21  0:02   ` Kees Cook
2019-02-21  5:17     ` Tobin C. Harding
2019-02-18 23:23 ` [PATCH 3/6] lib/string: Use correct docstring format Tobin C. Harding
2019-02-21  0:07   ` Kees Cook
2019-02-21  4:14     ` Randy Dunlap
2019-02-21  5:27       ` Kees Cook
2019-02-18 23:23 ` [PATCH 4/6] lib/string: Add string copy/zero function Tobin C. Harding
2019-02-21  0:48   ` Kees Cook
2019-02-21  5:20     ` Tobin C. Harding
2019-02-21 12:02     ` Andy Shevchenko
2019-02-25 20:09     ` Tobin C. Harding
2019-02-18 23:23 ` [PATCH 5/6] lib: Fix function documentation for strncpy_from_user Tobin C. Harding
2019-02-19  0:51   ` Jann Horn
2019-02-19 21:52     ` Tobin C. Harding
2019-02-21  1:05     ` Kees Cook
2019-02-21  5:24       ` Tobin C. Harding
2019-02-21  6:02         ` Kees Cook
2019-02-21 14:58           ` Rasmus Villemoes
2019-02-21 23:03             ` Kees Cook
2019-02-25 15:41               ` Rasmus Villemoes
2019-02-21 16:06           ` Jann Horn
2019-02-21 23:14             ` Kees Cook
2019-02-21 20:26           ` Stephen Rothwell
2019-02-21 23:16             ` Kees Cook
2019-02-21 14:28       ` Jann Horn
2019-02-21 22:52         ` Kees Cook
2019-02-18 23:23 ` [PATCH 6/6] lib: Add function strscpy_from_user() Tobin C. Harding
2019-02-19  2:09   ` Jann Horn
2019-02-19  2:12   ` Jann Horn
2019-02-19 21:53     ` Tobin C. Harding
2019-02-20 23:31 ` [PATCH 0/6] lib: Add safe string funtions Kees Cook
2019-02-21  5:15   ` Tobin C. Harding

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).