LKML Archive on lore.kernel.org
 help / color / Atom feed
* [RFC 0/4] Switching ARC to optimized generic strncpy_from_user
@ 2020-01-14 20:08 Vineet Gupta
  2020-01-14 20:08 ` [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use Vineet Gupta
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Vineet Gupta @ 2020-01-14 20:08 UTC (permalink / raw)
  To: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, Linus Torvalds
  Cc: linux-snps-arc, linux-kernel, linux-arch, Vineet Gupta

Hi,

This came up when trying to move ARC over to generic word-at-a-time
interface.

 - 1/4 is a trivial fix (and needed for ARC switch)
 - 2/4 is mucking with internals hence the RFC. I could very likely be
   overlooking some possible DoS / exploit issues and apologies in advance
   if thats the case but I felt like sharing it anyways to see what
   others think.
 - 3/4, 4/4 are ARC changes to remove the existing ARC version and
   switch to generic (needs 1/4).

Thx,
-Vineet

Vineet Gupta (4):
  asm-generic/uaccess: don't define inline functions if noinline lib/*
    in use
  lib/strncpy_from_user: Remove redundant user space pointer range check
  ARC: uaccess: remove noinline variants of __strncpy_from_user() and
    friends
  ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user

 arch/arc/Kconfig                      |  2 +
 arch/arc/include/asm/Kbuild           |  1 -
 arch/arc/include/asm/uaccess.h        | 87 ++-------------------------
 arch/arc/include/asm/word-at-a-time.h | 49 +++++++++++++++
 arch/arc/mm/extable.c                 | 23 -------
 include/asm-generic/uaccess.h         |  4 ++
 lib/strncpy_from_user.c               | 36 ++++-------
 lib/strnlen_user.c                    | 28 +++------
 8 files changed, 79 insertions(+), 151 deletions(-)
 create mode 100644 arch/arc/include/asm/word-at-a-time.h

-- 
2.20.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use
  2020-01-14 20:08 [RFC 0/4] Switching ARC to optimized generic strncpy_from_user Vineet Gupta
@ 2020-01-14 20:08 ` Vineet Gupta
  2020-01-14 20:57   ` Arnd Bergmann
  2020-01-14 21:32   ` Linus Torvalds
  2020-01-14 20:08 ` [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check Vineet Gupta
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 21+ messages in thread
From: Vineet Gupta @ 2020-01-14 20:08 UTC (permalink / raw)
  To: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, Linus Torvalds
  Cc: linux-snps-arc, linux-kernel, linux-arch, Vineet Gupta

There are 2 generic varaints of strncpy_from_user() / strnlen_user()
 (1). inline version in asm-generic/uaccess.h
 (2). optimized word-at-a-time version in lib/*

This patch disables #1 if #2 selected. This allows arches to continue
reusing asm-generic/uaccess.h for rest of code

This came up when switching ARC to generic word-at-a-time interface

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 include/asm-generic/uaccess.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index e935318804f8..74c14211377b 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -227,6 +227,7 @@ __strncpy_from_user(char *dst, const char __user *src, long count)
 }
 #endif
 
+#ifndef CONFIG_GENERIC_STRNCPY_FROM_USER
 static inline long
 strncpy_from_user(char *dst, const char __user *src, long count)
 {
@@ -234,6 +235,7 @@ strncpy_from_user(char *dst, const char __user *src, long count)
 		return -EFAULT;
 	return __strncpy_from_user(dst, src, count);
 }
+#endif
 
 /*
  * Return the size of a string (including the ending 0)
@@ -244,6 +246,7 @@ strncpy_from_user(char *dst, const char __user *src, long count)
 #define __strnlen_user(s, n) (strnlen((s), (n)) + 1)
 #endif
 
+#ifndef CONFIG_GENERIC_STRNLEN_USER
 /*
  * Unlike strnlen, strnlen_user includes the nul terminator in
  * its returned count. Callers should check for a returned value
@@ -255,6 +258,7 @@ static inline long strnlen_user(const char __user *src, long n)
 		return 0;
 	return __strnlen_user(src, n);
 }
+#endif
 
 /*
  * Zero Userspace
-- 
2.20.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check
  2020-01-14 20:08 [RFC 0/4] Switching ARC to optimized generic strncpy_from_user Vineet Gupta
  2020-01-14 20:08 ` [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use Vineet Gupta
@ 2020-01-14 20:08 ` Vineet Gupta
  2020-01-14 21:22   ` Linus Torvalds
  2020-01-15 14:42   ` Andrey Konovalov
  2020-01-14 20:08 ` [RFC 3/4] ARC: uaccess: remove noinline variants of __strncpy_from_user() and friends Vineet Gupta
  2020-01-14 20:08 ` [RFC 4/4] ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user Vineet Gupta
  3 siblings, 2 replies; 21+ messages in thread
From: Vineet Gupta @ 2020-01-14 20:08 UTC (permalink / raw)
  To: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, Linus Torvalds
  Cc: linux-snps-arc, linux-kernel, linux-arch, Vineet Gupta

This came up when switching ARC to word-at-a-time interface and using
generic/optimized strncpy_from_user

It seems the existing code checks for user buffer/string range multiple
times and one of tem cn be avoided.

There's an open-coded range check which computes @max off of user_addr_max()
and thus typically way larger than the kernel buffer @count and subsequently
discarded in do_strncpy_from_user()

	if (max > count)
		max = count;

The canonical user_access_begin() => access_ok() follow anyways and even
with @count it should suffice for an intial range check as is true for
any copy_{to,from}_user()

And in case actual user space buffer is smaller than kernel dest pointer
(i.e. @max < @count) the usual string copy, null byte detection would
abort the process early anyways

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 lib/strncpy_from_user.c | 36 +++++++++++-------------------------
 lib/strnlen_user.c      | 28 +++++++---------------------
 2 files changed, 18 insertions(+), 46 deletions(-)

diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index dccb95af6003..a1622d71f037 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -21,22 +21,15 @@
 /*
  * Do a strncpy, return length of string without final '\0'.
  * 'count' is the user-supplied count (return 'count' if we
- * hit it), 'max' is the address space maximum (and we return
- * -EFAULT if we hit it).
+ * hit it). If access fails, return -EFAULT.
  */
 static inline long do_strncpy_from_user(char *dst, const char __user *src,
-					unsigned long count, unsigned long max)
+					unsigned long count)
 {
 	const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS;
+	unsigned long max = count;
 	unsigned long res = 0;
 
-	/*
-	 * Truncate 'max' to the user-specified limit, so that
-	 * we only have one limit we need to check in the loop
-	 */
-	if (max > count)
-		max = count;
-
 	if (IS_UNALIGNED(src, dst))
 		goto byte_at_a_time;
 
@@ -72,7 +65,7 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
 	 * Uhhuh. We hit 'max'. But was that the user-specified maximum
 	 * too? If so, that's ok - we got as much as the user asked for.
 	 */
-	if (res >= count)
+	if (res == count)
 		return res;
 
 	/*
@@ -103,25 +96,18 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
  */
 long strncpy_from_user(char *dst, const char __user *src, long count)
 {
-	unsigned long max_addr, src_addr;
-
 	if (unlikely(count <= 0))
 		return 0;
 
-	max_addr = user_addr_max();
-	src_addr = (unsigned long)untagged_addr(src);
-	if (likely(src_addr < max_addr)) {
-		unsigned long max = max_addr - src_addr;
+	kasan_check_write(dst, count);
+	check_object_size(dst, count, false);
+	if (user_access_begin(src, count)) {
 		long retval;
-
-		kasan_check_write(dst, count);
-		check_object_size(dst, count, false);
-		if (user_access_begin(src, max)) {
-			retval = do_strncpy_from_user(dst, src, count, max);
-			user_access_end();
-			return retval;
-		}
+		retval = do_strncpy_from_user(dst, src, count);
+		user_access_end();
+		return retval;
 	}
+
 	return -EFAULT;
 }
 EXPORT_SYMBOL(strncpy_from_user);
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index 6c0005d5dd5c..5ce61f303d6e 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -20,19 +20,13 @@
  * if it fits in a aligned 'long'. The caller needs to check
  * the return value against "> max".
  */
-static inline long do_strnlen_user(const char __user *src, unsigned long count, unsigned long max)
+static inline long do_strnlen_user(const char __user *src, unsigned long count)
 {
 	const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS;
 	unsigned long align, res = 0;
+	unsigned long max = count;
 	unsigned long c;
 
-	/*
-	 * Truncate 'max' to the user-specified limit, so that
-	 * we only have one limit we need to check in the loop
-	 */
-	if (max > count)
-		max = count;
-
 	/*
 	 * Do everything aligned. But that means that we
 	 * need to also expand the maximum..
@@ -64,7 +58,7 @@ static inline long do_strnlen_user(const char __user *src, unsigned long count,
 	 * Uhhuh. We hit 'max'. But was that the user-specified maximum
 	 * too? If so, return the marker for "too long".
 	 */
-	if (res >= count)
+	if (res == count)
 		return count+1;
 
 	/*
@@ -98,22 +92,14 @@ static inline long do_strnlen_user(const char __user *src, unsigned long count,
  */
 long strnlen_user(const char __user *str, long count)
 {
-	unsigned long max_addr, src_addr;
-
 	if (unlikely(count <= 0))
 		return 0;
 
-	max_addr = user_addr_max();
-	src_addr = (unsigned long)untagged_addr(str);
-	if (likely(src_addr < max_addr)) {
-		unsigned long max = max_addr - src_addr;
+	if (user_access_begin(str, count)) {
 		long retval;
-
-		if (user_access_begin(str, max)) {
-			retval = do_strnlen_user(str, count, max);
-			user_access_end();
-			return retval;
-		}
+		retval = do_strnlen_user(str, count);
+		user_access_end();
+		return retval;
 	}
 	return 0;
 }
-- 
2.20.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC 3/4] ARC: uaccess: remove noinline variants of __strncpy_from_user() and friends
  2020-01-14 20:08 [RFC 0/4] Switching ARC to optimized generic strncpy_from_user Vineet Gupta
  2020-01-14 20:08 ` [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use Vineet Gupta
  2020-01-14 20:08 ` [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check Vineet Gupta
@ 2020-01-14 20:08 ` Vineet Gupta
  2020-01-14 20:08 ` [RFC 4/4] ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user Vineet Gupta
  3 siblings, 0 replies; 21+ messages in thread
From: Vineet Gupta @ 2020-01-14 20:08 UTC (permalink / raw)
  To: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, Linus Torvalds
  Cc: linux-snps-arc, linux-kernel, linux-arch, Vineet Gupta

This helps with subsequent removal of arch specific variants in favour
of optimized generic routines (word vs byte access)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/include/asm/uaccess.h | 26 ++++++--------------------
 arch/arc/mm/extable.c          | 23 -----------------------
 2 files changed, 6 insertions(+), 43 deletions(-)

diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index ea40ec7f6cae..0b34c152086f 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -613,7 +613,7 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 	return res;
 }
 
-static inline unsigned long __arc_clear_user(void __user *to, unsigned long n)
+static inline unsigned long __clear_user(void __user *to, unsigned long n)
 {
 	long res = n;
 	unsigned char *d_char = to;
@@ -656,7 +656,7 @@ static inline unsigned long __arc_clear_user(void __user *to, unsigned long n)
 }
 
 static inline long
-__arc_strncpy_from_user(char *dst, const char __user *src, long count)
+__strncpy_from_user(char *dst, const char __user *src, long count)
 {
 	long res = 0;
 	char val;
@@ -688,7 +688,7 @@ __arc_strncpy_from_user(char *dst, const char __user *src, long count)
 	return res;
 }
 
-static inline long __arc_strnlen_user(const char __user *s, long n)
+static inline long __strnlen_user(const char __user *s, long n)
 {
 	long res, tmp1, cnt;
 	char val;
@@ -718,26 +718,12 @@ static inline long __arc_strnlen_user(const char __user *s, long n)
 	return res;
 }
 
-#ifndef CONFIG_CC_OPTIMIZE_FOR_SIZE
-
 #define INLINE_COPY_TO_USER
 #define INLINE_COPY_FROM_USER
 
-#define __clear_user(d, n)		__arc_clear_user(d, n)
-#define __strncpy_from_user(d, s, n)	__arc_strncpy_from_user(d, s, n)
-#define __strnlen_user(s, n)		__arc_strnlen_user(s, n)
-#else
-extern unsigned long arc_clear_user_noinline(void __user *to,
-		unsigned long n);
-extern long arc_strncpy_from_user_noinline (char *dst, const char __user *src,
-		long count);
-extern long arc_strnlen_user_noinline(const char __user *src, long n);
-
-#define __clear_user(d, n)		arc_clear_user_noinline(d, n)
-#define __strncpy_from_user(d, s, n)	arc_strncpy_from_user_noinline(d, s, n)
-#define __strnlen_user(s, n)		arc_strnlen_user_noinline(s, n)
-
-#endif
+#define __clear_user		__clear_user
+#define __strncpy_from_user	__strncpy_from_user
+#define __strnlen_user		__strnlen_user
 
 #include <asm/segment.h>
 #include <asm-generic/uaccess.h>
diff --git a/arch/arc/mm/extable.c b/arch/arc/mm/extable.c
index b06b09ddf924..88fa3a4d4906 100644
--- a/arch/arc/mm/extable.c
+++ b/arch/arc/mm/extable.c
@@ -22,26 +22,3 @@ int fixup_exception(struct pt_regs *regs)
 
 	return 0;
 }
-
-#ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
-
-unsigned long arc_clear_user_noinline(void __user *to,
-		unsigned long n)
-{
-	return __arc_clear_user(to, n);
-}
-EXPORT_SYMBOL(arc_clear_user_noinline);
-
-long arc_strncpy_from_user_noinline(char *dst, const char __user *src,
-		long count)
-{
-	return __arc_strncpy_from_user(dst, src, count);
-}
-EXPORT_SYMBOL(arc_strncpy_from_user_noinline);
-
-long arc_strnlen_user_noinline(const char __user *src, long n)
-{
-	return __arc_strnlen_user(src, n);
-}
-EXPORT_SYMBOL(arc_strnlen_user_noinline);
-#endif
-- 
2.20.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC 4/4] ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user
  2020-01-14 20:08 [RFC 0/4] Switching ARC to optimized generic strncpy_from_user Vineet Gupta
                   ` (2 preceding siblings ...)
  2020-01-14 20:08 ` [RFC 3/4] ARC: uaccess: remove noinline variants of __strncpy_from_user() and friends Vineet Gupta
@ 2020-01-14 20:08 ` Vineet Gupta
  2020-01-14 20:42   ` Arnd Bergmann
  3 siblings, 1 reply; 21+ messages in thread
From: Vineet Gupta @ 2020-01-14 20:08 UTC (permalink / raw)
  To: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, Linus Torvalds
  Cc: linux-snps-arc, linux-kernel, linux-arch, Vineet Gupta

These rely on word access rather than byte loop

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/Kconfig                      |  2 +
 arch/arc/include/asm/Kbuild           |  1 -
 arch/arc/include/asm/uaccess.h        | 71 ++-------------------------
 arch/arc/include/asm/word-at-a-time.h | 49 ++++++++++++++++++
 4 files changed, 56 insertions(+), 67 deletions(-)
 create mode 100644 arch/arc/include/asm/word-at-a-time.h

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 26108ea785c2..3b074c4d31fb 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -26,6 +26,8 @@ config ARC
 	select GENERIC_PENDING_IRQ if SMP
 	select GENERIC_SCHED_CLOCK
 	select GENERIC_SMP_IDLE_THREAD
+	select GENERIC_STRNCPY_FROM_USER if MMU
+	select GENERIC_STRNLEN_USER if MMU
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_DEBUG_STACKOVERFLOW
diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild
index 1b505694691e..cb8d459b7f56 100644
--- a/arch/arc/include/asm/Kbuild
+++ b/arch/arc/include/asm/Kbuild
@@ -24,5 +24,4 @@ generic-y += topology.h
 generic-y += trace_clock.h
 generic-y += user.h
 generic-y += vga.h
-generic-y += word-at-a-time.h
 generic-y += xor.h
diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index 0b34c152086f..f579e06447a9 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -23,7 +23,6 @@
 
 #include <linux/string.h>	/* for generic string functions */
 
-
 #define __kernel_ok		(uaccess_kernel())
 
 /*
@@ -52,6 +51,8 @@
 #define __access_ok(addr, sz)	(unlikely(__kernel_ok) || \
 				 likely(__user_ok((addr), (sz))))
 
+#define user_addr_max() 	(uaccess_kernel() ? ~0UL : get_fs())
+
 /*********** Single byte/hword/word copies ******************/
 
 #define __get_user_fn(sz, u, k)					\
@@ -655,75 +656,13 @@ static inline unsigned long __clear_user(void __user *to, unsigned long n)
 	return res;
 }
 
-static inline long
-__strncpy_from_user(char *dst, const char __user *src, long count)
-{
-	long res = 0;
-	char val;
-
-	if (count == 0)
-		return 0;
-
-	__asm__ __volatile__(
-	"	mov	lp_count, %5		\n"
-	"	lp	3f			\n"
-	"1:	ldb.ab  %3, [%2, 1]		\n"
-	"	breq.d	%3, 0, 3f               \n"
-	"	stb.ab  %3, [%1, 1]		\n"
-	"	add	%0, %0, 1	# Num of NON NULL bytes copied	\n"
-	"3:								\n"
-	"	.section .fixup, \"ax\"		\n"
-	"	.align 4			\n"
-	"4:	mov %0, %4		# sets @res as -EFAULT	\n"
-	"	j   3b				\n"
-	"	.previous			\n"
-	"	.section __ex_table, \"a\"	\n"
-	"	.align 4			\n"
-	"	.word   1b, 4b			\n"
-	"	.previous			\n"
-	: "+r"(res), "+r"(dst), "+r"(src), "=r"(val)
-	: "g"(-EFAULT), "r"(count)
-	: "lp_count", "memory");
-
-	return res;
-}
-
-static inline long __strnlen_user(const char __user *s, long n)
-{
-	long res, tmp1, cnt;
-	char val;
-
-	__asm__ __volatile__(
-	"	mov %2, %1			\n"
-	"1:	ldb.ab  %3, [%0, 1]		\n"
-	"	breq.d  %3, 0, 2f		\n"
-	"	sub.f   %2, %2, 1		\n"
-	"	bnz 1b				\n"
-	"	sub %2, %2, 1			\n"
-	"2:	sub %0, %1, %2			\n"
-	"3:	;nop				\n"
-	"	.section .fixup, \"ax\"		\n"
-	"	.align 4			\n"
-	"4:	mov %0, 0			\n"
-	"	j   3b				\n"
-	"	.previous			\n"
-	"	.section __ex_table, \"a\"	\n"
-	"	.align 4			\n"
-	"	.word 1b, 4b			\n"
-	"	.previous			\n"
-	: "=r"(res), "=r"(tmp1), "=r"(cnt), "=r"(val)
-	: "0"(s), "1"(n)
-	: "memory");
-
-	return res;
-}
-
 #define INLINE_COPY_TO_USER
 #define INLINE_COPY_FROM_USER
 
 #define __clear_user		__clear_user
-#define __strncpy_from_user	__strncpy_from_user
-#define __strnlen_user		__strnlen_user
+
+extern long strncpy_from_user(char *dest, const char __user *src, long count);
+extern __must_check long strnlen_user(const char __user *str, long n);
 
 #include <asm/segment.h>
 #include <asm-generic/uaccess.h>
diff --git a/arch/arc/include/asm/word-at-a-time.h b/arch/arc/include/asm/word-at-a-time.h
new file mode 100644
index 000000000000..00e92be70987
--- /dev/null
+++ b/arch/arc/include/asm/word-at-a-time.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 Synopsys Inc.
+ */
+#ifndef __ASM_ARC_WORD_AT_A_TIME_H
+#define __ASM_ARC_WORD_AT_A_TIME_H
+
+#ifdef __LITTLE_ENDIAN__
+
+#include <linux/kernel.h>
+
+struct word_at_a_time {
+	const unsigned long one_bits, high_bits;
+};
+
+#define WORD_AT_A_TIME_CONSTANTS { REPEAT_BYTE(0x01), REPEAT_BYTE(0x80) }
+
+static inline unsigned long has_zero(unsigned long a, unsigned long *bits,
+				     const struct word_at_a_time *c)
+{
+	unsigned long mask = ((a - c->one_bits) & ~a) & c->high_bits;
+	*bits = mask;
+	return mask;
+}
+
+#define prep_zero_mask(a, bits, c) (bits)
+
+static inline unsigned long create_zero_mask(unsigned long bits)
+{
+	bits = (bits - 1) & ~bits;
+	return bits >> 7;
+}
+
+static inline unsigned long find_zero(unsigned long mask)
+{
+#ifdef CONFIG_64BIT
+	return fls64(mask) >> 3;
+#else
+	return fls(mask) >> 3;
+#endif
+}
+
+#define zero_bytemask(mask) (mask)
+
+#else	/* __BIG_ENDIAN__ */
+#include <asm-generic/word-at-a-time.h>
+#endif
+
+#endif /* __ASM_WORD_AT_A_TIME_H */
-- 
2.20.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 4/4] ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user
  2020-01-14 20:08 ` [RFC 4/4] ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user Vineet Gupta
@ 2020-01-14 20:42   ` Arnd Bergmann
  2020-01-14 21:36     ` Vineet Gupta
  0 siblings, 1 reply; 21+ messages in thread
From: Arnd Bergmann @ 2020-01-14 20:42 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Khalid Aziz, Andrey Konovalov, Andrew Morton, Peter Zijlstra,
	Christian Brauner, Kees Cook, Ingo Molnar, Aleksa Sarai,
	Linus Torvalds, open list:SYNOPSYS ARC ARCHITECTURE,
	linux-kernel, linux-arch

On Tue, Jan 14, 2020 at 9:08 PM Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:

> diff --git a/arch/arc/include/asm/word-at-a-time.h b/arch/arc/include/asm/word-at-a-time.h
> new file mode 100644
> index 000000000000..00e92be70987
> --- /dev/null
> +++ b/arch/arc/include/asm/word-at-a-time.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2020 Synopsys Inc.
> + */
> +#ifndef __ASM_ARC_WORD_AT_A_TIME_H
> +#define __ASM_ARC_WORD_AT_A_TIME_H
> +
> +#ifdef __LITTLE_ENDIAN__
> +
> +#include <linux/kernel.h>
> +
> +struct word_at_a_time {
> +       const unsigned long one_bits, high_bits;
> +};

What's wrong with the generic version on little-endian? Any
chance you can find a way to make it work as well for you as
this copy?

> +static inline unsigned long find_zero(unsigned long mask)
> +{
> +#ifdef CONFIG_64BIT
> +       return fls64(mask) >> 3;
> +#else
> +       return fls(mask) >> 3;
> +#endif

The CONFIG_64BIT check not be needed, unless you are adding
support for 64-bit ARC really soon.

       Arnd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use
  2020-01-14 20:08 ` [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use Vineet Gupta
@ 2020-01-14 20:57   ` Arnd Bergmann
  2020-01-15 23:01     ` Vineet Gupta
  2020-01-14 21:32   ` Linus Torvalds
  1 sibling, 1 reply; 21+ messages in thread
From: Arnd Bergmann @ 2020-01-14 20:57 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Khalid Aziz, Andrey Konovalov, Andrew Morton, Peter Zijlstra,
	Christian Brauner, Kees Cook, Ingo Molnar, Aleksa Sarai,
	Linus Torvalds, open list:SYNOPSYS ARC ARCHITECTURE,
	linux-kernel, linux-arch

On Tue, Jan 14, 2020 at 9:08 PM Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
>
> There are 2 generic varaints of strncpy_from_user() / strnlen_user()
>  (1). inline version in asm-generic/uaccess.h
>  (2). optimized word-at-a-time version in lib/*
>
> This patch disables #1 if #2 selected. This allows arches to continue
> reusing asm-generic/uaccess.h for rest of code
>
> This came up when switching ARC to generic word-at-a-time interface
>
> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

This looks like a useful change, but I think we can do even better: It
seems that
there are no  callers of __strnlen_user or __strncpy_from_user  in the
kernel today, so these should not be defined either when the Kconfig symbols
are set. Also, I would suggest moving the 'extern' declaration for the two
functions into the #else branch of the conditional so it does not need to be
duplicated.

      Arnd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check
  2020-01-14 20:08 ` [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check Vineet Gupta
@ 2020-01-14 21:22   ` Linus Torvalds
  2020-01-14 21:52     ` Vineet Gupta
  2020-01-14 23:46     ` Al Viro
  2020-01-15 14:42   ` Andrey Konovalov
  1 sibling, 2 replies; 21+ messages in thread
From: Linus Torvalds @ 2020-01-14 21:22 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, linux-snps-arc, Linux Kernel Mailing List,
	linux-arch

On Tue, Jan 14, 2020 at 12:09 PM Vineet Gupta
<Vineet.Gupta1@synopsys.com> wrote:
>
> This came up when switching ARC to word-at-a-time interface and using
> generic/optimized strncpy_from_user
>
> It seems the existing code checks for user buffer/string range multiple
> times and one of tem cn be avoided.

NO!

DO NOT DO THIS.

This is seriously buggy.

>  long strncpy_from_user(char *dst, const char __user *src, long count)
>  {
> -       unsigned long max_addr, src_addr;
> -
>         if (unlikely(count <= 0))
>                 return 0;
>
> -       max_addr = user_addr_max();
> -       src_addr = (unsigned long)untagged_addr(src);
> -       if (likely(src_addr < max_addr)) {
> -               unsigned long max = max_addr - src_addr;
> +       kasan_check_write(dst, count);
> +       check_object_size(dst, count, false);
> +       if (user_access_begin(src, count)) {

You can't do that "user_access_begin(src, count)", because "count" is
the maximum _possible_ length, but it is *NOT* necessarily the actual
length of the string we really get from user space!

Think of this situation:

 - user has a 5-byte string at the end of the address space

 - kernel does a

     n = strncpy_from_user(uaddr, page, PAGE_SIZE)

now your "user_access_begin(src, count)" will _fail_, because "uaddr"
is close to the end of the user address space, and there's not room
for PAGE_SIZE bytes any more.

But "count" isn't actually how many bytes we will access from user
space, it's only the maximum limit on the *target*. IOW, it's about a
kernel buffer size, not about the user access size.

Because we'll only access that 5-byte string, which fits just fine in
the user space, and doing that "user_access_begin(src, count)" gives
the wrong answer.

The fact is, copying a string from user space is *very* different from
copying a fixed number of bytes, and that whole dance with

        max_addr = user_addr_max();

is absolutely required and necessary.

You completely broke string copying.

It is very possible that string copying was horribly broken on ARC
before too - almost nobody ever gets this right, but the generic
routine does.

So the generic routine is not only faster, it is *correct*, and your
change broke it.

Don't touch generic code. If you want to use the generic code, please
do so. But DO NOT TOUCH IT. It is correct, your patch is wrong.

The exact same issue is true in strnlen_user(). Don't break it.

              Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use
  2020-01-14 20:08 ` [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use Vineet Gupta
  2020-01-14 20:57   ` Arnd Bergmann
@ 2020-01-14 21:32   ` Linus Torvalds
  2020-01-15  9:08     ` Arnd Bergmann
  1 sibling, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2020-01-14 21:32 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, linux-snps-arc, Linux Kernel Mailing List,
	linux-arch

On Tue, Jan 14, 2020 at 12:09 PM Vineet Gupta
<Vineet.Gupta1@synopsys.com> wrote:
>
> There are 2 generic varaints of strncpy_from_user() / strnlen_user()
>  (1). inline version in asm-generic/uaccess.h

I think we should get rid of this entirely. It's just a buggy garbage
implementation that nobody should ever actually use.

It does just about everything wrong that you *can* do, wrong,
including doing the NUL-filling termination of standard strncpy() that
"strncpy_from_user()" doesn't actually do.

So:

 - the asm-generic/uaccess.h __strncpy_from_user() function is just
horribly wrong

 - the generic/uaccess.h version of strncpy_from_user() shouldn't be
an inline function either, since the only thing it can do inline is
the bogus one-byte access check that _barely_ makes security work (you
also need to have a guard page to _actually_ make it work, and I'm not
atr all convinced that people do).

the whole thing is just broken and should be removed from a header file.

>  (2). optimized word-at-a-time version in lib/*

That is - outside of the original x86 strncpy_from_user() - the only
copy of this function that historically gets all the corner cases
right. And even those we've gotten wrong occasionally.

I would suggest that anybody who uses asm-generic/uaccess.h needs to
simply use the generic library version.

             Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 4/4] ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user
  2020-01-14 20:42   ` Arnd Bergmann
@ 2020-01-14 21:36     ` Vineet Gupta
  2020-01-14 21:49       ` Linus Torvalds
  0 siblings, 1 reply; 21+ messages in thread
From: Vineet Gupta @ 2020-01-14 21:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Khalid Aziz, Andrey Konovalov, Andrew Morton, Peter Zijlstra,
	Christian Brauner, Kees Cook, Ingo Molnar, Aleksa Sarai,
	Linus Torvalds, open list:SYNOPSYS ARC ARCHITECTURE,
	linux-kernel, linux-arch

On 1/14/20 12:42 PM, Arnd Bergmann wrote:
> On Tue, Jan 14, 2020 at 9:08 PM Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
> 
>> diff --git a/arch/arc/include/asm/word-at-a-time.h b/arch/arc/include/asm/word-at-a-time.h
>> new file mode 100644
>> index 000000000000..00e92be70987
>> --- /dev/null
>> +++ b/arch/arc/include/asm/word-at-a-time.h
>> @@ -0,0 +1,49 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (C) 2020 Synopsys Inc.
>> + */
>> +#ifndef __ASM_ARC_WORD_AT_A_TIME_H
>> +#define __ASM_ARC_WORD_AT_A_TIME_H
>> +
>> +#ifdef __LITTLE_ENDIAN__
>> +
>> +#include <linux/kernel.h>
>> +
>> +struct word_at_a_time {
>> +       const unsigned long one_bits, high_bits;
>> +};
> 
> What's wrong with the generic version on little-endian? Any
> chance you can find a way to make it work as well for you as
> this copy?

find_zero() by default doesn't use pop count instructions. I didn't like the copy
either but wasn't sure of the best way to make this 4 API interface reusable. Are
you suggesting we allow partial over-ride starting with #ifndef find_zero ?

>> +static inline unsigned long find_zero(unsigned long mask)
>> +{
>> +#ifdef CONFIG_64BIT
>> +       return fls64(mask) >> 3;
>> +#else
>> +       return fls(mask) >> 3;
>> +#endif
> 
> The CONFIG_64BIT check not be needed, unless you are adding
> support for 64-bit ARC really soon.

:-) Indeed that was the premise !

Thx for the quick review.

-Vineet

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 4/4] ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user
  2020-01-14 21:36     ` Vineet Gupta
@ 2020-01-14 21:49       ` Linus Torvalds
  2020-01-14 22:14         ` Vineet Gupta
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2020-01-14 21:49 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, open list:SYNOPSYS ARC ARCHITECTURE, linux-kernel,
	linux-arch

On Tue, Jan 14, 2020 at 1:37 PM Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
>
> On 1/14/20 12:42 PM, Arnd Bergmann wrote:
> >
> > What's wrong with the generic version on little-endian? Any
> > chance you can find a way to make it work as well for you as
> > this copy?
>
> find_zero() by default doesn't use pop count instructions.

Don't you think the generic find_zero() is likely just as fast as the
pop count instruction? On 32-bit, I think it's like a shift and a mask
and a couple of additions.

The 64-bit case has a multiply that is likely expensive unless you
have a good multiplication unit (but what 64-bit architecture
doesn't?), but the generic 32-bit LE code should already be pretty
close to optimal, and it might not be worth it to worry about it.

(The big-endian case is very different, and architectures really can
do much better. But LE allows for bit tricks using the carry chain)

             Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check
  2020-01-14 21:22   ` Linus Torvalds
@ 2020-01-14 21:52     ` Vineet Gupta
  2020-01-14 23:46     ` Al Viro
  1 sibling, 0 replies; 21+ messages in thread
From: Vineet Gupta @ 2020-01-14 21:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, linux-snps-arc, Linux Kernel Mailing List,
	linux-arch

On 1/14/20 1:22 PM, Linus Torvalds wrote:
> On Tue, Jan 14, 2020 at 12:09 PM Vineet Gupta
> <Vineet.Gupta1@synopsys.com> wrote:
>>
>> This came up when switching ARC to word-at-a-time interface and using
>> generic/optimized strncpy_from_user
>>
>> It seems the existing code checks for user buffer/string range multiple
>> times and one of tem cn be avoided.
> 
> NO!
> 
> DO NOT DO THIS.
> 
> This is seriously buggy.
> 
>>  long strncpy_from_user(char *dst, const char __user *src, long count)
>>  {
>> -       unsigned long max_addr, src_addr;
>> -
>>         if (unlikely(count <= 0))
>>                 return 0;
>>
>> -       max_addr = user_addr_max();
>> -       src_addr = (unsigned long)untagged_addr(src);
>> -       if (likely(src_addr < max_addr)) {
>> -               unsigned long max = max_addr - src_addr;
>> +       kasan_check_write(dst, count);
>> +       check_object_size(dst, count, false);
>> +       if (user_access_begin(src, count)) {
> 
> You can't do that "user_access_begin(src, count)", because "count" is
> the maximum _possible_ length, but it is *NOT* necessarily the actual
> length of the string we really get from user space!
> 
> Think of this situation:
> 
>  - user has a 5-byte string at the end of the address space
> 
>  - kernel does a
> 
>      n = strncpy_from_user(uaddr, page, PAGE_SIZE)
> 
> now your "user_access_begin(src, count)" will _fail_, because "uaddr"
> is close to the end of the user address space, and there's not room
> for PAGE_SIZE bytes any more.

Oops indeed that was the case I didn't comprehend. In my initial tests with
debugger, every single hit on strncpy_from_user() had user addresses well into the
address space such that @max was ridiculously large (0xFFFF_FFFF - ptr) compared
to @count.

> But "count" isn't actually how many bytes we will access from user
> space, it's only the maximum limit on the *target*. IOW, it's about a
> kernel buffer size, not about the user access size.

Right I understood all that, but missed the case when user buffer is towards end
of address space and access_ok() will erroneously flag it.

> Because we'll only access that 5-byte string, which fits just fine in
> the user space, and doing that "user_access_begin(src, count)" gives
> the wrong answer.
> 
> The fact is, copying a string from user space is *very* different from
> copying a fixed number of bytes, and that whole dance with
> 
>         max_addr = user_addr_max();
> 
> is absolutely required and necessary.
> 
> You completely broke string copying.

I'm sorry and I wasn't sure to begin with hence the disclaimer in 0/4

> It is very possible that string copying was horribly broken on ARC
> before too - almost nobody ever gets this right, but the generic
> routine does.

No it is not. It is just dog slow since it does byte copy and uses the Zero delay
loops which I'm trying to get rid of. That's when I recalled the word-at-a-time
API which I'd meaning to go back to for last 7 years :-)

-Vineet


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 4/4] ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user
  2020-01-14 21:49       ` Linus Torvalds
@ 2020-01-14 22:14         ` Vineet Gupta
  0 siblings, 0 replies; 21+ messages in thread
From: Vineet Gupta @ 2020-01-14 22:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, open list:SYNOPSYS ARC ARCHITECTURE, linux-kernel,
	linux-arch

On 1/14/20 1:49 PM, Linus Torvalds wrote:
> On Tue, Jan 14, 2020 at 1:37 PM Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
>>
>> On 1/14/20 12:42 PM, Arnd Bergmann wrote:
>>>
>>> What's wrong with the generic version on little-endian? Any
>>> chance you can find a way to make it work as well for you as
>>> this copy?
>>
>> find_zero() by default doesn't use pop count instructions.
> 
> Don't you think the generic find_zero() is likely just as fast as the
> pop count instruction? On 32-bit, I think it's like a shift and a mask
> and a couple of additions.

You are right that in grand scheme things it may be less than noise.

ARC pop count version

# 	bits = (bits - 1) & ~bits;
#  	return bits >> 7;

	sub r0,r6,1
	bic r6,r0,r6
	lsr r0,r6,7

# 	return fls(mask) >> 3;

	fls.f	r0, r0
	add.nz	r0, r0, 1
	asr r5,r0,3

	j_s.d [blink]

Generic version

# 	bits = (bits - 1) & ~bits;
#  	return bits >> 7;

	sub r5,r6,1
	bic r6,r5,r6
	lsr r5,r6,7

#  	unsigned long a = (0x0ff0001+mask) >> 23;
# 	return a & mask;

	add r0,r5,0x0ff0001	<-- this is 8 byte instruction though
	lsr_s r0,r0,23
	and r5,r5,r0

	j_s.d [blink]


But its the usual itch/inclination of arch people to try and use the specific
instruction if available.

> 
> The 64-bit case has a multiply that is likely expensive unless you
> have a good multiplication unit (but what 64-bit architecture
> doesn't?), but the generic 32-bit LE code should already be pretty
> close to optimal, and it might not be worth it to worry about it.
> 
> (The big-endian case is very different, and architectures really can
> do much better. But LE allows for bit tricks using the carry chain)

-Vineet

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check
  2020-01-14 21:22   ` Linus Torvalds
  2020-01-14 21:52     ` Vineet Gupta
@ 2020-01-14 23:46     ` Al Viro
  1 sibling, 0 replies; 21+ messages in thread
From: Al Viro @ 2020-01-14 23:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vineet Gupta, Arnd Bergmann, Khalid Aziz, Andrey Konovalov,
	Andrew Morton, Peter Zijlstra, Christian Brauner, Kees Cook,
	Ingo Molnar, Aleksa Sarai, linux-snps-arc,
	Linux Kernel Mailing List, linux-arch

On Tue, Jan 14, 2020 at 01:22:07PM -0800, Linus Torvalds wrote:

> The fact is, copying a string from user space is *very* different from
> copying a fixed number of bytes, and that whole dance with
> 
>         max_addr = user_addr_max();
> 
> is absolutely required and necessary.
> 
> You completely broke string copying.

BTW, a quick grep through the callers has found something odd -
static ssize_t kmemleak_write(struct file *file, const char __user *user_buf,
                              size_t size, loff_t *ppos)
{
        char buf[64];
        int buf_size;
        int ret;

        buf_size = min(size, (sizeof(buf) - 1));
        if (strncpy_from_user(buf, user_buf, buf_size) < 0)
                return -EFAULT;
        buf[buf_size] = 0;

What the hell?  If somebody is calling write(fd, buf, n) they'd
better be ready to see any byte from buf[0] up to buf[n - 1]
fetched, and if something is unmapped - deal with -EFAULT.
Is something really doing that and if so, why does kmemleak
try to accomodate that idiocy?

The same goes for several more ->write() instances - mtrr_write(),
armada_debugfs_crtc_reg_write() and cio_ignore_write(); IMO that's
seriously misguided (and cio one ought use vmemdup_user() instead
of what it's doing)...


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use
  2020-01-14 21:32   ` Linus Torvalds
@ 2020-01-15  9:08     ` Arnd Bergmann
  2020-01-15 14:12       ` Al Viro
  0 siblings, 1 reply; 21+ messages in thread
From: Arnd Bergmann @ 2020-01-15  9:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vineet Gupta, Khalid Aziz, Andrey Konovalov, Andrew Morton,
	Peter Zijlstra, Christian Brauner, Kees Cook, Ingo Molnar,
	Aleksa Sarai, open list:SYNOPSYS ARC ARCHITECTURE,
	Linux Kernel Mailing List, linux-arch

On Tue, Jan 14, 2020 at 10:33 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Tue, Jan 14, 2020 at 12:09 PM Vineet Gupta
> <Vineet.Gupta1@synopsys.com> wrote:
> >
> > There are 2 generic varaints of strncpy_from_user() / strnlen_user()
> >  (1). inline version in asm-generic/uaccess.h
>
> I think we should get rid of this entirely. It's just a buggy garbage
> implementation that nobody should ever actually use.
>
> It does just about everything wrong that you *can* do, wrong,
> including doing the NUL-filling termination of standard strncpy() that
> "strncpy_from_user()" doesn't actually do.
>
> So:
>
>  - the asm-generic/uaccess.h __strncpy_from_user() function is just
> horribly wrong

I checked who is actually using it, and the only ones I found
are c6x and rv32-nommu. It shouldn't be hard to move them over
to the generic version.

>  - the generic/uaccess.h version of strncpy_from_user() shouldn't be
> an inline function either, since the only thing it can do inline is
> the bogus one-byte access check that _barely_ makes security work (you
> also need to have a guard page to _actually_ make it work, and I'm not
> atr all convinced that people do).

That would be arc, hexagon, unicore32, and um. Hexagon already has
the same bug in strncpy_from_user and should be converted to the
generic version as you say. For unicore32 the existing asm imlpementation
may be fine, but it's clearly easier to use the generic code than moving
the range check in there.

I don't know what the arch/um implementation needs, but since it's in C,
moving the access_ok() in there is easy enough.

> I would suggest that anybody who uses asm-generic/uaccess.h needs to
> simply use the generic library version.

Or possibly just everybody altogether: the remaining architectures that
have a custom implementation don't seem to be doing any better either.

     Arnd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use
  2020-01-15  9:08     ` Arnd Bergmann
@ 2020-01-15 14:12       ` Al Viro
  2020-01-15 14:21         ` Arnd Bergmann
  0 siblings, 1 reply; 21+ messages in thread
From: Al Viro @ 2020-01-15 14:12 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linus Torvalds, Vineet Gupta, Khalid Aziz, Andrey Konovalov,
	Andrew Morton, Peter Zijlstra, Christian Brauner, Kees Cook,
	Ingo Molnar, Aleksa Sarai, open list:SYNOPSYS ARC ARCHITECTURE,
	Linux Kernel Mailing List, linux-arch

On Wed, Jan 15, 2020 at 10:08:31AM +0100, Arnd Bergmann wrote:

> > I would suggest that anybody who uses asm-generic/uaccess.h needs to
> > simply use the generic library version.
> 
> Or possibly just everybody altogether: the remaining architectures that
> have a custom implementation don't seem to be doing any better either.

No go for s390.  There you really want to access userland memory in
larger chunks - it's oriented for block transfers.  IIRC, the insn
they are using has a costly setup phase, independent of the amount
to copy, followed by reasonably fast transfer more or less linear
by the size.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use
  2020-01-15 14:12       ` Al Viro
@ 2020-01-15 14:21         ` Arnd Bergmann
  0 siblings, 0 replies; 21+ messages in thread
From: Arnd Bergmann @ 2020-01-15 14:21 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, Vineet Gupta, Khalid Aziz, Andrey Konovalov,
	Andrew Morton, Peter Zijlstra, Christian Brauner, Kees Cook,
	Ingo Molnar, Aleksa Sarai, open list:SYNOPSYS ARC ARCHITECTURE,
	Linux Kernel Mailing List, linux-arch

On Wed, Jan 15, 2020 at 3:12 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Wed, Jan 15, 2020 at 10:08:31AM +0100, Arnd Bergmann wrote:
>
> > > I would suggest that anybody who uses asm-generic/uaccess.h needs to
> > > simply use the generic library version.
> >
> > Or possibly just everybody altogether: the remaining architectures that
> > have a custom implementation don't seem to be doing any better either.
>
> No go for s390.  There you really want to access userland memory in
> larger chunks - it's oriented for block transfers.  IIRC, the insn
> they are using has a costly setup phase, independent of the amount
> to copy, followed by reasonably fast transfer more or less linear
> by the size.

Right, I missed that one while looking through the remaining
implementations.

     Arnd

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check
  2020-01-14 20:08 ` [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check Vineet Gupta
  2020-01-14 21:22   ` Linus Torvalds
@ 2020-01-15 14:42   ` Andrey Konovalov
  2020-01-15 23:00     ` Vineet Gupta
  1 sibling, 1 reply; 21+ messages in thread
From: Andrey Konovalov @ 2020-01-15 14:42 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Arnd Bergmann, Khalid Aziz, Andrew Morton, Peter Zijlstra,
	Christian Brauner, Kees Cook, Ingo Molnar, Aleksa Sarai,
	Linus Torvalds, linux-snps-arc, LKML, linux-arch

On Tue, Jan 14, 2020 at 9:08 PM Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
>
> This came up when switching ARC to word-at-a-time interface and using
> generic/optimized strncpy_from_user
>
> It seems the existing code checks for user buffer/string range multiple
> times and one of tem cn be avoided.
>
> There's an open-coded range check which computes @max off of user_addr_max()
> and thus typically way larger than the kernel buffer @count and subsequently
> discarded in do_strncpy_from_user()
>
>         if (max > count)
>                 max = count;
>
> The canonical user_access_begin() => access_ok() follow anyways and even
> with @count it should suffice for an intial range check as is true for
> any copy_{to,from}_user()
>
> And in case actual user space buffer is smaller than kernel dest pointer
> (i.e. @max < @count) the usual string copy, null byte detection would
> abort the process early anyways
>
> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
> ---
>  lib/strncpy_from_user.c | 36 +++++++++++-------------------------
>  lib/strnlen_user.c      | 28 +++++++---------------------
>  2 files changed, 18 insertions(+), 46 deletions(-)
>
> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> index dccb95af6003..a1622d71f037 100644
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c
> @@ -21,22 +21,15 @@
>  /*
>   * Do a strncpy, return length of string without final '\0'.
>   * 'count' is the user-supplied count (return 'count' if we
> - * hit it), 'max' is the address space maximum (and we return
> - * -EFAULT if we hit it).
> + * hit it). If access fails, return -EFAULT.
>   */
>  static inline long do_strncpy_from_user(char *dst, const char __user *src,
> -                                       unsigned long count, unsigned long max)
> +                                       unsigned long count)
>  {
>         const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS;
> +       unsigned long max = count;
>         unsigned long res = 0;
>
> -       /*
> -        * Truncate 'max' to the user-specified limit, so that
> -        * we only have one limit we need to check in the loop
> -        */
> -       if (max > count)
> -               max = count;
> -
>         if (IS_UNALIGNED(src, dst))
>                 goto byte_at_a_time;
>
> @@ -72,7 +65,7 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
>          * Uhhuh. We hit 'max'. But was that the user-specified maximum
>          * too? If so, that's ok - we got as much as the user asked for.
>          */
> -       if (res >= count)
> +       if (res == count)
>                 return res;
>
>         /*
> @@ -103,25 +96,18 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src,
>   */
>  long strncpy_from_user(char *dst, const char __user *src, long count)
>  {
> -       unsigned long max_addr, src_addr;
> -
>         if (unlikely(count <= 0))
>                 return 0;
>
> -       max_addr = user_addr_max();
> -       src_addr = (unsigned long)untagged_addr(src);

If you end up changing this code, you need to keep the untagged_addr()
logic, otherwise this breaks arm64 tagged address ABI [1].

[1] https://www.kernel.org/doc/html/latest/arm64/tagged-address-abi.html

> -       if (likely(src_addr < max_addr)) {
> -               unsigned long max = max_addr - src_addr;
> +       kasan_check_write(dst, count);
> +       check_object_size(dst, count, false);
> +       if (user_access_begin(src, count)) {
>                 long retval;
> -
> -               kasan_check_write(dst, count);
> -               check_object_size(dst, count, false);
> -               if (user_access_begin(src, max)) {
> -                       retval = do_strncpy_from_user(dst, src, count, max);
> -                       user_access_end();
> -                       return retval;
> -               }
> +               retval = do_strncpy_from_user(dst, src, count);
> +               user_access_end();
> +               return retval;
>         }
> +
>         return -EFAULT;
>  }
>  EXPORT_SYMBOL(strncpy_from_user);
> diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
> index 6c0005d5dd5c..5ce61f303d6e 100644
> --- a/lib/strnlen_user.c
> +++ b/lib/strnlen_user.c
> @@ -20,19 +20,13 @@
>   * if it fits in a aligned 'long'. The caller needs to check
>   * the return value against "> max".
>   */
> -static inline long do_strnlen_user(const char __user *src, unsigned long count, unsigned long max)
> +static inline long do_strnlen_user(const char __user *src, unsigned long count)
>  {
>         const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS;
>         unsigned long align, res = 0;
> +       unsigned long max = count;
>         unsigned long c;
>
> -       /*
> -        * Truncate 'max' to the user-specified limit, so that
> -        * we only have one limit we need to check in the loop
> -        */
> -       if (max > count)
> -               max = count;
> -
>         /*
>          * Do everything aligned. But that means that we
>          * need to also expand the maximum..
> @@ -64,7 +58,7 @@ static inline long do_strnlen_user(const char __user *src, unsigned long count,
>          * Uhhuh. We hit 'max'. But was that the user-specified maximum
>          * too? If so, return the marker for "too long".
>          */
> -       if (res >= count)
> +       if (res == count)
>                 return count+1;
>
>         /*
> @@ -98,22 +92,14 @@ static inline long do_strnlen_user(const char __user *src, unsigned long count,
>   */
>  long strnlen_user(const char __user *str, long count)
>  {
> -       unsigned long max_addr, src_addr;
> -
>         if (unlikely(count <= 0))
>                 return 0;
>
> -       max_addr = user_addr_max();
> -       src_addr = (unsigned long)untagged_addr(str);
> -       if (likely(src_addr < max_addr)) {
> -               unsigned long max = max_addr - src_addr;
> +       if (user_access_begin(str, count)) {
>                 long retval;
> -
> -               if (user_access_begin(str, max)) {
> -                       retval = do_strnlen_user(str, count, max);
> -                       user_access_end();
> -                       return retval;
> -               }
> +               retval = do_strnlen_user(str, count);
> +               user_access_end();
> +               return retval;
>         }
>         return 0;
>  }
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check
  2020-01-15 14:42   ` Andrey Konovalov
@ 2020-01-15 23:00     ` Vineet Gupta
  0 siblings, 0 replies; 21+ messages in thread
From: Vineet Gupta @ 2020-01-15 23:00 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: linux-arch, Kees Cook, Arnd Bergmann, Peter Zijlstra, LKML,
	Aleksa Sarai, Ingo Molnar, Khalid Aziz, Christian Brauner,
	linux-snps-arc, Linus Torvalds, Andrew Morton

On 1/15/20 6:42 AM, Andrey Konovalov wrote:
>> -       max_addr = user_addr_max();
>> -       src_addr = (unsigned long)untagged_addr(src);
>
> If you end up changing this code, you need to keep the untagged_addr()
> logic, otherwise this breaks arm64 tagged address ABI [1].

It is moot point now, but fwiw untagged_addr() would not have been needed anymore
as it was only needed to compute the pointer difference which my patch got rid of.

> 
> [1] https://www.kernel.org/doc/html/latest/arm64/tagged-address-abi.html
> 
>> -       if (likely(src_addr < max_addr)) {
>> -               unsigned long max = max_addr - src_addr;
>> +       kasan_check_write(dst, count);
>> +       check_object_size(dst, count, false);
>> +       if (user_access_begin(src, count)) {


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use
  2020-01-14 20:57   ` Arnd Bergmann
@ 2020-01-15 23:01     ` Vineet Gupta
  2020-01-16 11:43       ` Arnd Bergmann
  0 siblings, 1 reply; 21+ messages in thread
From: Vineet Gupta @ 2020-01-15 23:01 UTC (permalink / raw)
  To: Arnd Bergmann, Vineet Gupta
  Cc: Khalid Aziz, Andrey Konovalov, Andrew Morton, Peter Zijlstra,
	Christian Brauner, Kees Cook, Ingo Molnar, Aleksa Sarai,
	Linus Torvalds, open list:SYNOPSYS ARC ARCHITECTURE,
	linux-kernel, linux-arch

On 1/14/20 12:57 PM, Arnd Bergmann wrote:
> On Tue, Jan 14, 2020 at 9:08 PM Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
>> There are 2 generic varaints of strncpy_from_user() / strnlen_user()
>>  (1). inline version in asm-generic/uaccess.h
>>  (2). optimized word-at-a-time version in lib/*
>>
>> This patch disables #1 if #2 selected. This allows arches to continue
>> reusing asm-generic/uaccess.h for rest of code
>>
>> This came up when switching ARC to generic word-at-a-time interface
>>
>> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
> This looks like a useful change, but I think we can do even better: It
> seems that
> there are no  callers of __strnlen_user or __strncpy_from_user  in the
> kernel today, so these should not be defined either when the Kconfig symbols
> are set. Also, I would suggest moving the 'extern' declaration for the two
> functions into the #else branch of the conditional so it does not need to be
> duplicated.

Given where things seem to be heading, do u still want an updated patch for this
or do u plan to ditch the inline version in short term anyways.

-Vineet


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use
  2020-01-15 23:01     ` Vineet Gupta
@ 2020-01-16 11:43       ` Arnd Bergmann
  0 siblings, 0 replies; 21+ messages in thread
From: Arnd Bergmann @ 2020-01-16 11:43 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Khalid Aziz, Andrey Konovalov, Andrew Morton, Peter Zijlstra,
	Christian Brauner, Kees Cook, Ingo Molnar, Aleksa Sarai,
	Linus Torvalds, open list:SYNOPSYS ARC ARCHITECTURE,
	linux-kernel, linux-arch

On Thu, Jan 16, 2020 at 12:02 AM Vineet Gupta
<Vineet.Gupta1@synopsys.com> wrote:
> On 1/14/20 12:57 PM, Arnd Bergmann wrote:
> > On Tue, Jan 14, 2020 at 9:08 PM Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
> >> There are 2 generic varaints of strncpy_from_user() / strnlen_user()
> >>  (1). inline version in asm-generic/uaccess.h
> >>  (2). optimized word-at-a-time version in lib/*
> >>
> >> This patch disables #1 if #2 selected. This allows arches to continue
> >> reusing asm-generic/uaccess.h for rest of code
> >>
> >> This came up when switching ARC to generic word-at-a-time interface
> >>
> >> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
> > This looks like a useful change, but I think we can do even better: It
> > seems that
> > there are no  callers of __strnlen_user or __strncpy_from_user  in the
> > kernel today, so these should not be defined either when the Kconfig symbols
> > are set. Also, I would suggest moving the 'extern' declaration for the two
> > functions into the #else branch of the conditional so it does not need to be
> > duplicated.
>
> Given where things seem to be heading, do u still want an updated patch for this
> or do u plan to ditch the inline version in short term anyways.

I'm trying to come up with a cleanup series now that I'll send you.
You can base on top of that if you like.

     Arnd

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, back to index

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 20:08 [RFC 0/4] Switching ARC to optimized generic strncpy_from_user Vineet Gupta
2020-01-14 20:08 ` [RFC 1/4] asm-generic/uaccess: don't define inline functions if noinline lib/* in use Vineet Gupta
2020-01-14 20:57   ` Arnd Bergmann
2020-01-15 23:01     ` Vineet Gupta
2020-01-16 11:43       ` Arnd Bergmann
2020-01-14 21:32   ` Linus Torvalds
2020-01-15  9:08     ` Arnd Bergmann
2020-01-15 14:12       ` Al Viro
2020-01-15 14:21         ` Arnd Bergmann
2020-01-14 20:08 ` [RFC 2/4] lib/strncpy_from_user: Remove redundant user space pointer range check Vineet Gupta
2020-01-14 21:22   ` Linus Torvalds
2020-01-14 21:52     ` Vineet Gupta
2020-01-14 23:46     ` Al Viro
2020-01-15 14:42   ` Andrey Konovalov
2020-01-15 23:00     ` Vineet Gupta
2020-01-14 20:08 ` [RFC 3/4] ARC: uaccess: remove noinline variants of __strncpy_from_user() and friends Vineet Gupta
2020-01-14 20:08 ` [RFC 4/4] ARC: uaccess: use optimized generic __strnlen_user/__strncpy_from_user Vineet Gupta
2020-01-14 20:42   ` Arnd Bergmann
2020-01-14 21:36     ` Vineet Gupta
2020-01-14 21:49       ` Linus Torvalds
2020-01-14 22:14         ` Vineet Gupta

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git