linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good
@ 2022-02-14 16:34 Arnd Bergmann
  2022-02-14 16:34 ` [PATCH 01/14] uaccess: fix integer overflow on access_ok() Arnd Bergmann
                   ` (14 more replies)
  0 siblings, 15 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

Christoph Hellwig and a few others spent a huge effort on removing
set_fs() from most of the important architectures, but about half the
other architectures were never completed even though most of them don't
actually use set_fs() at all.

I did a patch for microblaze at some point, which turned out to be fairly
generic, and now ported it to most other architectures, using new generic
implementations of access_ok() and __{get,put}_kernel_nocheck().

Three architectures (sparc64, ia64, and sh) needed some extra work,
which I also completed.

The final series contains extra cleanup changes that touch all
architectures. Please review and test these, so we can merge them
for v5.18.

The series is available at
https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/log/?h=set_fs
for testing.

       Arnd

Arnd Bergmann (14):
  uaccess: fix integer overflow on access_ok()
  sparc64: add __{get,put}_kernel_nocheck()
  nds32: fix access_ok() checks in get/put_user
  x86: use more conventional access_ok() definition
  uaccess: add generic __{get,put}_kernel_nofault
  mips: use simpler access_ok()
  uaccess: generalize access_ok()
  arm64: simplify access_ok()
  m68k: drop custom __access_ok()
  uaccess: remove most CONFIG_SET_FS users
  sparc64: remove CONFIG_SET_FS support
  sh: remove CONFIG_SET_FS support
  ia64: remove CONFIG_SET_FS support
  uaccess: drop set_fs leftovers

 arch/Kconfig                              |   3 -
 arch/alpha/Kconfig                        |   1 -
 arch/alpha/include/asm/processor.h        |   4 -
 arch/alpha/include/asm/thread_info.h      |   2 -
 arch/alpha/include/asm/uaccess.h          |  53 +--------
 arch/arc/Kconfig                          |   1 -
 arch/arc/include/asm/segment.h            |  20 ----
 arch/arc/include/asm/thread_info.h        |   3 -
 arch/arc/include/asm/uaccess.h            |  30 -----
 arch/arm/include/asm/uaccess.h            |  22 +---
 arch/arm/kernel/swp_emulate.c             |   2 +-
 arch/arm/kernel/traps.c                   |   2 +-
 arch/arm/lib/uaccess_with_memcpy.c        |  10 --
 arch/arm64/include/asm/uaccess.h          |  29 +----
 arch/csky/Kconfig                         |   1 -
 arch/csky/include/asm/processor.h         |   2 -
 arch/csky/include/asm/segment.h           |  10 --
 arch/csky/include/asm/thread_info.h       |   2 -
 arch/csky/include/asm/uaccess.h           |  12 --
 arch/csky/kernel/asm-offsets.c            |   1 -
 arch/csky/kernel/signal.c                 |   2 +-
 arch/h8300/Kconfig                        |   1 -
 arch/h8300/include/asm/processor.h        |   1 -
 arch/h8300/include/asm/segment.h          |  40 -------
 arch/h8300/include/asm/thread_info.h      |   3 -
 arch/h8300/kernel/entry.S                 |   1 -
 arch/h8300/kernel/head_ram.S              |   1 -
 arch/h8300/mm/init.c                      |   6 -
 arch/h8300/mm/memory.c                    |   1 -
 arch/hexagon/Kconfig                      |   1 -
 arch/hexagon/include/asm/thread_info.h    |   6 -
 arch/hexagon/include/asm/uaccess.h        |  25 ----
 arch/hexagon/kernel/process.c             |   1 -
 arch/ia64/Kconfig                         |   1 -
 arch/ia64/include/asm/processor.h         |   4 -
 arch/ia64/include/asm/thread_info.h       |   2 -
 arch/ia64/include/asm/uaccess.h           |  26 ++---
 arch/ia64/kernel/unaligned.c              |  60 ++++++----
 arch/m68k/include/asm/uaccess.h           |  14 +--
 arch/microblaze/Kconfig                   |   1 -
 arch/microblaze/include/asm/thread_info.h |   6 -
 arch/microblaze/include/asm/uaccess.h     |  43 +------
 arch/microblaze/kernel/asm-offsets.c      |   1 -
 arch/microblaze/kernel/process.c          |   1 -
 arch/mips/include/asm/uaccess.h           |  47 +-------
 arch/nds32/Kconfig                        |   1 -
 arch/nds32/include/asm/thread_info.h      |   4 -
 arch/nds32/include/asm/uaccess.h          |  40 +++----
 arch/nds32/kernel/process.c               |   5 +-
 arch/nds32/mm/alignment.c                 |   3 -
 arch/nios2/Kconfig                        |   1 -
 arch/nios2/include/asm/thread_info.h      |   9 --
 arch/nios2/include/asm/uaccess.h          |  23 +---
 arch/nios2/kernel/signal.c                |  20 ++--
 arch/openrisc/Kconfig                     |   1 -
 arch/openrisc/include/asm/thread_info.h   |   7 --
 arch/openrisc/include/asm/uaccess.h       |  42 +------
 arch/parisc/include/asm/futex.h           |   2 +-
 arch/parisc/include/asm/uaccess.h         |  11 +-
 arch/parisc/lib/memcpy.c                  |   2 +-
 arch/powerpc/include/asm/uaccess.h        |  13 +--
 arch/powerpc/lib/sstep.c                  |   4 +-
 arch/riscv/include/asm/uaccess.h          |  33 +-----
 arch/riscv/kernel/perf_callchain.c        |   2 +-
 arch/s390/include/asm/uaccess.h           |  13 +--
 arch/sh/Kconfig                           |   1 -
 arch/sh/include/asm/processor.h           |   1 -
 arch/sh/include/asm/segment.h             |  33 ------
 arch/sh/include/asm/thread_info.h         |   2 -
 arch/sh/include/asm/uaccess.h             |  24 +---
 arch/sh/kernel/io_trapped.c               |   9 +-
 arch/sh/kernel/process_32.c               |   2 -
 arch/sh/kernel/traps_32.c                 |  30 +++--
 arch/sparc/Kconfig                        |   1 -
 arch/sparc/include/asm/processor_32.h     |   6 -
 arch/sparc/include/asm/processor_64.h     |   4 -
 arch/sparc/include/asm/switch_to_64.h     |   4 +-
 arch/sparc/include/asm/thread_info_64.h   |   4 +-
 arch/sparc/include/asm/uaccess.h          |   3 -
 arch/sparc/include/asm/uaccess_32.h       |  31 +----
 arch/sparc/include/asm/uaccess_64.h       | 135 +++++++++++++---------
 arch/sparc/kernel/process_32.c            |   2 -
 arch/sparc/kernel/process_64.c            |  12 --
 arch/sparc/kernel/signal_32.c             |   2 +-
 arch/sparc/kernel/traps_64.c              |   2 -
 arch/sparc/lib/NGmemcpy.S                 |   3 +-
 arch/sparc/mm/init_64.c                   |   3 -
 arch/um/include/asm/uaccess.h             |   7 +-
 arch/x86/include/asm/uaccess.h            |  44 ++-----
 arch/xtensa/Kconfig                       |   1 -
 arch/xtensa/include/asm/asm-uaccess.h     |  71 ------------
 arch/xtensa/include/asm/processor.h       |   7 --
 arch/xtensa/include/asm/thread_info.h     |   3 -
 arch/xtensa/include/asm/uaccess.h         |  26 +----
 arch/xtensa/kernel/asm-offsets.c          |   3 -
 drivers/hid/uhid.c                        |   2 +-
 drivers/scsi/sg.c                         |   5 -
 fs/exec.c                                 |   6 -
 include/asm-generic/access_ok.h           |  51 ++++++++
 include/asm-generic/uaccess.h             |  46 +-------
 include/linux/syscalls.h                  |   4 -
 include/linux/uaccess.h                   |  59 +++-------
 include/rdma/ib.h                         |   2 +-
 kernel/events/callchain.c                 |   4 -
 kernel/events/core.c                      |   3 -
 kernel/exit.c                             |  14 ---
 kernel/kthread.c                          |   5 -
 kernel/stacktrace.c                       |   3 -
 kernel/trace/bpf_trace.c                  |   4 -
 mm/maccess.c                              | 119 -------------------
 mm/memory.c                               |   8 --
 net/bpfilter/bpfilter_kern.c              |   2 +-
 112 files changed, 315 insertions(+), 1239 deletions(-)
 delete mode 100644 arch/arc/include/asm/segment.h
 delete mode 100644 arch/csky/include/asm/segment.h
 delete mode 100644 arch/h8300/include/asm/segment.h
 delete mode 100644 arch/sh/include/asm/segment.h
 create mode 100644 include/asm-generic/access_ok.h

-- 
2.29.2

Cc: linux@armlinux.org.uk
Cc: will@kernel.org
Cc: guoren@kernel.org
Cc: bcain@codeaurora.org
Cc: geert@linux-m68k.org
Cc: monstr@monstr.eu
Cc: tsbogend@alpha.franken.de
Cc: nickhu@andestech.com
Cc: green.hu@gmail.com
Cc: dinguyen@kernel.org
Cc: shorne@gmail.com
Cc: deller@gmx.de
Cc: mpe@ellerman.id.au
Cc: peterz@infradead.org
Cc: mingo@redhat.com
Cc: mark.rutland@arm.com
Cc: hca@linux.ibm.com
Cc: dalias@libc.org
Cc: davem@davemloft.net
Cc: richard@nod.at
Cc: x86@kernel.org
Cc: jcmvbkbc@gmail.com
Cc: ebiederm@xmission.com
Cc: arnd@arndb.de
Cc: akpm@linux-foundation.org
Cc: ardb@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: openrisc@lists.librecores.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Cc: linux-mm@kvack.org

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 01/14] uaccess: fix integer overflow on access_ok()
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 16:58   ` Christoph Hellwig
  2022-02-14 16:34 ` [PATCH 02/14] sparc64: add __{get,put}_kernel_nocheck() Arnd Bergmann
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, stable, dinguyen,
	David Laight, ebiederm, richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

Three architectures check the end of a user access against the
address limit without taking a possible overflow into account.
Passing a negative length or another overflow in here returns
success when it should not.

Use the most common correct implementation here, which optimizes
for a constant 'size' argument, and turns the common case into a
single comparison.

Cc: stable@vger.kernel.org
Fixes: da551281947c ("csky: User access")
Fixes: f663b60f5215 ("microblaze: Fix uaccess_ok macro")
Fixes: 7567746e1c0d ("Hexagon: Add user access functions")
Reported-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/csky/include/asm/uaccess.h       |  7 +++----
 arch/hexagon/include/asm/uaccess.h    | 18 +++++++++---------
 arch/microblaze/include/asm/uaccess.h | 19 ++++---------------
 3 files changed, 16 insertions(+), 28 deletions(-)

diff --git a/arch/csky/include/asm/uaccess.h b/arch/csky/include/asm/uaccess.h
index c40f06ee8d3e..ac5a54f57d40 100644
--- a/arch/csky/include/asm/uaccess.h
+++ b/arch/csky/include/asm/uaccess.h
@@ -3,14 +3,13 @@
 #ifndef __ASM_CSKY_UACCESS_H
 #define __ASM_CSKY_UACCESS_H
 
-#define user_addr_max() \
-	(uaccess_kernel() ? KERNEL_DS.seg : get_fs().seg)
+#define user_addr_max() (current_thread_info()->addr_limit.seg)
 
 static inline int __access_ok(unsigned long addr, unsigned long size)
 {
-	unsigned long limit = current_thread_info()->addr_limit.seg;
+	unsigned long limit = user_addr_max();
 
-	return ((addr < limit) && ((addr + size) < limit));
+	return (size <= limit) && (addr <= (limit - size));
 }
 #define __access_ok __access_ok
 
diff --git a/arch/hexagon/include/asm/uaccess.h b/arch/hexagon/include/asm/uaccess.h
index ef5bfef8d490..719ba3f3c45c 100644
--- a/arch/hexagon/include/asm/uaccess.h
+++ b/arch/hexagon/include/asm/uaccess.h
@@ -25,17 +25,17 @@
  * Returns true (nonzero) if the memory block *may* be valid, false (zero)
  * if it is definitely invalid.
  *
- * User address space in Hexagon, like x86, goes to 0xbfffffff, so the
- * simple MSB-based tests used by MIPS won't work.  Some further
- * optimization is probably possible here, but for now, keep it
- * reasonably simple and not *too* slow.  After all, we've got the
- * MMU for backup.
  */
+#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
+#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
 
-#define __access_ok(addr, size) \
-	((get_fs().seg == KERNEL_DS.seg) || \
-	(((unsigned long)addr < get_fs().seg) && \
-	  (unsigned long)size < (get_fs().seg - (unsigned long)addr)))
+static inline int __access_ok(unsigned long addr, unsigned long size)
+{
+	unsigned long limit = TASK_SIZE;
+
+	return (size <= limit) && (addr <= (limit - size));
+}
+#define __access_ok __access_ok
 
 /*
  * When a kernel-mode page fault is taken, the faulting instruction
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index d2a8ef9f8978..5b6e0e7788f4 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -39,24 +39,13 @@
 
 # define uaccess_kernel()	(get_fs().seg == KERNEL_DS.seg)
 
-static inline int access_ok(const void __user *addr, unsigned long size)
+static inline int __access_ok(unsigned long addr, unsigned long size)
 {
-	if (!size)
-		goto ok;
+	unsigned long limit = user_addr_max();
 
-	if ((get_fs().seg < ((unsigned long)addr)) ||
-			(get_fs().seg < ((unsigned long)addr + size - 1))) {
-		pr_devel("ACCESS fail at 0x%08x (size 0x%x), seg 0x%08x\n",
-			(__force u32)addr, (u32)size,
-			(u32)get_fs().seg);
-		return 0;
-	}
-ok:
-	pr_devel("ACCESS OK at 0x%08x (size 0x%x), seg 0x%08x\n",
-			(__force u32)addr, (u32)size,
-			(u32)get_fs().seg);
-	return 1;
+	return (size <= limit) && (addr <= (limit - size));
 }
+#define access_ok(addr, size) __access_ok((unsigned long)addr, size)
 
 # define __FIXUP_SECTION	".section .fixup,\"ax\"\n"
 # define __EX_TABLE_SECTION	".section __ex_table,\"a\"\n"
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 02/14] sparc64: add __{get,put}_kernel_nocheck()
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
  2022-02-14 16:34 ` [PATCH 01/14] uaccess: fix integer overflow on access_ok() Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 16:34 ` [PATCH 03/14] nds32: fix access_ok() checks in get/put_user Arnd Bergmann
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

sparc64 is one of the architectures that uses separate address
spaces for kernel and user addresses, so __get_kernel_nofault()
can not just call into the normal __get_user() without the
access_ok() check.

Instead duplicate __get_user() and __put_user() into their
in-kernel versions, with minor changes for the calling conventions
and leaving out the address space modifier on the assembler
instruction.

This could surely be written more elegantly, but duplicating it
gets the job done.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/sparc/include/asm/uaccess_64.h | 78 +++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h
index 30eb4c6414d1..b283798315b1 100644
--- a/arch/sparc/include/asm/uaccess_64.h
+++ b/arch/sparc/include/asm/uaccess_64.h
@@ -100,6 +100,42 @@ void __retl_efault(void);
 struct __large_struct { unsigned long buf[100]; };
 #define __m(x) ((struct __large_struct *)(x))
 
+#define __put_kernel_nofault(dst, src, type, label)			\
+do {									\
+	type *addr = (type __force *)(dst);				\
+	type data = *(type *)src;					\
+	register int __pu_ret;						\
+	switch (sizeof(type)) {						\
+	case 1: __put_kernel_asm(data, b, addr, __pu_ret); break;	\
+	case 2: __put_kernel_asm(data, h, addr, __pu_ret); break;	\
+	case 4: __put_kernel_asm(data, w, addr, __pu_ret); break;	\
+	case 8: __put_kernel_asm(data, x, addr, __pu_ret); break;	\
+	default: __pu_ret = __put_user_bad(); break;			\
+	}								\
+	if (__pu_ret)							\
+		goto label;						\
+} while (0)
+
+#define __put_kernel_asm(x, size, addr, ret)				\
+__asm__ __volatile__(							\
+		"/* Put kernel asm, inline. */\n"			\
+	"1:\t"	"st"#size " %1, [%2]\n\t"				\
+		"clr	%0\n"						\
+	"2:\n\n\t"							\
+		".section .fixup,#alloc,#execinstr\n\t"			\
+		".align	4\n"						\
+	"3:\n\t"							\
+		"sethi	%%hi(2b), %0\n\t"				\
+		"jmpl	%0 + %%lo(2b), %%g0\n\t"			\
+		" mov	%3, %0\n\n\t"					\
+		".previous\n\t"						\
+		".section __ex_table,\"a\"\n\t"				\
+		".align	4\n\t"						\
+		".word	1b, 3b\n\t"					\
+		".previous\n\n\t"					\
+	       : "=r" (ret) : "r" (x), "r" (__m(addr)),			\
+		 "i" (-EFAULT))
+
 #define __put_user_nocheck(data, addr, size) ({			\
 	register int __pu_ret;					\
 	switch (size) {						\
@@ -134,6 +170,48 @@ __asm__ __volatile__(							\
 
 int __put_user_bad(void);
 
+#define __get_kernel_nofault(dst, src, type, label)			     \
+do {									     \
+	type *addr = (type __force *)(src);		     		     \
+	register int __gu_ret;						     \
+	register unsigned long __gu_val;				     \
+	switch (sizeof(type)) {						     \
+		case 1: __get_kernel_asm(__gu_val, ub, addr, __gu_ret); break; \
+		case 2: __get_kernel_asm(__gu_val, uh, addr, __gu_ret); break; \
+		case 4: __get_kernel_asm(__gu_val, uw, addr, __gu_ret); break; \
+		case 8: __get_kernel_asm(__gu_val, x, addr, __gu_ret); break;  \
+		default:						     \
+			__gu_val = 0;					     \
+			__gu_ret = __get_user_bad();			     \
+			break;						     \
+	} 								     \
+	if (__gu_ret)							     \
+		goto label;						     \
+	*(type *)dst = (__force type) __gu_val;				     \
+} while (0)
+#define __get_kernel_asm(x, size, addr, ret)				\
+__asm__ __volatile__(							\
+		"/* Get kernel asm, inline. */\n"			\
+	"1:\t"	"ld"#size " [%2], %1\n\t"				\
+		"clr	%0\n"						\
+	"2:\n\n\t"							\
+		".section .fixup,#alloc,#execinstr\n\t"			\
+		".align	4\n"						\
+	"3:\n\t"							\
+		"sethi	%%hi(2b), %0\n\t"				\
+		"clr	%1\n\t"						\
+		"jmpl	%0 + %%lo(2b), %%g0\n\t"			\
+		" mov	%3, %0\n\n\t"					\
+		".previous\n\t"						\
+		".section __ex_table,\"a\"\n\t"				\
+		".align	4\n\t"						\
+		".word	1b, 3b\n\n\t"					\
+		".previous\n\t"						\
+	       : "=r" (ret), "=r" (x) : "r" (__m(addr)),		\
+		 "i" (-EFAULT))
+
+#define HAVE_GET_KERNEL_NOFAULT
+
 #define __get_user_nocheck(data, addr, size, type) ({			     \
 	register int __gu_ret;						     \
 	register unsigned long __gu_val;				     \
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 03/14] nds32: fix access_ok() checks in get/put_user
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
  2022-02-14 16:34 ` [PATCH 01/14] uaccess: fix integer overflow on access_ok() Arnd Bergmann
  2022-02-14 16:34 ` [PATCH 02/14] sparc64: add __{get,put}_kernel_nocheck() Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 17:01   ` Christoph Hellwig
  2022-02-14 16:34 ` [PATCH 04/14] x86: use more conventional access_ok() definition Arnd Bergmann
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, stable, dinguyen,
	ebiederm, richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

The get_user()/put_user() functions are meant to check for
access_ok(), while the __get_user()/__put_user() functions
don't.

This broke in 4.19 for nds32, when it gained an extraneous
check in __get_user(), but lost the check it needs in
__put_user().

Fixes: 487913ab18c2 ("nds32: Extract the checking and getting pointer to a macro")
Cc: stable@vger.kernel.org @ v4.19+
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/nds32/include/asm/uaccess.h | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/arch/nds32/include/asm/uaccess.h b/arch/nds32/include/asm/uaccess.h
index d4cbf069dc22..37a40981deb3 100644
--- a/arch/nds32/include/asm/uaccess.h
+++ b/arch/nds32/include/asm/uaccess.h
@@ -70,9 +70,7 @@ static inline void set_fs(mm_segment_t fs)
  * versions are void (ie, don't return a value as such).
  */
 
-#define get_user	__get_user					\
-
-#define __get_user(x, ptr)						\
+#define get_user(x, ptr)						\
 ({									\
 	long __gu_err = 0;						\
 	__get_user_check((x), (ptr), __gu_err);				\
@@ -85,6 +83,14 @@ static inline void set_fs(mm_segment_t fs)
 	(void)0;							\
 })
 
+#define __get_user(x, ptr)						\
+({									\
+	long __gu_err = 0;						\
+	const __typeof__(*(ptr)) __user *__p = (ptr);			\
+	__get_user_err((x), __p, (__gu_err));				\
+	__gu_err;							\
+})
+
 #define __get_user_check(x, ptr, err)					\
 ({									\
 	const __typeof__(*(ptr)) __user *__p = (ptr);			\
@@ -165,12 +171,18 @@ do {									\
 		: "r"(addr), "i"(-EFAULT)				\
 		: "cc")
 
-#define put_user	__put_user					\
+#define put_user(x, ptr)						\
+({									\
+	long __pu_err = 0;						\
+	__put_user_check((x), (ptr), __pu_err);				\
+	__pu_err;							\
+})
 
 #define __put_user(x, ptr)						\
 ({									\
 	long __pu_err = 0;						\
-	__put_user_err((x), (ptr), __pu_err);				\
+	__typeof__(*(ptr)) __user *__p = (ptr);				\
+	__put_user_err((x), __p, __pu_err);				\
 	__pu_err;							\
 })
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 04/14] x86: use more conventional access_ok() definition
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (2 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 03/14] nds32: fix access_ok() checks in get/put_user Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 17:02   ` Christoph Hellwig
  2022-02-14 16:34 ` [PATCH 05/14] uaccess: add generic __{get,put}_kernel_nofault Arnd Bergmann
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

The way that access_ok() is defined on x86 is slightly different from
most other architectures, and a bit more complex.

The generic version tends to result in the best output on all
architectures, as it results in single comparison against a constant
limit for calls with a known size.

There are a few callers of __range_not_ok(), all of which use TASK_SIZE
as the limit rather than TASK_SIZE_MAX, but I could not see any reason
for picking this. Changing these to call __access_ok() instead uses the
default limit, but keeps the behavior otherwise.

x86 is the only architecture with a WARN_ON_IN_IRQ() checking
access_ok(), but it's probably best to leave that in place.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/x86/include/asm/uaccess.h | 38 +++++++++++-----------------------
 1 file changed, 12 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index ac96f9b2d64b..6956a63291b6 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -16,30 +16,13 @@
  * Test whether a block of memory is a valid user space address.
  * Returns 0 if the range is valid, nonzero otherwise.
  */
-static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, unsigned long limit)
+static inline bool __access_ok(void __user *ptr, unsigned long size)
 {
-	/*
-	 * If we have used "sizeof()" for the size,
-	 * we know it won't overflow the limit (but
-	 * it might overflow the 'addr', so it's
-	 * important to subtract the size from the
-	 * limit, not add it to the address).
-	 */
-	if (__builtin_constant_p(size))
-		return unlikely(addr > limit - size);
-
-	/* Arbitrary sizes? Be careful about overflow */
-	addr += size;
-	if (unlikely(addr < size))
-		return true;
-	return unlikely(addr > limit);
-}
+	unsigned long limit = TASK_SIZE_MAX;
+	unsigned long addr = ptr;
 
-#define __range_not_ok(addr, size, limit)				\
-({									\
-	__chk_user_ptr(addr);						\
-	__chk_range_not_ok((unsigned long __force)(addr), size, limit); \
-})
+	return (size <= limit) && (addr <= (limit - size));
+}
 
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 static inline bool pagefault_disabled(void);
@@ -66,12 +49,15 @@ static inline bool pagefault_disabled(void);
  * Return: true (nonzero) if the memory block may be valid, false (zero)
  * if it is definitely invalid.
  */
-#define access_ok(addr, size)					\
-({									\
-	WARN_ON_IN_IRQ();						\
-	likely(!__range_not_ok(addr, size, TASK_SIZE_MAX));		\
+#define access_ok(addr, size)		\
+({					\
+	WARN_ON_IN_IRQ();		\
+	likely(__access_ok(addr, size));\
 })
 
+#define __range_not_ok(addr, size, limit)	(!__access_ok(addr, size))
+#define __chk_range_not_ok(addr, size, limit)	(!__access_ok((void __user *)addr, size))
+
 extern int __get_user_1(void);
 extern int __get_user_2(void);
 extern int __get_user_4(void);
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 05/14] uaccess: add generic __{get,put}_kernel_nofault
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (3 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 04/14] x86: use more conventional access_ok() definition Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 17:02   ` Christoph Hellwig
  2022-02-15  0:31   ` Al Viro
  2022-02-14 16:34 ` [PATCH 06/14] mips: use simpler access_ok() Arnd Bergmann
                   ` (9 subsequent siblings)
  14 siblings, 2 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

All architectures that don't provide __{get,put}_kernel_nofault() yet
can implement this on top of __{get,put}_user.

Add a generic version that lets everything use the normal
copy_{from,to}_kernel_nofault() code based on these, removing the last
use of get_fs()/set_fs() from architecture-independent code.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/include/asm/uaccess.h      |   2 -
 arch/arm64/include/asm/uaccess.h    |   2 -
 arch/m68k/include/asm/uaccess.h     |   2 -
 arch/mips/include/asm/uaccess.h     |   2 -
 arch/parisc/include/asm/uaccess.h   |   1 -
 arch/powerpc/include/asm/uaccess.h  |   2 -
 arch/riscv/include/asm/uaccess.h    |   2 -
 arch/s390/include/asm/uaccess.h     |   2 -
 arch/sparc/include/asm/uaccess_64.h |   2 -
 arch/um/include/asm/uaccess.h       |   2 -
 arch/x86/include/asm/uaccess.h      |   2 -
 include/asm-generic/uaccess.h       |   2 -
 include/linux/uaccess.h             |  19 +++++
 mm/maccess.c                        | 108 ----------------------------
 14 files changed, 19 insertions(+), 131 deletions(-)

diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 32dbfd81f42a..d20d78c34b94 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -476,8 +476,6 @@ do {									\
 	: "r" (x), "i" (-EFAULT)				\
 	: "cc")
 
-#define HAVE_GET_KERNEL_NOFAULT
-
 #define __get_kernel_nofault(dst, src, type, err_label)			\
 do {									\
 	const type *__pk_ptr = (src);					\
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 3a5ff5e20586..2e20879fe3cf 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -26,8 +26,6 @@
 #include <asm/memory.h>
 #include <asm/extable.h>
 
-#define HAVE_GET_KERNEL_NOFAULT
-
 /*
  * Test whether a block of memory is a valid user space address.
  * Returns 1 if the range is valid, 0 otherwise.
diff --git a/arch/m68k/include/asm/uaccess.h b/arch/m68k/include/asm/uaccess.h
index ba670523885c..79617c0b2f91 100644
--- a/arch/m68k/include/asm/uaccess.h
+++ b/arch/m68k/include/asm/uaccess.h
@@ -390,8 +390,6 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 #define INLINE_COPY_FROM_USER
 #define INLINE_COPY_TO_USER
 
-#define HAVE_GET_KERNEL_NOFAULT
-
 #define __get_kernel_nofault(dst, src, type, err_label)			\
 do {									\
 	type *__gk_dst = (type *)(dst);					\
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index f8f74f9f5883..db9a8e002b62 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -296,8 +296,6 @@ struct __large_struct { unsigned long buf[100]; };
 	(val) = __gu_tmp.t;						\
 }
 
-#define HAVE_GET_KERNEL_NOFAULT
-
 #define __get_kernel_nofault(dst, src, type, err_label)			\
 do {									\
 	int __gu_err;							\
diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h
index ebf8a845b017..0925bbd6db67 100644
--- a/arch/parisc/include/asm/uaccess.h
+++ b/arch/parisc/include/asm/uaccess.h
@@ -95,7 +95,6 @@ struct exception_table_entry {
 	(val) = (__force __typeof__(*(ptr))) __gu_val;	\
 }
 
-#define HAVE_GET_KERNEL_NOFAULT
 #define __get_kernel_nofault(dst, src, type, err_label)	\
 {							\
 	type __z;					\
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 63316100080c..a0032c2e7550 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -467,8 +467,6 @@ do {									\
 		unsafe_put_user(*(u8*)(_src + _i), (u8 __user *)(_dst + _i), e); \
 } while (0)
 
-#define HAVE_GET_KERNEL_NOFAULT
-
 #define __get_kernel_nofault(dst, src, type, err_label)			\
 	__get_user_size_goto(*((type *)(dst)),				\
 		(__force type __user *)(src), sizeof(type), err_label)
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
index c701a5e57a2b..4407b9e48d2c 100644
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -346,8 +346,6 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n)
 		__clear_user(to, n) : n;
 }
 
-#define HAVE_GET_KERNEL_NOFAULT
-
 #define __get_kernel_nofault(dst, src, type, err_label)			\
 do {									\
 	long __kr_err;							\
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index d74e26b48604..29332edf46f0 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -282,8 +282,6 @@ static inline unsigned long __must_check clear_user(void __user *to, unsigned lo
 int copy_to_user_real(void __user *dest, void *src, unsigned long count);
 void *s390_kernel_write(void *dst, const void *src, size_t size);
 
-#define HAVE_GET_KERNEL_NOFAULT
-
 int __noreturn __put_kernel_bad(void);
 
 #define __put_kernel_asm(val, to, insn)					\
diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h
index b283798315b1..5c12fb46bc61 100644
--- a/arch/sparc/include/asm/uaccess_64.h
+++ b/arch/sparc/include/asm/uaccess_64.h
@@ -210,8 +210,6 @@ __asm__ __volatile__(							\
 	       : "=r" (ret), "=r" (x) : "r" (__m(addr)),		\
 		 "i" (-EFAULT))
 
-#define HAVE_GET_KERNEL_NOFAULT
-
 #define __get_user_nocheck(data, addr, size, type) ({			     \
 	register int __gu_ret;						     \
 	register unsigned long __gu_val;				     \
diff --git a/arch/um/include/asm/uaccess.h b/arch/um/include/asm/uaccess.h
index 17d18cfd82a5..1ecfc96bcc50 100644
--- a/arch/um/include/asm/uaccess.h
+++ b/arch/um/include/asm/uaccess.h
@@ -44,8 +44,6 @@ static inline int __access_ok(unsigned long addr, unsigned long size)
 }
 
 /* no pagefaults for kernel addresses in um */
-#define HAVE_GET_KERNEL_NOFAULT 1
-
 #define __get_kernel_nofault(dst, src, type, err_label)			\
 do {									\
 	*((type *)dst) = get_unaligned((type *)(src));			\
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 6956a63291b6..c6d9dc42724d 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -510,8 +510,6 @@ do {									\
 	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u8, label);	\
 } while (0)
 
-#define HAVE_GET_KERNEL_NOFAULT
-
 #ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
 #define __get_kernel_nofault(dst, src, type, err_label)			\
 	__get_user_size(*((type *)(dst)), (__force type __user *)(src),	\
diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index 10ffa8b5c117..0870fa11a7c5 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -77,8 +77,6 @@ do {									\
 		goto err_label;						\
 } while (0)
 
-#define HAVE_GET_KERNEL_NOFAULT 1
-
 static inline __must_check unsigned long
 raw_copy_from_user(void *to, const void __user * from, unsigned long n)
 {
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index ac0394087f7d..67e9bc94dc40 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -368,6 +368,25 @@ long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr,
 		long count);
 long strnlen_user_nofault(const void __user *unsafe_addr, long count);
 
+#ifndef __get_kernel_nofault
+#define __get_kernel_nofault(dst, src, type, label)	\
+do {							\
+	type __user *p = (type __force __user *)(src);	\
+	type data;					\
+	if (__get_user(data, p))			\
+		goto label;				\
+	*(type *)dst = data;				\
+} while (0)
+
+#define __put_kernel_nofault(dst, src, type, label)	\
+do {							\
+	type __user *p = (type __force __user *)(dst);	\
+	type data = *(type *)src;			\
+	if (__put_user(data, p))			\
+		goto label;				\
+} while (0)
+#endif
+
 /**
  * get_kernel_nofault(): safely attempt to read from a location
  * @val: read into this variable
diff --git a/mm/maccess.c b/mm/maccess.c
index d3f1a1f0b1c1..cbd1b3959af2 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -12,8 +12,6 @@ bool __weak copy_from_kernel_nofault_allowed(const void *unsafe_src,
 	return true;
 }
 
-#ifdef HAVE_GET_KERNEL_NOFAULT
-
 #define copy_from_kernel_nofault_loop(dst, src, len, type, err_label)	\
 	while (len >= sizeof(type)) {					\
 		__get_kernel_nofault(dst, src, type, err_label);		\
@@ -102,112 +100,6 @@ long strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, long count)
 	dst[-1] = '\0';
 	return -EFAULT;
 }
-#else /* HAVE_GET_KERNEL_NOFAULT */
-/**
- * copy_from_kernel_nofault(): safely attempt to read from kernel-space
- * @dst: pointer to the buffer that shall take the data
- * @src: address to read from
- * @size: size of the data chunk
- *
- * Safely read from kernel address @src to the buffer at @dst.  If a kernel
- * fault happens, handle that and return -EFAULT.  If @src is not a valid kernel
- * address, return -ERANGE.
- *
- * We ensure that the copy_from_user is executed in atomic context so that
- * do_page_fault() doesn't attempt to take mmap_lock.  This makes
- * copy_from_kernel_nofault() suitable for use within regions where the caller
- * already holds mmap_lock, or other locks which nest inside mmap_lock.
- */
-long copy_from_kernel_nofault(void *dst, const void *src, size_t size)
-{
-	long ret;
-	mm_segment_t old_fs = get_fs();
-
-	if (!copy_from_kernel_nofault_allowed(src, size))
-		return -ERANGE;
-
-	set_fs(KERNEL_DS);
-	pagefault_disable();
-	ret = __copy_from_user_inatomic(dst, (__force const void __user *)src,
-			size);
-	pagefault_enable();
-	set_fs(old_fs);
-
-	if (ret)
-		return -EFAULT;
-	return 0;
-}
-EXPORT_SYMBOL_GPL(copy_from_kernel_nofault);
-
-/**
- * copy_to_kernel_nofault(): safely attempt to write to a location
- * @dst: address to write to
- * @src: pointer to the data that shall be written
- * @size: size of the data chunk
- *
- * Safely write to address @dst from the buffer at @src.  If a kernel fault
- * happens, handle that and return -EFAULT.
- */
-long copy_to_kernel_nofault(void *dst, const void *src, size_t size)
-{
-	long ret;
-	mm_segment_t old_fs = get_fs();
-
-	set_fs(KERNEL_DS);
-	pagefault_disable();
-	ret = __copy_to_user_inatomic((__force void __user *)dst, src, size);
-	pagefault_enable();
-	set_fs(old_fs);
-
-	if (ret)
-		return -EFAULT;
-	return 0;
-}
-
-/**
- * strncpy_from_kernel_nofault: - Copy a NUL terminated string from unsafe
- *				 address.
- * @dst:   Destination address, in kernel space.  This buffer must be at
- *         least @count bytes long.
- * @unsafe_addr: Unsafe address.
- * @count: Maximum number of bytes to copy, including the trailing NUL.
- *
- * Copies a NUL-terminated string from unsafe address to kernel buffer.
- *
- * On success, returns the length of the string INCLUDING the trailing NUL.
- *
- * If access fails, returns -EFAULT (some data may have been copied and the
- * trailing NUL added).  If @unsafe_addr is not a valid kernel address, return
- * -ERANGE.
- *
- * If @count is smaller than the length of the string, copies @count-1 bytes,
- * sets the last byte of @dst buffer to NUL and returns @count.
- */
-long strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, long count)
-{
-	mm_segment_t old_fs = get_fs();
-	const void *src = unsafe_addr;
-	long ret;
-
-	if (unlikely(count <= 0))
-		return 0;
-	if (!copy_from_kernel_nofault_allowed(unsafe_addr, count))
-		return -ERANGE;
-
-	set_fs(KERNEL_DS);
-	pagefault_disable();
-
-	do {
-		ret = __get_user(*dst++, (const char __user __force *)src++);
-	} while (dst[-1] && ret == 0 && src - unsafe_addr < count);
-
-	dst[-1] = '\0';
-	pagefault_enable();
-	set_fs(old_fs);
-
-	return ret ? -EFAULT : src - unsafe_addr;
-}
-#endif /* HAVE_GET_KERNEL_NOFAULT */
 
 /**
  * copy_from_user_nofault(): safely attempt to read from a user-space location
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 06/14] mips: use simpler access_ok()
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (4 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 05/14] uaccess: add generic __{get,put}_kernel_nofault Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 16:34 ` [PATCH 07/14] uaccess: generalize access_ok() Arnd Bergmann
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

Before unifying the mips version of __access_ok() with the generic
code, this converts it to the same algorithm. This is a change in
behavior on mips64, as now address in the user segment, the lower
2^62 bytes, is taken to be valid, relying on a page fault for
addresses that are within that segment but not valid on that CPU.

The new version should be the most effecient way to do this, but
it gets rid of the special handling for size=0 that most other
architectures ignore as well.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/mips/include/asm/uaccess.h | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index db9a8e002b62..d7c89dc3426c 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -19,6 +19,7 @@
 #ifdef CONFIG_32BIT
 
 #define __UA_LIMIT 0x80000000UL
+#define TASK_SIZE_MAX	__UA_LIMIT
 
 #define __UA_ADDR	".word"
 #define __UA_LA		"la"
@@ -33,6 +34,7 @@
 extern u64 __ua_limit;
 
 #define __UA_LIMIT	__ua_limit
+#define TASK_SIZE_MAX	XKSSEG
 
 #define __UA_ADDR	".dword"
 #define __UA_LA		"dla"
@@ -42,22 +44,6 @@ extern u64 __ua_limit;
 
 #endif /* CONFIG_64BIT */
 
-/*
- * Is a address valid? This does a straightforward calculation rather
- * than tests.
- *
- * Address valid if:
- *  - "addr" doesn't have any high-bits set
- *  - AND "size" doesn't have any high-bits set
- *  - AND "addr+size" doesn't have any high-bits set
- *  - OR we are in kernel mode.
- *
- * __ua_size() is a trick to avoid runtime checking of positive constant
- * sizes; for those we already know at compile time that the size is ok.
- */
-#define __ua_size(size)							\
-	((__builtin_constant_p(size) && (signed long) (size) > 0) ? 0 : (size))
-
 /*
  * access_ok: - Checks if a user space pointer is valid
  * @addr: User space pointer to start of block to check
@@ -79,9 +65,9 @@ extern u64 __ua_limit;
 static inline int __access_ok(const void __user *p, unsigned long size)
 {
 	unsigned long addr = (unsigned long)p;
-	unsigned long end = addr + size - !!size;
+	unsigned long limit = TASK_SIZE_MAX;
 
-	return (__UA_LIMIT & (addr | end | __ua_size(size))) == 0;
+	return (size <= limit) && (addr <= (limit - size));
 }
 
 #define access_ok(addr, size)					\
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 07/14] uaccess: generalize access_ok()
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (5 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 06/14] mips: use simpler access_ok() Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 17:04   ` Christoph Hellwig
                     ` (2 more replies)
  2022-02-14 16:34 ` [PATCH 08/14] arm64: simplify access_ok() Arnd Bergmann
                   ` (7 subsequent siblings)
  14 siblings, 3 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

There are many different ways that access_ok() is defined across
architectures, but in the end, they all just compare against the
user_addr_max() value or they accept anything.

Provide one definition that works for most architectures, checking
against TASK_SIZE_MAX for user processes or skipping the check inside
of uaccess_kernel() sections.

For architectures without CONFIG_SET_FS(), this should be the fastest
check, as it comes down to a single comparison of a pointer against a
compile-time constant, while the architecture specific versions tend to
do something more complex for historic reasons or get something wrong.

Type checking for __user annotations is handled inconsistently across
architectures, but this is easily simplified as well by using an inline
function that takes a 'const void __user *' argument. A handful of
callers need an extra __user annotation for this.

Some architectures had trick to use 33-bit or 65-bit arithmetic on the
addresses to calculate the overflow, however this simpler version uses
fewer registers, which means it can produce better object code in the
end despite needing a second (statically predicted) branch.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/alpha/include/asm/uaccess.h      | 34 +++------------
 arch/arc/include/asm/uaccess.h        | 29 -------------
 arch/arm/include/asm/uaccess.h        | 20 +--------
 arch/arm/kernel/swp_emulate.c         |  2 +-
 arch/arm/kernel/traps.c               |  2 +-
 arch/arm64/include/asm/uaccess.h      |  5 ++-
 arch/csky/include/asm/uaccess.h       |  8 ----
 arch/csky/kernel/signal.c             |  2 +-
 arch/hexagon/include/asm/uaccess.h    | 25 ------------
 arch/ia64/include/asm/uaccess.h       |  5 +--
 arch/m68k/include/asm/uaccess.h       |  5 ++-
 arch/microblaze/include/asm/uaccess.h |  8 +---
 arch/mips/include/asm/uaccess.h       | 29 +------------
 arch/nds32/include/asm/uaccess.h      |  7 +---
 arch/nios2/include/asm/uaccess.h      | 11 +----
 arch/nios2/kernel/signal.c            | 20 +++++----
 arch/openrisc/include/asm/uaccess.h   | 19 +--------
 arch/parisc/include/asm/uaccess.h     | 10 +++--
 arch/powerpc/include/asm/uaccess.h    | 11 +----
 arch/powerpc/lib/sstep.c              |  4 +-
 arch/riscv/include/asm/uaccess.h      | 31 +-------------
 arch/riscv/kernel/perf_callchain.c    |  2 +-
 arch/s390/include/asm/uaccess.h       | 11 ++---
 arch/sh/include/asm/uaccess.h         | 22 +---------
 arch/sparc/include/asm/uaccess.h      |  3 --
 arch/sparc/include/asm/uaccess_32.h   | 18 ++------
 arch/sparc/include/asm/uaccess_64.h   | 35 ++++------------
 arch/sparc/kernel/signal_32.c         |  2 +-
 arch/um/include/asm/uaccess.h         |  5 ++-
 arch/x86/include/asm/uaccess.h        | 14 +------
 arch/xtensa/include/asm/uaccess.h     | 10 +----
 include/asm-generic/access_ok.h       | 59 +++++++++++++++++++++++++++
 include/asm-generic/uaccess.h         | 21 +---------
 include/linux/uaccess.h               |  7 ----
 34 files changed, 130 insertions(+), 366 deletions(-)
 create mode 100644 include/asm-generic/access_ok.h

diff --git a/arch/alpha/include/asm/uaccess.h b/arch/alpha/include/asm/uaccess.h
index 1b6f25efa247..82c5743fc9cd 100644
--- a/arch/alpha/include/asm/uaccess.h
+++ b/arch/alpha/include/asm/uaccess.h
@@ -20,28 +20,7 @@
 #define get_fs()  (current_thread_info()->addr_limit)
 #define set_fs(x) (current_thread_info()->addr_limit = (x))
 
-#define uaccess_kernel()	(get_fs().seg == KERNEL_DS.seg)
-
-/*
- * Is a address valid? This does a straightforward calculation rather
- * than tests.
- *
- * Address valid if:
- *  - "addr" doesn't have any high-bits set
- *  - AND "size" doesn't have any high-bits set
- *  - AND "addr+size-(size != 0)" doesn't have any high-bits set
- *  - OR we are in kernel mode.
- */
-#define __access_ok(addr, size) ({				\
-	unsigned long __ao_a = (addr), __ao_b = (size);		\
-	unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;	\
-	(get_fs().seg & (__ao_a | __ao_b | __ao_end)) == 0; })
-
-#define access_ok(addr, size)				\
-({							\
-	__chk_user_ptr(addr);				\
-	__access_ok(((unsigned long)(addr)), (size));	\
-})
+#include <asm-generic/access_ok.h>
 
 /*
  * These are the main single-value transfer routines.  They automatically
@@ -105,7 +84,7 @@ extern void __get_user_unknown(void);
 	long __gu_err = -EFAULT;				\
 	unsigned long __gu_val = 0;				\
 	const __typeof__(*(ptr)) __user *__gu_addr = (ptr);	\
-	if (__access_ok((unsigned long)__gu_addr, size)) {	\
+	if (__access_ok(__gu_addr, size)) {			\
 		__gu_err = 0;					\
 		switch (size) {					\
 		  case 1: __get_user_8(__gu_addr); break;	\
@@ -200,7 +179,7 @@ extern void __put_user_unknown(void);
 ({								\
 	long __pu_err = -EFAULT;				\
 	__typeof__(*(ptr)) __user *__pu_addr = (ptr);		\
-	if (__access_ok((unsigned long)__pu_addr, size)) {	\
+	if (__access_ok(__pu_addr, size)) {			\
 		__pu_err = 0;					\
 		switch (size) {					\
 		  case 1: __put_user_8(x, __pu_addr); break;	\
@@ -316,17 +295,14 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long len)
 
 extern long __clear_user(void __user *to, long len);
 
-extern inline long
+static inline long
 clear_user(void __user *to, long len)
 {
-	if (__access_ok((unsigned long)to, len))
+	if (__access_ok(to, len))
 		len = __clear_user(to, len);
 	return len;
 }
 
-#define user_addr_max() \
-        (uaccess_kernel() ? ~0UL : TASK_SIZE)
-
 extern long strncpy_from_user(char *dest, const char __user *src, long count);
 extern __must_check long strnlen_user(const char __user *str, long n);
 
diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index 783bfdb3bfa3..30f80b4be2ab 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -23,35 +23,6 @@
 
 #include <linux/string.h>	/* for generic string functions */
 
-
-#define __kernel_ok		(uaccess_kernel())
-
-/*
- * Algorithmically, for __user_ok() we want do:
- * 	(start < TASK_SIZE) && (start+len < TASK_SIZE)
- * where TASK_SIZE could either be retrieved from thread_info->addr_limit or
- * emitted directly in code.
- *
- * This can however be rewritten as follows:
- *	(len <= TASK_SIZE) && (start+len < TASK_SIZE)
- *
- * Because it essentially checks if buffer end is within limit and @len is
- * non-ngeative, which implies that buffer start will be within limit too.
- *
- * The reason for rewriting being, for majority of cases, @len is generally
- * compile time constant, causing first sub-expression to be compile time
- * subsumed.
- *
- * The second part would generate weird large LIMMs e.g. (0x6000_0000 - 0x10),
- * so we check for TASK_SIZE using get_fs() since the addr_limit load from mem
- * would already have been done at this call site for __kernel_ok()
- *
- */
-#define __user_ok(addr, sz)	(((sz) <= TASK_SIZE) && \
-				 ((addr) <= (get_fs() - (sz))))
-#define __access_ok(addr, sz)	(unlikely(__kernel_ok) || \
-				 likely(__user_ok((addr), (sz))))
-
 /*********** Single byte/hword/word copies ******************/
 
 #define __get_user_fn(sz, u, k)					\
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index d20d78c34b94..2fcbec9c306c 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -55,21 +55,6 @@ extern int __put_user_bad(void);
 
 #ifdef CONFIG_MMU
 
-/*
- * We use 33-bit arithmetic here.  Success returns zero, failure returns
- * addr_limit.  We take advantage that addr_limit will be zero for KERNEL_DS,
- * so this will always return success in that case.
- */
-#define __range_ok(addr, size) ({ \
-	unsigned long flag, roksum; \
-	__chk_user_ptr(addr);	\
-	__asm__(".syntax unified\n" \
-		"adds %1, %2, %3; sbcscc %1, %1, %0; movcc %0, #0" \
-		: "=&r" (flag), "=&r" (roksum) \
-		: "r" (addr), "Ir" (size), "0" (TASK_SIZE) \
-		: "cc"); \
-	flag; })
-
 /*
  * This is a type: either unsigned long, if the argument fits into
  * that type, or otherwise unsigned long long.
@@ -241,15 +226,12 @@ extern int __put_user_8(void *, unsigned long long);
 
 #else /* CONFIG_MMU */
 
-#define __addr_ok(addr)		((void)(addr), 1)
-#define __range_ok(addr, size)	((void)(addr), 0)
-
 #define get_user(x, p)	__get_user(x, p)
 #define __put_user_check __put_user_nocheck
 
 #endif /* CONFIG_MMU */
 
-#define access_ok(addr, size)	(__range_ok(addr, size) == 0)
+#include <asm-generic/access_ok.h>
 
 #ifdef CONFIG_CPU_SPECTRE
 /*
diff --git a/arch/arm/kernel/swp_emulate.c b/arch/arm/kernel/swp_emulate.c
index 6166ba38bf99..b74bfcf94fb1 100644
--- a/arch/arm/kernel/swp_emulate.c
+++ b/arch/arm/kernel/swp_emulate.c
@@ -195,7 +195,7 @@ static int swp_handler(struct pt_regs *regs, unsigned int instr)
 		 destreg, EXTRACT_REG_NUM(instr, RT2_OFFSET), data);
 
 	/* Check access in reasonable access range for both SWP and SWPB */
-	if (!access_ok((address & ~3), 4)) {
+	if (!access_ok((void __user *)(address & ~3), 4)) {
 		pr_debug("SWP{B} emulation: access to %p not allowed!\n",
 			 (void *)address);
 		res = -EFAULT;
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index da04ed85855a..26c8c8276297 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -576,7 +576,7 @@ do_cache_op(unsigned long start, unsigned long end, int flags)
 	if (end < start || flags)
 		return -EINVAL;
 
-	if (!access_ok(start, end - start))
+	if (!access_ok((void __user *)start, end - start))
 		return -EFAULT;
 
 	return __do_cache_op(start, end);
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 2e20879fe3cf..357f7bd9c981 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -33,7 +33,7 @@
  * This is equivalent to the following test:
  * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
  */
-static inline unsigned long __range_ok(const void __user *addr, unsigned long size)
+static inline unsigned long __access_ok(const void __user *addr, unsigned long size)
 {
 	unsigned long ret, limit = TASK_SIZE_MAX - 1;
 
@@ -66,8 +66,9 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
 
 	return ret;
 }
+#define __access_ok __access_ok
 
-#define access_ok(addr, size)	__range_ok(addr, size)
+#include <asm-generic/access_ok.h>
 
 /*
  * User access enabling/disabling.
diff --git a/arch/csky/include/asm/uaccess.h b/arch/csky/include/asm/uaccess.h
index ac5a54f57d40..fec8f77ffc99 100644
--- a/arch/csky/include/asm/uaccess.h
+++ b/arch/csky/include/asm/uaccess.h
@@ -5,14 +5,6 @@
 
 #define user_addr_max() (current_thread_info()->addr_limit.seg)
 
-static inline int __access_ok(unsigned long addr, unsigned long size)
-{
-	unsigned long limit = user_addr_max();
-
-	return (size <= limit) && (addr <= (limit - size));
-}
-#define __access_ok __access_ok
-
 /*
  * __put_user_fn
  */
diff --git a/arch/csky/kernel/signal.c b/arch/csky/kernel/signal.c
index c7b763d2f526..8867ddf3e6c7 100644
--- a/arch/csky/kernel/signal.c
+++ b/arch/csky/kernel/signal.c
@@ -136,7 +136,7 @@ static inline void __user *get_sigframe(struct ksignal *ksig,
 static int
 setup_rt_frame(struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
 {
-	struct rt_sigframe *frame;
+	struct rt_sigframe __user *frame;
 	int err = 0;
 
 	frame = get_sigframe(ksig, regs, sizeof(*frame));
diff --git a/arch/hexagon/include/asm/uaccess.h b/arch/hexagon/include/asm/uaccess.h
index 719ba3f3c45c..bff77efc0d9a 100644
--- a/arch/hexagon/include/asm/uaccess.h
+++ b/arch/hexagon/include/asm/uaccess.h
@@ -12,31 +12,6 @@
  */
 #include <asm/sections.h>
 
-/*
- * access_ok: - Checks if a user space pointer is valid
- * @addr: User space pointer to start of block to check
- * @size: Size of block to check
- *
- * Context: User context only. This function may sleep if pagefaults are
- *          enabled.
- *
- * Checks if a pointer to a block of memory in user space is valid.
- *
- * Returns true (nonzero) if the memory block *may* be valid, false (zero)
- * if it is definitely invalid.
- *
- */
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
-
-static inline int __access_ok(unsigned long addr, unsigned long size)
-{
-	unsigned long limit = TASK_SIZE;
-
-	return (size <= limit) && (addr <= (limit - size));
-}
-#define __access_ok __access_ok
-
 /*
  * When a kernel-mode page fault is taken, the faulting instruction
  * address is checked against a table of exception_table_entries.
diff --git a/arch/ia64/include/asm/uaccess.h b/arch/ia64/include/asm/uaccess.h
index e19d2dcc0ced..e242a3cc1330 100644
--- a/arch/ia64/include/asm/uaccess.h
+++ b/arch/ia64/include/asm/uaccess.h
@@ -50,8 +50,6 @@
 #define get_fs()  (current_thread_info()->addr_limit)
 #define set_fs(x) (current_thread_info()->addr_limit = (x))
 
-#define uaccess_kernel()	(get_fs().seg == KERNEL_DS.seg)
-
 /*
  * When accessing user memory, we need to make sure the entire area really is in
  * user-level space.  In order to do this efficiently, we make sure that the page at
@@ -65,7 +63,8 @@ static inline int __access_ok(const void __user *p, unsigned long size)
 	return likely(addr <= seg) &&
 	 (seg == KERNEL_DS.seg || likely(REGION_OFFSET(addr) < RGN_MAP_LIMIT));
 }
-#define access_ok(addr, size)	__access_ok((addr), (size))
+#define __access_ok __access_ok
+#include <asm-generic/access_ok.h>
 
 /*
  * These are the main single-value transfer routines.  They automatically
diff --git a/arch/m68k/include/asm/uaccess.h b/arch/m68k/include/asm/uaccess.h
index 79617c0b2f91..d6bb5720365a 100644
--- a/arch/m68k/include/asm/uaccess.h
+++ b/arch/m68k/include/asm/uaccess.h
@@ -12,15 +12,18 @@
 #include <asm/extable.h>
 
 /* We let the MMU do all checking */
-static inline int access_ok(const void __user *addr,
+static inline int __access_ok(const void __user *addr,
 			    unsigned long size)
 {
 	/*
 	 * XXX: for !CONFIG_CPU_HAS_ADDRESS_SPACES this really needs to check
 	 * for TASK_SIZE!
+	 * Removing this helper is probably sufficient.
 	 */
 	return 1;
 }
+#define __access_ok __access_ok
+#include <asm-generic/access_ok.h>
 
 /*
  * Not all varients of the 68k family support the notion of address spaces.
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index 5b6e0e7788f4..dd82e90adb52 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -39,13 +39,7 @@
 
 # define uaccess_kernel()	(get_fs().seg == KERNEL_DS.seg)
 
-static inline int __access_ok(unsigned long addr, unsigned long size)
-{
-	unsigned long limit = user_addr_max();
-
-	return (size <= limit) && (addr <= (limit - size));
-}
-#define access_ok(addr, size) __access_ok((unsigned long)addr, size)
+#include <asm-generic/access_ok.h>
 
 # define __FIXUP_SECTION	".section .fixup,\"ax\"\n"
 # define __EX_TABLE_SECTION	".section __ex_table,\"a\"\n"
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index d7c89dc3426c..436248652b28 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -44,34 +44,7 @@ extern u64 __ua_limit;
 
 #endif /* CONFIG_64BIT */
 
-/*
- * access_ok: - Checks if a user space pointer is valid
- * @addr: User space pointer to start of block to check
- * @size: Size of block to check
- *
- * Context: User context only. This function may sleep if pagefaults are
- *          enabled.
- *
- * Checks if a pointer to a block of memory in user space is valid.
- *
- * Returns true (nonzero) if the memory block may be valid, false (zero)
- * if it is definitely invalid.
- *
- * Note that, depending on architecture, this function probably just
- * checks that the pointer is in the user space range - after calling
- * this function, memory access functions may still return -EFAULT.
- */
-
-static inline int __access_ok(const void __user *p, unsigned long size)
-{
-	unsigned long addr = (unsigned long)p;
-	unsigned long limit = TASK_SIZE_MAX;
-
-	return (size <= limit) && (addr <= (limit - size));
-}
-
-#define access_ok(addr, size)					\
-	likely(__access_ok((addr), (size)))
+#include <asm-generic/access_ok.h>
 
 /*
  * put_user: - Write a simple value into user space.
diff --git a/arch/nds32/include/asm/uaccess.h b/arch/nds32/include/asm/uaccess.h
index 37a40981deb3..832d642a4068 100644
--- a/arch/nds32/include/asm/uaccess.h
+++ b/arch/nds32/include/asm/uaccess.h
@@ -38,18 +38,15 @@ extern int fixup_exception(struct pt_regs *regs);
 
 #define get_fs()	(current_thread_info()->addr_limit)
 #define user_addr_max	get_fs
+#define uaccess_kernel() (get_fs() == KERNEL_DS)
 
 static inline void set_fs(mm_segment_t fs)
 {
 	current_thread_info()->addr_limit = fs;
 }
 
-#define uaccess_kernel()	(get_fs() == KERNEL_DS)
+#include <asm-generic/access_ok.h>
 
-#define __range_ok(addr, size) (size <= get_fs() && addr <= (get_fs() -size))
-
-#define access_ok(addr, size)	\
-	__range_ok((unsigned long)addr, (unsigned long)size)
 /*
  * Single-value transfer routines.  They automatically use the right
  * size if we just have the right pointer type.  Note that the functions
diff --git a/arch/nios2/include/asm/uaccess.h b/arch/nios2/include/asm/uaccess.h
index ba9340e96fd4..9a7658df7f8d 100644
--- a/arch/nios2/include/asm/uaccess.h
+++ b/arch/nios2/include/asm/uaccess.h
@@ -30,19 +30,10 @@
 #define get_fs()		(current_thread_info()->addr_limit)
 #define set_fs(seg)		(current_thread_info()->addr_limit = (seg))
 
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-
-#define __access_ok(addr, len)			\
-	(((signed long)(((long)get_fs().seg) &	\
-		((long)(addr) | (((long)(addr)) + (len)) | (len)))) == 0)
-
-#define access_ok(addr, len)		\
-	likely(__access_ok((unsigned long)(addr), (unsigned long)(len)))
+#include <asm-generic/access_ok.h>
 
 # define __EX_TABLE_SECTION	".section __ex_table,\"a\"\n"
 
-#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
-
 /*
  * Zero Userspace
  */
diff --git a/arch/nios2/kernel/signal.c b/arch/nios2/kernel/signal.c
index 2009ae2d3c3b..386e46443b60 100644
--- a/arch/nios2/kernel/signal.c
+++ b/arch/nios2/kernel/signal.c
@@ -36,10 +36,10 @@ struct rt_sigframe {
 
 static inline int rt_restore_ucontext(struct pt_regs *regs,
 					struct switch_stack *sw,
-					struct ucontext *uc, int *pr2)
+					struct ucontext __user *uc, int *pr2)
 {
 	int temp;
-	unsigned long *gregs = uc->uc_mcontext.gregs;
+	unsigned long __user *gregs = uc->uc_mcontext.gregs;
 	int err;
 
 	/* Always make any pending restarted system calls return -EINTR */
@@ -102,10 +102,11 @@ asmlinkage int do_rt_sigreturn(struct switch_stack *sw)
 {
 	struct pt_regs *regs = (struct pt_regs *)(sw + 1);
 	/* Verify, can we follow the stack back */
-	struct rt_sigframe *frame = (struct rt_sigframe *) regs->sp;
+	struct rt_sigframe __user *frame;
 	sigset_t set;
 	int rval;
 
+	frame = (struct rt_sigframe __user *) regs->sp;
 	if (!access_ok(frame, sizeof(*frame)))
 		goto badframe;
 
@@ -124,10 +125,10 @@ asmlinkage int do_rt_sigreturn(struct switch_stack *sw)
 	return 0;
 }
 
-static inline int rt_setup_ucontext(struct ucontext *uc, struct pt_regs *regs)
+static inline int rt_setup_ucontext(struct ucontext __user *uc, struct pt_regs *regs)
 {
 	struct switch_stack *sw = (struct switch_stack *)regs - 1;
-	unsigned long *gregs = uc->uc_mcontext.gregs;
+	unsigned long __user *gregs = uc->uc_mcontext.gregs;
 	int err = 0;
 
 	err |= __put_user(MCONTEXT_VERSION, &uc->uc_mcontext.version);
@@ -162,8 +163,9 @@ static inline int rt_setup_ucontext(struct ucontext *uc, struct pt_regs *regs)
 	return err;
 }
 
-static inline void *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
-				 size_t frame_size)
+static inline void __user *get_sigframe(struct ksignal *ksig,
+					struct pt_regs *regs,
+					size_t frame_size)
 {
 	unsigned long usp;
 
@@ -174,13 +176,13 @@ static inline void *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 	usp = sigsp(usp, ksig);
 
 	/* Verify, is it 32 or 64 bit aligned */
-	return (void *)((usp - frame_size) & -8UL);
+	return (void __user *)((usp - frame_size) & -8UL);
 }
 
 static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
 			  struct pt_regs *regs)
 {
-	struct rt_sigframe *frame;
+	struct rt_sigframe __user *frame;
 	int err = 0;
 
 	frame = get_sigframe(ksig, regs, sizeof(*frame));
diff --git a/arch/openrisc/include/asm/uaccess.h b/arch/openrisc/include/asm/uaccess.h
index 120f5005461b..8f049ec99b3e 100644
--- a/arch/openrisc/include/asm/uaccess.h
+++ b/arch/openrisc/include/asm/uaccess.h
@@ -45,21 +45,7 @@
 
 #define uaccess_kernel()	(get_fs() == KERNEL_DS)
 
-/* Ensure that the range from addr to addr+size is all within the process'
- * address space
- */
-static inline int __range_ok(unsigned long addr, unsigned long size)
-{
-	const mm_segment_t fs = get_fs();
-
-	return size <= fs && addr <= (fs - size);
-}
-
-#define access_ok(addr, size)						\
-({ 									\
-	__chk_user_ptr(addr);						\
-	__range_ok((unsigned long)(addr), (size));			\
-})
+#include <asm-generic/access_ok.h>
 
 /*
  * These are the main single-value transfer routines.  They automatically
@@ -268,9 +254,6 @@ clear_user(void __user *addr, unsigned long size)
 	return size;
 }
 
-#define user_addr_max() \
-	(uaccess_kernel() ? ~0UL : TASK_SIZE)
-
 extern long strncpy_from_user(char *dest, const char __user *src, long count);
 
 extern __must_check long strnlen_user(const char __user *str, long n);
diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h
index 0925bbd6db67..b68f19e11361 100644
--- a/arch/parisc/include/asm/uaccess.h
+++ b/arch/parisc/include/asm/uaccess.h
@@ -17,9 +17,13 @@
  * We just let the page fault handler do the right thing. This also means
  * that put_user is the same as __put_user, etc.
  */
-
-#define access_ok(uaddr, size)	\
-	( (uaddr) == (uaddr) )
+static inline int __access_ok(const void __user *addr, unsigned long size)
+{
+	return 1;
+}
+#define __access_ok __access_ok
+#define TASK_SIZE_MAX DEFAULT_TASK_SIZE
+#include <asm-generic/access_ok.h>
 
 #define put_user __put_user
 #define get_user __get_user
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index a0032c2e7550..2e83217f52de 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -11,18 +11,9 @@
 #ifdef __powerpc64__
 /* We use TASK_SIZE_USER64 as TASK_SIZE is not constant */
 #define TASK_SIZE_MAX		TASK_SIZE_USER64
-#else
-#define TASK_SIZE_MAX		TASK_SIZE
 #endif
 
-static inline bool __access_ok(unsigned long addr, unsigned long size)
-{
-	return addr < TASK_SIZE_MAX && size <= TASK_SIZE_MAX - addr;
-}
-
-#define access_ok(addr, size)		\
-	(__chk_user_ptr(addr),		\
-	 __access_ok((unsigned long)(addr), (size)))
+#include <asm-generic/access_ok.h>
 
 /*
  * These are the main single-value transfer routines.  They automatically
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index a94b0cd0bdc5..022d23ae300b 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -112,9 +112,9 @@ static nokprobe_inline long address_ok(struct pt_regs *regs,
 {
 	if (!user_mode(regs))
 		return 1;
-	if (__access_ok(ea, nb))
+	if (access_ok((void __user *)ea, nb))
 		return 1;
-	if (__access_ok(ea, 1))
+	if (access_ok((void __user *)ea, 1))
 		/* Access overlaps the end of the user region */
 		regs->dar = TASK_SIZE_MAX - 1;
 	else
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
index 4407b9e48d2c..855450bed9f5 100644
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -21,42 +21,13 @@
 #include <asm/byteorder.h>
 #include <asm/extable.h>
 #include <asm/asm.h>
+#include <asm-generic/access_ok.h>
 
 #define __enable_user_access()							\
 	__asm__ __volatile__ ("csrs sstatus, %0" : : "r" (SR_SUM) : "memory")
 #define __disable_user_access()							\
 	__asm__ __volatile__ ("csrc sstatus, %0" : : "r" (SR_SUM) : "memory")
 
-/**
- * access_ok: - Checks if a user space pointer is valid
- * @addr: User space pointer to start of block to check
- * @size: Size of block to check
- *
- * Context: User context only.  This function may sleep.
- *
- * Checks if a pointer to a block of memory in user space is valid.
- *
- * Returns true (nonzero) if the memory block may be valid, false (zero)
- * if it is definitely invalid.
- *
- * Note that, depending on architecture, this function probably just
- * checks that the pointer is in the user space range - after calling
- * this function, memory access functions may still return -EFAULT.
- */
-#define access_ok(addr, size) ({					\
-	__chk_user_ptr(addr);						\
-	likely(__access_ok((unsigned long __force)(addr), (size)));	\
-})
-
-/*
- * Ensure that the range [addr, addr+size) is within the process's
- * address space
- */
-static inline int __access_ok(unsigned long addr, unsigned long size)
-{
-	return size <= TASK_SIZE && addr <= TASK_SIZE - size;
-}
-
 /*
  * The exception table consists of pairs of addresses: the first is the
  * address of an instruction that is allowed to fault, and the second is
diff --git a/arch/riscv/kernel/perf_callchain.c b/arch/riscv/kernel/perf_callchain.c
index 1fc075b8f764..f0c7bb98119a 100644
--- a/arch/riscv/kernel/perf_callchain.c
+++ b/arch/riscv/kernel/perf_callchain.c
@@ -15,7 +15,7 @@ static unsigned long user_backtrace(struct perf_callchain_entry_ctx *entry,
 {
 	struct stackframe buftail;
 	unsigned long ra = 0;
-	unsigned long *user_frame_tail =
+	unsigned long __user *user_frame_tail =
 			(unsigned long *)(fp - sizeof(struct stackframe));
 
 	/* Check accessibility of one struct frame_tail beyond */
diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index 29332edf46f0..f84d70c8e188 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -20,18 +20,13 @@
 
 void debug_user_asce(int exit);
 
-static inline int __range_ok(unsigned long addr, unsigned long size)
+static inline int __access_ok(const void __user *addr, unsigned long size)
 {
 	return 1;
 }
+#define __access_ok __access_ok
 
-#define __access_ok(addr, size)				\
-({							\
-	__chk_user_ptr(addr);				\
-	__range_ok((unsigned long)(addr), (size));	\
-})
-
-#define access_ok(addr, size) __access_ok(addr, size)
+#include <asm-generic/access_ok.h>
 
 unsigned long __must_check
 raw_copy_from_user(void *to, const void __user *from, unsigned long n);
diff --git a/arch/sh/include/asm/uaccess.h b/arch/sh/include/asm/uaccess.h
index 8867bb04b00e..ccd219d74851 100644
--- a/arch/sh/include/asm/uaccess.h
+++ b/arch/sh/include/asm/uaccess.h
@@ -5,28 +5,10 @@
 #include <asm/segment.h>
 #include <asm/extable.h>
 
-#define __addr_ok(addr) \
-	((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg)
-
-/*
- * __access_ok: Check if address with size is OK or not.
- *
- * Uhhuh, this needs 33-bit arithmetic. We have a carry..
- *
- * sum := addr + size;  carry? --> flag = true;
- * if (sum >= addr_limit) flag = true;
- */
-#define __access_ok(addr, size)	({				\
-	unsigned long __ao_a = (addr), __ao_b = (size);		\
-	unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;	\
-	__ao_end >= __ao_a && __addr_ok(__ao_end); })
-
-#define access_ok(addr, size)	\
-	(__chk_user_ptr(addr),		\
-	 __access_ok((unsigned long __force)(addr), (size)))
-
 #define user_addr_max()	(current_thread_info()->addr_limit.seg)
 
+#include <asm-generic/access_ok.h>
+
 /*
  * Uh, these should become the main single-value transfer routines ...
  * They automatically use the right size if we just have the right
diff --git a/arch/sparc/include/asm/uaccess.h b/arch/sparc/include/asm/uaccess.h
index 390094200fc4..ee75f69e3fcd 100644
--- a/arch/sparc/include/asm/uaccess.h
+++ b/arch/sparc/include/asm/uaccess.h
@@ -10,9 +10,6 @@
 #include <asm/uaccess_32.h>
 #endif
 
-#define user_addr_max() \
-	(uaccess_kernel() ? ~0UL : TASK_SIZE)
-
 long strncpy_from_user(char *dest, const char __user *src, long count);
 
 #endif
diff --git a/arch/sparc/include/asm/uaccess_32.h b/arch/sparc/include/asm/uaccess_32.h
index 4a12346bb69c..367747116260 100644
--- a/arch/sparc/include/asm/uaccess_32.h
+++ b/arch/sparc/include/asm/uaccess_32.h
@@ -25,17 +25,7 @@
 #define get_fs()	(current->thread.current_ds)
 #define set_fs(val)	((current->thread.current_ds) = (val))
 
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-
-/* We have there a nice not-mapped page at PAGE_OFFSET - PAGE_SIZE, so that this test
- * can be fairly lightweight.
- * No one can read/write anything from userland in the kernel space by setting
- * large size and address near to PAGE_OFFSET - a fault will break his intentions.
- */
-#define __user_ok(addr, size) ({ (void)(size); (addr) < STACK_TOP; })
-#define __kernel_ok (uaccess_kernel())
-#define __access_ok(addr, size) (__user_ok((addr) & get_fs().seg, (size)))
-#define access_ok(addr, size) __access_ok((unsigned long)(addr), size)
+#include <asm-generic/access_ok.h>
 
 /* Uh, these should become the main single-value transfer routines..
  * They automatically use the right size if we just have the right
@@ -47,13 +37,13 @@
  * and hide all the ugliness from the user.
  */
 #define put_user(x, ptr) ({ \
-	unsigned long __pu_addr = (unsigned long)(ptr); \
+	void __user *__pu_addr = (ptr); \
 	__chk_user_ptr(ptr); \
 	__put_user_check((__typeof__(*(ptr)))(x), __pu_addr, sizeof(*(ptr))); \
 })
 
 #define get_user(x, ptr) ({ \
-	unsigned long __gu_addr = (unsigned long)(ptr); \
+	const void __user *__gu_addr = (ptr); \
 	__chk_user_ptr(ptr); \
 	__get_user_check((x), __gu_addr, sizeof(*(ptr)), __typeof__(*(ptr))); \
 })
@@ -232,7 +222,7 @@ static inline unsigned long __clear_user(void __user *addr, unsigned long size)
 
 static inline unsigned long clear_user(void __user *addr, unsigned long n)
 {
-	if (n && __access_ok((unsigned long) addr, n))
+	if (n && __access_ok(addr, n))
 		return __clear_user(addr, n);
 	else
 		return n;
diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h
index 5c12fb46bc61..000bac67cf31 100644
--- a/arch/sparc/include/asm/uaccess_64.h
+++ b/arch/sparc/include/asm/uaccess_64.h
@@ -31,7 +31,12 @@
 
 #define get_fs() ((mm_segment_t){(current_thread_info()->current_ds)})
 
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
+static inline int __access_ok(const void __user *addr, unsigned long size)
+{
+	return 1;
+}
+#define __access_ok __access_ok
+#include <asm-generic/access_ok.h>
 
 #define set_fs(val)								\
 do {										\
@@ -43,33 +48,7 @@ do {										\
  * Test whether a block of memory is a valid user space address.
  * Returns 0 if the range is valid, nonzero otherwise.
  */
-static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, unsigned long limit)
-{
-	if (__builtin_constant_p(size))
-		return addr > limit - size;
-
-	addr += size;
-	if (addr < size)
-		return true;
-
-	return addr > limit;
-}
-
-#define __range_not_ok(addr, size, limit)                               \
-({                                                                      \
-	__chk_user_ptr(addr);                                           \
-	__chk_range_not_ok((unsigned long __force)(addr), size, limit); \
-})
-
-static inline int __access_ok(const void __user * addr, unsigned long size)
-{
-	return 1;
-}
-
-static inline int access_ok(const void __user * addr, unsigned long size)
-{
-	return 1;
-}
+#define __range_not_ok(addr, size, limit) (!__access_ok(addr, size))
 
 void __retl_efault(void);
 
diff --git a/arch/sparc/kernel/signal_32.c b/arch/sparc/kernel/signal_32.c
index ffab16369bea..74f80443b195 100644
--- a/arch/sparc/kernel/signal_32.c
+++ b/arch/sparc/kernel/signal_32.c
@@ -65,7 +65,7 @@ struct rt_signal_frame {
  */
 static inline bool invalid_frame_pointer(void __user *fp, int fplen)
 {
-	if ((((unsigned long) fp) & 15) || !__access_ok((unsigned long)fp, fplen))
+	if ((((unsigned long) fp) & 15) || !access_ok(fp, fplen))
 		return true;
 
 	return false;
diff --git a/arch/um/include/asm/uaccess.h b/arch/um/include/asm/uaccess.h
index 1ecfc96bcc50..7d9d60e41e4e 100644
--- a/arch/um/include/asm/uaccess.h
+++ b/arch/um/include/asm/uaccess.h
@@ -25,7 +25,7 @@
 extern unsigned long raw_copy_from_user(void *to, const void __user *from, unsigned long n);
 extern unsigned long raw_copy_to_user(void __user *to, const void *from, unsigned long n);
 extern unsigned long __clear_user(void __user *mem, unsigned long len);
-static inline int __access_ok(unsigned long addr, unsigned long size);
+static inline int __access_ok(const void __user *ptr, unsigned long size);
 
 /* Teach asm-generic/uaccess.h that we have C functions for these. */
 #define __access_ok __access_ok
@@ -36,8 +36,9 @@ static inline int __access_ok(unsigned long addr, unsigned long size);
 
 #include <asm-generic/uaccess.h>
 
-static inline int __access_ok(unsigned long addr, unsigned long size)
+static inline int __access_ok(const void __user *ptr, unsigned long size)
 {
+	unsigned long addr = (unsigned long)ptr;
 	return __addr_range_nowrap(addr, size) &&
 		(__under_task_size(addr, size) ||
 		 __access_ok_vsyscall(addr, size));
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index c6d9dc42724d..c5e4bb7161bc 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -12,18 +12,6 @@
 #include <asm/smap.h>
 #include <asm/extable.h>
 
-/*
- * Test whether a block of memory is a valid user space address.
- * Returns 0 if the range is valid, nonzero otherwise.
- */
-static inline bool __access_ok(void __user *ptr, unsigned long size)
-{
-	unsigned long limit = TASK_SIZE_MAX;
-	unsigned long addr = ptr;
-
-	return (size <= limit) && (addr <= (limit - size));
-}
-
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 static inline bool pagefault_disabled(void);
 # define WARN_ON_IN_IRQ()	\
@@ -55,6 +43,8 @@ static inline bool pagefault_disabled(void);
 	likely(__access_ok(addr, size));\
 })
 
+#include <asm-generic/access_ok.h>
+
 #define __range_not_ok(addr, size, limit)	(!__access_ok(addr, size))
 #define __chk_range_not_ok(addr, size, limit)	(!__access_ok((void __user *)addr, size))
 
diff --git a/arch/xtensa/include/asm/uaccess.h b/arch/xtensa/include/asm/uaccess.h
index 75bd8fbf52ba..0edd9e4b23d0 100644
--- a/arch/xtensa/include/asm/uaccess.h
+++ b/arch/xtensa/include/asm/uaccess.h
@@ -35,15 +35,7 @@
 #define get_fs()	(current->thread.current_ds)
 #define set_fs(val)	(current->thread.current_ds = (val))
 
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-
-#define __kernel_ok (uaccess_kernel())
-#define __user_ok(addr, size) \
-		(((size) <= TASK_SIZE)&&((addr) <= TASK_SIZE-(size)))
-#define __access_ok(addr, size) (__kernel_ok || __user_ok((addr), (size)))
-#define access_ok(addr, size) __access_ok((unsigned long)(addr), (size))
-
-#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
+#include <asm-generic/access_ok.h>
 
 /*
  * These are the main single-value transfer routines.  They
diff --git a/include/asm-generic/access_ok.h b/include/asm-generic/access_ok.h
new file mode 100644
index 000000000000..883b573af5fe
--- /dev/null
+++ b/include/asm-generic/access_ok.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_GENERIC_ACCESS_OK_H__
+#define __ASM_GENERIC_ACCESS_OK_H__
+
+/*
+ * Checking whether a pointer is valid for user space access.
+ * These definitions work on most architectures, but overrides can
+ * be used where necessary.
+ */
+
+/*
+ * architectures with compat tasks have a variable TASK_SIZE and should
+ * override this to a constant.
+ */
+#ifndef TASK_SIZE_MAX
+#define TASK_SIZE_MAX			TASK_SIZE
+#endif
+
+#ifndef uaccess_kernel
+#ifdef CONFIG_SET_FS
+#define uaccess_kernel()		(get_fs().seg == KERNEL_DS.seg)
+#else
+#define uaccess_kernel()		(0)
+#endif
+#endif
+
+#ifndef user_addr_max
+#define user_addr_max()			(uaccess_kernel() ? ~0UL : TASK_SIZE_MAX)
+#endif
+
+#ifndef __access_ok
+/*
+ * 'size' is a compile-time constant for most callers, so optimize for
+ * this case to turn the check into a single comparison against a constant
+ * limit and catch all possible overflows.
+ * On architectures with separate user address space (m68k, s390, parisc,
+ * sparc64) or those without an MMU, this should always return true.
+ *
+ * This version was originally contributed by Jonas Bonn for the
+ * OpenRISC architecture, and was found to be the most efficient
+ * for constant 'size' and 'limit' values.
+ */
+static inline int __access_ok(const void __user *ptr, unsigned long size)
+{
+	unsigned long limit = user_addr_max();
+	unsigned long addr = (unsigned long)ptr;
+
+	if (limit == ULONG_MAX)
+		return true;
+
+	return (size <= limit) && (addr <= (limit - size));
+}
+#endif
+
+#ifndef access_ok
+#define access_ok(addr, size) likely(__access_ok(addr, size))
+#endif
+
+#endif
diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index 0870fa11a7c5..ebc685dc8d74 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -114,28 +114,9 @@ static inline void set_fs(mm_segment_t fs)
 }
 #endif
 
-#ifndef uaccess_kernel
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-#endif
-
-#ifndef user_addr_max
-#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
-#endif
-
 #endif /* CONFIG_SET_FS */
 
-#define access_ok(addr, size) __access_ok((unsigned long)(addr),(size))
-
-/*
- * The architecture should really override this if possible, at least
- * doing a check on the get_fs()
- */
-#ifndef __access_ok
-static inline int __access_ok(unsigned long addr, unsigned long size)
-{
-	return 1;
-}
-#endif
+#include <asm-generic/access_ok.h>
 
 /*
  * These are the main single-value transfer routines.  They automatically
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 67e9bc94dc40..2c31667e62e0 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -33,13 +33,6 @@ typedef struct {
 	/* empty dummy */
 } mm_segment_t;
 
-#ifndef TASK_SIZE_MAX
-#define TASK_SIZE_MAX			TASK_SIZE
-#endif
-
-#define uaccess_kernel()		(false)
-#define user_addr_max()			(TASK_SIZE_MAX)
-
 static inline mm_segment_t force_uaccess_begin(void)
 {
 	return (mm_segment_t) { };
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 08/14] arm64: simplify access_ok()
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (6 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 07/14] uaccess: generalize access_ok() Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 21:06   ` Robin Murphy
                     ` (2 more replies)
  2022-02-14 16:34 ` [PATCH 09/14] m68k: drop custom __access_ok() Arnd Bergmann
                   ` (6 subsequent siblings)
  14 siblings, 3 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

arm64 has an inline asm implementation of access_ok() that is derived from
the 32-bit arm version and optimized for the case that both the limit and
the size are variable. With set_fs() gone, the limit is always constant,
and the size usually is as well, so just using the default implementation
reduces the check into a comparison against a constant that can be
scheduled by the compiler.

On a defconfig build, this saves over 28KB of .text.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm64/include/asm/uaccess.h | 28 +++++-----------------------
 1 file changed, 5 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 357f7bd9c981..e8dce0cc5eaa 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -26,6 +26,8 @@
 #include <asm/memory.h>
 #include <asm/extable.h>
 
+static inline int __access_ok(const void __user *ptr, unsigned long size);
+
 /*
  * Test whether a block of memory is a valid user space address.
  * Returns 1 if the range is valid, 0 otherwise.
@@ -33,10 +35,8 @@
  * This is equivalent to the following test:
  * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
  */
-static inline unsigned long __access_ok(const void __user *addr, unsigned long size)
+static inline int access_ok(const void __user *addr, unsigned long size)
 {
-	unsigned long ret, limit = TASK_SIZE_MAX - 1;
-
 	/*
 	 * Asynchronous I/O running in a kernel thread does not have the
 	 * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag
@@ -46,27 +46,9 @@ static inline unsigned long __access_ok(const void __user *addr, unsigned long s
 	    (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
 		addr = untagged_addr(addr);
 
-	__chk_user_ptr(addr);
-	asm volatile(
-	// A + B <= C + 1 for all A,B,C, in four easy steps:
-	// 1: X = A + B; X' = X % 2^64
-	"	adds	%0, %3, %2\n"
-	// 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
-	"	csel	%1, xzr, %1, hi\n"
-	// 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
-	//    to compensate for the carry flag being set in step 4. For
-	//    X > 2^64, X' merely has to remain nonzero, which it does.
-	"	csinv	%0, %0, xzr, cc\n"
-	// 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
-	//    comes from the carry in being clear. Otherwise, we are
-	//    testing X' - C == 0, subject to the previous adjustments.
-	"	sbcs	xzr, %0, %1\n"
-	"	cset	%0, ls\n"
-	: "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
-
-	return ret;
+	return likely(__access_ok(addr, size));
 }
-#define __access_ok __access_ok
+#define access_ok access_ok
 
 #include <asm-generic/access_ok.h>
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 09/14] m68k: drop custom __access_ok()
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (7 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 08/14] arm64: simplify access_ok() Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-15  0:37   ` Al Viro
  2022-02-14 16:34 ` [PATCH 10/14] uaccess: remove most CONFIG_SET_FS users Arnd Bergmann
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

While most m68k platforms use separate address spaces for user
and kernel space, at least coldfire does not, and the other
ones have a TASK_SIZE that is less than the entire 4GB address
range.

Using the generic implementation of __access_ok() stops coldfire
user space from trivially accessing kernel memory, and is probably
the right thing elsewhere for consistency as well.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/m68k/include/asm/uaccess.h | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/arch/m68k/include/asm/uaccess.h b/arch/m68k/include/asm/uaccess.h
index d6bb5720365a..64914872a5c9 100644
--- a/arch/m68k/include/asm/uaccess.h
+++ b/arch/m68k/include/asm/uaccess.h
@@ -10,19 +10,6 @@
 #include <linux/compiler.h>
 #include <linux/types.h>
 #include <asm/extable.h>
-
-/* We let the MMU do all checking */
-static inline int __access_ok(const void __user *addr,
-			    unsigned long size)
-{
-	/*
-	 * XXX: for !CONFIG_CPU_HAS_ADDRESS_SPACES this really needs to check
-	 * for TASK_SIZE!
-	 * Removing this helper is probably sufficient.
-	 */
-	return 1;
-}
-#define __access_ok __access_ok
 #include <asm-generic/access_ok.h>
 
 /*
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 10/14] uaccess: remove most CONFIG_SET_FS users
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (8 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 09/14] m68k: drop custom __access_ok() Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 17:06   ` Christoph Hellwig
  2022-02-14 16:34 ` [PATCH 11/14] sparc64: remove CONFIG_SET_FS support Arnd Bergmann
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

On almost all architectures, there are no remaining callers
of set_fs(), so CONFIG_SET_FS can be disabled, along with
removing the thread_info field and any references to it.

This turns access_ok() into a cheaper check against TASK_SIZE_MAX.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/alpha/Kconfig                        |  1 -
 arch/alpha/include/asm/processor.h        |  4 --
 arch/alpha/include/asm/thread_info.h      |  2 -
 arch/alpha/include/asm/uaccess.h          | 19 ------
 arch/arc/Kconfig                          |  1 -
 arch/arc/include/asm/segment.h            | 20 -------
 arch/arc/include/asm/thread_info.h        |  3 -
 arch/arc/include/asm/uaccess.h            |  1 -
 arch/csky/Kconfig                         |  1 -
 arch/csky/include/asm/processor.h         |  2 -
 arch/csky/include/asm/segment.h           | 10 ----
 arch/csky/include/asm/thread_info.h       |  2 -
 arch/csky/include/asm/uaccess.h           |  3 -
 arch/csky/kernel/asm-offsets.c            |  1 -
 arch/h8300/Kconfig                        |  1 -
 arch/h8300/include/asm/processor.h        |  1 -
 arch/h8300/include/asm/segment.h          | 40 -------------
 arch/h8300/include/asm/thread_info.h      |  3 -
 arch/h8300/kernel/entry.S                 |  1 -
 arch/h8300/kernel/head_ram.S              |  1 -
 arch/h8300/mm/init.c                      |  6 --
 arch/h8300/mm/memory.c                    |  1 -
 arch/hexagon/Kconfig                      |  1 -
 arch/hexagon/include/asm/thread_info.h    |  6 --
 arch/hexagon/kernel/process.c             |  1 -
 arch/microblaze/Kconfig                   |  1 -
 arch/microblaze/include/asm/thread_info.h |  6 --
 arch/microblaze/include/asm/uaccess.h     | 24 --------
 arch/microblaze/kernel/asm-offsets.c      |  1 -
 arch/microblaze/kernel/process.c          |  1 -
 arch/nds32/Kconfig                        |  1 -
 arch/nds32/include/asm/thread_info.h      |  4 --
 arch/nds32/include/asm/uaccess.h          | 15 +----
 arch/nds32/mm/alignment.c                 |  3 -
 arch/nios2/Kconfig                        |  1 -
 arch/nios2/include/asm/thread_info.h      |  9 ---
 arch/nios2/include/asm/uaccess.h          | 12 ----
 arch/openrisc/Kconfig                     |  1 -
 arch/openrisc/include/asm/thread_info.h   |  7 ---
 arch/openrisc/include/asm/uaccess.h       | 23 --------
 arch/sparc/Kconfig                        |  2 +-
 arch/sparc/include/asm/processor_32.h     |  6 --
 arch/sparc/include/asm/uaccess_32.h       | 13 -----
 arch/sparc/kernel/process_32.c            |  2 -
 arch/xtensa/Kconfig                       |  1 -
 arch/xtensa/include/asm/asm-uaccess.h     | 71 -----------------------
 arch/xtensa/include/asm/processor.h       |  7 ---
 arch/xtensa/include/asm/thread_info.h     |  3 -
 arch/xtensa/include/asm/uaccess.h         | 16 -----
 arch/xtensa/kernel/asm-offsets.c          |  3 -
 include/asm-generic/uaccess.h             | 25 +-------
 51 files changed, 3 insertions(+), 387 deletions(-)
 delete mode 100644 arch/arc/include/asm/segment.h
 delete mode 100644 arch/csky/include/asm/segment.h
 delete mode 100644 arch/h8300/include/asm/segment.h

diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
index 4e87783c90ad..eee8b5b0a58b 100644
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -35,7 +35,6 @@ config ALPHA
 	select OLD_SIGSUSPEND
 	select CPU_NO_EFFICIENT_FFS if !ALPHA_EV67
 	select MMU_GATHER_NO_RANGE
-	select SET_FS
 	select SPARSEMEM_EXTREME if SPARSEMEM
 	select ZONE_DMA
 	help
diff --git a/arch/alpha/include/asm/processor.h b/arch/alpha/include/asm/processor.h
index 090499c99c1c..43e234c518b1 100644
--- a/arch/alpha/include/asm/processor.h
+++ b/arch/alpha/include/asm/processor.h
@@ -26,10 +26,6 @@
 #define TASK_UNMAPPED_BASE \
   ((current->personality & ADDR_LIMIT_32BIT) ? 0x40000000 : TASK_SIZE / 2)
 
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
 /* This is dead.  Everything has been moved to thread_info.  */
 struct thread_struct { };
 #define INIT_THREAD  { }
diff --git a/arch/alpha/include/asm/thread_info.h b/arch/alpha/include/asm/thread_info.h
index 2592356e3215..fdc485d7787a 100644
--- a/arch/alpha/include/asm/thread_info.h
+++ b/arch/alpha/include/asm/thread_info.h
@@ -19,7 +19,6 @@ struct thread_info {
 	unsigned int		flags;		/* low level flags */
 	unsigned int		ieee_state;	/* see fpu.h */
 
-	mm_segment_t		addr_limit;	/* thread address space */
 	unsigned		cpu;		/* current CPU */
 	int			preempt_count; /* 0 => preemptable, <0 => BUG */
 	unsigned int		status;		/* thread-synchronous flags */
@@ -35,7 +34,6 @@ struct thread_info {
 #define INIT_THREAD_INFO(tsk)			\
 {						\
 	.task		= &tsk,			\
-	.addr_limit	= KERNEL_DS,		\
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
 }
 
diff --git a/arch/alpha/include/asm/uaccess.h b/arch/alpha/include/asm/uaccess.h
index 82c5743fc9cd..c32c2584c0b7 100644
--- a/arch/alpha/include/asm/uaccess.h
+++ b/arch/alpha/include/asm/uaccess.h
@@ -2,26 +2,7 @@
 #ifndef __ALPHA_UACCESS_H
 #define __ALPHA_UACCESS_H
 
-/*
- * The fs value determines whether argument validity checking should be
- * performed or not.  If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * Or at least it did once upon a time.  Nowadays it is a mask that
- * defines which bits of the address space are off limits.  This is a
- * wee bit faster than the above.
- *
- * For historical reasons, these macros are grossly misnamed.
- */
-
-#define KERNEL_DS	((mm_segment_t) { 0UL })
-#define USER_DS		((mm_segment_t) { -0x40000000000UL })
-
-#define get_fs()  (current_thread_info()->addr_limit)
-#define set_fs(x) (current_thread_info()->addr_limit = (x))
-
 #include <asm-generic/access_ok.h>
-
 /*
  * These are the main single-value transfer routines.  They automatically
  * use the right size if we just have the right pointer type.
diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 3c2a4753d09b..e0a60a27e14d 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -45,7 +45,6 @@ config ARC
 	select PCI_SYSCALL if PCI
 	select PERF_USE_VMALLOC if ARC_CACHE_VIPT_ALIASING
 	select HAVE_ARCH_JUMP_LABEL if ISA_ARCV2 && !CPU_ENDIAN_BE32
-	select SET_FS
 	select TRACE_IRQFLAGS_SUPPORT
 
 config LOCKDEP_SUPPORT
diff --git a/arch/arc/include/asm/segment.h b/arch/arc/include/asm/segment.h
deleted file mode 100644
index 871f8ab11bfd..000000000000
--- a/arch/arc/include/asm/segment.h
+++ /dev/null
@@ -1,20 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2004, 2007-2010, 2011-2012 Synopsys, Inc. (www.synopsys.com)
- */
-
-#ifndef __ASMARC_SEGMENT_H
-#define __ASMARC_SEGMENT_H
-
-#ifndef __ASSEMBLY__
-
-typedef unsigned long mm_segment_t;
-
-#define MAKE_MM_SEG(s)	((mm_segment_t) { (s) })
-
-#define KERNEL_DS		MAKE_MM_SEG(0)
-#define USER_DS			MAKE_MM_SEG(TASK_SIZE)
-#define uaccess_kernel()	(get_fs() == KERNEL_DS)
-
-#endif /* __ASSEMBLY__ */
-#endif /* __ASMARC_SEGMENT_H */
diff --git a/arch/arc/include/asm/thread_info.h b/arch/arc/include/asm/thread_info.h
index d36863e34bfc..1e0b2e3914d5 100644
--- a/arch/arc/include/asm/thread_info.h
+++ b/arch/arc/include/asm/thread_info.h
@@ -27,7 +27,6 @@
 #ifndef __ASSEMBLY__
 
 #include <linux/thread_info.h>
-#include <asm/segment.h>
 
 /*
  * low level task data that entry.S needs immediate access to
@@ -40,7 +39,6 @@ struct thread_info {
 	unsigned long flags;		/* low level flags */
 	int preempt_count;		/* 0 => preemptable, <0 => BUG */
 	struct task_struct *task;	/* main task structure */
-	mm_segment_t addr_limit;	/* thread address space */
 	__u32 cpu;			/* current CPU */
 	unsigned long thr_ptr;		/* TLS ptr */
 };
@@ -56,7 +54,6 @@ struct thread_info {
 	.flags      = 0,			\
 	.cpu        = 0,			\
 	.preempt_count  = INIT_PREEMPT_COUNT,	\
-	.addr_limit = KERNEL_DS,		\
 }
 
 static inline __attribute_const__ struct thread_info *current_thread_info(void)
diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index 30f80b4be2ab..99712471c96a 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -638,7 +638,6 @@ extern unsigned long arc_clear_user_noinline(void __user *to,
 #define __clear_user(d, n)		arc_clear_user_noinline(d, n)
 #endif
 
-#include <asm/segment.h>
 #include <asm-generic/uaccess.h>
 
 #endif
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index 132f43f12dd8..75ef86605d69 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -79,7 +79,6 @@ config CSKY
 	select PCI_DOMAINS_GENERIC if PCI
 	select PCI_SYSCALL if PCI
 	select PCI_MSI if PCI
-	select SET_FS
 	select TRACE_IRQFLAGS_SUPPORT
 
 config LOCKDEP_SUPPORT
diff --git a/arch/csky/include/asm/processor.h b/arch/csky/include/asm/processor.h
index 817dd60ff152..688c7548b559 100644
--- a/arch/csky/include/asm/processor.h
+++ b/arch/csky/include/asm/processor.h
@@ -4,7 +4,6 @@
 #define __ASM_CSKY_PROCESSOR_H
 
 #include <linux/bitops.h>
-#include <asm/segment.h>
 #include <asm/ptrace.h>
 #include <asm/current.h>
 #include <asm/cache.h>
@@ -59,7 +58,6 @@ struct thread_struct {
  */
 #define start_thread(_regs, _pc, _usp)					\
 do {									\
-	set_fs(USER_DS); /* reads from user space */			\
 	(_regs)->pc = (_pc);						\
 	(_regs)->regs[1] = 0; /* ABIV1 is R7, uClibc_main rtdl arg */	\
 	(_regs)->regs[2] = 0;						\
diff --git a/arch/csky/include/asm/segment.h b/arch/csky/include/asm/segment.h
deleted file mode 100644
index 5bc1cc62b87f..000000000000
--- a/arch/csky/include/asm/segment.h
+++ /dev/null
@@ -1,10 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-
-#ifndef __ASM_CSKY_SEGMENT_H
-#define __ASM_CSKY_SEGMENT_H
-
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
-#endif /* __ASM_CSKY_SEGMENT_H */
diff --git a/arch/csky/include/asm/thread_info.h b/arch/csky/include/asm/thread_info.h
index 8c349a8f904d..b5ed788f0c68 100644
--- a/arch/csky/include/asm/thread_info.h
+++ b/arch/csky/include/asm/thread_info.h
@@ -16,7 +16,6 @@ struct thread_info {
 	unsigned long		flags;
 	int			preempt_count;
 	unsigned long		tp_value;
-	mm_segment_t		addr_limit;
 	struct restart_block	restart_block;
 	struct pt_regs		*regs;
 	unsigned int		cpu;
@@ -26,7 +25,6 @@ struct thread_info {
 {						\
 	.task		= &tsk,			\
 	.preempt_count  = INIT_PREEMPT_COUNT,	\
-	.addr_limit     = KERNEL_DS,		\
 	.cpu		= 0,			\
 	.restart_block = {			\
 		.fn = do_no_restart_syscall,	\
diff --git a/arch/csky/include/asm/uaccess.h b/arch/csky/include/asm/uaccess.h
index fec8f77ffc99..2e927c21d8a1 100644
--- a/arch/csky/include/asm/uaccess.h
+++ b/arch/csky/include/asm/uaccess.h
@@ -3,8 +3,6 @@
 #ifndef __ASM_CSKY_UACCESS_H
 #define __ASM_CSKY_UACCESS_H
 
-#define user_addr_max() (current_thread_info()->addr_limit.seg)
-
 /*
  * __put_user_fn
  */
@@ -200,7 +198,6 @@ unsigned long raw_copy_to_user(void *to, const void *from, unsigned long n);
 unsigned long __clear_user(void __user *to, unsigned long n);
 #define __clear_user __clear_user
 
-#include <asm/segment.h>
 #include <asm-generic/uaccess.h>
 
 #endif /* __ASM_CSKY_UACCESS_H */
diff --git a/arch/csky/kernel/asm-offsets.c b/arch/csky/kernel/asm-offsets.c
index 1cbcba4b0dd1..d1e903579473 100644
--- a/arch/csky/kernel/asm-offsets.c
+++ b/arch/csky/kernel/asm-offsets.c
@@ -25,7 +25,6 @@ int main(void)
 	/* offsets into the thread_info struct */
 	DEFINE(TINFO_FLAGS,       offsetof(struct thread_info, flags));
 	DEFINE(TINFO_PREEMPT,     offsetof(struct thread_info, preempt_count));
-	DEFINE(TINFO_ADDR_LIMIT,  offsetof(struct thread_info, addr_limit));
 	DEFINE(TINFO_TP_VALUE,   offsetof(struct thread_info, tp_value));
 	DEFINE(TINFO_TASK,        offsetof(struct thread_info, task));
 
diff --git a/arch/h8300/Kconfig b/arch/h8300/Kconfig
index 3e3e0f16f7e0..fe48c4f26cc8 100644
--- a/arch/h8300/Kconfig
+++ b/arch/h8300/Kconfig
@@ -24,7 +24,6 @@ config H8300
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_HASH
 	select CPU_NO_EFFICIENT_FFS
-	select SET_FS
 	select UACCESS_MEMCPY
 
 config CPU_BIG_ENDIAN
diff --git a/arch/h8300/include/asm/processor.h b/arch/h8300/include/asm/processor.h
index 141a23eb62b7..ba171aa4dacb 100644
--- a/arch/h8300/include/asm/processor.h
+++ b/arch/h8300/include/asm/processor.h
@@ -13,7 +13,6 @@
 #define __ASM_H8300_PROCESSOR_H
 
 #include <linux/compiler.h>
-#include <asm/segment.h>
 #include <asm/ptrace.h>
 #include <asm/current.h>
 
diff --git a/arch/h8300/include/asm/segment.h b/arch/h8300/include/asm/segment.h
deleted file mode 100644
index 37950725d9b9..000000000000
--- a/arch/h8300/include/asm/segment.h
+++ /dev/null
@@ -1,40 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _H8300_SEGMENT_H
-#define _H8300_SEGMENT_H
-
-/* define constants */
-#define USER_DATA     (1)
-#ifndef __USER_DS
-#define __USER_DS     (USER_DATA)
-#endif
-#define USER_PROGRAM  (2)
-#define SUPER_DATA    (3)
-#ifndef __KERNEL_DS
-#define __KERNEL_DS   (SUPER_DATA)
-#endif
-#define SUPER_PROGRAM (4)
-
-#ifndef __ASSEMBLY__
-
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
-#define MAKE_MM_SEG(s)	((mm_segment_t) { (s) })
-#define USER_DS		MAKE_MM_SEG(__USER_DS)
-#define KERNEL_DS	MAKE_MM_SEG(__KERNEL_DS)
-
-/*
- * Get/set the SFC/DFC registers for MOVES instructions
- */
-
-static inline mm_segment_t get_fs(void)
-{
-	return USER_DS;
-}
-
-#define uaccess_kernel()	(get_fs().seg == KERNEL_DS.seg)
-
-#endif /* __ASSEMBLY__ */
-
-#endif /* _H8300_SEGMENT_H */
diff --git a/arch/h8300/include/asm/thread_info.h b/arch/h8300/include/asm/thread_info.h
index a518214d4ddd..ff2d873749a4 100644
--- a/arch/h8300/include/asm/thread_info.h
+++ b/arch/h8300/include/asm/thread_info.h
@@ -10,7 +10,6 @@
 #define _ASM_THREAD_INFO_H
 
 #include <asm/page.h>
-#include <asm/segment.h>
 
 #ifdef __KERNEL__
 
@@ -31,7 +30,6 @@ struct thread_info {
 	unsigned long	   flags;		/* low level flags */
 	int		   cpu;			/* cpu we're on */
 	int		   preempt_count;	/* 0 => preemptable, <0 => BUG */
-	mm_segment_t		addr_limit;
 };
 
 /*
@@ -43,7 +41,6 @@ struct thread_info {
 	.flags =	0,			\
 	.cpu =		0,			\
 	.preempt_count = INIT_PREEMPT_COUNT,	\
-	.addr_limit	= KERNEL_DS,		\
 }
 
 /* how to get the thread information struct from C */
diff --git a/arch/h8300/kernel/entry.S b/arch/h8300/kernel/entry.S
index c6e289b5f1f2..42db87c17917 100644
--- a/arch/h8300/kernel/entry.S
+++ b/arch/h8300/kernel/entry.S
@@ -17,7 +17,6 @@
 #include <linux/sys.h>
 #include <asm/unistd.h>
 #include <asm/setup.h>
-#include <asm/segment.h>
 #include <asm/linkage.h>
 #include <asm/asm-offsets.h>
 #include <asm/thread_info.h>
diff --git a/arch/h8300/kernel/head_ram.S b/arch/h8300/kernel/head_ram.S
index dbf8429f5fab..489462f0ee57 100644
--- a/arch/h8300/kernel/head_ram.S
+++ b/arch/h8300/kernel/head_ram.S
@@ -4,7 +4,6 @@
 #include <linux/init.h>
 #include <asm/unistd.h>
 #include <asm/setup.h>
-#include <asm/segment.h>
 #include <asm/linkage.h>
 #include <asm/asm-offsets.h>
 #include <asm/thread_info.h>
diff --git a/arch/h8300/mm/init.c b/arch/h8300/mm/init.c
index f7bf4693e3b2..9fa13312720a 100644
--- a/arch/h8300/mm/init.c
+++ b/arch/h8300/mm/init.c
@@ -34,7 +34,6 @@
 #include <linux/gfp.h>
 
 #include <asm/setup.h>
-#include <asm/segment.h>
 #include <asm/page.h>
 #include <asm/sections.h>
 
@@ -71,11 +70,6 @@ void __init paging_init(void)
 		panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
 		      __func__, PAGE_SIZE, PAGE_SIZE);
 
-	/*
-	 * Set up SFC/DFC registers (user data space).
-	 */
-	set_fs(USER_DS);
-
 	pr_debug("before free_area_init\n");
 
 	pr_debug("free_area_init -> start_mem is %#lx\nvirtual_end is %#lx\n",
diff --git a/arch/h8300/mm/memory.c b/arch/h8300/mm/memory.c
index 4a60e2b5eb96..c950571064d2 100644
--- a/arch/h8300/mm/memory.c
+++ b/arch/h8300/mm/memory.c
@@ -24,7 +24,6 @@
 #include <linux/types.h>
 
 #include <asm/setup.h>
-#include <asm/segment.h>
 #include <asm/page.h>
 #include <asm/traps.h>
 #include <asm/io.h>
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index 15dd8f38b698..54eadf265178 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -30,7 +30,6 @@ config HEXAGON
 	select GENERIC_CLOCKEVENTS_BROADCAST
 	select MODULES_USE_ELF_RELA
 	select GENERIC_CPU_DEVICES
-	select SET_FS
 	select ARCH_WANT_LD_ORPHAN_WARN
 	select TRACE_IRQFLAGS_SUPPORT
 	help
diff --git a/arch/hexagon/include/asm/thread_info.h b/arch/hexagon/include/asm/thread_info.h
index 535976665bf0..e90f280b9ce3 100644
--- a/arch/hexagon/include/asm/thread_info.h
+++ b/arch/hexagon/include/asm/thread_info.h
@@ -22,10 +22,6 @@
 
 #ifndef __ASSEMBLY__
 
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
 /*
  * This is union'd with the "bottom" of the kernel stack.
  * It keeps track of thread info which is handy for routines
@@ -37,7 +33,6 @@ struct thread_info {
 	unsigned long		flags;          /* low level flags */
 	__u32                   cpu;            /* current cpu */
 	int                     preempt_count;  /* 0=>preemptible,<0=>BUG */
-	mm_segment_t            addr_limit;     /* segmentation sux */
 	/*
 	 * used for syscalls somehow;
 	 * seems to have a function pointer and four arguments
@@ -66,7 +61,6 @@ struct thread_info {
 	.flags          = 0,                    \
 	.cpu            = 0,                    \
 	.preempt_count  = 1,                    \
-	.addr_limit     = KERNEL_DS,            \
 	.sp = 0,				\
 	.regs = NULL,			\
 }
diff --git a/arch/hexagon/kernel/process.c b/arch/hexagon/kernel/process.c
index 232dfd8956aa..dfa6b2757c05 100644
--- a/arch/hexagon/kernel/process.c
+++ b/arch/hexagon/kernel/process.c
@@ -105,7 +105,6 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
 	/*
 	 * Parent sees new pid -- not necessary, not even possible at
 	 * this point in the fork process
-	 * Might also want to set things like ti->addr_limit
 	 */
 
 	return 0;
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index 59798e43cdb0..1fb1cec087b7 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -42,7 +42,6 @@ config MICROBLAZE
 	select CPU_NO_EFFICIENT_FFS
 	select MMU_GATHER_NO_RANGE
 	select SPARSE_IRQ
-	select SET_FS
 	select ZONE_DMA
 	select TRACE_IRQFLAGS_SUPPORT
 
diff --git a/arch/microblaze/include/asm/thread_info.h b/arch/microblaze/include/asm/thread_info.h
index 44f5ca331862..a0ddd2a36fb9 100644
--- a/arch/microblaze/include/asm/thread_info.h
+++ b/arch/microblaze/include/asm/thread_info.h
@@ -56,17 +56,12 @@ struct cpu_context {
 	__u32	fsr;
 };
 
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
 struct thread_info {
 	struct task_struct	*task; /* main task structure */
 	unsigned long		flags; /* low level flags */
 	unsigned long		status; /* thread-synchronous flags */
 	__u32			cpu; /* current CPU */
 	__s32			preempt_count; /* 0 => preemptable,< 0 => BUG*/
-	mm_segment_t		addr_limit; /* thread address space */
 
 	struct cpu_context	cpu_context;
 };
@@ -80,7 +75,6 @@ struct thread_info {
 	.flags		= 0,			\
 	.cpu		= 0,			\
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
-	.addr_limit	= KERNEL_DS,		\
 }
 
 /* how to get the thread information struct from C */
diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
index dd82e90adb52..ea0c1f11035f 100644
--- a/arch/microblaze/include/asm/uaccess.h
+++ b/arch/microblaze/include/asm/uaccess.h
@@ -15,30 +15,6 @@
 #include <linux/pgtable.h>
 #include <asm/extable.h>
 #include <linux/string.h>
-
-/*
- * On Microblaze the fs value is actually the top of the corresponding
- * address space.
- *
- * The fs value determines whether argument validity checking should be
- * performed or not. If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons, these macros are grossly misnamed.
- *
- * For non-MMU arch like Microblaze, KERNEL_DS and USER_DS is equal.
- */
-# define MAKE_MM_SEG(s)       ((mm_segment_t) { (s) })
-
-#  define KERNEL_DS	MAKE_MM_SEG(0xFFFFFFFF)
-#  define USER_DS	MAKE_MM_SEG(TASK_SIZE - 1)
-
-# define get_fs()	(current_thread_info()->addr_limit)
-# define set_fs(val)	(current_thread_info()->addr_limit = (val))
-# define user_addr_max() get_fs().seg
-
-# define uaccess_kernel()	(get_fs().seg == KERNEL_DS.seg)
-
 #include <asm-generic/access_ok.h>
 
 # define __FIXUP_SECTION	".section .fixup,\"ax\"\n"
diff --git a/arch/microblaze/kernel/asm-offsets.c b/arch/microblaze/kernel/asm-offsets.c
index b77dd188dec4..47ee409508b1 100644
--- a/arch/microblaze/kernel/asm-offsets.c
+++ b/arch/microblaze/kernel/asm-offsets.c
@@ -86,7 +86,6 @@ int main(int argc, char *argv[])
 	/* struct thread_info */
 	DEFINE(TI_TASK, offsetof(struct thread_info, task));
 	DEFINE(TI_FLAGS, offsetof(struct thread_info, flags));
-	DEFINE(TI_ADDR_LIMIT, offsetof(struct thread_info, addr_limit));
 	DEFINE(TI_CPU_CONTEXT, offsetof(struct thread_info, cpu_context));
 	DEFINE(TI_PREEMPT_COUNT, offsetof(struct thread_info, preempt_count));
 	BLANK();
diff --git a/arch/microblaze/kernel/process.c b/arch/microblaze/kernel/process.c
index 5e2b91c1e8ce..1b944d319d73 100644
--- a/arch/microblaze/kernel/process.c
+++ b/arch/microblaze/kernel/process.c
@@ -18,7 +18,6 @@
 #include <linux/tick.h>
 #include <linux/bitops.h>
 #include <linux/ptrace.h>
-#include <linux/uaccess.h> /* for USER_DS macros */
 #include <asm/cacheflush.h>
 
 void show_regs(struct pt_regs *regs)
diff --git a/arch/nds32/Kconfig b/arch/nds32/Kconfig
index 4d1421b18734..013249430fa3 100644
--- a/arch/nds32/Kconfig
+++ b/arch/nds32/Kconfig
@@ -44,7 +44,6 @@ config NDS32
 	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FTRACE_MCOUNT_RECORD
 	select HAVE_DYNAMIC_FTRACE
-	select SET_FS
 	select TRACE_IRQFLAGS_SUPPORT
 	help
 	  Andes(nds32) Linux support.
diff --git a/arch/nds32/include/asm/thread_info.h b/arch/nds32/include/asm/thread_info.h
index d3967ad184f0..bd8f81cf2ce5 100644
--- a/arch/nds32/include/asm/thread_info.h
+++ b/arch/nds32/include/asm/thread_info.h
@@ -16,8 +16,6 @@ struct task_struct;
 #include <asm/ptrace.h>
 #include <asm/types.h>
 
-typedef unsigned long mm_segment_t;
-
 /*
  * low level task data that entry.S needs immediate access to.
  * __switch_to() assumes cpu_context follows immediately after cpu_domain.
@@ -25,12 +23,10 @@ typedef unsigned long mm_segment_t;
 struct thread_info {
 	unsigned long flags;	/* low level flags */
 	__s32 preempt_count;	/* 0 => preemptable, <0 => bug */
-	mm_segment_t addr_limit;	/* address limit */
 };
 #define INIT_THREAD_INFO(tsk)						\
 {									\
 	.preempt_count	= INIT_PREEMPT_COUNT,				\
-	.addr_limit	= KERNEL_DS,					\
 }
 #define thread_saved_pc(tsk) ((unsigned long)(tsk->thread.cpu_context.pc))
 #define thread_saved_fp(tsk) ((unsigned long)(tsk->thread.cpu_context.fp))
diff --git a/arch/nds32/include/asm/uaccess.h b/arch/nds32/include/asm/uaccess.h
index 832d642a4068..377548d4451a 100644
--- a/arch/nds32/include/asm/uaccess.h
+++ b/arch/nds32/include/asm/uaccess.h
@@ -11,6 +11,7 @@
 #include <asm/errno.h>
 #include <asm/memory.h>
 #include <asm/types.h>
+#include <asm-generic/access_ok.h>
 
 #define __asmeq(x, y)  ".ifnc " x "," y " ; .err ; .endif\n\t"
 
@@ -33,20 +34,6 @@ struct exception_table_entry {
 
 extern int fixup_exception(struct pt_regs *regs);
 
-#define KERNEL_DS 	((mm_segment_t) { ~0UL })
-#define USER_DS		((mm_segment_t) {TASK_SIZE - 1})
-
-#define get_fs()	(current_thread_info()->addr_limit)
-#define user_addr_max	get_fs
-#define uaccess_kernel() (get_fs() == KERNEL_DS)
-
-static inline void set_fs(mm_segment_t fs)
-{
-	current_thread_info()->addr_limit = fs;
-}
-
-#include <asm-generic/access_ok.h>
-
 /*
  * Single-value transfer routines.  They automatically use the right
  * size if we just have the right pointer type.  Note that the functions
diff --git a/arch/nds32/mm/alignment.c b/arch/nds32/mm/alignment.c
index 1eb7ded6992b..9c2c0a454da8 100644
--- a/arch/nds32/mm/alignment.c
+++ b/arch/nds32/mm/alignment.c
@@ -512,7 +512,6 @@ int do_unaligned_access(unsigned long addr, struct pt_regs *regs)
 {
 	unsigned long inst;
 	int ret = -EFAULT;
-	mm_segment_t seg;
 
 	inst = get_inst(regs->ipc);
 
@@ -520,12 +519,10 @@ int do_unaligned_access(unsigned long addr, struct pt_regs *regs)
 	      "Faulting addr: 0x%08lx, pc: 0x%08lx [inst: 0x%08lx ]\n", addr,
 	      regs->ipc, inst);
 
-	seg = force_uaccess_begin();
 	if (inst & NDS32_16BIT_INSTRUCTION)
 		ret = do_16((inst >> 16) & 0xffff, regs);
 	else
 		ret = do_32(inst, regs);
-	force_uaccess_end(seg);
 
 	return ret;
 }
diff --git a/arch/nios2/Kconfig b/arch/nios2/Kconfig
index 33fd06f5fa41..4167f1eb4cd8 100644
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -24,7 +24,6 @@ config NIOS2
 	select USB_ARCH_HAS_HCD if USB_SUPPORT
 	select CPU_NO_EFFICIENT_FFS
 	select MMU_GATHER_NO_RANGE if MMU
-	select SET_FS
 
 config GENERIC_CSUM
 	def_bool y
diff --git a/arch/nios2/include/asm/thread_info.h b/arch/nios2/include/asm/thread_info.h
index 272d2c72a727..bcc0e9915ebd 100644
--- a/arch/nios2/include/asm/thread_info.h
+++ b/arch/nios2/include/asm/thread_info.h
@@ -26,10 +26,6 @@
 
 #ifndef __ASSEMBLY__
 
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
 /*
  * low level task data that entry.S needs immediate access to
  * - this struct should fit entirely inside of one cache line
@@ -42,10 +38,6 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	__u32			cpu;		/* current CPU */
 	int			preempt_count;	/* 0 => preemptable,<0 => BUG */
-	mm_segment_t		addr_limit;	/* thread address space:
-						  0-0x7FFFFFFF for user-thead
-						  0-0xFFFFFFFF for kernel-thread
-						*/
 	struct pt_regs		*regs;
 };
 
@@ -60,7 +52,6 @@ struct thread_info {
 	.flags		= 0,			\
 	.cpu		= 0,			\
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
-	.addr_limit	= KERNEL_DS,		\
 }
 
 /* how to get the thread information struct from C */
diff --git a/arch/nios2/include/asm/uaccess.h b/arch/nios2/include/asm/uaccess.h
index 9a7658df7f8d..6d364e459458 100644
--- a/arch/nios2/include/asm/uaccess.h
+++ b/arch/nios2/include/asm/uaccess.h
@@ -18,18 +18,6 @@
 #include <asm/page.h>
 
 #include <asm/extable.h>
-
-/*
- * Segment stuff
- */
-#define MAKE_MM_SEG(s)		((mm_segment_t) { (s) })
-#define USER_DS			MAKE_MM_SEG(0x80000000UL)
-#define KERNEL_DS		MAKE_MM_SEG(0)
-
-
-#define get_fs()		(current_thread_info()->addr_limit)
-#define set_fs(seg)		(current_thread_info()->addr_limit = (seg))
-
 #include <asm-generic/access_ok.h>
 
 # define __EX_TABLE_SECTION	".section __ex_table,\"a\"\n"
diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index f724b3f1aeed..0d68adf6e02b 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -36,7 +36,6 @@ config OPENRISC
 	select ARCH_WANT_FRAME_POINTERS
 	select GENERIC_IRQ_MULTI_HANDLER
 	select MMU_GATHER_NO_RANGE if MMU
-	select SET_FS
 	select TRACE_IRQFLAGS_SUPPORT
 
 config CPU_BIG_ENDIAN
diff --git a/arch/openrisc/include/asm/thread_info.h b/arch/openrisc/include/asm/thread_info.h
index 659834ab87fa..4af3049c34c2 100644
--- a/arch/openrisc/include/asm/thread_info.h
+++ b/arch/openrisc/include/asm/thread_info.h
@@ -40,18 +40,12 @@
  */
 #ifndef __ASSEMBLY__
 
-typedef unsigned long mm_segment_t;
-
 struct thread_info {
 	struct task_struct	*task;		/* main task structure */
 	unsigned long		flags;		/* low level flags */
 	__u32			cpu;		/* current CPU */
 	__s32			preempt_count; /* 0 => preemptable, <0 => BUG */
 
-	mm_segment_t		addr_limit; /* thread address space:
-					       0-0x7FFFFFFF for user-thead
-					       0-0xFFFFFFFF for kernel-thread
-					     */
 	__u8			supervisor_stack[0];
 
 	/* saved context data */
@@ -71,7 +65,6 @@ struct thread_info {
 	.flags		= 0,				\
 	.cpu		= 0,				\
 	.preempt_count	= INIT_PREEMPT_COUNT,		\
-	.addr_limit	= KERNEL_DS,			\
 	.ksp            = 0,                            \
 }
 
diff --git a/arch/openrisc/include/asm/uaccess.h b/arch/openrisc/include/asm/uaccess.h
index 8f049ec99b3e..d6500a374e18 100644
--- a/arch/openrisc/include/asm/uaccess.h
+++ b/arch/openrisc/include/asm/uaccess.h
@@ -22,29 +22,6 @@
 #include <linux/string.h>
 #include <asm/page.h>
 #include <asm/extable.h>
-
-/*
- * The fs value determines whether argument validity checking should be
- * performed or not.  If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons, these macros are grossly misnamed.
- */
-
-/* addr_limit is the maximum accessible address for the task. we misuse
- * the KERNEL_DS and USER_DS values to both assign and compare the
- * addr_limit values through the equally misnamed get/set_fs macros.
- * (see above)
- */
-
-#define KERNEL_DS	(~0UL)
-
-#define USER_DS		(TASK_SIZE)
-#define get_fs()	(current_thread_info()->addr_limit)
-#define set_fs(x)	(current_thread_info()->addr_limit = (x))
-
-#define uaccess_kernel()	(get_fs() == KERNEL_DS)
-
 #include <asm-generic/access_ok.h>
 
 /*
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 1cab1b284f1a..875388835a58 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -46,7 +46,6 @@ config SPARC
 	select LOCKDEP_SMALL if LOCKDEP
 	select NEED_DMA_MAP_STATE
 	select NEED_SG_DMA_LENGTH
-	select SET_FS
 	select TRACE_IRQFLAGS_SUPPORT
 
 config SPARC32
@@ -100,6 +99,7 @@ config SPARC64
 	select HAVE_SETUP_PER_CPU_AREA
 	select NEED_PER_CPU_EMBED_FIRST_CHUNK
 	select NEED_PER_CPU_PAGE_FIRST_CHUNK
+	select SET_FS
 
 config ARCH_PROC_KCORE_TEXT
 	def_bool y
diff --git a/arch/sparc/include/asm/processor_32.h b/arch/sparc/include/asm/processor_32.h
index 647bf0ac7beb..b26c35336b51 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -32,10 +32,6 @@ struct fpq {
 };
 #endif
 
-typedef struct {
-	int seg;
-} mm_segment_t;
-
 /* The Sparc processor specific thread struct. */
 struct thread_struct {
 	struct pt_regs *kregs;
@@ -50,11 +46,9 @@ struct thread_struct {
 	unsigned long   fsr;
 	unsigned long   fpqdepth;
 	struct fpq	fpqueue[16];
-	mm_segment_t current_ds;
 };
 
 #define INIT_THREAD  { \
-	.current_ds = KERNEL_DS, \
 	.kregs = (struct pt_regs *)(init_stack+THREAD_SIZE)-1 \
 }
 
diff --git a/arch/sparc/include/asm/uaccess_32.h b/arch/sparc/include/asm/uaccess_32.h
index 367747116260..9fd6c53644b6 100644
--- a/arch/sparc/include/asm/uaccess_32.h
+++ b/arch/sparc/include/asm/uaccess_32.h
@@ -12,19 +12,6 @@
 #include <linux/string.h>
 
 #include <asm/processor.h>
-
-/* Sparc is not segmented, however we need to be able to fool access_ok()
- * when doing system calls from kernel mode legitimately.
- *
- * "For historical reasons, these macros are grossly misnamed." -Linus
- */
-
-#define KERNEL_DS   ((mm_segment_t) { 0 })
-#define USER_DS     ((mm_segment_t) { -1 })
-
-#define get_fs()	(current->thread.current_ds)
-#define set_fs(val)	((current->thread.current_ds) = (val))
-
 #include <asm-generic/access_ok.h>
 
 /* Uh, these should become the main single-value transfer routines..
diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c
index 2dc0bf9fe62e..88c0c14aaff0 100644
--- a/arch/sparc/kernel/process_32.c
+++ b/arch/sparc/kernel/process_32.c
@@ -300,7 +300,6 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
 		extern int nwindows;
 		unsigned long psr;
 		memset(new_stack, 0, STACKFRAME_SZ + TRACEREG_SZ);
-		p->thread.current_ds = KERNEL_DS;
 		ti->kpc = (((unsigned long) ret_from_kernel_thread) - 0x8);
 		childregs->u_regs[UREG_G1] = sp; /* function */
 		childregs->u_regs[UREG_G2] = arg;
@@ -311,7 +310,6 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
 	}
 	memcpy(new_stack, (char *)regs - STACKFRAME_SZ, STACKFRAME_SZ + TRACEREG_SZ);
 	childregs->u_regs[UREG_FP] = sp;
-	p->thread.current_ds = USER_DS;
 	ti->kpc = (((unsigned long) ret_from_fork) - 0x8);
 	ti->kpsr = current->thread.fork_kpsr | PSR_PIL;
 	ti->kwim = current->thread.fork_kwim;
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index 8ac599aa6d99..09f7616a0b46 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -40,7 +40,6 @@ config XTENSA
 	select IRQ_DOMAIN
 	select MODULES_USE_ELF_RELA
 	select PERF_USE_VMALLOC
-	select SET_FS
 	select TRACE_IRQFLAGS_SUPPORT
 	select VIRT_TO_BUS
 	help
diff --git a/arch/xtensa/include/asm/asm-uaccess.h b/arch/xtensa/include/asm/asm-uaccess.h
index 7f6cf4151843..7cec869136e3 100644
--- a/arch/xtensa/include/asm/asm-uaccess.h
+++ b/arch/xtensa/include/asm/asm-uaccess.h
@@ -23,76 +23,6 @@
 #include <asm/asm-offsets.h>
 #include <asm/processor.h>
 
-/*
- * These assembly macros mirror the C macros in asm/uaccess.h.  They
- * should always have identical functionality.  See
- * arch/xtensa/kernel/sys.S for usage.
- */
-
-#define KERNEL_DS	0
-#define USER_DS		1
-
-/*
- * get_fs reads current->thread.current_ds into a register.
- * On Entry:
- * 	<ad>	anything
- * 	<sp>	stack
- * On Exit:
- * 	<ad>	contains current->thread.current_ds
- */
-	.macro	get_fs	ad, sp
-	GET_CURRENT(\ad,\sp)
-#if THREAD_CURRENT_DS > 1020
-	addi	\ad, \ad, TASK_THREAD
-	l32i	\ad, \ad, THREAD_CURRENT_DS - TASK_THREAD
-#else
-	l32i	\ad, \ad, THREAD_CURRENT_DS
-#endif
-	.endm
-
-/*
- * set_fs sets current->thread.current_ds to some value.
- * On Entry:
- *	<at>	anything (temp register)
- *	<av>	value to write
- *	<sp>	stack
- * On Exit:
- *	<at>	destroyed (actually, current)
- *	<av>	preserved, value to write
- */
-	.macro	set_fs	at, av, sp
-	GET_CURRENT(\at,\sp)
-	s32i	\av, \at, THREAD_CURRENT_DS
-	.endm
-
-/*
- * kernel_ok determines whether we should bypass addr/size checking.
- * See the equivalent C-macro version below for clarity.
- * On success, kernel_ok branches to a label indicated by parameter
- * <success>.  This implies that the macro falls through to the next
- * insruction on an error.
- *
- * Note that while this macro can be used independently, we designed
- * in for optimal use in the access_ok macro below (i.e., we fall
- * through on error).
- *
- * On Entry:
- * 	<at>		anything (temp register)
- * 	<success>	label to branch to on success; implies
- * 			fall-through macro on error
- * 	<sp>		stack pointer
- * On Exit:
- * 	<at>		destroyed (actually, current->thread.current_ds)
- */
-
-#if ((KERNEL_DS != 0) || (USER_DS == 0))
-# error Assembly macro kernel_ok fails
-#endif
-	.macro	kernel_ok  at, sp, success
-	get_fs	\at, \sp
-	beqz	\at, \success
-	.endm
-
 /*
  * user_ok determines whether the access to user-space memory is allowed.
  * See the equivalent C-macro version below for clarity.
@@ -147,7 +77,6 @@
  * 	<at>	destroyed
  */
 	.macro	access_ok  aa, as, at, sp, error
-	kernel_ok  \at, \sp, .Laccess_ok_\@
 	user_ok    \aa, \as, \at, \error
 .Laccess_ok_\@:
 	.endm
diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h
index 37d3e9887fe7..abad7c3df46f 100644
--- a/arch/xtensa/include/asm/processor.h
+++ b/arch/xtensa/include/asm/processor.h
@@ -152,18 +152,12 @@
  */
 #define SPILL_SLOT_CALL12(sp, reg) (*(((unsigned long *)(sp)) - 16 + (reg)))
 
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
 struct thread_struct {
 
 	/* kernel's return address and stack pointer for context switching */
 	unsigned long ra; /* kernel's a0: return address and window call size */
 	unsigned long sp; /* kernel's a1: stack pointer */
 
-	mm_segment_t current_ds;    /* see uaccess.h for example uses */
-
 	/* struct xtensa_cpuinfo info; */
 
 	unsigned long bad_vaddr; /* last user fault */
@@ -186,7 +180,6 @@ struct thread_struct {
 {									\
 	ra:		0, 						\
 	sp:		sizeof(init_stack) + (long) &init_stack,	\
-	current_ds:	{0},						\
 	/*info:		{0}, */						\
 	bad_vaddr:	0,						\
 	bad_uaddr:	0,						\
diff --git a/arch/xtensa/include/asm/thread_info.h b/arch/xtensa/include/asm/thread_info.h
index a312333a9add..f6fcbba1d02f 100644
--- a/arch/xtensa/include/asm/thread_info.h
+++ b/arch/xtensa/include/asm/thread_info.h
@@ -52,8 +52,6 @@ struct thread_info {
 	__u32			cpu;		/* current CPU */
 	__s32			preempt_count;	/* 0 => preemptable,< 0 => BUG*/
 
-	mm_segment_t		addr_limit;	/* thread address space */
-
 	unsigned long		cpenable;
 #if XCHAL_HAVE_EXCLUSIVE
 	/* result of the most recent exclusive store */
@@ -81,7 +79,6 @@ struct thread_info {
 	.flags		= 0,			\
 	.cpu		= 0,			\
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
-	.addr_limit	= KERNEL_DS,		\
 }
 
 /* how to get the thread information struct from C */
diff --git a/arch/xtensa/include/asm/uaccess.h b/arch/xtensa/include/asm/uaccess.h
index 0edd9e4b23d0..56aec6d504fe 100644
--- a/arch/xtensa/include/asm/uaccess.h
+++ b/arch/xtensa/include/asm/uaccess.h
@@ -19,22 +19,6 @@
 #include <linux/prefetch.h>
 #include <asm/types.h>
 #include <asm/extable.h>
-
-/*
- * The fs value determines whether argument validity checking should
- * be performed or not.  If get_fs() == USER_DS, checking is
- * performed, with get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons (Data Segment Register?), these macros are
- * grossly misnamed.
- */
-
-#define KERNEL_DS	((mm_segment_t) { 0 })
-#define USER_DS		((mm_segment_t) { 1 })
-
-#define get_fs()	(current->thread.current_ds)
-#define set_fs(val)	(current->thread.current_ds = (val))
-
 #include <asm-generic/access_ok.h>
 
 /*
diff --git a/arch/xtensa/kernel/asm-offsets.c b/arch/xtensa/kernel/asm-offsets.c
index dc5c83cad9be..f1fd1390d069 100644
--- a/arch/xtensa/kernel/asm-offsets.c
+++ b/arch/xtensa/kernel/asm-offsets.c
@@ -87,7 +87,6 @@ int main(void)
 	OFFSET(TI_STSTUS, thread_info, status);
 	OFFSET(TI_CPU, thread_info, cpu);
 	OFFSET(TI_PRE_COUNT, thread_info, preempt_count);
-	OFFSET(TI_ADDR_LIMIT, thread_info, addr_limit);
 
 	/* struct thread_info (offset from start_struct) */
 	DEFINE(THREAD_RA, offsetof (struct task_struct, thread.ra));
@@ -108,8 +107,6 @@ int main(void)
 #endif
 	DEFINE(THREAD_XTREGS_USER, offsetof (struct thread_info, xtregs_user));
 	DEFINE(XTREGS_USER_SIZE, sizeof(xtregs_user_t));
-	DEFINE(THREAD_CURRENT_DS, offsetof (struct task_struct, \
-	       thread.current_ds));
 
 	/* struct mm_struct */
 	DEFINE(MM_USERS, offsetof(struct mm_struct, mm_users));
diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index ebc685dc8d74..a5be9e61a2a2 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -8,6 +8,7 @@
  * address space, e.g. all NOMMU machines.
  */
 #include <linux/string.h>
+#include <asm-generic/access_ok.h>
 
 #ifdef CONFIG_UACCESS_MEMCPY
 #include <asm/unaligned.h>
@@ -94,30 +95,6 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 #define INLINE_COPY_TO_USER
 #endif /* CONFIG_UACCESS_MEMCPY */
 
-#ifdef CONFIG_SET_FS
-#define MAKE_MM_SEG(s)	((mm_segment_t) { (s) })
-
-#ifndef KERNEL_DS
-#define KERNEL_DS	MAKE_MM_SEG(~0UL)
-#endif
-
-#ifndef USER_DS
-#define USER_DS		MAKE_MM_SEG(TASK_SIZE - 1)
-#endif
-
-#ifndef get_fs
-#define get_fs()	(current_thread_info()->addr_limit)
-
-static inline void set_fs(mm_segment_t fs)
-{
-	current_thread_info()->addr_limit = fs;
-}
-#endif
-
-#endif /* CONFIG_SET_FS */
-
-#include <asm-generic/access_ok.h>
-
 /*
  * These are the main single-value transfer routines.  They automatically
  * use the right size if we just have the right pointer type.
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 11/14] sparc64: remove CONFIG_SET_FS support
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (9 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 10/14] uaccess: remove most CONFIG_SET_FS users Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 17:06   ` Christoph Hellwig
  2022-02-15  0:48   ` Al Viro
  2022-02-14 16:34 ` [PATCH 12/14] sh: " Arnd Bergmann
                   ` (3 subsequent siblings)
  14 siblings, 2 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

sparc64 uses address space identifiers to differentiate between kernel
and user space, using ASI_P for kernel threads but ASI_AIUS for normal
user space, with the option of changing between them.

As nothing really changes the ASI any more, just hardcode ASI_AIUS
everywhere. Kernel threads are not allowed to access __user pointers
anyway.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/sparc/Kconfig                      |  1 -
 arch/sparc/include/asm/processor_64.h   |  4 ----
 arch/sparc/include/asm/switch_to_64.h   |  4 +---
 arch/sparc/include/asm/thread_info_64.h |  4 +---
 arch/sparc/include/asm/uaccess_64.h     | 24 ------------------------
 arch/sparc/kernel/process_64.c          | 12 ------------
 arch/sparc/kernel/traps_64.c            |  2 --
 arch/sparc/lib/NGmemcpy.S               |  3 +--
 arch/sparc/mm/init_64.c                 |  3 ---
 9 files changed, 3 insertions(+), 54 deletions(-)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 875388835a58..5f08e4d16ad8 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -99,7 +99,6 @@ config SPARC64
 	select HAVE_SETUP_PER_CPU_AREA
 	select NEED_PER_CPU_EMBED_FIRST_CHUNK
 	select NEED_PER_CPU_PAGE_FIRST_CHUNK
-	select SET_FS
 
 config ARCH_PROC_KCORE_TEXT
 	def_bool y
diff --git a/arch/sparc/include/asm/processor_64.h b/arch/sparc/include/asm/processor_64.h
index ae851e8fce4c..89850dff6b03 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -47,10 +47,6 @@
 
 #ifndef __ASSEMBLY__
 
-typedef struct {
-	unsigned char seg;
-} mm_segment_t;
-
 /* The Sparc processor specific thread struct. */
 /* XXX This should die, everything can go into thread_info now. */
 struct thread_struct {
diff --git a/arch/sparc/include/asm/switch_to_64.h b/arch/sparc/include/asm/switch_to_64.h
index b1d4e2e3210f..14f3c49bfdbc 100644
--- a/arch/sparc/include/asm/switch_to_64.h
+++ b/arch/sparc/include/asm/switch_to_64.h
@@ -20,10 +20,8 @@ do {						\
 	 */
 #define switch_to(prev, next, last)					\
 do {	save_and_clear_fpu();						\
-	/* If you are tempted to conditionalize the following */	\
-	/* so that ASI is only written if it changes, think again. */	\
 	__asm__ __volatile__("wr %%g0, %0, %%asi"			\
-	: : "r" (task_thread_info(next)->current_ds));\
+	: : "r" (ASI_AIUS));						\
 	trap_block[current_thread_info()->cpu].thread =			\
 		task_thread_info(next);					\
 	__asm__ __volatile__(						\
diff --git a/arch/sparc/include/asm/thread_info_64.h b/arch/sparc/include/asm/thread_info_64.h
index 8047a9caab2f..1a44372e2bc0 100644
--- a/arch/sparc/include/asm/thread_info_64.h
+++ b/arch/sparc/include/asm/thread_info_64.h
@@ -46,7 +46,7 @@ struct thread_info {
 	struct pt_regs		*kregs;
 	int			preempt_count;	/* 0 => preemptable, <0 => BUG */
 	__u8			new_child;
-	__u8			current_ds;
+	__u8			__pad;
 	__u16			cpu;
 
 	unsigned long		*utraps;
@@ -81,7 +81,6 @@ struct thread_info {
 #define TI_KREGS	0x00000028
 #define TI_PRE_COUNT	0x00000030
 #define TI_NEW_CHILD	0x00000034
-#define TI_CURRENT_DS	0x00000035
 #define TI_CPU		0x00000036
 #define TI_UTRAPS	0x00000038
 #define TI_REG_WINDOW	0x00000040
@@ -116,7 +115,6 @@ struct thread_info {
 #define INIT_THREAD_INFO(tsk)				\
 {							\
 	.task		=	&tsk,			\
-	.current_ds	=	ASI_P,			\
 	.preempt_count	=	INIT_PREEMPT_COUNT,	\
 	.kregs		=	(struct pt_regs *)(init_stack+THREAD_SIZE)-1 \
 }
diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h
index 000bac67cf31..617a462d1f56 100644
--- a/arch/sparc/include/asm/uaccess_64.h
+++ b/arch/sparc/include/asm/uaccess_64.h
@@ -13,24 +13,6 @@
 
 #include <asm/processor.h>
 
-/*
- * Sparc64 is segmented, though more like the M68K than the I386.
- * We use the secondary ASI to address user memory, which references a
- * completely different VM map, thus there is zero chance of the user
- * doing something queer and tricking us into poking kernel memory.
- *
- * What is left here is basically what is needed for the other parts of
- * the kernel that expect to be able to manipulate, erum, "segments".
- * Or perhaps more properly, permissions.
- *
- * "For historical reasons, these macros are grossly misnamed." -Linus
- */
-
-#define KERNEL_DS   ((mm_segment_t) { ASI_P })
-#define USER_DS     ((mm_segment_t) { ASI_AIUS })	/* har har har */
-
-#define get_fs() ((mm_segment_t){(current_thread_info()->current_ds)})
-
 static inline int __access_ok(const void __user *addr, unsigned long size)
 {
 	return 1;
@@ -38,12 +20,6 @@ static inline int __access_ok(const void __user *addr, unsigned long size)
 #define __access_ok __access_ok
 #include <asm-generic/access_ok.h>
 
-#define set_fs(val)								\
-do {										\
-	current_thread_info()->current_ds = (val).seg;				\
-	__asm__ __volatile__ ("wr %%g0, %0, %%asi" : : "r" ((val).seg));	\
-} while(0)
-
 /*
  * Test whether a block of memory is a valid user space address.
  * Returns 0 if the range is valid, nonzero otherwise.
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index f5b2cac8669f..9a2ceb080ac9 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -106,18 +106,13 @@ static void show_regwindow32(struct pt_regs *regs)
 {
 	struct reg_window32 __user *rw;
 	struct reg_window32 r_w;
-	mm_segment_t old_fs;
 	
 	__asm__ __volatile__ ("flushw");
 	rw = compat_ptr((unsigned int)regs->u_regs[14]);
-	old_fs = get_fs();
-	set_fs (USER_DS);
 	if (copy_from_user (&r_w, rw, sizeof(r_w))) {
-		set_fs (old_fs);
 		return;
 	}
 
-	set_fs (old_fs);			
 	printk("l0: %08x l1: %08x l2: %08x l3: %08x "
 	       "l4: %08x l5: %08x l6: %08x l7: %08x\n",
 	       r_w.locals[0], r_w.locals[1], r_w.locals[2], r_w.locals[3],
@@ -136,7 +131,6 @@ static void show_regwindow(struct pt_regs *regs)
 	struct reg_window __user *rw;
 	struct reg_window *rwk;
 	struct reg_window r_w;
-	mm_segment_t old_fs;
 
 	if ((regs->tstate & TSTATE_PRIV) || !(test_thread_flag(TIF_32BIT))) {
 		__asm__ __volatile__ ("flushw");
@@ -145,14 +139,10 @@ static void show_regwindow(struct pt_regs *regs)
 		rwk = (struct reg_window *)
 			(regs->u_regs[14] + STACK_BIAS);
 		if (!(regs->tstate & TSTATE_PRIV)) {
-			old_fs = get_fs();
-			set_fs (USER_DS);
 			if (copy_from_user (&r_w, rw, sizeof(r_w))) {
-				set_fs (old_fs);
 				return;
 			}
 			rwk = &r_w;
-			set_fs (old_fs);			
 		}
 	} else {
 		show_regwindow32(regs);
@@ -598,7 +588,6 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
 		memset(child_trap_frame, 0, child_stack_sz);
 		__thread_flag_byte_ptr(t)[TI_FLAG_BYTE_CWP] = 
 			(current_pt_regs()->tstate + 1) & TSTATE_CWP;
-		t->current_ds = ASI_P;
 		t->kregs->u_regs[UREG_G1] = sp; /* function */
 		t->kregs->u_regs[UREG_G2] = arg;
 		return 0;
@@ -613,7 +602,6 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
 	t->kregs->u_regs[UREG_FP] = sp;
 	__thread_flag_byte_ptr(t)[TI_FLAG_BYTE_CWP] = 
 		(regs->tstate + 1) & TSTATE_CWP;
-	t->current_ds = ASI_AIUS;
 	if (sp != regs->u_regs[UREG_FP]) {
 		unsigned long csp;
 
diff --git a/arch/sparc/kernel/traps_64.c b/arch/sparc/kernel/traps_64.c
index 21077821f427..5b4de4a89dec 100644
--- a/arch/sparc/kernel/traps_64.c
+++ b/arch/sparc/kernel/traps_64.c
@@ -2857,8 +2857,6 @@ void __init trap_init(void)
 		     TI_PRE_COUNT != offsetof(struct thread_info,
 					      preempt_count) ||
 		     TI_NEW_CHILD != offsetof(struct thread_info, new_child) ||
-		     TI_CURRENT_DS != offsetof(struct thread_info,
-						current_ds) ||
 		     TI_KUNA_REGS != offsetof(struct thread_info,
 					      kern_una_regs) ||
 		     TI_KUNA_INSN != offsetof(struct thread_info,
diff --git a/arch/sparc/lib/NGmemcpy.S b/arch/sparc/lib/NGmemcpy.S
index 8e4d22a6ba0b..ee51c1230689 100644
--- a/arch/sparc/lib/NGmemcpy.S
+++ b/arch/sparc/lib/NGmemcpy.S
@@ -10,8 +10,7 @@
 #include <asm/thread_info.h>
 #define GLOBAL_SPARE	%g7
 #define RESTORE_ASI(TMP)	\
-	ldub	[%g6 + TI_CURRENT_DS], TMP;  \
-	wr	TMP, 0x0, %asi;
+	wr	%g0, ASI_AIUS, %asi
 #else
 #define GLOBAL_SPARE	%g5
 #define RESTORE_ASI(TMP)	\
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 1b23639e2fcd..ee08c279d67c 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -709,9 +709,6 @@ static void __init inherit_prom_mappings(void)
 
 void prom_world(int enter)
 {
-	if (!enter)
-		set_fs(get_fs());
-
 	__asm__ __volatile__("flushw");
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 12/14] sh: remove CONFIG_SET_FS support
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (10 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 11/14] sparc64: remove CONFIG_SET_FS support Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 16:34 ` [PATCH 13/14] ia64: " Arnd Bergmann
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

sh uses set_fs/get_fs only in one file, to handle address
errors in both user and kernel memory.

It already has an abstraction to differentiate between I/O
and memory, so adding a third class for kernel memory fits
into the same scheme and lets us kill off CONFIG_SET_FS.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/sh/Kconfig                   |  1 -
 arch/sh/include/asm/processor.h   |  1 -
 arch/sh/include/asm/segment.h     | 33 -------------------------------
 arch/sh/include/asm/thread_info.h |  2 --
 arch/sh/include/asm/uaccess.h     |  4 ----
 arch/sh/kernel/io_trapped.c       |  9 ++-------
 arch/sh/kernel/process_32.c       |  2 --
 arch/sh/kernel/traps_32.c         | 30 +++++++++++++++++-----------
 8 files changed, 21 insertions(+), 61 deletions(-)
 delete mode 100644 arch/sh/include/asm/segment.h

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 2474a04ceac4..f676e92b7d5b 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -65,7 +65,6 @@ config SUPERH
 	select PERF_EVENTS
 	select PERF_USE_VMALLOC
 	select RTC_LIB
-	select SET_FS
 	select SPARSE_IRQ
 	select TRACE_IRQFLAGS_SUPPORT
 	help
diff --git a/arch/sh/include/asm/processor.h b/arch/sh/include/asm/processor.h
index 3820d698846e..85a6c1c3c16e 100644
--- a/arch/sh/include/asm/processor.h
+++ b/arch/sh/include/asm/processor.h
@@ -3,7 +3,6 @@
 #define __ASM_SH_PROCESSOR_H
 
 #include <asm/cpu-features.h>
-#include <asm/segment.h>
 #include <asm/cache.h>
 
 #ifndef __ASSEMBLY__
diff --git a/arch/sh/include/asm/segment.h b/arch/sh/include/asm/segment.h
deleted file mode 100644
index 02e54a3335d6..000000000000
--- a/arch/sh/include/asm/segment.h
+++ /dev/null
@@ -1,33 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __ASM_SH_SEGMENT_H
-#define __ASM_SH_SEGMENT_H
-
-#ifndef __ASSEMBLY__
-
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
-#define MAKE_MM_SEG(s)	((mm_segment_t) { (s) })
-
-/*
- * The fs value determines whether argument validity checking should be
- * performed or not.  If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons, these macros are grossly misnamed.
- */
-#define KERNEL_DS	MAKE_MM_SEG(0xFFFFFFFFUL)
-#ifdef CONFIG_MMU
-#define USER_DS		MAKE_MM_SEG(PAGE_OFFSET)
-#else
-#define USER_DS		KERNEL_DS
-#endif
-
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-
-#define get_fs()	(current_thread_info()->addr_limit)
-#define set_fs(x)	(current_thread_info()->addr_limit = (x))
-
-#endif /* __ASSEMBLY__ */
-#endif /* __ASM_SH_SEGMENT_H */
diff --git a/arch/sh/include/asm/thread_info.h b/arch/sh/include/asm/thread_info.h
index 598d0184ffea..b119b859a0a3 100644
--- a/arch/sh/include/asm/thread_info.h
+++ b/arch/sh/include/asm/thread_info.h
@@ -30,7 +30,6 @@ struct thread_info {
 	__u32			status;		/* thread synchronous flags */
 	__u32			cpu;
 	int			preempt_count; /* 0 => preemptable, <0 => BUG */
-	mm_segment_t		addr_limit;	/* thread address space */
 	unsigned long		previous_sp;	/* sp of previous stack in case
 						   of nested IRQ stacks */
 	__u8			supervisor_stack[0];
@@ -58,7 +57,6 @@ struct thread_info {
 	.status		= 0,			\
 	.cpu		= 0,			\
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
-	.addr_limit	= KERNEL_DS,		\
 }
 
 /* how to get the current stack pointer from C */
diff --git a/arch/sh/include/asm/uaccess.h b/arch/sh/include/asm/uaccess.h
index ccd219d74851..a79609eb14be 100644
--- a/arch/sh/include/asm/uaccess.h
+++ b/arch/sh/include/asm/uaccess.h
@@ -2,11 +2,7 @@
 #ifndef __ASM_SH_UACCESS_H
 #define __ASM_SH_UACCESS_H
 
-#include <asm/segment.h>
 #include <asm/extable.h>
-
-#define user_addr_max()	(current_thread_info()->addr_limit.seg)
-
 #include <asm-generic/access_ok.h>
 
 /*
diff --git a/arch/sh/kernel/io_trapped.c b/arch/sh/kernel/io_trapped.c
index 004ad0130b10..e803b14ef12e 100644
--- a/arch/sh/kernel/io_trapped.c
+++ b/arch/sh/kernel/io_trapped.c
@@ -270,7 +270,6 @@ static struct mem_access trapped_io_access = {
 
 int handle_trapped_io(struct pt_regs *regs, unsigned long address)
 {
-	mm_segment_t oldfs;
 	insn_size_t instruction;
 	int tmp;
 
@@ -281,16 +280,12 @@ int handle_trapped_io(struct pt_regs *regs, unsigned long address)
 
 	WARN_ON(user_mode(regs));
 
-	oldfs = get_fs();
-	set_fs(KERNEL_DS);
-	if (copy_from_user(&instruction, (void *)(regs->pc),
-			   sizeof(instruction))) {
-		set_fs(oldfs);
+	if (copy_from_kernel_nofault(&instruction, (void *)(regs->pc),
+				     sizeof(instruction))) {
 		return 0;
 	}
 
 	tmp = handle_unaligned_access(instruction, regs,
 				      &trapped_io_access, 1, address);
-	set_fs(oldfs);
 	return tmp == 0;
 }
diff --git a/arch/sh/kernel/process_32.c b/arch/sh/kernel/process_32.c
index 1c28e3cddb60..ca01286a0610 100644
--- a/arch/sh/kernel/process_32.c
+++ b/arch/sh/kernel/process_32.c
@@ -123,7 +123,6 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
 #if defined(CONFIG_SH_FPU)
 		childregs->sr |= SR_FD;
 #endif
-		ti->addr_limit = KERNEL_DS;
 		ti->status &= ~TS_USEDFPU;
 		p->thread.fpu_counter = 0;
 		return 0;
@@ -132,7 +131,6 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
 
 	if (usp)
 		childregs->regs[15] = usp;
-	ti->addr_limit = USER_DS;
 
 	if (clone_flags & CLONE_SETTLS)
 		childregs->gbr = tls;
diff --git a/arch/sh/kernel/traps_32.c b/arch/sh/kernel/traps_32.c
index b3c715bc254b..6cdda3a621a1 100644
--- a/arch/sh/kernel/traps_32.c
+++ b/arch/sh/kernel/traps_32.c
@@ -75,6 +75,23 @@ static struct mem_access user_mem_access = {
 	copy_to_user,
 };
 
+static unsigned long copy_from_kernel_wrapper(void *dst, const void __user *src,
+					      unsigned long cnt)
+{
+	return copy_from_kernel_nofault(dst, (const void __force *)src, cnt);
+}
+
+static unsigned long copy_to_kernel_wrapper(void __user *dst, const void *src,
+					    unsigned long cnt)
+{
+	return copy_to_kernel_nofault((void __force *)dst, src, cnt);
+}
+
+static struct mem_access kernel_mem_access = {
+	copy_from_kernel_wrapper,
+	copy_to_kernel_wrapper,
+};
+
 /*
  * handle an instruction that does an unaligned memory access by emulating the
  * desired behaviour
@@ -473,7 +490,6 @@ asmlinkage void do_address_error(struct pt_regs *regs,
 				 unsigned long address)
 {
 	unsigned long error_code = 0;
-	mm_segment_t oldfs;
 	insn_size_t instruction;
 	int tmp;
 
@@ -489,13 +505,10 @@ asmlinkage void do_address_error(struct pt_regs *regs,
 		local_irq_enable();
 		inc_unaligned_user_access();
 
-		oldfs = force_uaccess_begin();
 		if (copy_from_user(&instruction, (insn_size_t __user *)(regs->pc & ~1),
 				   sizeof(instruction))) {
-			force_uaccess_end(oldfs);
 			goto uspace_segv;
 		}
-		force_uaccess_end(oldfs);
 
 		/* shout about userspace fixups */
 		unaligned_fixups_notify(current, instruction, regs);
@@ -518,11 +531,9 @@ asmlinkage void do_address_error(struct pt_regs *regs,
 			goto uspace_segv;
 		}
 
-		oldfs = force_uaccess_begin();
 		tmp = handle_unaligned_access(instruction, regs,
 					      &user_mem_access, 0,
 					      address);
-		force_uaccess_end(oldfs);
 
 		if (tmp == 0)
 			return; /* sorted */
@@ -538,21 +549,18 @@ asmlinkage void do_address_error(struct pt_regs *regs,
 		if (regs->pc & 1)
 			die("unaligned program counter", regs, error_code);
 
-		set_fs(KERNEL_DS);
-		if (copy_from_user(&instruction, (void __user *)(regs->pc),
+		if (copy_from_kernel_nofault(&instruction, (void *)(regs->pc),
 				   sizeof(instruction))) {
 			/* Argh. Fault on the instruction itself.
 			   This should never happen non-SMP
 			*/
-			set_fs(oldfs);
 			die("insn faulting in do_address_error", regs, 0);
 		}
 
 		unaligned_fixups_notify(current, instruction, regs);
 
-		handle_unaligned_access(instruction, regs, &user_mem_access,
+		handle_unaligned_access(instruction, regs, &kernel_mem_access,
 					0, address);
-		set_fs(oldfs);
 	}
 }
 
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 13/14] ia64: remove CONFIG_SET_FS support
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (11 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 12/14] sh: " Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-14 16:34 ` [PATCH 14/14] uaccess: drop set_fs leftovers Arnd Bergmann
  2022-02-14 17:35 ` [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Linus Torvalds
  14 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

ia64 only uses set_fs() in one file to handle unaligned access for
both user space and kernel instructions. Rewrite this to explicitly
pass around a flag about which one it is and drop the feature from
the architecture.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/ia64/Kconfig                   |  1 -
 arch/ia64/include/asm/processor.h   |  4 --
 arch/ia64/include/asm/thread_info.h |  2 -
 arch/ia64/include/asm/uaccess.h     | 21 +++-------
 arch/ia64/kernel/unaligned.c        | 60 +++++++++++++++++++----------
 5 files changed, 45 insertions(+), 43 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index a7e01573abd8..6b6a35b3d959 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -61,7 +61,6 @@ config IA64
 	select NEED_SG_DMA_LENGTH
 	select NUMA if !FLATMEM
 	select PCI_MSI_ARCH_FALLBACKS if PCI_MSI
-	select SET_FS
 	select ZONE_DMA32
 	default y
 	help
diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index 45365c2ef598..7cbce290f4e5 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -243,10 +243,6 @@ DECLARE_PER_CPU(struct cpuinfo_ia64, ia64_cpu_info);
 
 extern void print_cpu_info (struct cpuinfo_ia64 *);
 
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
 #define SET_UNALIGN_CTL(task,value)								\
 ({												\
 	(task)->thread.flags = (((task)->thread.flags & ~IA64_THREAD_UAC_MASK)			\
diff --git a/arch/ia64/include/asm/thread_info.h b/arch/ia64/include/asm/thread_info.h
index 51d20cb37706..ef83493e6778 100644
--- a/arch/ia64/include/asm/thread_info.h
+++ b/arch/ia64/include/asm/thread_info.h
@@ -27,7 +27,6 @@ struct thread_info {
 	__u32 cpu;			/* current CPU */
 	__u32 last_cpu;			/* Last CPU thread ran on */
 	__u32 status;			/* Thread synchronous flags */
-	mm_segment_t addr_limit;	/* user-level address space limit */
 	int preempt_count;		/* 0=premptable, <0=BUG; will also serve as bh-counter */
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 	__u64 utime;
@@ -48,7 +47,6 @@ struct thread_info {
 	.task		= &tsk,			\
 	.flags		= 0,			\
 	.cpu		= 0,			\
-	.addr_limit	= KERNEL_DS,		\
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
 }
 
diff --git a/arch/ia64/include/asm/uaccess.h b/arch/ia64/include/asm/uaccess.h
index e242a3cc1330..60adadeb3e9e 100644
--- a/arch/ia64/include/asm/uaccess.h
+++ b/arch/ia64/include/asm/uaccess.h
@@ -42,26 +42,17 @@
 #include <asm/extable.h>
 
 /*
- * For historical reasons, the following macros are grossly misnamed:
- */
-#define KERNEL_DS	((mm_segment_t) { ~0UL })		/* cf. access_ok() */
-#define USER_DS		((mm_segment_t) { TASK_SIZE-1 })	/* cf. access_ok() */
-
-#define get_fs()  (current_thread_info()->addr_limit)
-#define set_fs(x) (current_thread_info()->addr_limit = (x))
-
-/*
- * When accessing user memory, we need to make sure the entire area really is in
- * user-level space.  In order to do this efficiently, we make sure that the page at
- * address TASK_SIZE is never valid.  We also need to make sure that the address doesn't
+ * When accessing user memory, we need to make sure the entire area really is
+ * in user-level space.  We also need to make sure that the address doesn't
  * point inside the virtually mapped linear page table.
  */
 static inline int __access_ok(const void __user *p, unsigned long size)
 {
+	unsigned long limit = TASK_SIZE;
 	unsigned long addr = (unsigned long)p;
-	unsigned long seg = get_fs().seg;
-	return likely(addr <= seg) &&
-	 (seg == KERNEL_DS.seg || likely(REGION_OFFSET(addr) < RGN_MAP_LIMIT));
+
+	return likely((size <= limit) && (addr <= (limit - size)) &&
+		 likely(REGION_OFFSET(addr) < RGN_MAP_LIMIT));
 }
 #define __access_ok __access_ok
 #include <asm-generic/access_ok.h>
diff --git a/arch/ia64/kernel/unaligned.c b/arch/ia64/kernel/unaligned.c
index 6c1a8951dfbb..0acb5a0cd7ab 100644
--- a/arch/ia64/kernel/unaligned.c
+++ b/arch/ia64/kernel/unaligned.c
@@ -749,9 +749,25 @@ emulate_load_updates (update_t type, load_store_t ld, struct pt_regs *regs, unsi
 	}
 }
 
+static int emulate_store(unsigned long ifa, void *val, int len, bool kernel_mode)
+{
+	if (kernel_mode)
+		return copy_to_kernel_nofault((void *)ifa, val, len);
+
+	return copy_to_user((void __user *)ifa, val, len);
+}
+
+static int emulate_load(void *val, unsigned long ifa, int len, bool kernel_mode)
+{
+	if (kernel_mode)
+	       return copy_from_kernel_nofault(val, (void *)ifa, len);
+
+	return copy_from_user(val, (void __user *)ifa, len);
+}
 
 static int
-emulate_load_int (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
+emulate_load_int (unsigned long ifa, load_store_t ld, struct pt_regs *regs,
+		  bool kernel_mode)
 {
 	unsigned int len = 1 << ld.x6_sz;
 	unsigned long val = 0;
@@ -774,7 +790,7 @@ emulate_load_int (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
 		return -1;
 	}
 	/* this assumes little-endian byte-order: */
-	if (copy_from_user(&val, (void __user *) ifa, len))
+	if (emulate_load(&val, ifa, len, kernel_mode))
 		return -1;
 	setreg(ld.r1, val, 0, regs);
 
@@ -872,7 +888,8 @@ emulate_load_int (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
 }
 
 static int
-emulate_store_int (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
+emulate_store_int (unsigned long ifa, load_store_t ld, struct pt_regs *regs,
+		   bool kernel_mode)
 {
 	unsigned long r2;
 	unsigned int len = 1 << ld.x6_sz;
@@ -901,7 +918,7 @@ emulate_store_int (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
 	}
 
 	/* this assumes little-endian byte-order: */
-	if (copy_to_user((void __user *) ifa, &r2, len))
+	if (emulate_store(ifa, &r2, len, kernel_mode))
 		return -1;
 
 	/*
@@ -1021,7 +1038,7 @@ float2mem_double (struct ia64_fpreg *init, struct ia64_fpreg *final)
 }
 
 static int
-emulate_load_floatpair (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
+emulate_load_floatpair (unsigned long ifa, load_store_t ld, struct pt_regs *regs, bool kernel_mode)
 {
 	struct ia64_fpreg fpr_init[2];
 	struct ia64_fpreg fpr_final[2];
@@ -1050,8 +1067,8 @@ emulate_load_floatpair (unsigned long ifa, load_store_t ld, struct pt_regs *regs
 		 * This assumes little-endian byte-order.  Note that there is no "ldfpe"
 		 * instruction:
 		 */
-		if (copy_from_user(&fpr_init[0], (void __user *) ifa, len)
-		    || copy_from_user(&fpr_init[1], (void __user *) (ifa + len), len))
+		if (emulate_load(&fpr_init[0], ifa, len, kernel_mode)
+		    || emulate_load(&fpr_init[1], (ifa + len), len, kernel_mode))
 			return -1;
 
 		DPRINT("ld.r1=%d ld.imm=%d x6_sz=%d\n", ld.r1, ld.imm, ld.x6_sz);
@@ -1126,7 +1143,8 @@ emulate_load_floatpair (unsigned long ifa, load_store_t ld, struct pt_regs *regs
 
 
 static int
-emulate_load_float (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
+emulate_load_float (unsigned long ifa, load_store_t ld, struct pt_regs *regs,
+	            bool kernel_mode)
 {
 	struct ia64_fpreg fpr_init;
 	struct ia64_fpreg fpr_final;
@@ -1152,7 +1170,7 @@ emulate_load_float (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
 	 * See comments in ldX for descriptions on how the various loads are handled.
 	 */
 	if (ld.x6_op != 0x2) {
-		if (copy_from_user(&fpr_init, (void __user *) ifa, len))
+		if (emulate_load(&fpr_init, ifa, len, kernel_mode))
 			return -1;
 
 		DPRINT("ld.r1=%d x6_sz=%d\n", ld.r1, ld.x6_sz);
@@ -1202,7 +1220,8 @@ emulate_load_float (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
 
 
 static int
-emulate_store_float (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
+emulate_store_float (unsigned long ifa, load_store_t ld, struct pt_regs *regs,
+		     bool kernel_mode)
 {
 	struct ia64_fpreg fpr_init;
 	struct ia64_fpreg fpr_final;
@@ -1244,7 +1263,7 @@ emulate_store_float (unsigned long ifa, load_store_t ld, struct pt_regs *regs)
 	DDUMP("fpr_init =", &fpr_init, len);
 	DDUMP("fpr_final =", &fpr_final, len);
 
-	if (copy_to_user((void __user *) ifa, &fpr_final, len))
+	if (emulate_store(ifa, &fpr_final, len, kernel_mode))
 		return -1;
 
 	/*
@@ -1295,7 +1314,6 @@ void
 ia64_handle_unaligned (unsigned long ifa, struct pt_regs *regs)
 {
 	struct ia64_psr *ipsr = ia64_psr(regs);
-	mm_segment_t old_fs = get_fs();
 	unsigned long bundle[2];
 	unsigned long opcode;
 	const struct exception_table_entry *eh = NULL;
@@ -1304,6 +1322,7 @@ ia64_handle_unaligned (unsigned long ifa, struct pt_regs *regs)
 		load_store_t insn;
 	} u;
 	int ret = -1;
+	bool kernel_mode = false;
 
 	if (ia64_psr(regs)->be) {
 		/* we don't support big-endian accesses */
@@ -1367,13 +1386,13 @@ ia64_handle_unaligned (unsigned long ifa, struct pt_regs *regs)
 			if (unaligned_dump_stack)
 				dump_stack();
 		}
-		set_fs(KERNEL_DS);
+		kernel_mode = true;
 	}
 
 	DPRINT("iip=%lx ifa=%lx isr=%lx (ei=%d, sp=%d)\n",
 	       regs->cr_iip, ifa, regs->cr_ipsr, ipsr->ri, ipsr->it);
 
-	if (__copy_from_user(bundle, (void __user *) regs->cr_iip, 16))
+	if (emulate_load(bundle, regs->cr_iip, 16, kernel_mode))
 		goto failure;
 
 	/*
@@ -1467,7 +1486,7 @@ ia64_handle_unaligned (unsigned long ifa, struct pt_regs *regs)
 	      case LDCCLR_IMM_OP:
 	      case LDCNC_IMM_OP:
 	      case LDCCLRACQ_IMM_OP:
-		ret = emulate_load_int(ifa, u.insn, regs);
+		ret = emulate_load_int(ifa, u.insn, regs, kernel_mode);
 		break;
 
 	      case ST_OP:
@@ -1478,7 +1497,7 @@ ia64_handle_unaligned (unsigned long ifa, struct pt_regs *regs)
 		fallthrough;
 	      case ST_IMM_OP:
 	      case STREL_IMM_OP:
-		ret = emulate_store_int(ifa, u.insn, regs);
+		ret = emulate_store_int(ifa, u.insn, regs, kernel_mode);
 		break;
 
 	      case LDF_OP:
@@ -1486,21 +1505,21 @@ ia64_handle_unaligned (unsigned long ifa, struct pt_regs *regs)
 	      case LDFCCLR_OP:
 	      case LDFCNC_OP:
 		if (u.insn.x)
-			ret = emulate_load_floatpair(ifa, u.insn, regs);
+			ret = emulate_load_floatpair(ifa, u.insn, regs, kernel_mode);
 		else
-			ret = emulate_load_float(ifa, u.insn, regs);
+			ret = emulate_load_float(ifa, u.insn, regs, kernel_mode);
 		break;
 
 	      case LDF_IMM_OP:
 	      case LDFA_IMM_OP:
 	      case LDFCCLR_IMM_OP:
 	      case LDFCNC_IMM_OP:
-		ret = emulate_load_float(ifa, u.insn, regs);
+		ret = emulate_load_float(ifa, u.insn, regs, kernel_mode);
 		break;
 
 	      case STF_OP:
 	      case STF_IMM_OP:
-		ret = emulate_store_float(ifa, u.insn, regs);
+		ret = emulate_store_float(ifa, u.insn, regs, kernel_mode);
 		break;
 
 	      default:
@@ -1521,7 +1540,6 @@ ia64_handle_unaligned (unsigned long ifa, struct pt_regs *regs)
 
 	DPRINT("ipsr->ri=%d iip=%lx\n", ipsr->ri, regs->cr_iip);
   done:
-	set_fs(old_fs);		/* restore original address limit */
 	return;
 
   failure:
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 14/14] uaccess: drop set_fs leftovers
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (12 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 13/14] ia64: " Arnd Bergmann
@ 2022-02-14 16:34 ` Arnd Bergmann
  2022-02-15  3:03   ` Al Viro
  2022-02-14 17:35 ` [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Linus Torvalds
  14 siblings, 1 reply; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 16:34 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig, linux-arch, linux-mm,
	linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

From: Arnd Bergmann <arnd@arndb.de>

There are no more users of CONFIG_SET_FS left, so drop all
remaining references to set_fs()/get_fs(), mm_segment_t
and uaccess_kernel().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/Kconfig                       |  3 ---
 arch/arm/lib/uaccess_with_memcpy.c | 10 ---------
 arch/nds32/kernel/process.c        |  5 ++---
 arch/parisc/include/asm/futex.h    |  2 +-
 arch/parisc/lib/memcpy.c           |  2 +-
 drivers/hid/uhid.c                 |  2 +-
 drivers/scsi/sg.c                  |  5 -----
 fs/exec.c                          |  6 ------
 include/asm-generic/access_ok.h    | 10 +--------
 include/linux/syscalls.h           |  4 ----
 include/linux/uaccess.h            | 33 ------------------------------
 include/rdma/ib.h                  |  2 +-
 kernel/events/callchain.c          |  4 ----
 kernel/events/core.c               |  3 ---
 kernel/exit.c                      | 14 -------------
 kernel/kthread.c                   |  5 -----
 kernel/stacktrace.c                |  3 ---
 kernel/trace/bpf_trace.c           |  4 ----
 mm/maccess.c                       | 11 ----------
 mm/memory.c                        |  8 --------
 net/bpfilter/bpfilter_kern.c       |  2 +-
 21 files changed, 8 insertions(+), 130 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 678a80713b21..96075a12c720 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -24,9 +24,6 @@ config KEXEC_ELF
 config HAVE_IMA_KEXEC
 	bool
 
-config SET_FS
-	bool
-
 config HOTPLUG_SMT
 	bool
 
diff --git a/arch/arm/lib/uaccess_with_memcpy.c b/arch/arm/lib/uaccess_with_memcpy.c
index 106f83a5ea6d..c30b689bec2e 100644
--- a/arch/arm/lib/uaccess_with_memcpy.c
+++ b/arch/arm/lib/uaccess_with_memcpy.c
@@ -92,11 +92,6 @@ __copy_to_user_memcpy(void __user *to, const void *from, unsigned long n)
 	unsigned long ua_flags;
 	int atomic;
 
-	if (uaccess_kernel()) {
-		memcpy((void *)to, from, n);
-		return 0;
-	}
-
 	/* the mmap semaphore is taken only if not in an atomic context */
 	atomic = faulthandler_disabled();
 
@@ -165,11 +160,6 @@ __clear_user_memset(void __user *addr, unsigned long n)
 {
 	unsigned long ua_flags;
 
-	if (uaccess_kernel()) {
-		memset((void *)addr, 0, n);
-		return 0;
-	}
-
 	mmap_read_lock(current->mm);
 	while (n) {
 		pte_t *pte;
diff --git a/arch/nds32/kernel/process.c b/arch/nds32/kernel/process.c
index 49fab9e39cbf..d35c1f63fa11 100644
--- a/arch/nds32/kernel/process.c
+++ b/arch/nds32/kernel/process.c
@@ -119,9 +119,8 @@ void show_regs(struct pt_regs *regs)
 		regs->uregs[7], regs->uregs[6], regs->uregs[5], regs->uregs[4]);
 	pr_info("r3 : %08lx  r2 : %08lx  r1 : %08lx  r0 : %08lx\n",
 		regs->uregs[3], regs->uregs[2], regs->uregs[1], regs->uregs[0]);
-	pr_info("  IRQs o%s  Segment %s\n",
-		interrupts_enabled(regs) ? "n" : "ff",
-		uaccess_kernel() ? "kernel" : "user");
+	pr_info("  IRQs o%s  Segment user\n",
+		interrupts_enabled(regs) ? "n" : "ff");
 }
 
 EXPORT_SYMBOL(show_regs);
diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
index b5835325d44b..2f4a1b1ef387 100644
--- a/arch/parisc/include/asm/futex.h
+++ b/arch/parisc/include/asm/futex.h
@@ -99,7 +99,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	/* futex.c wants to do a cmpxchg_inatomic on kernel NULL, which is
 	 * our gateway page, and causes no end of trouble...
 	 */
-	if (uaccess_kernel() && !uaddr)
+	if (!uaddr)
 		return -EFAULT;
 
 	if (!access_ok(uaddr, sizeof(u32)))
diff --git a/arch/parisc/lib/memcpy.c b/arch/parisc/lib/memcpy.c
index ea70a0e08321..468704ce8a1c 100644
--- a/arch/parisc/lib/memcpy.c
+++ b/arch/parisc/lib/memcpy.c
@@ -13,7 +13,7 @@
 #include <linux/compiler.h>
 #include <linux/uaccess.h>
 
-#define get_user_space() (uaccess_kernel() ? 0 : mfsp(3))
+#define get_user_space() (mfsp(3))
 #define get_kernel_space() (0)
 
 /* Returns 0 for success, otherwise, returns number of bytes not transferred. */
diff --git a/drivers/hid/uhid.c b/drivers/hid/uhid.c
index 614adb510dbd..2a918aeb0af1 100644
--- a/drivers/hid/uhid.c
+++ b/drivers/hid/uhid.c
@@ -747,7 +747,7 @@ static ssize_t uhid_char_write(struct file *file, const char __user *buffer,
 		 * copied from, so it's unsafe to allow this with elevated
 		 * privileges (e.g. from a setuid binary) or via kernel_write().
 		 */
-		if (file->f_cred != current_cred() || uaccess_kernel()) {
+		if (file->f_cred != current_cred()) {
 			pr_err_once("UHID_CREATE from different security context by process %d (%s), this is not allowed.\n",
 				    task_tgid_vnr(current), current->comm);
 			ret = -EACCES;
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 6b43e97bd417..aaa2376b9d34 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -224,11 +224,6 @@ static int sg_check_file_access(struct file *filp, const char *caller)
 			caller, task_tgid_vnr(current), current->comm);
 		return -EPERM;
 	}
-	if (uaccess_kernel()) {
-		pr_err_once("%s: process %d (%s) called from kernel context, this is not allowed.\n",
-			caller, task_tgid_vnr(current), current->comm);
-		return -EACCES;
-	}
 	return 0;
 }
 
diff --git a/fs/exec.c b/fs/exec.c
index 79f2c9483302..bc68a0c089ac 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1303,12 +1303,6 @@ int begin_new_exec(struct linux_binprm * bprm)
 	if (retval)
 		goto out_unlock;
 
-	/*
-	 * Ensure that the uaccess routines can actually operate on userspace
-	 * pointers:
-	 */
-	force_uaccess_begin();
-
 	if (me->flags & PF_KTHREAD)
 		free_kthread_struct(me);
 	me->flags &= ~(PF_RANDOMIZE | PF_FORKNOEXEC | PF_KTHREAD |
diff --git a/include/asm-generic/access_ok.h b/include/asm-generic/access_ok.h
index 883b573af5fe..725647ba8ea9 100644
--- a/include/asm-generic/access_ok.h
+++ b/include/asm-generic/access_ok.h
@@ -16,16 +16,8 @@
 #define TASK_SIZE_MAX			TASK_SIZE
 #endif
 
-#ifndef uaccess_kernel
-#ifdef CONFIG_SET_FS
-#define uaccess_kernel()		(get_fs().seg == KERNEL_DS.seg)
-#else
-#define uaccess_kernel()		(0)
-#endif
-#endif
-
 #ifndef user_addr_max
-#define user_addr_max()			(uaccess_kernel() ? ~0UL : TASK_SIZE_MAX)
+#define user_addr_max()			TASK_SIZE_MAX
 #endif
 
 #ifndef __access_ok
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 819c0cb00b6d..a34b0f9a9972 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -290,10 +290,6 @@ static inline void addr_limit_user_check(void)
 		return;
 #endif
 
-	if (CHECK_DATA_CORRUPTION(uaccess_kernel(),
-				  "Invalid address limit on user-mode return"))
-		force_sig(SIGKILL);
-
 #ifdef TIF_FSCHECK
 	clear_thread_flag(TIF_FSCHECK);
 #endif
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 2c31667e62e0..2421a41f3a8e 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -10,39 +10,6 @@
 
 #include <asm/uaccess.h>
 
-#ifdef CONFIG_SET_FS
-/*
- * Force the uaccess routines to be wired up for actual userspace access,
- * overriding any possible set_fs(KERNEL_DS) still lingering around.  Undone
- * using force_uaccess_end below.
- */
-static inline mm_segment_t force_uaccess_begin(void)
-{
-	mm_segment_t fs = get_fs();
-
-	set_fs(USER_DS);
-	return fs;
-}
-
-static inline void force_uaccess_end(mm_segment_t oldfs)
-{
-	set_fs(oldfs);
-}
-#else /* CONFIG_SET_FS */
-typedef struct {
-	/* empty dummy */
-} mm_segment_t;
-
-static inline mm_segment_t force_uaccess_begin(void)
-{
-	return (mm_segment_t) { };
-}
-
-static inline void force_uaccess_end(mm_segment_t oldfs)
-{
-}
-#endif /* CONFIG_SET_FS */
-
 /*
  * Architectures should provide two primitives (raw_copy_{to,from}_user())
  * and get rid of their private instances of copy_{to,from}_user() and
diff --git a/include/rdma/ib.h b/include/rdma/ib.h
index 83139b9ce409..f7c185ff7a11 100644
--- a/include/rdma/ib.h
+++ b/include/rdma/ib.h
@@ -75,7 +75,7 @@ struct sockaddr_ib {
  */
 static inline bool ib_safe_file_access(struct file *filp)
 {
-	return filp->f_cred == current_cred() && !uaccess_kernel();
+	return filp->f_cred == current_cred();
 }
 
 #endif /* _RDMA_IB_H */
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index 58cbe357fb2b..1273be84392c 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -209,17 +209,13 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
 		}
 
 		if (regs) {
-			mm_segment_t fs;
-
 			if (crosstask)
 				goto exit_put;
 
 			if (add_mark)
 				perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);
 
-			fs = force_uaccess_begin();
 			perf_callchain_user(&ctx, regs);
-			force_uaccess_end(fs);
 		}
 	}
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 57c7197838db..11ca7303d6df 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6746,7 +6746,6 @@ perf_output_sample_ustack(struct perf_output_handle *handle, u64 dump_size,
 		unsigned long sp;
 		unsigned int rem;
 		u64 dyn_size;
-		mm_segment_t fs;
 
 		/*
 		 * We dump:
@@ -6764,9 +6763,7 @@ perf_output_sample_ustack(struct perf_output_handle *handle, u64 dump_size,
 
 		/* Data. */
 		sp = perf_user_stack_pointer(regs);
-		fs = force_uaccess_begin();
 		rem = __output_copy_user(handle, (void *) sp, dump_size);
-		force_uaccess_end(fs);
 		dyn_size = dump_size - rem;
 
 		perf_output_skip(handle, rem);
diff --git a/kernel/exit.c b/kernel/exit.c
index b00a25bb4ab9..0884a75bc2f8 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -737,20 +737,6 @@ void __noreturn do_exit(long code)
 
 	WARN_ON(blk_needs_flush_plug(tsk));
 
-	/*
-	 * If do_dead is called because this processes oopsed, it's possible
-	 * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
-	 * continuing. Amongst other possible reasons, this is to prevent
-	 * mm_release()->clear_child_tid() from writing to a user-controlled
-	 * kernel address.
-	 *
-	 * On uptodate architectures force_uaccess_begin is a noop.  On
-	 * architectures that still have set_fs/get_fs in addition to handling
-	 * oopses handles kernel threads that run as set_fs(KERNEL_DS) by
-	 * default.
-	 */
-	force_uaccess_begin();
-
 	kcov_task_exit(tsk);
 
 	coredump_task_exit(tsk);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 38c6dd822da8..16c2275d4b50 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -55,7 +55,6 @@ struct kthread {
 	int result;
 	int (*threadfn)(void *);
 	void *data;
-	mm_segment_t oldfs;
 	struct completion parked;
 	struct completion exited;
 #ifdef CONFIG_BLK_CGROUP
@@ -1441,8 +1440,6 @@ void kthread_use_mm(struct mm_struct *mm)
 		mmdrop(active_mm);
 	else
 		smp_mb();
-
-	to_kthread(tsk)->oldfs = force_uaccess_begin();
 }
 EXPORT_SYMBOL_GPL(kthread_use_mm);
 
@@ -1457,8 +1454,6 @@ void kthread_unuse_mm(struct mm_struct *mm)
 	WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));
 	WARN_ON_ONCE(!tsk->mm);
 
-	force_uaccess_end(to_kthread(tsk)->oldfs);
-
 	task_lock(tsk);
 	/*
 	 * When a kthread stops operating on an address space, the loop
diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index 9c625257023d..9ed5ce989415 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -226,15 +226,12 @@ unsigned int stack_trace_save_user(unsigned long *store, unsigned int size)
 		.store	= store,
 		.size	= size,
 	};
-	mm_segment_t fs;
 
 	/* Trace user stack if not a kernel thread */
 	if (current->flags & PF_KTHREAD)
 		return 0;
 
-	fs = force_uaccess_begin();
 	arch_stack_walk_user(consume_entry, &c, task_pt_regs(current));
-	force_uaccess_end(fs);
 
 	return c.len;
 }
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 21aa30644219..8115fff17018 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -332,8 +332,6 @@ BPF_CALL_3(bpf_probe_write_user, void __user *, unsafe_ptr, const void *, src,
 	if (unlikely(in_interrupt() ||
 		     current->flags & (PF_KTHREAD | PF_EXITING)))
 		return -EPERM;
-	if (unlikely(uaccess_kernel()))
-		return -EPERM;
 	if (unlikely(!nmi_uaccess_okay()))
 		return -EPERM;
 
@@ -835,8 +833,6 @@ static int bpf_send_signal_common(u32 sig, enum pid_type type)
 	 */
 	if (unlikely(current->flags & (PF_KTHREAD | PF_EXITING)))
 		return -EPERM;
-	if (unlikely(uaccess_kernel()))
-		return -EPERM;
 	if (unlikely(!nmi_uaccess_okay()))
 		return -EPERM;
 
diff --git a/mm/maccess.c b/mm/maccess.c
index cbd1b3959af2..106820b33a2b 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -113,14 +113,11 @@ long strncpy_from_kernel_nofault(char *dst, const void *unsafe_addr, long count)
 long copy_from_user_nofault(void *dst, const void __user *src, size_t size)
 {
 	long ret = -EFAULT;
-	mm_segment_t old_fs = force_uaccess_begin();
-
 	if (access_ok(src, size)) {
 		pagefault_disable();
 		ret = __copy_from_user_inatomic(dst, src, size);
 		pagefault_enable();
 	}
-	force_uaccess_end(old_fs);
 
 	if (ret)
 		return -EFAULT;
@@ -140,14 +137,12 @@ EXPORT_SYMBOL_GPL(copy_from_user_nofault);
 long copy_to_user_nofault(void __user *dst, const void *src, size_t size)
 {
 	long ret = -EFAULT;
-	mm_segment_t old_fs = force_uaccess_begin();
 
 	if (access_ok(dst, size)) {
 		pagefault_disable();
 		ret = __copy_to_user_inatomic(dst, src, size);
 		pagefault_enable();
 	}
-	force_uaccess_end(old_fs);
 
 	if (ret)
 		return -EFAULT;
@@ -176,17 +171,14 @@ EXPORT_SYMBOL_GPL(copy_to_user_nofault);
 long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr,
 			      long count)
 {
-	mm_segment_t old_fs;
 	long ret;
 
 	if (unlikely(count <= 0))
 		return 0;
 
-	old_fs = force_uaccess_begin();
 	pagefault_disable();
 	ret = strncpy_from_user(dst, unsafe_addr, count);
 	pagefault_enable();
-	force_uaccess_end(old_fs);
 
 	if (ret >= count) {
 		ret = count;
@@ -216,14 +208,11 @@ long strncpy_from_user_nofault(char *dst, const void __user *unsafe_addr,
  */
 long strnlen_user_nofault(const void __user *unsafe_addr, long count)
 {
-	mm_segment_t old_fs;
 	int ret;
 
-	old_fs = force_uaccess_begin();
 	pagefault_disable();
 	ret = strnlen_user(unsafe_addr, count);
 	pagefault_enable();
-	force_uaccess_end(old_fs);
 
 	return ret;
 }
diff --git a/mm/memory.c b/mm/memory.c
index c125c4969913..9a6ebf68a846 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5256,14 +5256,6 @@ void print_vma_addr(char *prefix, unsigned long ip)
 #if defined(CONFIG_PROVE_LOCKING) || defined(CONFIG_DEBUG_ATOMIC_SLEEP)
 void __might_fault(const char *file, int line)
 {
-	/*
-	 * Some code (nfs/sunrpc) uses socket ops on kernel memory while
-	 * holding the mmap_lock, this is safe because kernel memory doesn't
-	 * get paged out, therefore we'll never actually fault, and the
-	 * below annotations will generate false positives.
-	 */
-	if (uaccess_kernel())
-		return;
 	if (pagefault_disabled())
 		return;
 	__might_sleep(file, line);
diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c
index 51a941b56ec3..422ec6e7ccff 100644
--- a/net/bpfilter/bpfilter_kern.c
+++ b/net/bpfilter/bpfilter_kern.c
@@ -70,7 +70,7 @@ static int bpfilter_process_sockopt(struct sock *sk, int optname,
 		.addr		= (uintptr_t)optval.user,
 		.len		= optlen,
 	};
-	if (uaccess_kernel() || sockptr_is_kernel(optval)) {
+	if (sockptr_is_kernel(optval)) {
 		pr_err("kernel access not supported\n");
 		return -EFAULT;
 	}
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH 01/14] uaccess: fix integer overflow on access_ok()
  2022-02-14 16:34 ` [PATCH 01/14] uaccess: fix integer overflow on access_ok() Arnd Bergmann
@ 2022-02-14 16:58   ` Christoph Hellwig
  0 siblings, 0 replies; 61+ messages in thread
From: Christoph Hellwig @ 2022-02-14 16:58 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	jcmvbkbc, guoren, sparclinux, linux-riscv, will, ardb,
	linux-arch, linux-s390, bcain, linux-hexagon, deller, x86, linux,
	linux-csky, Christoph Hellwig, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-um, linuxppc-dev, richard,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, nickhu, linux-parisc, linux-mm, linux-api,
	linux-kernel, stable, dinguyen, David Laight, ebiederm,
	linux-alpha, akpm, Linus Torvalds, davem

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 03/14] nds32: fix access_ok() checks in get/put_user
  2022-02-14 16:34 ` [PATCH 03/14] nds32: fix access_ok() checks in get/put_user Arnd Bergmann
@ 2022-02-14 17:01   ` Christoph Hellwig
  2022-02-14 17:10     ` David Laight
  2022-02-15  9:18     ` Arnd Bergmann
  0 siblings, 2 replies; 61+ messages in thread
From: Christoph Hellwig @ 2022-02-14 17:01 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	jcmvbkbc, guoren, sparclinux, linux-riscv, will, ardb,
	linux-arch, linux-s390, bcain, linux-hexagon, deller, x86, linux,
	linux-csky, Christoph Hellwig, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-um, linuxppc-dev, richard,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, nickhu, linux-parisc, linux-mm, linux-api,
	linux-kernel, stable, dinguyen, ebiederm, linux-alpha, akpm,
	Linus Torvalds, davem

On Mon, Feb 14, 2022 at 05:34:41PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> The get_user()/put_user() functions are meant to check for
> access_ok(), while the __get_user()/__put_user() functions
> don't.
> 
> This broke in 4.19 for nds32, when it gained an extraneous
> check in __get_user(), but lost the check it needs in
> __put_user().

Can we follow the lead of MIPS (which this was originally copied
from I think) and kill the pointless __get/put_user_check wrapper
that just obsfucate the code?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/14] x86: use more conventional access_ok() definition
  2022-02-14 16:34 ` [PATCH 04/14] x86: use more conventional access_ok() definition Arnd Bergmann
@ 2022-02-14 17:02   ` Christoph Hellwig
  2022-02-14 19:45     ` Arnd Bergmann
  0 siblings, 1 reply; 61+ messages in thread
From: Christoph Hellwig @ 2022-02-14 17:02 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	jcmvbkbc, guoren, sparclinux, linux-riscv, will, ardb,
	linux-arch, linux-s390, bcain, linux-hexagon, deller, x86, linux,
	linux-csky, Christoph Hellwig, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-um, linuxppc-dev, richard,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, nickhu, linux-parisc, linux-mm, linux-api,
	linux-kernel, dinguyen, ebiederm, linux-alpha, akpm,
	Linus Torvalds, davem

On Mon, Feb 14, 2022 at 05:34:42PM +0100, Arnd Bergmann wrote:
> +#define __range_not_ok(addr, size, limit)	(!__access_ok(addr, size))
> +#define __chk_range_not_ok(addr, size, limit)	(!__access_ok((void __user *)addr, size))

Can we just kill these off insted of letting themm obsfucate the code?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/14] uaccess: add generic __{get,put}_kernel_nofault
  2022-02-14 16:34 ` [PATCH 05/14] uaccess: add generic __{get,put}_kernel_nofault Arnd Bergmann
@ 2022-02-14 17:02   ` Christoph Hellwig
  2022-02-15  0:31   ` Al Viro
  1 sibling, 0 replies; 61+ messages in thread
From: Christoph Hellwig @ 2022-02-14 17:02 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	jcmvbkbc, guoren, sparclinux, linux-riscv, will, ardb,
	linux-arch, linux-s390, bcain, linux-hexagon, deller, x86, linux,
	linux-csky, Christoph Hellwig, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-um, linuxppc-dev, richard,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, nickhu, linux-parisc, linux-mm, linux-api,
	linux-kernel, dinguyen, ebiederm, linux-alpha, akpm,
	Linus Torvalds, davem

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 07/14] uaccess: generalize access_ok()
  2022-02-14 16:34 ` [PATCH 07/14] uaccess: generalize access_ok() Arnd Bergmann
@ 2022-02-14 17:04   ` Christoph Hellwig
  2022-02-14 17:15   ` Al Viro
  2022-02-15 10:58   ` Mark Rutland
  2 siblings, 0 replies; 61+ messages in thread
From: Christoph Hellwig @ 2022-02-14 17:04 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	jcmvbkbc, guoren, sparclinux, linux-riscv, will, ardb,
	linux-arch, linux-s390, bcain, linux-hexagon, deller, x86, linux,
	linux-csky, Christoph Hellwig, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-um, linuxppc-dev, richard,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, nickhu, linux-parisc, linux-mm, linux-api,
	linux-kernel, dinguyen, ebiederm, linux-alpha, akpm,
	Linus Torvalds, davem

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 10/14] uaccess: remove most CONFIG_SET_FS users
  2022-02-14 16:34 ` [PATCH 10/14] uaccess: remove most CONFIG_SET_FS users Arnd Bergmann
@ 2022-02-14 17:06   ` Christoph Hellwig
  2022-02-14 19:40     ` Arnd Bergmann
  0 siblings, 1 reply; 61+ messages in thread
From: Christoph Hellwig @ 2022-02-14 17:06 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	jcmvbkbc, guoren, sparclinux, linux-riscv, will, ardb,
	linux-arch, linux-s390, bcain, linux-hexagon, deller, x86, linux,
	linux-csky, Christoph Hellwig, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-um, linuxppc-dev, richard,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, nickhu, linux-parisc, linux-mm, linux-api,
	linux-kernel, dinguyen, ebiederm, linux-alpha, akpm,
	Linus Torvalds, davem

On Mon, Feb 14, 2022 at 05:34:48PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> On almost all architectures, there are no remaining callers
> of set_fs(), so CONFIG_SET_FS can be disabled, along with
> removing the thread_info field and any references to it.
> 
> This turns access_ok() into a cheaper check against TASK_SIZE_MAX.

Wouldn't it make more sense to just merge this into the last patch?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 11/14] sparc64: remove CONFIG_SET_FS support
  2022-02-14 16:34 ` [PATCH 11/14] sparc64: remove CONFIG_SET_FS support Arnd Bergmann
@ 2022-02-14 17:06   ` Christoph Hellwig
  2022-02-16 13:06     ` Arnd Bergmann
  2022-02-15  0:48   ` Al Viro
  1 sibling, 1 reply; 61+ messages in thread
From: Christoph Hellwig @ 2022-02-14 17:06 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	jcmvbkbc, guoren, sparclinux, linux-riscv, will, ardb,
	linux-arch, linux-s390, bcain, linux-hexagon, deller, x86, linux,
	linux-csky, Christoph Hellwig, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-um, linuxppc-dev, richard,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, nickhu, linux-parisc, linux-mm, linux-api,
	linux-kernel, dinguyen, ebiederm, linux-alpha, akpm,
	Linus Torvalds, davem

>  void prom_world(int enter)
>  {
> -	if (!enter)
> -		set_fs(get_fs());
> -
>  	__asm__ __volatile__("flushw");
>  }

The enter argument is now unused.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH 03/14] nds32: fix access_ok() checks in get/put_user
  2022-02-14 17:01   ` Christoph Hellwig
@ 2022-02-14 17:10     ` David Laight
  2022-02-15  9:18     ` Arnd Bergmann
  1 sibling, 0 replies; 61+ messages in thread
From: David Laight @ 2022-02-14 17:10 UTC (permalink / raw)
  To: 'Christoph Hellwig', Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	jcmvbkbc, guoren, sparclinux, linux-riscv, will, ardb,
	linux-arch, linux-s390, bcain, linux-hexagon, deller, x86, linux,
	linux-csky, Christoph Hellwig, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-um, linuxppc-dev, richard,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, nickhu, linux-parisc, linux-mm, linux-api,
	linux-kernel, stable, dinguyen, ebiederm, linux-alpha, akpm,
	Linus Torvalds, davem

From: Christoph Hellwig
> Sent: 14 February 2022 17:01
> 
> On Mon, Feb 14, 2022 at 05:34:41PM +0100, Arnd Bergmann wrote:
> > From: Arnd Bergmann <arnd@arndb.de>
> >
> > The get_user()/put_user() functions are meant to check for
> > access_ok(), while the __get_user()/__put_user() functions
> > don't.
> >
> > This broke in 4.19 for nds32, when it gained an extraneous
> > check in __get_user(), but lost the check it needs in
> > __put_user().
> 
> Can we follow the lead of MIPS (which this was originally copied
> from I think) and kill the pointless __get/put_user_check wrapper
> that just obsfucate the code?

Is it possible to make all these architectures fall back to
a common definition somewhere?

Maybe they need to define ACCESS_OK_USER_LIMIT - which can be
different from TASK_SIZE.

There'll be a few special cases, but most architectures have
kernel addresses above userspace ones.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 07/14] uaccess: generalize access_ok()
  2022-02-14 16:34 ` [PATCH 07/14] uaccess: generalize access_ok() Arnd Bergmann
  2022-02-14 17:04   ` Christoph Hellwig
@ 2022-02-14 17:15   ` Al Viro
  2022-02-14 19:25     ` Arnd Bergmann
  2022-02-15 10:58   ` Mark Rutland
  2 siblings, 1 reply; 61+ messages in thread
From: Al Viro @ 2022-02-14 17:15 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	linux-mm, guoren, sparclinux, linux-hexagon, linux-riscv, will,
	Christoph Hellwig, linux-arch, linux-s390, bcain, deller, x86,
	linux, linux-csky, ardb, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-alpha, linux-um, linuxppc-dev,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, jcmvbkbc, linux-api,
	linux-kernel, dinguyen, ebiederm, richard, akpm, Linus Torvalds,
	davem

On Mon, Feb 14, 2022 at 05:34:45PM +0100, Arnd Bergmann wrote:

> diff --git a/arch/csky/kernel/signal.c b/arch/csky/kernel/signal.c
> index c7b763d2f526..8867ddf3e6c7 100644
> --- a/arch/csky/kernel/signal.c
> +++ b/arch/csky/kernel/signal.c
> @@ -136,7 +136,7 @@ static inline void __user *get_sigframe(struct ksignal *ksig,
>  static int
>  setup_rt_frame(struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
>  {
> -	struct rt_sigframe *frame;
> +	struct rt_sigframe __user *frame;
>  	int err = 0;
>  
>  	frame = get_sigframe(ksig, regs, sizeof(*frame));

Minor nit: might make sense to separate annotations (here, on nios2, etc.) from the rest...

This, OTOH,

> diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h
> index 5c12fb46bc61..000bac67cf31 100644
> --- a/arch/sparc/include/asm/uaccess_64.h
> +++ b/arch/sparc/include/asm/uaccess_64.h
...
> -static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, unsigned long limit)
> -{
> -	if (__builtin_constant_p(size))
> -		return addr > limit - size;
> -
> -	addr += size;
> -	if (addr < size)
> -		return true;
> -
> -	return addr > limit;
> -}
> -
> -#define __range_not_ok(addr, size, limit)                               \
> -({                                                                      \
> -	__chk_user_ptr(addr);                                           \
> -	__chk_range_not_ok((unsigned long __force)(addr), size, limit); \
> -})
> -
> -static inline int __access_ok(const void __user * addr, unsigned long size)
> -{
> -	return 1;
> -}
> -
> -static inline int access_ok(const void __user * addr, unsigned long size)
> -{
> -	return 1;
> -}
> +#define __range_not_ok(addr, size, limit) (!__access_ok(addr, size))

is really wrong.  For sparc64, access_ok() should always be true.
This __range_not_ok() thing is used *only* for valid_user_frame() in
arch/sparc/kernel/perf_event.c - it's not a part of normal access_ok()
there.

sparc64 has separate address spaces for kernel and for userland; access_ok()
had never been useful there.  

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good
  2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
                   ` (13 preceding siblings ...)
  2022-02-14 16:34 ` [PATCH 14/14] uaccess: drop set_fs leftovers Arnd Bergmann
@ 2022-02-14 17:35 ` Linus Torvalds
  14 siblings, 0 replies; 61+ messages in thread
From: Linus Torvalds @ 2022-02-14 17:35 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Linux-MM,
	Guo Ren, linux-sparc, linux-hexagon, linux-riscv, Will Deacon,
	Christoph Hellwig, linux-arch, linux-s390, Brian Cain,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Ard Biesheuvel, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE, linux-xtensa, Arnd Bergmann,
	Heiko Carstens, alpha, linux-um, linux-m68k, openrisc,
	Greentime Hu, Stafford Horne, Linux ARM, Michal Simek,
	Thomas Bogendoerfer, linux-parisc, Nick Hu, Max Filippov,
	Linux API, Linux Kernel Mailing List, Dinh Nguyen,
	Eric W. Biederman, Richard Weinberger, Andrew Morton,
	linuxppc-dev, David Miller

On Mon, Feb 14, 2022 at 8:35 AM Arnd Bergmann <arnd@kernel.org> wrote:
>
> I did a patch for microblaze at some point, which turned out to be fairly
> generic, and now ported it to most other architectures, using new generic
> implementations of access_ok() and __{get,put}_kernel_nocheck().

Thanks for doing this.

Apart from the sparc64 issue with completely separate address spaces
(so access_ok() should always return true like Al pointed out), this
looks excellent to me.

Somebody should check that there aren't other cases like sparc64, but
let's merge this asap other than that.

              Linus

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 07/14] uaccess: generalize access_ok()
  2022-02-14 17:15   ` Al Viro
@ 2022-02-14 19:25     ` Arnd Bergmann
  0 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 19:25 UTC (permalink / raw)
  To: Al Viro
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Linux-MM,
	Guo Ren, sparclinux, open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	linux-s390, Brian Cain, Helge Deller, the arch/x86 maintainers,
	Russell King - ARM Linux, linux-csky, Ard Biesheuvel,
	Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um, linuxppc-dev,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, Parisc List, Nick Hu,
	Max Filippov, Linux API, Linux Kernel Mailing List, Dinh Nguyen,
	Eric W . Biederman, Richard Weinberger, Andrew Morton,
	Linus Torvalds, David Miller

On Mon, Feb 14, 2022 at 6:15 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Feb 14, 2022 at 05:34:45PM +0100, Arnd Bergmann wrote:
>
> > diff --git a/arch/csky/kernel/signal.c b/arch/csky/kernel/signal.c
> > index c7b763d2f526..8867ddf3e6c7 100644
> > --- a/arch/csky/kernel/signal.c
> > +++ b/arch/csky/kernel/signal.c
> > @@ -136,7 +136,7 @@ static inline void __user *get_sigframe(struct ksignal *ksig,
> >  static int
> >  setup_rt_frame(struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
> >  {
> > -     struct rt_sigframe *frame;
> > +     struct rt_sigframe __user *frame;
> >       int err = 0;
> >
> >       frame = get_sigframe(ksig, regs, sizeof(*frame));
>
> Minor nit: might make sense to separate annotations (here, on nios2, etc.) from the rest...

Done.

> > -}
> > -
> > -static inline int access_ok(const void __user * addr, unsigned long size)
> > -{
> > -     return 1;
> > -}
> > +#define __range_not_ok(addr, size, limit) (!__access_ok(addr, size))
>
> is really wrong.  For sparc64, access_ok() should always be true.
> This __range_not_ok() thing is used *only* for valid_user_frame() in
> arch/sparc/kernel/perf_event.c - it's not a part of normal access_ok()
> there.
>
> sparc64 has separate address spaces for kernel and for userland; access_ok()
> had never been useful there.

Ok, fixed as well now. I had the access_ok() bit right, the definition just
moved around here so it comes before the #include, but I missed the
bit about __range_not_ok(), which I have now reverted back to the
correct version in my tree.

        Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 10/14] uaccess: remove most CONFIG_SET_FS users
  2022-02-14 17:06   ` Christoph Hellwig
@ 2022-02-14 19:40     ` Arnd Bergmann
  0 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 19:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, linuxppc-dev,
	Richard Weinberger, linux-m68k, Openrisc, Greentime Hu,
	Stafford Horne, Linux ARM, Michal Simek, Thomas Bogendoerfer,
	Nick Hu, Parisc List, Linux-MM, Linux API,
	Linux Kernel Mailing List, Dinh Nguyen, Eric W . Biederman,
	alpha, Andrew Morton, Linus Torvalds, David Miller

On Mon, Feb 14, 2022 at 6:06 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Mon, Feb 14, 2022 at 05:34:48PM +0100, Arnd Bergmann wrote:
> > From: Arnd Bergmann <arnd@arndb.de>
> >
> > On almost all architectures, there are no remaining callers
> > of set_fs(), so CONFIG_SET_FS can be disabled, along with
> > removing the thread_info field and any references to it.
> >
> > This turns access_ok() into a cheaper check against TASK_SIZE_MAX.
>
> Wouldn't it make more sense to just merge this into the last patch?

Yes, sounds good. I wasn't sure at first if there is enough buy-in to get
all architectures cleaned up, and I hadn't done the ia64 patch, so it
seemed more important to do this part early, but now it seems that it
will all go in at the same time, so doing this as part of a big removal
at the end makes sense.

        Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/14] x86: use more conventional access_ok() definition
  2022-02-14 17:02   ` Christoph Hellwig
@ 2022-02-14 19:45     ` Arnd Bergmann
  2022-02-14 20:00       ` Christoph Hellwig
  2022-02-14 20:01       ` Linus Torvalds
  0 siblings, 2 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-14 19:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, linuxppc-dev,
	Richard Weinberger, linux-m68k, Openrisc, Greentime Hu,
	Stafford Horne, Linux ARM, Michal Simek, Thomas Bogendoerfer,
	Nick Hu, Parisc List, Linux-MM, Linux API,
	Linux Kernel Mailing List, Dinh Nguyen, Eric W . Biederman,
	alpha, Andrew Morton, Linus Torvalds, David Miller, Al Viro

On Mon, Feb 14, 2022 at 6:02 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Mon, Feb 14, 2022 at 05:34:42PM +0100, Arnd Bergmann wrote:
> > +#define __range_not_ok(addr, size, limit)    (!__access_ok(addr, size))
> > +#define __chk_range_not_ok(addr, size, limit)        (!__access_ok((void __user *)addr, size))
>
> Can we just kill these off insted of letting themm obsfucate the code?

As Al pointed out, they turned out to be necessary on sparc64, but the only
definitions are on sparc64 and x86, so it's possible that they serve a similar
purpose here, in which case changing the limit from TASK_SIZE to
TASK_SIZE_MAX is probably wrong as well.

So either I need to revert the original definition as I did on sparc64, or
they can be removed completely. Hopefully Al or the x86 maintainers
can clarify.

         Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/14] x86: use more conventional access_ok() definition
  2022-02-14 19:45     ` Arnd Bergmann
@ 2022-02-14 20:00       ` Christoph Hellwig
  2022-02-14 20:01       ` Linus Torvalds
  1 sibling, 0 replies; 61+ messages in thread
From: Christoph Hellwig @ 2022-02-14 20:00 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Christoph Hellwig, Ingo Molnar,
	Geert Uytterhoeven, open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, linuxppc-dev,
	Richard Weinberger, linux-m68k, Openrisc, Greentime Hu,
	Stafford Horne, Linux ARM, Michal Simek, Thomas Bogendoerfer,
	Nick Hu, Parisc List, Linux-MM, Linux API,
	Linux Kernel Mailing List, Dinh Nguyen, Eric W . Biederman,
	alpha, Andrew Morton, Linus Torvalds, David Miller, Al Viro

On Mon, Feb 14, 2022 at 08:45:52PM +0100, Arnd Bergmann wrote:
> As Al pointed out, they turned out to be necessary on sparc64, but the only
> definitions are on sparc64 and x86, so it's possible that they serve a similar
> purpose here, in which case changing the limit from TASK_SIZE to
> TASK_SIZE_MAX is probably wrong as well.
> 
> So either I need to revert the original definition as I did on sparc64, or
> they can be removed completely. Hopefully Al or the x86 maintainers
> can clarify.

Looking at the x86 users I think:

 - valid_user_frame should go away and the caller should use get_user
   instead of __get_user
 - the one in copy_code can just go away, as there is another check
   in copy_from_user_nmi
 - copy_stack_frame should just use access_ok
 - as does copy_from_user_nmi

but yes, having someone who actually knows this code look over it
would be very helpful.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/14] x86: use more conventional access_ok() definition
  2022-02-14 19:45     ` Arnd Bergmann
  2022-02-14 20:00       ` Christoph Hellwig
@ 2022-02-14 20:01       ` Linus Torvalds
  2022-02-14 20:17         ` Al Viro
  2022-02-14 20:24         ` Linus Torvalds
  1 sibling, 2 replies; 61+ messages in thread
From: Linus Torvalds @ 2022-02-14 20:01 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Christoph Hellwig, Ingo Molnar,
	Geert Uytterhoeven, open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, Richard Weinberger,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, Nick Hu, Parisc List,
	Linux-MM, Linux API, Linux Kernel Mailing List, Dinh Nguyen,
	Eric W . Biederman, alpha, Andrew Morton, linuxppc-dev,
	David Miller, Al Viro

On Mon, Feb 14, 2022 at 11:46 AM Arnd Bergmann <arnd@kernel.org> wrote:
>
> As Al pointed out, they turned out to be necessary on sparc64, but the only
> definitions are on sparc64 and x86, so it's possible that they serve a similar
> purpose here, in which case changing the limit from TASK_SIZE to
> TASK_SIZE_MAX is probably wrong as well.

x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the
get_user() assembler implementation does the same.

I think any __range_not_ok() users that use TASK_SIZE are entirely
historical, and should be just fixed.

                 Linus

(*) And by "always" I mean "as far back as I bothered to go". In the
2.6.12 git import, we had

    #define USER_DS          MAKE_MM_SEG(PAGE_OFFSET)

so the user access limit was actually not really TASK_SIZE_MAX at all,
but the beginning of the kernel mapping, which on x86-64 is much much
higher.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/14] x86: use more conventional access_ok() definition
  2022-02-14 20:01       ` Linus Torvalds
@ 2022-02-14 20:17         ` Al Viro
  2022-02-15  2:47           ` Al Viro
  2022-02-14 20:24         ` Linus Torvalds
  1 sibling, 1 reply; 61+ messages in thread
From: Al Viro @ 2022-02-14 20:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Christoph Hellwig, Ingo Molnar,
	Geert Uytterhoeven, open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, Richard Weinberger,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Arnd Bergmann, Michal Simek, Thomas Bogendoerfer, Nick Hu,
	Parisc List, Linux-MM, Linux API, Linux Kernel Mailing List,
	Dinh Nguyen, Eric W . Biederman, alpha, Andrew Morton,
	linuxppc-dev, David Miller

On Mon, Feb 14, 2022 at 12:01:05PM -0800, Linus Torvalds wrote:
> On Mon, Feb 14, 2022 at 11:46 AM Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > As Al pointed out, they turned out to be necessary on sparc64, but the only
> > definitions are on sparc64 and x86, so it's possible that they serve a similar
> > purpose here, in which case changing the limit from TASK_SIZE to
> > TASK_SIZE_MAX is probably wrong as well.
> 
> x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the
> get_user() assembler implementation does the same.
> 
> I think any __range_not_ok() users that use TASK_SIZE are entirely
> historical, and should be just fixed.

IIRC, that was mostly userland stack trace collection in perf.
I'll try to dig in archives and see what shows up - it's been
a while ago...

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/14] x86: use more conventional access_ok() definition
  2022-02-14 20:01       ` Linus Torvalds
  2022-02-14 20:17         ` Al Viro
@ 2022-02-14 20:24         ` Linus Torvalds
  2022-02-14 22:13           ` David Laight
  1 sibling, 1 reply; 61+ messages in thread
From: Linus Torvalds @ 2022-02-14 20:24 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Christoph Hellwig, Ingo Molnar,
	Geert Uytterhoeven, open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, Richard Weinberger,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, Nick Hu, Parisc List,
	Linux-MM, Linux API, Linux Kernel Mailing List, Dinh Nguyen,
	Eric W . Biederman, alpha, Andrew Morton, linuxppc-dev,
	David Miller, Al Viro

On Mon, Feb 14, 2022 at 12:01 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the
> get_user() assembler implementation does the same.

Side note: we could just check the sign bit instead, and avoid big
constants that way.

Right now we actually have this complexity in the x86-64 user access code:

  #ifdef CONFIG_X86_5LEVEL
  #define LOAD_TASK_SIZE_MINUS_N(n) \
        ALTERNATIVE __stringify(mov $((1 << 47) - 4096 - (n)),%rdx), \
                    __stringify(mov $((1 << 56) - 4096 - (n)),%rdx),
X86_FEATURE_LA57
  #else
  #define LOAD_TASK_SIZE_MINUS_N(n) \
          mov $(TASK_SIZE_MAX - (n)),%_ASM_DX
  #endif

just because the code tries to get that TASK_SIZE_MAX boundary just right.

And getting that boundary just right is important on 32-bit x86, but
it's *much* less important on x86-64.

There's still a (weak) reason to do it even for 64-bit code: page
faults outside the valid user space range don't actually cause a #PF
fault - they cause #GP - and then we have the #GP handler warn about
"this address hasn't been checked".

Which is nice and useful for doing syzbot kind of randomization loads
(ie user accesses that didn't go through access_ok() will stand out
nicely), but maybe it's not worth this. syzbot would be fine with only
the "sign bit set" case warning for the same thing.

So on x86-64, we could just check the sign of the address instead, and
simplify and shrink those get/put_user() code sequences (but
array_index_mask_nospec() currently uses the carry flag computation
too, so we'd have to change that part as well, maybe not worth it).

                  Linus

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-14 16:34 ` [PATCH 08/14] arm64: simplify access_ok() Arnd Bergmann
@ 2022-02-14 21:06   ` Robin Murphy
  2022-02-15  8:17   ` Ard Biesheuvel
  2022-02-15 11:07   ` Mark Rutland
  2 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2022-02-14 21:06 UTC (permalink / raw)
  To: Arnd Bergmann, Linus Torvalds, Christoph Hellwig, linux-arch,
	linux-mm, linux-api, arnd, linux-kernel
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, jcmvbkbc,
	guoren, sparclinux, linux-hexagon, linux-riscv, will, ardb,
	linux-s390, bcain, deller, x86, linux, linux-csky, mingo, geert,
	linux-snps-arc, linux-xtensa, hca, linux-alpha, linux-um,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, linux-mips, dinguyen, ebiederm,
	richard, akpm, linuxppc-dev, davem

On 2022-02-14 16:34, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> arm64 has an inline asm implementation of access_ok() that is derived from
> the 32-bit arm version and optimized for the case that both the limit and
> the size are variable. With set_fs() gone, the limit is always constant,
> and the size usually is as well, so just using the default implementation
> reduces the check into a comparison against a constant that can be
> scheduled by the compiler.

Aww, I still vividly remember the birth of this madness, sat with my 
phone on a Saturday morning waiting for my bike to be MOT'd, staring at 
the 7-instruction sequence that Mark and I had come up with and certain 
that it could be shortened still. Kinda sad to see it go, but at the 
same time, glad that it can.

Acked-by: Robin Murphy <robin.murphy@arm.com>

> On a defconfig build, this saves over 28KB of .text.

Not to mention saving those "WTF is going on there... oh yeah, 
access_ok()" moments when looking through disassembly :)

Cheers,
Robin.

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>   arch/arm64/include/asm/uaccess.h | 28 +++++-----------------------
>   1 file changed, 5 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 357f7bd9c981..e8dce0cc5eaa 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -26,6 +26,8 @@
>   #include <asm/memory.h>
>   #include <asm/extable.h>
>   
> +static inline int __access_ok(const void __user *ptr, unsigned long size);
> +
>   /*
>    * Test whether a block of memory is a valid user space address.
>    * Returns 1 if the range is valid, 0 otherwise.
> @@ -33,10 +35,8 @@
>    * This is equivalent to the following test:
>    * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
>    */
> -static inline unsigned long __access_ok(const void __user *addr, unsigned long size)
> +static inline int access_ok(const void __user *addr, unsigned long size)
>   {
> -	unsigned long ret, limit = TASK_SIZE_MAX - 1;
> -
>   	/*
>   	 * Asynchronous I/O running in a kernel thread does not have the
>   	 * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag
> @@ -46,27 +46,9 @@ static inline unsigned long __access_ok(const void __user *addr, unsigned long s
>   	    (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
>   		addr = untagged_addr(addr);
>   
> -	__chk_user_ptr(addr);
> -	asm volatile(
> -	// A + B <= C + 1 for all A,B,C, in four easy steps:
> -	// 1: X = A + B; X' = X % 2^64
> -	"	adds	%0, %3, %2\n"
> -	// 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
> -	"	csel	%1, xzr, %1, hi\n"
> -	// 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
> -	//    to compensate for the carry flag being set in step 4. For
> -	//    X > 2^64, X' merely has to remain nonzero, which it does.
> -	"	csinv	%0, %0, xzr, cc\n"
> -	// 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
> -	//    comes from the carry in being clear. Otherwise, we are
> -	//    testing X' - C == 0, subject to the previous adjustments.
> -	"	sbcs	xzr, %0, %1\n"
> -	"	cset	%0, ls\n"
> -	: "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
> -
> -	return ret;
> +	return likely(__access_ok(addr, size));
>   }
> -#define __access_ok __access_ok
> +#define access_ok access_ok
>   
>   #include <asm-generic/access_ok.h>
>   

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH 04/14] x86: use more conventional access_ok() definition
  2022-02-14 20:24         ` Linus Torvalds
@ 2022-02-14 22:13           ` David Laight
  0 siblings, 0 replies; 61+ messages in thread
From: David Laight @ 2022-02-14 22:13 UTC (permalink / raw)
  To: 'Linus Torvalds', Arnd Bergmann
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, Linux Kernel Mailing List, Max Filippov, Guo Ren,
	sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel, linux-arch,
	linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Christoph Hellwig, Ingo Molnar,
	Geert Uytterhoeven, open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um, linux-m68k,
	Openrisc, Greentime Hu, Stafford Horne, Linux ARM, Michal Simek,
	Thomas Bogendoerfer, Parisc List, Nick Hu, Linux-MM, Linux API,
	open list:BROADCOM NVRAM DRIVER, Dinh Nguyen, Eric W . Biederman,
	Richard Weinberger, Andrew Morton, linuxppc-dev, David Miller,
	Al Viro

From: Linus Torvalds
> Sent: 14 February 2022 20:24
> >
> > x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the
> > get_user() assembler implementation does the same.
> 
> Side note: we could just check the sign bit instead, and avoid big
> constants that way.

The cheap test for most 64bit is (addr | size) >> 62 != 0.

I did some tests last week and the compilers correctly optimise
out constant size.

Doesn't sparc64 still need a wrap test?
Or is that assumed because there is always an unmapped page
and transfer are 'adequately' done on increasing addresses?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/14] uaccess: add generic __{get,put}_kernel_nofault
  2022-02-14 16:34 ` [PATCH 05/14] uaccess: add generic __{get,put}_kernel_nofault Arnd Bergmann
  2022-02-14 17:02   ` Christoph Hellwig
@ 2022-02-15  0:31   ` Al Viro
  2022-02-15 13:16     ` Arnd Bergmann
  1 sibling, 1 reply; 61+ messages in thread
From: Al Viro @ 2022-02-15  0:31 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	linux-mm, guoren, sparclinux, linux-hexagon, linux-riscv, will,
	Christoph Hellwig, linux-arch, linux-s390, bcain, deller, x86,
	linux, linux-csky, ardb, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-alpha, linux-um, linuxppc-dev,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, jcmvbkbc, linux-api,
	linux-kernel, dinguyen, ebiederm, richard, akpm, Linus Torvalds,
	davem

On Mon, Feb 14, 2022 at 05:34:43PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> All architectures that don't provide __{get,put}_kernel_nofault() yet
> can implement this on top of __{get,put}_user.
> 
> Add a generic version that lets everything use the normal
> copy_{from,to}_kernel_nofault() code based on these, removing the last
> use of get_fs()/set_fs() from architecture-independent code.

I'd put the list of those architectures (AFAICS, that's alpha, ia64,
microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa) into commit
message - it's not that hard to find out, but...

And AFAICS, you've missed nios2 - see
#define __put_user(x, ptr) put_user(x, ptr)
in there.  nds32 oddities are dealt with earlier in the series, this
one is not...

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 09/14] m68k: drop custom __access_ok()
  2022-02-14 16:34 ` [PATCH 09/14] m68k: drop custom __access_ok() Arnd Bergmann
@ 2022-02-15  0:37   ` Al Viro
  2022-02-15  6:29     ` Christoph Hellwig
  0 siblings, 1 reply; 61+ messages in thread
From: Al Viro @ 2022-02-15  0:37 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	linux-mm, guoren, sparclinux, linux-hexagon, linux-riscv, will,
	Christoph Hellwig, linux-arch, linux-s390, bcain, deller, x86,
	linux, linux-csky, ardb, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-alpha, linux-um, linuxppc-dev,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, jcmvbkbc, linux-api,
	linux-kernel, dinguyen, ebiederm, richard, akpm, Linus Torvalds,
	davem

On Mon, Feb 14, 2022 at 05:34:47PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> While most m68k platforms use separate address spaces for user
> and kernel space, at least coldfire does not, and the other
> ones have a TASK_SIZE that is less than the entire 4GB address
> range.
> 
> Using the generic implementation of __access_ok() stops coldfire
> user space from trivially accessing kernel memory, and is probably
> the right thing elsewhere for consistency as well.

Perhaps simply wrap that sucker into #ifdef CONFIG_CPU_HAS_ADDRESS_SPACES
(and trim the comment down to "coldfire and 68000 will pick generic
variant")?

> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/m68k/include/asm/uaccess.h | 13 -------------
>  1 file changed, 13 deletions(-)
> 
> diff --git a/arch/m68k/include/asm/uaccess.h b/arch/m68k/include/asm/uaccess.h
> index d6bb5720365a..64914872a5c9 100644
> --- a/arch/m68k/include/asm/uaccess.h
> +++ b/arch/m68k/include/asm/uaccess.h
> @@ -10,19 +10,6 @@
>  #include <linux/compiler.h>
>  #include <linux/types.h>
>  #include <asm/extable.h>
> -
> -/* We let the MMU do all checking */
> -static inline int __access_ok(const void __user *addr,
> -			    unsigned long size)
> -{
> -	/*
> -	 * XXX: for !CONFIG_CPU_HAS_ADDRESS_SPACES this really needs to check
> -	 * for TASK_SIZE!
> -	 * Removing this helper is probably sufficient.
> -	 */
> -	return 1;
> -}
> -#define __access_ok __access_ok
>  #include <asm-generic/access_ok.h>
>  
>  /*
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 11/14] sparc64: remove CONFIG_SET_FS support
  2022-02-14 16:34 ` [PATCH 11/14] sparc64: remove CONFIG_SET_FS support Arnd Bergmann
  2022-02-14 17:06   ` Christoph Hellwig
@ 2022-02-15  0:48   ` Al Viro
  2022-02-16 13:07     ` Arnd Bergmann
  1 sibling, 1 reply; 61+ messages in thread
From: Al Viro @ 2022-02-15  0:48 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	linux-mm, guoren, sparclinux, linux-hexagon, linux-riscv, will,
	Christoph Hellwig, linux-arch, linux-s390, bcain, deller, x86,
	linux, linux-csky, ardb, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-alpha, linux-um, linuxppc-dev,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, jcmvbkbc, linux-api,
	linux-kernel, dinguyen, ebiederm, richard, akpm, Linus Torvalds,
	davem

On Mon, Feb 14, 2022 at 05:34:49PM +0100, Arnd Bergmann wrote:

> -/*
> - * Sparc64 is segmented, though more like the M68K than the I386.
> - * We use the secondary ASI to address user memory, which references a
> - * completely different VM map, thus there is zero chance of the user
> - * doing something queer and tricking us into poking kernel memory.

Actually, this part of comment probably ought to stay - it is relevant
for understanding what's going on (e.g. why is access_ok() always true, etc.)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 04/14] x86: use more conventional access_ok() definition
  2022-02-14 20:17         ` Al Viro
@ 2022-02-15  2:47           ` Al Viro
  0 siblings, 0 replies; 61+ messages in thread
From: Al Viro @ 2022-02-15  2:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Christoph Hellwig, Ingo Molnar,
	Geert Uytterhoeven, open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, Richard Weinberger,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Arnd Bergmann, Michal Simek, Thomas Bogendoerfer, Nick Hu,
	Parisc List, Linux-MM, Linux API, Linux Kernel Mailing List,
	Dinh Nguyen, Eric W . Biederman, alpha, Andrew Morton,
	linuxppc-dev, David Miller

On Mon, Feb 14, 2022 at 08:17:07PM +0000, Al Viro wrote:
> On Mon, Feb 14, 2022 at 12:01:05PM -0800, Linus Torvalds wrote:
> > On Mon, Feb 14, 2022 at 11:46 AM Arnd Bergmann <arnd@kernel.org> wrote:
> > >
> > > As Al pointed out, they turned out to be necessary on sparc64, but the only
> > > definitions are on sparc64 and x86, so it's possible that they serve a similar
> > > purpose here, in which case changing the limit from TASK_SIZE to
> > > TASK_SIZE_MAX is probably wrong as well.
> > 
> > x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the
> > get_user() assembler implementation does the same.
> > 
> > I think any __range_not_ok() users that use TASK_SIZE are entirely
> > historical, and should be just fixed.
> 
> IIRC, that was mostly userland stack trace collection in perf.
> I'll try to dig in archives and see what shows up - it's been
> a while ago...

After some digging:

	access_ok() needs only to make sure that MMU won't go anywhere near
the kernel page tables; address limit for 32bit threads is none of its
concern, so TASK_SIZE_MAX is right for it.

	valid_user_frame() in arch/x86/events/core.c: used while walking
the userland call chain.  The reason it's not access_ok() is only that
perf_callchain_user() might've been called from interrupt that came while
we'd been under KERNEL_DS.
	That had been back in 2015 and it had been obsoleted since 2017, commit
88b0193d9418 (perf/callchain: Force USER_DS when invoking perf_callchain_user()).
We had been guaranteed USER_DS ever since.
	IOW, it could've reverted to use of access_ok() at any point after that.
TASK_SIZE vs TASK_SIZE_MAX is pretty much an accident there - might've been
TASK_SIZE_MAX from the very beginning.

	copy_stack_frame() in arch/x86/kernel/stacktrace.c: similar story,
except the commit that made sure callers will have USER_DS - cac9b9a4b083
(stacktrace: Force USER_DS for stack_trace_save_user()) in this case.
Also could've been using access_ok() just fine.  Amusingly, access_ok()
used to be there, until it had been replaced with explicit check on
Jul 22 2019 - 4 days after that had been made useless by fix in the caller...

	copy_from_user_nmi().  That one is a bit more interesting.
We have a call chain from perf_output_sample_ustack() (covered by
force_uaccess_begin() these days, not that it mattered for x86 now),
there's something odd in dumpstack.c:copy_code() (with explicit check
for TASK_SIZE_MAX in the caller) and there's a couple of callers in
Intel PMU code.
	AFAICS, there's no reason whatsoever to use TASK_SIZE
in that one - the point is to prevent copyin from the kernel
memory, and in that respect TASK_SIZE_MAX isn't any worse.
The check in copy_code() probably should go.

	So all of those guys should be simply switched to access_ok().
Might be worth making that a preliminary patch - it's independent
from everything else and there's no point folding it into any of the
patches in the series.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 14/14] uaccess: drop set_fs leftovers
  2022-02-14 16:34 ` [PATCH 14/14] uaccess: drop set_fs leftovers Arnd Bergmann
@ 2022-02-15  3:03   ` Al Viro
  2022-02-15  7:46     ` Helge Deller
  0 siblings, 1 reply; 61+ messages in thread
From: Al Viro @ 2022-02-15  3:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	linux-mm, guoren, sparclinux, linux-hexagon, linux-riscv, will,
	Christoph Hellwig, linux-arch, linux-s390, bcain, deller, x86,
	linux, linux-csky, ardb, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-alpha, linux-um, linuxppc-dev,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, jcmvbkbc, linux-api,
	linux-kernel, dinguyen, ebiederm, richard, akpm, Linus Torvalds,
	davem

On Mon, Feb 14, 2022 at 05:34:52PM +0100, Arnd Bergmann wrote:
> diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
> index b5835325d44b..2f4a1b1ef387 100644
> --- a/arch/parisc/include/asm/futex.h
> +++ b/arch/parisc/include/asm/futex.h
> @@ -99,7 +99,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
>  	/* futex.c wants to do a cmpxchg_inatomic on kernel NULL, which is
>  	 * our gateway page, and causes no end of trouble...
>  	 */
> -	if (uaccess_kernel() && !uaddr)
> +	if (!uaddr)
>  		return -EFAULT;

	Huh?  uaccess_kernel() is removed since it becomes always false now,
so this looks odd.

	AFAICS, the comment above that check refers to futex_detect_cmpxchg()
-> cmpxchg_futex_value_locked() -> futex_atomic_cmpxchg_inatomic() call chain.
Which had been gone since commit 3297481d688a (futex: Remove futex_cmpxchg
detection).  The comment *and* the check should've been killed off back
then.
	Let's make sure to get both now...

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 09/14] m68k: drop custom __access_ok()
  2022-02-15  0:37   ` Al Viro
@ 2022-02-15  6:29     ` Christoph Hellwig
  2022-02-15  7:13       ` Al Viro
  0 siblings, 1 reply; 61+ messages in thread
From: Christoph Hellwig @ 2022-02-15  6:29 UTC (permalink / raw)
  To: Al Viro
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	linux-mm, guoren, sparclinux, linux-hexagon, linux-riscv, will,
	Christoph Hellwig, linux-arch, linux-s390, bcain, deller, x86,
	linux, linux-csky, ardb, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-alpha, linux-um, linuxppc-dev,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel,
	Arnd Bergmann, monstr, tsbogend, linux-parisc, nickhu, jcmvbkbc,
	linux-api, linux-kernel, dinguyen, ebiederm, richard, akpm,
	Linus Torvalds, davem

On Tue, Feb 15, 2022 at 12:37:41AM +0000, Al Viro wrote:
> Perhaps simply wrap that sucker into #ifdef CONFIG_CPU_HAS_ADDRESS_SPACES
> (and trim the comment down to "coldfire and 68000 will pick generic
> variant")?

I wonder if we should invert CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE,
select the separate address space config for s390, sparc64, non-coldfire
m68k and mips with EVA and then just have one single access_ok for
overlapping address space (as added by Arnd) and non-overlapping ones
(always return true).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 09/14] m68k: drop custom __access_ok()
  2022-02-15  6:29     ` Christoph Hellwig
@ 2022-02-15  7:13       ` Al Viro
  2022-02-15 10:02         ` Arnd Bergmann
  0 siblings, 1 reply; 61+ messages in thread
From: Al Viro @ 2022-02-15  7:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	linux-mm, guoren, sparclinux, linux-hexagon, linux-riscv, will,
	ardb, linux-arch, linux-s390, bcain, deller, x86, linux,
	linux-csky, mingo, geert, linux-snps-arc, linux-xtensa, arnd,
	hca, linux-alpha, linux-um, linuxppc-dev, linux-m68k, openrisc,
	green.hu, shorne, linux-arm-kernel, Arnd Bergmann, monstr,
	tsbogend, linux-parisc, nickhu, jcmvbkbc, linux-api,
	linux-kernel, dinguyen, ebiederm, richard, akpm, Linus Torvalds,
	davem

On Tue, Feb 15, 2022 at 07:29:42AM +0100, Christoph Hellwig wrote:
> On Tue, Feb 15, 2022 at 12:37:41AM +0000, Al Viro wrote:
> > Perhaps simply wrap that sucker into #ifdef CONFIG_CPU_HAS_ADDRESS_SPACES
> > (and trim the comment down to "coldfire and 68000 will pick generic
> > variant")?
> 
> I wonder if we should invert CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE,
> select the separate address space config for s390, sparc64, non-coldfire
> m68k and mips with EVA and then just have one single access_ok for
> overlapping address space (as added by Arnd) and non-overlapping ones
> (always return true).

parisc is also such...  How about

	select ALTERNATE_SPACE_USERLAND

for that bunch?  While we are at it, how many unusual access_ok() instances are
left after this series?  arm64, itanic, um, anything else?

FWIW, sparc32 has a slightly unusual instance (see uaccess_32.h there); it's
obviously cheaper than generic and I wonder if the trick is legitimate (and
applicable elsewhere, perhaps)...

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 14/14] uaccess: drop set_fs leftovers
  2022-02-15  3:03   ` Al Viro
@ 2022-02-15  7:46     ` Helge Deller
  2022-02-15  8:10       ` Arnd Bergmann
  0 siblings, 1 reply; 61+ messages in thread
From: Helge Deller @ 2022-02-15  7:46 UTC (permalink / raw)
  To: Al Viro, Arnd Bergmann
  Cc: mark.rutland, dalias, linux-ia64, linux-sh, peterz, linux-mips,
	linux-mm, guoren, sparclinux, linux-hexagon, linux-riscv, will,
	Christoph Hellwig, linux-arch, linux-s390, bcain, x86, linux,
	linux-csky, ardb, mingo, geert, linux-snps-arc, linux-xtensa,
	arnd, hca, linux-alpha, linux-um, linuxppc-dev, linux-m68k,
	openrisc, green.hu, shorne, linux-arm-kernel, monstr, tsbogend,
	linux-parisc, nickhu, jcmvbkbc, linux-api, linux-kernel,
	dinguyen, ebiederm, richard, akpm, Linus Torvalds, davem

On 2/15/22 04:03, Al Viro wrote:
> On Mon, Feb 14, 2022 at 05:34:52PM +0100, Arnd Bergmann wrote:
>> diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
>> index b5835325d44b..2f4a1b1ef387 100644
>> --- a/arch/parisc/include/asm/futex.h
>> +++ b/arch/parisc/include/asm/futex.h
>> @@ -99,7 +99,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
>>  	/* futex.c wants to do a cmpxchg_inatomic on kernel NULL, which is
>>  	 * our gateway page, and causes no end of trouble...
>>  	 */
>> -	if (uaccess_kernel() && !uaddr)
>> +	if (!uaddr)
>>  		return -EFAULT;
>
> 	Huh?  uaccess_kernel() is removed since it becomes always false now,
> so this looks odd.
>
> 	AFAICS, the comment above that check refers to futex_detect_cmpxchg()
> -> cmpxchg_futex_value_locked() -> futex_atomic_cmpxchg_inatomic() call chain.
> Which had been gone since commit 3297481d688a (futex: Remove futex_cmpxchg
> detection).  The comment *and* the check should've been killed off back
> then.
> 	Let's make sure to get both now...

Right. Arnd, can you drop this if() and the comment above it?

Thanks,
Helge

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 14/14] uaccess: drop set_fs leftovers
  2022-02-15  7:46     ` Helge Deller
@ 2022-02-15  8:10       ` Arnd Bergmann
  0 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-15  8:10 UTC (permalink / raw)
  To: Helge Deller
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Linux-MM,
	Guo Ren, sparclinux, open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	linux-s390, Brian Cain, the arch/x86 maintainers,
	Russell King - ARM Linux, linux-csky, Ard Biesheuvel,
	Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um, linuxppc-dev,
	linux-m68k, Openrisc, Al Viro, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, Parisc List, Nick Hu,
	Max Filippov, Linux API, Linux Kernel Mailing List, Dinh Nguyen,
	Eric W . Biederman, Richard Weinberger, Andrew Morton,
	Linus Torvalds, David Miller, Greentime Hu

On Tue, Feb 15, 2022 at 8:46 AM Helge Deller <deller@gmx.de> wrote:
>
> On 2/15/22 04:03, Al Viro wrote:
> > On Mon, Feb 14, 2022 at 05:34:52PM +0100, Arnd Bergmann wrote:
> >> diff --git a/arch/parisc/include/asm/futex.h b/arch/parisc/include/asm/futex.h
> >> index b5835325d44b..2f4a1b1ef387 100644
> >> --- a/arch/parisc/include/asm/futex.h
> >> +++ b/arch/parisc/include/asm/futex.h
> >> @@ -99,7 +99,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
> >>      /* futex.c wants to do a cmpxchg_inatomic on kernel NULL, which is
> >>       * our gateway page, and causes no end of trouble...
> >>       */
> >> -    if (uaccess_kernel() && !uaddr)
> >> +    if (!uaddr)
> >>              return -EFAULT;
> >
> >       Huh?  uaccess_kernel() is removed since it becomes always false now,
> > so this looks odd.
> >
> >       AFAICS, the comment above that check refers to futex_detect_cmpxchg()
> > -> cmpxchg_futex_value_locked() -> futex_atomic_cmpxchg_inatomic() call chain.
> > Which had been gone since commit 3297481d688a (futex: Remove futex_cmpxchg
> > detection).  The comment *and* the check should've been killed off back
> > then.
> >       Let's make sure to get both now...
>
> Right. Arnd, can you drop this if() and the comment above it?

Done.

       Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-14 16:34 ` [PATCH 08/14] arm64: simplify access_ok() Arnd Bergmann
  2022-02-14 21:06   ` Robin Murphy
@ 2022-02-15  8:17   ` Ard Biesheuvel
  2022-02-15  9:12     ` Arnd Bergmann
  2022-02-15  9:30     ` David Laight
  2022-02-15 11:07   ` Mark Rutland
  2 siblings, 2 replies; 61+ messages in thread
From: Ard Biesheuvel @ 2022-02-15  8:17 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Rich Felker, linux-ia64, linux-sh, Peter Zijlstra,
	open list:MIPS, Linux Memory Management List, Guo Ren,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	linux-hexagon, linux-riscv, Will Deacon, Christoph Hellwig,
	linux-arch, open list:S390, Brian Cain, Helge Deller, X86 ML,
	Russell King, linux-csky, Ingo Molnar, Geert Uytterhoeven,
	linux-snps-arc, Robin Murphy,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-m68k, openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	monstr, Thomas Bogendoerfer, open list:PARISC ARCHITECTURE,
	Nick Hu, Max Filippov, linux-api, Linux Kernel Mailing List,
	dinguyen, Eric W. Biederman, Richard Weinberger, Andrew Morton,
	Linus Torvalds, David S. Miller

On Mon, 14 Feb 2022 at 17:37, Arnd Bergmann <arnd@kernel.org> wrote:
>
> From: Arnd Bergmann <arnd@arndb.de>
>
> arm64 has an inline asm implementation of access_ok() that is derived from
> the 32-bit arm version and optimized for the case that both the limit and
> the size are variable. With set_fs() gone, the limit is always constant,
> and the size usually is as well, so just using the default implementation
> reduces the check into a comparison against a constant that can be
> scheduled by the compiler.
>
> On a defconfig build, this saves over 28KB of .text.
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/arm64/include/asm/uaccess.h | 28 +++++-----------------------
>  1 file changed, 5 insertions(+), 23 deletions(-)
>
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 357f7bd9c981..e8dce0cc5eaa 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -26,6 +26,8 @@
>  #include <asm/memory.h>
>  #include <asm/extable.h>
>
> +static inline int __access_ok(const void __user *ptr, unsigned long size);
> +
>  /*
>   * Test whether a block of memory is a valid user space address.
>   * Returns 1 if the range is valid, 0 otherwise.
> @@ -33,10 +35,8 @@
>   * This is equivalent to the following test:
>   * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
>   */
> -static inline unsigned long __access_ok(const void __user *addr, unsigned long size)
> +static inline int access_ok(const void __user *addr, unsigned long size)
>  {
> -       unsigned long ret, limit = TASK_SIZE_MAX - 1;
> -
>         /*
>          * Asynchronous I/O running in a kernel thread does not have the
>          * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag
> @@ -46,27 +46,9 @@ static inline unsigned long __access_ok(const void __user *addr, unsigned long s
>             (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
>                 addr = untagged_addr(addr);
>
> -       __chk_user_ptr(addr);
> -       asm volatile(
> -       // A + B <= C + 1 for all A,B,C, in four easy steps:
> -       // 1: X = A + B; X' = X % 2^64
> -       "       adds    %0, %3, %2\n"
> -       // 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
> -       "       csel    %1, xzr, %1, hi\n"
> -       // 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
> -       //    to compensate for the carry flag being set in step 4. For
> -       //    X > 2^64, X' merely has to remain nonzero, which it does.
> -       "       csinv   %0, %0, xzr, cc\n"
> -       // 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
> -       //    comes from the carry in being clear. Otherwise, we are
> -       //    testing X' - C == 0, subject to the previous adjustments.
> -       "       sbcs    xzr, %0, %1\n"
> -       "       cset    %0, ls\n"
> -       : "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
> -
> -       return ret;
> +       return likely(__access_ok(addr, size));
>  }
> -#define __access_ok __access_ok
> +#define access_ok access_ok
>
>  #include <asm-generic/access_ok.h>
>
> --
> 2.29.2
>

With set_fs() out of the picture, wouldn't it be sufficient to check
that bit #55 is clear? (the bit that selects between TTBR0 and TTBR1)
That would also remove the need to strip the tag from the address.

Something like

    asm goto("tbnz  %0, #55, %2     \n"
             "tbnz  %1, #55, %2     \n"
             :: "r"(addr), "r"(addr + size - 1) :: notok);
    return 1;
notok:
    return 0;

with an additional sanity check on the size which the compiler could
eliminate for compile-time constant values.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-15  8:17   ` Ard Biesheuvel
@ 2022-02-15  9:12     ` Arnd Bergmann
  2022-02-15  9:21       ` Ard Biesheuvel
  2022-02-16 19:43       ` Christophe Leroy
  2022-02-15  9:30     ` David Laight
  1 sibling, 2 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-15  9:12 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:MIPS, Linux Memory Management List,
	Guo Ren, open list:SPARC + UltraSPARC (sparc/sparc64),
	open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	open list:S390, Brian Cain, Helge Deller, X86 ML, Russell King,
	linux-csky, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE, Robin Murphy,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, open list:PARISC ARCHITECTURE,
	Nick Hu, Max Filippov, Linux API, Linux Kernel Mailing List,
	Dinh Nguyen, Eric W. Biederman, Richard Weinberger,
	Andrew Morton, Linus Torvalds, David S. Miller

On Tue, Feb 15, 2022 at 9:17 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> On Mon, 14 Feb 2022 at 17:37, Arnd Bergmann <arnd@kernel.org> wrote:
> > From: Arnd Bergmann <arnd@arndb.de>
> >
>
> With set_fs() out of the picture, wouldn't it be sufficient to check
> that bit #55 is clear? (the bit that selects between TTBR0 and TTBR1)
> That would also remove the need to strip the tag from the address.
>
> Something like
>
>     asm goto("tbnz  %0, #55, %2     \n"
>              "tbnz  %1, #55, %2     \n"
>              :: "r"(addr), "r"(addr + size - 1) :: notok);
>     return 1;
> notok:
>     return 0;
>
> with an additional sanity check on the size which the compiler could
> eliminate for compile-time constant values.

That should work, but I don't see it as a clear enough advantage to
have a custom implementation. For the constant-size case, it probably
isn't better than a compiler-scheduled comparison against a
constant limit, but it does hurt maintainability when the next person
wants to change the behavior of access_ok() globally.

If we want to get into micro-optimizing uaccess, I think a better target
would be a CONFIG_CC_HAS_ASM_GOTO_OUTPUT version
of __get_user()/__put_user as we have on x86 and powerpc.

         Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 03/14] nds32: fix access_ok() checks in get/put_user
  2022-02-14 17:01   ` Christoph Hellwig
  2022-02-14 17:10     ` David Laight
@ 2022-02-15  9:18     ` Arnd Bergmann
  2022-02-15 10:25       ` Greg KH
  1 sibling, 1 reply; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-15  9:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, linuxppc-dev,
	Richard Weinberger, linux-m68k, Openrisc, Greentime Hu,
	Stafford Horne, Linux ARM, Michal Simek, Thomas Bogendoerfer,
	Nick Hu, Parisc List, Linux-MM, Linux API,
	Linux Kernel Mailing List, # 3.4.x, Dinh Nguyen,
	Eric W . Biederman, alpha, Andrew Morton, Linus Torvalds,
	David Miller

On Mon, Feb 14, 2022 at 6:01 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Mon, Feb 14, 2022 at 05:34:41PM +0100, Arnd Bergmann wrote:
> > From: Arnd Bergmann <arnd@arndb.de>
> >
> > The get_user()/put_user() functions are meant to check for
> > access_ok(), while the __get_user()/__put_user() functions
> > don't.
> >
> > This broke in 4.19 for nds32, when it gained an extraneous
> > check in __get_user(), but lost the check it needs in
> > __put_user().
>
> Can we follow the lead of MIPS (which this was originally copied
> from I think) and kill the pointless __get/put_user_check wrapper
> that just obsfucate the code?

I had another look, but I think that would be a bigger change than
I want to have in a fix for stable backports, as nds32 also uses
the _check versions in __{get,put}_user_error.

If we instead clean it up in a separate patch, it should be done for
all eight architectures that do the same thing, but at that point,
the time seems better spent at coming up with a new set of
calling conventions that work with asm-goto.

         Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-15  9:12     ` Arnd Bergmann
@ 2022-02-15  9:21       ` Ard Biesheuvel
  2022-02-15  9:39         ` Arnd Bergmann
  2022-02-15 10:37         ` Mark Rutland
  2022-02-16 19:43       ` Christophe Leroy
  1 sibling, 2 replies; 61+ messages in thread
From: Ard Biesheuvel @ 2022-02-15  9:21 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:MIPS, Linux Memory Management List,
	Guo Ren, open list:SPARC + UltraSPARC (sparc/sparc64),
	open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	open list:S390, Brian Cain, Helge Deller, X86 ML, Russell King,
	linux-csky, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE, Robin Murphy,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, open list:PARISC ARCHITECTURE,
	Nick Hu, Max Filippov, Linux API, Linux Kernel Mailing List,
	Dinh Nguyen, Eric W. Biederman, Richard Weinberger,
	Andrew Morton, Linus Torvalds, David S. Miller

On Tue, 15 Feb 2022 at 10:13, Arnd Bergmann <arnd@kernel.org> wrote:
>
> On Tue, Feb 15, 2022 at 9:17 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > On Mon, 14 Feb 2022 at 17:37, Arnd Bergmann <arnd@kernel.org> wrote:
> > > From: Arnd Bergmann <arnd@arndb.de>
> > >
> >
> > With set_fs() out of the picture, wouldn't it be sufficient to check
> > that bit #55 is clear? (the bit that selects between TTBR0 and TTBR1)
> > That would also remove the need to strip the tag from the address.
> >
> > Something like
> >
> >     asm goto("tbnz  %0, #55, %2     \n"
> >              "tbnz  %1, #55, %2     \n"
> >              :: "r"(addr), "r"(addr + size - 1) :: notok);
> >     return 1;
> > notok:
> >     return 0;
> >
> > with an additional sanity check on the size which the compiler could
> > eliminate for compile-time constant values.
>
> That should work, but I don't see it as a clear enough advantage to
> have a custom implementation. For the constant-size case, it probably
> isn't better than a compiler-scheduled comparison against a
> constant limit, but it does hurt maintainability when the next person
> wants to change the behavior of access_ok() globally.
>

arm64 also has this leading up to the range check, and I think we'd no
longer need it:

    if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&
        (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
            addr = untagged_addr(addr);

> If we want to get into micro-optimizing uaccess, I think a better target
> would be a CONFIG_CC_HAS_ASM_GOTO_OUTPUT version
> of __get_user()/__put_user as we have on x86 and powerpc.
>
>          Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-15  8:17   ` Ard Biesheuvel
  2022-02-15  9:12     ` Arnd Bergmann
@ 2022-02-15  9:30     ` David Laight
  2022-02-15 11:24       ` Mark Rutland
  1 sibling, 1 reply; 61+ messages in thread
From: David Laight @ 2022-02-15  9:30 UTC (permalink / raw)
  To: 'Ard Biesheuvel', Arnd Bergmann
  Cc: Mark Rutland, Rich Felker, linux-ia64, linux-sh, Peter Zijlstra,
	Linux Kernel Mailing List, Linux Memory Management List, Guo Ren,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	linux-riscv, linux-api, Will Deacon, Christoph Hellwig,
	linux-arch, open list:S390, Brian Cain, linux-hexagon,
	Helge Deller, X86 ML, Russell King, linux-csky, Linus Torvalds,
	Ingo Molnar, Geert Uytterhoeven, linux-snps-arc,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, Richard Weinberger,
	linux-m68k, openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	monstr, Thomas Bogendoerfer, Nick Hu,
	open list:PARISC ARCHITECTURE, Max Filippov,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	open list:MIPS, dinguyen, Eric W. Biederman, alpha,
	Andrew Morton, Robin Murphy, David S. Miller

From: Ard Biesheuvel
> Sent: 15 February 2022 08:18
> 
> On Mon, 14 Feb 2022 at 17:37, Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > From: Arnd Bergmann <arnd@arndb.de>
> >
> > arm64 has an inline asm implementation of access_ok() that is derived from
> > the 32-bit arm version and optimized for the case that both the limit and
> > the size are variable. With set_fs() gone, the limit is always constant,
> > and the size usually is as well, so just using the default implementation
> > reduces the check into a comparison against a constant that can be
> > scheduled by the compiler.
> >
> > On a defconfig build, this saves over 28KB of .text.
> >
> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> > ---
> >  arch/arm64/include/asm/uaccess.h | 28 +++++-----------------------
> >  1 file changed, 5 insertions(+), 23 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> > index 357f7bd9c981..e8dce0cc5eaa 100644
> > --- a/arch/arm64/include/asm/uaccess.h
> > +++ b/arch/arm64/include/asm/uaccess.h
> > @@ -26,6 +26,8 @@
> >  #include <asm/memory.h>
> >  #include <asm/extable.h>
> >
> > +static inline int __access_ok(const void __user *ptr, unsigned long size);
> > +
> >  /*
> >   * Test whether a block of memory is a valid user space address.
> >   * Returns 1 if the range is valid, 0 otherwise.
> > @@ -33,10 +35,8 @@
> >   * This is equivalent to the following test:
> >   * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
> >   */
> > -static inline unsigned long __access_ok(const void __user *addr, unsigned long size)
> > +static inline int access_ok(const void __user *addr, unsigned long size)
> >  {
> > -       unsigned long ret, limit = TASK_SIZE_MAX - 1;
> > -
> >         /*
> >          * Asynchronous I/O running in a kernel thread does not have the
> >          * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag
> > @@ -46,27 +46,9 @@ static inline unsigned long __access_ok(const void __user *addr, unsigned long s
> >             (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
> >                 addr = untagged_addr(addr);
> >
> > -       __chk_user_ptr(addr);
> > -       asm volatile(
> > -       // A + B <= C + 1 for all A,B,C, in four easy steps:
> > -       // 1: X = A + B; X' = X % 2^64
> > -       "       adds    %0, %3, %2\n"
> > -       // 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
> > -       "       csel    %1, xzr, %1, hi\n"
> > -       // 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
> > -       //    to compensate for the carry flag being set in step 4. For
> > -       //    X > 2^64, X' merely has to remain nonzero, which it does.
> > -       "       csinv   %0, %0, xzr, cc\n"
> > -       // 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
> > -       //    comes from the carry in being clear. Otherwise, we are
> > -       //    testing X' - C == 0, subject to the previous adjustments.
> > -       "       sbcs    xzr, %0, %1\n"
> > -       "       cset    %0, ls\n"
> > -       : "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
> > -
> > -       return ret;
> > +       return likely(__access_ok(addr, size));
> >  }
> > -#define __access_ok __access_ok
> > +#define access_ok access_ok
> >
> >  #include <asm-generic/access_ok.h>
> >
> > --
> > 2.29.2
> >
> 
> With set_fs() out of the picture, wouldn't it be sufficient to check
> that bit #55 is clear? (the bit that selects between TTBR0 and TTBR1)
> That would also remove the need to strip the tag from the address.
> 
> Something like
> 
>     asm goto("tbnz  %0, #55, %2     \n"
>              "tbnz  %1, #55, %2     \n"
>              :: "r"(addr), "r"(addr + size - 1) :: notok);
>     return 1;
> notok:
>     return 0;
> 
> with an additional sanity check on the size which the compiler could
> eliminate for compile-time constant values.

Is there are reason not to just use:
	size < 1u << 48 && !((addr | (addr + size - 1)) & 1u << 55)

(The -1 can be removed if the last user page is never mapped)

Ugg, is arm64 addressing as horrid as it looks - with the 'kernel'
bit in the middle of the virtual address space?
It seems to be:
	<zero:4><tag:4><kernel:1><ignored:7><address:48>
Although I found some references to 44 bit VA and to code using the
'ignored' bits as tags - relying on the hardware ignoring them.
There might be some feature that uses the top 4 bits as well.

Another option is assuming that accesses are 'reasonably sequential',
removing the length check and ensuring there is an unmapped page
between valid user and kernel addresses.
That probably requires and unmapped page at the bottom of kernel space
which may not be achievable.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-15  9:21       ` Ard Biesheuvel
@ 2022-02-15  9:39         ` Arnd Bergmann
  2022-02-15 10:39           ` Mark Rutland
  2022-02-15 10:37         ` Mark Rutland
  1 sibling, 1 reply; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-15  9:39 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:MIPS, Linux Memory Management List,
	Guo Ren, open list:SPARC + UltraSPARC (sparc/sparc64),
	open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	open list:S390, Brian Cain, Helge Deller, X86 ML, Russell King,
	linux-csky, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE, Robin Murphy,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, open list:PARISC ARCHITECTURE,
	Nick Hu, Max Filippov, Linux API, Linux Kernel Mailing List,
	Dinh Nguyen, Eric W. Biederman, Richard Weinberger,
	Andrew Morton, Linus Torvalds, David S. Miller

On Tue, Feb 15, 2022 at 10:21 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> On Tue, 15 Feb 2022 at 10:13, Arnd Bergmann <arnd@kernel.org> wrote:
>
> arm64 also has this leading up to the range check, and I think we'd no
> longer need it:
>
>     if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&
>         (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
>             addr = untagged_addr(addr);

I suspect the expensive part here is checking the two flags, as untagged_addr()
seems to always just add a sbfx instruction. Would this work?

#ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
#define access_ok(ptr, size) __access_ok(untagged_addr(ptr), (size))
#else // the else path is the default, this can be left out.
#define access_ok(ptr, size) __access_ok((ptr), (size))
#endif

       Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 09/14] m68k: drop custom __access_ok()
  2022-02-15  7:13       ` Al Viro
@ 2022-02-15 10:02         ` Arnd Bergmann
  2022-02-15 13:28           ` David Laight
  0 siblings, 1 reply; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-15 10:02 UTC (permalink / raw)
  To: Al Viro
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Linux-MM,
	Guo Ren, sparclinux, open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	linux-s390, Brian Cain, Helge Deller, the arch/x86 maintainers,
	Russell King - ARM Linux, linux-csky, Ard Biesheuvel,
	Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um, linuxppc-dev,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, Parisc List, Nick Hu,
	Max Filippov, Linux API, Linux Kernel Mailing List, Dinh Nguyen,
	Eric W . Biederman, Richard Weinberger, Andrew Morton,
	Linus Torvalds, David Miller

On Tue, Feb 15, 2022 at 8:13 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Tue, Feb 15, 2022 at 07:29:42AM +0100, Christoph Hellwig wrote:
> > On Tue, Feb 15, 2022 at 12:37:41AM +0000, Al Viro wrote:
> > > Perhaps simply wrap that sucker into #ifdef CONFIG_CPU_HAS_ADDRESS_SPACES
> > > (and trim the comment down to "coldfire and 68000 will pick generic
> > > variant")?
> >
> > I wonder if we should invert CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE,
> > select the separate address space config for s390, sparc64, non-coldfire
> > m68k and mips with EVA and then just have one single access_ok for
> > overlapping address space (as added by Arnd) and non-overlapping ones
> > (always return true).
>
> parisc is also such...  How about
>
>         select ALTERNATE_SPACE_USERLAND
>
> for that bunch?

Either of those works for me. My current version has this keyed off
TASK_SIZE_MAX==ULONG_MAX, but a CONFIG_ symbol does
look more descriptive.

>  While we are at it, how many unusual access_ok() instances are
> left after this series?  arm64, itanic, um, anything else?

x86 adds a WARN_ON_IN_IRQ() check in there. This could be
made generic, but it's not obvious what exactly the exceptions are
that other architectures need. The arm64 tagged pointers could
probably also get integrated into the generic version.

> FWIW, sparc32 has a slightly unusual instance (see uaccess_32.h there); it's
> obviously cheaper than generic and I wonder if the trick is legitimate (and
> applicable elsewhere, perhaps)...

Right, a few others have the same, but I wasn't convinced that this
is actually safe for call possible cases: it's trivial to construct a caller
that works on other architectures but not this one, if you pass a large
enough size value and don't access the contents in sequence.

Also, like the ((addr | (addr + size)) & MASK) check on some other
architectures, it is less portable because it makes assumptions about
the actual layout beyond a fixed address limit.

        Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 03/14] nds32: fix access_ok() checks in get/put_user
  2022-02-15  9:18     ` Arnd Bergmann
@ 2022-02-15 10:25       ` Greg KH
  0 siblings, 0 replies; 61+ messages in thread
From: Greg KH @ 2022-02-15 10:25 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Christoph Hellwig, Ingo Molnar,
	Geert Uytterhoeven, open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, linuxppc-dev,
	Richard Weinberger, linux-m68k, Openrisc, Greentime Hu,
	Stafford Horne, Linux ARM, Michal Simek, Thomas Bogendoerfer,
	Nick Hu, Parisc List, Linux-MM, Linux API,
	Linux Kernel Mailing List, # 3.4.x, Dinh Nguyen,
	Eric W . Biederman, alpha, Andrew Morton, Linus Torvalds,
	David Miller

On Tue, Feb 15, 2022 at 10:18:15AM +0100, Arnd Bergmann wrote:
> On Mon, Feb 14, 2022 at 6:01 PM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Mon, Feb 14, 2022 at 05:34:41PM +0100, Arnd Bergmann wrote:
> > > From: Arnd Bergmann <arnd@arndb.de>
> > >
> > > The get_user()/put_user() functions are meant to check for
> > > access_ok(), while the __get_user()/__put_user() functions
> > > don't.
> > >
> > > This broke in 4.19 for nds32, when it gained an extraneous
> > > check in __get_user(), but lost the check it needs in
> > > __put_user().
> >
> > Can we follow the lead of MIPS (which this was originally copied
> > from I think) and kill the pointless __get/put_user_check wrapper
> > that just obsfucate the code?
> 
> I had another look, but I think that would be a bigger change than
> I want to have in a fix for stable backports, as nds32 also uses
> the _check versions in __{get,put}_user_error.

Don't worry about stable backports first, get it correct and merged and
then worry about them if you really have to.

If someone cares about nds32 for stable kernels, they can do the
backport work :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-15  9:21       ` Ard Biesheuvel
  2022-02-15  9:39         ` Arnd Bergmann
@ 2022-02-15 10:37         ` Mark Rutland
  1 sibling, 0 replies; 61+ messages in thread
From: Mark Rutland @ 2022-02-15 10:37 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Rich Felker, linux-ia64, Linux-sh list, Peter Zijlstra,
	open list:MIPS, Linux Memory Management List, Guo Ren,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	open list:S390, Brian Cain, Helge Deller, X86 ML, Russell King,
	linux-csky, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE, Robin Murphy,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Arnd Bergmann, Michal Simek, Thomas Bogendoerfer,
	open list:PARISC ARCHITECTURE, Nick Hu, Max Filippov, Linux API,
	Linux Kernel Mailing List, Dinh Nguyen, Eric W. Biederman,
	Richard Weinberger, Andrew Morton, Linus Torvalds,
	David S. Miller

On Tue, Feb 15, 2022 at 10:21:16AM +0100, Ard Biesheuvel wrote:
> On Tue, 15 Feb 2022 at 10:13, Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > On Tue, Feb 15, 2022 at 9:17 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > > On Mon, 14 Feb 2022 at 17:37, Arnd Bergmann <arnd@kernel.org> wrote:
> > > > From: Arnd Bergmann <arnd@arndb.de>
> > > >
> > >
> > > With set_fs() out of the picture, wouldn't it be sufficient to check
> > > that bit #55 is clear? (the bit that selects between TTBR0 and TTBR1)
> > > That would also remove the need to strip the tag from the address.
> > >
> > > Something like
> > >
> > >     asm goto("tbnz  %0, #55, %2     \n"
> > >              "tbnz  %1, #55, %2     \n"
> > >              :: "r"(addr), "r"(addr + size - 1) :: notok);
> > >     return 1;
> > > notok:
> > >     return 0;
> > >
> > > with an additional sanity check on the size which the compiler could
> > > eliminate for compile-time constant values.
> >
> > That should work, but I don't see it as a clear enough advantage to
> > have a custom implementation. For the constant-size case, it probably
> > isn't better than a compiler-scheduled comparison against a
> > constant limit, but it does hurt maintainability when the next person
> > wants to change the behavior of access_ok() globally.
> >
> 
> arm64 also has this leading up to the range check, and I think we'd no
> longer need it:
> 
>     if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&
>         (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
>             addr = untagged_addr(addr);
> 

ABI-wise, we aim to *reject* tagged pointers unless the task is using the
tagged addr ABI, so we need to retain both the untagging logic and the full
pointer check (to actually check the tag bits) unless we relax that ABI
decision generally (or go context-switch the TCR_EL1.TBI* bits).

Since that has subtle ABI implications, I don't think we should change that
within this series.

If we *did* relax things, we could just check bit 55 here, and unconditionally
clear that in uaccess_mask_ptr(), since LDTR/STTR should fault on kernel memory.
On parts with meltdown those might not fault until committed, and so we need
masking to avoid speculative access to a kernel pointer, and that requires the
prior explciit check.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-15  9:39         ` Arnd Bergmann
@ 2022-02-15 10:39           ` Mark Rutland
  0 siblings, 0 replies; 61+ messages in thread
From: Mark Rutland @ 2022-02-15 10:39 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Rich Felker, linux-ia64, Linux-sh list, Peter Zijlstra,
	open list:MIPS, Linux Memory Management List, Guo Ren,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	open list:S390, Brian Cain, Helge Deller, X86 ML, Russell King,
	linux-csky, Ard Biesheuvel, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE, Robin Murphy,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, open list:PARISC ARCHITECTURE,
	Nick Hu, Max Filippov, Linux API, Linux Kernel Mailing List,
	Dinh Nguyen, Eric W. Biederman, Richard Weinberger,
	Andrew Morton, Linus Torvalds, David S. Miller

On Tue, Feb 15, 2022 at 10:39:46AM +0100, Arnd Bergmann wrote:
> On Tue, Feb 15, 2022 at 10:21 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > On Tue, 15 Feb 2022 at 10:13, Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > arm64 also has this leading up to the range check, and I think we'd no
> > longer need it:
> >
> >     if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&
> >         (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
> >             addr = untagged_addr(addr);
> 
> I suspect the expensive part here is checking the two flags, as untagged_addr()
> seems to always just add a sbfx instruction. Would this work?
> 
> #ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
> #define access_ok(ptr, size) __access_ok(untagged_addr(ptr), (size))
> #else // the else path is the default, this can be left out.
> #define access_ok(ptr, size) __access_ok((ptr), (size))
> #endif

This would be an ABI change, e.g. for tasks without TIF_TAGGED_ADDR.

I don't think we should change this as part of this series.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 07/14] uaccess: generalize access_ok()
  2022-02-14 16:34 ` [PATCH 07/14] uaccess: generalize access_ok() Arnd Bergmann
  2022-02-14 17:04   ` Christoph Hellwig
  2022-02-14 17:15   ` Al Viro
@ 2022-02-15 10:58   ` Mark Rutland
  2 siblings, 0 replies; 61+ messages in thread
From: Mark Rutland @ 2022-02-15 10:58 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: dalias, linux-ia64, linux-sh, peterz, linux-mips, linux-mm,
	guoren, sparclinux, linux-hexagon, linux-riscv, will,
	Christoph Hellwig, linux-arch, linux-s390, bcain, deller, x86,
	linux, linux-csky, ardb, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-alpha, linux-um, linuxppc-dev,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, jcmvbkbc, linux-api,
	linux-kernel, dinguyen, ebiederm, richard, akpm, Linus Torvalds,
	davem

On Mon, Feb 14, 2022 at 05:34:45PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> There are many different ways that access_ok() is defined across
> architectures, but in the end, they all just compare against the
> user_addr_max() value or they accept anything.
> 
> Provide one definition that works for most architectures, checking
> against TASK_SIZE_MAX for user processes or skipping the check inside
> of uaccess_kernel() sections.
> 
> For architectures without CONFIG_SET_FS(), this should be the fastest
> check, as it comes down to a single comparison of a pointer against a
> compile-time constant, while the architecture specific versions tend to
> do something more complex for historic reasons or get something wrong.
> 
> Type checking for __user annotations is handled inconsistently across
> architectures, but this is easily simplified as well by using an inline
> function that takes a 'const void __user *' argument. A handful of
> callers need an extra __user annotation for this.
> 
> Some architectures had trick to use 33-bit or 65-bit arithmetic on the
> addresses to calculate the overflow, however this simpler version uses
> fewer registers, which means it can produce better object code in the
> end despite needing a second (statically predicted) branch.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

As discussed over IRC, the generic sequence looks good to me, and likewise for
the arm64 change, so:

Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64, asm-generic]

Thanks,
Mark.

> ---
>  arch/alpha/include/asm/uaccess.h      | 34 +++------------
>  arch/arc/include/asm/uaccess.h        | 29 -------------
>  arch/arm/include/asm/uaccess.h        | 20 +--------
>  arch/arm/kernel/swp_emulate.c         |  2 +-
>  arch/arm/kernel/traps.c               |  2 +-
>  arch/arm64/include/asm/uaccess.h      |  5 ++-
>  arch/csky/include/asm/uaccess.h       |  8 ----
>  arch/csky/kernel/signal.c             |  2 +-
>  arch/hexagon/include/asm/uaccess.h    | 25 ------------
>  arch/ia64/include/asm/uaccess.h       |  5 +--
>  arch/m68k/include/asm/uaccess.h       |  5 ++-
>  arch/microblaze/include/asm/uaccess.h |  8 +---
>  arch/mips/include/asm/uaccess.h       | 29 +------------
>  arch/nds32/include/asm/uaccess.h      |  7 +---
>  arch/nios2/include/asm/uaccess.h      | 11 +----
>  arch/nios2/kernel/signal.c            | 20 +++++----
>  arch/openrisc/include/asm/uaccess.h   | 19 +--------
>  arch/parisc/include/asm/uaccess.h     | 10 +++--
>  arch/powerpc/include/asm/uaccess.h    | 11 +----
>  arch/powerpc/lib/sstep.c              |  4 +-
>  arch/riscv/include/asm/uaccess.h      | 31 +-------------
>  arch/riscv/kernel/perf_callchain.c    |  2 +-
>  arch/s390/include/asm/uaccess.h       | 11 ++---
>  arch/sh/include/asm/uaccess.h         | 22 +---------
>  arch/sparc/include/asm/uaccess.h      |  3 --
>  arch/sparc/include/asm/uaccess_32.h   | 18 ++------
>  arch/sparc/include/asm/uaccess_64.h   | 35 ++++------------
>  arch/sparc/kernel/signal_32.c         |  2 +-
>  arch/um/include/asm/uaccess.h         |  5 ++-
>  arch/x86/include/asm/uaccess.h        | 14 +------
>  arch/xtensa/include/asm/uaccess.h     | 10 +----
>  include/asm-generic/access_ok.h       | 59 +++++++++++++++++++++++++++
>  include/asm-generic/uaccess.h         | 21 +---------
>  include/linux/uaccess.h               |  7 ----
>  34 files changed, 130 insertions(+), 366 deletions(-)
>  create mode 100644 include/asm-generic/access_ok.h
> 
> diff --git a/arch/alpha/include/asm/uaccess.h b/arch/alpha/include/asm/uaccess.h
> index 1b6f25efa247..82c5743fc9cd 100644
> --- a/arch/alpha/include/asm/uaccess.h
> +++ b/arch/alpha/include/asm/uaccess.h
> @@ -20,28 +20,7 @@
>  #define get_fs()  (current_thread_info()->addr_limit)
>  #define set_fs(x) (current_thread_info()->addr_limit = (x))
>  
> -#define uaccess_kernel()	(get_fs().seg == KERNEL_DS.seg)
> -
> -/*
> - * Is a address valid? This does a straightforward calculation rather
> - * than tests.
> - *
> - * Address valid if:
> - *  - "addr" doesn't have any high-bits set
> - *  - AND "size" doesn't have any high-bits set
> - *  - AND "addr+size-(size != 0)" doesn't have any high-bits set
> - *  - OR we are in kernel mode.
> - */
> -#define __access_ok(addr, size) ({				\
> -	unsigned long __ao_a = (addr), __ao_b = (size);		\
> -	unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;	\
> -	(get_fs().seg & (__ao_a | __ao_b | __ao_end)) == 0; })
> -
> -#define access_ok(addr, size)				\
> -({							\
> -	__chk_user_ptr(addr);				\
> -	__access_ok(((unsigned long)(addr)), (size));	\
> -})
> +#include <asm-generic/access_ok.h>
>  
>  /*
>   * These are the main single-value transfer routines.  They automatically
> @@ -105,7 +84,7 @@ extern void __get_user_unknown(void);
>  	long __gu_err = -EFAULT;				\
>  	unsigned long __gu_val = 0;				\
>  	const __typeof__(*(ptr)) __user *__gu_addr = (ptr);	\
> -	if (__access_ok((unsigned long)__gu_addr, size)) {	\
> +	if (__access_ok(__gu_addr, size)) {			\
>  		__gu_err = 0;					\
>  		switch (size) {					\
>  		  case 1: __get_user_8(__gu_addr); break;	\
> @@ -200,7 +179,7 @@ extern void __put_user_unknown(void);
>  ({								\
>  	long __pu_err = -EFAULT;				\
>  	__typeof__(*(ptr)) __user *__pu_addr = (ptr);		\
> -	if (__access_ok((unsigned long)__pu_addr, size)) {	\
> +	if (__access_ok(__pu_addr, size)) {			\
>  		__pu_err = 0;					\
>  		switch (size) {					\
>  		  case 1: __put_user_8(x, __pu_addr); break;	\
> @@ -316,17 +295,14 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long len)
>  
>  extern long __clear_user(void __user *to, long len);
>  
> -extern inline long
> +static inline long
>  clear_user(void __user *to, long len)
>  {
> -	if (__access_ok((unsigned long)to, len))
> +	if (__access_ok(to, len))
>  		len = __clear_user(to, len);
>  	return len;
>  }
>  
> -#define user_addr_max() \
> -        (uaccess_kernel() ? ~0UL : TASK_SIZE)
> -
>  extern long strncpy_from_user(char *dest, const char __user *src, long count);
>  extern __must_check long strnlen_user(const char __user *str, long n);
>  
> diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
> index 783bfdb3bfa3..30f80b4be2ab 100644
> --- a/arch/arc/include/asm/uaccess.h
> +++ b/arch/arc/include/asm/uaccess.h
> @@ -23,35 +23,6 @@
>  
>  #include <linux/string.h>	/* for generic string functions */
>  
> -
> -#define __kernel_ok		(uaccess_kernel())
> -
> -/*
> - * Algorithmically, for __user_ok() we want do:
> - * 	(start < TASK_SIZE) && (start+len < TASK_SIZE)
> - * where TASK_SIZE could either be retrieved from thread_info->addr_limit or
> - * emitted directly in code.
> - *
> - * This can however be rewritten as follows:
> - *	(len <= TASK_SIZE) && (start+len < TASK_SIZE)
> - *
> - * Because it essentially checks if buffer end is within limit and @len is
> - * non-ngeative, which implies that buffer start will be within limit too.
> - *
> - * The reason for rewriting being, for majority of cases, @len is generally
> - * compile time constant, causing first sub-expression to be compile time
> - * subsumed.
> - *
> - * The second part would generate weird large LIMMs e.g. (0x6000_0000 - 0x10),
> - * so we check for TASK_SIZE using get_fs() since the addr_limit load from mem
> - * would already have been done at this call site for __kernel_ok()
> - *
> - */
> -#define __user_ok(addr, sz)	(((sz) <= TASK_SIZE) && \
> -				 ((addr) <= (get_fs() - (sz))))
> -#define __access_ok(addr, sz)	(unlikely(__kernel_ok) || \
> -				 likely(__user_ok((addr), (sz))))
> -
>  /*********** Single byte/hword/word copies ******************/
>  
>  #define __get_user_fn(sz, u, k)					\
> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> index d20d78c34b94..2fcbec9c306c 100644
> --- a/arch/arm/include/asm/uaccess.h
> +++ b/arch/arm/include/asm/uaccess.h
> @@ -55,21 +55,6 @@ extern int __put_user_bad(void);
>  
>  #ifdef CONFIG_MMU
>  
> -/*
> - * We use 33-bit arithmetic here.  Success returns zero, failure returns
> - * addr_limit.  We take advantage that addr_limit will be zero for KERNEL_DS,
> - * so this will always return success in that case.
> - */
> -#define __range_ok(addr, size) ({ \
> -	unsigned long flag, roksum; \
> -	__chk_user_ptr(addr);	\
> -	__asm__(".syntax unified\n" \
> -		"adds %1, %2, %3; sbcscc %1, %1, %0; movcc %0, #0" \
> -		: "=&r" (flag), "=&r" (roksum) \
> -		: "r" (addr), "Ir" (size), "0" (TASK_SIZE) \
> -		: "cc"); \
> -	flag; })
> -
>  /*
>   * This is a type: either unsigned long, if the argument fits into
>   * that type, or otherwise unsigned long long.
> @@ -241,15 +226,12 @@ extern int __put_user_8(void *, unsigned long long);
>  
>  #else /* CONFIG_MMU */
>  
> -#define __addr_ok(addr)		((void)(addr), 1)
> -#define __range_ok(addr, size)	((void)(addr), 0)
> -
>  #define get_user(x, p)	__get_user(x, p)
>  #define __put_user_check __put_user_nocheck
>  
>  #endif /* CONFIG_MMU */
>  
> -#define access_ok(addr, size)	(__range_ok(addr, size) == 0)
> +#include <asm-generic/access_ok.h>
>  
>  #ifdef CONFIG_CPU_SPECTRE
>  /*
> diff --git a/arch/arm/kernel/swp_emulate.c b/arch/arm/kernel/swp_emulate.c
> index 6166ba38bf99..b74bfcf94fb1 100644
> --- a/arch/arm/kernel/swp_emulate.c
> +++ b/arch/arm/kernel/swp_emulate.c
> @@ -195,7 +195,7 @@ static int swp_handler(struct pt_regs *regs, unsigned int instr)
>  		 destreg, EXTRACT_REG_NUM(instr, RT2_OFFSET), data);
>  
>  	/* Check access in reasonable access range for both SWP and SWPB */
> -	if (!access_ok((address & ~3), 4)) {
> +	if (!access_ok((void __user *)(address & ~3), 4)) {
>  		pr_debug("SWP{B} emulation: access to %p not allowed!\n",
>  			 (void *)address);
>  		res = -EFAULT;
> diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
> index da04ed85855a..26c8c8276297 100644
> --- a/arch/arm/kernel/traps.c
> +++ b/arch/arm/kernel/traps.c
> @@ -576,7 +576,7 @@ do_cache_op(unsigned long start, unsigned long end, int flags)
>  	if (end < start || flags)
>  		return -EINVAL;
>  
> -	if (!access_ok(start, end - start))
> +	if (!access_ok((void __user *)start, end - start))
>  		return -EFAULT;
>  
>  	return __do_cache_op(start, end);
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 2e20879fe3cf..357f7bd9c981 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -33,7 +33,7 @@
>   * This is equivalent to the following test:
>   * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
>   */
> -static inline unsigned long __range_ok(const void __user *addr, unsigned long size)
> +static inline unsigned long __access_ok(const void __user *addr, unsigned long size)
>  {
>  	unsigned long ret, limit = TASK_SIZE_MAX - 1;
>  
> @@ -66,8 +66,9 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
>  
>  	return ret;
>  }
> +#define __access_ok __access_ok
>  
> -#define access_ok(addr, size)	__range_ok(addr, size)
> +#include <asm-generic/access_ok.h>
>  
>  /*
>   * User access enabling/disabling.
> diff --git a/arch/csky/include/asm/uaccess.h b/arch/csky/include/asm/uaccess.h
> index ac5a54f57d40..fec8f77ffc99 100644
> --- a/arch/csky/include/asm/uaccess.h
> +++ b/arch/csky/include/asm/uaccess.h
> @@ -5,14 +5,6 @@
>  
>  #define user_addr_max() (current_thread_info()->addr_limit.seg)
>  
> -static inline int __access_ok(unsigned long addr, unsigned long size)
> -{
> -	unsigned long limit = user_addr_max();
> -
> -	return (size <= limit) && (addr <= (limit - size));
> -}
> -#define __access_ok __access_ok
> -
>  /*
>   * __put_user_fn
>   */
> diff --git a/arch/csky/kernel/signal.c b/arch/csky/kernel/signal.c
> index c7b763d2f526..8867ddf3e6c7 100644
> --- a/arch/csky/kernel/signal.c
> +++ b/arch/csky/kernel/signal.c
> @@ -136,7 +136,7 @@ static inline void __user *get_sigframe(struct ksignal *ksig,
>  static int
>  setup_rt_frame(struct ksignal *ksig, sigset_t *set, struct pt_regs *regs)
>  {
> -	struct rt_sigframe *frame;
> +	struct rt_sigframe __user *frame;
>  	int err = 0;
>  
>  	frame = get_sigframe(ksig, regs, sizeof(*frame));
> diff --git a/arch/hexagon/include/asm/uaccess.h b/arch/hexagon/include/asm/uaccess.h
> index 719ba3f3c45c..bff77efc0d9a 100644
> --- a/arch/hexagon/include/asm/uaccess.h
> +++ b/arch/hexagon/include/asm/uaccess.h
> @@ -12,31 +12,6 @@
>   */
>  #include <asm/sections.h>
>  
> -/*
> - * access_ok: - Checks if a user space pointer is valid
> - * @addr: User space pointer to start of block to check
> - * @size: Size of block to check
> - *
> - * Context: User context only. This function may sleep if pagefaults are
> - *          enabled.
> - *
> - * Checks if a pointer to a block of memory in user space is valid.
> - *
> - * Returns true (nonzero) if the memory block *may* be valid, false (zero)
> - * if it is definitely invalid.
> - *
> - */
> -#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
> -#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
> -
> -static inline int __access_ok(unsigned long addr, unsigned long size)
> -{
> -	unsigned long limit = TASK_SIZE;
> -
> -	return (size <= limit) && (addr <= (limit - size));
> -}
> -#define __access_ok __access_ok
> -
>  /*
>   * When a kernel-mode page fault is taken, the faulting instruction
>   * address is checked against a table of exception_table_entries.
> diff --git a/arch/ia64/include/asm/uaccess.h b/arch/ia64/include/asm/uaccess.h
> index e19d2dcc0ced..e242a3cc1330 100644
> --- a/arch/ia64/include/asm/uaccess.h
> +++ b/arch/ia64/include/asm/uaccess.h
> @@ -50,8 +50,6 @@
>  #define get_fs()  (current_thread_info()->addr_limit)
>  #define set_fs(x) (current_thread_info()->addr_limit = (x))
>  
> -#define uaccess_kernel()	(get_fs().seg == KERNEL_DS.seg)
> -
>  /*
>   * When accessing user memory, we need to make sure the entire area really is in
>   * user-level space.  In order to do this efficiently, we make sure that the page at
> @@ -65,7 +63,8 @@ static inline int __access_ok(const void __user *p, unsigned long size)
>  	return likely(addr <= seg) &&
>  	 (seg == KERNEL_DS.seg || likely(REGION_OFFSET(addr) < RGN_MAP_LIMIT));
>  }
> -#define access_ok(addr, size)	__access_ok((addr), (size))
> +#define __access_ok __access_ok
> +#include <asm-generic/access_ok.h>
>  
>  /*
>   * These are the main single-value transfer routines.  They automatically
> diff --git a/arch/m68k/include/asm/uaccess.h b/arch/m68k/include/asm/uaccess.h
> index 79617c0b2f91..d6bb5720365a 100644
> --- a/arch/m68k/include/asm/uaccess.h
> +++ b/arch/m68k/include/asm/uaccess.h
> @@ -12,15 +12,18 @@
>  #include <asm/extable.h>
>  
>  /* We let the MMU do all checking */
> -static inline int access_ok(const void __user *addr,
> +static inline int __access_ok(const void __user *addr,
>  			    unsigned long size)
>  {
>  	/*
>  	 * XXX: for !CONFIG_CPU_HAS_ADDRESS_SPACES this really needs to check
>  	 * for TASK_SIZE!
> +	 * Removing this helper is probably sufficient.
>  	 */
>  	return 1;
>  }
> +#define __access_ok __access_ok
> +#include <asm-generic/access_ok.h>
>  
>  /*
>   * Not all varients of the 68k family support the notion of address spaces.
> diff --git a/arch/microblaze/include/asm/uaccess.h b/arch/microblaze/include/asm/uaccess.h
> index 5b6e0e7788f4..dd82e90adb52 100644
> --- a/arch/microblaze/include/asm/uaccess.h
> +++ b/arch/microblaze/include/asm/uaccess.h
> @@ -39,13 +39,7 @@
>  
>  # define uaccess_kernel()	(get_fs().seg == KERNEL_DS.seg)
>  
> -static inline int __access_ok(unsigned long addr, unsigned long size)
> -{
> -	unsigned long limit = user_addr_max();
> -
> -	return (size <= limit) && (addr <= (limit - size));
> -}
> -#define access_ok(addr, size) __access_ok((unsigned long)addr, size)
> +#include <asm-generic/access_ok.h>
>  
>  # define __FIXUP_SECTION	".section .fixup,\"ax\"\n"
>  # define __EX_TABLE_SECTION	".section __ex_table,\"a\"\n"
> diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
> index d7c89dc3426c..436248652b28 100644
> --- a/arch/mips/include/asm/uaccess.h
> +++ b/arch/mips/include/asm/uaccess.h
> @@ -44,34 +44,7 @@ extern u64 __ua_limit;
>  
>  #endif /* CONFIG_64BIT */
>  
> -/*
> - * access_ok: - Checks if a user space pointer is valid
> - * @addr: User space pointer to start of block to check
> - * @size: Size of block to check
> - *
> - * Context: User context only. This function may sleep if pagefaults are
> - *          enabled.
> - *
> - * Checks if a pointer to a block of memory in user space is valid.
> - *
> - * Returns true (nonzero) if the memory block may be valid, false (zero)
> - * if it is definitely invalid.
> - *
> - * Note that, depending on architecture, this function probably just
> - * checks that the pointer is in the user space range - after calling
> - * this function, memory access functions may still return -EFAULT.
> - */
> -
> -static inline int __access_ok(const void __user *p, unsigned long size)
> -{
> -	unsigned long addr = (unsigned long)p;
> -	unsigned long limit = TASK_SIZE_MAX;
> -
> -	return (size <= limit) && (addr <= (limit - size));
> -}
> -
> -#define access_ok(addr, size)					\
> -	likely(__access_ok((addr), (size)))
> +#include <asm-generic/access_ok.h>
>  
>  /*
>   * put_user: - Write a simple value into user space.
> diff --git a/arch/nds32/include/asm/uaccess.h b/arch/nds32/include/asm/uaccess.h
> index 37a40981deb3..832d642a4068 100644
> --- a/arch/nds32/include/asm/uaccess.h
> +++ b/arch/nds32/include/asm/uaccess.h
> @@ -38,18 +38,15 @@ extern int fixup_exception(struct pt_regs *regs);
>  
>  #define get_fs()	(current_thread_info()->addr_limit)
>  #define user_addr_max	get_fs
> +#define uaccess_kernel() (get_fs() == KERNEL_DS)
>  
>  static inline void set_fs(mm_segment_t fs)
>  {
>  	current_thread_info()->addr_limit = fs;
>  }
>  
> -#define uaccess_kernel()	(get_fs() == KERNEL_DS)
> +#include <asm-generic/access_ok.h>
>  
> -#define __range_ok(addr, size) (size <= get_fs() && addr <= (get_fs() -size))
> -
> -#define access_ok(addr, size)	\
> -	__range_ok((unsigned long)addr, (unsigned long)size)
>  /*
>   * Single-value transfer routines.  They automatically use the right
>   * size if we just have the right pointer type.  Note that the functions
> diff --git a/arch/nios2/include/asm/uaccess.h b/arch/nios2/include/asm/uaccess.h
> index ba9340e96fd4..9a7658df7f8d 100644
> --- a/arch/nios2/include/asm/uaccess.h
> +++ b/arch/nios2/include/asm/uaccess.h
> @@ -30,19 +30,10 @@
>  #define get_fs()		(current_thread_info()->addr_limit)
>  #define set_fs(seg)		(current_thread_info()->addr_limit = (seg))
>  
> -#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
> -
> -#define __access_ok(addr, len)			\
> -	(((signed long)(((long)get_fs().seg) &	\
> -		((long)(addr) | (((long)(addr)) + (len)) | (len)))) == 0)
> -
> -#define access_ok(addr, len)		\
> -	likely(__access_ok((unsigned long)(addr), (unsigned long)(len)))
> +#include <asm-generic/access_ok.h>
>  
>  # define __EX_TABLE_SECTION	".section __ex_table,\"a\"\n"
>  
> -#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
> -
>  /*
>   * Zero Userspace
>   */
> diff --git a/arch/nios2/kernel/signal.c b/arch/nios2/kernel/signal.c
> index 2009ae2d3c3b..386e46443b60 100644
> --- a/arch/nios2/kernel/signal.c
> +++ b/arch/nios2/kernel/signal.c
> @@ -36,10 +36,10 @@ struct rt_sigframe {
>  
>  static inline int rt_restore_ucontext(struct pt_regs *regs,
>  					struct switch_stack *sw,
> -					struct ucontext *uc, int *pr2)
> +					struct ucontext __user *uc, int *pr2)
>  {
>  	int temp;
> -	unsigned long *gregs = uc->uc_mcontext.gregs;
> +	unsigned long __user *gregs = uc->uc_mcontext.gregs;
>  	int err;
>  
>  	/* Always make any pending restarted system calls return -EINTR */
> @@ -102,10 +102,11 @@ asmlinkage int do_rt_sigreturn(struct switch_stack *sw)
>  {
>  	struct pt_regs *regs = (struct pt_regs *)(sw + 1);
>  	/* Verify, can we follow the stack back */
> -	struct rt_sigframe *frame = (struct rt_sigframe *) regs->sp;
> +	struct rt_sigframe __user *frame;
>  	sigset_t set;
>  	int rval;
>  
> +	frame = (struct rt_sigframe __user *) regs->sp;
>  	if (!access_ok(frame, sizeof(*frame)))
>  		goto badframe;
>  
> @@ -124,10 +125,10 @@ asmlinkage int do_rt_sigreturn(struct switch_stack *sw)
>  	return 0;
>  }
>  
> -static inline int rt_setup_ucontext(struct ucontext *uc, struct pt_regs *regs)
> +static inline int rt_setup_ucontext(struct ucontext __user *uc, struct pt_regs *regs)
>  {
>  	struct switch_stack *sw = (struct switch_stack *)regs - 1;
> -	unsigned long *gregs = uc->uc_mcontext.gregs;
> +	unsigned long __user *gregs = uc->uc_mcontext.gregs;
>  	int err = 0;
>  
>  	err |= __put_user(MCONTEXT_VERSION, &uc->uc_mcontext.version);
> @@ -162,8 +163,9 @@ static inline int rt_setup_ucontext(struct ucontext *uc, struct pt_regs *regs)
>  	return err;
>  }
>  
> -static inline void *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
> -				 size_t frame_size)
> +static inline void __user *get_sigframe(struct ksignal *ksig,
> +					struct pt_regs *regs,
> +					size_t frame_size)
>  {
>  	unsigned long usp;
>  
> @@ -174,13 +176,13 @@ static inline void *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
>  	usp = sigsp(usp, ksig);
>  
>  	/* Verify, is it 32 or 64 bit aligned */
> -	return (void *)((usp - frame_size) & -8UL);
> +	return (void __user *)((usp - frame_size) & -8UL);
>  }
>  
>  static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
>  			  struct pt_regs *regs)
>  {
> -	struct rt_sigframe *frame;
> +	struct rt_sigframe __user *frame;
>  	int err = 0;
>  
>  	frame = get_sigframe(ksig, regs, sizeof(*frame));
> diff --git a/arch/openrisc/include/asm/uaccess.h b/arch/openrisc/include/asm/uaccess.h
> index 120f5005461b..8f049ec99b3e 100644
> --- a/arch/openrisc/include/asm/uaccess.h
> +++ b/arch/openrisc/include/asm/uaccess.h
> @@ -45,21 +45,7 @@
>  
>  #define uaccess_kernel()	(get_fs() == KERNEL_DS)
>  
> -/* Ensure that the range from addr to addr+size is all within the process'
> - * address space
> - */
> -static inline int __range_ok(unsigned long addr, unsigned long size)
> -{
> -	const mm_segment_t fs = get_fs();
> -
> -	return size <= fs && addr <= (fs - size);
> -}
> -
> -#define access_ok(addr, size)						\
> -({ 									\
> -	__chk_user_ptr(addr);						\
> -	__range_ok((unsigned long)(addr), (size));			\
> -})
> +#include <asm-generic/access_ok.h>
>  
>  /*
>   * These are the main single-value transfer routines.  They automatically
> @@ -268,9 +254,6 @@ clear_user(void __user *addr, unsigned long size)
>  	return size;
>  }
>  
> -#define user_addr_max() \
> -	(uaccess_kernel() ? ~0UL : TASK_SIZE)
> -
>  extern long strncpy_from_user(char *dest, const char __user *src, long count);
>  
>  extern __must_check long strnlen_user(const char __user *str, long n);
> diff --git a/arch/parisc/include/asm/uaccess.h b/arch/parisc/include/asm/uaccess.h
> index 0925bbd6db67..b68f19e11361 100644
> --- a/arch/parisc/include/asm/uaccess.h
> +++ b/arch/parisc/include/asm/uaccess.h
> @@ -17,9 +17,13 @@
>   * We just let the page fault handler do the right thing. This also means
>   * that put_user is the same as __put_user, etc.
>   */
> -
> -#define access_ok(uaddr, size)	\
> -	( (uaddr) == (uaddr) )
> +static inline int __access_ok(const void __user *addr, unsigned long size)
> +{
> +	return 1;
> +}
> +#define __access_ok __access_ok
> +#define TASK_SIZE_MAX DEFAULT_TASK_SIZE
> +#include <asm-generic/access_ok.h>
>  
>  #define put_user __put_user
>  #define get_user __get_user
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index a0032c2e7550..2e83217f52de 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -11,18 +11,9 @@
>  #ifdef __powerpc64__
>  /* We use TASK_SIZE_USER64 as TASK_SIZE is not constant */
>  #define TASK_SIZE_MAX		TASK_SIZE_USER64
> -#else
> -#define TASK_SIZE_MAX		TASK_SIZE
>  #endif
>  
> -static inline bool __access_ok(unsigned long addr, unsigned long size)
> -{
> -	return addr < TASK_SIZE_MAX && size <= TASK_SIZE_MAX - addr;
> -}
> -
> -#define access_ok(addr, size)		\
> -	(__chk_user_ptr(addr),		\
> -	 __access_ok((unsigned long)(addr), (size)))
> +#include <asm-generic/access_ok.h>
>  
>  /*
>   * These are the main single-value transfer routines.  They automatically
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index a94b0cd0bdc5..022d23ae300b 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -112,9 +112,9 @@ static nokprobe_inline long address_ok(struct pt_regs *regs,
>  {
>  	if (!user_mode(regs))
>  		return 1;
> -	if (__access_ok(ea, nb))
> +	if (access_ok((void __user *)ea, nb))
>  		return 1;
> -	if (__access_ok(ea, 1))
> +	if (access_ok((void __user *)ea, 1))
>  		/* Access overlaps the end of the user region */
>  		regs->dar = TASK_SIZE_MAX - 1;
>  	else
> diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
> index 4407b9e48d2c..855450bed9f5 100644
> --- a/arch/riscv/include/asm/uaccess.h
> +++ b/arch/riscv/include/asm/uaccess.h
> @@ -21,42 +21,13 @@
>  #include <asm/byteorder.h>
>  #include <asm/extable.h>
>  #include <asm/asm.h>
> +#include <asm-generic/access_ok.h>
>  
>  #define __enable_user_access()							\
>  	__asm__ __volatile__ ("csrs sstatus, %0" : : "r" (SR_SUM) : "memory")
>  #define __disable_user_access()							\
>  	__asm__ __volatile__ ("csrc sstatus, %0" : : "r" (SR_SUM) : "memory")
>  
> -/**
> - * access_ok: - Checks if a user space pointer is valid
> - * @addr: User space pointer to start of block to check
> - * @size: Size of block to check
> - *
> - * Context: User context only.  This function may sleep.
> - *
> - * Checks if a pointer to a block of memory in user space is valid.
> - *
> - * Returns true (nonzero) if the memory block may be valid, false (zero)
> - * if it is definitely invalid.
> - *
> - * Note that, depending on architecture, this function probably just
> - * checks that the pointer is in the user space range - after calling
> - * this function, memory access functions may still return -EFAULT.
> - */
> -#define access_ok(addr, size) ({					\
> -	__chk_user_ptr(addr);						\
> -	likely(__access_ok((unsigned long __force)(addr), (size)));	\
> -})
> -
> -/*
> - * Ensure that the range [addr, addr+size) is within the process's
> - * address space
> - */
> -static inline int __access_ok(unsigned long addr, unsigned long size)
> -{
> -	return size <= TASK_SIZE && addr <= TASK_SIZE - size;
> -}
> -
>  /*
>   * The exception table consists of pairs of addresses: the first is the
>   * address of an instruction that is allowed to fault, and the second is
> diff --git a/arch/riscv/kernel/perf_callchain.c b/arch/riscv/kernel/perf_callchain.c
> index 1fc075b8f764..f0c7bb98119a 100644
> --- a/arch/riscv/kernel/perf_callchain.c
> +++ b/arch/riscv/kernel/perf_callchain.c
> @@ -15,7 +15,7 @@ static unsigned long user_backtrace(struct perf_callchain_entry_ctx *entry,
>  {
>  	struct stackframe buftail;
>  	unsigned long ra = 0;
> -	unsigned long *user_frame_tail =
> +	unsigned long __user *user_frame_tail =
>  			(unsigned long *)(fp - sizeof(struct stackframe));
>  
>  	/* Check accessibility of one struct frame_tail beyond */
> diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
> index 29332edf46f0..f84d70c8e188 100644
> --- a/arch/s390/include/asm/uaccess.h
> +++ b/arch/s390/include/asm/uaccess.h
> @@ -20,18 +20,13 @@
>  
>  void debug_user_asce(int exit);
>  
> -static inline int __range_ok(unsigned long addr, unsigned long size)
> +static inline int __access_ok(const void __user *addr, unsigned long size)
>  {
>  	return 1;
>  }
> +#define __access_ok __access_ok
>  
> -#define __access_ok(addr, size)				\
> -({							\
> -	__chk_user_ptr(addr);				\
> -	__range_ok((unsigned long)(addr), (size));	\
> -})
> -
> -#define access_ok(addr, size) __access_ok(addr, size)
> +#include <asm-generic/access_ok.h>
>  
>  unsigned long __must_check
>  raw_copy_from_user(void *to, const void __user *from, unsigned long n);
> diff --git a/arch/sh/include/asm/uaccess.h b/arch/sh/include/asm/uaccess.h
> index 8867bb04b00e..ccd219d74851 100644
> --- a/arch/sh/include/asm/uaccess.h
> +++ b/arch/sh/include/asm/uaccess.h
> @@ -5,28 +5,10 @@
>  #include <asm/segment.h>
>  #include <asm/extable.h>
>  
> -#define __addr_ok(addr) \
> -	((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg)
> -
> -/*
> - * __access_ok: Check if address with size is OK or not.
> - *
> - * Uhhuh, this needs 33-bit arithmetic. We have a carry..
> - *
> - * sum := addr + size;  carry? --> flag = true;
> - * if (sum >= addr_limit) flag = true;
> - */
> -#define __access_ok(addr, size)	({				\
> -	unsigned long __ao_a = (addr), __ao_b = (size);		\
> -	unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;	\
> -	__ao_end >= __ao_a && __addr_ok(__ao_end); })
> -
> -#define access_ok(addr, size)	\
> -	(__chk_user_ptr(addr),		\
> -	 __access_ok((unsigned long __force)(addr), (size)))
> -
>  #define user_addr_max()	(current_thread_info()->addr_limit.seg)
>  
> +#include <asm-generic/access_ok.h>
> +
>  /*
>   * Uh, these should become the main single-value transfer routines ...
>   * They automatically use the right size if we just have the right
> diff --git a/arch/sparc/include/asm/uaccess.h b/arch/sparc/include/asm/uaccess.h
> index 390094200fc4..ee75f69e3fcd 100644
> --- a/arch/sparc/include/asm/uaccess.h
> +++ b/arch/sparc/include/asm/uaccess.h
> @@ -10,9 +10,6 @@
>  #include <asm/uaccess_32.h>
>  #endif
>  
> -#define user_addr_max() \
> -	(uaccess_kernel() ? ~0UL : TASK_SIZE)
> -
>  long strncpy_from_user(char *dest, const char __user *src, long count);
>  
>  #endif
> diff --git a/arch/sparc/include/asm/uaccess_32.h b/arch/sparc/include/asm/uaccess_32.h
> index 4a12346bb69c..367747116260 100644
> --- a/arch/sparc/include/asm/uaccess_32.h
> +++ b/arch/sparc/include/asm/uaccess_32.h
> @@ -25,17 +25,7 @@
>  #define get_fs()	(current->thread.current_ds)
>  #define set_fs(val)	((current->thread.current_ds) = (val))
>  
> -#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
> -
> -/* We have there a nice not-mapped page at PAGE_OFFSET - PAGE_SIZE, so that this test
> - * can be fairly lightweight.
> - * No one can read/write anything from userland in the kernel space by setting
> - * large size and address near to PAGE_OFFSET - a fault will break his intentions.
> - */
> -#define __user_ok(addr, size) ({ (void)(size); (addr) < STACK_TOP; })
> -#define __kernel_ok (uaccess_kernel())
> -#define __access_ok(addr, size) (__user_ok((addr) & get_fs().seg, (size)))
> -#define access_ok(addr, size) __access_ok((unsigned long)(addr), size)
> +#include <asm-generic/access_ok.h>
>  
>  /* Uh, these should become the main single-value transfer routines..
>   * They automatically use the right size if we just have the right
> @@ -47,13 +37,13 @@
>   * and hide all the ugliness from the user.
>   */
>  #define put_user(x, ptr) ({ \
> -	unsigned long __pu_addr = (unsigned long)(ptr); \
> +	void __user *__pu_addr = (ptr); \
>  	__chk_user_ptr(ptr); \
>  	__put_user_check((__typeof__(*(ptr)))(x), __pu_addr, sizeof(*(ptr))); \
>  })
>  
>  #define get_user(x, ptr) ({ \
> -	unsigned long __gu_addr = (unsigned long)(ptr); \
> +	const void __user *__gu_addr = (ptr); \
>  	__chk_user_ptr(ptr); \
>  	__get_user_check((x), __gu_addr, sizeof(*(ptr)), __typeof__(*(ptr))); \
>  })
> @@ -232,7 +222,7 @@ static inline unsigned long __clear_user(void __user *addr, unsigned long size)
>  
>  static inline unsigned long clear_user(void __user *addr, unsigned long n)
>  {
> -	if (n && __access_ok((unsigned long) addr, n))
> +	if (n && __access_ok(addr, n))
>  		return __clear_user(addr, n);
>  	else
>  		return n;
> diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h
> index 5c12fb46bc61..000bac67cf31 100644
> --- a/arch/sparc/include/asm/uaccess_64.h
> +++ b/arch/sparc/include/asm/uaccess_64.h
> @@ -31,7 +31,12 @@
>  
>  #define get_fs() ((mm_segment_t){(current_thread_info()->current_ds)})
>  
> -#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
> +static inline int __access_ok(const void __user *addr, unsigned long size)
> +{
> +	return 1;
> +}
> +#define __access_ok __access_ok
> +#include <asm-generic/access_ok.h>
>  
>  #define set_fs(val)								\
>  do {										\
> @@ -43,33 +48,7 @@ do {										\
>   * Test whether a block of memory is a valid user space address.
>   * Returns 0 if the range is valid, nonzero otherwise.
>   */
> -static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, unsigned long limit)
> -{
> -	if (__builtin_constant_p(size))
> -		return addr > limit - size;
> -
> -	addr += size;
> -	if (addr < size)
> -		return true;
> -
> -	return addr > limit;
> -}
> -
> -#define __range_not_ok(addr, size, limit)                               \
> -({                                                                      \
> -	__chk_user_ptr(addr);                                           \
> -	__chk_range_not_ok((unsigned long __force)(addr), size, limit); \
> -})
> -
> -static inline int __access_ok(const void __user * addr, unsigned long size)
> -{
> -	return 1;
> -}
> -
> -static inline int access_ok(const void __user * addr, unsigned long size)
> -{
> -	return 1;
> -}
> +#define __range_not_ok(addr, size, limit) (!__access_ok(addr, size))
>  
>  void __retl_efault(void);
>  
> diff --git a/arch/sparc/kernel/signal_32.c b/arch/sparc/kernel/signal_32.c
> index ffab16369bea..74f80443b195 100644
> --- a/arch/sparc/kernel/signal_32.c
> +++ b/arch/sparc/kernel/signal_32.c
> @@ -65,7 +65,7 @@ struct rt_signal_frame {
>   */
>  static inline bool invalid_frame_pointer(void __user *fp, int fplen)
>  {
> -	if ((((unsigned long) fp) & 15) || !__access_ok((unsigned long)fp, fplen))
> +	if ((((unsigned long) fp) & 15) || !access_ok(fp, fplen))
>  		return true;
>  
>  	return false;
> diff --git a/arch/um/include/asm/uaccess.h b/arch/um/include/asm/uaccess.h
> index 1ecfc96bcc50..7d9d60e41e4e 100644
> --- a/arch/um/include/asm/uaccess.h
> +++ b/arch/um/include/asm/uaccess.h
> @@ -25,7 +25,7 @@
>  extern unsigned long raw_copy_from_user(void *to, const void __user *from, unsigned long n);
>  extern unsigned long raw_copy_to_user(void __user *to, const void *from, unsigned long n);
>  extern unsigned long __clear_user(void __user *mem, unsigned long len);
> -static inline int __access_ok(unsigned long addr, unsigned long size);
> +static inline int __access_ok(const void __user *ptr, unsigned long size);
>  
>  /* Teach asm-generic/uaccess.h that we have C functions for these. */
>  #define __access_ok __access_ok
> @@ -36,8 +36,9 @@ static inline int __access_ok(unsigned long addr, unsigned long size);
>  
>  #include <asm-generic/uaccess.h>
>  
> -static inline int __access_ok(unsigned long addr, unsigned long size)
> +static inline int __access_ok(const void __user *ptr, unsigned long size)
>  {
> +	unsigned long addr = (unsigned long)ptr;
>  	return __addr_range_nowrap(addr, size) &&
>  		(__under_task_size(addr, size) ||
>  		 __access_ok_vsyscall(addr, size));
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index c6d9dc42724d..c5e4bb7161bc 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -12,18 +12,6 @@
>  #include <asm/smap.h>
>  #include <asm/extable.h>
>  
> -/*
> - * Test whether a block of memory is a valid user space address.
> - * Returns 0 if the range is valid, nonzero otherwise.
> - */
> -static inline bool __access_ok(void __user *ptr, unsigned long size)
> -{
> -	unsigned long limit = TASK_SIZE_MAX;
> -	unsigned long addr = ptr;
> -
> -	return (size <= limit) && (addr <= (limit - size));
> -}
> -
>  #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
>  static inline bool pagefault_disabled(void);
>  # define WARN_ON_IN_IRQ()	\
> @@ -55,6 +43,8 @@ static inline bool pagefault_disabled(void);
>  	likely(__access_ok(addr, size));\
>  })
>  
> +#include <asm-generic/access_ok.h>
> +
>  #define __range_not_ok(addr, size, limit)	(!__access_ok(addr, size))
>  #define __chk_range_not_ok(addr, size, limit)	(!__access_ok((void __user *)addr, size))
>  
> diff --git a/arch/xtensa/include/asm/uaccess.h b/arch/xtensa/include/asm/uaccess.h
> index 75bd8fbf52ba..0edd9e4b23d0 100644
> --- a/arch/xtensa/include/asm/uaccess.h
> +++ b/arch/xtensa/include/asm/uaccess.h
> @@ -35,15 +35,7 @@
>  #define get_fs()	(current->thread.current_ds)
>  #define set_fs(val)	(current->thread.current_ds = (val))
>  
> -#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
> -
> -#define __kernel_ok (uaccess_kernel())
> -#define __user_ok(addr, size) \
> -		(((size) <= TASK_SIZE)&&((addr) <= TASK_SIZE-(size)))
> -#define __access_ok(addr, size) (__kernel_ok || __user_ok((addr), (size)))
> -#define access_ok(addr, size) __access_ok((unsigned long)(addr), (size))
> -
> -#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
> +#include <asm-generic/access_ok.h>
>  
>  /*
>   * These are the main single-value transfer routines.  They
> diff --git a/include/asm-generic/access_ok.h b/include/asm-generic/access_ok.h
> new file mode 100644
> index 000000000000..883b573af5fe
> --- /dev/null
> +++ b/include/asm-generic/access_ok.h
> @@ -0,0 +1,59 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_GENERIC_ACCESS_OK_H__
> +#define __ASM_GENERIC_ACCESS_OK_H__
> +
> +/*
> + * Checking whether a pointer is valid for user space access.
> + * These definitions work on most architectures, but overrides can
> + * be used where necessary.
> + */
> +
> +/*
> + * architectures with compat tasks have a variable TASK_SIZE and should
> + * override this to a constant.
> + */
> +#ifndef TASK_SIZE_MAX
> +#define TASK_SIZE_MAX			TASK_SIZE
> +#endif
> +
> +#ifndef uaccess_kernel
> +#ifdef CONFIG_SET_FS
> +#define uaccess_kernel()		(get_fs().seg == KERNEL_DS.seg)
> +#else
> +#define uaccess_kernel()		(0)
> +#endif
> +#endif
> +
> +#ifndef user_addr_max
> +#define user_addr_max()			(uaccess_kernel() ? ~0UL : TASK_SIZE_MAX)
> +#endif
> +
> +#ifndef __access_ok
> +/*
> + * 'size' is a compile-time constant for most callers, so optimize for
> + * this case to turn the check into a single comparison against a constant
> + * limit and catch all possible overflows.
> + * On architectures with separate user address space (m68k, s390, parisc,
> + * sparc64) or those without an MMU, this should always return true.
> + *
> + * This version was originally contributed by Jonas Bonn for the
> + * OpenRISC architecture, and was found to be the most efficient
> + * for constant 'size' and 'limit' values.
> + */
> +static inline int __access_ok(const void __user *ptr, unsigned long size)
> +{
> +	unsigned long limit = user_addr_max();
> +	unsigned long addr = (unsigned long)ptr;
> +
> +	if (limit == ULONG_MAX)
> +		return true;
> +
> +	return (size <= limit) && (addr <= (limit - size));
> +}
> +#endif
> +
> +#ifndef access_ok
> +#define access_ok(addr, size) likely(__access_ok(addr, size))
> +#endif
> +
> +#endif
> diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
> index 0870fa11a7c5..ebc685dc8d74 100644
> --- a/include/asm-generic/uaccess.h
> +++ b/include/asm-generic/uaccess.h
> @@ -114,28 +114,9 @@ static inline void set_fs(mm_segment_t fs)
>  }
>  #endif
>  
> -#ifndef uaccess_kernel
> -#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
> -#endif
> -
> -#ifndef user_addr_max
> -#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
> -#endif
> -
>  #endif /* CONFIG_SET_FS */
>  
> -#define access_ok(addr, size) __access_ok((unsigned long)(addr),(size))
> -
> -/*
> - * The architecture should really override this if possible, at least
> - * doing a check on the get_fs()
> - */
> -#ifndef __access_ok
> -static inline int __access_ok(unsigned long addr, unsigned long size)
> -{
> -	return 1;
> -}
> -#endif
> +#include <asm-generic/access_ok.h>
>  
>  /*
>   * These are the main single-value transfer routines.  They automatically
> diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
> index 67e9bc94dc40..2c31667e62e0 100644
> --- a/include/linux/uaccess.h
> +++ b/include/linux/uaccess.h
> @@ -33,13 +33,6 @@ typedef struct {
>  	/* empty dummy */
>  } mm_segment_t;
>  
> -#ifndef TASK_SIZE_MAX
> -#define TASK_SIZE_MAX			TASK_SIZE
> -#endif
> -
> -#define uaccess_kernel()		(false)
> -#define user_addr_max()			(TASK_SIZE_MAX)
> -
>  static inline mm_segment_t force_uaccess_begin(void)
>  {
>  	return (mm_segment_t) { };
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-14 16:34 ` [PATCH 08/14] arm64: simplify access_ok() Arnd Bergmann
  2022-02-14 21:06   ` Robin Murphy
  2022-02-15  8:17   ` Ard Biesheuvel
@ 2022-02-15 11:07   ` Mark Rutland
  2 siblings, 0 replies; 61+ messages in thread
From: Mark Rutland @ 2022-02-15 11:07 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: dalias, linux-ia64, linux-sh, peterz, linux-mips, linux-mm,
	guoren, sparclinux, linux-hexagon, linux-riscv, will,
	Christoph Hellwig, linux-arch, linux-s390, bcain, deller, x86,
	linux, linux-csky, ardb, mingo, geert, linux-snps-arc,
	linux-xtensa, arnd, hca, linux-alpha, linux-um, linuxppc-dev,
	linux-m68k, openrisc, green.hu, shorne, linux-arm-kernel, monstr,
	tsbogend, linux-parisc, nickhu, jcmvbkbc, linux-api,
	linux-kernel, dinguyen, ebiederm, richard, akpm, Linus Torvalds,
	davem

On Mon, Feb 14, 2022 at 05:34:46PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@arndb.de>
> 
> arm64 has an inline asm implementation of access_ok() that is derived from
> the 32-bit arm version and optimized for the case that both the limit and
> the size are variable. With set_fs() gone, the limit is always constant,
> and the size usually is as well, so just using the default implementation
> reduces the check into a comparison against a constant that can be
> scheduled by the compiler.
> 
> On a defconfig build, this saves over 28KB of .text.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

I had a play around with this and a number of alternative options that had
previously been discussed (e.g. using uint128_t for the check to allow the
compiler to use the carry flag), and:

* Any sequences which we significantly simpler involved an ABI change (e.g. not
  checking tags for tasks not using the relaxed tag ABI), or didn't interact
  well with the uaccess pointer masking we do for speculation hardening.

* For all constant-size cases, this was joint-best for codegen.

* For variable-size cases the difference between options (which did not change
  ABI or break pointer masking) fell in the noise and really depended on what
  you were optimizing for.

This patch itself is clear, I believe the logic is sound and does not result in
a behavioural change, so for this as-is:

Acked-by: Mark Rutland <mark.rutland@arm.com>

As on other replies, I think that if we want to make further changes to this,
we should do that as follow-ups, since there are a number of subtleties in this
area w.r.t. tag management and speculation with potential ABI implications.

Thanks,
Mark.

> ---
>  arch/arm64/include/asm/uaccess.h | 28 +++++-----------------------
>  1 file changed, 5 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 357f7bd9c981..e8dce0cc5eaa 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -26,6 +26,8 @@
>  #include <asm/memory.h>
>  #include <asm/extable.h>
>  
> +static inline int __access_ok(const void __user *ptr, unsigned long size);
> +
>  /*
>   * Test whether a block of memory is a valid user space address.
>   * Returns 1 if the range is valid, 0 otherwise.
> @@ -33,10 +35,8 @@
>   * This is equivalent to the following test:
>   * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
>   */
> -static inline unsigned long __access_ok(const void __user *addr, unsigned long size)
> +static inline int access_ok(const void __user *addr, unsigned long size)
>  {
> -	unsigned long ret, limit = TASK_SIZE_MAX - 1;
> -
>  	/*
>  	 * Asynchronous I/O running in a kernel thread does not have the
>  	 * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag
> @@ -46,27 +46,9 @@ static inline unsigned long __access_ok(const void __user *addr, unsigned long s
>  	    (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
>  		addr = untagged_addr(addr);
>  
> -	__chk_user_ptr(addr);
> -	asm volatile(
> -	// A + B <= C + 1 for all A,B,C, in four easy steps:
> -	// 1: X = A + B; X' = X % 2^64
> -	"	adds	%0, %3, %2\n"
> -	// 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
> -	"	csel	%1, xzr, %1, hi\n"
> -	// 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
> -	//    to compensate for the carry flag being set in step 4. For
> -	//    X > 2^64, X' merely has to remain nonzero, which it does.
> -	"	csinv	%0, %0, xzr, cc\n"
> -	// 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
> -	//    comes from the carry in being clear. Otherwise, we are
> -	//    testing X' - C == 0, subject to the previous adjustments.
> -	"	sbcs	xzr, %0, %1\n"
> -	"	cset	%0, ls\n"
> -	: "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
> -
> -	return ret;
> +	return likely(__access_ok(addr, size));
>  }
> -#define __access_ok __access_ok
> +#define access_ok access_ok
>  
>  #include <asm-generic/access_ok.h>
>  
> -- 
> 2.29.2
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-15  9:30     ` David Laight
@ 2022-02-15 11:24       ` Mark Rutland
  0 siblings, 0 replies; 61+ messages in thread
From: Mark Rutland @ 2022-02-15 11:24 UTC (permalink / raw)
  To: David Laight
  Cc: Rich Felker, linux-ia64, linux-sh, Peter Zijlstra,
	Linux Kernel Mailing List, Linux Memory Management List, Guo Ren,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	linux-riscv, linux-api, Will Deacon, Christoph Hellwig,
	linux-arch, open list:S390, Brian Cain, linux-hexagon,
	Helge Deller, X86 ML, Russell King, linux-csky,
	'Ard Biesheuvel',
	Linus Torvalds, Ingo Molnar, Geert Uytterhoeven, linux-snps-arc,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, Richard Weinberger,
	linux-m68k, openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Arnd Bergmann, monstr, Thomas Bogendoerfer, Nick Hu,
	open list:PARISC ARCHITECTURE, Max Filippov,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	open list:MIPS, dinguyen, Eric W. Biederman, alpha,
	Andrew Morton, Robin Murphy, David S. Miller

On Tue, Feb 15, 2022 at 09:30:41AM +0000, David Laight wrote:
> From: Ard Biesheuvel
> > Sent: 15 February 2022 08:18
> > 
> > On Mon, 14 Feb 2022 at 17:37, Arnd Bergmann <arnd@kernel.org> wrote:
> > >
> > > From: Arnd Bergmann <arnd@arndb.de>
> > >
> > > arm64 has an inline asm implementation of access_ok() that is derived from
> > > the 32-bit arm version and optimized for the case that both the limit and
> > > the size are variable. With set_fs() gone, the limit is always constant,
> > > and the size usually is as well, so just using the default implementation
> > > reduces the check into a comparison against a constant that can be
> > > scheduled by the compiler.
> > >
> > > On a defconfig build, this saves over 28KB of .text.
> > >
> > > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> > > ---
> > >  arch/arm64/include/asm/uaccess.h | 28 +++++-----------------------
> > >  1 file changed, 5 insertions(+), 23 deletions(-)
> > >
> > > diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> > > index 357f7bd9c981..e8dce0cc5eaa 100644
> > > --- a/arch/arm64/include/asm/uaccess.h
> > > +++ b/arch/arm64/include/asm/uaccess.h
> > > @@ -26,6 +26,8 @@
> > >  #include <asm/memory.h>
> > >  #include <asm/extable.h>
> > >
> > > +static inline int __access_ok(const void __user *ptr, unsigned long size);
> > > +
> > >  /*
> > >   * Test whether a block of memory is a valid user space address.
> > >   * Returns 1 if the range is valid, 0 otherwise.
> > > @@ -33,10 +35,8 @@
> > >   * This is equivalent to the following test:
> > >   * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
> > >   */
> > > -static inline unsigned long __access_ok(const void __user *addr, unsigned long size)
> > > +static inline int access_ok(const void __user *addr, unsigned long size)
> > >  {
> > > -       unsigned long ret, limit = TASK_SIZE_MAX - 1;
> > > -
> > >         /*
> > >          * Asynchronous I/O running in a kernel thread does not have the
> > >          * TIF_TAGGED_ADDR flag of the process owning the mm, so always untag
> > > @@ -46,27 +46,9 @@ static inline unsigned long __access_ok(const void __user *addr, unsigned long s
> > >             (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
> > >                 addr = untagged_addr(addr);
> > >
> > > -       __chk_user_ptr(addr);
> > > -       asm volatile(
> > > -       // A + B <= C + 1 for all A,B,C, in four easy steps:
> > > -       // 1: X = A + B; X' = X % 2^64
> > > -       "       adds    %0, %3, %2\n"
> > > -       // 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
> > > -       "       csel    %1, xzr, %1, hi\n"
> > > -       // 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
> > > -       //    to compensate for the carry flag being set in step 4. For
> > > -       //    X > 2^64, X' merely has to remain nonzero, which it does.
> > > -       "       csinv   %0, %0, xzr, cc\n"
> > > -       // 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
> > > -       //    comes from the carry in being clear. Otherwise, we are
> > > -       //    testing X' - C == 0, subject to the previous adjustments.
> > > -       "       sbcs    xzr, %0, %1\n"
> > > -       "       cset    %0, ls\n"
> > > -       : "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
> > > -
> > > -       return ret;
> > > +       return likely(__access_ok(addr, size));
> > >  }
> > > -#define __access_ok __access_ok
> > > +#define access_ok access_ok
> > >
> > >  #include <asm-generic/access_ok.h>
> > >
> > > --
> > > 2.29.2
> > >
> > 
> > With set_fs() out of the picture, wouldn't it be sufficient to check
> > that bit #55 is clear? (the bit that selects between TTBR0 and TTBR1)
> > That would also remove the need to strip the tag from the address.
> > 
> > Something like
> > 
> >     asm goto("tbnz  %0, #55, %2     \n"
> >              "tbnz  %1, #55, %2     \n"
> >              :: "r"(addr), "r"(addr + size - 1) :: notok);
> >     return 1;
> > notok:
> >     return 0;
> > 
> > with an additional sanity check on the size which the compiler could
> > eliminate for compile-time constant values.
> 
> Is there are reason not to just use:
> 	size < 1u << 48 && !((addr | (addr + size - 1)) & 1u << 55)

That has a few problems, including being an ABI change for tasks not using the
relaxed tag ABI and not working for 52-bit VAs.

If we really want to relax the tag checking aspect, there are simpler options,
including variations on Ard's approach above.

> Ugg, is arm64 addressing as horrid as it looks - with the 'kernel'
> bit in the middle of the virtual address space?

It's just sign-extension/canonical addressing, except bits [63:56] are
configurable between a few uses, so the achitecture says bit 55 is the one to
look at in all configurations to figure out if an address is high/low (in
addition to checking the remaining bits are canonical).

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 05/14] uaccess: add generic __{get,put}_kernel_nofault
  2022-02-15  0:31   ` Al Viro
@ 2022-02-15 13:16     ` Arnd Bergmann
  0 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-15 13:16 UTC (permalink / raw)
  To: Al Viro
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Linux-MM,
	Guo Ren, sparclinux, open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	linux-s390, Brian Cain, Helge Deller, the arch/x86 maintainers,
	Russell King - ARM Linux, linux-csky, Ard Biesheuvel,
	Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um, linuxppc-dev,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, Parisc List, Nick Hu,
	Max Filippov, Linux API, Linux Kernel Mailing List, Dinh Nguyen,
	Eric W . Biederman, Richard Weinberger, Andrew Morton,
	Linus Torvalds, David Miller

On Tue, Feb 15, 2022 at 1:31 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Feb 14, 2022 at 05:34:43PM +0100, Arnd Bergmann wrote:
> > From: Arnd Bergmann <arnd@arndb.de>
> >
> > All architectures that don't provide __{get,put}_kernel_nofault() yet
> > can implement this on top of __{get,put}_user.
> >
> > Add a generic version that lets everything use the normal
> > copy_{from,to}_kernel_nofault() code based on these, removing the last
> > use of get_fs()/set_fs() from architecture-independent code.
>
> I'd put the list of those architectures (AFAICS, that's alpha, ia64,
> microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa) into commit
> message - it's not that hard to find out, but...

done.

> And AFAICS, you've missed nios2 - see
> #define __put_user(x, ptr) put_user(x, ptr)
> in there.  nds32 oddities are dealt with earlier in the series, this
> one is not...

Ok, fixed my bug in nios2 __put_user() as well now. This one is not nearly
as bad as nds32, at least without my patches it should work as expected.

Unfortunately I also noticed that __get_user() on microblaze and nios2
is completely broken for 64-bit arguments, where these copy eight bytes
into a four byte buffer. I'll try to come up with a fix for this as well then.

         Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* RE: [PATCH 09/14] m68k: drop custom __access_ok()
  2022-02-15 10:02         ` Arnd Bergmann
@ 2022-02-15 13:28           ` David Laight
  0 siblings, 0 replies; 61+ messages in thread
From: David Laight @ 2022-02-15 13:28 UTC (permalink / raw)
  To: 'Arnd Bergmann', Al Viro
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Linux-MM,
	Guo Ren, sparclinux, open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	linux-s390, Brian Cain, Helge Deller, the arch/x86 maintainers,
	Russell King - ARM Linux, linux-csky, Ard Biesheuvel,
	Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um, linuxppc-dev,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, Parisc List, Nick Hu,
	Max Filippov, Linux API, Linux Kernel Mailing List, Dinh Nguyen,
	Eric W . Biederman, Richard Weinberger, Andrew Morton,
	Linus Torvalds, David Miller

From: Arnd Bergmann
> Sent: 15 February 2022 10:02
> 
> On Tue, Feb 15, 2022 at 8:13 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Tue, Feb 15, 2022 at 07:29:42AM +0100, Christoph Hellwig wrote:
> > > On Tue, Feb 15, 2022 at 12:37:41AM +0000, Al Viro wrote:
> > > > Perhaps simply wrap that sucker into #ifdef CONFIG_CPU_HAS_ADDRESS_SPACES
> > > > (and trim the comment down to "coldfire and 68000 will pick generic
> > > > variant")?
> > >
> > > I wonder if we should invert CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE,
> > > select the separate address space config for s390, sparc64, non-coldfire
> > > m68k and mips with EVA and then just have one single access_ok for
> > > overlapping address space (as added by Arnd) and non-overlapping ones
> > > (always return true).
> >
> > parisc is also such...  How about
> >
> >         select ALTERNATE_SPACE_USERLAND
> >
> > for that bunch?
> 
> Either of those works for me. My current version has this keyed off
> TASK_SIZE_MAX==ULONG_MAX, but a CONFIG_ symbol does
> look more descriptive.
> 
> >  While we are at it, how many unusual access_ok() instances are
> > left after this series?  arm64, itanic, um, anything else?
> 
> x86 adds a WARN_ON_IN_IRQ() check in there.

If is a noop unless CONFIG_DEBUG_ATOMIC_SLEEP is set.
I doubt that is often enabled.

> This could be
> made generic, but it's not obvious what exactly the exceptions are
> that other architectures need. The arm64 tagged pointers could
> probably also get integrated into the generic version.
> 
> > FWIW, sparc32 has a slightly unusual instance (see uaccess_32.h there); it's
> > obviously cheaper than generic and I wonder if the trick is legitimate (and
> > applicable elsewhere, perhaps)...
> 
> Right, a few others have the same, but I wasn't convinced that this
> is actually safe for call possible cases: it's trivial to construct a caller
> that works on other architectures but not this one, if you pass a large
> enough size value and don't access the contents in sequence.

You'd need code that did an access_ok() check and then read from
a large offset from the address - unlikely.
It's not like the access_ok() check for read/write is done on syscall
entry and then everything underneath assumes it is valid.

Hasn't (almost) everything been checked for function calls between
user_access_begin() and the actual accesses?
And access_ok() is done by/at the same time as user_access_begin()?

You do need an unmapped page above the address that is tested.

> Also, like the ((addr | (addr + size)) & MASK) check on some other
> architectures, it is less portable because it makes assumptions about
> the actual layout beyond a fixed address limit.

Isn't that test broken without a separate bound check on size?

I also seem to remember that access_ok(xxx, 0) is always 'ok'
and some of the 'fast' tests give a false negative if the user
buffer ends with the last byte of user address space.

So you may need:
	size < TASK_SIZE && (addr < (TASK_SIZE - size - 1) || !size)
(sprinkled with [un]likely())

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 11/14] sparc64: remove CONFIG_SET_FS support
  2022-02-14 17:06   ` Christoph Hellwig
@ 2022-02-16 13:06     ` Arnd Bergmann
  0 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-16 13:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Max Filippov,
	Guo Ren, sparclinux, linux-riscv, Will Deacon, Ard Biesheuvel,
	linux-arch, linux-s390, Brian Cain, open list:QUALCOMM HEXAGON...,
	Helge Deller, the arch/x86 maintainers, Russell King - ARM Linux,
	linux-csky, Christoph Hellwig, Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, linuxppc-dev,
	Richard Weinberger, linux-m68k, Openrisc, Greentime Hu,
	Stafford Horne, Linux ARM, Michal Simek, Thomas Bogendoerfer,
	Nick Hu, Parisc List, Linux-MM, Linux API,
	Linux Kernel Mailing List, Dinh Nguyen, Eric W . Biederman,
	alpha, Andrew Morton, Linus Torvalds, David Miller

On Mon, Feb 14, 2022 at 6:06 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> >  void prom_world(int enter)
> >  {
> > -     if (!enter)
> > -             set_fs(get_fs());
> > -
> >       __asm__ __volatile__("flushw");
> >  }
>
> The enter argument is now unused.

Right, good point. I'll add a comment, but I think I will leave that
as this seems
too hard to change the callers in assembly code for this. If any
sparc64 developer
wants to clean that up, I'm happy to integrate the cleanup patch in my series.

         Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 11/14] sparc64: remove CONFIG_SET_FS support
  2022-02-15  0:48   ` Al Viro
@ 2022-02-16 13:07     ` Arnd Bergmann
  0 siblings, 0 replies; 61+ messages in thread
From: Arnd Bergmann @ 2022-02-16 13:07 UTC (permalink / raw)
  To: Al Viro
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, open list:BROADCOM NVRAM DRIVER, Linux-MM,
	Guo Ren, sparclinux, open list:QUALCOMM HEXAGON...,
	linux-riscv, Will Deacon, Christoph Hellwig, linux-arch,
	linux-s390, Brian Cain, Helge Deller, the arch/x86 maintainers,
	Russell King - ARM Linux, linux-csky, Ard Biesheuvel,
	Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, alpha, linux-um, linuxppc-dev,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, Parisc List, Nick Hu,
	Max Filippov, Linux API, Linux Kernel Mailing List, Dinh Nguyen,
	Eric W . Biederman, Richard Weinberger, Andrew Morton,
	Linus Torvalds, David Miller

On Tue, Feb 15, 2022 at 1:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Feb 14, 2022 at 05:34:49PM +0100, Arnd Bergmann wrote:
>
> > -/*
> > - * Sparc64 is segmented, though more like the M68K than the I386.
> > - * We use the secondary ASI to address user memory, which references a
> > - * completely different VM map, thus there is zero chance of the user
> > - * doing something queer and tricking us into poking kernel memory.
>
> Actually, this part of comment probably ought to stay - it is relevant
> for understanding what's going on (e.g. why is access_ok() always true, etc.)

Ok, I've put it back now.

       Arnd

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 08/14] arm64: simplify access_ok()
  2022-02-15  9:12     ` Arnd Bergmann
  2022-02-15  9:21       ` Ard Biesheuvel
@ 2022-02-16 19:43       ` Christophe Leroy
  1 sibling, 0 replies; 61+ messages in thread
From: Christophe Leroy @ 2022-02-16 19:43 UTC (permalink / raw)
  To: Arnd Bergmann, Ard Biesheuvel
  Cc: Mark Rutland, Rich Felker, linux-ia64, Linux-sh list,
	Peter Zijlstra, Linux Kernel Mailing List,
	Linux Memory Management List, Guo Ren,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	linux-riscv, Linux API, Will Deacon, Christoph Hellwig,
	linux-arch, open list:S390, Brian Cain,
	open list:QUALCOMM HEXAGON...,
	Helge Deller, X86 ML, Russell King, linux-csky, Linus Torvalds,
	Ingo Molnar, Geert Uytterhoeven,
	open list:SYNOPSYS ARC ARCHITECTURE,
	open list:TENSILICA XTENSA PORT (xtensa),
	Arnd Bergmann, Heiko Carstens, linux-um, Richard Weinberger,
	linux-m68k, Openrisc, Greentime Hu, Stafford Horne, Linux ARM,
	Michal Simek, Thomas Bogendoerfer, Nick Hu,
	open list:PARISC ARCHITECTURE, Max Filippov,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	open list:MIPS, Dinh Nguyen, Eric W. Biederman, alpha,
	Andrew Morton, Robin Murphy, David S. Miller



Le 15/02/2022 à 10:12, Arnd Bergmann a écrit :
> On Tue, Feb 15, 2022 at 9:17 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>> On Mon, 14 Feb 2022 at 17:37, Arnd Bergmann <arnd@kernel.org> wrote:
>>> From: Arnd Bergmann <arnd@arndb.de>
>>>
>>
>> With set_fs() out of the picture, wouldn't it be sufficient to check
>> that bit #55 is clear? (the bit that selects between TTBR0 and TTBR1)
>> That would also remove the need to strip the tag from the address.
>>
>> Something like
>>
>>      asm goto("tbnz  %0, #55, %2     \n"
>>               "tbnz  %1, #55, %2     \n"
>>               :: "r"(addr), "r"(addr + size - 1) :: notok);
>>      return 1;
>> notok:
>>      return 0;
>>
>> with an additional sanity check on the size which the compiler could
>> eliminate for compile-time constant values.
> 
> That should work, but I don't see it as a clear enough advantage to
> have a custom implementation. For the constant-size case, it probably
> isn't better than a compiler-scheduled comparison against a
> constant limit, but it does hurt maintainability when the next person
> wants to change the behavior of access_ok() globally.
> 
> If we want to get into micro-optimizing uaccess, I think a better target
> would be a CONFIG_CC_HAS_ASM_GOTO_OUTPUT version
> of __get_user()/__put_user as we have on x86 and powerpc.
> 

There is also the user block accesses with 
user_access_begin()/user_access_end() together with unsafe_put_user() 
and unsafe_get_user() which allowed us to optimise user accesses on 
powerpc, especially in the signal code.

Christophe

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2022-02-16 19:44 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 16:34 [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Arnd Bergmann
2022-02-14 16:34 ` [PATCH 01/14] uaccess: fix integer overflow on access_ok() Arnd Bergmann
2022-02-14 16:58   ` Christoph Hellwig
2022-02-14 16:34 ` [PATCH 02/14] sparc64: add __{get,put}_kernel_nocheck() Arnd Bergmann
2022-02-14 16:34 ` [PATCH 03/14] nds32: fix access_ok() checks in get/put_user Arnd Bergmann
2022-02-14 17:01   ` Christoph Hellwig
2022-02-14 17:10     ` David Laight
2022-02-15  9:18     ` Arnd Bergmann
2022-02-15 10:25       ` Greg KH
2022-02-14 16:34 ` [PATCH 04/14] x86: use more conventional access_ok() definition Arnd Bergmann
2022-02-14 17:02   ` Christoph Hellwig
2022-02-14 19:45     ` Arnd Bergmann
2022-02-14 20:00       ` Christoph Hellwig
2022-02-14 20:01       ` Linus Torvalds
2022-02-14 20:17         ` Al Viro
2022-02-15  2:47           ` Al Viro
2022-02-14 20:24         ` Linus Torvalds
2022-02-14 22:13           ` David Laight
2022-02-14 16:34 ` [PATCH 05/14] uaccess: add generic __{get,put}_kernel_nofault Arnd Bergmann
2022-02-14 17:02   ` Christoph Hellwig
2022-02-15  0:31   ` Al Viro
2022-02-15 13:16     ` Arnd Bergmann
2022-02-14 16:34 ` [PATCH 06/14] mips: use simpler access_ok() Arnd Bergmann
2022-02-14 16:34 ` [PATCH 07/14] uaccess: generalize access_ok() Arnd Bergmann
2022-02-14 17:04   ` Christoph Hellwig
2022-02-14 17:15   ` Al Viro
2022-02-14 19:25     ` Arnd Bergmann
2022-02-15 10:58   ` Mark Rutland
2022-02-14 16:34 ` [PATCH 08/14] arm64: simplify access_ok() Arnd Bergmann
2022-02-14 21:06   ` Robin Murphy
2022-02-15  8:17   ` Ard Biesheuvel
2022-02-15  9:12     ` Arnd Bergmann
2022-02-15  9:21       ` Ard Biesheuvel
2022-02-15  9:39         ` Arnd Bergmann
2022-02-15 10:39           ` Mark Rutland
2022-02-15 10:37         ` Mark Rutland
2022-02-16 19:43       ` Christophe Leroy
2022-02-15  9:30     ` David Laight
2022-02-15 11:24       ` Mark Rutland
2022-02-15 11:07   ` Mark Rutland
2022-02-14 16:34 ` [PATCH 09/14] m68k: drop custom __access_ok() Arnd Bergmann
2022-02-15  0:37   ` Al Viro
2022-02-15  6:29     ` Christoph Hellwig
2022-02-15  7:13       ` Al Viro
2022-02-15 10:02         ` Arnd Bergmann
2022-02-15 13:28           ` David Laight
2022-02-14 16:34 ` [PATCH 10/14] uaccess: remove most CONFIG_SET_FS users Arnd Bergmann
2022-02-14 17:06   ` Christoph Hellwig
2022-02-14 19:40     ` Arnd Bergmann
2022-02-14 16:34 ` [PATCH 11/14] sparc64: remove CONFIG_SET_FS support Arnd Bergmann
2022-02-14 17:06   ` Christoph Hellwig
2022-02-16 13:06     ` Arnd Bergmann
2022-02-15  0:48   ` Al Viro
2022-02-16 13:07     ` Arnd Bergmann
2022-02-14 16:34 ` [PATCH 12/14] sh: " Arnd Bergmann
2022-02-14 16:34 ` [PATCH 13/14] ia64: " Arnd Bergmann
2022-02-14 16:34 ` [PATCH 14/14] uaccess: drop set_fs leftovers Arnd Bergmann
2022-02-15  3:03   ` Al Viro
2022-02-15  7:46     ` Helge Deller
2022-02-15  8:10       ` Arnd Bergmann
2022-02-14 17:35 ` [PATCH 00/14] clean up asm/uaccess.h, kill set_fs for good Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).