linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] ARM: remove set_fs callers and implementation
@ 2020-09-18 12:46 Arnd Bergmann
  2020-09-18 12:46 ` [PATCH v2 1/9] mm/maccess: fix unaligned copy_{from,to}_kernel_nofault Arnd Bergmann
                   ` (10 more replies)
  0 siblings, 11 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, Arnd Bergmann

Hi Christoph, Russell,

Here is an updated series for removing set_fs() from arch/arm,
based on the previous feedback.

I have tested the oabi-compat changes using the LTP tests for the three
modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
and I have lightly tested the get_kernel_nofault infrastructure by
loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.

     Arnd

Arnd Bergmann (9):
  mm/maccess: fix unaligned copy_{from,to}_kernel_nofault
  ARM: traps: use get_kernel_nofault instead of set_fs()
  ARM: oabi-compat: add epoll_pwait handler
  ARM: syscall: always store thread_info->syscall
  ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation
  ARM: oabi-compat: rework sys_semtimedop emulation
  ARM: oabi-compat: rework fcntl64() emulation
  ARM: uaccess: add __{get,put}_kernel_nofault
  ARM: uaccess: remove set_fs() implementation

 arch/arm/Kconfig                   |   1 -
 arch/arm/include/asm/ptrace.h      |   1 -
 arch/arm/include/asm/syscall.h     |  16 ++-
 arch/arm/include/asm/thread_info.h |   4 -
 arch/arm/include/asm/uaccess-asm.h |   6 -
 arch/arm/include/asm/uaccess.h     | 169 ++++++++++++++-------------
 arch/arm/kernel/asm-offsets.c      |   3 +-
 arch/arm/kernel/entry-common.S     |  16 +--
 arch/arm/kernel/process.c          |   7 +-
 arch/arm/kernel/ptrace.c           |   4 +-
 arch/arm/kernel/signal.c           |   8 --
 arch/arm/kernel/sys_oabi-compat.c  | 181 ++++++++++++++++-------------
 arch/arm/kernel/traps.c            |  47 +++-----
 arch/arm/lib/copy_from_user.S      |   3 +-
 arch/arm/lib/copy_to_user.S        |   3 +-
 arch/arm/tools/syscall.tbl         |   2 +-
 fs/eventpoll.c                     |   5 +-
 include/linux/eventpoll.h          |  18 +++
 include/linux/syscalls.h           |   3 +
 ipc/sem.c                          |  84 ++++++++-----
 mm/maccess.c                       |  28 ++++-
 21 files changed, 328 insertions(+), 281 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 1/9] mm/maccess: fix unaligned copy_{from,to}_kernel_nofault
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
@ 2020-09-18 12:46 ` Arnd Bergmann
  2020-09-18 12:46 ` [PATCH v2 2/9] ARM: traps: use get_kernel_nofault instead of set_fs() Arnd Bergmann
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm,
	Arnd Bergmann, Christoph Hellwig

On machines such as ARMv5 that trap unaligned accesses, these
two functions can be slow when each access needs to be emulated,
or they might not work at all.

Change them so that each loop is only used when both the src
and dst pointers are naturally aligned.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 mm/maccess.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/mm/maccess.c b/mm/maccess.c
index 3bd70405f2d8..d3f1a1f0b1c1 100644
--- a/mm/maccess.c
+++ b/mm/maccess.c
@@ -24,13 +24,21 @@ bool __weak copy_from_kernel_nofault_allowed(const void *unsafe_src,
 
 long copy_from_kernel_nofault(void *dst, const void *src, size_t size)
 {
+	unsigned long align = 0;
+
+	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
+		align = (unsigned long)dst | (unsigned long)src;
+
 	if (!copy_from_kernel_nofault_allowed(src, size))
 		return -ERANGE;
 
 	pagefault_disable();
-	copy_from_kernel_nofault_loop(dst, src, size, u64, Efault);
-	copy_from_kernel_nofault_loop(dst, src, size, u32, Efault);
-	copy_from_kernel_nofault_loop(dst, src, size, u16, Efault);
+	if (!(align & 7))
+		copy_from_kernel_nofault_loop(dst, src, size, u64, Efault);
+	if (!(align & 3))
+		copy_from_kernel_nofault_loop(dst, src, size, u32, Efault);
+	if (!(align & 1))
+		copy_from_kernel_nofault_loop(dst, src, size, u16, Efault);
 	copy_from_kernel_nofault_loop(dst, src, size, u8, Efault);
 	pagefault_enable();
 	return 0;
@@ -50,10 +58,18 @@ EXPORT_SYMBOL_GPL(copy_from_kernel_nofault);
 
 long copy_to_kernel_nofault(void *dst, const void *src, size_t size)
 {
+	unsigned long align = 0;
+
+	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
+		align = (unsigned long)dst | (unsigned long)src;
+
 	pagefault_disable();
-	copy_to_kernel_nofault_loop(dst, src, size, u64, Efault);
-	copy_to_kernel_nofault_loop(dst, src, size, u32, Efault);
-	copy_to_kernel_nofault_loop(dst, src, size, u16, Efault);
+	if (!(align & 7))
+		copy_to_kernel_nofault_loop(dst, src, size, u64, Efault);
+	if (!(align & 3))
+		copy_to_kernel_nofault_loop(dst, src, size, u32, Efault);
+	if (!(align & 1))
+		copy_to_kernel_nofault_loop(dst, src, size, u16, Efault);
 	copy_to_kernel_nofault_loop(dst, src, size, u8, Efault);
 	pagefault_enable();
 	return 0;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/9] ARM: traps: use get_kernel_nofault instead of set_fs()
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
  2020-09-18 12:46 ` [PATCH v2 1/9] mm/maccess: fix unaligned copy_{from,to}_kernel_nofault Arnd Bergmann
@ 2020-09-18 12:46 ` Arnd Bergmann
  2020-09-19  5:27   ` Christoph Hellwig
  2020-09-18 12:46 ` [PATCH v2 3/9] ARM: oabi-compat: add epoll_pwait handler Arnd Bergmann
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, Arnd Bergmann

ARM uses set_fs() and __get_user() to allow the stack dumping code to
access possibly invalid pointers carefully. These can be changed to the
simpler get_kernel_nofault(), and allow the eventual removal of set_fs().

dump_instr() will print either kernel or user space pointers,
depending on how it was called. For dump_mem(), I assume we are only
interested in kernel pointers, and the only time that this is called
with user_mode(regs)==true is when the regs themselves are unreliable
as a result of the condition that caused the trap.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/kernel/traps.c | 47 ++++++++++++++---------------------------
 1 file changed, 16 insertions(+), 31 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 17d5a785df28..c3964a283b63 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -122,17 +122,8 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom,
 		     unsigned long top)
 {
 	unsigned long first;
-	mm_segment_t fs;
 	int i;
 
-	/*
-	 * We need to switch to kernel mode so that we can use __get_user
-	 * to safely read from kernel space.  Note that we now dump the
-	 * code first, just in case the backtrace kills us.
-	 */
-	fs = get_fs();
-	set_fs(KERNEL_DS);
-
 	printk("%s%s(0x%08lx to 0x%08lx)\n", lvl, str, bottom, top);
 
 	for (first = bottom & ~31; first < top; first += 32) {
@@ -145,7 +136,7 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom,
 		for (p = first, i = 0; i < 8 && p < top; i++, p += 4) {
 			if (p >= bottom && p < top) {
 				unsigned long val;
-				if (__get_user(val, (unsigned long *)p) == 0)
+				if (get_kernel_nofault(val, (unsigned long *)p))
 					sprintf(str + i * 9, " %08lx", val);
 				else
 					sprintf(str + i * 9, " ????????");
@@ -153,11 +144,9 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom,
 		}
 		printk("%s%04lx:%s\n", lvl, first & 0xffff, str);
 	}
-
-	set_fs(fs);
 }
 
-static void __dump_instr(const char *lvl, struct pt_regs *regs)
+static void dump_instr(const char *lvl, struct pt_regs *regs)
 {
 	unsigned long addr = instruction_pointer(regs);
 	const int thumb = thumb_mode(regs);
@@ -173,10 +162,20 @@ static void __dump_instr(const char *lvl, struct pt_regs *regs)
 	for (i = -4; i < 1 + !!thumb; i++) {
 		unsigned int val, bad;
 
-		if (thumb)
-			bad = get_user(val, &((u16 *)addr)[i]);
-		else
-			bad = get_user(val, &((u32 *)addr)[i]);
+		if (!user_mode(regs)) {
+			if (thumb) {
+				u16 val16;
+				bad = get_kernel_nofault(val16, &((u16 *)addr)[i]);
+				val = val16;
+			} else {
+				bad = get_kernel_nofault(val, &((u32 *)addr)[i]);
+			}
+		} else {
+			if (thumb)
+				bad = get_user(val, &((u16 *)addr)[i]);
+			else
+				bad = get_user(val, &((u32 *)addr)[i]);
+		}
 
 		if (!bad)
 			p += sprintf(p, i == 0 ? "(%0*x) " : "%0*x ",
@@ -189,20 +188,6 @@ static void __dump_instr(const char *lvl, struct pt_regs *regs)
 	printk("%sCode: %s\n", lvl, str);
 }
 
-static void dump_instr(const char *lvl, struct pt_regs *regs)
-{
-	mm_segment_t fs;
-
-	if (!user_mode(regs)) {
-		fs = get_fs();
-		set_fs(KERNEL_DS);
-		__dump_instr(lvl, regs);
-		set_fs(fs);
-	} else {
-		__dump_instr(lvl, regs);
-	}
-}
-
 #ifdef CONFIG_ARM_UNWIND
 static inline void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
 				  const char *loglvl)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 3/9] ARM: oabi-compat: add epoll_pwait handler
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
  2020-09-18 12:46 ` [PATCH v2 1/9] mm/maccess: fix unaligned copy_{from,to}_kernel_nofault Arnd Bergmann
  2020-09-18 12:46 ` [PATCH v2 2/9] ARM: traps: use get_kernel_nofault instead of set_fs() Arnd Bergmann
@ 2020-09-18 12:46 ` Arnd Bergmann
  2020-09-21 12:54   ` Sasha Levin
  2020-09-18 12:46 ` [PATCH v2 4/9] ARM: syscall: always store thread_info->syscall Arnd Bergmann
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm,
	Arnd Bergmann, stable, Christoph Hellwig

The epoll_wait() syscall has a special version for OABI compat
mode to convert the arguments to the EABI structure layout
of the kernel. However, the later epoll_pwait() syscall was
added in arch/arm in linux-2.6.32 without this conversion.

Use the same kind of handler for both.

Fixes: 369842658a36 ("ARM: 5677/1: ARM support for TIF_RESTORE_SIGMASK/pselect6/ppoll/epoll_pwait")
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/kernel/sys_oabi-compat.c | 37 ++++++++++++++++++++++++++++---
 arch/arm/tools/syscall.tbl        |  2 +-
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c
index 0203e545bbc8..a2b1ae01e5bf 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -264,9 +264,8 @@ asmlinkage long sys_oabi_epoll_ctl(int epfd, int op, int fd,
 	return do_epoll_ctl(epfd, op, fd, &kernel, false);
 }
 
-asmlinkage long sys_oabi_epoll_wait(int epfd,
-				    struct oabi_epoll_event __user *events,
-				    int maxevents, int timeout)
+static long do_oabi_epoll_wait(int epfd, struct oabi_epoll_event __user *events,
+			       int maxevents, int timeout)
 {
 	struct epoll_event *kbuf;
 	struct oabi_epoll_event e;
@@ -299,6 +298,38 @@ asmlinkage long sys_oabi_epoll_wait(int epfd,
 	return err ? -EFAULT : ret;
 }
 
+SYSCALL_DEFINE4(oabi_epoll_wait, int, epfd,
+		struct oabi_epoll_event __user *, events,
+		int, maxevents, int, timeout)
+{
+	return do_oabi_epoll_wait(epfd, events, maxevents, timeout);
+}
+
+/*
+ * Implement the event wait interface for the eventpoll file. It is the kernel
+ * part of the user space epoll_pwait(2).
+ */
+SYSCALL_DEFINE6(oabi_epoll_pwait, int, epfd,
+		struct oabi_epoll_event __user *, events, int, maxevents,
+		int, timeout, const sigset_t __user *, sigmask,
+		size_t, sigsetsize)
+{
+	int error;
+
+	/*
+	 * If the caller wants a certain signal mask to be set during the wait,
+	 * we apply it here.
+	 */
+	error = set_user_sigmask(sigmask, sigsetsize);
+	if (error)
+		return error;
+
+	error = do_oabi_epoll_wait(epfd, events, maxevents, timeout);
+	restore_saved_sigmask_unless(error == -EINTR);
+
+	return error;
+}
+
 struct oabi_sembuf {
 	unsigned short	sem_num;
 	short		sem_op;
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 171077cbf419..39a24bee7df8 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -360,7 +360,7 @@
 343	common	vmsplice		sys_vmsplice
 344	common	move_pages		sys_move_pages
 345	common	getcpu			sys_getcpu
-346	common	epoll_pwait		sys_epoll_pwait
+346	common	epoll_pwait		sys_epoll_pwait		sys_oabi_epoll_pwait
 347	common	kexec_load		sys_kexec_load
 348	common	utimensat		sys_utimensat_time32
 349	common	signalfd		sys_signalfd
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 4/9] ARM: syscall: always store thread_info->syscall
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
                   ` (2 preceding siblings ...)
  2020-09-18 12:46 ` [PATCH v2 3/9] ARM: oabi-compat: add epoll_pwait handler Arnd Bergmann
@ 2020-09-18 12:46 ` Arnd Bergmann
  2020-09-18 12:46 ` [PATCH v2 5/9] ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation Arnd Bergmann
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, Arnd Bergmann

The system call number is used in a a couple of places, in particular
ptrace, seccomp and /proc/<pid>/syscall.

The last one apparently never worked reliably on ARM for tasks
that are not currently getting traced.

Storing the syscall number in the normal entry path makes it work,
as well as allowing us to see if the current system call is for
OABI compat mode, which is the next thing I want to hook into.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/include/asm/syscall.h | 5 ++++-
 arch/arm/kernel/asm-offsets.c  | 1 +
 arch/arm/kernel/entry-common.S | 7 +++++--
 arch/arm/kernel/ptrace.c       | 4 ++--
 4 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
index fd02761ba06c..855aa7cc9b8e 100644
--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -22,7 +22,10 @@ extern const unsigned long sys_call_table[];
 static inline int syscall_get_nr(struct task_struct *task,
 				 struct pt_regs *regs)
 {
-	return task_thread_info(task)->syscall;
+	if (!IS_ENABLED(CONFIG_OABI_COMPAT))
+		return task_thread_info(task)->syscall;
+
+	return task_thread_info(task)->syscall & ~__NR_OABI_SYSCALL_BASE;
 }
 
 static inline void syscall_rollback(struct task_struct *task,
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index a1570c8bab25..97af6735172b 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -46,6 +46,7 @@ int main(void)
   DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
   DEFINE(TI_CPU_DOMAIN,		offsetof(struct thread_info, cpu_domain));
   DEFINE(TI_CPU_SAVE,		offsetof(struct thread_info, cpu_context));
+  DEFINE(TI_SYSCALL,		offsetof(struct thread_info, syscall));
   DEFINE(TI_USED_CP,		offsetof(struct thread_info, used_cp));
   DEFINE(TI_TP_VALUE,		offsetof(struct thread_info, tp_value));
   DEFINE(TI_FPSTATE,		offsetof(struct thread_info, fpstate));
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 271cb8a1eba1..2ea3a1989fed 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -223,6 +223,7 @@ ENTRY(vector_swi)
 	/* saved_psr and saved_pc are now dead */
 
 	uaccess_disable tbl
+	get_thread_info tsk
 
 	adr	tbl, sys_call_table		@ load syscall table pointer
 
@@ -234,13 +235,16 @@ ENTRY(vector_swi)
 	 * get the old ABI syscall table address.
 	 */
 	bics	r10, r10, #0xff000000
+	str	r10, [tsk, #TI_SYSCALL]
 	eorne	scno, r10, #__NR_OABI_SYSCALL_BASE
 	ldrne	tbl, =sys_oabi_call_table
 #elif !defined(CONFIG_AEABI)
 	bic	scno, scno, #0xff000000		@ mask off SWI op-code
+	str	scno, [tsk, #TI_SYSCALL]
 	eor	scno, scno, #__NR_SYSCALL_BASE	@ check OS number
+#else
+	str	scno, [tsk, #TI_SYSCALL]
 #endif
-	get_thread_info tsk
 	/*
 	 * Reload the registers that may have been corrupted on entry to
 	 * the syscall assembly (by tracing or context tracking.)
@@ -285,7 +289,6 @@ ENDPROC(vector_swi)
 	 * context switches, and waiting for our parent to respond.
 	 */
 __sys_trace:
-	mov	r1, scno
 	add	r0, sp, #S_OFF
 	bl	syscall_trace_enter
 	mov	scno, r0
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 2771e682220b..252060663b00 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -885,9 +885,9 @@ static void tracehook_report_syscall(struct pt_regs *regs,
 	regs->ARM_ip = ip;
 }
 
-asmlinkage int syscall_trace_enter(struct pt_regs *regs, int scno)
+asmlinkage int syscall_trace_enter(struct pt_regs *regs)
 {
-	current_thread_info()->syscall = scno;
+	int scno;
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE))
 		tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 5/9] ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
                   ` (3 preceding siblings ...)
  2020-09-18 12:46 ` [PATCH v2 4/9] ARM: syscall: always store thread_info->syscall Arnd Bergmann
@ 2020-09-18 12:46 ` Arnd Bergmann
  2020-09-19  5:32   ` Christoph Hellwig
  2020-09-18 12:46 ` [PATCH v2 6/9] ARM: oabi-compat: rework sys_semtimedop emulation Arnd Bergmann
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, Arnd Bergmann

The epoll_wait() system call wrapper is one of the remaining users of
the set_fs() infrasturcture for Arm. Changing it to not require set_fs()
is rather complex unfortunately.

The approach I'm taking here is to allow architectures to override
the code that copies the output to user space, and let the oabi-compat
implementation check whether it is getting called from an EABI or OABI
system call based on the thread_info->syscall value.

The in_oabi_syscall() check here mirrors the in_compat_syscall() and
in_x32_syscall() helpers for 32-bit compat implementations on other
architectures.

Overall, the amount of code goes down, at least with the newly added
sys_oabi_epoll_pwait() helper getting removed again. The downside
is added complexity in the source code for the native implementation.
There should be no difference in runtime performance except for Arm
kernels with CONFIG_OABI_COMPAT enabled that now have to go through
an external function call to check which of the two variants to use.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/include/asm/syscall.h    | 11 +++++
 arch/arm/kernel/sys_oabi-compat.c | 75 +++++++------------------------
 arch/arm/tools/syscall.tbl        |  4 +-
 fs/eventpoll.c                    |  5 +--
 include/linux/eventpoll.h         | 18 ++++++++
 5 files changed, 49 insertions(+), 64 deletions(-)

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
index 855aa7cc9b8e..156880943c16 100644
--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -28,6 +28,17 @@ static inline int syscall_get_nr(struct task_struct *task,
 	return task_thread_info(task)->syscall & ~__NR_OABI_SYSCALL_BASE;
 }
 
+static inline bool __in_oabi_syscall(struct task_struct *task)
+{
+	return IS_ENABLED(CONFIG_OABI_COMPAT) &&
+		(task_thread_info(task)->syscall & __NR_OABI_SYSCALL_BASE);
+}
+
+static inline bool in_oabi_syscall(void)
+{
+	return __in_oabi_syscall(current);
+}
+
 static inline void syscall_rollback(struct task_struct *task,
 				    struct pt_regs *regs)
 {
diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c
index a2b1ae01e5bf..f9d8e5be6ba0 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -83,6 +83,8 @@
 #include <linux/uaccess.h>
 #include <linux/slab.h>
 
+#include <asm/syscall.h>
+
 struct oldabi_stat64 {
 	unsigned long long st_dev;
 	unsigned int	__pad1;
@@ -264,70 +266,25 @@ asmlinkage long sys_oabi_epoll_ctl(int epfd, int op, int fd,
 	return do_epoll_ctl(epfd, op, fd, &kernel, false);
 }
 
-static long do_oabi_epoll_wait(int epfd, struct oabi_epoll_event __user *events,
-			       int maxevents, int timeout)
+struct epoll_event __user *
+epoll_put_uevent(__poll_t revents, __u64 data,
+		 struct epoll_event __user *uevent)
 {
-	struct epoll_event *kbuf;
-	struct oabi_epoll_event e;
-	mm_segment_t fs;
-	long ret, err, i;
+	if (in_oabi_syscall()) {
+		struct oabi_epoll_event __user *oevent = (void __user *)uevent;
 
-	if (maxevents <= 0 ||
-			maxevents > (INT_MAX/sizeof(*kbuf)) ||
-			maxevents > (INT_MAX/sizeof(*events)))
-		return -EINVAL;
-	if (!access_ok(events, sizeof(*events) * maxevents))
-		return -EFAULT;
-	kbuf = kmalloc_array(maxevents, sizeof(*kbuf), GFP_KERNEL);
-	if (!kbuf)
-		return -ENOMEM;
-	fs = get_fs();
-	set_fs(KERNEL_DS);
-	ret = sys_epoll_wait(epfd, kbuf, maxevents, timeout);
-	set_fs(fs);
-	err = 0;
-	for (i = 0; i < ret; i++) {
-		e.events = kbuf[i].events;
-		e.data = kbuf[i].data;
-		err = __copy_to_user(events, &e, sizeof(e));
-		if (err)
-			break;
-		events++;
-	}
-	kfree(kbuf);
-	return err ? -EFAULT : ret;
-}
+		if (__put_user(revents, &oevent->events) ||
+		    __put_user(data, &oevent->data))
+			return NULL;
 
-SYSCALL_DEFINE4(oabi_epoll_wait, int, epfd,
-		struct oabi_epoll_event __user *, events,
-		int, maxevents, int, timeout)
-{
-	return do_oabi_epoll_wait(epfd, events, maxevents, timeout);
-}
-
-/*
- * Implement the event wait interface for the eventpoll file. It is the kernel
- * part of the user space epoll_pwait(2).
- */
-SYSCALL_DEFINE6(oabi_epoll_pwait, int, epfd,
-		struct oabi_epoll_event __user *, events, int, maxevents,
-		int, timeout, const sigset_t __user *, sigmask,
-		size_t, sigsetsize)
-{
-	int error;
-
-	/*
-	 * If the caller wants a certain signal mask to be set during the wait,
-	 * we apply it here.
-	 */
-	error = set_user_sigmask(sigmask, sigsetsize);
-	if (error)
-		return error;
+		return (void __user *)(oevent+1);
+	}
 
-	error = do_oabi_epoll_wait(epfd, events, maxevents, timeout);
-	restore_saved_sigmask_unless(error == -EINTR);
+	if (__put_user(revents, &uevent->events) ||
+	    __put_user(data, &uevent->data))
+		return NULL;
 
-	return error;
+	return uevent+1;
 }
 
 struct oabi_sembuf {
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 39a24bee7df8..fe5cd48fed91 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -266,7 +266,7 @@
 249	common	lookup_dcookie		sys_lookup_dcookie
 250	common	epoll_create		sys_epoll_create
 251	common	epoll_ctl		sys_epoll_ctl		sys_oabi_epoll_ctl
-252	common	epoll_wait		sys_epoll_wait		sys_oabi_epoll_wait
+252	common	epoll_wait		sys_epoll_wait
 253	common	remap_file_pages	sys_remap_file_pages
 # 254 for set_thread_area
 # 255 for get_thread_area
@@ -360,7 +360,7 @@
 343	common	vmsplice		sys_vmsplice
 344	common	move_pages		sys_move_pages
 345	common	getcpu			sys_getcpu
-346	common	epoll_pwait		sys_epoll_pwait		sys_oabi_epoll_pwait
+346	common	epoll_pwait		sys_epoll_pwait
 347	common	kexec_load		sys_kexec_load
 348	common	utimensat		sys_utimensat_time32
 349	common	signalfd		sys_signalfd
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 12eebcdea9c8..796d9e72dc96 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1745,8 +1745,8 @@ static __poll_t ep_send_events_proc(struct eventpoll *ep, struct list_head *head
 		if (!revents)
 			continue;
 
-		if (__put_user(revents, &uevent->events) ||
-		    __put_user(epi->event.data, &uevent->data)) {
+		uevent = epoll_put_uevent(revents, epi->event.data, uevent);
+		if (!uevent) {
 			list_add(&epi->rdllink, head);
 			ep_pm_stay_awake(epi);
 			if (!esed->res)
@@ -1754,7 +1754,6 @@ static __poll_t ep_send_events_proc(struct eventpoll *ep, struct list_head *head
 			return 0;
 		}
 		esed->res++;
-		uevent++;
 		if (epi->event.events & EPOLLONESHOT)
 			epi->event.events &= EP_PRIVATE_BITS;
 		else if (!(epi->event.events & EPOLLET)) {
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index 8f000fada5a4..315d28a1cf1b 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -77,4 +77,22 @@ static inline void eventpoll_release(struct file *file) {}
 
 #endif
 
+#if !defined(CONFIG_ARM) || !defined(CONFIG_OABI_COMPAT)
+/* ARM OABI has an incompatible struct layout and needs a special handler */
+static inline struct epoll_event __user *
+epoll_put_uevent(__poll_t revents, __u64 data,
+		 struct epoll_event __user *uevent)
+{
+	if (__put_user(revents, &uevent->events) ||
+	    __put_user(data, &uevent->data))
+		return NULL;
+
+	return uevent+1;
+}
+#else
+struct epoll_event __user *
+epoll_put_uevent(__poll_t revents, __u64 data,
+		 struct epoll_event __user *uevent);
+#endif
+
 #endif /* #ifndef _LINUX_EVENTPOLL_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 6/9] ARM: oabi-compat: rework sys_semtimedop emulation
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
                   ` (4 preceding siblings ...)
  2020-09-18 12:46 ` [PATCH v2 5/9] ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation Arnd Bergmann
@ 2020-09-18 12:46 ` Arnd Bergmann
  2020-09-18 12:46 ` [PATCH v2 7/9] ARM: oabi-compat: rework fcntl64() emulation Arnd Bergmann
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, Arnd Bergmann

sys_oabi_semtimedop() is one of the last users of set_fs() on Arm. To
remove this one, expose the internal code of the actual implementation
that operates on a kernel pointer and call it directly after copying.

There should be no measurable impact on the normal execution of this
function, and it makes the overly long function a little shorter, which
may help readability.

While reworking the oabi version, make it behave a little more like
the native one, using kvmalloc_array() and restructure the code
flow in a similar way.

The naming of __do_semtimedop() is not very good, I hope someone can
come up with a better name.

One regression was spotted by kernel test robot <rong.a.chen@intel.com>
and fixed before the first mailing list submission.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/kernel/sys_oabi-compat.c | 38 ++++++++------
 include/linux/syscalls.h          |  3 ++
 ipc/sem.c                         | 84 +++++++++++++++++++------------
 3 files changed, 77 insertions(+), 48 deletions(-)

diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c
index f9d8e5be6ba0..c3e63b73b6ae 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -80,6 +80,7 @@
 #include <linux/socket.h>
 #include <linux/net.h>
 #include <linux/ipc.h>
+#include <linux/ipc_namespace.h>
 #include <linux/uaccess.h>
 #include <linux/slab.h>
 
@@ -294,46 +295,51 @@ struct oabi_sembuf {
 	unsigned short	__pad;
 };
 
+#define sc_semopm     sem_ctls[2]
+
 asmlinkage long sys_oabi_semtimedop(int semid,
 				    struct oabi_sembuf __user *tsops,
 				    unsigned nsops,
 				    const struct old_timespec32 __user *timeout)
 {
+	struct ipc_namespace *ns;
 	struct sembuf *sops;
-	struct old_timespec32 local_timeout;
 	long err;
 	int i;
 
+	ns = current->nsproxy->ipc_ns;
+	if (nsops > ns->sc_semopm)
+		return -E2BIG;
 	if (nsops < 1 || nsops > SEMOPM)
 		return -EINVAL;
-	if (!access_ok(tsops, sizeof(*tsops) * nsops))
-		return -EFAULT;
-	sops = kmalloc_array(nsops, sizeof(*sops), GFP_KERNEL);
+	sops = kvmalloc_array(nsops, sizeof(*sops), GFP_KERNEL);
 	if (!sops)
 		return -ENOMEM;
 	err = 0;
 	for (i = 0; i < nsops; i++) {
 		struct oabi_sembuf osb;
-		err |= __copy_from_user(&osb, tsops, sizeof(osb));
+		err |= copy_from_user(&osb, tsops, sizeof(osb));
 		sops[i].sem_num = osb.sem_num;
 		sops[i].sem_op = osb.sem_op;
 		sops[i].sem_flg = osb.sem_flg;
 		tsops++;
 	}
-	if (timeout) {
-		/* copy this as well before changing domain protection */
-		err |= copy_from_user(&local_timeout, timeout, sizeof(*timeout));
-		timeout = &local_timeout;
-	}
 	if (err) {
 		err = -EFAULT;
-	} else {
-		mm_segment_t fs = get_fs();
-		set_fs(KERNEL_DS);
-		err = sys_semtimedop_time32(semid, sops, nsops, timeout);
-		set_fs(fs);
+		goto out;
+	}
+
+	if (timeout) {
+		struct timespec64 ts;
+		err = get_old_timespec32(&ts, timeout);
+		if (err)
+			goto out;
+		err = __do_semtimedop(semid, sops, nsops, &ts, ns);
+		goto out;
 	}
-	kfree(sops);
+	err = __do_semtimedop(semid, sops, nsops, NULL, ns);
+out:
+	kvfree(sops);
 	return err;
 }
 
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 75ac7f8ae93c..bb5ed3a712a5 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -1340,6 +1340,9 @@ long ksys_old_shmctl(int shmid, int cmd, struct shmid_ds __user *buf);
 long compat_ksys_semtimedop(int semid, struct sembuf __user *tsems,
 			    unsigned int nsops,
 			    const struct old_timespec32 __user *timeout);
+long __do_semtimedop(int semid, struct sembuf *tsems, unsigned int nsops,
+		     const struct timespec64 *timeout,
+		     struct ipc_namespace *ns);
 
 int __sys_getsockopt(int fd, int level, int optname, char __user *optval,
 		int __user *optlen);
diff --git a/ipc/sem.c b/ipc/sem.c
index 8c0244e0365e..515a39a67534 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1978,46 +1978,34 @@ static struct sem_undo *find_alloc_undo(struct ipc_namespace *ns, int semid)
 	return un;
 }
 
-static long do_semtimedop(int semid, struct sembuf __user *tsops,
-		unsigned nsops, const struct timespec64 *timeout)
+long __do_semtimedop(int semid, struct sembuf *sops,
+		unsigned nsops, const struct timespec64 *timeout,
+		struct ipc_namespace *ns)
 {
 	int error = -EINVAL;
 	struct sem_array *sma;
-	struct sembuf fast_sops[SEMOPM_FAST];
-	struct sembuf *sops = fast_sops, *sop;
+	struct sembuf *sop;
 	struct sem_undo *un;
 	int max, locknum;
 	bool undos = false, alter = false, dupsop = false;
 	struct sem_queue queue;
 	unsigned long dup = 0, jiffies_left = 0;
-	struct ipc_namespace *ns;
-
-	ns = current->nsproxy->ipc_ns;
 
 	if (nsops < 1 || semid < 0)
 		return -EINVAL;
 	if (nsops > ns->sc_semopm)
 		return -E2BIG;
-	if (nsops > SEMOPM_FAST) {
-		sops = kvmalloc_array(nsops, sizeof(*sops), GFP_KERNEL);
-		if (sops == NULL)
-			return -ENOMEM;
-	}
-
-	if (copy_from_user(sops, tsops, nsops * sizeof(*tsops))) {
-		error =  -EFAULT;
-		goto out_free;
-	}
 
 	if (timeout) {
 		if (timeout->tv_sec < 0 || timeout->tv_nsec < 0 ||
 			timeout->tv_nsec >= 1000000000L) {
 			error = -EINVAL;
-			goto out_free;
+			goto out;
 		}
 		jiffies_left = timespec64_to_jiffies(timeout);
 	}
 
+
 	max = 0;
 	for (sop = sops; sop < sops + nsops; sop++) {
 		unsigned long mask = 1ULL << ((sop->sem_num) % BITS_PER_LONG);
@@ -2046,7 +2034,7 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
 		un = find_alloc_undo(ns, semid);
 		if (IS_ERR(un)) {
 			error = PTR_ERR(un);
-			goto out_free;
+			goto out;
 		}
 	} else {
 		un = NULL;
@@ -2057,25 +2045,25 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
 	if (IS_ERR(sma)) {
 		rcu_read_unlock();
 		error = PTR_ERR(sma);
-		goto out_free;
+		goto out;
 	}
 
 	error = -EFBIG;
 	if (max >= sma->sem_nsems) {
 		rcu_read_unlock();
-		goto out_free;
+		goto out;
 	}
 
 	error = -EACCES;
 	if (ipcperms(ns, &sma->sem_perm, alter ? S_IWUGO : S_IRUGO)) {
 		rcu_read_unlock();
-		goto out_free;
+		goto out;
 	}
 
 	error = security_sem_semop(&sma->sem_perm, sops, nsops, alter);
 	if (error) {
 		rcu_read_unlock();
-		goto out_free;
+		goto out;
 	}
 
 	error = -EIDRM;
@@ -2089,7 +2077,7 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
 	 * entangled here and why it's RMID race safe on comments at sem_lock()
 	 */
 	if (!ipc_valid_object(&sma->sem_perm))
-		goto out_unlock_free;
+		goto out_unlock;
 	/*
 	 * semid identifiers are not unique - find_alloc_undo may have
 	 * allocated an undo structure, it was invalidated by an RMID
@@ -2098,7 +2086,7 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
 	 * "un" itself is guaranteed by rcu.
 	 */
 	if (un && un->semid == -1)
-		goto out_unlock_free;
+		goto out_unlock;
 
 	queue.sops = sops;
 	queue.nsops = nsops;
@@ -2124,10 +2112,10 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
 		rcu_read_unlock();
 		wake_up_q(&wake_q);
 
-		goto out_free;
+		goto out;
 	}
 	if (error < 0) /* non-blocking error path */
-		goto out_unlock_free;
+		goto out_unlock;
 
 	/*
 	 * We need to sleep on this operation, so we put the current
@@ -2192,14 +2180,14 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
 		if (error != -EINTR) {
 			/* see SEM_BARRIER_2 for purpose/pairing */
 			smp_acquire__after_ctrl_dep();
-			goto out_free;
+			goto out;
 		}
 
 		rcu_read_lock();
 		locknum = sem_lock(sma, sops, nsops);
 
 		if (!ipc_valid_object(&sma->sem_perm))
-			goto out_unlock_free;
+			goto out_unlock;
 
 		/*
 		 * No necessity for any barrier: We are protect by sem_lock()
@@ -2211,7 +2199,7 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
 		 * Leave without unlink_queue(), but with sem_unlock().
 		 */
 		if (error != -EINTR)
-			goto out_unlock_free;
+			goto out_unlock;
 
 		/*
 		 * If an interrupt occurred we have to clean up the queue.
@@ -2222,13 +2210,45 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
 
 	unlink_queue(sma, &queue);
 
-out_unlock_free:
+out_unlock:
 	sem_unlock(sma, locknum);
 	rcu_read_unlock();
+out:
+	return error;
+}
+
+static long do_semtimedop(int semid, struct sembuf __user *tsops,
+		unsigned nsops, const struct timespec64 *timeout)
+{
+	struct sembuf fast_sops[SEMOPM_FAST];
+	struct sembuf *sops = fast_sops;
+	struct ipc_namespace *ns;
+	int ret;
+
+	ns = current->nsproxy->ipc_ns;
+	if (nsops > ns->sc_semopm)
+		return -E2BIG;
+	if (nsops < 1)
+		return -EINVAL;
+
+	if (nsops > SEMOPM_FAST) {
+		sops = kvmalloc_array(nsops, sizeof(*sops), GFP_KERNEL);
+		if (sops == NULL)
+			return -ENOMEM;
+	}
+
+	if (copy_from_user(sops, tsops, nsops * sizeof(*tsops))) {
+		ret =  -EFAULT;
+		goto out_free;
+	}
+
+	ret = __do_semtimedop(semid, sops, nsops, timeout, ns);
+
 out_free:
 	if (sops != fast_sops)
 		kvfree(sops);
-	return error;
+
+	return ret;
 }
 
 long ksys_semtimedop(int semid, struct sembuf __user *tsops,
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 7/9] ARM: oabi-compat: rework fcntl64() emulation
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
                   ` (5 preceding siblings ...)
  2020-09-18 12:46 ` [PATCH v2 6/9] ARM: oabi-compat: rework sys_semtimedop emulation Arnd Bergmann
@ 2020-09-18 12:46 ` Arnd Bergmann
  2020-09-18 12:46 ` [PATCH v2 8/9] ARM: uaccess: add __{get,put}_kernel_nofault Arnd Bergmann
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, Arnd Bergmann

This is one of the last users of get_fs(), and this is fairly easy to
change, since the infrastructure for it is already there.

The replacement here is essentially a copy of the existing fcntl64()
syscall entry function.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/kernel/sys_oabi-compat.c | 93 ++++++++++++++++++++-----------
 1 file changed, 60 insertions(+), 33 deletions(-)

diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c
index c3e63b73b6ae..3449e163ea88 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -194,56 +194,83 @@ struct oabi_flock64 {
 	pid_t	l_pid;
 } __attribute__ ((packed,aligned(4)));
 
-static long do_locks(unsigned int fd, unsigned int cmd,
-				 unsigned long arg)
+static int get_oabi_flock(struct flock64 *kernel, struct oabi_flock64 __user *arg)
 {
-	struct flock64 kernel;
 	struct oabi_flock64 user;
-	mm_segment_t fs;
-	long ret;
 
 	if (copy_from_user(&user, (struct oabi_flock64 __user *)arg,
 			   sizeof(user)))
 		return -EFAULT;
-	kernel.l_type	= user.l_type;
-	kernel.l_whence	= user.l_whence;
-	kernel.l_start	= user.l_start;
-	kernel.l_len	= user.l_len;
-	kernel.l_pid	= user.l_pid;
-
-	fs = get_fs();
-	set_fs(KERNEL_DS);
-	ret = sys_fcntl64(fd, cmd, (unsigned long)&kernel);
-	set_fs(fs);
-
-	if (!ret && (cmd == F_GETLK64 || cmd == F_OFD_GETLK)) {
-		user.l_type	= kernel.l_type;
-		user.l_whence	= kernel.l_whence;
-		user.l_start	= kernel.l_start;
-		user.l_len	= kernel.l_len;
-		user.l_pid	= kernel.l_pid;
-		if (copy_to_user((struct oabi_flock64 __user *)arg,
-				 &user, sizeof(user)))
-			ret = -EFAULT;
-	}
-	return ret;
+
+	kernel->l_type	 = user.l_type;
+	kernel->l_whence = user.l_whence;
+	kernel->l_start	 = user.l_start;
+	kernel->l_len	 = user.l_len;
+	kernel->l_pid	 = user.l_pid;
+
+	return 0;
+}
+
+static int put_oabi_flock(struct flock64 *kernel, struct oabi_flock64 __user *arg)
+{
+	struct oabi_flock64 user;
+
+	user.l_type	= kernel->l_type;
+	user.l_whence	= kernel->l_whence;
+	user.l_start	= kernel->l_start;
+	user.l_len	= kernel->l_len;
+	user.l_pid	= kernel->l_pid;
+
+	if (copy_to_user((struct oabi_flock64 __user *)arg,
+			 &user, sizeof(user)))
+		return -EFAULT;
+
+	return 0;
 }
 
 asmlinkage long sys_oabi_fcntl64(unsigned int fd, unsigned int cmd,
 				 unsigned long arg)
 {
+	void __user *argp = (void __user *)arg;
+	struct fd f = fdget_raw(fd);
+	struct flock64 flock;
+	long err = -EBADF;
+
+	if (!f.file)
+		goto out;
+
 	switch (cmd) {
-	case F_OFD_GETLK:
-	case F_OFD_SETLK:
-	case F_OFD_SETLKW:
 	case F_GETLK64:
+	case F_OFD_GETLK:
+		err = security_file_fcntl(f.file, cmd, arg);
+		if (err)
+			break;
+		err = get_oabi_flock(&flock, argp);
+		if (err)
+			break;
+		err = fcntl_getlk64(f.file, cmd, &flock);
+		if (!err)
+		       err = put_oabi_flock(&flock, argp);
+		break;
 	case F_SETLK64:
 	case F_SETLKW64:
-		return do_locks(fd, cmd, arg);
-
+	case F_OFD_SETLK:
+	case F_OFD_SETLKW:
+		err = security_file_fcntl(f.file, cmd, arg);
+		if (err)
+			break;
+		err = get_oabi_flock(&flock, argp);
+		if (err)
+			break;
+		err = fcntl_setlk64(fd, f.file, cmd, &flock);
+		break;
 	default:
-		return sys_fcntl64(fd, cmd, arg);
+		err = sys_fcntl64(fd, cmd, arg);
+		break;
 	}
+	fdput(f);
+out:
+	return err;
 }
 
 struct oabi_epoll_event {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 8/9] ARM: uaccess: add __{get,put}_kernel_nofault
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
                   ` (6 preceding siblings ...)
  2020-09-18 12:46 ` [PATCH v2 7/9] ARM: oabi-compat: rework fcntl64() emulation Arnd Bergmann
@ 2020-09-18 12:46 ` Arnd Bergmann
  2020-09-18 12:46 ` [PATCH v2 9/9] ARM: uaccess: remove set_fs() implementation Arnd Bergmann
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, Arnd Bergmann

These mimic the behavior of get_user and put_user, except
for domain switching, address limit checking and handling
of mismatched sizes, none of which are relevant here.

To work with pre-Armv6 kernels, this has to avoid TUSER()
inside of the new macros, the new approach passes the "t"
string along with the opcode, which is a bit uglier but
avoids duplicating more code.

As there is no __get_user_asm_dword(), I work around it
by copying 32 bit at a time, which is possible because
the output size is known.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/include/asm/uaccess.h | 123 ++++++++++++++++++++++-----------
 1 file changed, 83 insertions(+), 40 deletions(-)

diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index a13d90206472..4f60638755c4 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -308,11 +308,11 @@ static inline void set_fs(mm_segment_t fs)
 #define __get_user(x, ptr)						\
 ({									\
 	long __gu_err = 0;						\
-	__get_user_err((x), (ptr), __gu_err);				\
+	__get_user_err((x), (ptr), __gu_err, TUSER());			\
 	__gu_err;							\
 })
 
-#define __get_user_err(x, ptr, err)					\
+#define __get_user_err(x, ptr, err, __t)				\
 do {									\
 	unsigned long __gu_addr = (unsigned long)(ptr);			\
 	unsigned long __gu_val;						\
@@ -321,18 +321,19 @@ do {									\
 	might_fault();							\
 	__ua_flags = uaccess_save_and_enable();				\
 	switch (sizeof(*(ptr))) {					\
-	case 1:	__get_user_asm_byte(__gu_val, __gu_addr, err);	break;	\
-	case 2:	__get_user_asm_half(__gu_val, __gu_addr, err);	break;	\
-	case 4:	__get_user_asm_word(__gu_val, __gu_addr, err);	break;	\
+	case 1:	__get_user_asm_byte(__gu_val, __gu_addr, err, __t); break;	\
+	case 2:	__get_user_asm_half(__gu_val, __gu_addr, err, __t); break;	\
+	case 4:	__get_user_asm_word(__gu_val, __gu_addr, err, __t); break;	\
 	default: (__gu_val) = __get_user_bad();				\
 	}								\
 	uaccess_restore(__ua_flags);					\
 	(x) = (__typeof__(*(ptr)))__gu_val;				\
 } while (0)
+#endif
 
 #define __get_user_asm(x, addr, err, instr)			\
 	__asm__ __volatile__(					\
-	"1:	" TUSER(instr) " %1, [%2], #0\n"		\
+	"1:	" instr " %1, [%2], #0\n"			\
 	"2:\n"							\
 	"	.pushsection .text.fixup,\"ax\"\n"		\
 	"	.align	2\n"					\
@@ -348,40 +349,38 @@ do {									\
 	: "r" (addr), "i" (-EFAULT)				\
 	: "cc")
 
-#define __get_user_asm_byte(x, addr, err)			\
-	__get_user_asm(x, addr, err, ldrb)
+#define __get_user_asm_byte(x, addr, err, __t)			\
+	__get_user_asm(x, addr, err, "ldrb" __t)
 
 #if __LINUX_ARM_ARCH__ >= 6
 
-#define __get_user_asm_half(x, addr, err)			\
-	__get_user_asm(x, addr, err, ldrh)
+#define __get_user_asm_half(x, addr, err, __t)			\
+	__get_user_asm(x, addr, err, "ldrh" __t)
 
 #else
 
 #ifndef __ARMEB__
-#define __get_user_asm_half(x, __gu_addr, err)			\
+#define __get_user_asm_half(x, __gu_addr, err, __t)		\
 ({								\
 	unsigned long __b1, __b2;				\
-	__get_user_asm_byte(__b1, __gu_addr, err);		\
-	__get_user_asm_byte(__b2, __gu_addr + 1, err);		\
+	__get_user_asm_byte(__b1, __gu_addr, err, __t);		\
+	__get_user_asm_byte(__b2, __gu_addr + 1, err, __t);	\
 	(x) = __b1 | (__b2 << 8);				\
 })
 #else
-#define __get_user_asm_half(x, __gu_addr, err)			\
+#define __get_user_asm_half(x, __gu_addr, err, __t)		\
 ({								\
 	unsigned long __b1, __b2;				\
-	__get_user_asm_byte(__b1, __gu_addr, err);		\
-	__get_user_asm_byte(__b2, __gu_addr + 1, err);		\
+	__get_user_asm_byte(__b1, __gu_addr, err, __t);		\
+	__get_user_asm_byte(__b2, __gu_addr + 1, err, __t);	\
 	(x) = (__b1 << 8) | __b2;				\
 })
 #endif
 
 #endif /* __LINUX_ARM_ARCH__ >= 6 */
 
-#define __get_user_asm_word(x, addr, err)			\
-	__get_user_asm(x, addr, err, ldr)
-#endif
-
+#define __get_user_asm_word(x, addr, err, __t)			\
+	__get_user_asm(x, addr, err, "ldr" __t)
 
 #define __put_user_switch(x, ptr, __err, __fn)				\
 	do {								\
@@ -425,7 +424,7 @@ do {									\
 #define __put_user_nocheck(x, __pu_ptr, __err, __size)			\
 	do {								\
 		unsigned long __pu_addr = (unsigned long)__pu_ptr;	\
-		__put_user_nocheck_##__size(x, __pu_addr, __err);	\
+		__put_user_nocheck_##__size(x, __pu_addr, __err, TUSER());\
 	} while (0)
 
 #define __put_user_nocheck_1 __put_user_asm_byte
@@ -433,9 +432,11 @@ do {									\
 #define __put_user_nocheck_4 __put_user_asm_word
 #define __put_user_nocheck_8 __put_user_asm_dword
 
+#endif /* !CONFIG_CPU_SPECTRE */
+
 #define __put_user_asm(x, __pu_addr, err, instr)		\
 	__asm__ __volatile__(					\
-	"1:	" TUSER(instr) " %1, [%2], #0\n"		\
+	"1:	" instr " %1, [%2], #0\n"		\
 	"2:\n"							\
 	"	.pushsection .text.fixup,\"ax\"\n"		\
 	"	.align	2\n"					\
@@ -450,36 +451,36 @@ do {									\
 	: "r" (x), "r" (__pu_addr), "i" (-EFAULT)		\
 	: "cc")
 
-#define __put_user_asm_byte(x, __pu_addr, err)			\
-	__put_user_asm(x, __pu_addr, err, strb)
+#define __put_user_asm_byte(x, __pu_addr, err, __t)		\
+	__put_user_asm(x, __pu_addr, err, "strb" __t)
 
 #if __LINUX_ARM_ARCH__ >= 6
 
-#define __put_user_asm_half(x, __pu_addr, err)			\
-	__put_user_asm(x, __pu_addr, err, strh)
+#define __put_user_asm_half(x, __pu_addr, err, __t)		\
+	__put_user_asm(x, __pu_addr, err, "strh" __t)
 
 #else
 
 #ifndef __ARMEB__
-#define __put_user_asm_half(x, __pu_addr, err)			\
+#define __put_user_asm_half(x, __pu_addr, err, __t)		\
 ({								\
 	unsigned long __temp = (__force unsigned long)(x);	\
-	__put_user_asm_byte(__temp, __pu_addr, err);		\
-	__put_user_asm_byte(__temp >> 8, __pu_addr + 1, err);	\
+	__put_user_asm_byte(__temp, __pu_addr, err, __t);	\
+	__put_user_asm_byte(__temp >> 8, __pu_addr + 1, err, __t);\
 })
 #else
-#define __put_user_asm_half(x, __pu_addr, err)			\
+#define __put_user_asm_half(x, __pu_addr, err, __t)		\
 ({								\
 	unsigned long __temp = (__force unsigned long)(x);	\
-	__put_user_asm_byte(__temp >> 8, __pu_addr, err);	\
-	__put_user_asm_byte(__temp, __pu_addr + 1, err);	\
+	__put_user_asm_byte(__temp >> 8, __pu_addr, err, __t);	\
+	__put_user_asm_byte(__temp, __pu_addr + 1, err, __t);	\
 })
 #endif
 
 #endif /* __LINUX_ARM_ARCH__ >= 6 */
 
-#define __put_user_asm_word(x, __pu_addr, err)			\
-	__put_user_asm(x, __pu_addr, err, str)
+#define __put_user_asm_word(x, __pu_addr, err, __t)		\
+	__put_user_asm(x, __pu_addr, err, "str" __t)
 
 #ifndef __ARMEB__
 #define	__reg_oper0	"%R2"
@@ -489,12 +490,12 @@ do {									\
 #define	__reg_oper1	"%R2"
 #endif
 
-#define __put_user_asm_dword(x, __pu_addr, err)			\
+#define __put_user_asm_dword(x, __pu_addr, err, __t)		\
 	__asm__ __volatile__(					\
- ARM(	"1:	" TUSER(str) "	" __reg_oper1 ", [%1], #4\n"	) \
- ARM(	"2:	" TUSER(str) "	" __reg_oper0 ", [%1]\n"	) \
- THUMB(	"1:	" TUSER(str) "	" __reg_oper1 ", [%1]\n"	) \
- THUMB(	"2:	" TUSER(str) "	" __reg_oper0 ", [%1, #4]\n"	) \
+ ARM(	"1:	str" __t "	" __reg_oper1 ", [%1], #4\n"  ) \
+ ARM(	"2:	str" __t "	" __reg_oper0 ", [%1]\n"      ) \
+ THUMB(	"1:	str" __t "	" __reg_oper1 ", [%1]\n"      ) \
+ THUMB(	"2:	str" __t "	" __reg_oper0 ", [%1, #4]\n"  ) \
 	"3:\n"							\
 	"	.pushsection .text.fixup,\"ax\"\n"		\
 	"	.align	2\n"					\
@@ -510,7 +511,49 @@ do {									\
 	: "r" (x), "i" (-EFAULT)				\
 	: "cc")
 
-#endif /* !CONFIG_CPU_SPECTRE */
+#define HAVE_GET_KERNEL_NOFAULT
+
+#define __get_kernel_nofault(dst, src, type, err_label)			\
+do {									\
+	const type *__pk_ptr = (src);					\
+	unsigned long __src = (unsigned long)(__pk_ptr);		\
+	type __val;							\
+	int __err = 0;							\
+	switch (sizeof(type)) {						\
+	case 1:	__get_user_asm_byte(__val, __src, __err, ""); break;	\
+	case 2: __get_user_asm_half(__val, __src, __err, ""); break;	\
+	case 4: __get_user_asm_word(__val, __src, __err, ""); break;	\
+	case 8: {							\
+		u32 *__v32 = (u32*)&__val;				\
+		__get_user_asm_word(__v32[0], __src, __err, "");	\
+		if (__err)						\
+			break;						\
+		__get_user_asm_word(__v32[1], __src+4, __err, "");	\
+		break;							\
+	}								\
+	default: __err = __get_user_bad(); break;			\
+	}								\
+	*(type *)(dst) = __val;						\
+	if (__err)							\
+		goto err_label;						\
+} while (0)
+
+#define __put_kernel_nofault(dst, src, type, err_label)			\
+do {									\
+	const type *__pk_ptr = (dst);					\
+	unsigned long __dst = (unsigned long)__pk_ptr;			\
+	int __err = 0;							\
+	type __val = *(type *)src;					\
+	switch (sizeof(type)) {						\
+	case 1: __put_user_asm_byte(__val, __dst, __err, ""); break;	\
+	case 2:	__put_user_asm_half(__val, __dst, __err, ""); break;	\
+	case 4:	__put_user_asm_word(__val, __dst, __err, ""); break;	\
+	case 8:	__put_user_asm_dword(__val, __dst, __err, ""); break;	\
+	default: __err = __put_user_bad(); break;			\
+	}								\
+	if (__err)							\
+		goto err_label;						\
+} while (0)
 
 #ifdef CONFIG_MMU
 extern unsigned long __must_check
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 9/9] ARM: uaccess: remove set_fs() implementation
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
                   ` (7 preceding siblings ...)
  2020-09-18 12:46 ` [PATCH v2 8/9] ARM: uaccess: add __{get,put}_kernel_nofault Arnd Bergmann
@ 2020-09-18 12:46 ` Arnd Bergmann
  2020-09-19  5:27 ` [PATCH v2 0/9] ARM: remove set_fs callers and implementation Christoph Hellwig
  2020-09-19  8:19 ` Russell King - ARM Linux admin
  10 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-18 12:46 UTC (permalink / raw)
  To: Christoph Hellwig, Russell King, Alexander Viro
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, Arnd Bergmann

There are no remaining callers of set_fs(), so just remove it
along with all associated code that operates on
thread_info->addr_limit.

There are still further optimizations that can be done:

- In get_user(), the address check could be moved entirely
  into the out of line code, rather than passing a constant
  as an argument,

- I assume the DACR handling can be simplified as we now
  only change it during user access when CONFIG_CPU_SW_DOMAIN_PAN
  is set, but not during set_fs().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm/Kconfig                   |  1 -
 arch/arm/include/asm/ptrace.h      |  1 -
 arch/arm/include/asm/thread_info.h |  4 ---
 arch/arm/include/asm/uaccess-asm.h |  6 ----
 arch/arm/include/asm/uaccess.h     | 46 +++---------------------------
 arch/arm/kernel/asm-offsets.c      |  2 --
 arch/arm/kernel/entry-common.S     |  9 ------
 arch/arm/kernel/process.c          |  7 +----
 arch/arm/kernel/signal.c           |  8 ------
 arch/arm/lib/copy_from_user.S      |  3 +-
 arch/arm/lib/copy_to_user.S        |  3 +-
 11 files changed, 7 insertions(+), 83 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 87e1478a42dc..e00d94b16658 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -118,7 +118,6 @@ config ARM
 	select PCI_SYSCALL if PCI
 	select PERF_USE_VMALLOC
 	select RTC_LIB
-	select SET_FS
 	select SYS_SUPPORTS_APM_EMULATION
 	# Above selects are sorted alphabetically; please add new ones
 	# according to that.  Thanks.
diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
index 91d6b7856be4..93051e2f402c 100644
--- a/arch/arm/include/asm/ptrace.h
+++ b/arch/arm/include/asm/ptrace.h
@@ -19,7 +19,6 @@ struct pt_regs {
 struct svc_pt_regs {
 	struct pt_regs regs;
 	u32 dacr;
-	u32 addr_limit;
 };
 
 #define to_svc_pt_regs(r) container_of(r, struct svc_pt_regs, regs)
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 536b6b979f63..8b705f611216 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -23,8 +23,6 @@ struct task_struct;
 
 #include <asm/types.h>
 
-typedef unsigned long mm_segment_t;
-
 struct cpu_context_save {
 	__u32	r4;
 	__u32	r5;
@@ -46,7 +44,6 @@ struct cpu_context_save {
 struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	int			preempt_count;	/* 0 => preemptable, <0 => bug */
-	mm_segment_t		addr_limit;	/* address limit */
 	struct task_struct	*task;		/* main task structure */
 	__u32			cpu;		/* cpu */
 	__u32			cpu_domain;	/* cpu domain */
@@ -72,7 +69,6 @@ struct thread_info {
 	.task		= &tsk,						\
 	.flags		= 0,						\
 	.preempt_count	= INIT_PREEMPT_COUNT,				\
-	.addr_limit	= KERNEL_DS,					\
 }
 
 /*
diff --git a/arch/arm/include/asm/uaccess-asm.h b/arch/arm/include/asm/uaccess-asm.h
index 907571fd05c6..6451a433912c 100644
--- a/arch/arm/include/asm/uaccess-asm.h
+++ b/arch/arm/include/asm/uaccess-asm.h
@@ -84,12 +84,8 @@
 	 * if \disable is set.
 	 */
 	.macro	uaccess_entry, tsk, tmp0, tmp1, tmp2, disable
-	ldr	\tmp1, [\tsk, #TI_ADDR_LIMIT]
-	mov	\tmp2, #TASK_SIZE
-	str	\tmp2, [\tsk, #TI_ADDR_LIMIT]
  DACR(	mrc	p15, 0, \tmp0, c3, c0, 0)
  DACR(	str	\tmp0, [sp, #SVC_DACR])
-	str	\tmp1, [sp, #SVC_ADDR_LIMIT]
 	.if \disable && IS_ENABLED(CONFIG_CPU_SW_DOMAIN_PAN)
 	/* kernel=client, user=no access */
 	mov	\tmp2, #DACR_UACCESS_DISABLE
@@ -106,9 +102,7 @@
 
 	/* Restore the user access state previously saved by uaccess_entry */
 	.macro	uaccess_exit, tsk, tmp0, tmp1
-	ldr	\tmp1, [sp, #SVC_ADDR_LIMIT]
  DACR(	ldr	\tmp0, [sp, #SVC_DACR])
-	str	\tmp1, [\tsk, #TI_ADDR_LIMIT]
  DACR(	mcr	p15, 0, \tmp0, c3, c0, 0)
 	.endm
 
diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 4f60638755c4..084d1c07c2d0 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -52,32 +52,8 @@ static __always_inline void uaccess_restore(unsigned int flags)
 extern int __get_user_bad(void);
 extern int __put_user_bad(void);
 
-/*
- * Note that this is actually 0x1,0000,0000
- */
-#define KERNEL_DS	0x00000000
-
 #ifdef CONFIG_MMU
 
-#define USER_DS		TASK_SIZE
-#define get_fs()	(current_thread_info()->addr_limit)
-
-static inline void set_fs(mm_segment_t fs)
-{
-	current_thread_info()->addr_limit = fs;
-
-	/*
-	 * Prevent a mispredicted conditional call to set_fs from forwarding
-	 * the wrong address limit to access_ok under speculation.
-	 */
-	dsb(nsh);
-	isb();
-
-	modify_domain(DOMAIN_KERNEL, fs ? DOMAIN_CLIENT : DOMAIN_MANAGER);
-}
-
-#define uaccess_kernel()	(get_fs() == KERNEL_DS)
-
 /*
  * We use 33-bit arithmetic here.  Success returns zero, failure returns
  * addr_limit.  We take advantage that addr_limit will be zero for KERNEL_DS,
@@ -89,7 +65,7 @@ static inline void set_fs(mm_segment_t fs)
 	__asm__(".syntax unified\n" \
 		"adds %1, %2, %3; sbcscc %1, %1, %0; movcc %0, #0" \
 		: "=&r" (flag), "=&r" (roksum) \
-		: "r" (addr), "Ir" (size), "0" (current_thread_info()->addr_limit) \
+		: "r" (addr), "Ir" (size), "0" (TASK_SIZE) \
 		: "cc"); \
 	flag; })
 
@@ -120,7 +96,7 @@ static inline void __user *__uaccess_mask_range_ptr(const void __user *ptr,
 	"	subshs	%1, %1, %2\n"
 	"	movlo	%0, #0\n"
 	: "+r" (safe_ptr), "=&r" (tmp)
-	: "r" (size), "r" (current_thread_info()->addr_limit)
+	: "r" (size), "r" (TASK_SIZE)
 	: "cc");
 
 	csdb();
@@ -194,7 +170,7 @@ extern int __get_user_64t_4(void *);
 
 #define __get_user_check(x, p)						\
 	({								\
-		unsigned long __limit = current_thread_info()->addr_limit - 1; \
+		unsigned long __limit = TASK_SIZE - 1; \
 		register typeof(*(p)) __user *__p asm("r0") = (p);	\
 		register __inttype(x) __r2 asm("r2");			\
 		register unsigned long __l asm("r1") = __limit;		\
@@ -245,7 +221,7 @@ extern int __put_user_8(void *, unsigned long long);
 
 #define __put_user_check(__pu_val, __ptr, __err, __s)			\
 	({								\
-		unsigned long __limit = current_thread_info()->addr_limit - 1; \
+		unsigned long __limit = TASK_SIZE - 1; \
 		register typeof(__pu_val) __r2 asm("r2") = __pu_val;	\
 		register const void __user *__p asm("r0") = __ptr;	\
 		register unsigned long __l asm("r1") = __limit;		\
@@ -262,19 +238,8 @@ extern int __put_user_8(void *, unsigned long long);
 
 #else /* CONFIG_MMU */
 
-/*
- * uClinux has only one addr space, so has simplified address limits.
- */
-#define USER_DS			KERNEL_DS
-
-#define uaccess_kernel()	(true)
 #define __addr_ok(addr)		((void)(addr), 1)
 #define __range_ok(addr, size)	((void)(addr), 0)
-#define get_fs()		(KERNEL_DS)
-
-static inline void set_fs(mm_segment_t fs)
-{
-}
 
 #define get_user(x, p)	__get_user(x, p)
 #define __put_user_check __put_user_nocheck
@@ -283,9 +248,6 @@ static inline void set_fs(mm_segment_t fs)
 
 #define access_ok(addr, size)	(__range_ok(addr, size) == 0)
 
-#define user_addr_max() \
-	(uaccess_kernel() ? ~0UL : get_fs())
-
 #ifdef CONFIG_CPU_SPECTRE
 /*
  * When mitigating Spectre variant 1, it is not worth fixing the non-
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 97af6735172b..78f0a25baf2d 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -41,7 +41,6 @@ int main(void)
   BLANK();
   DEFINE(TI_FLAGS,		offsetof(struct thread_info, flags));
   DEFINE(TI_PREEMPT,		offsetof(struct thread_info, preempt_count));
-  DEFINE(TI_ADDR_LIMIT,		offsetof(struct thread_info, addr_limit));
   DEFINE(TI_TASK,		offsetof(struct thread_info, task));
   DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
   DEFINE(TI_CPU_DOMAIN,		offsetof(struct thread_info, cpu_domain));
@@ -90,7 +89,6 @@ int main(void)
   DEFINE(S_OLD_R0,		offsetof(struct pt_regs, ARM_ORIG_r0));
   DEFINE(PT_REGS_SIZE,		sizeof(struct pt_regs));
   DEFINE(SVC_DACR,		offsetof(struct svc_pt_regs, dacr));
-  DEFINE(SVC_ADDR_LIMIT,	offsetof(struct svc_pt_regs, addr_limit));
   DEFINE(SVC_REGS_SIZE,		sizeof(struct svc_pt_regs));
   BLANK();
   DEFINE(SIGFRAME_RC3_OFFSET,	offsetof(struct sigframe, retcode[3]));
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 2ea3a1989fed..610e32273c81 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -49,9 +49,6 @@ __ret_fast_syscall:
  UNWIND(.fnstart	)
  UNWIND(.cantunwind	)
 	disable_irq_notrace			@ disable interrupts
-	ldr	r2, [tsk, #TI_ADDR_LIMIT]
-	cmp	r2, #TASK_SIZE
-	blne	addr_limit_check_failed
 	ldr	r1, [tsk, #TI_FLAGS]		@ re-check for syscall tracing
 	tst	r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
 	bne	fast_work_pending
@@ -86,9 +83,6 @@ __ret_fast_syscall:
 	bl	do_rseq_syscall
 #endif
 	disable_irq_notrace			@ disable interrupts
-	ldr	r2, [tsk, #TI_ADDR_LIMIT]
-	cmp	r2, #TASK_SIZE
-	blne	addr_limit_check_failed
 	ldr	r1, [tsk, #TI_FLAGS]		@ re-check for syscall tracing
 	tst	r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
 	beq	no_work_pending
@@ -127,9 +121,6 @@ ret_slow_syscall:
 #endif
 	disable_irq_notrace			@ disable interrupts
 ENTRY(ret_to_user_from_irq)
-	ldr	r2, [tsk, #TI_ADDR_LIMIT]
-	cmp	r2, #TASK_SIZE
-	blne	addr_limit_check_failed
 	ldr	r1, [tsk, #TI_FLAGS]
 	tst	r1, #_TIF_WORK_MASK
 	bne	slow_work_pending
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 8e6ace03e960..28a1a4a9dd77 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -97,7 +97,7 @@ void __show_regs(struct pt_regs *regs)
 	unsigned long flags;
 	char buf[64];
 #ifndef CONFIG_CPU_V7M
-	unsigned int domain, fs;
+	unsigned int domain;
 #ifdef CONFIG_CPU_SW_DOMAIN_PAN
 	/*
 	 * Get the domain register for the parent context. In user
@@ -106,14 +106,11 @@ void __show_regs(struct pt_regs *regs)
 	 */
 	if (user_mode(regs)) {
 		domain = DACR_UACCESS_ENABLE;
-		fs = get_fs();
 	} else {
 		domain = to_svc_pt_regs(regs)->dacr;
-		fs = to_svc_pt_regs(regs)->addr_limit;
 	}
 #else
 	domain = get_domain();
-	fs = get_fs();
 #endif
 #endif
 
@@ -149,8 +146,6 @@ void __show_regs(struct pt_regs *regs)
 		if ((domain & domain_mask(DOMAIN_USER)) ==
 		    domain_val(DOMAIN_USER, DOMAIN_NOACCESS))
 			segment = "none";
-		else if (fs == KERNEL_DS)
-			segment = "kernel";
 		else
 			segment = "user";
 
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index c9dc912b83f0..618b5d938317 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -710,14 +710,6 @@ struct page *get_signal_page(void)
 	return page;
 }
 
-/* Defer to generic check */
-asmlinkage void addr_limit_check_failed(void)
-{
-#ifdef CONFIG_MMU
-	addr_limit_user_check();
-#endif
-}
-
 #ifdef CONFIG_DEBUG_RSEQ
 asmlinkage void do_rseq_syscall(struct pt_regs *regs)
 {
diff --git a/arch/arm/lib/copy_from_user.S b/arch/arm/lib/copy_from_user.S
index f8016e3db65d..f481ef789a93 100644
--- a/arch/arm/lib/copy_from_user.S
+++ b/arch/arm/lib/copy_from_user.S
@@ -109,8 +109,7 @@
 
 ENTRY(arm_copy_from_user)
 #ifdef CONFIG_CPU_SPECTRE
-	get_thread_info r3
-	ldr	r3, [r3, #TI_ADDR_LIMIT]
+	mov	r3, #TASK_SIZE
 	uaccess_mask_range_ptr r1, r2, r3, ip
 #endif
 
diff --git a/arch/arm/lib/copy_to_user.S b/arch/arm/lib/copy_to_user.S
index ebfe4cb3d912..215da16c7d6e 100644
--- a/arch/arm/lib/copy_to_user.S
+++ b/arch/arm/lib/copy_to_user.S
@@ -109,8 +109,7 @@
 ENTRY(__copy_to_user_std)
 WEAK(arm_copy_to_user)
 #ifdef CONFIG_CPU_SPECTRE
-	get_thread_info r3
-	ldr	r3, [r3, #TI_ADDR_LIMIT]
+	mov	r3, #TASK_SIZE
 	uaccess_mask_range_ptr r0, r2, r3, ip
 #endif
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/9] ARM: remove set_fs callers and implementation
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
                   ` (8 preceding siblings ...)
  2020-09-18 12:46 ` [PATCH v2 9/9] ARM: uaccess: remove set_fs() implementation Arnd Bergmann
@ 2020-09-19  5:27 ` Christoph Hellwig
  2020-09-25 13:40   ` Arnd Bergmann
  2020-09-19  8:19 ` Russell King - ARM Linux admin
  10 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2020-09-19  5:27 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Russell King, Alexander Viro, linux-kernel,
	linux-arm-kernel, linux-arch, linux-mm

On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> Hi Christoph, Russell,
> 
> Here is an updated series for removing set_fs() from arch/arm,
> based on the previous feedback.
> 
> I have tested the oabi-compat changes using the LTP tests for the three
> modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> and I have lightly tested the get_kernel_nofault infrastructure by
> loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.

What is the base line?  Just the base.set_fs branch in Als tree, or do
you need anything from my RISC-V series?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/9] ARM: traps: use get_kernel_nofault instead of set_fs()
  2020-09-18 12:46 ` [PATCH v2 2/9] ARM: traps: use get_kernel_nofault instead of set_fs() Arnd Bergmann
@ 2020-09-19  5:27   ` Christoph Hellwig
  0 siblings, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2020-09-19  5:27 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Russell King, Alexander Viro, linux-kernel,
	linux-arm-kernel, linux-arch, linux-mm

On Fri, Sep 18, 2020 at 02:46:17PM +0200, Arnd Bergmann wrote:
> ARM uses set_fs() and __get_user() to allow the stack dumping code to
> access possibly invalid pointers carefully. These can be changed to the
> simpler get_kernel_nofault(), and allow the eventual removal of set_fs().
> 
> dump_instr() will print either kernel or user space pointers,
> depending on how it was called. For dump_mem(), I assume we are only
> interested in kernel pointers, and the only time that this is called
> with user_mode(regs)==true is when the regs themselves are unreliable
> as a result of the condition that caused the trap.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/9] ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation
  2020-09-18 12:46 ` [PATCH v2 5/9] ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation Arnd Bergmann
@ 2020-09-19  5:32   ` Christoph Hellwig
  2020-09-26 18:30     ` Arnd Bergmann
  0 siblings, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2020-09-19  5:32 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Russell King, Alexander Viro, linux-kernel,
	linux-arm-kernel, linux-arch, linux-mm

> index 855aa7cc9b8e..156880943c16 100644
> --- a/arch/arm/include/asm/syscall.h
> +++ b/arch/arm/include/asm/syscall.h
> @@ -28,6 +28,17 @@ static inline int syscall_get_nr(struct task_struct *task,
>  	return task_thread_info(task)->syscall & ~__NR_OABI_SYSCALL_BASE;
>  }
>  
> +static inline bool __in_oabi_syscall(struct task_struct *task)
> +{
> +	return IS_ENABLED(CONFIG_OABI_COMPAT) &&
> +		(task_thread_info(task)->syscall & __NR_OABI_SYSCALL_BASE);
> +}
> +
> +static inline bool in_oabi_syscall(void)
> +{
> +	return __in_oabi_syscall(current);
> +}
> +

Maybe split these infrastructure additions into a separate helper?

> +#if !defined(CONFIG_ARM) || !defined(CONFIG_OABI_COMPAT)
> +/* ARM OABI has an incompatible struct layout and needs a special handler */
> +static inline struct epoll_event __user *
> +epoll_put_uevent(__poll_t revents, __u64 data,
> +		 struct epoll_event __user *uevent)
> +{
> +	if (__put_user(revents, &uevent->events) ||
> +	    __put_user(data, &uevent->data))
> +		return NULL;
> +
> +	return uevent+1;
> +}
> +#else
> +struct epoll_event __user *
> +epoll_put_uevent(__poll_t revents, __u64 data,
> +		 struct epoll_event __user *uevent);
> +#endif

So after you argued for this variant I still have minor nitpicks:

I alway find positive ifdefs better where possible, e.g.

#if defined(CONFIG_ARM) && defined(CONFIG_OABI_COMPAT)
external declaration here
#else
the real thing
#endif

but I still find the fact that the native case goes into the arch
helper a little weird.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/9] ARM: remove set_fs callers and implementation
  2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
                   ` (9 preceding siblings ...)
  2020-09-19  5:27 ` [PATCH v2 0/9] ARM: remove set_fs callers and implementation Christoph Hellwig
@ 2020-09-19  8:19 ` Russell King - ARM Linux admin
  2020-09-25 14:08   ` Arnd Bergmann
  10 siblings, 1 reply; 23+ messages in thread
From: Russell King - ARM Linux admin @ 2020-09-19  8:19 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Alexander Viro, linux-kernel,
	linux-arm-kernel, linux-arch, linux-mm

On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> Hi Christoph, Russell,
> 
> Here is an updated series for removing set_fs() from arch/arm,
> based on the previous feedback.
> 
> I have tested the oabi-compat changes using the LTP tests for the three
> modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> and I have lightly tested the get_kernel_nofault infrastructure by
> loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.

I'm not too keen on always saving the syscall number, but for the gain
of getting rid of set_fs() I think it's worth it. However...

I think there are some things to check - what value do you end up
with as the first number in /proc/self/syscall when you do:

strace cat /proc/self/syscall

?

It should be 3, not 0x900003. I suspect you're getting the latter
with these changes.  IIRC, task_thread_info(task)->syscall needs to
be the value _without_ the offset, otherwise tracing will break.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/9] ARM: oabi-compat: add epoll_pwait handler
  2020-09-18 12:46 ` [PATCH v2 3/9] ARM: oabi-compat: add epoll_pwait handler Arnd Bergmann
@ 2020-09-21 12:54   ` Sasha Levin
  2020-09-21 13:27     ` Arnd Bergmann
  0 siblings, 1 reply; 23+ messages in thread
From: Sasha Levin @ 2020-09-21 12:54 UTC (permalink / raw)
  To: Sasha Levin, Arnd Bergmann, Christoph Hellwig
  Cc: linux-kernel, stable, stable

Hi

[This is an automated email]

This commit has been processed because it contains a "Fixes:" tag
fixing commit: 369842658a36 ("ARM: 5677/1: ARM support for TIF_RESTORE_SIGMASK/pselect6/ppoll/epoll_pwait").

The bot has tested the following trees: v5.8.10, v5.4.66, v4.19.146, v4.14.198, v4.9.236, v4.4.236.

v5.8.10: Build OK!
v5.4.66: Build OK!
v4.19.146: Build OK!
v4.14.198: Build OK!
v4.9.236: Failed to apply! Possible dependencies:
    00bf25d693e7 ("y2038: use time32 syscall names on 32-bit")
    17435e5f8cf3 ("time: Introduce CONFIG_COMPAT_32BIT_TIME")
    338035edc9b9 ("arm: Wire up restartable sequences system call")
    4e2648db9c5f ("ARM: remove indirection of asm/mach-types.h")
    73aeb2cbcdc9 ("ARM: 8787/1: wire up io_pgetevents syscall")
    78594b95998f ("ARM: add migrate_pages() system call")
    96a8fae0fe09 ("ARM: convert to generated system call tables")
    a1016e94cce9 ("ARM: wire up statx syscall")
    c281634c8652 ("ARM: compat: remove KERNEL_DS usage in sys_oabi_epoll_ctl()")
    d4703ddafd1e ("time: Introduce CONFIG_64BIT_TIME in architectures")

v4.4.236: Failed to apply! Possible dependencies:
    00bf25d693e7 ("y2038: use time32 syscall names on 32-bit")
    03590cb56d5d ("ARM: wire up copy_file_range() syscall")
    0d4a619b64ba ("dma-mapping: make the generic coherent dma mmap implementation optional")
    17435e5f8cf3 ("time: Introduce CONFIG_COMPAT_32BIT_TIME")
    4e2648db9c5f ("ARM: remove indirection of asm/mach-types.h")
    96a8fae0fe09 ("ARM: convert to generated system call tables")
    c281634c8652 ("ARM: compat: remove KERNEL_DS usage in sys_oabi_epoll_ctl()")
    d4703ddafd1e ("time: Introduce CONFIG_64BIT_TIME in architectures")
    f2335a2a0a59 ("ARM: wire up preadv2 and pwritev2 syscalls")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

-- 
Thanks
Sasha

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/9] ARM: oabi-compat: add epoll_pwait handler
  2020-09-21 12:54   ` Sasha Levin
@ 2020-09-21 13:27     ` Arnd Bergmann
  0 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-21 13:27 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Christoph Hellwig, linux-kernel, # 3.4.x

On Mon, Sep 21, 2020 at 2:54 PM Sasha Levin <sashal@kernel.org> wrote:
>
> Hi
>
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag
> fixing commit: 369842658a36 ("ARM: 5677/1: ARM support for TIF_RESTORE_SIGMASK/pselect6/ppoll/epoll_pwait").
>
> The bot has tested the following trees: v5.8.10, v5.4.66, v4.19.146, v4.14.198, v4.9.236, v4.4.236.
>
> v5.8.10: Build OK!
> v5.4.66: Build OK!
> v4.19.146: Build OK!
> v4.14.198: Build OK!
> v4.9.236: Failed to apply! Possible dependencies:
>     00bf25d693e7 ("y2038: use time32 syscall names on 32-bit")
>     17435e5f8cf3 ("time: Introduce CONFIG_COMPAT_32BIT_TIME")
>     338035edc9b9 ("arm: Wire up restartable sequences system call")
>     4e2648db9c5f ("ARM: remove indirection of asm/mach-types.h")
>     73aeb2cbcdc9 ("ARM: 8787/1: wire up io_pgetevents syscall")
>     78594b95998f ("ARM: add migrate_pages() system call")
>     96a8fae0fe09 ("ARM: convert to generated system call tables")
>     a1016e94cce9 ("ARM: wire up statx syscall")
>     c281634c8652 ("ARM: compat: remove KERNEL_DS usage in sys_oabi_epoll_ctl()")
>     d4703ddafd1e ("time: Introduce CONFIG_64BIT_TIME in architectures")
>
> v4.4.236: Failed to apply! Possible dependencies:
>     00bf25d693e7 ("y2038: use time32 syscall names on 32-bit")
>     03590cb56d5d ("ARM: wire up copy_file_range() syscall")
>     0d4a619b64ba ("dma-mapping: make the generic coherent dma mmap implementation optional")
>     17435e5f8cf3 ("time: Introduce CONFIG_COMPAT_32BIT_TIME")
>     4e2648db9c5f ("ARM: remove indirection of asm/mach-types.h")
>     96a8fae0fe09 ("ARM: convert to generated system call tables")
>     c281634c8652 ("ARM: compat: remove KERNEL_DS usage in sys_oabi_epoll_ctl()")
>     d4703ddafd1e ("time: Introduce CONFIG_64BIT_TIME in architectures")
>     f2335a2a0a59 ("ARM: wire up preadv2 and pwritev2 syscalls")
>
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?

I wouldn't worry too much about the failed backport in this case, as I
don't think there are any actual users of this code on older stable
kernels, and even if there are they are unlikely to start using
epoll_pwait.

      Arnd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/9] ARM: remove set_fs callers and implementation
  2020-09-19  5:27 ` [PATCH v2 0/9] ARM: remove set_fs callers and implementation Christoph Hellwig
@ 2020-09-25 13:40   ` Arnd Bergmann
  2020-09-26  6:49     ` Christoph Hellwig
  2021-07-05  6:01     ` Christoph Hellwig
  0 siblings, 2 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-25 13:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Russell King, Alexander Viro, linux-kernel, Linux ARM,
	linux-arch, Linux-MM

On Sat, Sep 19, 2020 at 7:27 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> > Hi Christoph, Russell,
> >
> > Here is an updated series for removing set_fs() from arch/arm,
> > based on the previous feedback.
> >
> > I have tested the oabi-compat changes using the LTP tests for the three
> > modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> > and I have lightly tested the get_kernel_nofault infrastructure by
> > loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.
>
> What is the base line?  Just the base.set_fs branch in Als tree, or do
> you need anything from my RISC-V series?

I imported these additional patches from you:

e0d17576790e quota: simplify the quotactl compat handling
b0f8a0c4046f compat: add a compat_need_64bit_alignment_fixup() helper
ed8af9335e19 compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
ce526c75bbe2 uaccess: provide a generic TASK_SIZE_MAX definition

I think I only actually needed the last one of those for the Arm
patches, the other ones are dependencies for my other patches
I have on the same branch.

      Arnd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/9] ARM: remove set_fs callers and implementation
  2020-09-19  8:19 ` Russell King - ARM Linux admin
@ 2020-09-25 14:08   ` Arnd Bergmann
  2020-09-25 15:30     ` Arnd Bergmann
  0 siblings, 1 reply; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-25 14:08 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Christoph Hellwig, Alexander Viro, linux-kernel, Linux ARM,
	linux-arch, Linux-MM

On Sat, Sep 19, 2020 at 10:19 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> > Hi Christoph, Russell,
> >
> > Here is an updated series for removing set_fs() from arch/arm,
> > based on the previous feedback.
> >
> > I have tested the oabi-compat changes using the LTP tests for the three
> > modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> > and I have lightly tested the get_kernel_nofault infrastructure by
> > loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.
>
> I'm not too keen on always saving the syscall number, but for the gain
> of getting rid of set_fs() I think it's worth it. However...
>
> I think there are some things to check - what value do you end up
> with as the first number in /proc/self/syscall when you do:
>
> strace cat /proc/self/syscall
>
> ?

> It should be 3, not 0x900003. I suspect you're getting the latter
> with these changes.  IIRC, task_thread_info(task)->syscall needs to
> be the value _without_ the offset, otherwise tracing will break.

It seems broken in different ways, depending on the combination
of kernel and userland:

1. EABI armv5-versatile kernel, EABI Debian 5:
$ cat /proc/self/syscall
0 0x1500000000003 0x1500000000400 0x1500000000400 0x60000013c7800480
0xc0008668c0112f8c 0xc0112d14c68e1f68 0xbeab06f8 0xb6e80d4c
$ strace -f cat /proc/self/syscall
execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) =
-1 EINTR (Interrupted system call)
dup(2)                                  = -1 EINTR (Interrupted system call)
write(2, "strace: exec: Interrupted system "..., 38) = -1 EINTR
(Interrupted system call)
exit_group(1)                           = ?

2. EABI kernel, OABI Debian 5:
$ cat /proc/self/syscall
3 0x1500000000003 0x13ccc00000400 0x1500000000400 0x60000013c7800480
0xc0008de0c0112f8c 0xc0112d14c7313f68 0xbeed27d0 0xb6eab324
$ strace cat /proc/self/syscall
execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) = -1090648236
--- SIGILL (Illegal instruction) @ 0 (0) ---
+++ killed by SIGILL +++

3. OABI kernel, OABI Debian 5:
 cat /proc/self/syscall
9437187 0x1500000000003 0x13ccc00000400 0x1500000000400 0x100060000013
0x15000c72cff6c 0xc72cfe9000000000 0xbece27d0 0xb6f2f324
$ strace cat /proc/self/syscall
execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) = -1095141548
--- SIGILL (Illegal instruction) @ 0 (0) ---
+++ killed by SIGILL +++

I suspect the OABI strace in Debian is broken since it crashes on
both kernels. I'll look into fixing the output without strace first then.

       Arnd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/9] ARM: remove set_fs callers and implementation
  2020-09-25 14:08   ` Arnd Bergmann
@ 2020-09-25 15:30     ` Arnd Bergmann
  0 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-25 15:30 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Christoph Hellwig, Alexander Viro, linux-kernel, Linux ARM,
	linux-arch, Linux-MM

On Fri, Sep 25, 2020 at 4:08 PM Arnd Bergmann <arnd@arndb.de> wrote:
> On Sat, Sep 19, 2020 at 10:19 AM Russell King - ARM Linux admin
> <linux@armlinux.org.uk> wrote:
> >
> > On Fri, Sep 18, 2020 at 02:46:15PM +0200, Arnd Bergmann wrote:
> > > Hi Christoph, Russell,
> > >
> > > Here is an updated series for removing set_fs() from arch/arm,
> > > based on the previous feedback.
> > >
> > > I have tested the oabi-compat changes using the LTP tests for the three
> > > modified syscalls using an Armv7 kernel and a Debian 5 OABI user space,
> > > and I have lightly tested the get_kernel_nofault infrastructure by
> > > loading the test_lockup.ko module after setting CONFIG_DEBUG_SPINLOCK.
> >
> > I'm not too keen on always saving the syscall number, but for the gain
> > of getting rid of set_fs() I think it's worth it. However...
> >
> > I think there are some things to check - what value do you end up
> > with as the first number in /proc/self/syscall when you do:
> >
> > strace cat /proc/self/syscall
> >
> > ?
>
> > It should be 3, not 0x900003. I suspect you're getting the latter
> > with these changes.  IIRC, task_thread_info(task)->syscall needs to
> > be the value _without_ the offset, otherwise tracing will break.
>
> It seems broken in different ways, depending on the combination
> of kernel and userland:
>
> 1. EABI armv5-versatile kernel, EABI Debian 5:
> $ cat /proc/self/syscall
> 0 0x1500000000003 0x1500000000400 0x1500000000400 0x60000013c7800480
> 0xc0008668c0112f8c 0xc0112d14c68e1f68 0xbeab06f8 0xb6e80d4c
> $ strace -f cat /proc/self/syscall
> execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) =
> -1 EINTR (Interrupted system call)
> dup(2)                                  = -1 EINTR (Interrupted system call)
> write(2, "strace: exec: Interrupted system "..., 38) = -1 EINTR
> (Interrupted system call)
> exit_group(1)                           = ?

Both the missing number and the broken strace are fixed with

diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 610e32273c81..2c0bde14fba6 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -226,7 +226,8 @@ ENTRY(vector_swi)
         * get the old ABI syscall table address.
         */
        bics    r10, r10, #0xff000000
-       str     r10, [tsk, #TI_SYSCALL]
+       strne   r10, [tsk, #TI_SYSCALL]
+       streq   scno, [tsk, #TI_SYSCALL]
        eorne   scno, r10, #__NR_OABI_SYSCALL_BASE
        ldrne   tbl, =sys_oabi_call_table
 #elif !defined(CONFIG_AEABI)

It was already working with CONFIG_AEABI=y and
CONFIG_OABI_COMPAT=n

> 2. EABI kernel, OABI Debian 5:
> $ cat /proc/self/syscall
> 3 0x1500000000003 0x13ccc00000400 0x1500000000400 0x60000013c7800480
> 0xc0008de0c0112f8c 0xc0112d14c7313f68 0xbeed27d0 0xb6eab324
> $ strace cat /proc/self/syscall
> execve("/bin/cat", ["cat", "/proc/self/syscall"], [/* 16 vars */]) = -1090648236
> --- SIGILL (Illegal instruction) @ 0 (0) ---
> +++ killed by SIGILL +++

This was caused by me after all, here is my fix:

--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -25,6 +25,7 @@
 #include <linux/tracehook.h>
 #include <linux/unistd.h>

+#include <asm/syscall.h>
 #include <asm/traps.h>

 #define CREATE_TRACE_POINTS
@@ -898,11 +899,11 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
                return -1;
 #else
        /* XXX: remove this once OABI gets fixed */
-       secure_computing_strict(current_thread_info()->syscall);
+       secure_computing_strict(syscall_get_nr(current, regs));
 #endif

        /* Tracer or seccomp may have changed syscall. */
-       scno = current_thread_info()->syscall;
+       scno = syscall_get_nr(current, regs);

        if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
                trace_sys_enter(regs, scno);

> 3. OABI kernel, OABI Debian 5:
>  cat /proc/self/syscall
> 9437187 0x1500000000003 0x13ccc00000400 0x1500000000400 0x100060000013
> 0x15000c72cff6c 0xc72cfe9000000000 0xbece27d0 0xb6f2f324

This one is fixed by

--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -22,7 +22,7 @@ extern const unsigned long sys_call_table[];
 static inline int syscall_get_nr(struct task_struct *task,
                                 struct pt_regs *regs)
 {
-       if (!IS_ENABLED(CONFIG_OABI_COMPAT))
+       if (IS_ENABLED(CONFIG_AEABI) && !IS_ENABLED(CONFIG_OABI_COMPAT))
                return task_thread_info(task)->syscall;

        return task_thread_info(task)->syscall & ~__NR_OABI_SYSCALL_BASE;



I'll send an updated patch once I've addressed Christoph's comments.

      Arnd

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/9] ARM: remove set_fs callers and implementation
  2020-09-25 13:40   ` Arnd Bergmann
@ 2020-09-26  6:49     ` Christoph Hellwig
  2021-07-05  6:01     ` Christoph Hellwig
  1 sibling, 0 replies; 23+ messages in thread
From: Christoph Hellwig @ 2020-09-26  6:49 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Russell King, Alexander Viro, linux-kernel,
	Linux ARM, linux-arch, Linux-MM

On Fri, Sep 25, 2020 at 03:40:06PM +0200, Arnd Bergmann wrote:
> e0d17576790e quota: simplify the quotactl compat handling
> b0f8a0c4046f compat: add a compat_need_64bit_alignment_fixup() helper
> ed8af9335e19 compat: lift compat_s64 and compat_u64 to <asm-generic/compat.h>
> ce526c75bbe2 uaccess: provide a generic TASK_SIZE_MAX definition
> 
> I think I only actually needed the last one of those for the Arm
> patches, the other ones are dependencies for my other patches
> I have on the same branch.

I still haven't gotten anyone to pick up the TASK_SIZE_MAX one.
So if you want to minimize dependencies just define TASK_SIZE_MAX
in the arm code for now, and we can remove redundant definitions later.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/9] ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation
  2020-09-19  5:32   ` Christoph Hellwig
@ 2020-09-26 18:30     ` Arnd Bergmann
  0 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2020-09-26 18:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Russell King, Alexander Viro, linux-kernel, Linux ARM,
	linux-arch, Linux-MM

On Sat, Sep 19, 2020 at 7:32 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> > index 855aa7cc9b8e..156880943c16 100644
> > --- a/arch/arm/include/asm/syscall.h
> > +++ b/arch/arm/include/asm/syscall.h
> > @@ -28,6 +28,17 @@ static inline int syscall_get_nr(struct task_struct *task,
> >       return task_thread_info(task)->syscall & ~__NR_OABI_SYSCALL_BASE;
> >  }
> >
> > +static inline bool __in_oabi_syscall(struct task_struct *task)
> > +{
> > +     return IS_ENABLED(CONFIG_OABI_COMPAT) &&
> > +             (task_thread_info(task)->syscall & __NR_OABI_SYSCALL_BASE);
> > +}
> > +
> > +static inline bool in_oabi_syscall(void)
> > +{
> > +     return __in_oabi_syscall(current);
> > +}
> > +
>
> Maybe split these infrastructure additions into a separate helper?

Sorry, I'm not following what you mean by this. Both of the above
are pretty minimal helpers already, in what way could they be split
further?

> So after you argued for this variant I still have minor nitpicks:
>
> I alway find positive ifdefs better where possible, e.g.
>
> #if defined(CONFIG_ARM) && defined(CONFIG_OABI_COMPAT)
> external declaration here
> #else
> the real thing
> #endif

Ok.

> but I still find the fact that the native case goes into the arch
> helper a little weird.

Would you prefer something like this:

static inline struct epoll_event __user *
epoll_put_uevent(__poll_t revents, __u64 data,
                 struct epoll_event __user *uevent)
{
#if defined(CONFIG_ARM) && defined(CONFIG_OABI_COMPAT)
        /* ARM OABI has an incompatible struct layout and needs a
special handler */
        extern struct epoll_event __user *
        epoll_oabi_put_uevent(__poll_t revents, __u64 data,
                              struct epoll_event __user *uevent);

        if (in_oabi_syscall())
                return epoll_oabi_put_uevent(revents, data, uevent);
#endif
        if (__put_user(revents, &uevent->events) ||
            __put_user(data, &uevent->data))
                return NULL;

        return uevent+1;
}

That would keep the native case in one place, but also mix in
more architecture specific stuff into the common source location,
which again seems worse to me.

     Arnd

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/9] ARM: remove set_fs callers and implementation
  2020-09-25 13:40   ` Arnd Bergmann
  2020-09-26  6:49     ` Christoph Hellwig
@ 2021-07-05  6:01     ` Christoph Hellwig
  2021-07-22 17:27       ` Arnd Bergmann
  1 sibling, 1 reply; 23+ messages in thread
From: Christoph Hellwig @ 2021-07-05  6:01 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Russell King, Alexander Viro, linux-kernel,
	Linux ARM, linux-arch, Linux-MM, torvalds

Just a ping if we're making some progress on this.  The m68knommu
set_fs implementation is actively misleading, which lead to a problem
this merge window so I'd really like to see this set merged.

Also your improvements to the copy_from_kernel_nofaul loop would be

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/9] ARM: remove set_fs callers and implementation
  2021-07-05  6:01     ` Christoph Hellwig
@ 2021-07-22 17:27       ` Arnd Bergmann
  0 siblings, 0 replies; 23+ messages in thread
From: Arnd Bergmann @ 2021-07-22 17:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Russell King, Alexander Viro, linux-kernel, Linux ARM,
	linux-arch, Linux-MM, Linus Torvalds

On Mon, Jul 5, 2021 at 8:01 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> Just a ping if we're making some progress on this.  The m68knommu
> set_fs implementation is actively misleading, which lead to a problem
> this merge window so I'd really like to see this set merged.
>
> Also your improvements to the copy_from_kernel_nofaul loop would be

I've rebased the series now, and rearranged a little in order to address
Russell's concerns. I still have one regression that I need to fix before
sending the new version.

        Arnd

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-07-22 17:27 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-18 12:46 [PATCH v2 0/9] ARM: remove set_fs callers and implementation Arnd Bergmann
2020-09-18 12:46 ` [PATCH v2 1/9] mm/maccess: fix unaligned copy_{from,to}_kernel_nofault Arnd Bergmann
2020-09-18 12:46 ` [PATCH v2 2/9] ARM: traps: use get_kernel_nofault instead of set_fs() Arnd Bergmann
2020-09-19  5:27   ` Christoph Hellwig
2020-09-18 12:46 ` [PATCH v2 3/9] ARM: oabi-compat: add epoll_pwait handler Arnd Bergmann
2020-09-21 12:54   ` Sasha Levin
2020-09-21 13:27     ` Arnd Bergmann
2020-09-18 12:46 ` [PATCH v2 4/9] ARM: syscall: always store thread_info->syscall Arnd Bergmann
2020-09-18 12:46 ` [PATCH v2 5/9] ARM: oabi-compat: rework epoll_wait/epoll_pwait emulation Arnd Bergmann
2020-09-19  5:32   ` Christoph Hellwig
2020-09-26 18:30     ` Arnd Bergmann
2020-09-18 12:46 ` [PATCH v2 6/9] ARM: oabi-compat: rework sys_semtimedop emulation Arnd Bergmann
2020-09-18 12:46 ` [PATCH v2 7/9] ARM: oabi-compat: rework fcntl64() emulation Arnd Bergmann
2020-09-18 12:46 ` [PATCH v2 8/9] ARM: uaccess: add __{get,put}_kernel_nofault Arnd Bergmann
2020-09-18 12:46 ` [PATCH v2 9/9] ARM: uaccess: remove set_fs() implementation Arnd Bergmann
2020-09-19  5:27 ` [PATCH v2 0/9] ARM: remove set_fs callers and implementation Christoph Hellwig
2020-09-25 13:40   ` Arnd Bergmann
2020-09-26  6:49     ` Christoph Hellwig
2021-07-05  6:01     ` Christoph Hellwig
2021-07-22 17:27       ` Arnd Bergmann
2020-09-19  8:19 ` Russell King - ARM Linux admin
2020-09-25 14:08   ` Arnd Bergmann
2020-09-25 15:30     ` Arnd Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).