linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] syscalls: remove compat_alloc_user_space callers
@ 2020-09-18 13:24 Arnd Bergmann
  2020-09-18 13:24 ` [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro Arnd Bergmann
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Arnd Bergmann @ 2020-09-18 13:24 UTC (permalink / raw)
  To: Christoph Hellwig, Alexander Viro, Eric Biederman, Andrew Morton
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, kexec,
	Arnd Bergmann

Going through compat_alloc_user_space() to convert indirect system call
arguments tends to add complexity compared to handling the native and
compat logic in the same code.

I have patches for all other uses of compat_alloc_user_space() as well,
and would expect that we can subsequently remove the interface itself.

      Arnd

Arnd Bergmann (4):
  x86: add __X32_COND_SYSCALL() macro
  kexec: remove compat_sys_kexec_load syscall
  mm: remove compat_sys_move_pages
  mm: remove compat numa syscalls

 arch/arm64/include/asm/unistd32.h         |  12 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl |  12 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |  12 +-
 arch/parisc/kernel/syscalls/syscall.tbl   |  10 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |  12 +-
 arch/s390/kernel/syscalls/syscall.tbl     |  12 +-
 arch/sparc/kernel/syscalls/syscall.tbl    |  12 +-
 arch/x86/entry/syscalls/syscall_32.tbl    |   6 +-
 arch/x86/entry/syscalls/syscall_64.tbl    |   4 +-
 arch/x86/include/asm/syscall_wrapper.h    |   5 +
 include/linux/compat.h                    |  26 ---
 include/uapi/asm-generic/unistd.h         |  12 +-
 kernel/kexec.c                            |  77 +++------
 kernel/sys_ni.c                           |   5 -
 mm/mempolicy.c                            | 193 +++++-----------------
 mm/migrate.c                              |  45 +++--
 16 files changed, 143 insertions(+), 312 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro
  2020-09-18 13:24 [PATCH 0/4] syscalls: remove compat_alloc_user_space callers Arnd Bergmann
@ 2020-09-18 13:24 ` Arnd Bergmann
  2020-09-19  5:35   ` Christoph Hellwig
  2020-09-18 13:24 ` [PATCH 2/4] kexec: remove compat_sys_kexec_load syscall Arnd Bergmann
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2020-09-18 13:24 UTC (permalink / raw)
  To: Christoph Hellwig, Alexander Viro, Eric Biederman, Andrew Morton
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, kexec,
	Arnd Bergmann

sys_move_pages() is an optional syscall, and once we remove
the compat version of it in favor of the native one with an
in_compat_syscall() check, the x32 syscall table refers to
a __x32_sys_move_pages symbol that may not exist when the
syscall is disabled.

Change the COND_SYSCALL() definition on x86 to also include
the redirection for x32.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/x86/include/asm/syscall_wrapper.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
index a84333adeef2..5eacd35a7f97 100644
--- a/arch/x86/include/asm/syscall_wrapper.h
+++ b/arch/x86/include/asm/syscall_wrapper.h
@@ -171,12 +171,16 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);
 	__SYS_STUBx(x32, compat_sys##name,				\
 		    SC_X86_64_REGS_TO_ARGS(x, __VA_ARGS__))
 
+#define __X32_COND_SYSCALL(name)					\
+	__COND_SYSCALL(x32, sys_##name)
+
 #define __X32_COMPAT_COND_SYSCALL(name)					\
 	__COND_SYSCALL(x32, compat_sys_##name)
 
 #define __X32_COMPAT_SYS_NI(name)					\
 	__SYS_NI(x32, compat_sys_##name)
 #else /* CONFIG_X86_X32 */
+#define __X32_COND_SYSCALL(name)
 #define __X32_COMPAT_SYS_STUB0(name)
 #define __X32_COMPAT_SYS_STUBx(x, name, ...)
 #define __X32_COMPAT_COND_SYSCALL(name)
@@ -253,6 +257,7 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);
 	static long __do_sys_##sname(const struct pt_regs *__unused)
 
 #define COND_SYSCALL(name)						\
+	__X32_COND_SYSCALL(name)					\
 	__X64_COND_SYSCALL(name)					\
 	__IA32_COND_SYSCALL(name)
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] kexec: remove compat_sys_kexec_load syscall
  2020-09-18 13:24 [PATCH 0/4] syscalls: remove compat_alloc_user_space callers Arnd Bergmann
  2020-09-18 13:24 ` [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro Arnd Bergmann
@ 2020-09-18 13:24 ` Arnd Bergmann
  2020-09-19  5:37   ` Christoph Hellwig
  2020-09-18 13:24 ` [PATCH 3/4] mm: remove compat_sys_move_pages Arnd Bergmann
  2020-09-18 13:24 ` [PATCH 4/4] mm: remove compat numa syscalls Arnd Bergmann
  3 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2020-09-18 13:24 UTC (permalink / raw)
  To: Christoph Hellwig, Alexander Viro, Eric Biederman, Andrew Morton
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, kexec,
	Arnd Bergmann

The compat version of sys_kexec_load() uses compat_alloc_user_space to
convert the user-provided arguments into the native format.

Move the conversion into the regular implementation with
an in_compat_syscall() check to simplify it and avoid the
compat_alloc_user_space() call.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm64/include/asm/unistd32.h         |  2 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl |  2 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |  2 +-
 arch/parisc/kernel/syscalls/syscall.tbl   |  2 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |  2 +-
 arch/s390/kernel/syscalls/syscall.tbl     |  2 +-
 arch/sparc/kernel/syscalls/syscall.tbl    |  2 +-
 arch/x86/entry/syscalls/syscall_32.tbl    |  2 +-
 arch/x86/entry/syscalls/syscall_64.tbl    |  2 +-
 include/linux/compat.h                    |  6 --
 include/uapi/asm-generic/unistd.h         |  2 +-
 kernel/kexec.c                            | 75 ++++++-----------------
 12 files changed, 29 insertions(+), 72 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 734860ac7cf9..b6517df74037 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -705,7 +705,7 @@ __SYSCALL(__NR_getcpu, sys_getcpu)
 #define __NR_epoll_pwait 346
 __SYSCALL(__NR_epoll_pwait, compat_sys_epoll_pwait)
 #define __NR_kexec_load 347
-__SYSCALL(__NR_kexec_load, compat_sys_kexec_load)
+__SYSCALL(__NR_kexec_load, sys_kexec_load)
 #define __NR_utimensat 348
 __SYSCALL(__NR_utimensat, sys_utimensat_time32)
 #define __NR_signalfd 349
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index f9df9edb67a4..ad157aab4c09 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -282,7 +282,7 @@
 271	n32	move_pages			compat_sys_move_pages
 272	n32	set_robust_list			compat_sys_set_robust_list
 273	n32	get_robust_list			compat_sys_get_robust_list
-274	n32	kexec_load			compat_sys_kexec_load
+274	n32	kexec_load			sys_kexec_load
 275	n32	getcpu				sys_getcpu
 276	n32	epoll_pwait			compat_sys_epoll_pwait
 277	n32	ioprio_set			sys_ioprio_set
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 195b43cf27c8..57baf6c8008f 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -322,7 +322,7 @@
 308	o32	move_pages			sys_move_pages			compat_sys_move_pages
 309	o32	set_robust_list			sys_set_robust_list		compat_sys_set_robust_list
 310	o32	get_robust_list			sys_get_robust_list		compat_sys_get_robust_list
-311	o32	kexec_load			sys_kexec_load			compat_sys_kexec_load
+311	o32	kexec_load			sys_kexec_load
 312	o32	getcpu				sys_getcpu
 313	o32	epoll_pwait			sys_epoll_pwait			compat_sys_epoll_pwait
 314	o32	ioprio_set			sys_ioprio_set
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index def64d221cd4..778bf166d7bd 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -336,7 +336,7 @@
 297	common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
 298	common	statfs64		sys_statfs64			compat_sys_statfs64
 299	common	fstatfs64		sys_fstatfs64			compat_sys_fstatfs64
-300	common	kexec_load		sys_kexec_load			compat_sys_kexec_load
+300	common	kexec_load		sys_kexec_load
 301	32	utimensat		sys_utimensat_time32
 301	64	utimensat		sys_utimensat
 302	common	signalfd		sys_signalfd			compat_sys_signalfd
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index c2d737ff2e7b..f128ba8b9a71 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -350,7 +350,7 @@
 265	64	mq_timedreceive			sys_mq_timedreceive
 266	nospu	mq_notify			sys_mq_notify			compat_sys_mq_notify
 267	nospu	mq_getsetattr			sys_mq_getsetattr		compat_sys_mq_getsetattr
-268	nospu	kexec_load			sys_kexec_load			compat_sys_kexec_load
+268	nospu	kexec_load			sys_kexec_load
 269	nospu	add_key				sys_add_key
 270	nospu	request_key			sys_request_key
 271	nospu	keyctl				sys_keyctl			compat_sys_keyctl
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 10456bc936fb..d45952058be2 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -283,7 +283,7 @@
 274  common	mq_timedreceive		sys_mq_timedreceive		sys_mq_timedreceive_time32
 275  common	mq_notify		sys_mq_notify			compat_sys_mq_notify
 276  common	mq_getsetattr		sys_mq_getsetattr		compat_sys_mq_getsetattr
-277  common	kexec_load		sys_kexec_load			compat_sys_kexec_load
+277  common	kexec_load		sys_kexec_load			sys_kexec_load
 278  common	add_key			sys_add_key			sys_add_key
 279  common	request_key		sys_request_key			sys_request_key
 280  common	keyctl			sys_keyctl			compat_sys_keyctl
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 4af114e84f20..a46edcdd950d 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -369,7 +369,7 @@
 303	common	mbind			sys_mbind			compat_sys_mbind
 304	common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
 305	common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
-306	common	kexec_load		sys_kexec_load			compat_sys_kexec_load
+306	common	kexec_load		sys_kexec_load			sys_kexec_load
 307	common	move_pages		sys_move_pages			compat_sys_move_pages
 308	common	getcpu			sys_getcpu
 309	common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 3db3d8823dc8..7e4140b78aad 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -294,7 +294,7 @@
 280	i386	mq_timedreceive		sys_mq_timedreceive_time32
 281	i386	mq_notify		sys_mq_notify			compat_sys_mq_notify
 282	i386	mq_getsetattr		sys_mq_getsetattr		compat_sys_mq_getsetattr
-283	i386	kexec_load		sys_kexec_load			compat_sys_kexec_load
+283	i386	kexec_load		sys_kexec_load			sys_kexec_load
 284	i386	waitid			sys_waitid			compat_sys_waitid
 # 285 sys_setaltroot
 286	i386	add_key			sys_add_key
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index f30d6ae9a688..9986f5f08278 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -384,7 +384,7 @@
 525	x32	sigaltstack		compat_sys_sigaltstack
 526	x32	timer_create		compat_sys_timer_create
 527	x32	mq_notify		compat_sys_mq_notify
-528	x32	kexec_load		compat_sys_kexec_load
+528	x32	kexec_load		sys_kexec_load
 529	x32	waitid			compat_sys_waitid
 530	x32	set_robust_list		compat_sys_set_robust_list
 531	x32	get_robust_list		compat_sys_get_robust_list
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 3d96a841bd49..a7a5a0ff59ef 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -643,12 +643,6 @@ asmlinkage long compat_sys_setitimer(int which,
 				     struct old_itimerval32 __user *in,
 				     struct old_itimerval32 __user *out);
 
-/* kernel/kexec.c */
-asmlinkage long compat_sys_kexec_load(compat_ulong_t entry,
-				      compat_ulong_t nr_segments,
-				      struct compat_kexec_segment __user *,
-				      compat_ulong_t flags);
-
 /* kernel/posix-timers.c */
 asmlinkage long compat_sys_timer_create(clockid_t which_clock,
 			struct compat_sigevent __user *timer_event_spec,
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 995b36c2ea7d..83f1fc7fd3d7 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -342,7 +342,7 @@ __SC_COMP(__NR_setitimer, sys_setitimer, compat_sys_setitimer)
 
 /* kernel/kexec.c */
 #define __NR_kexec_load 104
-__SC_COMP(__NR_kexec_load, sys_kexec_load, compat_sys_kexec_load)
+__SYSCALL(__NR_kexec_load, sys_kexec_load)
 
 /* kernel/module.c */
 #define __NR_init_module 105
diff --git a/kernel/kexec.c b/kernel/kexec.c
index f977786fe498..1ef7d3dc906f 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -29,7 +29,25 @@ static int copy_user_segment_list(struct kimage *image,
 	/* Read in the segments */
 	image->nr_segments = nr_segments;
 	segment_bytes = nr_segments * sizeof(*segments);
-	ret = copy_from_user(image->segment, segments, segment_bytes);
+	if (in_compat_syscall()) {
+		struct compat_kexec_segment __user *cs = (void __user *)segments;
+		struct compat_kexec_segment segment;
+		int i;
+		for (i=0; i< nr_segments; i++) {
+			copy_from_user(&segment, &cs[i], sizeof(segment));
+			if (ret)
+				break;
+
+			image->segment[i] = (struct kexec_segment) {
+				.buf   = compat_ptr(segment.buf),
+				.bufsz = segment.bufsz,
+				.mem   = segment.mem,
+				.memsz = segment.memsz,
+			};
+		}
+	} else {
+		ret = copy_from_user(image->segment, segments, segment_bytes);
+	}
 	if (ret)
 		ret = -EFAULT;
 
@@ -264,58 +282,3 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
 
 	return result;
 }
-
-#ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE4(kexec_load, compat_ulong_t, entry,
-		       compat_ulong_t, nr_segments,
-		       struct compat_kexec_segment __user *, segments,
-		       compat_ulong_t, flags)
-{
-	struct compat_kexec_segment in;
-	struct kexec_segment out, __user *ksegments;
-	unsigned long i, result;
-
-	result = kexec_load_check(nr_segments, flags);
-	if (result)
-		return result;
-
-	/* Don't allow clients that don't understand the native
-	 * architecture to do anything.
-	 */
-	if ((flags & KEXEC_ARCH_MASK) == KEXEC_ARCH_DEFAULT)
-		return -EINVAL;
-
-	ksegments = compat_alloc_user_space(nr_segments * sizeof(out));
-	for (i = 0; i < nr_segments; i++) {
-		result = copy_from_user(&in, &segments[i], sizeof(in));
-		if (result)
-			return -EFAULT;
-
-		out.buf   = compat_ptr(in.buf);
-		out.bufsz = in.bufsz;
-		out.mem   = in.mem;
-		out.memsz = in.memsz;
-
-		result = copy_to_user(&ksegments[i], &out, sizeof(out));
-		if (result)
-			return -EFAULT;
-	}
-
-	/* Because we write directly to the reserved memory
-	 * region when loading crash kernels we need a mutex here to
-	 * prevent multiple crash  kernels from attempting to load
-	 * simultaneously, and to prevent a crash kernel from loading
-	 * over the top of a in use crash kernel.
-	 *
-	 * KISS: always take the mutex.
-	 */
-	if (!mutex_trylock(&kexec_mutex))
-		return -EBUSY;
-
-	result = do_kexec_load(entry, nr_segments, ksegments, flags);
-
-	mutex_unlock(&kexec_mutex);
-
-	return result;
-}
-#endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] mm: remove compat_sys_move_pages
  2020-09-18 13:24 [PATCH 0/4] syscalls: remove compat_alloc_user_space callers Arnd Bergmann
  2020-09-18 13:24 ` [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro Arnd Bergmann
  2020-09-18 13:24 ` [PATCH 2/4] kexec: remove compat_sys_kexec_load syscall Arnd Bergmann
@ 2020-09-18 13:24 ` Arnd Bergmann
  2020-09-19  5:38   ` Christoph Hellwig
  2020-09-18 13:24 ` [PATCH 4/4] mm: remove compat numa syscalls Arnd Bergmann
  3 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2020-09-18 13:24 UTC (permalink / raw)
  To: Christoph Hellwig, Alexander Viro, Eric Biederman, Andrew Morton
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, kexec,
	Arnd Bergmann

The compat move_pages() implementation uses compat_alloc_user_space()
for converting the pointer array. Moving the compat handling into
the function itself is a bit simpler and lets us avoid the
compat_alloc_user_space() call.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm64/include/asm/unistd32.h         |  2 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl |  2 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |  2 +-
 arch/parisc/kernel/syscalls/syscall.tbl   |  2 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |  2 +-
 arch/s390/kernel/syscalls/syscall.tbl     |  2 +-
 arch/sparc/kernel/syscalls/syscall.tbl    |  2 +-
 arch/x86/entry/syscalls/syscall_32.tbl    |  2 +-
 arch/x86/entry/syscalls/syscall_64.tbl    |  2 +-
 include/linux/compat.h                    |  5 ---
 include/uapi/asm-generic/unistd.h         |  2 +-
 kernel/sys_ni.c                           |  1 -
 mm/migrate.c                              | 45 +++++++++++------------
 13 files changed, 32 insertions(+), 39 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index b6517df74037..af793775ba98 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -699,7 +699,7 @@ __SYSCALL(__NR_tee, sys_tee)
 #define __NR_vmsplice 343
 __SYSCALL(__NR_vmsplice, compat_sys_vmsplice)
 #define __NR_move_pages 344
-__SYSCALL(__NR_move_pages, compat_sys_move_pages)
+__SYSCALL(__NR_move_pages, sys_move_pages)
 #define __NR_getcpu 345
 __SYSCALL(__NR_getcpu, sys_getcpu)
 #define __NR_epoll_pwait 346
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index ad157aab4c09..7fa1ca45e44c 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -279,7 +279,7 @@
 268	n32	sync_file_range			sys_sync_file_range
 269	n32	tee				sys_tee
 270	n32	vmsplice			compat_sys_vmsplice
-271	n32	move_pages			compat_sys_move_pages
+271	n32	move_pages			sys_move_pages
 272	n32	set_robust_list			compat_sys_set_robust_list
 273	n32	get_robust_list			compat_sys_get_robust_list
 274	n32	kexec_load			sys_kexec_load
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 57baf6c8008f..194c7fbeedf7 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -319,7 +319,7 @@
 305	o32	sync_file_range			sys_sync_file_range		sys32_sync_file_range
 306	o32	tee				sys_tee
 307	o32	vmsplice			sys_vmsplice			compat_sys_vmsplice
-308	o32	move_pages			sys_move_pages			compat_sys_move_pages
+308	o32	move_pages			sys_move_pages
 309	o32	set_robust_list			sys_set_robust_list		compat_sys_set_robust_list
 310	o32	get_robust_list			sys_get_robust_list		compat_sys_get_robust_list
 311	o32	kexec_load			sys_kexec_load
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 778bf166d7bd..5c17edaffe70 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -331,7 +331,7 @@
 292	64	sync_file_range		sys_sync_file_range
 293	common	tee			sys_tee
 294	common	vmsplice		sys_vmsplice			compat_sys_vmsplice
-295	common	move_pages		sys_move_pages			compat_sys_move_pages
+295	common	move_pages		sys_move_pages
 296	common	getcpu			sys_getcpu
 297	common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
 298	common	statfs64		sys_statfs64			compat_sys_statfs64
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index f128ba8b9a71..04fb42d7b377 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -389,7 +389,7 @@
 298	common	faccessat			sys_faccessat
 299	common	get_robust_list			sys_get_robust_list		compat_sys_get_robust_list
 300	common	set_robust_list			sys_set_robust_list		compat_sys_set_robust_list
-301	common	move_pages			sys_move_pages			compat_sys_move_pages
+301	common	move_pages			sys_move_pages
 302	common	getcpu				sys_getcpu
 303	nospu	epoll_pwait			sys_epoll_pwait			compat_sys_epoll_pwait
 304	32	utimensat			sys_utimensat_time32
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index d45952058be2..3197965d45e9 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -317,7 +317,7 @@
 307  common	sync_file_range		sys_sync_file_range		compat_sys_s390_sync_file_range
 308  common	tee			sys_tee				sys_tee
 309  common	vmsplice		sys_vmsplice			compat_sys_vmsplice
-310  common	move_pages		sys_move_pages			compat_sys_move_pages
+310  common	move_pages		sys_move_pages
 311  common	getcpu			sys_getcpu			sys_getcpu
 312  common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
 313  common	utimes			sys_utimes			sys_utimes_time32
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index a46edcdd950d..e36ac364e61a 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -370,7 +370,7 @@
 304	common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
 305	common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
 306	common	kexec_load		sys_kexec_load			sys_kexec_load
-307	common	move_pages		sys_move_pages			compat_sys_move_pages
+307	common	move_pages		sys_move_pages
 308	common	getcpu			sys_getcpu
 309	common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
 310	32	utimensat		sys_utimensat_time32
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 7e4140b78aad..b3263b8b2eae 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -328,7 +328,7 @@
 314	i386	sync_file_range		sys_ia32_sync_file_range
 315	i386	tee			sys_tee
 316	i386	vmsplice		sys_vmsplice			compat_sys_vmsplice
-317	i386	move_pages		sys_move_pages			compat_sys_move_pages
+317	i386	move_pages		sys_move_pages
 318	i386	getcpu			sys_getcpu
 319	i386	epoll_pwait		sys_epoll_pwait
 320	i386	utimensat		sys_utimensat_time32
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 9986f5f08278..4a997a0cbf47 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -389,7 +389,7 @@
 530	x32	set_robust_list		compat_sys_set_robust_list
 531	x32	get_robust_list		compat_sys_get_robust_list
 532	x32	vmsplice		compat_sys_vmsplice
-533	x32	move_pages		compat_sys_move_pages
+533	x32	move_pages		sys_move_pages
 534	x32	preadv			compat_sys_preadv64
 535	x32	pwritev			compat_sys_pwritev64
 536	x32	rt_tgsigqueueinfo	compat_sys_rt_tgsigqueueinfo
diff --git a/include/linux/compat.h b/include/linux/compat.h
index a7a5a0ff59ef..db1d7ac2c9e0 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -763,11 +763,6 @@ asmlinkage long compat_sys_set_mempolicy(int mode, compat_ulong_t __user *nmask,
 asmlinkage long compat_sys_migrate_pages(compat_pid_t pid,
 		compat_ulong_t maxnode, const compat_ulong_t __user *old_nodes,
 		const compat_ulong_t __user *new_nodes);
-asmlinkage long compat_sys_move_pages(pid_t pid, compat_ulong_t nr_pages,
-				      __u32 __user *pages,
-				      const int __user *nodes,
-				      int __user *status,
-				      int flags);
 
 asmlinkage long compat_sys_rt_tgsigqueueinfo(compat_pid_t tgid,
 					compat_pid_t pid, int sig,
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 83f1fc7fd3d7..4da51702fb21 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -681,7 +681,7 @@ __SC_COMP(__NR_set_mempolicy, sys_set_mempolicy, compat_sys_set_mempolicy)
 #define __NR_migrate_pages 238
 __SC_COMP(__NR_migrate_pages, sys_migrate_pages, compat_sys_migrate_pages)
 #define __NR_move_pages 239
-__SC_COMP(__NR_move_pages, sys_move_pages, compat_sys_move_pages)
+__SYSCALL(__NR_move_pages, sys_move_pages)
 #endif
 
 #define __NR_rt_tgsigqueueinfo 240
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index c925d1e1777e..783a24ceee88 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -290,7 +290,6 @@ COND_SYSCALL_COMPAT(set_mempolicy);
 COND_SYSCALL(migrate_pages);
 COND_SYSCALL_COMPAT(migrate_pages);
 COND_SYSCALL(move_pages);
-COND_SYSCALL_COMPAT(move_pages);
 
 COND_SYSCALL(perf_event_open);
 COND_SYSCALL(accept4);
diff --git a/mm/migrate.c b/mm/migrate.c
index 34a842a8eb6a..e9dfbde5f12c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1835,6 +1835,27 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
 	mmap_read_unlock(mm);
 }
 
+static int put_pages_array(const void __user *chunk_pages[],
+			   const void __user * __user *pages,
+			   unsigned long chunk_nr)
+{
+	compat_uptr_t __user *pages32 = (compat_uptr_t __user *)pages;
+	compat_uptr_t p;
+	int i;
+
+	if (!in_compat_syscall())
+		return copy_from_user(chunk_pages, pages,
+				      chunk_nr * sizeof(*chunk_pages));
+
+	for (i = 0; i < chunk_nr; i++) {
+		if (get_user(p, pages32 + i))
+			return -EFAULT;
+		chunk_pages[i] = compat_ptr(p);
+	}
+
+	return 0;
+}
+
 /*
  * Determine the nodes of a user array of pages and store it in
  * a user array of status.
@@ -1854,7 +1875,7 @@ static int do_pages_stat(struct mm_struct *mm, unsigned long nr_pages,
 		if (chunk_nr > DO_PAGES_STAT_CHUNK_NR)
 			chunk_nr = DO_PAGES_STAT_CHUNK_NR;
 
-		if (copy_from_user(chunk_pages, pages, chunk_nr * sizeof(*chunk_pages)))
+		if (put_pages_array(chunk_pages, pages, chunk_nr))
 			break;
 
 		do_pages_stat_array(mm, chunk_nr, chunk_pages, chunk_status);
@@ -1943,28 +1964,6 @@ SYSCALL_DEFINE6(move_pages, pid_t, pid, unsigned long, nr_pages,
 	return kernel_move_pages(pid, nr_pages, pages, nodes, status, flags);
 }
 
-#ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE6(move_pages, pid_t, pid, compat_ulong_t, nr_pages,
-		       compat_uptr_t __user *, pages32,
-		       const int __user *, nodes,
-		       int __user *, status,
-		       int, flags)
-{
-	const void __user * __user *pages;
-	int i;
-
-	pages = compat_alloc_user_space(nr_pages * sizeof(void *));
-	for (i = 0; i < nr_pages; i++) {
-		compat_uptr_t p;
-
-		if (get_user(p, pages32 + i) ||
-			put_user(compat_ptr(p), pages + i))
-			return -EFAULT;
-	}
-	return kernel_move_pages(pid, nr_pages, pages, nodes, status, flags);
-}
-#endif /* CONFIG_COMPAT */
-
 #ifdef CONFIG_NUMA_BALANCING
 /*
  * Returns true if this is a safe migration target node for misplaced NUMA
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] mm: remove compat numa syscalls
  2020-09-18 13:24 [PATCH 0/4] syscalls: remove compat_alloc_user_space callers Arnd Bergmann
                   ` (2 preceding siblings ...)
  2020-09-18 13:24 ` [PATCH 3/4] mm: remove compat_sys_move_pages Arnd Bergmann
@ 2020-09-18 13:24 ` Arnd Bergmann
  2020-09-19  5:41   ` Christoph Hellwig
  3 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2020-09-18 13:24 UTC (permalink / raw)
  To: Christoph Hellwig, Alexander Viro, Eric Biederman, Andrew Morton
  Cc: linux-kernel, linux-arm-kernel, linux-arch, linux-mm, kexec,
	Arnd Bergmann

The compat implementations for mbind, get_mempolicy, set_mempolicy
and migrate_pages are just there to handle the subtly different
layout of bitmaps on 32-bit hosts.

The compat implementation however lacks some of the checks that
are present in the native one, in particular for checking that
the extra bits are all zero when user space has a larger mask
size than the kernel. Worse, those extra bits do not get cleared
when copying in or out of the kernel, which can lead to incorrect
data as well.

Unify the implementation to handle the compat bitmap layout directly
in the get_nodes() and copy_nodes_to_user() helpers.  Splitting out
the get_bitmap() helper from get_nodes() also helps readability of the
native case.

On x86, two additional problems are addressed by this: compat tasks can
pass a bitmap at the end of a mapping, causing a fault when reading
across the page boundary for a 64-bit word. x32 tasks might also run
into problems with get_mempolicy corrupting data when an odd number of
32-bit words gets passed.

On parisc the migrate_pages() system call apparently had the wrong
calling convention, as big-endian architectures expect the words
inside of a bitmap to be swapped. This is not a problem though
since parisc has no NUMA support.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/arm64/include/asm/unistd32.h         |   8 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl |   8 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |   8 +-
 arch/parisc/kernel/syscalls/syscall.tbl   |   6 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |   8 +-
 arch/s390/kernel/syscalls/syscall.tbl     |   8 +-
 arch/sparc/kernel/syscalls/syscall.tbl    |   8 +-
 arch/x86/entry/syscalls/syscall_32.tbl    |   2 +-
 include/linux/compat.h                    |  15 --
 include/uapi/asm-generic/unistd.h         |   8 +-
 kernel/kexec.c                            |   6 +-
 kernel/sys_ni.c                           |   4 -
 mm/mempolicy.c                            | 193 +++++-----------------
 13 files changed, 79 insertions(+), 203 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index af793775ba98..31479f7120a0 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -649,11 +649,11 @@ __SYSCALL(__NR_inotify_add_watch, sys_inotify_add_watch)
 #define __NR_inotify_rm_watch 318
 __SYSCALL(__NR_inotify_rm_watch, sys_inotify_rm_watch)
 #define __NR_mbind 319
-__SYSCALL(__NR_mbind, compat_sys_mbind)
+__SYSCALL(__NR_mbind, sys_mbind)
 #define __NR_get_mempolicy 320
-__SYSCALL(__NR_get_mempolicy, compat_sys_get_mempolicy)
+__SYSCALL(__NR_get_mempolicy, sys_get_mempolicy)
 #define __NR_set_mempolicy 321
-__SYSCALL(__NR_set_mempolicy, compat_sys_set_mempolicy)
+__SYSCALL(__NR_set_mempolicy, sys_set_mempolicy)
 #define __NR_openat 322
 __SYSCALL(__NR_openat, compat_sys_openat)
 #define __NR_mkdirat 323
@@ -811,7 +811,7 @@ __SYSCALL(__NR_rseq, sys_rseq)
 #define __NR_io_pgetevents 399
 __SYSCALL(__NR_io_pgetevents, compat_sys_io_pgetevents)
 #define __NR_migrate_pages 400
-__SYSCALL(__NR_migrate_pages, compat_sys_migrate_pages)
+__SYSCALL(__NR_migrate_pages, sys_migrate_pages)
 #define __NR_kexec_file_load 401
 __SYSCALL(__NR_kexec_file_load, sys_kexec_file_load)
 /* 402 is unused */
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 7fa1ca45e44c..15fda882d07e 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -239,9 +239,9 @@
 228	n32	clock_nanosleep			sys_clock_nanosleep_time32
 229	n32	tgkill				sys_tgkill
 230	n32	utimes				sys_utimes_time32
-231	n32	mbind				compat_sys_mbind
-232	n32	get_mempolicy			compat_sys_get_mempolicy
-233	n32	set_mempolicy			compat_sys_set_mempolicy
+231	n32	mbind				sys_mbind
+232	n32	get_mempolicy			sys_get_mempolicy
+233	n32	set_mempolicy			sys_set_mempolicy
 234	n32	mq_open				compat_sys_mq_open
 235	n32	mq_unlink			sys_mq_unlink
 236	n32	mq_timedsend			sys_mq_timedsend_time32
@@ -258,7 +258,7 @@
 247	n32	inotify_init			sys_inotify_init
 248	n32	inotify_add_watch		sys_inotify_add_watch
 249	n32	inotify_rm_watch		sys_inotify_rm_watch
-250	n32	migrate_pages			compat_sys_migrate_pages
+250	n32	migrate_pages			sys_migrate_pages
 251	n32	openat				sys_openat
 252	n32	mkdirat				sys_mkdirat
 253	n32	mknodat				sys_mknodat
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 194c7fbeedf7..6591388a9d88 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -279,9 +279,9 @@
 265	o32	clock_nanosleep			sys_clock_nanosleep_time32
 266	o32	tgkill				sys_tgkill
 267	o32	utimes				sys_utimes_time32
-268	o32	mbind				sys_mbind			compat_sys_mbind
-269	o32	get_mempolicy			sys_get_mempolicy		compat_sys_get_mempolicy
-270	o32	set_mempolicy			sys_set_mempolicy		compat_sys_set_mempolicy
+268	o32	mbind				sys_mbind
+269	o32	get_mempolicy			sys_get_mempolicy
+270	o32	set_mempolicy			sys_set_mempolicy
 271	o32	mq_open				sys_mq_open			compat_sys_mq_open
 272	o32	mq_unlink			sys_mq_unlink
 273	o32	mq_timedsend			sys_mq_timedsend_time32
@@ -298,7 +298,7 @@
 284	o32	inotify_init			sys_inotify_init
 285	o32	inotify_add_watch		sys_inotify_add_watch
 286	o32	inotify_rm_watch		sys_inotify_rm_watch
-287	o32	migrate_pages			sys_migrate_pages		compat_sys_migrate_pages
+287	o32	migrate_pages			sys_migrate_pages
 288	o32	openat				sys_openat			compat_sys_openat
 289	o32	mkdirat				sys_mkdirat
 290	o32	mknodat				sys_mknodat
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 5c17edaffe70..30f3c0146abf 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -292,9 +292,9 @@
 258	32	clock_nanosleep		sys_clock_nanosleep_time32
 258	64	clock_nanosleep		sys_clock_nanosleep
 259	common	tgkill			sys_tgkill
-260	common	mbind			sys_mbind			compat_sys_mbind
-261	common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
-262	common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
+260	common	mbind			sys_mbind
+261	common	get_mempolicy		sys_get_mempolicy
+262	common	set_mempolicy		sys_set_mempolicy
 # 263 was vserver
 264	common	add_key			sys_add_key
 265	common	request_key		sys_request_key
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index 04fb42d7b377..4f5216320721 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -338,10 +338,10 @@
 256	64	sys_debug_setcontext		sys_ni_syscall
 256	spu	sys_debug_setcontext		sys_ni_syscall
 # 257 reserved for vserver
-258	nospu	migrate_pages			sys_migrate_pages		compat_sys_migrate_pages
-259	nospu	mbind				sys_mbind			compat_sys_mbind
-260	nospu	get_mempolicy			sys_get_mempolicy		compat_sys_get_mempolicy
-261	nospu	set_mempolicy			sys_set_mempolicy		compat_sys_set_mempolicy
+258	nospu	migrate_pages			sys_migrate_pages
+259	nospu	mbind				sys_mbind
+260	nospu	get_mempolicy			sys_get_mempolicy
+261	nospu	set_mempolicy			sys_set_mempolicy
 262	nospu	mq_open				sys_mq_open			compat_sys_mq_open
 263	nospu	mq_unlink			sys_mq_unlink
 264	32	mq_timedsend			sys_mq_timedsend_time32
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 3197965d45e9..70c0b830d14f 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -274,9 +274,9 @@
 265  common	statfs64		sys_statfs64			compat_sys_statfs64
 266  common	fstatfs64		sys_fstatfs64			compat_sys_fstatfs64
 267  common	remap_file_pages	sys_remap_file_pages		sys_remap_file_pages
-268  common	mbind			sys_mbind			compat_sys_mbind
-269  common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
-270  common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
+268  common	mbind			sys_mbind			sys_mbind
+269  common	get_mempolicy		sys_get_mempolicy		sys_get_mempolicy
+270  common	set_mempolicy		sys_set_mempolicy		sys_set_mempolicy
 271  common	mq_open			sys_mq_open			compat_sys_mq_open
 272  common	mq_unlink		sys_mq_unlink			sys_mq_unlink
 273  common	mq_timedsend		sys_mq_timedsend		sys_mq_timedsend_time32
@@ -293,7 +293,7 @@
 284  common	inotify_init		sys_inotify_init		sys_inotify_init
 285  common	inotify_add_watch	sys_inotify_add_watch		sys_inotify_add_watch
 286  common	inotify_rm_watch	sys_inotify_rm_watch		sys_inotify_rm_watch
-287  common	migrate_pages		sys_migrate_pages		compat_sys_migrate_pages
+287  common	migrate_pages		sys_migrate_pages		sys_migrate_pages
 288  common	openat			sys_openat			compat_sys_openat
 289  common	mkdirat			sys_mkdirat			sys_mkdirat
 290  common	mknodat			sys_mknodat			sys_mknodat
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index e36ac364e61a..50ff839a2661 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -365,10 +365,10 @@
 299	common	unshare			sys_unshare
 300	common	set_robust_list		sys_set_robust_list		compat_sys_set_robust_list
 301	common	get_robust_list		sys_get_robust_list		compat_sys_get_robust_list
-302	common	migrate_pages		sys_migrate_pages		compat_sys_migrate_pages
-303	common	mbind			sys_mbind			compat_sys_mbind
-304	common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
-305	common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
+302	common	migrate_pages		sys_migrate_pages
+303	common	mbind			sys_mbind
+304	common	get_mempolicy		sys_get_mempolicy
+305	common	set_mempolicy		sys_set_mempolicy
 306	common	kexec_load		sys_kexec_load			sys_kexec_load
 307	common	move_pages		sys_move_pages
 308	common	getcpu			sys_getcpu
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index b3263b8b2eae..d07c3fbd4697 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -286,7 +286,7 @@
 272	i386	fadvise64_64		sys_ia32_fadvise64_64
 273	i386	vserver
 274	i386	mbind			sys_mbind
-275	i386	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
+275	i386	get_mempolicy		sys_get_mempolicy
 276	i386	set_mempolicy		sys_set_mempolicy
 277	i386	mq_open			sys_mq_open			compat_sys_mq_open
 278	i386	mq_unlink		sys_mq_unlink
diff --git a/include/linux/compat.h b/include/linux/compat.h
index db1d7ac2c9e0..be06367b336c 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -749,21 +749,6 @@ asmlinkage long compat_sys_execve(const char __user *filename, const compat_uptr
 /* mm/fadvise.c: No generic prototype for fadvise64_64 */
 
 /* mm/, CONFIG_MMU only */
-asmlinkage long compat_sys_mbind(compat_ulong_t start, compat_ulong_t len,
-				 compat_ulong_t mode,
-				 compat_ulong_t __user *nmask,
-				 compat_ulong_t maxnode, compat_ulong_t flags);
-asmlinkage long compat_sys_get_mempolicy(int __user *policy,
-					 compat_ulong_t __user *nmask,
-					 compat_ulong_t maxnode,
-					 compat_ulong_t addr,
-					 compat_ulong_t flags);
-asmlinkage long compat_sys_set_mempolicy(int mode, compat_ulong_t __user *nmask,
-					 compat_ulong_t maxnode);
-asmlinkage long compat_sys_migrate_pages(compat_pid_t pid,
-		compat_ulong_t maxnode, const compat_ulong_t __user *old_nodes,
-		const compat_ulong_t __user *new_nodes);
-
 asmlinkage long compat_sys_rt_tgsigqueueinfo(compat_pid_t tgid,
 					compat_pid_t pid, int sig,
 					struct compat_siginfo __user *uinfo);
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 4da51702fb21..4e31f9b68a8f 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -673,13 +673,13 @@ __SYSCALL(__NR_madvise, sys_madvise)
 #define __NR_remap_file_pages 234
 __SYSCALL(__NR_remap_file_pages, sys_remap_file_pages)
 #define __NR_mbind 235
-__SC_COMP(__NR_mbind, sys_mbind, compat_sys_mbind)
+__SYSCALL(__NR_mbind, sys_mbind)
 #define __NR_get_mempolicy 236
-__SC_COMP(__NR_get_mempolicy, sys_get_mempolicy, compat_sys_get_mempolicy)
+__SYSCALL(__NR_get_mempolicy, sys_get_mempolicy)
 #define __NR_set_mempolicy 237
-__SC_COMP(__NR_set_mempolicy, sys_set_mempolicy, compat_sys_set_mempolicy)
+__SYSCALL(__NR_set_mempolicy, sys_set_mempolicy)
 #define __NR_migrate_pages 238
-__SC_COMP(__NR_migrate_pages, sys_migrate_pages, compat_sys_migrate_pages)
+__SYSCALL(__NR_migrate_pages, sys_migrate_pages)
 #define __NR_move_pages 239
 __SYSCALL(__NR_move_pages, sys_move_pages)
 #endif
diff --git a/kernel/kexec.c b/kernel/kexec.c
index 1ef7d3dc906f..0fecf2370be1 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -30,11 +30,13 @@ static int copy_user_segment_list(struct kimage *image,
 	image->nr_segments = nr_segments;
 	segment_bytes = nr_segments * sizeof(*segments);
 	if (in_compat_syscall()) {
-		struct compat_kexec_segment __user *cs = (void __user *)segments;
+		struct compat_kexec_segment __user *cs;
 		struct compat_kexec_segment segment;
 		int i;
+
+		cs = (struct compat_kexec_segment __user *)segments;
 		for (i=0; i< nr_segments; i++) {
-			copy_from_user(&segment, &cs[i], sizeof(segment));
+			ret = copy_from_user(&segment, &cs[i], sizeof(segment));
 			if (ret)
 				break;
 
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 783a24ceee88..0850111f888e 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -282,13 +282,9 @@ COND_SYSCALL(mincore);
 COND_SYSCALL(madvise);
 COND_SYSCALL(remap_file_pages);
 COND_SYSCALL(mbind);
-COND_SYSCALL_COMPAT(mbind);
 COND_SYSCALL(get_mempolicy);
-COND_SYSCALL_COMPAT(get_mempolicy);
 COND_SYSCALL(set_mempolicy);
-COND_SYSCALL_COMPAT(set_mempolicy);
 COND_SYSCALL(migrate_pages);
-COND_SYSCALL_COMPAT(migrate_pages);
 COND_SYSCALL(move_pages);
 
 COND_SYSCALL(perf_event_open);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index eddbe4e56c73..2e1b90143b2c 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1374,16 +1374,30 @@ static long do_mbind(unsigned long start, unsigned long len,
 /*
  * User space interface with variable sized bitmaps for nodelists.
  */
+static int get_bitmap(unsigned long *mask, const unsigned long __user *nmask,
+		      unsigned long maxnode)
+{
+	unsigned long nlongs = BITS_TO_LONGS(maxnode);
+	int ret;
+
+	if (in_compat_syscall())
+		ret = compat_get_bitmap(mask, (void __user *)nmask, maxnode);
+	else
+		ret = copy_from_user(mask, nmask, nlongs*sizeof(unsigned long));
+
+	if (ret)
+		return -EFAULT;
+
+	if (maxnode % BITS_PER_LONG)
+		mask[nlongs-1] &= (1UL << (maxnode % BITS_PER_LONG)) - 1;
+
+	return 0;
+}
 
 /* Copy a node mask from user space. */
 static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 		     unsigned long maxnode)
 {
-	unsigned long k;
-	unsigned long t;
-	unsigned long nlongs;
-	unsigned long endmask;
-
 	--maxnode;
 	nodes_clear(*nodes);
 	if (maxnode == 0 || !nmask)
@@ -1391,49 +1405,29 @@ static int get_nodes(nodemask_t *nodes, const unsigned long __user *nmask,
 	if (maxnode > PAGE_SIZE*BITS_PER_BYTE)
 		return -EINVAL;
 
-	nlongs = BITS_TO_LONGS(maxnode);
-	if ((maxnode % BITS_PER_LONG) == 0)
-		endmask = ~0UL;
-	else
-		endmask = (1UL << (maxnode % BITS_PER_LONG)) - 1;
-
 	/*
 	 * When the user specified more nodes than supported just check
-	 * if the non supported part is all zero.
-	 *
-	 * If maxnode have more longs than MAX_NUMNODES, check
-	 * the bits in that area first. And then go through to
-	 * check the rest bits which equal or bigger than MAX_NUMNODES.
-	 * Otherwise, just check bits [MAX_NUMNODES, maxnode).
+	 * if the non supported part is all zero, one word at a time,
+	 * starting at the end.
 	 */
-	if (nlongs > BITS_TO_LONGS(MAX_NUMNODES)) {
-		for (k = BITS_TO_LONGS(MAX_NUMNODES); k < nlongs; k++) {
-			if (get_user(t, nmask + k))
-				return -EFAULT;
-			if (k == nlongs - 1) {
-				if (t & endmask)
-					return -EINVAL;
-			} else if (t)
-				return -EINVAL;
-		}
-		nlongs = BITS_TO_LONGS(MAX_NUMNODES);
-		endmask = ~0UL;
-	}
-
-	if (maxnode > MAX_NUMNODES && MAX_NUMNODES % BITS_PER_LONG != 0) {
-		unsigned long valid_mask = endmask;
+	while (maxnode > MAX_NUMNODES) {
+		unsigned long bits = min_t(unsigned long, maxnode, BITS_PER_LONG);
+		unsigned long t;
 
-		valid_mask &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
-		if (get_user(t, nmask + nlongs - 1))
+		if (get_bitmap(&t, &nmask[maxnode / BITS_PER_LONG], bits))
 			return -EFAULT;
-		if (t & valid_mask)
+
+		if (maxnode - bits >= MAX_NUMNODES) {
+			maxnode -= bits;
+		} else {
+			maxnode = MAX_NUMNODES;
+			t &= ~((1UL << (MAX_NUMNODES % BITS_PER_LONG)) - 1);
+		}
+		if (t)
 			return -EINVAL;
 	}
 
-	if (copy_from_user(nodes_addr(*nodes), nmask, nlongs*sizeof(unsigned long)))
-		return -EFAULT;
-	nodes_addr(*nodes)[nlongs-1] &= endmask;
-	return 0;
+	return get_bitmap(nodes_addr(*nodes), nmask, maxnode);
 }
 
 /* Copy a kernel node mask to user space */
@@ -1442,6 +1436,10 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
 {
 	unsigned long copy = ALIGN(maxnode-1, 64) / 8;
 	unsigned int nbytes = BITS_TO_LONGS(nr_node_ids) * sizeof(long);
+	bool compat = in_compat_syscall();
+
+	if (compat)
+		nbytes = BITS_TO_COMPAT_LONGS(nr_node_ids) * sizeof(compat_long_t);
 
 	if (copy > nbytes) {
 		if (copy > PAGE_SIZE)
@@ -1450,6 +1448,11 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
 			return -EFAULT;
 		copy = nbytes;
 	}
+
+	if (compat)
+		return compat_put_bitmap((compat_ulong_t __user *)mask,
+					 nodes_addr(*nodes), maxnode);
+
 	return copy_to_user(mask, nodes_addr(*nodes), copy) ? -EFAULT : 0;
 }
 
@@ -1641,116 +1644,6 @@ SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
 	return kernel_get_mempolicy(policy, nmask, maxnode, addr, flags);
 }
 
-#ifdef CONFIG_COMPAT
-
-COMPAT_SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
-		       compat_ulong_t __user *, nmask,
-		       compat_ulong_t, maxnode,
-		       compat_ulong_t, addr, compat_ulong_t, flags)
-{
-	long err;
-	unsigned long __user *nm = NULL;
-	unsigned long nr_bits, alloc_size;
-	DECLARE_BITMAP(bm, MAX_NUMNODES);
-
-	nr_bits = min_t(unsigned long, maxnode-1, nr_node_ids);
-	alloc_size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
-
-	if (nmask)
-		nm = compat_alloc_user_space(alloc_size);
-
-	err = kernel_get_mempolicy(policy, nm, nr_bits+1, addr, flags);
-
-	if (!err && nmask) {
-		unsigned long copy_size;
-		copy_size = min_t(unsigned long, sizeof(bm), alloc_size);
-		err = copy_from_user(bm, nm, copy_size);
-		/* ensure entire bitmap is zeroed */
-		err |= clear_user(nmask, ALIGN(maxnode-1, 8) / 8);
-		err |= compat_put_bitmap(nmask, bm, nr_bits);
-	}
-
-	return err;
-}
-
-COMPAT_SYSCALL_DEFINE3(set_mempolicy, int, mode, compat_ulong_t __user *, nmask,
-		       compat_ulong_t, maxnode)
-{
-	unsigned long __user *nm = NULL;
-	unsigned long nr_bits, alloc_size;
-	DECLARE_BITMAP(bm, MAX_NUMNODES);
-
-	nr_bits = min_t(unsigned long, maxnode-1, MAX_NUMNODES);
-	alloc_size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
-
-	if (nmask) {
-		if (compat_get_bitmap(bm, nmask, nr_bits))
-			return -EFAULT;
-		nm = compat_alloc_user_space(alloc_size);
-		if (copy_to_user(nm, bm, alloc_size))
-			return -EFAULT;
-	}
-
-	return kernel_set_mempolicy(mode, nm, nr_bits+1);
-}
-
-COMPAT_SYSCALL_DEFINE6(mbind, compat_ulong_t, start, compat_ulong_t, len,
-		       compat_ulong_t, mode, compat_ulong_t __user *, nmask,
-		       compat_ulong_t, maxnode, compat_ulong_t, flags)
-{
-	unsigned long __user *nm = NULL;
-	unsigned long nr_bits, alloc_size;
-	nodemask_t bm;
-
-	nr_bits = min_t(unsigned long, maxnode-1, MAX_NUMNODES);
-	alloc_size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
-
-	if (nmask) {
-		if (compat_get_bitmap(nodes_addr(bm), nmask, nr_bits))
-			return -EFAULT;
-		nm = compat_alloc_user_space(alloc_size);
-		if (copy_to_user(nm, nodes_addr(bm), alloc_size))
-			return -EFAULT;
-	}
-
-	return kernel_mbind(start, len, mode, nm, nr_bits+1, flags);
-}
-
-COMPAT_SYSCALL_DEFINE4(migrate_pages, compat_pid_t, pid,
-		       compat_ulong_t, maxnode,
-		       const compat_ulong_t __user *, old_nodes,
-		       const compat_ulong_t __user *, new_nodes)
-{
-	unsigned long __user *old = NULL;
-	unsigned long __user *new = NULL;
-	nodemask_t tmp_mask;
-	unsigned long nr_bits;
-	unsigned long size;
-
-	nr_bits = min_t(unsigned long, maxnode - 1, MAX_NUMNODES);
-	size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
-	if (old_nodes) {
-		if (compat_get_bitmap(nodes_addr(tmp_mask), old_nodes, nr_bits))
-			return -EFAULT;
-		old = compat_alloc_user_space(new_nodes ? size * 2 : size);
-		if (new_nodes)
-			new = old + size / sizeof(unsigned long);
-		if (copy_to_user(old, nodes_addr(tmp_mask), size))
-			return -EFAULT;
-	}
-	if (new_nodes) {
-		if (compat_get_bitmap(nodes_addr(tmp_mask), new_nodes, nr_bits))
-			return -EFAULT;
-		if (new == NULL)
-			new = compat_alloc_user_space(size);
-		if (copy_to_user(new, nodes_addr(tmp_mask), size))
-			return -EFAULT;
-	}
-	return kernel_migrate_pages(pid, nr_bits + 1, old, new);
-}
-
-#endif /* CONFIG_COMPAT */
-
 bool vma_migratable(struct vm_area_struct *vma)
 {
 	if (vma->vm_flags & (VM_IO | VM_PFNMAP))
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro
  2020-09-18 13:24 ` [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro Arnd Bergmann
@ 2020-09-19  5:35   ` Christoph Hellwig
  2020-09-19 16:23     ` Andy Lutomirski
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2020-09-19  5:35 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Alexander Viro, Eric Biederman, Andrew Morton,
	linux-kernel, linux-arm-kernel, linux-arch, linux-mm, kexec,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin,
	Andy Lutomirski, Christoph Hellwig, Arnd Bergmann, Brian Gerst

On Fri, Sep 18, 2020 at 03:24:36PM +0200, Arnd Bergmann wrote:
> sys_move_pages() is an optional syscall, and once we remove
> the compat version of it in favor of the native one with an
> in_compat_syscall() check, the x32 syscall table refers to
> a __x32_sys_move_pages symbol that may not exist when the
> syscall is disabled.
> 
> Change the COND_SYSCALL() definition on x86 to also include
> the redirection for x32.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Adding the x86 maintainers and Brian Gerst.  Brian proposed another
problem to the mess that most of the compat syscall handlers used by
x32 here:

   https://lkml.org/lkml/2020/6/16/664

hpa didn't particularly like it, but with your and my pending series
we'll soon use more native than compat syscalls for x32, so something
will need to change..

> ---
>  arch/x86/include/asm/syscall_wrapper.h | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
> index a84333adeef2..5eacd35a7f97 100644
> --- a/arch/x86/include/asm/syscall_wrapper.h
> +++ b/arch/x86/include/asm/syscall_wrapper.h
> @@ -171,12 +171,16 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);
>  	__SYS_STUBx(x32, compat_sys##name,				\
>  		    SC_X86_64_REGS_TO_ARGS(x, __VA_ARGS__))
>  
> +#define __X32_COND_SYSCALL(name)					\
> +	__COND_SYSCALL(x32, sys_##name)
> +
>  #define __X32_COMPAT_COND_SYSCALL(name)					\
>  	__COND_SYSCALL(x32, compat_sys_##name)
>  
>  #define __X32_COMPAT_SYS_NI(name)					\
>  	__SYS_NI(x32, compat_sys_##name)
>  #else /* CONFIG_X86_X32 */
> +#define __X32_COND_SYSCALL(name)
>  #define __X32_COMPAT_SYS_STUB0(name)
>  #define __X32_COMPAT_SYS_STUBx(x, name, ...)
>  #define __X32_COMPAT_COND_SYSCALL(name)
> @@ -253,6 +257,7 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);
>  	static long __do_sys_##sname(const struct pt_regs *__unused)
>  
>  #define COND_SYSCALL(name)						\
> +	__X32_COND_SYSCALL(name)					\
>  	__X64_COND_SYSCALL(name)					\
>  	__IA32_COND_SYSCALL(name)
>  
> -- 
> 2.27.0
> 
---end quoted text---

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] kexec: remove compat_sys_kexec_load syscall
  2020-09-18 13:24 ` [PATCH 2/4] kexec: remove compat_sys_kexec_load syscall Arnd Bergmann
@ 2020-09-19  5:37   ` Christoph Hellwig
  2020-09-26 21:10     ` Arnd Bergmann
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2020-09-19  5:37 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Alexander Viro, Eric Biederman, Andrew Morton,
	linux-kernel, linux-arm-kernel, linux-arch, linux-mm, kexec

On Fri, Sep 18, 2020 at 03:24:37PM +0200, Arnd Bergmann wrote:
> The compat version of sys_kexec_load() uses compat_alloc_user_space to
> convert the user-provided arguments into the native format.
> 
> Move the conversion into the regular implementation with
> an in_compat_syscall() check to simplify it and avoid the
> compat_alloc_user_space() call.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/arm64/include/asm/unistd32.h         |  2 +-
>  arch/mips/kernel/syscalls/syscall_n32.tbl |  2 +-
>  arch/mips/kernel/syscalls/syscall_o32.tbl |  2 +-
>  arch/parisc/kernel/syscalls/syscall.tbl   |  2 +-
>  arch/powerpc/kernel/syscalls/syscall.tbl  |  2 +-
>  arch/s390/kernel/syscalls/syscall.tbl     |  2 +-
>  arch/sparc/kernel/syscalls/syscall.tbl    |  2 +-
>  arch/x86/entry/syscalls/syscall_32.tbl    |  2 +-
>  arch/x86/entry/syscalls/syscall_64.tbl    |  2 +-
>  include/linux/compat.h                    |  6 --
>  include/uapi/asm-generic/unistd.h         |  2 +-
>  kernel/kexec.c                            | 75 ++++++-----------------
>  12 files changed, 29 insertions(+), 72 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
> index 734860ac7cf9..b6517df74037 100644
> --- a/arch/arm64/include/asm/unistd32.h
> +++ b/arch/arm64/include/asm/unistd32.h
> @@ -705,7 +705,7 @@ __SYSCALL(__NR_getcpu, sys_getcpu)
>  #define __NR_epoll_pwait 346
>  __SYSCALL(__NR_epoll_pwait, compat_sys_epoll_pwait)
>  #define __NR_kexec_load 347
> -__SYSCALL(__NR_kexec_load, compat_sys_kexec_load)
> +__SYSCALL(__NR_kexec_load, sys_kexec_load)
>  #define __NR_utimensat 348
>  __SYSCALL(__NR_utimensat, sys_utimensat_time32)
>  #define __NR_signalfd 349
> diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
> index f9df9edb67a4..ad157aab4c09 100644
> --- a/arch/mips/kernel/syscalls/syscall_n32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
> @@ -282,7 +282,7 @@
>  271	n32	move_pages			compat_sys_move_pages
>  272	n32	set_robust_list			compat_sys_set_robust_list
>  273	n32	get_robust_list			compat_sys_get_robust_list
> -274	n32	kexec_load			compat_sys_kexec_load
> +274	n32	kexec_load			sys_kexec_load
>  275	n32	getcpu				sys_getcpu
>  276	n32	epoll_pwait			compat_sys_epoll_pwait
>  277	n32	ioprio_set			sys_ioprio_set
> diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
> index 195b43cf27c8..57baf6c8008f 100644
> --- a/arch/mips/kernel/syscalls/syscall_o32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
> @@ -322,7 +322,7 @@
>  308	o32	move_pages			sys_move_pages			compat_sys_move_pages
>  309	o32	set_robust_list			sys_set_robust_list		compat_sys_set_robust_list
>  310	o32	get_robust_list			sys_get_robust_list		compat_sys_get_robust_list
> -311	o32	kexec_load			sys_kexec_load			compat_sys_kexec_load
> +311	o32	kexec_load			sys_kexec_load
>  312	o32	getcpu				sys_getcpu
>  313	o32	epoll_pwait			sys_epoll_pwait			compat_sys_epoll_pwait
>  314	o32	ioprio_set			sys_ioprio_set
> diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
> index def64d221cd4..778bf166d7bd 100644
> --- a/arch/parisc/kernel/syscalls/syscall.tbl
> +++ b/arch/parisc/kernel/syscalls/syscall.tbl
> @@ -336,7 +336,7 @@
>  297	common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
>  298	common	statfs64		sys_statfs64			compat_sys_statfs64
>  299	common	fstatfs64		sys_fstatfs64			compat_sys_fstatfs64
> -300	common	kexec_load		sys_kexec_load			compat_sys_kexec_load
> +300	common	kexec_load		sys_kexec_load
>  301	32	utimensat		sys_utimensat_time32
>  301	64	utimensat		sys_utimensat
>  302	common	signalfd		sys_signalfd			compat_sys_signalfd
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
> index c2d737ff2e7b..f128ba8b9a71 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -350,7 +350,7 @@
>  265	64	mq_timedreceive			sys_mq_timedreceive
>  266	nospu	mq_notify			sys_mq_notify			compat_sys_mq_notify
>  267	nospu	mq_getsetattr			sys_mq_getsetattr		compat_sys_mq_getsetattr
> -268	nospu	kexec_load			sys_kexec_load			compat_sys_kexec_load
> +268	nospu	kexec_load			sys_kexec_load
>  269	nospu	add_key				sys_add_key
>  270	nospu	request_key			sys_request_key
>  271	nospu	keyctl				sys_keyctl			compat_sys_keyctl
> diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
> index 10456bc936fb..d45952058be2 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -283,7 +283,7 @@
>  274  common	mq_timedreceive		sys_mq_timedreceive		sys_mq_timedreceive_time32
>  275  common	mq_notify		sys_mq_notify			compat_sys_mq_notify
>  276  common	mq_getsetattr		sys_mq_getsetattr		compat_sys_mq_getsetattr
> -277  common	kexec_load		sys_kexec_load			compat_sys_kexec_load
> +277  common	kexec_load		sys_kexec_load			sys_kexec_load
>  278  common	add_key			sys_add_key			sys_add_key
>  279  common	request_key		sys_request_key			sys_request_key
>  280  common	keyctl			sys_keyctl			compat_sys_keyctl
> diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
> index 4af114e84f20..a46edcdd950d 100644
> --- a/arch/sparc/kernel/syscalls/syscall.tbl
> +++ b/arch/sparc/kernel/syscalls/syscall.tbl
> @@ -369,7 +369,7 @@
>  303	common	mbind			sys_mbind			compat_sys_mbind
>  304	common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
>  305	common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
> -306	common	kexec_load		sys_kexec_load			compat_sys_kexec_load
> +306	common	kexec_load		sys_kexec_load			sys_kexec_load
>  307	common	move_pages		sys_move_pages			compat_sys_move_pages
>  308	common	getcpu			sys_getcpu
>  309	common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> index 3db3d8823dc8..7e4140b78aad 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -294,7 +294,7 @@
>  280	i386	mq_timedreceive		sys_mq_timedreceive_time32
>  281	i386	mq_notify		sys_mq_notify			compat_sys_mq_notify
>  282	i386	mq_getsetattr		sys_mq_getsetattr		compat_sys_mq_getsetattr
> -283	i386	kexec_load		sys_kexec_load			compat_sys_kexec_load
> +283	i386	kexec_load		sys_kexec_load			sys_kexec_load
>  284	i386	waitid			sys_waitid			compat_sys_waitid
>  # 285 sys_setaltroot
>  286	i386	add_key			sys_add_key
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> index f30d6ae9a688..9986f5f08278 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -384,7 +384,7 @@
>  525	x32	sigaltstack		compat_sys_sigaltstack
>  526	x32	timer_create		compat_sys_timer_create
>  527	x32	mq_notify		compat_sys_mq_notify
> -528	x32	kexec_load		compat_sys_kexec_load
> +528	x32	kexec_load		sys_kexec_load
>  529	x32	waitid			compat_sys_waitid
>  530	x32	set_robust_list		compat_sys_set_robust_list
>  531	x32	get_robust_list		compat_sys_get_robust_list
> diff --git a/include/linux/compat.h b/include/linux/compat.h
> index 3d96a841bd49..a7a5a0ff59ef 100644
> --- a/include/linux/compat.h
> +++ b/include/linux/compat.h
> @@ -643,12 +643,6 @@ asmlinkage long compat_sys_setitimer(int which,
>  				     struct old_itimerval32 __user *in,
>  				     struct old_itimerval32 __user *out);
>  
> -/* kernel/kexec.c */
> -asmlinkage long compat_sys_kexec_load(compat_ulong_t entry,
> -				      compat_ulong_t nr_segments,
> -				      struct compat_kexec_segment __user *,
> -				      compat_ulong_t flags);
> -
>  /* kernel/posix-timers.c */
>  asmlinkage long compat_sys_timer_create(clockid_t which_clock,
>  			struct compat_sigevent __user *timer_event_spec,
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index 995b36c2ea7d..83f1fc7fd3d7 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -342,7 +342,7 @@ __SC_COMP(__NR_setitimer, sys_setitimer, compat_sys_setitimer)
>  
>  /* kernel/kexec.c */
>  #define __NR_kexec_load 104
> -__SC_COMP(__NR_kexec_load, sys_kexec_load, compat_sys_kexec_load)
> +__SYSCALL(__NR_kexec_load, sys_kexec_load)
>  
>  /* kernel/module.c */
>  #define __NR_init_module 105
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index f977786fe498..1ef7d3dc906f 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -29,7 +29,25 @@ static int copy_user_segment_list(struct kimage *image,
>  	/* Read in the segments */
>  	image->nr_segments = nr_segments;
>  	segment_bytes = nr_segments * sizeof(*segments);
> -	ret = copy_from_user(image->segment, segments, segment_bytes);
> +	if (in_compat_syscall()) {
> +		struct compat_kexec_segment __user *cs = (void __user *)segments;
> +		struct compat_kexec_segment segment;
> +		int i;
> +		for (i=0; i< nr_segments; i++) {

Missing empty line after the variable declarations and really strange
indentation.

> +			copy_from_user(&segment, &cs[i], sizeof(segment));

Missing return value check.

> +			if (ret)
> +				break;
> +
> +			image->segment[i] = (struct kexec_segment) {
> +				.buf   = compat_ptr(segment.buf),
> +				.bufsz = segment.bufsz,
> +				.mem   = segment.mem,
> +				.memsz = segment.memsz,
> +			};
> +		}

I'd split the whole compat handling into a helper, and I'd probably
use the unsafe_get/put user to optimize it a little more.

> +	} else {
> +		ret = copy_from_user(image->segment, segments, segment_bytes);
> +	}
>  	if (ret)
>  		ret = -EFAULT;

Why not just

		if (copy_from_user(image->segment, segments, segment_bytes))
			ret = -EFAULT;

?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] mm: remove compat_sys_move_pages
  2020-09-18 13:24 ` [PATCH 3/4] mm: remove compat_sys_move_pages Arnd Bergmann
@ 2020-09-19  5:38   ` Christoph Hellwig
  2020-09-26 15:21     ` Arnd Bergmann
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2020-09-19  5:38 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Alexander Viro, Eric Biederman, Andrew Morton,
	linux-kernel, linux-arm-kernel, linux-arch, linux-mm, kexec

On Fri, Sep 18, 2020 at 03:24:38PM +0200, Arnd Bergmann wrote:
> The compat move_pages() implementation uses compat_alloc_user_space()
> for converting the pointer array. Moving the compat handling into
> the function itself is a bit simpler and lets us avoid the
> compat_alloc_user_space() call.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/arm64/include/asm/unistd32.h         |  2 +-
>  arch/mips/kernel/syscalls/syscall_n32.tbl |  2 +-
>  arch/mips/kernel/syscalls/syscall_o32.tbl |  2 +-
>  arch/parisc/kernel/syscalls/syscall.tbl   |  2 +-
>  arch/powerpc/kernel/syscalls/syscall.tbl  |  2 +-
>  arch/s390/kernel/syscalls/syscall.tbl     |  2 +-
>  arch/sparc/kernel/syscalls/syscall.tbl    |  2 +-
>  arch/x86/entry/syscalls/syscall_32.tbl    |  2 +-
>  arch/x86/entry/syscalls/syscall_64.tbl    |  2 +-
>  include/linux/compat.h                    |  5 ---
>  include/uapi/asm-generic/unistd.h         |  2 +-
>  kernel/sys_ni.c                           |  1 -
>  mm/migrate.c                              | 45 +++++++++++------------
>  13 files changed, 32 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
> index b6517df74037..af793775ba98 100644
> --- a/arch/arm64/include/asm/unistd32.h
> +++ b/arch/arm64/include/asm/unistd32.h
> @@ -699,7 +699,7 @@ __SYSCALL(__NR_tee, sys_tee)
>  #define __NR_vmsplice 343
>  __SYSCALL(__NR_vmsplice, compat_sys_vmsplice)
>  #define __NR_move_pages 344
> -__SYSCALL(__NR_move_pages, compat_sys_move_pages)
> +__SYSCALL(__NR_move_pages, sys_move_pages)
>  #define __NR_getcpu 345
>  __SYSCALL(__NR_getcpu, sys_getcpu)
>  #define __NR_epoll_pwait 346
> diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
> index ad157aab4c09..7fa1ca45e44c 100644
> --- a/arch/mips/kernel/syscalls/syscall_n32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
> @@ -279,7 +279,7 @@
>  268	n32	sync_file_range			sys_sync_file_range
>  269	n32	tee				sys_tee
>  270	n32	vmsplice			compat_sys_vmsplice
> -271	n32	move_pages			compat_sys_move_pages
> +271	n32	move_pages			sys_move_pages
>  272	n32	set_robust_list			compat_sys_set_robust_list
>  273	n32	get_robust_list			compat_sys_get_robust_list
>  274	n32	kexec_load			sys_kexec_load
> diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
> index 57baf6c8008f..194c7fbeedf7 100644
> --- a/arch/mips/kernel/syscalls/syscall_o32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
> @@ -319,7 +319,7 @@
>  305	o32	sync_file_range			sys_sync_file_range		sys32_sync_file_range
>  306	o32	tee				sys_tee
>  307	o32	vmsplice			sys_vmsplice			compat_sys_vmsplice
> -308	o32	move_pages			sys_move_pages			compat_sys_move_pages
> +308	o32	move_pages			sys_move_pages
>  309	o32	set_robust_list			sys_set_robust_list		compat_sys_set_robust_list
>  310	o32	get_robust_list			sys_get_robust_list		compat_sys_get_robust_list
>  311	o32	kexec_load			sys_kexec_load
> diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
> index 778bf166d7bd..5c17edaffe70 100644
> --- a/arch/parisc/kernel/syscalls/syscall.tbl
> +++ b/arch/parisc/kernel/syscalls/syscall.tbl
> @@ -331,7 +331,7 @@
>  292	64	sync_file_range		sys_sync_file_range
>  293	common	tee			sys_tee
>  294	common	vmsplice		sys_vmsplice			compat_sys_vmsplice
> -295	common	move_pages		sys_move_pages			compat_sys_move_pages
> +295	common	move_pages		sys_move_pages
>  296	common	getcpu			sys_getcpu
>  297	common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
>  298	common	statfs64		sys_statfs64			compat_sys_statfs64
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
> index f128ba8b9a71..04fb42d7b377 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -389,7 +389,7 @@
>  298	common	faccessat			sys_faccessat
>  299	common	get_robust_list			sys_get_robust_list		compat_sys_get_robust_list
>  300	common	set_robust_list			sys_set_robust_list		compat_sys_set_robust_list
> -301	common	move_pages			sys_move_pages			compat_sys_move_pages
> +301	common	move_pages			sys_move_pages
>  302	common	getcpu				sys_getcpu
>  303	nospu	epoll_pwait			sys_epoll_pwait			compat_sys_epoll_pwait
>  304	32	utimensat			sys_utimensat_time32
> diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
> index d45952058be2..3197965d45e9 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -317,7 +317,7 @@
>  307  common	sync_file_range		sys_sync_file_range		compat_sys_s390_sync_file_range
>  308  common	tee			sys_tee				sys_tee
>  309  common	vmsplice		sys_vmsplice			compat_sys_vmsplice
> -310  common	move_pages		sys_move_pages			compat_sys_move_pages
> +310  common	move_pages		sys_move_pages
>  311  common	getcpu			sys_getcpu			sys_getcpu
>  312  common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
>  313  common	utimes			sys_utimes			sys_utimes_time32
> diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
> index a46edcdd950d..e36ac364e61a 100644
> --- a/arch/sparc/kernel/syscalls/syscall.tbl
> +++ b/arch/sparc/kernel/syscalls/syscall.tbl
> @@ -370,7 +370,7 @@
>  304	common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
>  305	common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
>  306	common	kexec_load		sys_kexec_load			sys_kexec_load
> -307	common	move_pages		sys_move_pages			compat_sys_move_pages
> +307	common	move_pages		sys_move_pages
>  308	common	getcpu			sys_getcpu
>  309	common	epoll_pwait		sys_epoll_pwait			compat_sys_epoll_pwait
>  310	32	utimensat		sys_utimensat_time32
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> index 7e4140b78aad..b3263b8b2eae 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -328,7 +328,7 @@
>  314	i386	sync_file_range		sys_ia32_sync_file_range
>  315	i386	tee			sys_tee
>  316	i386	vmsplice		sys_vmsplice			compat_sys_vmsplice
> -317	i386	move_pages		sys_move_pages			compat_sys_move_pages
> +317	i386	move_pages		sys_move_pages
>  318	i386	getcpu			sys_getcpu
>  319	i386	epoll_pwait		sys_epoll_pwait
>  320	i386	utimensat		sys_utimensat_time32
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> index 9986f5f08278..4a997a0cbf47 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -389,7 +389,7 @@
>  530	x32	set_robust_list		compat_sys_set_robust_list
>  531	x32	get_robust_list		compat_sys_get_robust_list
>  532	x32	vmsplice		compat_sys_vmsplice
> -533	x32	move_pages		compat_sys_move_pages
> +533	x32	move_pages		sys_move_pages
>  534	x32	preadv			compat_sys_preadv64
>  535	x32	pwritev			compat_sys_pwritev64
>  536	x32	rt_tgsigqueueinfo	compat_sys_rt_tgsigqueueinfo
> diff --git a/include/linux/compat.h b/include/linux/compat.h
> index a7a5a0ff59ef..db1d7ac2c9e0 100644
> --- a/include/linux/compat.h
> +++ b/include/linux/compat.h
> @@ -763,11 +763,6 @@ asmlinkage long compat_sys_set_mempolicy(int mode, compat_ulong_t __user *nmask,
>  asmlinkage long compat_sys_migrate_pages(compat_pid_t pid,
>  		compat_ulong_t maxnode, const compat_ulong_t __user *old_nodes,
>  		const compat_ulong_t __user *new_nodes);
> -asmlinkage long compat_sys_move_pages(pid_t pid, compat_ulong_t nr_pages,
> -				      __u32 __user *pages,
> -				      const int __user *nodes,
> -				      int __user *status,
> -				      int flags);
>  
>  asmlinkage long compat_sys_rt_tgsigqueueinfo(compat_pid_t tgid,
>  					compat_pid_t pid, int sig,
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index 83f1fc7fd3d7..4da51702fb21 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -681,7 +681,7 @@ __SC_COMP(__NR_set_mempolicy, sys_set_mempolicy, compat_sys_set_mempolicy)
>  #define __NR_migrate_pages 238
>  __SC_COMP(__NR_migrate_pages, sys_migrate_pages, compat_sys_migrate_pages)
>  #define __NR_move_pages 239
> -__SC_COMP(__NR_move_pages, sys_move_pages, compat_sys_move_pages)
> +__SYSCALL(__NR_move_pages, sys_move_pages)
>  #endif
>  
>  #define __NR_rt_tgsigqueueinfo 240
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index c925d1e1777e..783a24ceee88 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -290,7 +290,6 @@ COND_SYSCALL_COMPAT(set_mempolicy);
>  COND_SYSCALL(migrate_pages);
>  COND_SYSCALL_COMPAT(migrate_pages);
>  COND_SYSCALL(move_pages);
> -COND_SYSCALL_COMPAT(move_pages);
>  
>  COND_SYSCALL(perf_event_open);
>  COND_SYSCALL(accept4);
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 34a842a8eb6a..e9dfbde5f12c 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1835,6 +1835,27 @@ static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
>  	mmap_read_unlock(mm);
>  }
>  
> +static int put_pages_array(const void __user *chunk_pages[],
> +			   const void __user * __user *pages,
> +			   unsigned long chunk_nr)
> +{
> +	compat_uptr_t __user *pages32 = (compat_uptr_t __user *)pages;
> +	compat_uptr_t p;
> +	int i;
> +
> +	if (!in_compat_syscall())
> +		return copy_from_user(chunk_pages, pages,
> +				      chunk_nr * sizeof(*chunk_pages));
> +
> +	for (i = 0; i < chunk_nr; i++) {
> +		if (get_user(p, pages32 + i))
> +			return -EFAULT;
> +		chunk_pages[i] = compat_ptr(p);
> +	}
> +
> +	return 0;

I'd just keep the native version inline and have the compat one in
a helper, but that is just a minor detail.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] mm: remove compat numa syscalls
  2020-09-18 13:24 ` [PATCH 4/4] mm: remove compat numa syscalls Arnd Bergmann
@ 2020-09-19  5:41   ` Christoph Hellwig
  2020-09-26 15:14     ` Arnd Bergmann
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2020-09-19  5:41 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Alexander Viro, Eric Biederman, Andrew Morton,
	linux-kernel, linux-arm-kernel, linux-arch, linux-mm, kexec

On Fri, Sep 18, 2020 at 03:24:39PM +0200, Arnd Bergmann wrote:
> The compat implementations for mbind, get_mempolicy, set_mempolicy
> and migrate_pages are just there to handle the subtly different
> layout of bitmaps on 32-bit hosts.
> 
> The compat implementation however lacks some of the checks that
> are present in the native one, in particular for checking that
> the extra bits are all zero when user space has a larger mask
> size than the kernel. Worse, those extra bits do not get cleared
> when copying in or out of the kernel, which can lead to incorrect
> data as well.
> 
> Unify the implementation to handle the compat bitmap layout directly
> in the get_nodes() and copy_nodes_to_user() helpers.  Splitting out
> the get_bitmap() helper from get_nodes() also helps readability of the
> native case.
> 
> On x86, two additional problems are addressed by this: compat tasks can
> pass a bitmap at the end of a mapping, causing a fault when reading
> across the page boundary for a 64-bit word. x32 tasks might also run
> into problems with get_mempolicy corrupting data when an odd number of
> 32-bit words gets passed.
> 
> On parisc the migrate_pages() system call apparently had the wrong
> calling convention, as big-endian architectures expect the words
> inside of a bitmap to be swapped. This is not a problem though
> since parisc has no NUMA support.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  arch/arm64/include/asm/unistd32.h         |   8 +-
>  arch/mips/kernel/syscalls/syscall_n32.tbl |   8 +-
>  arch/mips/kernel/syscalls/syscall_o32.tbl |   8 +-
>  arch/parisc/kernel/syscalls/syscall.tbl   |   6 +-
>  arch/powerpc/kernel/syscalls/syscall.tbl  |   8 +-
>  arch/s390/kernel/syscalls/syscall.tbl     |   8 +-
>  arch/sparc/kernel/syscalls/syscall.tbl    |   8 +-
>  arch/x86/entry/syscalls/syscall_32.tbl    |   2 +-
>  include/linux/compat.h                    |  15 --
>  include/uapi/asm-generic/unistd.h         |   8 +-
>  kernel/kexec.c                            |   6 +-
>  kernel/sys_ni.c                           |   4 -
>  mm/mempolicy.c                            | 193 +++++-----------------
>  13 files changed, 79 insertions(+), 203 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
> index af793775ba98..31479f7120a0 100644
> --- a/arch/arm64/include/asm/unistd32.h
> +++ b/arch/arm64/include/asm/unistd32.h
> @@ -649,11 +649,11 @@ __SYSCALL(__NR_inotify_add_watch, sys_inotify_add_watch)
>  #define __NR_inotify_rm_watch 318
>  __SYSCALL(__NR_inotify_rm_watch, sys_inotify_rm_watch)
>  #define __NR_mbind 319
> -__SYSCALL(__NR_mbind, compat_sys_mbind)
> +__SYSCALL(__NR_mbind, sys_mbind)
>  #define __NR_get_mempolicy 320
> -__SYSCALL(__NR_get_mempolicy, compat_sys_get_mempolicy)
> +__SYSCALL(__NR_get_mempolicy, sys_get_mempolicy)
>  #define __NR_set_mempolicy 321
> -__SYSCALL(__NR_set_mempolicy, compat_sys_set_mempolicy)
> +__SYSCALL(__NR_set_mempolicy, sys_set_mempolicy)
>  #define __NR_openat 322
>  __SYSCALL(__NR_openat, compat_sys_openat)
>  #define __NR_mkdirat 323
> @@ -811,7 +811,7 @@ __SYSCALL(__NR_rseq, sys_rseq)
>  #define __NR_io_pgetevents 399
>  __SYSCALL(__NR_io_pgetevents, compat_sys_io_pgetevents)
>  #define __NR_migrate_pages 400
> -__SYSCALL(__NR_migrate_pages, compat_sys_migrate_pages)
> +__SYSCALL(__NR_migrate_pages, sys_migrate_pages)
>  #define __NR_kexec_file_load 401
>  __SYSCALL(__NR_kexec_file_load, sys_kexec_file_load)
>  /* 402 is unused */
> diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
> index 7fa1ca45e44c..15fda882d07e 100644
> --- a/arch/mips/kernel/syscalls/syscall_n32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
> @@ -239,9 +239,9 @@
>  228	n32	clock_nanosleep			sys_clock_nanosleep_time32
>  229	n32	tgkill				sys_tgkill
>  230	n32	utimes				sys_utimes_time32
> -231	n32	mbind				compat_sys_mbind
> -232	n32	get_mempolicy			compat_sys_get_mempolicy
> -233	n32	set_mempolicy			compat_sys_set_mempolicy
> +231	n32	mbind				sys_mbind
> +232	n32	get_mempolicy			sys_get_mempolicy
> +233	n32	set_mempolicy			sys_set_mempolicy
>  234	n32	mq_open				compat_sys_mq_open
>  235	n32	mq_unlink			sys_mq_unlink
>  236	n32	mq_timedsend			sys_mq_timedsend_time32
> @@ -258,7 +258,7 @@
>  247	n32	inotify_init			sys_inotify_init
>  248	n32	inotify_add_watch		sys_inotify_add_watch
>  249	n32	inotify_rm_watch		sys_inotify_rm_watch
> -250	n32	migrate_pages			compat_sys_migrate_pages
> +250	n32	migrate_pages			sys_migrate_pages
>  251	n32	openat				sys_openat
>  252	n32	mkdirat				sys_mkdirat
>  253	n32	mknodat				sys_mknodat
> diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
> index 194c7fbeedf7..6591388a9d88 100644
> --- a/arch/mips/kernel/syscalls/syscall_o32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
> @@ -279,9 +279,9 @@
>  265	o32	clock_nanosleep			sys_clock_nanosleep_time32
>  266	o32	tgkill				sys_tgkill
>  267	o32	utimes				sys_utimes_time32
> -268	o32	mbind				sys_mbind			compat_sys_mbind
> -269	o32	get_mempolicy			sys_get_mempolicy		compat_sys_get_mempolicy
> -270	o32	set_mempolicy			sys_set_mempolicy		compat_sys_set_mempolicy
> +268	o32	mbind				sys_mbind
> +269	o32	get_mempolicy			sys_get_mempolicy
> +270	o32	set_mempolicy			sys_set_mempolicy
>  271	o32	mq_open				sys_mq_open			compat_sys_mq_open
>  272	o32	mq_unlink			sys_mq_unlink
>  273	o32	mq_timedsend			sys_mq_timedsend_time32
> @@ -298,7 +298,7 @@
>  284	o32	inotify_init			sys_inotify_init
>  285	o32	inotify_add_watch		sys_inotify_add_watch
>  286	o32	inotify_rm_watch		sys_inotify_rm_watch
> -287	o32	migrate_pages			sys_migrate_pages		compat_sys_migrate_pages
> +287	o32	migrate_pages			sys_migrate_pages
>  288	o32	openat				sys_openat			compat_sys_openat
>  289	o32	mkdirat				sys_mkdirat
>  290	o32	mknodat				sys_mknodat
> diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
> index 5c17edaffe70..30f3c0146abf 100644
> --- a/arch/parisc/kernel/syscalls/syscall.tbl
> +++ b/arch/parisc/kernel/syscalls/syscall.tbl
> @@ -292,9 +292,9 @@
>  258	32	clock_nanosleep		sys_clock_nanosleep_time32
>  258	64	clock_nanosleep		sys_clock_nanosleep
>  259	common	tgkill			sys_tgkill
> -260	common	mbind			sys_mbind			compat_sys_mbind
> -261	common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
> -262	common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
> +260	common	mbind			sys_mbind
> +261	common	get_mempolicy		sys_get_mempolicy
> +262	common	set_mempolicy		sys_set_mempolicy
>  # 263 was vserver
>  264	common	add_key			sys_add_key
>  265	common	request_key		sys_request_key
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
> index 04fb42d7b377..4f5216320721 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -338,10 +338,10 @@
>  256	64	sys_debug_setcontext		sys_ni_syscall
>  256	spu	sys_debug_setcontext		sys_ni_syscall
>  # 257 reserved for vserver
> -258	nospu	migrate_pages			sys_migrate_pages		compat_sys_migrate_pages
> -259	nospu	mbind				sys_mbind			compat_sys_mbind
> -260	nospu	get_mempolicy			sys_get_mempolicy		compat_sys_get_mempolicy
> -261	nospu	set_mempolicy			sys_set_mempolicy		compat_sys_set_mempolicy
> +258	nospu	migrate_pages			sys_migrate_pages
> +259	nospu	mbind				sys_mbind
> +260	nospu	get_mempolicy			sys_get_mempolicy
> +261	nospu	set_mempolicy			sys_set_mempolicy
>  262	nospu	mq_open				sys_mq_open			compat_sys_mq_open
>  263	nospu	mq_unlink			sys_mq_unlink
>  264	32	mq_timedsend			sys_mq_timedsend_time32
> diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
> index 3197965d45e9..70c0b830d14f 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -274,9 +274,9 @@
>  265  common	statfs64		sys_statfs64			compat_sys_statfs64
>  266  common	fstatfs64		sys_fstatfs64			compat_sys_fstatfs64
>  267  common	remap_file_pages	sys_remap_file_pages		sys_remap_file_pages
> -268  common	mbind			sys_mbind			compat_sys_mbind
> -269  common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
> -270  common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
> +268  common	mbind			sys_mbind			sys_mbind
> +269  common	get_mempolicy		sys_get_mempolicy		sys_get_mempolicy
> +270  common	set_mempolicy		sys_set_mempolicy		sys_set_mempolicy
>  271  common	mq_open			sys_mq_open			compat_sys_mq_open
>  272  common	mq_unlink		sys_mq_unlink			sys_mq_unlink
>  273  common	mq_timedsend		sys_mq_timedsend		sys_mq_timedsend_time32
> @@ -293,7 +293,7 @@
>  284  common	inotify_init		sys_inotify_init		sys_inotify_init
>  285  common	inotify_add_watch	sys_inotify_add_watch		sys_inotify_add_watch
>  286  common	inotify_rm_watch	sys_inotify_rm_watch		sys_inotify_rm_watch
> -287  common	migrate_pages		sys_migrate_pages		compat_sys_migrate_pages
> +287  common	migrate_pages		sys_migrate_pages		sys_migrate_pages
>  288  common	openat			sys_openat			compat_sys_openat
>  289  common	mkdirat			sys_mkdirat			sys_mkdirat
>  290  common	mknodat			sys_mknodat			sys_mknodat
> diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
> index e36ac364e61a..50ff839a2661 100644
> --- a/arch/sparc/kernel/syscalls/syscall.tbl
> +++ b/arch/sparc/kernel/syscalls/syscall.tbl
> @@ -365,10 +365,10 @@
>  299	common	unshare			sys_unshare
>  300	common	set_robust_list		sys_set_robust_list		compat_sys_set_robust_list
>  301	common	get_robust_list		sys_get_robust_list		compat_sys_get_robust_list
> -302	common	migrate_pages		sys_migrate_pages		compat_sys_migrate_pages
> -303	common	mbind			sys_mbind			compat_sys_mbind
> -304	common	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
> -305	common	set_mempolicy		sys_set_mempolicy		compat_sys_set_mempolicy
> +302	common	migrate_pages		sys_migrate_pages
> +303	common	mbind			sys_mbind
> +304	common	get_mempolicy		sys_get_mempolicy
> +305	common	set_mempolicy		sys_set_mempolicy
>  306	common	kexec_load		sys_kexec_load			sys_kexec_load
>  307	common	move_pages		sys_move_pages
>  308	common	getcpu			sys_getcpu
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> index b3263b8b2eae..d07c3fbd4697 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -286,7 +286,7 @@
>  272	i386	fadvise64_64		sys_ia32_fadvise64_64
>  273	i386	vserver
>  274	i386	mbind			sys_mbind
> -275	i386	get_mempolicy		sys_get_mempolicy		compat_sys_get_mempolicy
> +275	i386	get_mempolicy		sys_get_mempolicy
>  276	i386	set_mempolicy		sys_set_mempolicy
>  277	i386	mq_open			sys_mq_open			compat_sys_mq_open
>  278	i386	mq_unlink		sys_mq_unlink
> diff --git a/include/linux/compat.h b/include/linux/compat.h
> index db1d7ac2c9e0..be06367b336c 100644
> --- a/include/linux/compat.h
> +++ b/include/linux/compat.h
> @@ -749,21 +749,6 @@ asmlinkage long compat_sys_execve(const char __user *filename, const compat_uptr
>  /* mm/fadvise.c: No generic prototype for fadvise64_64 */
>  
>  /* mm/, CONFIG_MMU only */
> -asmlinkage long compat_sys_mbind(compat_ulong_t start, compat_ulong_t len,
> -				 compat_ulong_t mode,
> -				 compat_ulong_t __user *nmask,
> -				 compat_ulong_t maxnode, compat_ulong_t flags);
> -asmlinkage long compat_sys_get_mempolicy(int __user *policy,
> -					 compat_ulong_t __user *nmask,
> -					 compat_ulong_t maxnode,
> -					 compat_ulong_t addr,
> -					 compat_ulong_t flags);
> -asmlinkage long compat_sys_set_mempolicy(int mode, compat_ulong_t __user *nmask,
> -					 compat_ulong_t maxnode);
> -asmlinkage long compat_sys_migrate_pages(compat_pid_t pid,
> -		compat_ulong_t maxnode, const compat_ulong_t __user *old_nodes,
> -		const compat_ulong_t __user *new_nodes);
> -
>  asmlinkage long compat_sys_rt_tgsigqueueinfo(compat_pid_t tgid,
>  					compat_pid_t pid, int sig,
>  					struct compat_siginfo __user *uinfo);
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index 4da51702fb21..4e31f9b68a8f 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -673,13 +673,13 @@ __SYSCALL(__NR_madvise, sys_madvise)
>  #define __NR_remap_file_pages 234
>  __SYSCALL(__NR_remap_file_pages, sys_remap_file_pages)
>  #define __NR_mbind 235
> -__SC_COMP(__NR_mbind, sys_mbind, compat_sys_mbind)
> +__SYSCALL(__NR_mbind, sys_mbind)
>  #define __NR_get_mempolicy 236
> -__SC_COMP(__NR_get_mempolicy, sys_get_mempolicy, compat_sys_get_mempolicy)
> +__SYSCALL(__NR_get_mempolicy, sys_get_mempolicy)
>  #define __NR_set_mempolicy 237
> -__SC_COMP(__NR_set_mempolicy, sys_set_mempolicy, compat_sys_set_mempolicy)
> +__SYSCALL(__NR_set_mempolicy, sys_set_mempolicy)
>  #define __NR_migrate_pages 238
> -__SC_COMP(__NR_migrate_pages, sys_migrate_pages, compat_sys_migrate_pages)
> +__SYSCALL(__NR_migrate_pages, sys_migrate_pages)
>  #define __NR_move_pages 239
>  __SYSCALL(__NR_move_pages, sys_move_pages)
>  #endif
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index 1ef7d3dc906f..0fecf2370be1 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -30,11 +30,13 @@ static int copy_user_segment_list(struct kimage *image,
>  	image->nr_segments = nr_segments;
>  	segment_bytes = nr_segments * sizeof(*segments);
>  	if (in_compat_syscall()) {
> -		struct compat_kexec_segment __user *cs = (void __user *)segments;
> +		struct compat_kexec_segment __user *cs;
>  		struct compat_kexec_segment segment;
>  		int i;
> +
> +		cs = (struct compat_kexec_segment __user *)segments;
>  		for (i=0; i< nr_segments; i++) {
> -			copy_from_user(&segment, &cs[i], sizeof(segment));
> +			ret = copy_from_user(&segment, &cs[i], sizeof(segment));
>  			if (ret)
>  				break;
>  
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index 783a24ceee88..0850111f888e 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -282,13 +282,9 @@ COND_SYSCALL(mincore);
>  COND_SYSCALL(madvise);
>  COND_SYSCALL(remap_file_pages);
>  COND_SYSCALL(mbind);
> -COND_SYSCALL_COMPAT(mbind);
>  COND_SYSCALL(get_mempolicy);
> -COND_SYSCALL_COMPAT(get_mempolicy);
>  COND_SYSCALL(set_mempolicy);
> -COND_SYSCALL_COMPAT(set_mempolicy);
>  COND_SYSCALL(migrate_pages);
> -COND_SYSCALL_COMPAT(migrate_pages);
>  COND_SYSCALL(move_pages);
>  
>  COND_SYSCALL(perf_event_open);
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index eddbe4e56c73..2e1b90143b2c 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1374,16 +1374,30 @@ static long do_mbind(unsigned long start, unsigned long len,
>  /*
>   * User space interface with variable sized bitmaps for nodelists.
>   */
> +static int get_bitmap(unsigned long *mask, const unsigned long __user *nmask,
> +		      unsigned long maxnode)
> +{
> +	unsigned long nlongs = BITS_TO_LONGS(maxnode);
> +	int ret;
> +
> +	if (in_compat_syscall())
> +		ret = compat_get_bitmap(mask, (void __user *)nmask, maxnode);

I'd either pass void __user all the way, or do an explicit case from
the native to the compat version in the compat handler.

> +	else
> +		ret = copy_from_user(mask, nmask, nlongs*sizeof(unsigned long));

That whole BITS_TO_LONGS(b) * sizeof(unsigned long) pattern is
duplicated in various places including the checking of compat vs native
and probably want a helper that includes the in_compat_syscall() check.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro
  2020-09-19  5:35   ` Christoph Hellwig
@ 2020-09-19 16:23     ` Andy Lutomirski
  2020-09-19 17:14       ` hpa
  0 siblings, 1 reply; 16+ messages in thread
From: Andy Lutomirski @ 2020-09-19 16:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Alexander Viro, Eric Biederman, Andrew Morton,
	LKML, linux-arm-kernel, linux-arch, Linux-MM, kexec,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin,
	Andy Lutomirski, Christoph Hellwig, Brian Gerst

On Fri, Sep 18, 2020 at 10:35 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Fri, Sep 18, 2020 at 03:24:36PM +0200, Arnd Bergmann wrote:
> > sys_move_pages() is an optional syscall, and once we remove
> > the compat version of it in favor of the native one with an
> > in_compat_syscall() check, the x32 syscall table refers to
> > a __x32_sys_move_pages symbol that may not exist when the
> > syscall is disabled.
> >
> > Change the COND_SYSCALL() definition on x86 to also include
> > the redirection for x32.
> >
> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>
> Adding the x86 maintainers and Brian Gerst.  Brian proposed another
> problem to the mess that most of the compat syscall handlers used by
> x32 here:
>
>    https://lkml.org/lkml/2020/6/16/664
>
> hpa didn't particularly like it, but with your and my pending series
> we'll soon use more native than compat syscalls for x32, so something
> will need to change..

I'm fine with either solution.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro
  2020-09-19 16:23     ` Andy Lutomirski
@ 2020-09-19 17:14       ` hpa
  2020-09-19 17:39         ` Andy Lutomirski
  2020-09-19 17:45         ` Brian Gerst
  0 siblings, 2 replies; 16+ messages in thread
From: hpa @ 2020-09-19 17:14 UTC (permalink / raw)
  To: Andy Lutomirski, Christoph Hellwig
  Cc: Arnd Bergmann, Alexander Viro, Eric Biederman, Andrew Morton,
	LKML, linux-arm-kernel, linux-arch, Linux-MM, kexec,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Christoph Hellwig,
	Brian Gerst

On September 19, 2020 9:23:22 AM PDT, Andy Lutomirski <luto@kernel.org> wrote:
>On Fri, Sep 18, 2020 at 10:35 PM Christoph Hellwig <hch@infradead.org>
>wrote:
>>
>> On Fri, Sep 18, 2020 at 03:24:36PM +0200, Arnd Bergmann wrote:
>> > sys_move_pages() is an optional syscall, and once we remove
>> > the compat version of it in favor of the native one with an
>> > in_compat_syscall() check, the x32 syscall table refers to
>> > a __x32_sys_move_pages symbol that may not exist when the
>> > syscall is disabled.
>> >
>> > Change the COND_SYSCALL() definition on x86 to also include
>> > the redirection for x32.
>> >
>> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>>
>> Adding the x86 maintainers and Brian Gerst.  Brian proposed another
>> problem to the mess that most of the compat syscall handlers used by
>> x32 here:
>>
>>    https://lkml.org/lkml/2020/6/16/664
>>
>> hpa didn't particularly like it, but with your and my pending series
>> we'll soon use more native than compat syscalls for x32, so something
>> will need to change..
>
>I'm fine with either solution.

My main objection was naming. x64 is a widely used synonym for x86-64, and so that is confusing.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro
  2020-09-19 17:14       ` hpa
@ 2020-09-19 17:39         ` Andy Lutomirski
  2020-09-19 17:45         ` Brian Gerst
  1 sibling, 0 replies; 16+ messages in thread
From: Andy Lutomirski @ 2020-09-19 17:39 UTC (permalink / raw)
  To: hpa
  Cc: Andy Lutomirski, Christoph Hellwig, Arnd Bergmann,
	Alexander Viro, Eric Biederman, Andrew Morton, LKML,
	linux-arm-kernel, linux-arch, Linux-MM, kexec, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Christoph Hellwig, Brian Gerst


> On Sep 19, 2020, at 10:14 AM, hpa@zytor.com wrote:
> 
> On September 19, 2020 9:23:22 AM PDT, Andy Lutomirski <luto@kernel.org> wrote:
>>> On Fri, Sep 18, 2020 at 10:35 PM Christoph Hellwig <hch@infradead.org>
>>> wrote:
>>> 
>>> On Fri, Sep 18, 2020 at 03:24:36PM +0200, Arnd Bergmann wrote:
>>>> sys_move_pages() is an optional syscall, and once we remove
>>>> the compat version of it in favor of the native one with an
>>>> in_compat_syscall() check, the x32 syscall table refers to
>>>> a __x32_sys_move_pages symbol that may not exist when the
>>>> syscall is disabled.
>>>> 
>>>> Change the COND_SYSCALL() definition on x86 to also include
>>>> the redirection for x32.
>>>> 
>>>> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>>> 
>>> Adding the x86 maintainers and Brian Gerst.  Brian proposed another
>>> problem to the mess that most of the compat syscall handlers used by
>>> x32 here:
>>> 
>>>   https://lkml.org/lkml/2020/6/16/664
>>> 
>>> hpa didn't particularly like it, but with your and my pending series
>>> we'll soon use more native than compat syscalls for x32, so something
>>> will need to change..
>> 
>> I'm fine with either solution.
> 
> My main objection was naming. x64 is a widely used synonym for x86-64, and so that is confusing.
> 
> 

The way I deal with the syscall wrappers is that I assume the naming makes no sense whatsoever, and I go from there. With this perspective, the patches are neither an improvement nor a worsening of the current situation.

(Similarly, the last column of the tables is useless garbage.  My last attempt to fix that stalled.)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro
  2020-09-19 17:14       ` hpa
  2020-09-19 17:39         ` Andy Lutomirski
@ 2020-09-19 17:45         ` Brian Gerst
  1 sibling, 0 replies; 16+ messages in thread
From: Brian Gerst @ 2020-09-19 17:45 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andy Lutomirski, Christoph Hellwig, Arnd Bergmann,
	Alexander Viro, Eric Biederman, Andrew Morton, LKML,
	linux-arm-kernel, linux-arch, Linux-MM, kexec, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Christoph Hellwig

An alternative to the patch I proposed earlier would be to use aliases
with the __x32_ prefix for the common syscalls.

--
Brian Gerst

On Sat, Sep 19, 2020 at 1:14 PM <hpa@zytor.com> wrote:
>
> On September 19, 2020 9:23:22 AM PDT, Andy Lutomirski <luto@kernel.org> wrote:
> >On Fri, Sep 18, 2020 at 10:35 PM Christoph Hellwig <hch@infradead.org>
> >wrote:
> >>
> >> On Fri, Sep 18, 2020 at 03:24:36PM +0200, Arnd Bergmann wrote:
> >> > sys_move_pages() is an optional syscall, and once we remove
> >> > the compat version of it in favor of the native one with an
> >> > in_compat_syscall() check, the x32 syscall table refers to
> >> > a __x32_sys_move_pages symbol that may not exist when the
> >> > syscall is disabled.
> >> >
> >> > Change the COND_SYSCALL() definition on x86 to also include
> >> > the redirection for x32.
> >> >
> >> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> >>
> >> Adding the x86 maintainers and Brian Gerst.  Brian proposed another
> >> problem to the mess that most of the compat syscall handlers used by
> >> x32 here:
> >>
> >>    https://lkml.org/lkml/2020/6/16/664
> >>
> >> hpa didn't particularly like it, but with your and my pending series
> >> we'll soon use more native than compat syscalls for x32, so something
> >> will need to change..
> >
> >I'm fine with either solution.
>
> My main objection was naming. x64 is a widely used synonym for x86-64, and so that is confusing.
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] mm: remove compat numa syscalls
  2020-09-19  5:41   ` Christoph Hellwig
@ 2020-09-26 15:14     ` Arnd Bergmann
  0 siblings, 0 replies; 16+ messages in thread
From: Arnd Bergmann @ 2020-09-26 15:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alexander Viro, Eric Biederman, Andrew Morton, linux-kernel,
	Linux ARM, linux-arch, Linux-MM, kexec

On Sat, Sep 19, 2020 at 7:41 AM Christoph Hellwig <hch@infradead.org> wrote:
> On Fri, Sep 18, 2020 at 03:24:39PM +0200, Arnd Bergmann wrote:

> > +static int get_bitmap(unsigned long *mask, const unsigned long __user *nmask,
> > +                   unsigned long maxnode)
> > +{
> > +     unsigned long nlongs = BITS_TO_LONGS(maxnode);
> > +     int ret;
> > +
> > +     if (in_compat_syscall())
> > +             ret = compat_get_bitmap(mask, (void __user *)nmask, maxnode);
>
> I'd either pass void __user all the way, or do an explicit case from
> the native to the compat version in the compat handler.

Changed to

        if (in_compat_syscall())
                ret = compat_get_bitmap(mask,
                                (const compat_ulong_t __user *)nmask,
                                maxnode);

> > +     else
> > +             ret = copy_from_user(mask, nmask, nlongs*sizeof(unsigned long));
>
> That whole BITS_TO_LONGS(b) * sizeof(unsigned long) pattern is
> duplicated in various places including the checking of compat vs native
> and probably want a helper that includes the in_compat_syscall() check.

I don't see what you mean here. I can see how having the helper would
simplify copy_nodes_to_user(), but not how it can be shared with the
use in get_bitmap()/get_nodes().

      Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] mm: remove compat_sys_move_pages
  2020-09-19  5:38   ` Christoph Hellwig
@ 2020-09-26 15:21     ` Arnd Bergmann
  0 siblings, 0 replies; 16+ messages in thread
From: Arnd Bergmann @ 2020-09-26 15:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alexander Viro, Eric Biederman, Andrew Morton, linux-kernel,
	Linux ARM, linux-arch, Linux-MM, kexec

On Sat, Sep 19, 2020 at 7:38 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> I'd just keep the native version inline and have the compat one in
> a helper, but that is just a minor detail.

Folded in this change:

diff --git a/mm/migrate.c b/mm/migrate.c
index e9dfbde5f12c..d3fa3f4bf653 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1835,18 +1835,14 @@ static void do_pages_stat_array(struct
mm_struct *mm, unsigned long nr_pages,
        mmap_read_unlock(mm);
 }

-static int put_pages_array(const void __user *chunk_pages[],
-                          const void __user * __user *pages,
-                          unsigned long chunk_nr)
+static int put_compat_pages_array(const void __user *chunk_pages[],
+                                 const void __user * __user *pages,
+                                 unsigned long chunk_nr)
 {
        compat_uptr_t __user *pages32 = (compat_uptr_t __user *)pages;
        compat_uptr_t p;
        int i;

-       if (!in_compat_syscall())
-               return copy_from_user(chunk_pages, pages,
-                                     chunk_nr * sizeof(*chunk_pages));
-
        for (i = 0; i < chunk_nr; i++) {
                if (get_user(p, pages32 + i))
                        return -EFAULT;
@@ -1875,8 +1871,15 @@ static int do_pages_stat(struct mm_struct *mm,
unsigned long nr_pages,
                if (chunk_nr > DO_PAGES_STAT_CHUNK_NR)
                        chunk_nr = DO_PAGES_STAT_CHUNK_NR;

-               if (put_pages_array(chunk_pages, pages, chunk_nr))
-                       break;
+               if (in_compat_syscall()) {
+                       if (put_compat_pages_array(chunk_pages, pages,
+                                                  chunk_nr))
+                               break;
+               } else {
+                       if (copy_from_user(chunk_pages, pages,
+                                     chunk_nr * sizeof(*chunk_pages)))
+                               break;
+               }

                do_pages_stat_array(mm, chunk_nr, chunk_pages, chunk_status);

It does make the separation cleaner but it's also more code, which is
why I had it in the combined function before.

      Arnd

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] kexec: remove compat_sys_kexec_load syscall
  2020-09-19  5:37   ` Christoph Hellwig
@ 2020-09-26 21:10     ` Arnd Bergmann
  0 siblings, 0 replies; 16+ messages in thread
From: Arnd Bergmann @ 2020-09-26 21:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Alexander Viro, Eric Biederman, Andrew Morton, linux-kernel,
	Linux ARM, linux-arch, Linux-MM, kexec

On Sat, Sep 19, 2020 at 7:37 AM Christoph Hellwig <hch@infradead.org> wrote:

> > +             struct compat_kexec_segment __user *cs = (void __user *)segments;
> > +             struct compat_kexec_segment segment;
> > +             int i;
> > +             for (i=0; i< nr_segments; i++) {
>
> Missing empty line after the variable declarations and really strange
> indentation.
>
> > +                     copy_from_user(&segment, &cs[i], sizeof(segment));
>
> Missing return value check.
>
> > +                     if (ret)
> > +                             break;
> > +
> > +                     image->segment[i] = (struct kexec_segment) {
> > +                             .buf   = compat_ptr(segment.buf),
> > +                             .bufsz = segment.bufsz,
> > +                             .mem   = segment.mem,
> > +                             .memsz = segment.memsz,
> > +                     };
> > +             }
>
> I'd split the whole compat handling into a helper, and I'd probably
> use the unsafe_get/put user to optimize it a little more.
>
> > +     } else {
> > +             ret = copy_from_user(image->segment, segments, segment_bytes);
> > +     }
> >       if (ret)
> >               ret = -EFAULT;
>
> Why not just
>
>                 if (copy_from_user(image->segment, segments, segment_bytes))
>                         ret = -EFAULT;
>
> ?

Addressed all of these now, thanks for the suggestions!

I had already fixed the missing error handling after the kbuild bot
pointed that out. The separate function does improve the error
handling.

I ended up not using unsafe_get/put since I find the copy_from_user
based loop more readable and it should lead to smaller object code in
most cases as well. kexec is not performance critical, so readability
seems more important here.

      Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-09-26 21:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-18 13:24 [PATCH 0/4] syscalls: remove compat_alloc_user_space callers Arnd Bergmann
2020-09-18 13:24 ` [PATCH 1/4] x86: add __X32_COND_SYSCALL() macro Arnd Bergmann
2020-09-19  5:35   ` Christoph Hellwig
2020-09-19 16:23     ` Andy Lutomirski
2020-09-19 17:14       ` hpa
2020-09-19 17:39         ` Andy Lutomirski
2020-09-19 17:45         ` Brian Gerst
2020-09-18 13:24 ` [PATCH 2/4] kexec: remove compat_sys_kexec_load syscall Arnd Bergmann
2020-09-19  5:37   ` Christoph Hellwig
2020-09-26 21:10     ` Arnd Bergmann
2020-09-18 13:24 ` [PATCH 3/4] mm: remove compat_sys_move_pages Arnd Bergmann
2020-09-19  5:38   ` Christoph Hellwig
2020-09-26 15:21     ` Arnd Bergmann
2020-09-18 13:24 ` [PATCH 4/4] mm: remove compat numa syscalls Arnd Bergmann
2020-09-19  5:41   ` Christoph Hellwig
2020-09-26 15:14     ` Arnd Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).