linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/)
@ 2018-03-22  9:00 Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 01/45] fs: add ksys_getdents64() helper; remove in-kernel calls to sys_getdents64() Dominik Brodowski
                   ` (45 more replies)
  0 siblings, 46 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch

Here is a third series of patches which reduce the number of syscall
invocations from within the kernel. Once this long-term goal is achieved,
the syscall entry path can be streamlined.

This series builds on top of

- part1 (random bits and pieces) 
  http://lkml.kernel.org/r/20180315190529.20943-1-linux@dominikbrodowski.net

- part2 (net)
  http://lkml.kernel.org/r/20180316170614.5392-1-linux@dominikbrodowski.net

and replaces the RFC of a subset of this series. Most of the patches are
just "mindless" conversions and helpers. If wrappers or helpers are limited
to one subsystem, I have named them do_*(), kern_*() or __sys_*(), depending
on what was used by the subsystem and/or what was still available. Otherwise,
I have used ksys_*() to reflect that this is meant as a drop-in replacement
for sys_*() within the kernel.

With this third series, *all* in-kernel callers of sys_*() and compat_sys_*()
outside of arch/ are converted.[*] On top of this, three things may be
attempted now:

- ptregs system call conversion for x86-64.

- re-work initramfs etc. code to not use in-kernel equivalents of
   syscalls, but operate on the VFS level instead.

- re-work SYSCALL_DEFINEx() / COMPAT_SYSCALL_DEFINEx() to do the right
  thing depending on arch-specific requirements on padding, long long handling,
  etc. (Al Viro). 

Also thrown in are a patch by Michael Tautschnig to use proper
SYSCALL_DEFINE0() macros on x86 and a patch by Howard McLauchlan to whitelist
all syscalls for error injection.

The whole series, including part 1 and part 2, can be found at

        https://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux.git syscalls-next

Thanks,
	Dominik

[*] Within arch, only x86 is fully covered by these series.

Dominik Brodowski (43):
  fs: add ksys_getdents64() helper; remove in-kernel calls to
    sys_getdents64()
  fs: add ksys_ioctl() helper; remove in-kernel calls to sys_ioctl()
  fs: add ksys_lseek() helper; remove in-kernel calls to sys_lseek()
  fs: add ksys_read() helper; remove in-kernel calls to sys_read()
  fs: add ksys_sync() helper; remove in-kernel calls to sys_sync()
  fs: add do_lookup_dcookie() helper; remove in-kernel call to syscall
  fs: add do_vmsplice() helper; remove in-kernel call to syscall
  fs: add kern_select() helper; remove in-kernel call to sys_select()
  fs: add ksys_truncate() wrapper; remove in-kernel calls to
    sys_truncate()
  fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to
    syscalls
  fs: add ksys_fallocate() wrapper; remove in-kernel calls to
    sys_fallocate()
  fs: add do_compat_fcntl64() helper; remove in-kernel call to comapt
    syscall
  fs: add do_compat_select() helper; remove in-kernel call to comapt
    syscall
  fs: add do_compat_signalfd4() helper; remove in-kernel call to comapt
    syscall
  fs: add do_compat_futimesat() helper; remove in-kernel call to comapt
    syscall
  inotify: add do_inotify_init() helper; remove in-kernel call to
    syscall
  fanotify: add do_fanotify_mark() helper; remove in-kernel call to
    syscall
  fs/quota: add kernel_quotactl() helper; remove in-kernel call to
    syscall
  fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl()
  kernel: add do_compat_sigaltstack() helper; remove in-kernel call to
    compat syscall
  kernel: add ksys_setsid() helper; remove in-kernel call to
    sys_setsid()
  kernel: provide ksys_*() wrappers for syscalls called by
    kernel/uid16.c
  sched: add do_sched_yield() helper; remove in-kernel call to
    sched_yield()
  kexec: call do_kexec_load() in compat syscall directly
  mm: add kernel_migrate_pages() helper, move compat syscall to
    mm/mempolicy.c
  mm: add kernel_move_pages() helper, move compat syscall to
    mm/migrate.c
  mm: add kernel_mbind() helper; remove in-kernel call to syscall
  mm: add kernel_[sg]et_mempolicy() helpers; remove in-kernel calls to
    syscalls
  mm: add ksys_readahead() helper; remove in-kernel calls to
    sys_readahead()
  ipc: add semtimedop syscall/compat_syscall wrappers
  ipc: add semget syscall wrapper
  ipc: add semctl syscall/compat_syscall wrappers
  ipc: add msgget syscall wrapper
  ipc: add shmget syscall wrapper
  ipc: add shmdt syscall wrapper
  ipc: add shmctl syscall/compat_syscall wrappers
  ipc: add msgctl syscall/compat_syscall wrappers
  ipc: add msgrcv syscall/compat_syscall wrappers
  ipc: add msgsnd syscall/compat_syscall wrappers
  x86: use _do_fork() in compat_sys_x86_clone()
  x86: remove compat_sys_x86_waitpid()
  x86: fix sys_sigreturn() return type to be long, not unsigned long
  kernel/sys_ni: sort cond_syscall() entries

Howard McLauchlan (1):
  bpf: whitelist all syscalls for error injection

Tautschnig, Michael (1):
  x86/sigreturn: use SYSCALL_DEFINE0

 arch/mips/kernel/linux32.c             |  12 +-
 arch/parisc/kernel/sys_parisc.c        |  16 +-
 arch/powerpc/kernel/sys_ppc32.c        |  10 +-
 arch/s390/kernel/compat_linux.c        |  14 +-
 arch/sh/kernel/sys_sh32.c              |   4 +-
 arch/sparc/kernel/setup_32.c           |   2 +-
 arch/sparc/kernel/sys_sparc32.c        |  12 +-
 arch/x86/entry/syscalls/syscall_32.tbl |   4 +-
 arch/x86/ia32/sys_ia32.c               |  26 +-
 arch/x86/include/asm/sys_ia32.h        |   3 -
 arch/x86/include/asm/syscalls.h        |   2 +-
 arch/x86/kernel/signal.c               |   5 +-
 drivers/tty/sysrq.c                    |   2 +-
 fs/dcookies.c                          |  11 +-
 fs/fcntl.c                             |  12 +-
 fs/ioctl.c                             |   7 +-
 fs/notify/fanotify/fanotify_user.c     |  14 +-
 fs/notify/inotify/inotify_user.c       |   9 +-
 fs/open.c                              |   9 +-
 fs/quota/compat.c                      |  13 +-
 fs/quota/quota.c                       |  10 +-
 fs/read_write.c                        |  36 ++-
 fs/readdir.c                           |  11 +-
 fs/select.c                            |  29 +-
 fs/signalfd.c                          |  17 +-
 fs/splice.c                            |  12 +-
 fs/sync.c                              |   7 +-
 fs/utimes.c                            |  12 +-
 include/linux/compat.h                 |   6 +
 include/linux/quotaops.h               |   3 +
 include/linux/syscalls.h               |  27 +-
 init/do_mounts.c                       |  10 +-
 init/do_mounts_initrd.c                |   4 +-
 init/do_mounts_md.c                    |  15 +-
 init/do_mounts_rd.c                    |  22 +-
 init/initramfs.c                       |   4 +-
 ipc/msg.c                              |  60 +++-
 ipc/sem.c                              |  44 ++-
 ipc/shm.c                              |  28 +-
 ipc/syscall.c                          |  58 ++--
 ipc/util.h                             |  31 ++
 kernel/compat.c                        |  55 ----
 kernel/kexec.c                         |  50 +++-
 kernel/power/hibernate.c               |   2 +-
 kernel/power/suspend.c                 |   2 +-
 kernel/power/user.c                    |   2 +-
 kernel/sched/core.c                    |   8 +-
 kernel/signal.c                        |  14 +-
 kernel/sys.c                           |  65 ++++-
 kernel/sys_ni.c                        | 506 +++++++++++++++++++++------------
 kernel/uid16.c                         |  19 +-
 kernel/uid16.h                         |  14 +
 mm/mempolicy.c                         |  92 +++++-
 mm/migrate.c                           |  39 ++-
 mm/readahead.c                         |   7 +-
 55 files changed, 1030 insertions(+), 478 deletions(-)
 create mode 100644 kernel/uid16.h

-- 
2.16.2

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 01/45] fs: add ksys_getdents64() helper; remove in-kernel calls to sys_getdents64()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 02/45] fs: add ksys_ioctl() helper; remove in-kernel calls to sys_ioctl() Dominik Brodowski
                   ` (44 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Alexander Viro

Using this helper allows us to avoid the in-kernel calls to the
sys_getdents64() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_getdents64().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/readdir.c             | 11 +++++++++--
 include/linux/syscalls.h |  2 ++
 init/initramfs.c         |  4 ++--
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/readdir.c b/fs/readdir.c
index 1b83b0ad183b..d97f548e6323 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -292,8 +292,8 @@ static int filldir64(struct dir_context *ctx, const char *name, int namlen,
 	return -EFAULT;
 }
 
-SYSCALL_DEFINE3(getdents64, unsigned int, fd,
-		struct linux_dirent64 __user *, dirent, unsigned int, count)
+int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
+		    unsigned int count)
 {
 	struct fd f;
 	struct linux_dirent64 __user * lastdirent;
@@ -326,6 +326,13 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd,
 	return error;
 }
 
+
+SYSCALL_DEFINE3(getdents64, unsigned int, fd,
+		struct linux_dirent64 __user *, dirent, unsigned int, count)
+{
+	return ksys_getdents64(fd, dirent, count);
+}
+
 #ifdef CONFIG_COMPAT
 struct compat_old_linux_dirent {
 	compat_ulong_t	d_ino;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 6976c9e140db..3f834dc0b456 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -972,6 +972,8 @@ int ksys_sync_file_range(int fd, loff_t offset, loff_t nbytes,
 			 unsigned int flags);
 int ksys_fchmod(unsigned int fd, umode_t mode);
 int ksys_fchown(unsigned int fd, uid_t user, gid_t group);
+int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
+		    unsigned int count);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
diff --git a/init/initramfs.c b/init/initramfs.c
index 5f2ff1d2370e..13643c46ebab 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -579,7 +579,7 @@ static void __init clean_rootfs(void)
 	}
 
 	dirp = buf;
-	num = sys_getdents64(fd, dirp, BUF_SIZE);
+	num = ksys_getdents64(fd, dirp, BUF_SIZE);
 	while (num > 0) {
 		while (num > 0) {
 			struct kstat st;
@@ -599,7 +599,7 @@ static void __init clean_rootfs(void)
 		}
 		dirp = buf;
 		memset(buf, 0, BUF_SIZE);
-		num = sys_getdents64(fd, dirp, BUF_SIZE);
+		num = ksys_getdents64(fd, dirp, BUF_SIZE);
 	}
 
 	ksys_close(fd);
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 02/45] fs: add ksys_ioctl() helper; remove in-kernel calls to sys_ioctl()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 01/45] fs: add ksys_getdents64() helper; remove in-kernel calls to sys_getdents64() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 03/45] fs: add ksys_lseek() helper; remove in-kernel calls to sys_lseek() Dominik Brodowski
                   ` (43 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Alexander Viro

Using this helper allows us to avoid the in-kernel calls to the
sys_ioctl() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_ioctl().

After careful review, at least some of these calls could be converted
to do_vfs_ioctl() in future.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/ioctl.c               |  7 ++++++-
 include/linux/syscalls.h |  1 +
 init/do_mounts.c         |  8 ++++----
 init/do_mounts_initrd.c  |  2 +-
 init/do_mounts_md.c      | 15 ++++++++-------
 init/do_mounts_rd.c      |  4 ++--
 6 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 5ace7efb0d04..4823431d1c9d 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -689,7 +689,7 @@ int do_vfs_ioctl(struct file *filp, unsigned int fd, unsigned int cmd,
 	return error;
 }
 
-SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
+int ksys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg)
 {
 	int error;
 	struct fd f = fdget(fd);
@@ -702,3 +702,8 @@ SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
 	fdput(f);
 	return error;
 }
+
+SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd, unsigned long, arg)
+{
+	return ksys_ioctl(fd, cmd, arg);
+}
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 3f834dc0b456..f1d341d0972f 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -974,6 +974,7 @@ int ksys_fchmod(unsigned int fd, umode_t mode);
 int ksys_fchown(unsigned int fd, uid_t user, gid_t group);
 int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
 		    unsigned int count);
+int ksys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
diff --git a/init/do_mounts.c b/init/do_mounts.c
index cc1103477071..b17e0095eb4e 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -491,18 +491,18 @@ void __init change_floppy(char *fmt, ...)
 	va_end(args);
 	fd = ksys_open("/dev/root", O_RDWR | O_NDELAY, 0);
 	if (fd >= 0) {
-		sys_ioctl(fd, FDEJECT, 0);
+		ksys_ioctl(fd, FDEJECT, 0);
 		ksys_close(fd);
 	}
 	printk(KERN_NOTICE "VFS: Insert %s and press ENTER\n", buf);
 	fd = ksys_open("/dev/console", O_RDWR, 0);
 	if (fd >= 0) {
-		sys_ioctl(fd, TCGETS, (long)&termios);
+		ksys_ioctl(fd, TCGETS, (long)&termios);
 		termios.c_lflag &= ~ICANON;
-		sys_ioctl(fd, TCSETSF, (long)&termios);
+		ksys_ioctl(fd, TCSETSF, (long)&termios);
 		sys_read(fd, &c, 1);
 		termios.c_lflag |= ICANON;
-		sys_ioctl(fd, TCSETSF, (long)&termios);
+		ksys_ioctl(fd, TCSETSF, (long)&termios);
 		ksys_close(fd);
 	}
 }
diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c
index 1e3b469fb54b..d1d3e53bdeef 100644
--- a/init/do_mounts_initrd.c
+++ b/init/do_mounts_initrd.c
@@ -110,7 +110,7 @@ static void __init handle_initrd(void)
 		if (fd < 0) {
 			error = fd;
 		} else {
-			error = sys_ioctl(fd, BLKFLSBUF, 0);
+			error = ksys_ioctl(fd, BLKFLSBUF, 0);
 			ksys_close(fd);
 		}
 		printk(!error ? "okay\n" : "failed\n");
diff --git a/init/do_mounts_md.c b/init/do_mounts_md.c
index 76dcfaada3ed..7d85d172bc7e 100644
--- a/init/do_mounts_md.c
+++ b/init/do_mounts_md.c
@@ -187,7 +187,7 @@ static void __init md_setup_drive(void)
 					"array %s\n", name);
 			continue;
 		}
-		if (sys_ioctl(fd, SET_ARRAY_INFO, 0) == -EBUSY) {
+		if (ksys_ioctl(fd, SET_ARRAY_INFO, 0) == -EBUSY) {
 			printk(KERN_WARNING
 			       "md: Ignoring md=%d, already autodetected. (Use raid=noautodetect)\n",
 			       minor);
@@ -210,7 +210,7 @@ static void __init md_setup_drive(void)
 			ainfo.state = (1 << MD_SB_CLEAN);
 			ainfo.layout = 0;
 			ainfo.chunk_size = md_setup_args[ent].chunk;
-			err = sys_ioctl(fd, SET_ARRAY_INFO, (long)&ainfo);
+			err = ksys_ioctl(fd, SET_ARRAY_INFO, (long)&ainfo);
 			for (i = 0; !err && i <= MD_SB_DISKS; i++) {
 				dev = devices[i];
 				if (!dev)
@@ -220,7 +220,8 @@ static void __init md_setup_drive(void)
 				dinfo.state = (1<<MD_DISK_ACTIVE)|(1<<MD_DISK_SYNC);
 				dinfo.major = MAJOR(dev);
 				dinfo.minor = MINOR(dev);
-				err = sys_ioctl(fd, ADD_NEW_DISK, (long)&dinfo);
+				err = ksys_ioctl(fd, ADD_NEW_DISK,
+						 (long)&dinfo);
 			}
 		} else {
 			/* persistent */
@@ -230,11 +231,11 @@ static void __init md_setup_drive(void)
 					break;
 				dinfo.major = MAJOR(dev);
 				dinfo.minor = MINOR(dev);
-				sys_ioctl(fd, ADD_NEW_DISK, (long)&dinfo);
+				ksys_ioctl(fd, ADD_NEW_DISK, (long)&dinfo);
 			}
 		}
 		if (!err)
-			err = sys_ioctl(fd, RUN_ARRAY, 0);
+			err = ksys_ioctl(fd, RUN_ARRAY, 0);
 		if (err)
 			printk(KERN_WARNING "md: starting md%d failed\n", minor);
 		else {
@@ -245,7 +246,7 @@ static void __init md_setup_drive(void)
 			 */
 			ksys_close(fd);
 			fd = ksys_open(name, 0, 0);
-			sys_ioctl(fd, BLKRRPART, 0);
+			ksys_ioctl(fd, BLKRRPART, 0);
 		}
 		ksys_close(fd);
 	}
@@ -296,7 +297,7 @@ static void __init autodetect_raid(void)
 
 	fd = ksys_open("/dev/md0", 0, 0);
 	if (fd >= 0) {
-		sys_ioctl(fd, RAID_AUTORUN, raid_autopart);
+		ksys_ioctl(fd, RAID_AUTORUN, raid_autopart);
 		ksys_close(fd);
 	}
 }
diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c
index a6706314baa7..4dafaed5736f 100644
--- a/init/do_mounts_rd.c
+++ b/init/do_mounts_rd.c
@@ -218,7 +218,7 @@ int __init rd_load_image(char *from)
 	 * NOTE NOTE: nblocks is not actually blocks but
 	 * the number of kibibytes of data to load into a ramdisk.
 	 */
-	if (sys_ioctl(out_fd, BLKGETSIZE, (unsigned long)&rd_blocks) < 0)
+	if (ksys_ioctl(out_fd, BLKGETSIZE, (unsigned long)&rd_blocks) < 0)
 		rd_blocks = 0;
 	else
 		rd_blocks >>= 1;
@@ -232,7 +232,7 @@ int __init rd_load_image(char *from)
 	/*
 	 * OK, time to copy in the data
 	 */
-	if (sys_ioctl(in_fd, BLKGETSIZE, (unsigned long)&devblocks) < 0)
+	if (ksys_ioctl(in_fd, BLKGETSIZE, (unsigned long)&devblocks) < 0)
 		devblocks = 0;
 	else
 		devblocks >>= 1;
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 03/45] fs: add ksys_lseek() helper; remove in-kernel calls to sys_lseek()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 01/45] fs: add ksys_getdents64() helper; remove in-kernel calls to sys_getdents64() Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 02/45] fs: add ksys_ioctl() helper; remove in-kernel calls to sys_ioctl() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 04/45] fs: add ksys_read() helper; remove in-kernel calls to sys_read() Dominik Brodowski
                   ` (42 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Alexander Viro

Using this helper allows us to avoid the in-kernel calls to the
sys_lseek() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_lseek().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/read_write.c          | 9 +++++++--
 include/linux/syscalls.h | 1 +
 init/do_mounts_rd.c      | 8 ++++----
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 8e8f0b4f52e2..b38b008a078e 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -301,7 +301,7 @@ loff_t vfs_llseek(struct file *file, loff_t offset, int whence)
 }
 EXPORT_SYMBOL(vfs_llseek);
 
-SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence)
+off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence)
 {
 	off_t retval;
 	struct fd f = fdget_pos(fd);
@@ -319,10 +319,15 @@ SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence)
 	return retval;
 }
 
+SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence)
+{
+	return ksys_lseek(fd, offset, whence);
+}
+
 #ifdef CONFIG_COMPAT
 COMPAT_SYSCALL_DEFINE3(lseek, unsigned int, fd, compat_off_t, offset, unsigned int, whence)
 {
-	return sys_lseek(fd, offset, whence);
+	return ksys_lseek(fd, offset, whence);
 }
 #endif
 
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index f1d341d0972f..d11d06ab546e 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -975,6 +975,7 @@ int ksys_fchown(unsigned int fd, uid_t user, gid_t group);
 int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
 		    unsigned int count);
 int ksys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg);
+off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c
index 4dafaed5736f..13e54148c0e0 100644
--- a/init/do_mounts_rd.c
+++ b/init/do_mounts_rd.c
@@ -90,7 +90,7 @@ identify_ramdisk_image(int fd, int start_block, decompress_fn *decompressor)
 	/*
 	 * Read block 0 to test for compressed kernel
 	 */
-	sys_lseek(fd, start_block * BLOCK_SIZE, 0);
+	ksys_lseek(fd, start_block * BLOCK_SIZE, 0);
 	sys_read(fd, buf, size);
 
 	*decompressor = decompress_method(buf, size, &compress_name);
@@ -136,7 +136,7 @@ identify_ramdisk_image(int fd, int start_block, decompress_fn *decompressor)
 	/*
 	 * Read 512 bytes further to check if cramfs is padded
 	 */
-	sys_lseek(fd, start_block * BLOCK_SIZE + 0x200, 0);
+	ksys_lseek(fd, start_block * BLOCK_SIZE + 0x200, 0);
 	sys_read(fd, buf, size);
 
 	if (cramfsb->magic == CRAMFS_MAGIC) {
@@ -150,7 +150,7 @@ identify_ramdisk_image(int fd, int start_block, decompress_fn *decompressor)
 	/*
 	 * Read block 1 to test for minix and ext2 superblock
 	 */
-	sys_lseek(fd, (start_block+1) * BLOCK_SIZE, 0);
+	ksys_lseek(fd, (start_block+1) * BLOCK_SIZE, 0);
 	sys_read(fd, buf, size);
 
 	/* Try minix */
@@ -178,7 +178,7 @@ identify_ramdisk_image(int fd, int start_block, decompress_fn *decompressor)
 	       start_block);
 
 done:
-	sys_lseek(fd, start_block * BLOCK_SIZE, 0);
+	ksys_lseek(fd, start_block * BLOCK_SIZE, 0);
 	kfree(buf);
 	return nblocks;
 }
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 04/45] fs: add ksys_read() helper; remove in-kernel calls to sys_read()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (2 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 03/45] fs: add ksys_lseek() helper; remove in-kernel calls to sys_lseek() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 05/45] fs: add ksys_sync() helper; remove in-kernel calls to sys_sync() Dominik Brodowski
                   ` (41 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Alexander Viro

Using this helper allows us to avoid the in-kernel calls to the
sys_read() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_read().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/s390/kernel/compat_linux.c |  2 +-
 fs/read_write.c                 |  7 ++++++-
 include/linux/syscalls.h        |  1 +
 init/do_mounts.c                |  2 +-
 init/do_mounts_rd.c             | 10 +++++-----
 5 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/s390/kernel/compat_linux.c b/arch/s390/kernel/compat_linux.c
index a1fa8051fe63..fb902b591bb4 100644
--- a/arch/s390/kernel/compat_linux.c
+++ b/arch/s390/kernel/compat_linux.c
@@ -460,7 +460,7 @@ COMPAT_SYSCALL_DEFINE3(s390_read, unsigned int, fd, char __user *, buf, compat_s
 	if ((compat_ssize_t) count < 0)
 		return -EINVAL; 
 
-	return sys_read(fd, buf, count);
+	return ksys_read(fd, buf, count);
 }
 
 COMPAT_SYSCALL_DEFINE3(s390_write, unsigned int, fd, const char __user *, buf, compat_size_t, count)
diff --git a/fs/read_write.c b/fs/read_write.c
index b38b008a078e..fc441e1ac683 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -568,7 +568,7 @@ static inline void file_pos_write(struct file *file, loff_t pos)
 	file->f_pos = pos;
 }
 
-SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
+ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count)
 {
 	struct fd f = fdget_pos(fd);
 	ssize_t ret = -EBADF;
@@ -583,6 +583,11 @@ SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
 	return ret;
 }
 
+SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)
+{
+	return ksys_read(fd, buf, count);
+}
+
 ssize_t ksys_write(unsigned int fd, const char __user *buf, size_t count)
 {
 	struct fd f = fdget_pos(fd);
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index d11d06ab546e..e3a52c0bbf52 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -976,6 +976,7 @@ int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
 		    unsigned int count);
 int ksys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg);
 off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence);
+ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
diff --git a/init/do_mounts.c b/init/do_mounts.c
index b17e0095eb4e..2c71dabe5626 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -500,7 +500,7 @@ void __init change_floppy(char *fmt, ...)
 		ksys_ioctl(fd, TCGETS, (long)&termios);
 		termios.c_lflag &= ~ICANON;
 		ksys_ioctl(fd, TCSETSF, (long)&termios);
-		sys_read(fd, &c, 1);
+		ksys_read(fd, &c, 1);
 		termios.c_lflag |= ICANON;
 		ksys_ioctl(fd, TCSETSF, (long)&termios);
 		ksys_close(fd);
diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c
index 13e54148c0e0..12c159824c7b 100644
--- a/init/do_mounts_rd.c
+++ b/init/do_mounts_rd.c
@@ -91,7 +91,7 @@ identify_ramdisk_image(int fd, int start_block, decompress_fn *decompressor)
 	 * Read block 0 to test for compressed kernel
 	 */
 	ksys_lseek(fd, start_block * BLOCK_SIZE, 0);
-	sys_read(fd, buf, size);
+	ksys_read(fd, buf, size);
 
 	*decompressor = decompress_method(buf, size, &compress_name);
 	if (compress_name) {
@@ -137,7 +137,7 @@ identify_ramdisk_image(int fd, int start_block, decompress_fn *decompressor)
 	 * Read 512 bytes further to check if cramfs is padded
 	 */
 	ksys_lseek(fd, start_block * BLOCK_SIZE + 0x200, 0);
-	sys_read(fd, buf, size);
+	ksys_read(fd, buf, size);
 
 	if (cramfsb->magic == CRAMFS_MAGIC) {
 		printk(KERN_NOTICE
@@ -151,7 +151,7 @@ identify_ramdisk_image(int fd, int start_block, decompress_fn *decompressor)
 	 * Read block 1 to test for minix and ext2 superblock
 	 */
 	ksys_lseek(fd, (start_block+1) * BLOCK_SIZE, 0);
-	sys_read(fd, buf, size);
+	ksys_read(fd, buf, size);
 
 	/* Try minix */
 	if (minixsb->s_magic == MINIX_SUPER_MAGIC ||
@@ -269,7 +269,7 @@ int __init rd_load_image(char *from)
 			}
 			printk("Loading disk #%d... ", disk);
 		}
-		sys_read(in_fd, buf, BLOCK_SIZE);
+		ksys_read(in_fd, buf, BLOCK_SIZE);
 		ksys_write(out_fd, buf, BLOCK_SIZE);
 #if !defined(CONFIG_S390)
 		if (!(i % 16)) {
@@ -307,7 +307,7 @@ static int crd_infd, crd_outfd;
 
 static long __init compr_fill(void *buf, unsigned long len)
 {
-	long r = sys_read(crd_infd, buf, len);
+	long r = ksys_read(crd_infd, buf, len);
 	if (r < 0)
 		printk(KERN_ERR "RAMDISK: error while reading compressed data");
 	else if (r == 0)
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 05/45] fs: add ksys_sync() helper; remove in-kernel calls to sys_sync()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (3 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 04/45] fs: add ksys_read() helper; remove in-kernel calls to sys_read() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 06/45] fs: add do_lookup_dcookie() helper; remove in-kernel call to syscall Dominik Brodowski
                   ` (40 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Alexander Viro

Using this helper allows us to avoid the in-kernel calls to the
sys_sync() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_sync().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/sparc/kernel/setup_32.c | 2 +-
 drivers/tty/sysrq.c          | 2 +-
 fs/sync.c                    | 7 ++++++-
 include/linux/syscalls.h     | 1 +
 kernel/power/hibernate.c     | 2 +-
 kernel/power/suspend.c       | 2 +-
 kernel/power/user.c          | 2 +-
 7 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c
index 2e3a3e203061..13664c377196 100644
--- a/arch/sparc/kernel/setup_32.c
+++ b/arch/sparc/kernel/setup_32.c
@@ -86,7 +86,7 @@ static void prom_sync_me(void)
 	show_free_areas(0, NULL);
 	if (!is_idle_task(current)) {
 		local_irq_enable();
-		sys_sync();
+		ksys_sync();
 		local_irq_disable();
 	}
 	prom_printf("Returning to prom\n");
diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index b674793be478..6364890575ec 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -660,7 +660,7 @@ static void sysrq_do_reset(struct timer_list *t)
 
 	state->reset_requested = true;
 
-	sys_sync();
+	ksys_sync();
 	kernel_restart(NULL);
 }
 
diff --git a/fs/sync.c b/fs/sync.c
index ff947c30a6c0..9908a114d506 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -105,7 +105,7 @@ static void fdatawait_one_bdev(struct block_device *bdev, void *arg)
  * just write metadata (such as inodes or bitmaps) to block device page cache
  * and do not sync it on their own in ->sync_fs().
  */
-SYSCALL_DEFINE0(sync)
+void ksys_sync(void)
 {
 	int nowait = 0, wait = 1;
 
@@ -117,6 +117,11 @@ SYSCALL_DEFINE0(sync)
 	iterate_bdevs(fdatawait_one_bdev, NULL);
 	if (unlikely(laptop_mode))
 		laptop_sync_completion();
+}
+
+SYSCALL_DEFINE0(sync)
+{
+	ksys_sync();
 	return 0;
 }
 
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index e3a52c0bbf52..1970c6817289 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -977,6 +977,7 @@ int ksys_getdents64(unsigned int fd, struct linux_dirent64 __user *dirent,
 int ksys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg);
 off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence);
 ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count);
+void ksys_sync(void);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index a5c36e9c56a6..4710f1b142fc 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -701,7 +701,7 @@ int hibernate(void)
 	}
 
 	pr_info("Syncing filesystems ... \n");
-	sys_sync();
+	ksys_sync();
 	pr_info("done.\n");
 
 	error = freeze_processes();
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 0685c4499431..4c10be0f4843 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -560,7 +560,7 @@ static int enter_state(suspend_state_t state)
 #ifndef CONFIG_SUSPEND_SKIP_SYNC
 	trace_suspend_resume(TPS("sync_filesystems"), 0, true);
 	pr_info("Syncing filesystems ... ");
-	sys_sync();
+	ksys_sync();
 	pr_cont("done.\n");
 	trace_suspend_resume(TPS("sync_filesystems"), 0, false);
 #endif
diff --git a/kernel/power/user.c b/kernel/power/user.c
index 22df9f7ff672..75c959de4b29 100644
--- a/kernel/power/user.c
+++ b/kernel/power/user.c
@@ -224,7 +224,7 @@ static long snapshot_ioctl(struct file *filp, unsigned int cmd,
 			break;
 
 		printk("Syncing filesystems ... ");
-		sys_sync();
+		ksys_sync();
 		printk("done.\n");
 
 		error = freeze_processes();
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 06/45] fs: add do_lookup_dcookie() helper; remove in-kernel call to syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (4 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 05/45] fs: add ksys_sync() helper; remove in-kernel calls to sys_sync() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 07/45] fs: add do_vmsplice() " Dominik Brodowski
                   ` (39 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using the fs-internal do_lookup_dcookie() helper allows us to get rid of
fs-internal calls to the sys_lookup_dcookie() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/dcookies.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/dcookies.c b/fs/dcookies.c
index 0d0461cf2431..57bc96435feb 100644
--- a/fs/dcookies.c
+++ b/fs/dcookies.c
@@ -146,7 +146,7 @@ int get_dcookie(const struct path *path, unsigned long *cookie)
 /* And here is where the userspace process can look up the cookie value
  * to retrieve the path.
  */
-SYSCALL_DEFINE3(lookup_dcookie, u64, cookie64, char __user *, buf, size_t, len)
+static int do_lookup_dcookie(u64 cookie64, char __user *buf, size_t len)
 {
 	unsigned long cookie = (unsigned long)cookie64;
 	int err = -EINVAL;
@@ -203,13 +203,18 @@ SYSCALL_DEFINE3(lookup_dcookie, u64, cookie64, char __user *, buf, size_t, len)
 	return err;
 }
 
+SYSCALL_DEFINE3(lookup_dcookie, u64, cookie64, char __user *, buf, size_t, len)
+{
+	return do_lookup_dcookie(cookie64, buf, len);
+}
+
 #ifdef CONFIG_COMPAT
 COMPAT_SYSCALL_DEFINE4(lookup_dcookie, u32, w0, u32, w1, char __user *, buf, compat_size_t, len)
 {
 #ifdef __BIG_ENDIAN
-	return sys_lookup_dcookie(((u64)w0 << 32) | w1, buf, len);
+	return do_lookup_dcookie(((u64)w0 << 32) | w1, buf, len);
 #else
-	return sys_lookup_dcookie(((u64)w1 << 32) | w0, buf, len);
+	return do_lookup_dcookie(((u64)w1 << 32) | w0, buf, len);
 #endif
 }
 #endif
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 07/45] fs: add do_vmsplice() helper; remove in-kernel call to syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (5 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 06/45] fs: add do_lookup_dcookie() helper; remove in-kernel call to syscall Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 08/45] fs: add kern_select() helper; remove in-kernel call to sys_select() Dominik Brodowski
                   ` (38 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using the fs-internal do_vmsplice() helper allows us to get rid of the
fs-internal call to the sys_vmsplice() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/splice.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 39e2dc01ac12..005d09cf3fa8 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1331,8 +1331,8 @@ static long vmsplice_to_pipe(struct file *file, const struct iovec __user *uiov,
  * Currently we punt and implement it as a normal copy, see pipe_to_user().
  *
  */
-SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, iov,
-		unsigned long, nr_segs, unsigned int, flags)
+static long do_vmsplice(int fd, const struct iovec __user *iov,
+			unsigned long nr_segs, unsigned int flags)
 {
 	struct fd f;
 	long error;
@@ -1358,6 +1358,12 @@ SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, iov,
 	return error;
 }
 
+SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, iov,
+		unsigned long, nr_segs, unsigned int, flags)
+{
+	return do_vmsplice(fd, iov, nr_segs, flags);
+}
+
 #ifdef CONFIG_COMPAT
 COMPAT_SYSCALL_DEFINE4(vmsplice, int, fd, const struct compat_iovec __user *, iov32,
 		    unsigned int, nr_segs, unsigned int, flags)
@@ -1375,7 +1381,7 @@ COMPAT_SYSCALL_DEFINE4(vmsplice, int, fd, const struct compat_iovec __user *, io
 		    put_user(v.iov_len, &iov[i].iov_len))
 			return -EFAULT;
 	}
-	return sys_vmsplice(fd, iov, nr_segs, flags);
+	return do_vmsplice(fd, iov, nr_segs, flags);
 }
 #endif
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 08/45] fs: add kern_select() helper; remove in-kernel call to sys_select()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (6 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 07/45] fs: add do_vmsplice() " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 09/45] fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate() Dominik Brodowski
                   ` (37 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using this helper allows us to avoid the in-kernel call to the sys_umount()
syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/select.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index b6c36254028a..b5df01c4587d 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -675,8 +675,8 @@ int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
 	return ret;
 }
 
-SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp,
-		fd_set __user *, exp, struct timeval __user *, tvp)
+static int kern_select(int n, fd_set __user *inp, fd_set __user *outp,
+		       fd_set __user *exp, struct timeval __user *tvp)
 {
 	struct timespec64 end_time, *to = NULL;
 	struct timeval tv;
@@ -699,6 +699,12 @@ SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp,
 	return ret;
 }
 
+SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp,
+		fd_set __user *, exp, struct timeval __user *, tvp)
+{
+	return kern_select(n, inp, outp, exp, tvp);
+}
+
 static long do_pselect(int n, fd_set __user *inp, fd_set __user *outp,
 		       fd_set __user *exp, struct timespec __user *tsp,
 		       const sigset_t __user *sigmask, size_t sigsetsize)
@@ -784,7 +790,7 @@ SYSCALL_DEFINE1(old_select, struct sel_arg_struct __user *, arg)
 
 	if (copy_from_user(&a, arg, sizeof(a)))
 		return -EFAULT;
-	return sys_select(a.n, a.inp, a.outp, a.exp, a.tvp);
+	return kern_select(a.n, a.inp, a.outp, a.exp, a.tvp);
 }
 #endif
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 09/45] fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (7 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 08/45] fs: add kern_select() helper; remove in-kernel call to sys_select() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 10/45] fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls Dominik Brodowski
                   ` (36 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using the ksys_truncate() wrapper allows us to get rid of in-kernel
calls to the sys_truncate() syscall. The ksys_ prefix denotes that this
function is meant as a drop-in replacement for the syscall. In
particular, it uses the same calling convention as sys_truncate().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/mips/kernel/linux32.c      | 2 +-
 arch/parisc/kernel/sys_parisc.c | 6 +++---
 arch/powerpc/kernel/sys_ppc32.c | 2 +-
 arch/s390/kernel/compat_linux.c | 2 +-
 arch/sparc/kernel/sys_sparc32.c | 2 +-
 arch/x86/ia32/sys_ia32.c        | 3 ++-
 fs/open.c                       | 2 +-
 include/linux/syscalls.h        | 7 +++++++
 8 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/arch/mips/kernel/linux32.c b/arch/mips/kernel/linux32.c
index 740eee40c668..12421ba4318a 100644
--- a/arch/mips/kernel/linux32.c
+++ b/arch/mips/kernel/linux32.c
@@ -82,7 +82,7 @@ struct rlimit32 {
 SYSCALL_DEFINE4(32_truncate64, const char __user *, path,
 	unsigned long, __dummy, unsigned long, a2, unsigned long, a3)
 {
-	return sys_truncate(path, merge_64(a2, a3));
+	return ksys_truncate(path, merge_64(a2, a3));
 }
 
 SYSCALL_DEFINE4(32_ftruncate64, unsigned long, fd, unsigned long, __dummy,
diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
index 59b315d6d194..3c1d788ab3ed 100644
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -292,7 +292,7 @@ asmlinkage unsigned long sys_mmap(unsigned long addr, unsigned long len,
 asmlinkage long parisc_truncate64(const char __user * path,
 					unsigned int high, unsigned int low)
 {
-	return sys_truncate(path, (long)high << 32 | low);
+	return ksys_truncate(path, (long)high << 32 | low);
 }
 
 asmlinkage long parisc_ftruncate64(unsigned int fd,
@@ -305,7 +305,7 @@ asmlinkage long parisc_ftruncate64(unsigned int fd,
  * are identical on LP64 */
 asmlinkage long sys_truncate64(const char __user * path, unsigned long length)
 {
-	return sys_truncate(path, length);
+	return ksys_truncate(path, length);
 }
 asmlinkage long sys_ftruncate64(unsigned int fd, unsigned long length)
 {
@@ -320,7 +320,7 @@ asmlinkage long sys_fcntl64(unsigned int fd, unsigned int cmd, unsigned long arg
 asmlinkage long parisc_truncate64(const char __user * path,
 					unsigned int high, unsigned int low)
 {
-	return sys_truncate64(path, (loff_t)high << 32 | low);
+	return ksys_truncate(path, (loff_t)high << 32 | low);
 }
 
 asmlinkage long parisc_ftruncate64(unsigned int fd,
diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c
index f41cb34c84c8..eb79138bd44d 100644
--- a/arch/powerpc/kernel/sys_ppc32.c
+++ b/arch/powerpc/kernel/sys_ppc32.c
@@ -94,7 +94,7 @@ compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offhi, u32 offlo, u32 co
 asmlinkage int compat_sys_truncate64(const char __user * path, u32 reg4,
 				unsigned long high, unsigned long low)
 {
-	return sys_truncate(path, (high << 32) | low);
+	return ksys_truncate(path, (high << 32) | low);
 }
 
 asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offhi, u32 offlo,
diff --git a/arch/s390/kernel/compat_linux.c b/arch/s390/kernel/compat_linux.c
index fb902b591bb4..0788df0443ba 100644
--- a/arch/s390/kernel/compat_linux.c
+++ b/arch/s390/kernel/compat_linux.c
@@ -302,7 +302,7 @@ COMPAT_SYSCALL_DEFINE5(s390_ipc, uint, call, int, first, compat_ulong_t, second,
 
 COMPAT_SYSCALL_DEFINE3(s390_truncate64, const char __user *, path, u32, high, u32, low)
 {
-	return sys_truncate(path, (unsigned long)high << 32 | low);
+	return ksys_truncate(path, (unsigned long)high << 32 | low);
 }
 
 COMPAT_SYSCALL_DEFINE3(s390_ftruncate64, unsigned int, fd, u32, high, u32, low)
diff --git a/arch/sparc/kernel/sys_sparc32.c b/arch/sparc/kernel/sys_sparc32.c
index f8d357540748..4ceb2e591688 100644
--- a/arch/sparc/kernel/sys_sparc32.c
+++ b/arch/sparc/kernel/sys_sparc32.c
@@ -57,7 +57,7 @@ asmlinkage long sys32_truncate64(const char __user * path, unsigned long high, u
 	if ((int)high < 0)
 		return -EINVAL;
 	else
-		return sys_truncate(path, (high << 32) | low);
+		return ksys_truncate(path, (high << 32) | low);
 }
 
 asmlinkage long sys32_ftruncate64(unsigned int fd, unsigned long high, unsigned long low)
diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index 9f5c25093e7a..91ed2c256dac 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -54,7 +54,8 @@
 COMPAT_SYSCALL_DEFINE3(x86_truncate64, const char __user *, filename,
 		       unsigned long, offset_low, unsigned long, offset_high)
 {
-       return sys_truncate(filename, ((loff_t) offset_high << 32) | offset_low);
+	return ksys_truncate(filename,
+			    ((loff_t) offset_high << 32) | offset_low);
 }
 
 COMPAT_SYSCALL_DEFINE3(x86_ftruncate64, unsigned int, fd,
diff --git a/fs/open.c b/fs/open.c
index 8a42a2961130..2e816fc7bd56 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -128,7 +128,7 @@ long vfs_truncate(const struct path *path, loff_t length)
 }
 EXPORT_SYMBOL_GPL(vfs_truncate);
 
-static long do_sys_truncate(const char __user *pathname, loff_t length)
+long do_sys_truncate(const char __user *pathname, loff_t length)
 {
 	unsigned int lookup_flags = LOOKUP_FOLLOW;
 	struct path path;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 1970c6817289..535cc3cf516a 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -1045,6 +1045,13 @@ static inline long ksys_access(const char __user *filename, int mode)
 	return do_faccessat(AT_FDCWD, filename, mode);
 }
 
+extern long do_sys_truncate(const char __user *pathname, loff_t length);
+
+static inline long ksys_truncate(const char __user *pathname, loff_t length)
+{
+	return do_sys_truncate(pathname, length);
+}
+
 extern long do_sys_ftruncate(unsigned int fd, loff_t length, int small);
 
 static inline long ksys_ftruncate(unsigned int fd, unsigned long length)
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 10/45] fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (8 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 09/45] fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 11/45] fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate() Dominik Brodowski
                   ` (35 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using the ksys_p{read,write}64() wrappers allows us to get rid of
in-kernel calls to the sys_pread64() and sys_pwrite64() syscalls.
The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_p{read,write}64().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/mips/kernel/linux32.c      |  4 ++--
 arch/parisc/kernel/sys_parisc.c |  4 ++--
 arch/powerpc/kernel/sys_ppc32.c |  4 ++--
 arch/s390/kernel/compat_linux.c |  4 ++--
 arch/sh/kernel/sys_sh32.c       |  4 ++--
 arch/sparc/kernel/sys_sparc32.c |  4 ++--
 arch/x86/ia32/sys_ia32.c        |  8 ++++----
 fs/read_write.c                 | 20 ++++++++++++++++----
 include/linux/syscalls.h        |  4 ++++
 9 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/arch/mips/kernel/linux32.c b/arch/mips/kernel/linux32.c
index 12421ba4318a..944f0ff8e00b 100644
--- a/arch/mips/kernel/linux32.c
+++ b/arch/mips/kernel/linux32.c
@@ -105,13 +105,13 @@ SYSCALL_DEFINE5(32_llseek, unsigned int, fd, unsigned int, offset_high,
 SYSCALL_DEFINE6(32_pread, unsigned long, fd, char __user *, buf, size_t, count,
 	unsigned long, unused, unsigned long, a4, unsigned long, a5)
 {
-	return sys_pread64(fd, buf, count, merge_64(a4, a5));
+	return ksys_pread64(fd, buf, count, merge_64(a4, a5));
 }
 
 SYSCALL_DEFINE6(32_pwrite, unsigned int, fd, const char __user *, buf,
 	size_t, count, u32, unused, u64, a4, u64, a5)
 {
-	return sys_pwrite64(fd, buf, count, merge_64(a4, a5));
+	return ksys_pwrite64(fd, buf, count, merge_64(a4, a5));
 }
 
 SYSCALL_DEFINE1(32_personality, unsigned long, personality)
diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
index 3c1d788ab3ed..21179358926c 100644
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -333,13 +333,13 @@ asmlinkage long parisc_ftruncate64(unsigned int fd,
 asmlinkage ssize_t parisc_pread64(unsigned int fd, char __user *buf, size_t count,
 					unsigned int high, unsigned int low)
 {
-	return sys_pread64(fd, buf, count, (loff_t)high << 32 | low);
+	return ksys_pread64(fd, buf, count, (loff_t)high << 32 | low);
 }
 
 asmlinkage ssize_t parisc_pwrite64(unsigned int fd, const char __user *buf,
 			size_t count, unsigned int high, unsigned int low)
 {
-	return sys_pwrite64(fd, buf, count, (loff_t)high << 32 | low);
+	return ksys_pwrite64(fd, buf, count, (loff_t)high << 32 | low);
 }
 
 asmlinkage ssize_t parisc_readahead(int fd, unsigned int high, unsigned int low,
diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c
index eb79138bd44d..bad1f3e891a4 100644
--- a/arch/powerpc/kernel/sys_ppc32.c
+++ b/arch/powerpc/kernel/sys_ppc32.c
@@ -77,13 +77,13 @@ unsigned long compat_sys_mmap2(unsigned long addr, size_t len,
 compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count,
 			     u32 reg6, u32 poshi, u32 poslo)
 {
-	return sys_pread64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
+	return ksys_pread64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
 }
 
 compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, compat_size_t count,
 			      u32 reg6, u32 poshi, u32 poslo)
 {
-	return sys_pwrite64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
+	return ksys_pwrite64(fd, ubuf, count, ((loff_t)poshi << 32) | poslo);
 }
 
 compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offhi, u32 offlo, u32 count)
diff --git a/arch/s390/kernel/compat_linux.c b/arch/s390/kernel/compat_linux.c
index 0788df0443ba..a3b5adbd683f 100644
--- a/arch/s390/kernel/compat_linux.c
+++ b/arch/s390/kernel/compat_linux.c
@@ -315,7 +315,7 @@ COMPAT_SYSCALL_DEFINE5(s390_pread64, unsigned int, fd, char __user *, ubuf,
 {
 	if ((compat_ssize_t) count < 0)
 		return -EINVAL;
-	return sys_pread64(fd, ubuf, count, (unsigned long)high << 32 | low);
+	return ksys_pread64(fd, ubuf, count, (unsigned long)high << 32 | low);
 }
 
 COMPAT_SYSCALL_DEFINE5(s390_pwrite64, unsigned int, fd, const char __user *, ubuf,
@@ -323,7 +323,7 @@ COMPAT_SYSCALL_DEFINE5(s390_pwrite64, unsigned int, fd, const char __user *, ubu
 {
 	if ((compat_ssize_t) count < 0)
 		return -EINVAL;
-	return sys_pwrite64(fd, ubuf, count, (unsigned long)high << 32 | low);
+	return ksys_pwrite64(fd, ubuf, count, (unsigned long)high << 32 | low);
 }
 
 COMPAT_SYSCALL_DEFINE4(s390_readahead, int, fd, u32, high, u32, low, s32, count)
diff --git a/arch/sh/kernel/sys_sh32.c b/arch/sh/kernel/sys_sh32.c
index 4d55318e0899..9dca568509a5 100644
--- a/arch/sh/kernel/sys_sh32.c
+++ b/arch/sh/kernel/sys_sh32.c
@@ -39,13 +39,13 @@ asmlinkage int sys_sh_pipe(void)
 asmlinkage ssize_t sys_pread_wrapper(unsigned int fd, char __user *buf,
 			     size_t count, long dummy, loff_t pos)
 {
-	return sys_pread64(fd, buf, count, pos);
+	return ksys_pread64(fd, buf, count, pos);
 }
 
 asmlinkage ssize_t sys_pwrite_wrapper(unsigned int fd, const char __user *buf,
 			      size_t count, long dummy, loff_t pos)
 {
-	return sys_pwrite64(fd, buf, count, pos);
+	return ksys_pwrite64(fd, buf, count, pos);
 }
 
 asmlinkage int sys_fadvise64_64_wrapper(int fd, u32 offset0, u32 offset1,
diff --git a/arch/sparc/kernel/sys_sparc32.c b/arch/sparc/kernel/sys_sparc32.c
index 4ceb2e591688..dc8c3f0fe3e8 100644
--- a/arch/sparc/kernel/sys_sparc32.c
+++ b/arch/sparc/kernel/sys_sparc32.c
@@ -200,7 +200,7 @@ asmlinkage compat_ssize_t sys32_pread64(unsigned int fd,
 					unsigned long poshi,
 					unsigned long poslo)
 {
-	return sys_pread64(fd, ubuf, count, (poshi << 32) | poslo);
+	return ksys_pread64(fd, ubuf, count, (poshi << 32) | poslo);
 }
 
 asmlinkage compat_ssize_t sys32_pwrite64(unsigned int fd,
@@ -209,7 +209,7 @@ asmlinkage compat_ssize_t sys32_pwrite64(unsigned int fd,
 					 unsigned long poshi,
 					 unsigned long poslo)
 {
-	return sys_pwrite64(fd, ubuf, count, (poshi << 32) | poslo);
+	return ksys_pwrite64(fd, ubuf, count, (poshi << 32) | poslo);
 }
 
 asmlinkage long compat_sys_readahead(int fd,
diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index 91ed2c256dac..9d09812a40b9 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -179,15 +179,15 @@ COMPAT_SYSCALL_DEFINE3(x86_waitpid, compat_pid_t, pid, unsigned int __user *,
 COMPAT_SYSCALL_DEFINE5(x86_pread, unsigned int, fd, char __user *, ubuf,
 		       u32, count, u32, poslo, u32, poshi)
 {
-	return sys_pread64(fd, ubuf, count,
-			 ((loff_t)AA(poshi) << 32) | AA(poslo));
+	return ksys_pread64(fd, ubuf, count,
+			    ((loff_t)AA(poshi) << 32) | AA(poslo));
 }
 
 COMPAT_SYSCALL_DEFINE5(x86_pwrite, unsigned int, fd, const char __user *, ubuf,
 		       u32, count, u32, poslo, u32, poshi)
 {
-	return sys_pwrite64(fd, ubuf, count,
-			  ((loff_t)AA(poshi) << 32) | AA(poslo));
+	return ksys_pwrite64(fd, ubuf, count,
+			     ((loff_t)AA(poshi) << 32) | AA(poslo));
 }
 
 
diff --git a/fs/read_write.c b/fs/read_write.c
index fc441e1ac683..c4eabbfc90df 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -610,8 +610,8 @@ SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
 	return ksys_write(fd, buf, count);
 }
 
-SYSCALL_DEFINE4(pread64, unsigned int, fd, char __user *, buf,
-			size_t, count, loff_t, pos)
+ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count,
+		     loff_t pos)
 {
 	struct fd f;
 	ssize_t ret = -EBADF;
@@ -630,8 +630,14 @@ SYSCALL_DEFINE4(pread64, unsigned int, fd, char __user *, buf,
 	return ret;
 }
 
-SYSCALL_DEFINE4(pwrite64, unsigned int, fd, const char __user *, buf,
-			 size_t, count, loff_t, pos)
+SYSCALL_DEFINE4(pread64, unsigned int, fd, char __user *, buf,
+			size_t, count, loff_t, pos)
+{
+	return ksys_pread64(fd, buf, count, pos);
+}
+
+ssize_t ksys_pwrite64(unsigned int fd, const char __user *buf,
+		      size_t count, loff_t pos)
 {
 	struct fd f;
 	ssize_t ret = -EBADF;
@@ -650,6 +656,12 @@ SYSCALL_DEFINE4(pwrite64, unsigned int, fd, const char __user *, buf,
 	return ret;
 }
 
+SYSCALL_DEFINE4(pwrite64, unsigned int, fd, const char __user *, buf,
+			 size_t, count, loff_t, pos)
+{
+	return ksys_pwrite64(fd, buf, count, pos);
+}
+
 static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
 		loff_t *ppos, int type, rwf_t flags)
 {
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 535cc3cf516a..73a55d8982b6 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -978,6 +978,10 @@ int ksys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg);
 off_t ksys_lseek(unsigned int fd, off_t offset, unsigned int whence);
 ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count);
 void ksys_sync(void);
+ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count,
+		     loff_t pos);
+ssize_t ksys_pwrite64(unsigned int fd, const char __user *buf,
+		      size_t count, loff_t pos);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 11/45] fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (9 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 10/45] fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 12/45] fs: add do_compat_fcntl64() helper; remove in-kernel call to comapt syscall Dominik Brodowski
                   ` (34 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using the ksys_fallocate() wrapper allows us to get rid of in-kernel
calls to the sys_fallocate() syscall. The ksys_ prefix denotes that this
function is meant as a drop-in replacement for the syscall. In
particular, it uses the same calling convention as sys_fallocate().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/mips/kernel/linux32.c      | 4 ++--
 arch/parisc/kernel/sys_parisc.c | 4 ++--
 arch/powerpc/kernel/sys_ppc32.c | 2 +-
 arch/s390/kernel/compat_linux.c | 4 ++--
 arch/sparc/kernel/sys_sparc32.c | 4 ++--
 arch/x86/ia32/sys_ia32.c        | 4 ++--
 fs/open.c                       | 7 ++++++-
 include/linux/syscalls.h        | 1 +
 8 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/arch/mips/kernel/linux32.c b/arch/mips/kernel/linux32.c
index 944f0ff8e00b..0571ab7b68b0 100644
--- a/arch/mips/kernel/linux32.c
+++ b/arch/mips/kernel/linux32.c
@@ -157,6 +157,6 @@ asmlinkage long sys32_fadvise64_64(int fd, int __pad,
 asmlinkage long sys32_fallocate(int fd, int mode, unsigned offset_a2,
 	unsigned offset_a3, unsigned len_a4, unsigned len_a5)
 {
-	return sys_fallocate(fd, mode, merge_64(offset_a2, offset_a3),
-			     merge_64(len_a4, len_a5));
+	return ksys_fallocate(fd, mode, merge_64(offset_a2, offset_a3),
+			      merge_64(len_a4, len_a5));
 }
diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
index 21179358926c..080d566654ea 100644
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -367,8 +367,8 @@ asmlinkage long parisc_sync_file_range(int fd,
 asmlinkage long parisc_fallocate(int fd, int mode, u32 offhi, u32 offlo,
 				u32 lenhi, u32 lenlo)
 {
-        return sys_fallocate(fd, mode, ((u64)offhi << 32) | offlo,
-                             ((u64)lenhi << 32) | lenlo);
+	return ksys_fallocate(fd, mode, ((u64)offhi << 32) | offlo,
+			      ((u64)lenhi << 32) | lenlo);
 }
 
 long parisc_personality(unsigned long personality)
diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c
index bad1f3e891a4..0b95fa13307f 100644
--- a/arch/powerpc/kernel/sys_ppc32.c
+++ b/arch/powerpc/kernel/sys_ppc32.c
@@ -100,7 +100,7 @@ asmlinkage int compat_sys_truncate64(const char __user * path, u32 reg4,
 asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offhi, u32 offlo,
 				     u32 lenhi, u32 lenlo)
 {
-	return sys_fallocate(fd, mode, ((loff_t)offhi << 32) | offlo,
+	return ksys_fallocate(fd, mode, ((loff_t)offhi << 32) | offlo,
 			     ((loff_t)lenhi << 32) | lenlo);
 }
 
diff --git a/arch/s390/kernel/compat_linux.c b/arch/s390/kernel/compat_linux.c
index a3b5adbd683f..da5ef7718254 100644
--- a/arch/s390/kernel/compat_linux.c
+++ b/arch/s390/kernel/compat_linux.c
@@ -517,6 +517,6 @@ COMPAT_SYSCALL_DEFINE6(s390_sync_file_range, int, fd, u32, offhigh, u32, offlow,
 COMPAT_SYSCALL_DEFINE6(s390_fallocate, int, fd, int, mode, u32, offhigh, u32, offlow,
 		       u32, lenhigh, u32, lenlow)
 {
-	return sys_fallocate(fd, mode, ((loff_t)offhigh << 32) + offlow,
-			     ((u64)lenhigh << 32) + lenlow);
+	return ksys_fallocate(fd, mode, ((loff_t)offhigh << 32) + offlow,
+			      ((u64)lenhigh << 32) + lenlow);
 }
diff --git a/arch/sparc/kernel/sys_sparc32.c b/arch/sparc/kernel/sys_sparc32.c
index dc8c3f0fe3e8..4da66aed50b4 100644
--- a/arch/sparc/kernel/sys_sparc32.c
+++ b/arch/sparc/kernel/sys_sparc32.c
@@ -250,6 +250,6 @@ long sys32_sync_file_range(unsigned int fd, unsigned long off_high, unsigned lon
 asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offhi, u32 offlo,
 				     u32 lenhi, u32 lenlo)
 {
-	return sys_fallocate(fd, mode, ((loff_t)offhi << 32) | offlo,
-			     ((loff_t)lenhi << 32) | lenlo);
+	return ksys_fallocate(fd, mode, ((loff_t)offhi << 32) | offlo,
+			      ((loff_t)lenhi << 32) | lenlo);
 }
diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index 9d09812a40b9..bf4e8dbd65e7 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -231,8 +231,8 @@ COMPAT_SYSCALL_DEFINE6(x86_fallocate, int, fd, int, mode,
 		       unsigned int, offset_lo, unsigned int, offset_hi,
 		       unsigned int, len_lo, unsigned int, len_hi)
 {
-	return sys_fallocate(fd, mode, ((u64)offset_hi << 32) | offset_lo,
-			     ((u64)len_hi << 32) | len_lo);
+	return ksys_fallocate(fd, mode, ((u64)offset_hi << 32) | offset_lo,
+			      ((u64)len_hi << 32) | len_lo);
 }
 
 /*
diff --git a/fs/open.c b/fs/open.c
index 2e816fc7bd56..d0e955b558ad 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -333,7 +333,7 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 }
 EXPORT_SYMBOL_GPL(vfs_fallocate);
 
-SYSCALL_DEFINE4(fallocate, int, fd, int, mode, loff_t, offset, loff_t, len)
+int ksys_fallocate(int fd, int mode, loff_t offset, loff_t len)
 {
 	struct fd f = fdget(fd);
 	int error = -EBADF;
@@ -345,6 +345,11 @@ SYSCALL_DEFINE4(fallocate, int, fd, int, mode, loff_t, offset, loff_t, len)
 	return error;
 }
 
+SYSCALL_DEFINE4(fallocate, int, fd, int, mode, loff_t, offset, loff_t, len)
+{
+	return ksys_fallocate(fd, mode, offset, len);
+}
+
 /*
  * access() needs to use the real uid/gid, not the effective uid/gid.
  * We do this by temporarily clearing all FS-related capabilities and
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 73a55d8982b6..f30083190ae4 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -982,6 +982,7 @@ ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count,
 		     loff_t pos);
 ssize_t ksys_pwrite64(unsigned int fd, const char __user *buf,
 		      size_t count, loff_t pos);
+int ksys_fallocate(int fd, int mode, loff_t offset, loff_t len);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 12/45] fs: add do_compat_fcntl64() helper; remove in-kernel call to comapt syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (10 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 11/45] fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 13/45] fs: add do_compat_select() " Dominik Brodowski
                   ` (33 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using the fs-internal do_compat_fcntl64() helper allows us to get rid of
the fs-internal call to the compat_sys_fcntl64() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/fcntl.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index 1e97f1fda90c..d737ff082472 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -607,8 +607,8 @@ static int fixup_compat_flock(struct flock *flock)
 	return 0;
 }
 
-COMPAT_SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, cmd,
-		       compat_ulong_t, arg)
+static long do_compat_fcntl64(unsigned int fd, unsigned int cmd,
+			     compat_ulong_t arg)
 {
 	struct fd f = fdget_raw(fd);
 	struct flock flock;
@@ -672,6 +672,12 @@ COMPAT_SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, cmd,
 	return err;
 }
 
+COMPAT_SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, cmd,
+		       compat_ulong_t, arg)
+{
+	return do_compat_fcntl64(fd, cmd, arg);
+}
+
 COMPAT_SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned int, cmd,
 		       compat_ulong_t, arg)
 {
@@ -684,7 +690,7 @@ COMPAT_SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned int, cmd,
 	case F_OFD_SETLKW:
 		return -EINVAL;
 	}
-	return compat_sys_fcntl64(fd, cmd, arg);
+	return do_compat_fcntl64(fd, cmd, arg);
 }
 #endif
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 13/45] fs: add do_compat_select() helper; remove in-kernel call to comapt syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (11 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 12/45] fs: add do_compat_fcntl64() helper; remove in-kernel call to comapt syscall Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 14/45] fs: add do_compat_signalfd4() " Dominik Brodowski
                   ` (32 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using the fs-internal do_compat_select() helper allows us to get rid of
the fs-internal call to the compat_sys_select() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/select.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index b5df01c4587d..ba879c51288f 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -1265,9 +1265,9 @@ static int compat_core_sys_select(int n, compat_ulong_t __user *inp,
 	return ret;
 }
 
-COMPAT_SYSCALL_DEFINE5(select, int, n, compat_ulong_t __user *, inp,
-	compat_ulong_t __user *, outp, compat_ulong_t __user *, exp,
-	struct compat_timeval __user *, tvp)
+static int do_compat_select(int n, compat_ulong_t __user *inp,
+	compat_ulong_t __user *outp, compat_ulong_t __user *exp,
+	struct compat_timeval __user *tvp)
 {
 	struct timespec64 end_time, *to = NULL;
 	struct compat_timeval tv;
@@ -1290,6 +1290,13 @@ COMPAT_SYSCALL_DEFINE5(select, int, n, compat_ulong_t __user *, inp,
 	return ret;
 }
 
+COMPAT_SYSCALL_DEFINE5(select, int, n, compat_ulong_t __user *, inp,
+	compat_ulong_t __user *, outp, compat_ulong_t __user *, exp,
+	struct compat_timeval __user *, tvp)
+{
+	return do_compat_select(n, inp, outp, exp, tvp);
+}
+
 struct compat_sel_arg_struct {
 	compat_ulong_t n;
 	compat_uptr_t inp;
@@ -1304,8 +1311,8 @@ COMPAT_SYSCALL_DEFINE1(old_select, struct compat_sel_arg_struct __user *, arg)
 
 	if (copy_from_user(&a, arg, sizeof(a)))
 		return -EFAULT;
-	return compat_sys_select(a.n, compat_ptr(a.inp), compat_ptr(a.outp),
-				 compat_ptr(a.exp), compat_ptr(a.tvp));
+	return do_compat_select(a.n, compat_ptr(a.inp), compat_ptr(a.outp),
+				compat_ptr(a.exp), compat_ptr(a.tvp));
 }
 
 static long do_compat_pselect(int n, compat_ulong_t __user *inp,
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 14/45] fs: add do_compat_signalfd4() helper; remove in-kernel call to comapt syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (12 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 13/45] fs: add do_compat_select() " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 15/45] fs: add do_compat_futimesat() " Dominik Brodowski
                   ` (31 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using the fs-internal do_compat_signalfd4() helper allows us to get rid of
the fs-internal call to the compat_sys_signalfd4() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/signalfd.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/signalfd.c b/fs/signalfd.c
index 501c41f3351f..d2187a813376 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -323,10 +323,9 @@ SYSCALL_DEFINE3(signalfd, int, ufd, sigset_t __user *, user_mask,
 }
 
 #ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE4(signalfd4, int, ufd,
-		     const compat_sigset_t __user *,sigmask,
-		     compat_size_t, sigsetsize,
-		     int, flags)
+static long do_compat_signalfd4(int ufd,
+			const compat_sigset_t __user *sigmask,
+			compat_size_t sigsetsize, int flags)
 {
 	sigset_t tmp;
 	sigset_t __user *ksigmask;
@@ -342,10 +341,18 @@ COMPAT_SYSCALL_DEFINE4(signalfd4, int, ufd,
 	return do_signalfd4(ufd, ksigmask, sizeof(sigset_t), flags);
 }
 
+COMPAT_SYSCALL_DEFINE4(signalfd4, int, ufd,
+		     const compat_sigset_t __user *, sigmask,
+		     compat_size_t, sigsetsize,
+		     int, flags)
+{
+	return do_compat_signalfd4(ufd, sigmask, sigsetsize, flags);
+}
+
 COMPAT_SYSCALL_DEFINE3(signalfd, int, ufd,
 		     const compat_sigset_t __user *,sigmask,
 		     compat_size_t, sigsetsize)
 {
-	return compat_sys_signalfd4(ufd, sigmask, sigsetsize, 0);
+	return do_compat_signalfd4(ufd, sigmask, sigsetsize, 0);
 }
 #endif
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 15/45] fs: add do_compat_futimesat() helper; remove in-kernel call to comapt syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (13 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 14/45] fs: add do_compat_signalfd4() " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 16/45] inotify: add do_inotify_init() helper; remove in-kernel call to syscall Dominik Brodowski
                   ` (30 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Using the fs-internal do_compat_futimesat() helper allows us to get rid of
the fs-internal call to the compat_sys_futimesat() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/utimes.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/utimes.c b/fs/utimes.c
index 5be035ed26c0..69d4b6ba1bfb 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -260,7 +260,8 @@ COMPAT_SYSCALL_DEFINE4(utimensat, unsigned int, dfd, const char __user *, filena
 	return do_utimes(dfd, filename, t ? tv : NULL, flags);
 }
 
-COMPAT_SYSCALL_DEFINE3(futimesat, unsigned int, dfd, const char __user *, filename, struct compat_timeval __user *, t)
+static long do_compat_futimesat(unsigned int dfd, const char __user *filename,
+				struct compat_timeval __user *t)
 {
 	struct timespec64 tv[2];
 
@@ -279,8 +280,15 @@ COMPAT_SYSCALL_DEFINE3(futimesat, unsigned int, dfd, const char __user *, filena
 	return do_utimes(dfd, filename, t ? tv : NULL, 0);
 }
 
+COMPAT_SYSCALL_DEFINE3(futimesat, unsigned int, dfd,
+		       const char __user *, filename,
+		       struct compat_timeval __user *, t)
+{
+	return do_compat_futimesat(dfd, filename, t);
+}
+
 COMPAT_SYSCALL_DEFINE2(utimes, const char __user *, filename, struct compat_timeval __user *, t)
 {
-	return compat_sys_futimesat(AT_FDCWD, filename, t);
+	return do_compat_futimesat(AT_FDCWD, filename, t);
 }
 #endif
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 16/45] inotify: add do_inotify_init() helper; remove in-kernel call to syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (14 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 15/45] fs: add do_compat_futimesat() " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-26 12:25   ` Jan Kara
  2018-03-22  9:00 ` [PATCH 17/45] fanotify: add do_fanotify_mark() " Dominik Brodowski
                   ` (29 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Jan Kara, Amir Goldstein, linux-fsdevel

Using the inotify-internal do_inotify_init() helper allows us to get rid
of the in-kernel call to sys_inotify_init1() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/notify/inotify/inotify_user.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 2c908b31d6c9..43c23653ce2e 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -635,7 +635,7 @@ static struct fsnotify_group *inotify_new_group(unsigned int max_events)
 
 
 /* inotify syscalls */
-SYSCALL_DEFINE1(inotify_init1, int, flags)
+static int do_inotify_init(int flags)
 {
 	struct fsnotify_group *group;
 	int ret;
@@ -660,9 +660,14 @@ SYSCALL_DEFINE1(inotify_init1, int, flags)
 	return ret;
 }
 
+SYSCALL_DEFINE1(inotify_init1, int, flags)
+{
+	return do_inotify_init(flags);
+}
+
 SYSCALL_DEFINE0(inotify_init)
 {
-	return sys_inotify_init1(0);
+	return do_inotify_init(0);
 }
 
 SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 17/45] fanotify: add do_fanotify_mark() helper; remove in-kernel call to syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (15 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 16/45] inotify: add do_inotify_init() helper; remove in-kernel call to syscall Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-26 12:25   ` Jan Kara
  2018-03-22  9:00 ` [PATCH 18/45] fs/quota: add kernel_quotactl() " Dominik Brodowski
                   ` (28 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Jan Kara, Amir Goldstein

Using the fs-internal do_fanotify_mark() helper allows us to get rid of
the fs-internal call to the sys_fanotify_mark() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/notify/fanotify/fanotify_user.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index c07eb3d655ea..fa803a58a605 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -820,9 +820,8 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 	return fd;
 }
 
-SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
-			      __u64, mask, int, dfd,
-			      const char  __user *, pathname)
+static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
+			    int dfd, const char  __user *pathname)
 {
 	struct inode *inode = NULL;
 	struct vfsmount *mnt = NULL;
@@ -928,13 +927,20 @@ SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
 	return ret;
 }
 
+SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
+			      __u64, mask, int, dfd,
+			      const char  __user *, pathname)
+{
+	return do_fanotify_mark(fanotify_fd, flags, mask, dfd, pathname);
+}
+
 #ifdef CONFIG_COMPAT
 COMPAT_SYSCALL_DEFINE6(fanotify_mark,
 				int, fanotify_fd, unsigned int, flags,
 				__u32, mask0, __u32, mask1, int, dfd,
 				const char  __user *, pathname)
 {
-	return sys_fanotify_mark(fanotify_fd, flags,
+	return do_fanotify_mark(fanotify_fd, flags,
 #ifdef __BIG_ENDIAN
 				((__u64)mask0 << 32) | mask1,
 #else
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 18/45] fs/quota: add kernel_quotactl() helper; remove in-kernel call to syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (16 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 17/45] fanotify: add do_fanotify_mark() " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-26 12:26   ` Jan Kara
  2018-03-22  9:00 ` [PATCH 19/45] fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl() Dominik Brodowski
                   ` (27 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Jan Kara

Using the fs-internal kernel_quotactl() helper allows us to get rid of
the fs-internal call to the sys_quotactl() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Jan Kara <jack@suse.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 fs/quota/compat.c        |  8 ++++----
 fs/quota/quota.c         | 10 ++++++++--
 include/linux/quotaops.h |  3 +++
 3 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/quota/compat.c b/fs/quota/compat.c
index 779caed4f078..1577a2fd51f4 100644
--- a/fs/quota/compat.c
+++ b/fs/quota/compat.c
@@ -59,7 +59,7 @@ asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
 	case Q_GETQUOTA:
 		dqblk = compat_alloc_user_space(sizeof(struct if_dqblk));
 		compat_dqblk = addr;
-		ret = sys_quotactl(cmd, special, id, dqblk);
+		ret = kernel_quotactl(cmd, special, id, dqblk);
 		if (ret)
 			break;
 		if (copy_in_user(compat_dqblk, dqblk, sizeof(*compat_dqblk)) ||
@@ -75,12 +75,12 @@ asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
 			get_user(data, &compat_dqblk->dqb_valid) ||
 			put_user(data, &dqblk->dqb_valid))
 			break;
-		ret = sys_quotactl(cmd, special, id, dqblk);
+		ret = kernel_quotactl(cmd, special, id, dqblk);
 		break;
 	case Q_XGETQSTAT:
 		fsqstat = compat_alloc_user_space(sizeof(struct fs_quota_stat));
 		compat_fsqstat = addr;
-		ret = sys_quotactl(cmd, special, id, fsqstat);
+		ret = kernel_quotactl(cmd, special, id, fsqstat);
 		if (ret)
 			break;
 		ret = -EFAULT;
@@ -113,7 +113,7 @@ asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
 		ret = 0;
 		break;
 	default:
-		ret = sys_quotactl(cmd, special, id, addr);
+		ret = kernel_quotactl(cmd, special, id, addr);
 	}
 	return ret;
 }
diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 43612e2a73af..860bfbe7a07a 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -833,8 +833,8 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
  * calls. Maybe we need to add the process quotas etc. in the future,
  * but we probably should use rlimits for that.
  */
-SYSCALL_DEFINE4(quotactl, unsigned int, cmd, const char __user *, special,
-		qid_t, id, void __user *, addr)
+int kernel_quotactl(unsigned int cmd, const char __user *special,
+		    qid_t id, void __user *addr)
 {
 	uint cmds, type;
 	struct super_block *sb = NULL;
@@ -885,3 +885,9 @@ SYSCALL_DEFINE4(quotactl, unsigned int, cmd, const char __user *, special,
 		path_put(pathp);
 	return ret;
 }
+
+SYSCALL_DEFINE4(quotactl, unsigned int, cmd, const char __user *, special,
+		qid_t, id, void __user *, addr)
+{
+	return kernel_quotactl(cmd, special, id, addr);
+}
diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
index 2fb6fb11132e..ff63eac16a79 100644
--- a/include/linux/quotaops.h
+++ b/include/linux/quotaops.h
@@ -105,6 +105,9 @@ int dquot_set_dqblk(struct super_block *sb, struct kqid id,
 int __dquot_transfer(struct inode *inode, struct dquot **transfer_to);
 int dquot_transfer(struct inode *inode, struct iattr *iattr);
 
+int kernel_quotactl(unsigned int cmd, const char __user *special,
+		    qid_t id, void __user *addr);
+
 static inline struct mem_dqinfo *sb_dqinfo(struct super_block *sb, int type)
 {
 	return sb_dqopt(sb)->info + type;
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 19/45] fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (17 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 18/45] fs/quota: add kernel_quotactl() " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-26 12:33   ` Jan Kara
  2018-03-22  9:00 ` [PATCH 20/45] kernel: add do_compat_sigaltstack() helper; remove in-kernel call to compat syscall Dominik Brodowski
                   ` (26 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Jan Kara, Christoph Hellwig

While sys32_quotactl() is only needed on x86, it can use the recommended
COMPAT_SYSCALL_DEFINEx() machinery for its setup.

Cc: Jan Kara <jack@suse.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/entry/syscalls/syscall_32.tbl | 2 +-
 fs/quota/compat.c                      | 5 +++--
 include/linux/compat.h                 | 3 +++
 include/linux/syscalls.h               | 4 +---
 kernel/sys_ni.c                        | 2 +-
 5 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 2a5e99cff859..09338dd2bd94 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -137,7 +137,7 @@
 128	i386	init_module		sys_init_module
 129	i386	delete_module		sys_delete_module
 130	i386	get_kernel_syms
-131	i386	quotactl		sys_quotactl			sys32_quotactl
+131	i386	quotactl		sys_quotactl			compat_sys_quotactl32
 132	i386	getpgid			sys_getpgid
 133	i386	fchdir			sys_fchdir
 134	i386	bdflush			sys_bdflush
diff --git a/fs/quota/compat.c b/fs/quota/compat.c
index 1577a2fd51f4..c30572857619 100644
--- a/fs/quota/compat.c
+++ b/fs/quota/compat.c
@@ -41,8 +41,9 @@ struct compat_fs_quota_stat {
 	__u16		qs_iwarnlimit;
 };
 
-asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
-						qid_t id, void __user *addr)
+COMPAT_SYSCALL_DEFINE4(quotactl32, unsigned int, cmd,
+		       const char __user *, special, qid_t, id,
+		       void __user *, addr)
 {
 	unsigned int cmds;
 	struct if_dqblk __user *dqblk;
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 16c3027074a2..f1649a5e6716 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -461,6 +461,9 @@ asmlinkage ssize_t compat_sys_pwritev2(compat_ulong_t fd,
 		const struct compat_iovec __user *vec,
 		compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
 
+asmlinkage long compat_sys_quotactl32(unsigned int cmd,
+		const char __user *special, qid_t id, void __user *addr);
+
 #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
 asmlinkage long compat_sys_preadv64(unsigned long fd,
 		const struct compat_iovec __user *vec,
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index f30083190ae4..a3f42f5f3b08 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -241,8 +241,6 @@ static inline void addr_limit_user_check(void)
 #endif
 }
 
-asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
-			       qid_t id, void __user *addr);
 asmlinkage long sys_time(time_t __user *tloc);
 asmlinkage long sys_stime(time_t __user *tptr);
 asmlinkage long sys_gettimeofday(struct timeval __user *tv,
@@ -625,7 +623,7 @@ asmlinkage long sys_chdir(const char __user *filename);
 asmlinkage long sys_fchdir(unsigned int fd);
 asmlinkage long sys_rmdir(const char __user *pathname);
 asmlinkage long sys_lookup_dcookie(u64 cookie64, char __user *buf, size_t len);
-asmlinkage long sys_quotactl(unsigned int cmd, const char __user *special,
+asmlinkage long sys_quotactl32(unsigned int cmd, const char __user *special,
 				qid_t id, void __user *addr);
 asmlinkage long sys_getdents(unsigned int fd,
 				struct linux_dirent __user *dirent,
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index b5189762d275..951dbda5c2b4 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -18,7 +18,7 @@ asmlinkage long sys_ni_syscall(void)
 }
 
 cond_syscall(sys_quotactl);
-cond_syscall(sys32_quotactl);
+cond_syscall(compat_sys_quotactl32);
 cond_syscall(sys_acct);
 cond_syscall(sys_lookup_dcookie);
 cond_syscall(compat_sys_lookup_dcookie);
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 20/45] kernel: add do_compat_sigaltstack() helper; remove in-kernel call to compat syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (18 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 19/45] fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 21/45] kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid() Dominik Brodowski
                   ` (25 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Eric W. Biederman, Al Viro, Andrew Morton

Using this helper allows us to avoid the in-kernel call to the
compat_sys_sigaltstack() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 kernel/signal.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index 985c61749bcf..f04466655238 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -3573,9 +3573,8 @@ int __save_altstack(stack_t __user *uss, unsigned long sp)
 }
 
 #ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE2(sigaltstack,
-			const compat_stack_t __user *, uss_ptr,
-			compat_stack_t __user *, uoss_ptr)
+static int do_compat_sigaltstack(const compat_stack_t __user *uss_ptr,
+				 compat_stack_t __user *uoss_ptr)
 {
 	stack_t uss, uoss;
 	int ret;
@@ -3602,9 +3601,16 @@ COMPAT_SYSCALL_DEFINE2(sigaltstack,
 	return ret;
 }
 
+COMPAT_SYSCALL_DEFINE2(sigaltstack,
+			const compat_stack_t __user *, uss_ptr,
+			compat_stack_t __user *, uoss_ptr)
+{
+	return do_compat_sigaltstack(uss_ptr, uoss_ptr);
+}
+
 int compat_restore_altstack(const compat_stack_t __user *uss)
 {
-	int err = compat_sys_sigaltstack(uss, NULL);
+	int err = do_compat_sigaltstack(uss, NULL);
 	/* squash all but -EFAULT for now */
 	return err == -EFAULT ? err : 0;
 }
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 21/45] kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (19 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 20/45] kernel: add do_compat_sigaltstack() helper; remove in-kernel call to compat syscall Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c Dominik Brodowski
                   ` (24 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro

Using this helper allows us to avoid the in-kernel call to the
sys_setsid() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_setsid().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 include/linux/syscalls.h | 1 +
 init/do_mounts_initrd.c  | 2 +-
 kernel/sys.c             | 7 ++++++-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index a3f42f5f3b08..c509459ce9d5 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -981,6 +981,7 @@ ssize_t ksys_pread64(unsigned int fd, char __user *buf, size_t count,
 ssize_t ksys_pwrite64(unsigned int fd, const char __user *buf,
 		      size_t count, loff_t pos);
 int ksys_fallocate(int fd, int mode, loff_t offset, loff_t len);
+int ksys_setsid(void);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
diff --git a/init/do_mounts_initrd.c b/init/do_mounts_initrd.c
index d1d3e53bdeef..5a91aefa7305 100644
--- a/init/do_mounts_initrd.c
+++ b/init/do_mounts_initrd.c
@@ -45,7 +45,7 @@ static int init_linuxrc(struct subprocess_info *info, struct cred *new)
 	ksys_chdir("/root");
 	ksys_mount(".", "/", NULL, MS_MOVE, NULL);
 	ksys_chroot(".");
-	sys_setsid();
+	ksys_setsid();
 	return 0;
 }
 
diff --git a/kernel/sys.c b/kernel/sys.c
index ebb138b841c8..8eda25dcbbd4 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1108,7 +1108,7 @@ static void set_special_pids(struct pid *pid)
 		change_pid(curr, PIDTYPE_PGID, pid);
 }
 
-SYSCALL_DEFINE0(setsid)
+int ksys_setsid(void)
 {
 	struct task_struct *group_leader = current->group_leader;
 	struct pid *sid = task_pid(group_leader);
@@ -1141,6 +1141,11 @@ SYSCALL_DEFINE0(setsid)
 	return err;
 }
 
+SYSCALL_DEFINE0(setsid)
+{
+	return ksys_setsid();
+}
+
 DECLARE_RWSEM(uts_sem);
 
 #ifdef COMPAT_UTS_MACHINE
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (20 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 21/45] kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22 10:21   ` Any chance that kernel/uid6.c can go? [Was: [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c] Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield() Dominik Brodowski
                   ` (23 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Eric W . Biederman, Andrew Morton

Using these helpers allows us to avoid the in-kernel calls to these
syscalls: sys_setregid(), sys_setgid(), sys_setreuid(), sys_setuid(),
sys_setresuid(), sys_setresgid(), sys_setfsuid(), and sys_setfsgid().

The ksys_ prefix denotes that these function are meant as a drop-in
replacement for the syscall. In particular, they use the same calling
convention.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 kernel/sys.c   | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++--------
 kernel/uid16.c | 19 ++++++++++---------
 kernel/uid16.h | 14 ++++++++++++++
 3 files changed, 74 insertions(+), 17 deletions(-)
 create mode 100644 kernel/uid16.h

diff --git a/kernel/sys.c b/kernel/sys.c
index 8eda25dcbbd4..ad692183dfe9 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -69,6 +69,8 @@
 #include <asm/io.h>
 #include <asm/unistd.h>
 
+#include "uid16.h"
+
 #ifndef SET_UNALIGN_CTL
 # define SET_UNALIGN_CTL(a, b)	(-EINVAL)
 #endif
@@ -340,7 +342,7 @@ SYSCALL_DEFINE2(getpriority, int, which, int, who)
  *      operations (as far as semantic preservation is concerned).
  */
 #ifdef CONFIG_MULTIUSER
-SYSCALL_DEFINE2(setregid, gid_t, rgid, gid_t, egid)
+long __sys_setregid(gid_t rgid, gid_t egid)
 {
 	struct user_namespace *ns = current_user_ns();
 	const struct cred *old;
@@ -392,12 +394,17 @@ SYSCALL_DEFINE2(setregid, gid_t, rgid, gid_t, egid)
 	return retval;
 }
 
+SYSCALL_DEFINE2(setregid, gid_t, rgid, gid_t, egid)
+{
+	return __sys_setregid(rgid, egid);
+}
+
 /*
  * setgid() is implemented like SysV w/ SAVED_IDS
  *
  * SMP: Same implicit races as above.
  */
-SYSCALL_DEFINE1(setgid, gid_t, gid)
+long __sys_setgid(gid_t gid)
 {
 	struct user_namespace *ns = current_user_ns();
 	const struct cred *old;
@@ -429,6 +436,11 @@ SYSCALL_DEFINE1(setgid, gid_t, gid)
 	return retval;
 }
 
+SYSCALL_DEFINE1(setgid, gid_t, gid)
+{
+	return __sys_setgid(gid);
+}
+
 /*
  * change the user struct in a credentials set to match the new UID
  */
@@ -473,7 +485,7 @@ static int set_user(struct cred *new)
  * 100% compatible with BSD.  A program which uses just setuid() will be
  * 100% compatible with POSIX with saved IDs.
  */
-SYSCALL_DEFINE2(setreuid, uid_t, ruid, uid_t, euid)
+long __sys_setreuid(uid_t ruid, uid_t euid)
 {
 	struct user_namespace *ns = current_user_ns();
 	const struct cred *old;
@@ -533,6 +545,11 @@ SYSCALL_DEFINE2(setreuid, uid_t, ruid, uid_t, euid)
 	return retval;
 }
 
+SYSCALL_DEFINE2(setreuid, uid_t, ruid, uid_t, euid)
+{
+	return __sys_setreuid(ruid, euid);
+}
+
 /*
  * setuid() is implemented like SysV with SAVED_IDS
  *
@@ -544,7 +561,7 @@ SYSCALL_DEFINE2(setreuid, uid_t, ruid, uid_t, euid)
  * will allow a root program to temporarily drop privileges and be able to
  * regain them by swapping the real and effective uid.
  */
-SYSCALL_DEFINE1(setuid, uid_t, uid)
+long __sys_setuid(uid_t uid)
 {
 	struct user_namespace *ns = current_user_ns();
 	const struct cred *old;
@@ -586,12 +603,17 @@ SYSCALL_DEFINE1(setuid, uid_t, uid)
 	return retval;
 }
 
+SYSCALL_DEFINE1(setuid, uid_t, uid)
+{
+	return __sys_setuid(uid);
+}
+
 
 /*
  * This function implements a generic ability to update ruid, euid,
  * and suid.  This allows you to implement the 4.4 compatible seteuid().
  */
-SYSCALL_DEFINE3(setresuid, uid_t, ruid, uid_t, euid, uid_t, suid)
+long __sys_setresuid(uid_t ruid, uid_t euid, uid_t suid)
 {
 	struct user_namespace *ns = current_user_ns();
 	const struct cred *old;
@@ -656,6 +678,11 @@ SYSCALL_DEFINE3(setresuid, uid_t, ruid, uid_t, euid, uid_t, suid)
 	return retval;
 }
 
+SYSCALL_DEFINE3(setresuid, uid_t, ruid, uid_t, euid, uid_t, suid)
+{
+	return __sys_setresuid(ruid, euid, suid);
+}
+
 SYSCALL_DEFINE3(getresuid, uid_t __user *, ruidp, uid_t __user *, euidp, uid_t __user *, suidp)
 {
 	const struct cred *cred = current_cred();
@@ -678,7 +705,7 @@ SYSCALL_DEFINE3(getresuid, uid_t __user *, ruidp, uid_t __user *, euidp, uid_t _
 /*
  * Same as above, but for rgid, egid, sgid.
  */
-SYSCALL_DEFINE3(setresgid, gid_t, rgid, gid_t, egid, gid_t, sgid)
+long __sys_setresgid(gid_t rgid, gid_t egid, gid_t sgid)
 {
 	struct user_namespace *ns = current_user_ns();
 	const struct cred *old;
@@ -730,6 +757,11 @@ SYSCALL_DEFINE3(setresgid, gid_t, rgid, gid_t, egid, gid_t, sgid)
 	return retval;
 }
 
+SYSCALL_DEFINE3(setresgid, gid_t, rgid, gid_t, egid, gid_t, sgid)
+{
+	return __sys_setresgid(rgid, egid, sgid);
+}
+
 SYSCALL_DEFINE3(getresgid, gid_t __user *, rgidp, gid_t __user *, egidp, gid_t __user *, sgidp)
 {
 	const struct cred *cred = current_cred();
@@ -757,7 +789,7 @@ SYSCALL_DEFINE3(getresgid, gid_t __user *, rgidp, gid_t __user *, egidp, gid_t _
  * whatever uid it wants to). It normally shadows "euid", except when
  * explicitly set by setfsuid() or for access..
  */
-SYSCALL_DEFINE1(setfsuid, uid_t, uid)
+long __sys_setfsuid(uid_t uid)
 {
 	const struct cred *old;
 	struct cred *new;
@@ -793,10 +825,15 @@ SYSCALL_DEFINE1(setfsuid, uid_t, uid)
 	return old_fsuid;
 }
 
+SYSCALL_DEFINE1(setfsuid, uid_t, uid)
+{
+	return __sys_setfsuid(uid);
+}
+
 /*
  * Samma på svenska..
  */
-SYSCALL_DEFINE1(setfsgid, gid_t, gid)
+long __sys_setfsgid(gid_t gid)
 {
 	const struct cred *old;
 	struct cred *new;
@@ -830,6 +867,11 @@ SYSCALL_DEFINE1(setfsgid, gid_t, gid)
 	commit_creds(new);
 	return old_fsgid;
 }
+
+SYSCALL_DEFINE1(setfsgid, gid_t, gid)
+{
+	return __sys_setfsgid(gid);
+}
 #endif /* CONFIG_MULTIUSER */
 
 /**
diff --git a/kernel/uid16.c b/kernel/uid16.c
index ea3cf87ff000..af6925d8599b 100644
--- a/kernel/uid16.c
+++ b/kernel/uid16.c
@@ -18,6 +18,8 @@
 
 #include <linux/uaccess.h>
 
+#include "uid16.h"
+
 SYSCALL_DEFINE3(chown16, const char __user *, filename, old_uid_t, user, old_gid_t, group)
 {
 	return ksys_chown(filename, low2highuid(user), low2highgid(group));
@@ -35,27 +37,27 @@ SYSCALL_DEFINE3(fchown16, unsigned int, fd, old_uid_t, user, old_gid_t, group)
 
 SYSCALL_DEFINE2(setregid16, old_gid_t, rgid, old_gid_t, egid)
 {
-	return sys_setregid(low2highgid(rgid), low2highgid(egid));
+	return __sys_setregid(low2highgid(rgid), low2highgid(egid));
 }
 
 SYSCALL_DEFINE1(setgid16, old_gid_t, gid)
 {
-	return sys_setgid(low2highgid(gid));
+	return __sys_setgid(low2highgid(gid));
 }
 
 SYSCALL_DEFINE2(setreuid16, old_uid_t, ruid, old_uid_t, euid)
 {
-	return sys_setreuid(low2highuid(ruid), low2highuid(euid));
+	return __sys_setreuid(low2highuid(ruid), low2highuid(euid));
 }
 
 SYSCALL_DEFINE1(setuid16, old_uid_t, uid)
 {
-	return sys_setuid(low2highuid(uid));
+	return __sys_setuid(low2highuid(uid));
 }
 
 SYSCALL_DEFINE3(setresuid16, old_uid_t, ruid, old_uid_t, euid, old_uid_t, suid)
 {
-	return sys_setresuid(low2highuid(ruid), low2highuid(euid),
+	return __sys_setresuid(low2highuid(ruid), low2highuid(euid),
 				 low2highuid(suid));
 }
 
@@ -78,11 +80,10 @@ SYSCALL_DEFINE3(getresuid16, old_uid_t __user *, ruidp, old_uid_t __user *, euid
 
 SYSCALL_DEFINE3(setresgid16, old_gid_t, rgid, old_gid_t, egid, old_gid_t, sgid)
 {
-	return sys_setresgid(low2highgid(rgid), low2highgid(egid),
+	return __sys_setresgid(low2highgid(rgid), low2highgid(egid),
 				 low2highgid(sgid));
 }
 
-
 SYSCALL_DEFINE3(getresgid16, old_gid_t __user *, rgidp, old_gid_t __user *, egidp, old_gid_t __user *, sgidp)
 {
 	const struct cred *cred = current_cred();
@@ -102,12 +103,12 @@ SYSCALL_DEFINE3(getresgid16, old_gid_t __user *, rgidp, old_gid_t __user *, egid
 
 SYSCALL_DEFINE1(setfsuid16, old_uid_t, uid)
 {
-	return sys_setfsuid(low2highuid(uid));
+	return __sys_setfsuid(low2highuid(uid));
 }
 
 SYSCALL_DEFINE1(setfsgid16, old_gid_t, gid)
 {
-	return sys_setfsgid(low2highgid(gid));
+	return __sys_setfsgid(low2highgid(gid));
 }
 
 static int groups16_to_user(old_gid_t __user *grouplist,
diff --git a/kernel/uid16.h b/kernel/uid16.h
new file mode 100644
index 000000000000..cdca040f7602
--- /dev/null
+++ b/kernel/uid16.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef LINUX_UID16_H
+#define LINUX_UID16_H
+
+long __sys_setuid(uid_t uid);
+long __sys_setgid(gid_t gid);
+long __sys_setreuid(uid_t ruid, uid_t euid);
+long __sys_setregid(gid_t rgid, gid_t egid);
+long __sys_setresuid(uid_t ruid, uid_t euid, uid_t suid);
+long __sys_setresgid(gid_t rgid, gid_t egid, gid_t sgid);
+long __sys_setfsuid(uid_t uid);
+long __sys_setfsgid(gid_t gid);
+
+#endif /* LINUX_UID16_H */
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (21 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22 17:29   ` Peter Zijlstra
  2018-03-22  9:00 ` [PATCH 24/45] kexec: call do_kexec_load() in compat syscall directly Dominik Brodowski
                   ` (22 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Ingo Molnar, Peter Zijlstra

Using the sched-internal do_sched_yield() helper allows us to get rid of
the sched-internal call to the sys_sched_yield() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 kernel/sched/core.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e7c535eee0a6..8de4919c889a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4892,7 +4892,7 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len,
  *
  * Return: 0.
  */
-SYSCALL_DEFINE0(sched_yield)
+static void do_sched_yield(void)
 {
 	struct rq_flags rf;
 	struct rq *rq;
@@ -4913,7 +4913,11 @@ SYSCALL_DEFINE0(sched_yield)
 	sched_preempt_enable_no_resched();
 
 	schedule();
+}
 
+SYSCALL_DEFINE0(sched_yield)
+{
+	do_sched_yield();
 	return 0;
 }
 
@@ -4997,7 +5001,7 @@ EXPORT_SYMBOL(__cond_resched_softirq);
 void __sched yield(void)
 {
 	set_current_state(TASK_RUNNING);
-	sys_sched_yield();
+	do_sched_yield();
 }
 EXPORT_SYMBOL(yield);
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 24/45] kexec: call do_kexec_load() in compat syscall directly
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (22 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 25/45] mm: add kernel_migrate_pages() helper, move compat syscall to mm/mempolicy.c Dominik Brodowski
                   ` (21 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Eric Biederman, kexec

do_kexec_load() can be called directly by compat_sys_kexec() as long as
the same parameters checks are completed which are currently handled
(also) by sys_kexec(). Therefore, move those to kexec_load_check(),
call that newly introduced helper function from both sys_kexec() and
compat_sys_kexec(), and duplicate the remaining code from sys_kexec()
in compat_sys_kexec().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Eric Biederman <ebiederm@xmission.com>
Cc: kexec@lists.infradead.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 kernel/kexec.c | 50 +++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index e62ec4dc6620..d959dc2c5587 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -192,11 +192,9 @@ static int do_kexec_load(unsigned long entry, unsigned long nr_segments,
  * that to happen you need to do that yourself.
  */
 
-SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
-		struct kexec_segment __user *, segments, unsigned long, flags)
+static inline int kexec_load_check(unsigned long nr_segments,
+				   unsigned long flags)
 {
-	int result;
-
 	/* We only trust the superuser with rebooting the system. */
 	if (!capable(CAP_SYS_BOOT) || kexec_load_disabled)
 		return -EPERM;
@@ -208,17 +206,29 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
 	if ((flags & KEXEC_FLAGS) != (flags & ~KEXEC_ARCH_MASK))
 		return -EINVAL;
 
-	/* Verify we are on the appropriate architecture */
-	if (((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH) &&
-		((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH_DEFAULT))
-		return -EINVAL;
-
 	/* Put an artificial cap on the number
 	 * of segments passed to kexec_load.
 	 */
 	if (nr_segments > KEXEC_SEGMENT_MAX)
 		return -EINVAL;
 
+	return 0;
+}
+
+SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
+		struct kexec_segment __user *, segments, unsigned long, flags)
+{
+	int result;
+
+	result = kexec_load_check(nr_segments, flags);
+	if (result)
+		return result;
+
+	/* Verify we are on the appropriate architecture */
+	if (((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH) &&
+		((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH_DEFAULT))
+		return -EINVAL;
+
 	/* Because we write directly to the reserved memory
 	 * region when loading crash kernels we need a mutex here to
 	 * prevent multiple crash  kernels from attempting to load
@@ -247,15 +257,16 @@ COMPAT_SYSCALL_DEFINE4(kexec_load, compat_ulong_t, entry,
 	struct kexec_segment out, __user *ksegments;
 	unsigned long i, result;
 
+	result = kexec_load_check(nr_segments, flags);
+	if (result)
+		return result;
+
 	/* Don't allow clients that don't understand the native
 	 * architecture to do anything.
 	 */
 	if ((flags & KEXEC_ARCH_MASK) == KEXEC_ARCH_DEFAULT)
 		return -EINVAL;
 
-	if (nr_segments > KEXEC_SEGMENT_MAX)
-		return -EINVAL;
-
 	ksegments = compat_alloc_user_space(nr_segments * sizeof(out));
 	for (i = 0; i < nr_segments; i++) {
 		result = copy_from_user(&in, &segments[i], sizeof(in));
@@ -272,6 +283,19 @@ COMPAT_SYSCALL_DEFINE4(kexec_load, compat_ulong_t, entry,
 			return -EFAULT;
 	}
 
-	return sys_kexec_load(entry, nr_segments, ksegments, flags);
+	/* Because we write directly to the reserved memory
+	 * region when loading crash kernels we need a mutex here to
+	 * prevent multiple crash  kernels from attempting to load
+	 * simultaneously, and to prevent a crash kernel from loading
+	 * over the top of a in use crash kernel.
+	 *
+	 * KISS: always take the mutex.
+	 */
+	if (!mutex_trylock(&kexec_mutex))
+		return -EBUSY;
+
+	result = do_kexec_load(entry, nr_segments, ksegments, flags);
+
+	mutex_unlock(&kexec_mutex);
 }
 #endif
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 25/45] mm: add kernel_migrate_pages() helper, move compat syscall to mm/mempolicy.c
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (23 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 24/45] kexec: call do_kexec_load() in compat syscall directly Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 26/45] mm: add kernel_move_pages() helper, move compat syscall to mm/migrate.c Dominik Brodowski
                   ` (20 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Al Viro, linux-mm, Andrew Morton

Move compat_sys_migrate_pages() to mm/mempolicy.c and make it call a newly
introduced helper -- kernel_migrate_pages() -- instead of the syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 kernel/compat.c | 33 ---------------------------------
 mm/mempolicy.c  | 48 ++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 44 insertions(+), 37 deletions(-)

diff --git a/kernel/compat.c b/kernel/compat.c
index 3f5fa8902e7d..51bdf1808943 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -508,39 +508,6 @@ COMPAT_SYSCALL_DEFINE6(move_pages, pid_t, pid, compat_ulong_t, nr_pages,
 	}
 	return sys_move_pages(pid, nr_pages, pages, nodes, status, flags);
 }
-
-COMPAT_SYSCALL_DEFINE4(migrate_pages, compat_pid_t, pid,
-		       compat_ulong_t, maxnode,
-		       const compat_ulong_t __user *, old_nodes,
-		       const compat_ulong_t __user *, new_nodes)
-{
-	unsigned long __user *old = NULL;
-	unsigned long __user *new = NULL;
-	nodemask_t tmp_mask;
-	unsigned long nr_bits;
-	unsigned long size;
-
-	nr_bits = min_t(unsigned long, maxnode - 1, MAX_NUMNODES);
-	size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
-	if (old_nodes) {
-		if (compat_get_bitmap(nodes_addr(tmp_mask), old_nodes, nr_bits))
-			return -EFAULT;
-		old = compat_alloc_user_space(new_nodes ? size * 2 : size);
-		if (new_nodes)
-			new = old + size / sizeof(unsigned long);
-		if (copy_to_user(old, nodes_addr(tmp_mask), size))
-			return -EFAULT;
-	}
-	if (new_nodes) {
-		if (compat_get_bitmap(nodes_addr(tmp_mask), new_nodes, nr_bits))
-			return -EFAULT;
-		if (new == NULL)
-			new = compat_alloc_user_space(size);
-		if (copy_to_user(new, nodes_addr(tmp_mask), size))
-			return -EFAULT;
-	}
-	return sys_migrate_pages(pid, nr_bits + 1, old, new);
-}
 #endif
 
 /*
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index d879f1d8a44a..7399ede02b5f 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1377,9 +1377,9 @@ SYSCALL_DEFINE3(set_mempolicy, int, mode, const unsigned long __user *, nmask,
 	return do_set_mempolicy(mode, flags, &nodes);
 }
 
-SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned long, maxnode,
-		const unsigned long __user *, old_nodes,
-		const unsigned long __user *, new_nodes)
+static int kernel_migrate_pages(pid_t pid, unsigned long maxnode,
+				const unsigned long __user *old_nodes,
+				const unsigned long __user *new_nodes)
 {
 	struct mm_struct *mm = NULL;
 	struct task_struct *task;
@@ -1469,6 +1469,13 @@ SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned long, maxnode,
 
 }
 
+SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned long, maxnode,
+		const unsigned long __user *, old_nodes,
+		const unsigned long __user *, new_nodes)
+{
+	return kernel_migrate_pages(pid, maxnode, old_nodes, new_nodes);
+}
+
 
 /* Retrieve NUMA policy */
 SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
@@ -1571,7 +1578,40 @@ COMPAT_SYSCALL_DEFINE6(mbind, compat_ulong_t, start, compat_ulong_t, len,
 	return sys_mbind(start, len, mode, nm, nr_bits+1, flags);
 }
 
-#endif
+COMPAT_SYSCALL_DEFINE4(migrate_pages, compat_pid_t, pid,
+		       compat_ulong_t, maxnode,
+		       const compat_ulong_t __user *, old_nodes,
+		       const compat_ulong_t __user *, new_nodes)
+{
+	unsigned long __user *old = NULL;
+	unsigned long __user *new = NULL;
+	nodemask_t tmp_mask;
+	unsigned long nr_bits;
+	unsigned long size;
+
+	nr_bits = min_t(unsigned long, maxnode - 1, MAX_NUMNODES);
+	size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
+	if (old_nodes) {
+		if (compat_get_bitmap(nodes_addr(tmp_mask), old_nodes, nr_bits))
+			return -EFAULT;
+		old = compat_alloc_user_space(new_nodes ? size * 2 : size);
+		if (new_nodes)
+			new = old + size / sizeof(unsigned long);
+		if (copy_to_user(old, nodes_addr(tmp_mask), size))
+			return -EFAULT;
+	}
+	if (new_nodes) {
+		if (compat_get_bitmap(nodes_addr(tmp_mask), new_nodes, nr_bits))
+			return -EFAULT;
+		if (new == NULL)
+			new = compat_alloc_user_space(size);
+		if (copy_to_user(new, nodes_addr(tmp_mask), size))
+			return -EFAULT;
+	}
+	return kernel_migrate_pages(pid, nr_bits + 1, old, new);
+}
+
+#endif /* CONFIG_COMPAT */
 
 struct mempolicy *__get_vma_policy(struct vm_area_struct *vma,
 						unsigned long addr)
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 26/45] mm: add kernel_move_pages() helper, move compat syscall to mm/migrate.c
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (24 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 25/45] mm: add kernel_migrate_pages() helper, move compat syscall to mm/mempolicy.c Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 27/45] mm: add kernel_mbind() helper; remove in-kernel call to syscall Dominik Brodowski
                   ` (19 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Al Viro, linux-mm, Andrew Morton

Move compat_sys_move_pages() to mm/migrate.c and make it call a newly
introduced helper -- kernel_move_pages() -- instead of the syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 kernel/compat.c | 22 ----------------------
 mm/migrate.c    | 39 +++++++++++++++++++++++++++++++++++----
 2 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/kernel/compat.c b/kernel/compat.c
index 51bdf1808943..6d21894806b4 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -488,28 +488,6 @@ get_compat_sigset(sigset_t *set, const compat_sigset_t __user *compat)
 }
 EXPORT_SYMBOL_GPL(get_compat_sigset);
 
-#ifdef CONFIG_NUMA
-COMPAT_SYSCALL_DEFINE6(move_pages, pid_t, pid, compat_ulong_t, nr_pages,
-		       compat_uptr_t __user *, pages32,
-		       const int __user *, nodes,
-		       int __user *, status,
-		       int, flags)
-{
-	const void __user * __user *pages;
-	int i;
-
-	pages = compat_alloc_user_space(nr_pages * sizeof(void *));
-	for (i = 0; i < nr_pages; i++) {
-		compat_uptr_t p;
-
-		if (get_user(p, pages32 + i) ||
-			put_user(compat_ptr(p), pages + i))
-			return -EFAULT;
-	}
-	return sys_move_pages(pid, nr_pages, pages, nodes, status, flags);
-}
-#endif
-
 /*
  * Allocate user-space memory for the duration of a single system call,
  * in order to marshall parameters inside a compat thunk.
diff --git a/mm/migrate.c b/mm/migrate.c
index 1e5525a25691..003886606a22 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -34,6 +34,7 @@
 #include <linux/backing-dev.h>
 #include <linux/compaction.h>
 #include <linux/syscalls.h>
+#include <linux/compat.h>
 #include <linux/hugetlb.h>
 #include <linux/hugetlb_cgroup.h>
 #include <linux/gfp.h>
@@ -1745,10 +1746,10 @@ static int do_pages_stat(struct mm_struct *mm, unsigned long nr_pages,
  * Move a list of pages in the address space of the currently executing
  * process.
  */
-SYSCALL_DEFINE6(move_pages, pid_t, pid, unsigned long, nr_pages,
-		const void __user * __user *, pages,
-		const int __user *, nodes,
-		int __user *, status, int, flags)
+static int kernel_move_pages(pid_t pid, unsigned long nr_pages,
+			     const void __user * __user *pages,
+			     const int __user *nodes,
+			     int __user *status, int flags)
 {
 	struct task_struct *task;
 	struct mm_struct *mm;
@@ -1807,6 +1808,36 @@ SYSCALL_DEFINE6(move_pages, pid_t, pid, unsigned long, nr_pages,
 	return err;
 }
 
+SYSCALL_DEFINE6(move_pages, pid_t, pid, unsigned long, nr_pages,
+		const void __user * __user *, pages,
+		const int __user *, nodes,
+		int __user *, status, int, flags)
+{
+	return kernel_move_pages(pid, nr_pages, pages, nodes, status, flags);
+}
+
+#ifdef CONFIG_COMPAT
+COMPAT_SYSCALL_DEFINE6(move_pages, pid_t, pid, compat_ulong_t, nr_pages,
+		       compat_uptr_t __user *, pages32,
+		       const int __user *, nodes,
+		       int __user *, status,
+		       int, flags)
+{
+	const void __user * __user *pages;
+	int i;
+
+	pages = compat_alloc_user_space(nr_pages * sizeof(void *));
+	for (i = 0; i < nr_pages; i++) {
+		compat_uptr_t p;
+
+		if (get_user(p, pages32 + i) ||
+			put_user(compat_ptr(p), pages + i))
+			return -EFAULT;
+	}
+	return kernel_move_pages(pid, nr_pages, pages, nodes, status, flags);
+}
+#endif /* CONFIG_COMPAT */
+
 #ifdef CONFIG_NUMA_BALANCING
 /*
  * Returns true if this is a safe migration target node for misplaced NUMA
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 27/45] mm: add kernel_mbind() helper; remove in-kernel call to syscall
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (25 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 26/45] mm: add kernel_move_pages() helper, move compat syscall to mm/migrate.c Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 28/45] mm: add kernel_[sg]et_mempolicy() helpers; remove in-kernel calls to syscalls Dominik Brodowski
                   ` (18 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Al Viro, linux-mm, Andrew Morton

Using the mm-internal kernel_mbind() helper allows us to get rid of the
mm-internal call to the sys_mbind() syscall.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 mm/mempolicy.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 7399ede02b5f..e4d7d4c0b253 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1336,9 +1336,9 @@ static int copy_nodes_to_user(unsigned long __user *mask, unsigned long maxnode,
 	return copy_to_user(mask, nodes_addr(*nodes), copy) ? -EFAULT : 0;
 }
 
-SYSCALL_DEFINE6(mbind, unsigned long, start, unsigned long, len,
-		unsigned long, mode, const unsigned long __user *, nmask,
-		unsigned long, maxnode, unsigned, flags)
+static long kernel_mbind(unsigned long start, unsigned long len,
+			 unsigned long mode, const unsigned long __user *nmask,
+			 unsigned long maxnode, unsigned int flags)
 {
 	nodemask_t nodes;
 	int err;
@@ -1357,6 +1357,13 @@ SYSCALL_DEFINE6(mbind, unsigned long, start, unsigned long, len,
 	return do_mbind(start, len, mode, mode_flags, &nodes, flags);
 }
 
+SYSCALL_DEFINE6(mbind, unsigned long, start, unsigned long, len,
+		unsigned long, mode, const unsigned long __user *, nmask,
+		unsigned long, maxnode, unsigned int, flags)
+{
+	return kernel_mbind(start, len, mode, nmask, maxnode, flags);
+}
+
 /* Set the process memory policy */
 SYSCALL_DEFINE3(set_mempolicy, int, mode, const unsigned long __user *, nmask,
 		unsigned long, maxnode)
@@ -1575,7 +1582,7 @@ COMPAT_SYSCALL_DEFINE6(mbind, compat_ulong_t, start, compat_ulong_t, len,
 			return -EFAULT;
 	}
 
-	return sys_mbind(start, len, mode, nm, nr_bits+1, flags);
+	return kernel_mbind(start, len, mode, nm, nr_bits+1, flags);
 }
 
 COMPAT_SYSCALL_DEFINE4(migrate_pages, compat_pid_t, pid,
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 28/45] mm: add kernel_[sg]et_mempolicy() helpers; remove in-kernel calls to syscalls
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (26 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 27/45] mm: add kernel_mbind() helper; remove in-kernel call to syscall Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 29/45] mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead() Dominik Brodowski
                   ` (17 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Al Viro, linux-mm, Andrew Morton

Using the mm-internal kernel_[sg]et_mempolicy() helper allows us to get
rid of the mm-internal calls to the sys_[sg]et_mempolicy() syscalls.

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 mm/mempolicy.c | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index e4d7d4c0b253..ca817e768d0e 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1365,8 +1365,8 @@ SYSCALL_DEFINE6(mbind, unsigned long, start, unsigned long, len,
 }
 
 /* Set the process memory policy */
-SYSCALL_DEFINE3(set_mempolicy, int, mode, const unsigned long __user *, nmask,
-		unsigned long, maxnode)
+static long kernel_set_mempolicy(int mode, const unsigned long __user *nmask,
+				 unsigned long maxnode)
 {
 	int err;
 	nodemask_t nodes;
@@ -1384,6 +1384,12 @@ SYSCALL_DEFINE3(set_mempolicy, int, mode, const unsigned long __user *, nmask,
 	return do_set_mempolicy(mode, flags, &nodes);
 }
 
+SYSCALL_DEFINE3(set_mempolicy, int, mode, const unsigned long __user *, nmask,
+		unsigned long, maxnode)
+{
+	return kernel_set_mempolicy(mode, nmask, maxnode);
+}
+
 static int kernel_migrate_pages(pid_t pid, unsigned long maxnode,
 				const unsigned long __user *old_nodes,
 				const unsigned long __user *new_nodes)
@@ -1485,9 +1491,11 @@ SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned long, maxnode,
 
 
 /* Retrieve NUMA policy */
-SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
-		unsigned long __user *, nmask, unsigned long, maxnode,
-		unsigned long, addr, unsigned long, flags)
+static int kernel_get_mempolicy(int __user *policy,
+				unsigned long __user *nmask,
+				unsigned long maxnode,
+				unsigned long addr,
+				unsigned long flags)
 {
 	int err;
 	int uninitialized_var(pval);
@@ -1510,6 +1518,13 @@ SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
 	return err;
 }
 
+SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
+		unsigned long __user *, nmask, unsigned long, maxnode,
+		unsigned long, addr, unsigned long, flags)
+{
+	return kernel_get_mempolicy(policy, nmask, maxnode, addr, flags);
+}
+
 #ifdef CONFIG_COMPAT
 
 COMPAT_SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
@@ -1528,7 +1543,7 @@ COMPAT_SYSCALL_DEFINE5(get_mempolicy, int __user *, policy,
 	if (nmask)
 		nm = compat_alloc_user_space(alloc_size);
 
-	err = sys_get_mempolicy(policy, nm, nr_bits+1, addr, flags);
+	err = kernel_get_mempolicy(policy, nm, nr_bits+1, addr, flags);
 
 	if (!err && nmask) {
 		unsigned long copy_size;
@@ -1560,7 +1575,7 @@ COMPAT_SYSCALL_DEFINE3(set_mempolicy, int, mode, compat_ulong_t __user *, nmask,
 			return -EFAULT;
 	}
 
-	return sys_set_mempolicy(mode, nm, nr_bits+1);
+	return kernel_set_mempolicy(mode, nm, nr_bits+1);
 }
 
 COMPAT_SYSCALL_DEFINE6(mbind, compat_ulong_t, start, compat_ulong_t, len,
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 29/45] mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (27 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 28/45] mm: add kernel_[sg]et_mempolicy() helpers; remove in-kernel calls to syscalls Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 30/45] ipc: add semtimedop syscall/compat_syscall wrappers Dominik Brodowski
                   ` (16 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Andrew Morton, linux-mm

Using this helper allows us to avoid the in-kernel calls to the
sys_readahead() syscall. The ksys_ prefix denotes that this function is
meant as a drop-in replacement for the syscall. In particular, it uses the
same calling convention as sys_readahead().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/mips/kernel/linux32.c      | 2 +-
 arch/parisc/kernel/sys_parisc.c | 2 +-
 arch/powerpc/kernel/sys_ppc32.c | 2 +-
 arch/s390/kernel/compat_linux.c | 2 +-
 arch/sparc/kernel/sys_sparc32.c | 2 +-
 arch/x86/ia32/sys_ia32.c        | 2 +-
 include/linux/syscalls.h        | 1 +
 mm/readahead.c                  | 7 ++++++-
 8 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/mips/kernel/linux32.c b/arch/mips/kernel/linux32.c
index 0571ab7b68b0..318f1c05c5b3 100644
--- a/arch/mips/kernel/linux32.c
+++ b/arch/mips/kernel/linux32.c
@@ -131,7 +131,7 @@ SYSCALL_DEFINE1(32_personality, unsigned long, personality)
 asmlinkage ssize_t sys32_readahead(int fd, u32 pad0, u64 a2, u64 a3,
 				   size_t count)
 {
-	return sys_readahead(fd, merge_64(a2, a3), count);
+	return ksys_readahead(fd, merge_64(a2, a3), count);
 }
 
 asmlinkage long sys32_sync_file_range(int fd, int __pad,
diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
index 080d566654ea..8c99ebbe2bac 100644
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -345,7 +345,7 @@ asmlinkage ssize_t parisc_pwrite64(unsigned int fd, const char __user *buf,
 asmlinkage ssize_t parisc_readahead(int fd, unsigned int high, unsigned int low,
 		                    size_t count)
 {
-	return sys_readahead(fd, (loff_t)high << 32 | low, count);
+	return ksys_readahead(fd, (loff_t)high << 32 | low, count);
 }
 
 asmlinkage long parisc_fadvise64_64(int fd,
diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c
index 0b95fa13307f..c11c73373691 100644
--- a/arch/powerpc/kernel/sys_ppc32.c
+++ b/arch/powerpc/kernel/sys_ppc32.c
@@ -88,7 +88,7 @@ compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, com
 
 compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offhi, u32 offlo, u32 count)
 {
-	return sys_readahead(fd, ((loff_t)offhi << 32) | offlo, count);
+	return ksys_readahead(fd, ((loff_t)offhi << 32) | offlo, count);
 }
 
 asmlinkage int compat_sys_truncate64(const char __user * path, u32 reg4,
diff --git a/arch/s390/kernel/compat_linux.c b/arch/s390/kernel/compat_linux.c
index da5ef7718254..8ac38d51ed7d 100644
--- a/arch/s390/kernel/compat_linux.c
+++ b/arch/s390/kernel/compat_linux.c
@@ -328,7 +328,7 @@ COMPAT_SYSCALL_DEFINE5(s390_pwrite64, unsigned int, fd, const char __user *, ubu
 
 COMPAT_SYSCALL_DEFINE4(s390_readahead, int, fd, u32, high, u32, low, s32, count)
 {
-	return sys_readahead(fd, (unsigned long)high << 32 | low, count);
+	return ksys_readahead(fd, (unsigned long)high << 32 | low, count);
 }
 
 struct stat64_emu31 {
diff --git a/arch/sparc/kernel/sys_sparc32.c b/arch/sparc/kernel/sys_sparc32.c
index 4da66aed50b4..f166e5bbf506 100644
--- a/arch/sparc/kernel/sys_sparc32.c
+++ b/arch/sparc/kernel/sys_sparc32.c
@@ -217,7 +217,7 @@ asmlinkage long compat_sys_readahead(int fd,
 				     unsigned long offlo,
 				     compat_size_t count)
 {
-	return sys_readahead(fd, (offhi << 32) | offlo, count);
+	return ksys_readahead(fd, (offhi << 32) | offlo, count);
 }
 
 long compat_sys_fadvise64(int fd,
diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index bf4e8dbd65e7..064b76598a2e 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -208,7 +208,7 @@ COMPAT_SYSCALL_DEFINE6(x86_fadvise64_64, int, fd, __u32, offset_low,
 COMPAT_SYSCALL_DEFINE4(x86_readahead, int, fd, unsigned int, off_lo,
 		       unsigned int, off_hi, size_t, count)
 {
-	return sys_readahead(fd, ((u64)off_hi << 32) | off_lo, count);
+	return ksys_readahead(fd, ((u64)off_hi << 32) | off_lo, count);
 }
 
 COMPAT_SYSCALL_DEFINE6(x86_sync_file_range, int, fd, unsigned int, off_low,
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index c509459ce9d5..3591c4af33d8 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -982,6 +982,7 @@ ssize_t ksys_pwrite64(unsigned int fd, const char __user *buf,
 		      size_t count, loff_t pos);
 int ksys_fallocate(int fd, int mode, loff_t offset, loff_t len);
 int ksys_setsid(void);
+ssize_t ksys_readahead(int fd, loff_t offset, size_t count);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
diff --git a/mm/readahead.c b/mm/readahead.c
index c4ca70239233..4d57b4644f98 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -573,7 +573,7 @@ do_readahead(struct address_space *mapping, struct file *filp,
 	return force_page_cache_readahead(mapping, filp, index, nr);
 }
 
-SYSCALL_DEFINE3(readahead, int, fd, loff_t, offset, size_t, count)
+ssize_t ksys_readahead(int fd, loff_t offset, size_t count)
 {
 	ssize_t ret;
 	struct fd f;
@@ -592,3 +592,8 @@ SYSCALL_DEFINE3(readahead, int, fd, loff_t, offset, size_t, count)
 	}
 	return ret;
 }
+
+SYSCALL_DEFINE3(readahead, int, fd, loff_t, offset, size_t, count)
+{
+	return ksys_readahead(fd, offset, count);
+}
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 30/45] ipc: add semtimedop syscall/compat_syscall wrappers
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (28 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 29/45] mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 31/45] ipc: add semget syscall wrapper Dominik Brodowski
                   ` (15 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_semtimedop() and compat_ksys_semtimedop() wrappers to avoid
in-kernel calls to these syscalls. The ksys_ prefix denotes that these
functions are meant as a drop-in replacement for the syscalls. In
particular, they use the same calling convention as sys_semtimedop() and
compat_sys_semtimedop().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/sem.c     | 23 ++++++++++++++++++-----
 ipc/syscall.c | 17 ++++++++++-------
 ipc/util.h    | 13 +++++++++++++
 3 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index a4af04979fd2..e21ceb8b4af1 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2120,8 +2120,8 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
 	return error;
 }
 
-SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
-		unsigned, nsops, const struct timespec __user *, timeout)
+long ksys_semtimedop(int semid, struct sembuf __user *tsops,
+		     unsigned int nsops, const struct timespec __user *timeout)
 {
 	if (timeout) {
 		struct timespec64 ts;
@@ -2132,10 +2132,16 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
 	return do_semtimedop(semid, tsops, nsops, NULL);
 }
 
+SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
+		unsigned int, nsops, const struct timespec __user *, timeout)
+{
+	return ksys_semtimedop(semid, tsops, nsops, timeout);
+}
+
 #ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsems,
-		       unsigned, nsops,
-		       const struct compat_timespec __user *, timeout)
+long compat_ksys_semtimedop(int semid, struct sembuf __user *tsems,
+			    unsigned int nsops,
+			    const struct compat_timespec __user *timeout)
 {
 	if (timeout) {
 		struct timespec64 ts;
@@ -2145,6 +2151,13 @@ COMPAT_SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsems,
 	}
 	return do_semtimedop(semid, tsems, nsops, NULL);
 }
+
+COMPAT_SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsems,
+		       unsigned int, nsops,
+		       const struct compat_timespec __user *, timeout)
+{
+	return compat_ksys_semtimedop(semid, tsems, nsops, timeout);
+}
 #endif
 
 SYSCALL_DEFINE3(semop, int, semid, struct sembuf __user *, tsops,
diff --git a/ipc/syscall.c b/ipc/syscall.c
index 3763b4293b74..84d6a7691baa 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -7,6 +7,9 @@
  */
 #include <linux/unistd.h>
 #include <linux/syscalls.h>
+#include <linux/security.h>
+#include <linux/ipc_namespace.h>
+#include "util.h"
 
 #ifdef __ARCH_WANT_SYS_IPC
 #include <linux/errno.h>
@@ -24,12 +27,12 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 
 	switch (call) {
 	case SEMOP:
-		return sys_semtimedop(first, (struct sembuf __user *)ptr,
-				      second, NULL);
+		return ksys_semtimedop(first, (struct sembuf __user *)ptr,
+				       second, NULL);
 	case SEMTIMEDOP:
-		return sys_semtimedop(first, (struct sembuf __user *)ptr,
-				      second,
-				      (const struct timespec __user *)fifth);
+		return ksys_semtimedop(first, (struct sembuf __user *)ptr,
+				       second,
+				       (const struct timespec __user *)fifth);
 
 	case SEMGET:
 		return sys_semget(first, second, third);
@@ -124,9 +127,9 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 	switch (call) {
 	case SEMOP:
 		/* struct sembuf is the same on 32 and 64bit :)) */
-		return sys_semtimedop(first, compat_ptr(ptr), second, NULL);
+		return ksys_semtimedop(first, compat_ptr(ptr), second, NULL);
 	case SEMTIMEDOP:
-		return compat_sys_semtimedop(first, compat_ptr(ptr), second,
+		return compat_ksys_semtimedop(first, compat_ptr(ptr), second,
 						compat_ptr(fifth));
 	case SEMGET:
 		return sys_semget(first, second, third);
diff --git a/ipc/util.h b/ipc/util.h
index 89b8ec176fc4..6deadf77547e 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -235,4 +235,17 @@ static inline int compat_ipc_parse_version(int *cmd)
 #endif
 }
 #endif
+
+/* for __ARCH_WANT_SYS_IPC */
+long ksys_semtimedop(int semid, struct sembuf __user *tsops,
+		     unsigned int nsops,
+		     const struct timespec __user *timeout);
+
+/* for CONFIG_ARCH_WANT_OLD_COMPAT_IPC */
+#ifdef CONFIG_COMPAT
+long compat_ksys_semtimedop(int semid, struct sembuf __user *tsems,
+			    unsigned int nsops,
+			    const struct compat_timespec __user *timeout);
+#endif /* CONFIG_COMPAT */
+
 #endif
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 31/45] ipc: add semget syscall wrapper
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (29 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 30/45] ipc: add semtimedop syscall/compat_syscall wrappers Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 32/45] ipc: add semctl syscall/compat_syscall wrappers Dominik Brodowski
                   ` (14 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_semget() wrapper to avoid in-kernel calls to this syscall.
The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_semget().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/sem.c     | 7 ++++++-
 ipc/syscall.c | 4 ++--
 ipc/util.h    | 1 +
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index e21ceb8b4af1..2e5f7ec7a7db 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -556,7 +556,7 @@ static inline int sem_more_checks(struct kern_ipc_perm *ipcp,
 	return 0;
 }
 
-SYSCALL_DEFINE3(semget, key_t, key, int, nsems, int, semflg)
+long ksys_semget(key_t key, int nsems, int semflg)
 {
 	struct ipc_namespace *ns;
 	static const struct ipc_ops sem_ops = {
@@ -578,6 +578,11 @@ SYSCALL_DEFINE3(semget, key_t, key, int, nsems, int, semflg)
 	return ipcget(ns, &sem_ids(ns), &sem_ops, &sem_params);
 }
 
+SYSCALL_DEFINE3(semget, key_t, key, int, nsems, int, semflg)
+{
+	return ksys_semget(key, nsems, semflg);
+}
+
 /**
  * perform_atomic_semop[_slow] - Attempt to perform semaphore
  *                               operations on a given array.
diff --git a/ipc/syscall.c b/ipc/syscall.c
index 84d6a7691baa..21fcdf0b4836 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -35,7 +35,7 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 				       (const struct timespec __user *)fifth);
 
 	case SEMGET:
-		return sys_semget(first, second, third);
+		return ksys_semget(first, second, third);
 	case SEMCTL: {
 		unsigned long arg;
 		if (!ptr)
@@ -132,7 +132,7 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 		return compat_ksys_semtimedop(first, compat_ptr(ptr), second,
 						compat_ptr(fifth));
 	case SEMGET:
-		return sys_semget(first, second, third);
+		return ksys_semget(first, second, third);
 	case SEMCTL:
 		if (!ptr)
 			return -EINVAL;
diff --git a/ipc/util.h b/ipc/util.h
index 6deadf77547e..0f07056e5a73 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -240,6 +240,7 @@ static inline int compat_ipc_parse_version(int *cmd)
 long ksys_semtimedop(int semid, struct sembuf __user *tsops,
 		     unsigned int nsops,
 		     const struct timespec __user *timeout);
+long ksys_semget(key_t key, int nsems, int semflg);
 
 /* for CONFIG_ARCH_WANT_OLD_COMPAT_IPC */
 #ifdef CONFIG_COMPAT
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 32/45] ipc: add semctl syscall/compat_syscall wrappers
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (30 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 31/45] ipc: add semget syscall wrapper Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 33/45] ipc: add msgget syscall wrapper Dominik Brodowski
                   ` (13 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_semctl() and compat_ksys_semctl() wrappers to avoid in-kernel
calls to these syscalls. The ksys_ prefix denotes that these functions are
meant as a drop-in replacement for the syscalls. In particular, they use
the same calling convention as sys_semctl() and compat_sys_semctl().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/sem.c     | 14 ++++++++++++--
 ipc/syscall.c |  4 ++--
 ipc/util.h    |  2 ++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index 2e5f7ec7a7db..1cf56279a84c 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1581,7 +1581,7 @@ static int semctl_down(struct ipc_namespace *ns, int semid,
 	return err;
 }
 
-SYSCALL_DEFINE4(semctl, int, semid, int, semnum, int, cmd, unsigned long, arg)
+long ksys_semctl(int semid, int semnum, int cmd, unsigned long arg)
 {
 	int version;
 	struct ipc_namespace *ns;
@@ -1635,6 +1635,11 @@ SYSCALL_DEFINE4(semctl, int, semid, int, semnum, int, cmd, unsigned long, arg)
 	}
 }
 
+SYSCALL_DEFINE4(semctl, int, semid, int, semnum, int, cmd, unsigned long, arg)
+{
+	return ksys_semctl(semid, semnum, cmd, arg);
+}
+
 #ifdef CONFIG_COMPAT
 
 struct compat_semid_ds {
@@ -1683,7 +1688,7 @@ static int copy_compat_semid_to_user(void __user *buf, struct semid64_ds *in,
 	}
 }
 
-COMPAT_SYSCALL_DEFINE4(semctl, int, semid, int, semnum, int, cmd, int, arg)
+long compat_ksys_semctl(int semid, int semnum, int cmd, int arg)
 {
 	void __user *p = compat_ptr(arg);
 	struct ipc_namespace *ns;
@@ -1727,6 +1732,11 @@ COMPAT_SYSCALL_DEFINE4(semctl, int, semid, int, semnum, int, cmd, int, arg)
 		return -EINVAL;
 	}
 }
+
+COMPAT_SYSCALL_DEFINE4(semctl, int, semid, int, semnum, int, cmd, int, arg)
+{
+	return compat_ksys_semctl(semid, semnum, cmd, arg);
+}
 #endif
 
 /* If the task doesn't already have a undo_list, then allocate one
diff --git a/ipc/syscall.c b/ipc/syscall.c
index 21fcdf0b4836..a536cca37661 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -42,7 +42,7 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 			return -EINVAL;
 		if (get_user(arg, (unsigned long __user *) ptr))
 			return -EFAULT;
-		return sys_semctl(first, second, third, arg);
+		return ksys_semctl(first, second, third, arg);
 	}
 
 	case MSGSND:
@@ -138,7 +138,7 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 			return -EINVAL;
 		if (get_user(pad, (u32 __user *) compat_ptr(ptr)))
 			return -EFAULT;
-		return compat_sys_semctl(first, second, third, pad);
+		return compat_ksys_semctl(first, second, third, pad);
 
 	case MSGSND:
 		return compat_sys_msgsnd(first, ptr, second, third);
diff --git a/ipc/util.h b/ipc/util.h
index 0f07056e5a73..1f1109b83437 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -241,12 +241,14 @@ long ksys_semtimedop(int semid, struct sembuf __user *tsops,
 		     unsigned int nsops,
 		     const struct timespec __user *timeout);
 long ksys_semget(key_t key, int nsems, int semflg);
+long ksys_semctl(int semid, int semnum, int cmd, unsigned long arg);
 
 /* for CONFIG_ARCH_WANT_OLD_COMPAT_IPC */
 #ifdef CONFIG_COMPAT
 long compat_ksys_semtimedop(int semid, struct sembuf __user *tsems,
 			    unsigned int nsops,
 			    const struct compat_timespec __user *timeout);
+long compat_ksys_semctl(int semid, int semnum, int cmd, int arg);
 #endif /* CONFIG_COMPAT */
 
 #endif
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 33/45] ipc: add msgget syscall wrapper
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (31 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 32/45] ipc: add semctl syscall/compat_syscall wrappers Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 34/45] ipc: add shmget " Dominik Brodowski
                   ` (12 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_msgget() wrapper to avoid in-kernel calls to this syscall.
The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_msgget().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/msg.c     | 7 ++++++-
 ipc/syscall.c | 4 ++--
 ipc/util.h    | 1 +
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 0dcc6699dc53..64e8276be164 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -263,7 +263,7 @@ static inline int msg_security(struct kern_ipc_perm *ipcp, int msgflg)
 	return security_msg_queue_associate(msq, msgflg);
 }
 
-SYSCALL_DEFINE2(msgget, key_t, key, int, msgflg)
+long ksys_msgget(key_t key, int msgflg)
 {
 	struct ipc_namespace *ns;
 	static const struct ipc_ops msg_ops = {
@@ -280,6 +280,11 @@ SYSCALL_DEFINE2(msgget, key_t, key, int, msgflg)
 	return ipcget(ns, &msg_ids(ns), &msg_ops, &msg_params);
 }
 
+SYSCALL_DEFINE2(msgget, key_t, key, int, msgflg)
+{
+	return ksys_msgget(key, msgflg);
+}
+
 static inline unsigned long
 copy_msqid_to_user(void __user *buf, struct msqid64_ds *in, int version)
 {
diff --git a/ipc/syscall.c b/ipc/syscall.c
index a536cca37661..355c4a644bbf 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -68,7 +68,7 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 					   second, fifth, third);
 		}
 	case MSGGET:
-		return sys_msgget((key_t) first, second);
+		return ksys_msgget((key_t) first, second);
 	case MSGCTL:
 		return sys_msgctl(first, second, (struct msqid_ds __user *)ptr);
 
@@ -161,7 +161,7 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 		return compat_sys_msgrcv(first, ptr, second, fifth, third);
 	}
 	case MSGGET:
-		return sys_msgget(first, second);
+		return ksys_msgget(first, second);
 	case MSGCTL:
 		return compat_sys_msgctl(first, second, compat_ptr(ptr));
 
diff --git a/ipc/util.h b/ipc/util.h
index 1f1109b83437..b35c0dfe3bc3 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -242,6 +242,7 @@ long ksys_semtimedop(int semid, struct sembuf __user *tsops,
 		     const struct timespec __user *timeout);
 long ksys_semget(key_t key, int nsems, int semflg);
 long ksys_semctl(int semid, int semnum, int cmd, unsigned long arg);
+long ksys_msgget(key_t key, int msgflg);
 
 /* for CONFIG_ARCH_WANT_OLD_COMPAT_IPC */
 #ifdef CONFIG_COMPAT
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 34/45] ipc: add shmget syscall wrapper
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (32 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 33/45] ipc: add msgget syscall wrapper Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 35/45] ipc: add shmdt " Dominik Brodowski
                   ` (11 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_shmget() wrapper to avoid in-kernel calls to this syscall.
The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_shmget().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/shm.c     | 7 ++++++-
 ipc/syscall.c | 4 ++--
 ipc/util.h    | 1 +
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index 4643865e9171..9f3cdb259a51 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -656,7 +656,7 @@ static inline int shm_more_checks(struct kern_ipc_perm *ipcp,
 	return 0;
 }
 
-SYSCALL_DEFINE3(shmget, key_t, key, size_t, size, int, shmflg)
+long ksys_shmget(key_t key, size_t size, int shmflg)
 {
 	struct ipc_namespace *ns;
 	static const struct ipc_ops shm_ops = {
@@ -675,6 +675,11 @@ SYSCALL_DEFINE3(shmget, key_t, key, size_t, size, int, shmflg)
 	return ipcget(ns, &shm_ids(ns), &shm_ops, &shm_params);
 }
 
+SYSCALL_DEFINE3(shmget, key_t, key, size_t, size, int, shmflg)
+{
+	return ksys_shmget(key, size, shmflg);
+}
+
 static inline unsigned long copy_shmid_to_user(void __user *buf, struct shmid64_ds *in, int version)
 {
 	switch (version) {
diff --git a/ipc/syscall.c b/ipc/syscall.c
index 355c4a644bbf..60bceb19b6f0 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -92,7 +92,7 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 	case SHMDT:
 		return sys_shmdt((char __user *)ptr);
 	case SHMGET:
-		return sys_shmget(first, second, third);
+		return ksys_shmget(first, second, third);
 	case SHMCTL:
 		return sys_shmctl(first, second,
 				   (struct shmid_ds __user *) ptr);
@@ -180,7 +180,7 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 	case SHMDT:
 		return sys_shmdt(compat_ptr(ptr));
 	case SHMGET:
-		return sys_shmget(first, (unsigned)second, third);
+		return ksys_shmget(first, (unsigned int)second, third);
 	case SHMCTL:
 		return compat_sys_shmctl(first, second, compat_ptr(ptr));
 	}
diff --git a/ipc/util.h b/ipc/util.h
index b35c0dfe3bc3..51002c0b2a21 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -243,6 +243,7 @@ long ksys_semtimedop(int semid, struct sembuf __user *tsops,
 long ksys_semget(key_t key, int nsems, int semflg);
 long ksys_semctl(int semid, int semnum, int cmd, unsigned long arg);
 long ksys_msgget(key_t key, int msgflg);
+long ksys_shmget(key_t key, size_t size, int shmflg);
 
 /* for CONFIG_ARCH_WANT_OLD_COMPAT_IPC */
 #ifdef CONFIG_COMPAT
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 35/45] ipc: add shmdt syscall wrapper
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (33 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 34/45] ipc: add shmget " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 36/45] ipc: add shmctl syscall/compat_syscall wrappers Dominik Brodowski
                   ` (10 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_shmdt() wrapper to avoid in-kernel calls to this syscall.
The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_shmdt().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/shm.c     | 7 ++++++-
 ipc/syscall.c | 4 ++--
 ipc/util.h    | 1 +
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index 9f3cdb259a51..e5838e3328dc 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -1481,7 +1481,7 @@ COMPAT_SYSCALL_DEFINE3(shmat, int, shmid, compat_uptr_t, shmaddr, int, shmflg)
  * detach and kill segment if marked destroyed.
  * The work is done in shm_close.
  */
-SYSCALL_DEFINE1(shmdt, char __user *, shmaddr)
+long ksys_shmdt(char __user *shmaddr)
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
@@ -1588,6 +1588,11 @@ SYSCALL_DEFINE1(shmdt, char __user *, shmaddr)
 	return retval;
 }
 
+SYSCALL_DEFINE1(shmdt, char __user *, shmaddr)
+{
+	return ksys_shmdt(shmaddr);
+}
+
 #ifdef CONFIG_PROC_FS
 static int sysvipc_shm_proc_show(struct seq_file *s, void *it)
 {
diff --git a/ipc/syscall.c b/ipc/syscall.c
index 60bceb19b6f0..b3aa71564815 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -90,7 +90,7 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 			return -EINVAL;
 		}
 	case SHMDT:
-		return sys_shmdt((char __user *)ptr);
+		return ksys_shmdt((char __user *)ptr);
 	case SHMGET:
 		return ksys_shmget(first, second, third);
 	case SHMCTL:
@@ -178,7 +178,7 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 		return put_user(raddr, (compat_ulong_t __user *)compat_ptr(third));
 	}
 	case SHMDT:
-		return sys_shmdt(compat_ptr(ptr));
+		return ksys_shmdt(compat_ptr(ptr));
 	case SHMGET:
 		return ksys_shmget(first, (unsigned int)second, third);
 	case SHMCTL:
diff --git a/ipc/util.h b/ipc/util.h
index 51002c0b2a21..7770bcad1168 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -244,6 +244,7 @@ long ksys_semget(key_t key, int nsems, int semflg);
 long ksys_semctl(int semid, int semnum, int cmd, unsigned long arg);
 long ksys_msgget(key_t key, int msgflg);
 long ksys_shmget(key_t key, size_t size, int shmflg);
+long ksys_shmdt(char __user *shmaddr);
 
 /* for CONFIG_ARCH_WANT_OLD_COMPAT_IPC */
 #ifdef CONFIG_COMPAT
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 36/45] ipc: add shmctl syscall/compat_syscall wrappers
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (34 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 35/45] ipc: add shmdt " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 37/45] ipc: add msgctl " Dominik Brodowski
                   ` (9 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_shmctl() and compat_ksys_shmctl() wrappers to avoid in-kernel
calls to these syscalls. The ksys_ prefix denotes that these functions are
meant as a drop-in replacement for the syscalls. In particular, they use
the same calling convention as sys_shmctl() and compat_sys_shmctl().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/shm.c     | 14 ++++++++++++--
 ipc/syscall.c |  4 ++--
 ipc/util.h    |  2 ++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index e5838e3328dc..0aae3e55bc56 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -1045,7 +1045,7 @@ static int shmctl_do_lock(struct ipc_namespace *ns, int shmid, int cmd)
 	return err;
 }
 
-SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf)
+long ksys_shmctl(int shmid, int cmd, struct shmid_ds __user *buf)
 {
 	int err, version;
 	struct ipc_namespace *ns;
@@ -1099,6 +1099,11 @@ SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf)
 	}
 }
 
+SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, struct shmid_ds __user *, buf)
+{
+	return ksys_shmctl(shmid, cmd, buf);
+}
+
 #ifdef CONFIG_COMPAT
 
 struct compat_shmid_ds {
@@ -1218,7 +1223,7 @@ static int copy_compat_shmid_from_user(struct shmid64_ds *out, void __user *buf,
 	}
 }
 
-COMPAT_SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, void __user *, uptr)
+long compat_ksys_shmctl(int shmid, int cmd, void __user *uptr)
 {
 	struct ipc_namespace *ns;
 	struct shmid64_ds sem64;
@@ -1273,6 +1278,11 @@ COMPAT_SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, void __user *, uptr)
 	}
 	return err;
 }
+
+COMPAT_SYSCALL_DEFINE3(shmctl, int, shmid, int, cmd, void __user *, uptr)
+{
+	return compat_ksys_shmctl(shmid, cmd, uptr);
+}
 #endif
 
 /*
diff --git a/ipc/syscall.c b/ipc/syscall.c
index b3aa71564815..34bbabc9e672 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -94,7 +94,7 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 	case SHMGET:
 		return ksys_shmget(first, second, third);
 	case SHMCTL:
-		return sys_shmctl(first, second,
+		return ksys_shmctl(first, second,
 				   (struct shmid_ds __user *) ptr);
 	default:
 		return -ENOSYS;
@@ -182,7 +182,7 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 	case SHMGET:
 		return ksys_shmget(first, (unsigned int)second, third);
 	case SHMCTL:
-		return compat_sys_shmctl(first, second, compat_ptr(ptr));
+		return compat_ksys_shmctl(first, second, compat_ptr(ptr));
 	}
 
 	return -ENOSYS;
diff --git a/ipc/util.h b/ipc/util.h
index 7770bcad1168..16e8b5b8c416 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -245,6 +245,7 @@ long ksys_semctl(int semid, int semnum, int cmd, unsigned long arg);
 long ksys_msgget(key_t key, int msgflg);
 long ksys_shmget(key_t key, size_t size, int shmflg);
 long ksys_shmdt(char __user *shmaddr);
+long ksys_shmctl(int shmid, int cmd, struct shmid_ds __user *buf);
 
 /* for CONFIG_ARCH_WANT_OLD_COMPAT_IPC */
 #ifdef CONFIG_COMPAT
@@ -252,6 +253,7 @@ long compat_ksys_semtimedop(int semid, struct sembuf __user *tsems,
 			    unsigned int nsops,
 			    const struct compat_timespec __user *timeout);
 long compat_ksys_semctl(int semid, int semnum, int cmd, int arg);
+long compat_ksys_shmctl(int shmid, int cmd, void __user *uptr);
 #endif /* CONFIG_COMPAT */
 
 #endif
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 37/45] ipc: add msgctl syscall/compat_syscall wrappers
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (35 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 36/45] ipc: add shmctl syscall/compat_syscall wrappers Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 38/45] ipc: add msgrcv " Dominik Brodowski
                   ` (8 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_msgctl() and compat_ksys_msgctl() wrappers to avoid in-kernel
calls to these syscalls. The ksys_ prefix denotes that these functions are
meant as a drop-in replacement for the syscalls. In particular, they use
the same calling convention as sys_msgctl() and compat_sys_msgctl().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/msg.c     | 14 ++++++++++++--
 ipc/syscall.c |  5 +++--
 ipc/util.h    |  2 ++
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 64e8276be164..5b026868df07 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -538,7 +538,7 @@ static int msgctl_stat(struct ipc_namespace *ns, int msqid,
 	return err;
 }
 
-SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, struct msqid_ds __user *, buf)
+long ksys_msgctl(int msqid, int cmd, struct msqid_ds __user *buf)
 {
 	int version;
 	struct ipc_namespace *ns;
@@ -581,6 +581,11 @@ SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, struct msqid_ds __user *, buf)
 	}
 }
 
+SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, struct msqid_ds __user *, buf)
+{
+	return ksys_msgctl(msqid, cmd, buf);
+}
+
 #ifdef CONFIG_COMPAT
 
 struct compat_msqid_ds {
@@ -651,7 +656,7 @@ static int copy_compat_msqid_to_user(void __user *buf, struct msqid64_ds *in,
 	}
 }
 
-COMPAT_SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, void __user *, uptr)
+long compat_ksys_msgctl(int msqid, int cmd, void __user *uptr)
 {
 	struct ipc_namespace *ns;
 	int err;
@@ -692,6 +697,11 @@ COMPAT_SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, void __user *, uptr)
 		return -EINVAL;
 	}
 }
+
+COMPAT_SYSCALL_DEFINE3(msgctl, int, msqid, int, cmd, void __user *, uptr)
+{
+	return compat_ksys_msgctl(msqid, cmd, uptr);
+}
 #endif
 
 static int testmsg(struct msg_msg *msg, long type, int mode)
diff --git a/ipc/syscall.c b/ipc/syscall.c
index 34bbabc9e672..aa29b0802e26 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -70,7 +70,8 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 	case MSGGET:
 		return ksys_msgget((key_t) first, second);
 	case MSGCTL:
-		return sys_msgctl(first, second, (struct msqid_ds __user *)ptr);
+		return ksys_msgctl(first, second,
+				   (struct msqid_ds __user *)ptr);
 
 	case SHMAT:
 		switch (version) {
@@ -163,7 +164,7 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 	case MSGGET:
 		return ksys_msgget(first, second);
 	case MSGCTL:
-		return compat_sys_msgctl(first, second, compat_ptr(ptr));
+		return compat_ksys_msgctl(first, second, compat_ptr(ptr));
 
 	case SHMAT: {
 		int err;
diff --git a/ipc/util.h b/ipc/util.h
index 16e8b5b8c416..47837b4af3f2 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -243,6 +243,7 @@ long ksys_semtimedop(int semid, struct sembuf __user *tsops,
 long ksys_semget(key_t key, int nsems, int semflg);
 long ksys_semctl(int semid, int semnum, int cmd, unsigned long arg);
 long ksys_msgget(key_t key, int msgflg);
+long ksys_msgctl(int msqid, int cmd, struct msqid_ds __user *buf);
 long ksys_shmget(key_t key, size_t size, int shmflg);
 long ksys_shmdt(char __user *shmaddr);
 long ksys_shmctl(int shmid, int cmd, struct shmid_ds __user *buf);
@@ -253,6 +254,7 @@ long compat_ksys_semtimedop(int semid, struct sembuf __user *tsems,
 			    unsigned int nsops,
 			    const struct compat_timespec __user *timeout);
 long compat_ksys_semctl(int semid, int semnum, int cmd, int arg);
+long compat_ksys_msgctl(int msqid, int cmd, void __user *uptr);
 long compat_ksys_shmctl(int shmid, int cmd, void __user *uptr);
 #endif /* CONFIG_COMPAT */
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 38/45] ipc: add msgrcv syscall/compat_syscall wrappers
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (36 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 37/45] ipc: add msgctl " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 39/45] ipc: add msgsnd " Dominik Brodowski
                   ` (7 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_msgrcv() and compat_ksys_msgrcv() wrappers to avoid in-kernel
calls to these syscalls. The ksys_ prefix denotes that these functions are
meant as a drop-in replacement for the syscalls. In particular, they use
the same calling convention as sys_msgrcv() and compat_sys_msgrcv().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/msg.c     | 19 ++++++++++++++++---
 ipc/syscall.c |  8 ++++----
 ipc/util.h    |  4 ++++
 3 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 5b026868df07..abc5826270a6 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -1150,10 +1150,16 @@ static long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, in
 	return bufsz;
 }
 
+long ksys_msgrcv(int msqid, struct msgbuf __user *msgp, size_t msgsz,
+		 long msgtyp, int msgflg)
+{
+	return do_msgrcv(msqid, msgp, msgsz, msgtyp, msgflg, do_msg_fill);
+}
+
 SYSCALL_DEFINE5(msgrcv, int, msqid, struct msgbuf __user *, msgp, size_t, msgsz,
 		long, msgtyp, int, msgflg)
 {
-	return do_msgrcv(msqid, msgp, msgsz, msgtyp, msgflg, do_msg_fill);
+	return ksys_msgrcv(msqid, msgp, msgsz, msgtyp, msgflg);
 }
 
 #ifdef CONFIG_COMPAT
@@ -1171,12 +1177,19 @@ static long compat_do_msg_fill(void __user *dest, struct msg_msg *msg, size_t bu
 	return msgsz;
 }
 
-COMPAT_SYSCALL_DEFINE5(msgrcv, int, msqid, compat_uptr_t, msgp,
-		       compat_ssize_t, msgsz, compat_long_t, msgtyp, int, msgflg)
+long compat_ksys_msgrcv(int msqid, compat_uptr_t msgp, compat_ssize_t msgsz,
+			compat_long_t msgtyp, int msgflg)
 {
 	return do_msgrcv(msqid, compat_ptr(msgp), (ssize_t)msgsz, (long)msgtyp,
 			 msgflg, compat_do_msg_fill);
 }
+
+COMPAT_SYSCALL_DEFINE5(msgrcv, int, msqid, compat_uptr_t, msgp,
+		       compat_ssize_t, msgsz, compat_long_t, msgtyp,
+		       int, msgflg)
+{
+	return compat_ksys_msgrcv(msqid, msgp, msgsz, msgtyp, msgflg);
+}
 #endif
 
 int msg_init_ns(struct ipc_namespace *ns)
diff --git a/ipc/syscall.c b/ipc/syscall.c
index aa29b0802e26..0228c7afd882 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -59,11 +59,11 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 					   (struct ipc_kludge __user *) ptr,
 					   sizeof(tmp)))
 				return -EFAULT;
-			return sys_msgrcv(first, tmp.msgp, second,
+			return ksys_msgrcv(first, tmp.msgp, second,
 					   tmp.msgtyp, third);
 		}
 		default:
-			return sys_msgrcv(first,
+			return ksys_msgrcv(first,
 					   (struct msgbuf __user *) ptr,
 					   second, fifth, third);
 		}
@@ -156,10 +156,10 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 				return -EINVAL;
 			if (copy_from_user(&ipck, uptr, sizeof(ipck)))
 				return -EFAULT;
-			return compat_sys_msgrcv(first, ipck.msgp, second,
+			return compat_ksys_msgrcv(first, ipck.msgp, second,
 						 ipck.msgtyp, third);
 		}
-		return compat_sys_msgrcv(first, ptr, second, fifth, third);
+		return compat_ksys_msgrcv(first, ptr, second, fifth, third);
 	}
 	case MSGGET:
 		return ksys_msgget(first, second);
diff --git a/ipc/util.h b/ipc/util.h
index 47837b4af3f2..c16aceb1bdec 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -244,6 +244,8 @@ long ksys_semget(key_t key, int nsems, int semflg);
 long ksys_semctl(int semid, int semnum, int cmd, unsigned long arg);
 long ksys_msgget(key_t key, int msgflg);
 long ksys_msgctl(int msqid, int cmd, struct msqid_ds __user *buf);
+long ksys_msgrcv(int msqid, struct msgbuf __user *msgp, size_t msgsz,
+		 long msgtyp, int msgflg);
 long ksys_shmget(key_t key, size_t size, int shmflg);
 long ksys_shmdt(char __user *shmaddr);
 long ksys_shmctl(int shmid, int cmd, struct shmid_ds __user *buf);
@@ -255,6 +257,8 @@ long compat_ksys_semtimedop(int semid, struct sembuf __user *tsems,
 			    const struct compat_timespec __user *timeout);
 long compat_ksys_semctl(int semid, int semnum, int cmd, int arg);
 long compat_ksys_msgctl(int msqid, int cmd, void __user *uptr);
+long compat_ksys_msgrcv(int msqid, compat_uptr_t msgp, compat_ssize_t msgsz,
+			compat_long_t msgtyp, int msgflg);
 long compat_ksys_shmctl(int shmid, int cmd, void __user *uptr);
 #endif /* CONFIG_COMPAT */
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 39/45] ipc: add msgsnd syscall/compat_syscall wrappers
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (37 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 38/45] ipc: add msgrcv " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 40/45] x86: use _do_fork() in compat_sys_x86_clone() Dominik Brodowski
                   ` (6 subsequent siblings)
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Al Viro, Andrew Morton

Provide ksys_msgsnd() and compat_ksys_msgsnd() wrappers to avoid in-kernel
calls to these syscalls. The ksys_ prefix denotes that these functions are
meant as a drop-in replacement for the syscalls. In particular, they use
the same calling convention as sys_msgsnd() and compat_sys_msgsnd().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 ipc/msg.c     | 20 ++++++++++++++++----
 ipc/syscall.c |  4 ++--
 ipc/util.h    |  4 ++++
 3 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index abc5826270a6..9de48065c1ac 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -867,8 +867,8 @@ static long do_msgsnd(int msqid, long mtype, void __user *mtext,
 	return err;
 }
 
-SYSCALL_DEFINE4(msgsnd, int, msqid, struct msgbuf __user *, msgp, size_t, msgsz,
-		int, msgflg)
+long ksys_msgsnd(int msqid, struct msgbuf __user *msgp, size_t msgsz,
+		 int msgflg)
 {
 	long mtype;
 
@@ -877,6 +877,12 @@ SYSCALL_DEFINE4(msgsnd, int, msqid, struct msgbuf __user *, msgp, size_t, msgsz,
 	return do_msgsnd(msqid, mtype, msgp->mtext, msgsz, msgflg);
 }
 
+SYSCALL_DEFINE4(msgsnd, int, msqid, struct msgbuf __user *, msgp, size_t, msgsz,
+		int, msgflg)
+{
+	return ksys_msgsnd(msqid, msgp, msgsz, msgflg);
+}
+
 #ifdef CONFIG_COMPAT
 
 struct compat_msgbuf {
@@ -884,8 +890,8 @@ struct compat_msgbuf {
 	char mtext[1];
 };
 
-COMPAT_SYSCALL_DEFINE4(msgsnd, int, msqid, compat_uptr_t, msgp,
-		       compat_ssize_t, msgsz, int, msgflg)
+long compat_ksys_msgsnd(int msqid, compat_uptr_t msgp,
+		       compat_ssize_t msgsz, int msgflg)
 {
 	struct compat_msgbuf __user *up = compat_ptr(msgp);
 	compat_long_t mtype;
@@ -894,6 +900,12 @@ COMPAT_SYSCALL_DEFINE4(msgsnd, int, msqid, compat_uptr_t, msgp,
 		return -EFAULT;
 	return do_msgsnd(msqid, mtype, up->mtext, (ssize_t)msgsz, msgflg);
 }
+
+COMPAT_SYSCALL_DEFINE4(msgsnd, int, msqid, compat_uptr_t, msgp,
+		       compat_ssize_t, msgsz, int, msgflg)
+{
+	return compat_ksys_msgsnd(msqid, msgp, msgsz, msgflg);
+}
 #endif
 
 static inline int convert_mode(long *msgtyp, int msgflg)
diff --git a/ipc/syscall.c b/ipc/syscall.c
index 0228c7afd882..77a883ef2eca 100644
--- a/ipc/syscall.c
+++ b/ipc/syscall.c
@@ -46,7 +46,7 @@ SYSCALL_DEFINE6(ipc, unsigned int, call, int, first, unsigned long, second,
 	}
 
 	case MSGSND:
-		return sys_msgsnd(first, (struct msgbuf __user *) ptr,
+		return ksys_msgsnd(first, (struct msgbuf __user *) ptr,
 				  second, third);
 	case MSGRCV:
 		switch (version) {
@@ -142,7 +142,7 @@ COMPAT_SYSCALL_DEFINE6(ipc, u32, call, int, first, int, second,
 		return compat_ksys_semctl(first, second, third, pad);
 
 	case MSGSND:
-		return compat_sys_msgsnd(first, ptr, second, third);
+		return compat_ksys_msgsnd(first, ptr, second, third);
 
 	case MSGRCV: {
 		void __user *uptr = compat_ptr(ptr);
diff --git a/ipc/util.h b/ipc/util.h
index c16aceb1bdec..51853dc2f340 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -246,6 +246,8 @@ long ksys_msgget(key_t key, int msgflg);
 long ksys_msgctl(int msqid, int cmd, struct msqid_ds __user *buf);
 long ksys_msgrcv(int msqid, struct msgbuf __user *msgp, size_t msgsz,
 		 long msgtyp, int msgflg);
+long ksys_msgsnd(int msqid, struct msgbuf __user *msgp, size_t msgsz,
+		 int msgflg);
 long ksys_shmget(key_t key, size_t size, int shmflg);
 long ksys_shmdt(char __user *shmaddr);
 long ksys_shmctl(int shmid, int cmd, struct shmid_ds __user *buf);
@@ -259,6 +261,8 @@ long compat_ksys_semctl(int semid, int semnum, int cmd, int arg);
 long compat_ksys_msgctl(int msqid, int cmd, void __user *uptr);
 long compat_ksys_msgrcv(int msqid, compat_uptr_t msgp, compat_ssize_t msgsz,
 			compat_long_t msgtyp, int msgflg);
+long compat_ksys_msgsnd(int msqid, compat_uptr_t msgp,
+		       compat_ssize_t msgsz, int msgflg);
 long compat_ksys_shmctl(int shmid, int cmd, void __user *uptr);
 #endif /* CONFIG_COMPAT */
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 40/45] x86: use _do_fork() in compat_sys_x86_clone()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (38 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 39/45] ipc: add msgsnd " Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:26   ` Thomas Gleixner
  2018-03-22  9:00 ` [PATCH 41/45] x86: remove compat_sys_x86_waitpid() Dominik Brodowski
                   ` (5 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Ingo Molnar, Jiri Slaby, x86

It is trivial to directly call _do_fork() instead of the sys_clone()
syscall in compat_sys_x86_clone().

This patch is part of a series which tries to remove in-kernel calls to
syscalls. On this basis, the syscall entry path can be streamlined.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: x86@kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/ia32/sys_ia32.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index 064b76598a2e..b9d4d8abc3f7 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -41,6 +41,7 @@
 #include <linux/highuid.h>
 #include <linux/sysctl.h>
 #include <linux/slab.h>
+#include <linux/sched/task.h>
 #include <asm/mman.h>
 #include <asm/types.h>
 #include <linux/uaccess.h>
@@ -242,6 +243,6 @@ COMPAT_SYSCALL_DEFINE5(x86_clone, unsigned long, clone_flags,
 		       unsigned long, newsp, int __user *, parent_tidptr,
 		       unsigned long, tls_val, int __user *, child_tidptr)
 {
-	return sys_clone(clone_flags, newsp, parent_tidptr, child_tidptr,
+	return _do_fork(clone_flags, newsp, 0, parent_tidptr, child_tidptr,
 			tls_val);
 }
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 41/45] x86: remove compat_sys_x86_waitpid()
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (39 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 40/45] x86: use _do_fork() in compat_sys_x86_clone() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:27   ` Thomas Gleixner
  2018-03-22  9:00 ` [PATCH 42/45] x86: fix sys_sigreturn() return type to be long, not unsigned long Dominik Brodowski
                   ` (4 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Ingo Molnar, Jiri Slaby, x86

compat_sys_x86_waitpid() is not needed, as it takes the same parameters
(int, *int, int) as the native syscall.

Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: x86@kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/entry/syscalls/syscall_32.tbl | 2 +-
 arch/x86/ia32/sys_ia32.c               | 6 ------
 arch/x86/include/asm/sys_ia32.h        | 3 ---
 3 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 09338dd2bd94..c58f75b088c5 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -13,7 +13,7 @@
 4	i386	write			sys_write
 5	i386	open			sys_open			compat_sys_open
 6	i386	close			sys_close
-7	i386	waitpid			sys_waitpid			compat_sys_x86_waitpid
+7	i386	waitpid			sys_waitpid
 8	i386	creat			sys_creat
 9	i386	link			sys_link
 10	i386	unlink			sys_unlink
diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index b9d4d8abc3f7..bd8a7020b9a7 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -170,12 +170,6 @@ COMPAT_SYSCALL_DEFINE1(x86_mmap, struct mmap_arg_struct32 __user *, arg)
 			       a.offset>>PAGE_SHIFT);
 }
 
-COMPAT_SYSCALL_DEFINE3(x86_waitpid, compat_pid_t, pid, unsigned int __user *,
-		       stat_addr, int, options)
-{
-	return compat_sys_wait4(pid, stat_addr, options, NULL);
-}
-
 /* warning: next two assume little endian */
 COMPAT_SYSCALL_DEFINE5(x86_pread, unsigned int, fd, char __user *, ubuf,
 		       u32, count, u32, poslo, u32, poshi)
diff --git a/arch/x86/include/asm/sys_ia32.h b/arch/x86/include/asm/sys_ia32.h
index 906794aa034e..2ee6e3b96656 100644
--- a/arch/x86/include/asm/sys_ia32.h
+++ b/arch/x86/include/asm/sys_ia32.h
@@ -35,9 +35,6 @@ asmlinkage long compat_sys_x86_fstatat(unsigned int, const char __user *,
 struct mmap_arg_struct32;
 asmlinkage long compat_sys_x86_mmap(struct mmap_arg_struct32 __user *);
 
-asmlinkage long compat_sys_x86_waitpid(compat_pid_t, unsigned int __user *,
-				       int);
-
 asmlinkage long compat_sys_x86_pread(unsigned int, char __user *, u32, u32,
 				     u32);
 asmlinkage long compat_sys_x86_pwrite(unsigned int, const char __user *, u32,
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 42/45] x86: fix sys_sigreturn() return type to be long, not unsigned long
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (40 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 41/45] x86: remove compat_sys_x86_waitpid() Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:27   ` Thomas Gleixner
  2018-03-22  9:00 ` [PATCH 43/45] x86/sigreturn: use SYSCALL_DEFINE0 Dominik Brodowski
                   ` (3 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Andi Kleen, Ingo Molnar, Jiri Slaby, x86, Michael Tautschnig

Same as with other system calls, sys_sigreturn() should return a value
of type long, not unsigned long. This also matches the behaviour for
IA32_EMULATION, see sys32_sigreturn() in arch/x86/ia32/ia32_signal.c .

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: x86@kernel.org
Cc: Michael Tautschnig <tautschn@amazon.co.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/include/asm/syscalls.h | 2 +-
 arch/x86/kernel/signal.c        | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index 1c0bebbd039e..ae6e05fdc24b 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -35,7 +35,7 @@ asmlinkage long sys_get_thread_area(struct user_desc __user *);
 #ifdef CONFIG_X86_32
 
 /* kernel/signal.c */
-asmlinkage unsigned long sys_sigreturn(void);
+asmlinkage long sys_sigreturn(void);
 
 /* kernel/vm86_32.c */
 struct vm86_struct;
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 4cdc0b27ec82..83a26726b689 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -601,7 +601,7 @@ static int x32_setup_rt_frame(struct ksignal *ksig,
  * Do a signal return; undo the signal stack.
  */
 #ifdef CONFIG_X86_32
-asmlinkage unsigned long sys_sigreturn(void)
+asmlinkage long sys_sigreturn(void)
 {
 	struct pt_regs *regs = current_pt_regs();
 	struct sigframe __user *frame;
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 43/45] x86/sigreturn: use SYSCALL_DEFINE0
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (41 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 42/45] x86: fix sys_sigreturn() return type to be long, not unsigned long Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:27   ` Thomas Gleixner
  2018-03-22  9:00 ` [PATCH 44/45] kernel/sys_ni: sort cond_syscall() entries Dominik Brodowski
                   ` (2 subsequent siblings)
  45 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Tautschnig, Michael, Michael Tautschnig, Thomas Gleixner,
	Ingo Molnar, H . Peter Anvin, Jaswinder Singh, Andi Kleen, x86

From: "Tautschnig, Michael" <tautschn@amazon.co.uk>

All definitions of syscalls in x86 except for those patched here have
already been using the appropriate SYSCALL_DEFINE*.

Signed-off-by: Michael Tautschnig <tautschn@amazon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jaswinder Singh <jaswinder@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: x86@kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/kernel/signal.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 83a26726b689..da270b95fe4d 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -25,6 +25,7 @@
 #include <linux/user-return-notifier.h>
 #include <linux/uprobes.h>
 #include <linux/context_tracking.h>
+#include <linux/syscalls.h>
 
 #include <asm/processor.h>
 #include <asm/ucontext.h>
@@ -601,7 +602,7 @@ static int x32_setup_rt_frame(struct ksignal *ksig,
  * Do a signal return; undo the signal stack.
  */
 #ifdef CONFIG_X86_32
-asmlinkage long sys_sigreturn(void)
+SYSCALL_DEFINE0(sigreturn)
 {
 	struct pt_regs *regs = current_pt_regs();
 	struct sigframe __user *frame;
@@ -633,7 +634,7 @@ asmlinkage long sys_sigreturn(void)
 }
 #endif /* CONFIG_X86_32 */
 
-asmlinkage long sys_rt_sigreturn(void)
+SYSCALL_DEFINE0(rt_sigreturn)
 {
 	struct pt_regs *regs = current_pt_regs();
 	struct rt_sigframe __user *frame;
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 44/45] kernel/sys_ni: sort cond_syscall() entries
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (42 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 43/45] x86/sigreturn: use SYSCALL_DEFINE0 Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22  9:00 ` [PATCH 45/45] bpf: whitelist all syscalls for error injection Dominik Brodowski
  2018-03-22 20:29 ` [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Linus Torvalds
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch

Shuffle the cond_syscall() entries in kernel/sys_ni.c around so that they
are kept in the same order as in include/uapi/asm-generic/unistd.h. For
better structuring, add the same comments as in that file, but keep a few
additional comments and extend the commentary where it seems useful.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 kernel/sys_ni.c | 506 +++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 332 insertions(+), 174 deletions(-)

diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 951dbda5c2b4..0c1538f5a559 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -17,245 +17,403 @@ asmlinkage long sys_ni_syscall(void)
 	return -ENOSYS;
 }
 
-cond_syscall(sys_quotactl);
-cond_syscall(compat_sys_quotactl32);
-cond_syscall(sys_acct);
+/*
+ * This list is kept in the same order as include/uapi/asm-generic/unistd.h.
+ * Architecture specific entries go below, followed by deprecated or obsolete
+ * system calls.
+ */
+
+cond_syscall(sys_io_setup);
+cond_syscall(compat_sys_io_setup);
+cond_syscall(sys_io_destroy);
+cond_syscall(sys_io_submit);
+cond_syscall(compat_sys_io_submit);
+cond_syscall(sys_io_cancel);
+cond_syscall(sys_io_getevents);
+cond_syscall(compat_sys_io_getevents);
+
+/* fs/xattr.c */
+
+/* fs/dcache.c */
+
+/* fs/cookies.c */
 cond_syscall(sys_lookup_dcookie);
 cond_syscall(compat_sys_lookup_dcookie);
-cond_syscall(sys_swapon);
-cond_syscall(sys_swapoff);
+
+/* fs/eventfd.c */
+cond_syscall(sys_eventfd2);
+
+/* fs/eventfd.c */
+cond_syscall(sys_epoll_create1);
+cond_syscall(sys_epoll_ctl);
+cond_syscall(sys_epoll_pwait);
+cond_syscall(compat_sys_epoll_pwait);
+
+/* fs/fcntl.c */
+
+/* fs/inotify_user.c */
+cond_syscall(sys_inotify_init1);
+cond_syscall(sys_inotify_add_watch);
+cond_syscall(sys_inotify_rm_watch);
+
+/* fs/ioctl.c */
+
+/* fs/ioprio.c */
+cond_syscall(sys_ioprio_set);
+cond_syscall(sys_ioprio_get);
+
+/* fs/locks.c */
+cond_syscall(sys_flock);
+
+/* fs/namei.c */
+
+/* fs/namespace.c */
+
+/* fs/nfsctl.c */
+
+/* fs/open.c */
+
+/* fs/pipe.c */
+
+/* fs/quota.c */
+cond_syscall(sys_quotactl);
+
+/* fs/readdir.c */
+
+/* fs/read_write.c */
+
+/* fs/sendfile.c */
+
+/* fs/select.c */
+
+/* fs/signalfd.c */
+cond_syscall(sys_signalfd4);
+cond_syscall(compat_sys_signalfd4);
+
+/* fs/splice.c */
+
+/* fs/stat.c */
+
+/* fs/sync.c */
+
+/* fs/timerfd.c */
+cond_syscall(sys_timerfd_create);
+cond_syscall(sys_timerfd_settime);
+cond_syscall(compat_sys_timerfd_settime);
+cond_syscall(sys_timerfd_gettime);
+cond_syscall(compat_sys_timerfd_gettime);
+
+/* fs/utimes.c */
+
+/* kernel/acct.c */
+cond_syscall(sys_acct);
+
+/* kernel/capability.c */
+cond_syscall(sys_capget);
+cond_syscall(sys_capset);
+
+/* kernel/exec_domain.c */
+
+/* kernel/exit.c */
+
+/* kernel/fork.c */
+
+/* kernel/futex.c */
+cond_syscall(sys_futex);
+cond_syscall(compat_sys_futex);
+cond_syscall(sys_set_robust_list);
+cond_syscall(compat_sys_set_robust_list);
+cond_syscall(sys_get_robust_list);
+cond_syscall(compat_sys_get_robust_list);
+
+/* kernel/hrtimer.c */
+
+/* kernel/itimer.c */
+
+/* kernel/kexec.c */
 cond_syscall(sys_kexec_load);
 cond_syscall(compat_sys_kexec_load);
-cond_syscall(sys_kexec_file_load);
+
+/* kernel/module.c */
 cond_syscall(sys_init_module);
-cond_syscall(sys_finit_module);
 cond_syscall(sys_delete_module);
+
+/* kernel/posix-timers.c */
+
+/* kernel/printk.c */
+cond_syscall(sys_syslog);
+
+/* kernel/ptrace.c */
+
+/* kernel/sched/core.c */
+
+/* kernel/signal.c */
+
+/* kernel/sys.c */
+cond_syscall(sys_setregid);
+cond_syscall(sys_setgid);
+cond_syscall(sys_setreuid);
+cond_syscall(sys_setuid);
+cond_syscall(sys_setresuid);
+cond_syscall(sys_getresuid);
+cond_syscall(sys_setresgid);
+cond_syscall(sys_getresgid);
+cond_syscall(sys_setfsuid);
+cond_syscall(sys_setfsgid);
+cond_syscall(sys_setgroups);
+cond_syscall(sys_getgroups);
+
+/* kernel/time.c */
+
+/* kernel/timer.c */
+
+/* ipc/mqueue.c */
+cond_syscall(sys_mq_open);
+cond_syscall(compat_sys_mq_open);
+cond_syscall(sys_mq_unlink);
+cond_syscall(sys_mq_timedsend);
+cond_syscall(compat_sys_mq_timedsend);
+cond_syscall(sys_mq_timedreceive);
+cond_syscall(compat_sys_mq_timedreceive);
+cond_syscall(sys_mq_notify);
+cond_syscall(compat_sys_mq_notify);
+cond_syscall(sys_mq_getsetattr);
+cond_syscall(compat_sys_mq_getsetattr);
+
+/* ipc/msg.c */
+cond_syscall(sys_msgget);
+cond_syscall(sys_msgctl);
+cond_syscall(compat_sys_msgctl);
+cond_syscall(sys_msgrcv);
+cond_syscall(compat_sys_msgrcv);
+cond_syscall(sys_msgsnd);
+cond_syscall(compat_sys_msgsnd);
+
+/* ipc/sem.c */
+cond_syscall(sys_semget);
+cond_syscall(sys_semctl);
+cond_syscall(compat_sys_semctl);
+cond_syscall(sys_semtimedop);
+cond_syscall(compat_sys_semtimedop);
+cond_syscall(sys_semop);
+
+/* ipc/shm.c */
+cond_syscall(sys_shmget);
+cond_syscall(sys_shmctl);
+cond_syscall(compat_sys_shmctl);
+cond_syscall(sys_shmat);
+cond_syscall(compat_sys_shmat);
+cond_syscall(sys_shmdt);
+
+/* net/socket.c */
+cond_syscall(sys_socket);
 cond_syscall(sys_socketpair);
 cond_syscall(sys_bind);
 cond_syscall(sys_listen);
 cond_syscall(sys_accept);
-cond_syscall(sys_accept4);
 cond_syscall(sys_connect);
 cond_syscall(sys_getsockname);
 cond_syscall(sys_getpeername);
-cond_syscall(sys_sendto);
-cond_syscall(sys_send);
-cond_syscall(sys_recvfrom);
-cond_syscall(sys_recv);
-cond_syscall(sys_socket);
 cond_syscall(sys_setsockopt);
 cond_syscall(compat_sys_setsockopt);
 cond_syscall(sys_getsockopt);
 cond_syscall(compat_sys_getsockopt);
+cond_syscall(sys_sendto);
 cond_syscall(sys_shutdown);
+cond_syscall(sys_recvfrom);
+cond_syscall(compat_sys_recvfrom);
 cond_syscall(sys_sendmsg);
-cond_syscall(sys_sendmmsg);
 cond_syscall(compat_sys_sendmsg);
-cond_syscall(compat_sys_sendmmsg);
 cond_syscall(sys_recvmsg);
-cond_syscall(sys_recvmmsg);
 cond_syscall(compat_sys_recvmsg);
-cond_syscall(compat_sys_recv);
-cond_syscall(compat_sys_recvfrom);
-cond_syscall(compat_sys_recvmmsg);
-cond_syscall(sys_socketcall);
-cond_syscall(sys_futex);
-cond_syscall(compat_sys_futex);
-cond_syscall(sys_set_robust_list);
-cond_syscall(compat_sys_set_robust_list);
-cond_syscall(sys_get_robust_list);
-cond_syscall(compat_sys_get_robust_list);
-cond_syscall(sys_epoll_create);
-cond_syscall(sys_epoll_create1);
-cond_syscall(sys_epoll_ctl);
-cond_syscall(sys_epoll_wait);
-cond_syscall(sys_epoll_pwait);
-cond_syscall(compat_sys_epoll_pwait);
-cond_syscall(sys_semget);
-cond_syscall(sys_semop);
-cond_syscall(sys_semtimedop);
-cond_syscall(compat_sys_semtimedop);
-cond_syscall(sys_semctl);
-cond_syscall(compat_sys_semctl);
-cond_syscall(sys_msgget);
-cond_syscall(sys_msgsnd);
-cond_syscall(compat_sys_msgsnd);
-cond_syscall(sys_msgrcv);
-cond_syscall(compat_sys_msgrcv);
-cond_syscall(sys_msgctl);
-cond_syscall(compat_sys_msgctl);
-cond_syscall(sys_shmget);
-cond_syscall(sys_shmat);
-cond_syscall(compat_sys_shmat);
-cond_syscall(sys_shmdt);
-cond_syscall(sys_shmctl);
-cond_syscall(compat_sys_shmctl);
-cond_syscall(sys_mq_open);
-cond_syscall(sys_mq_unlink);
-cond_syscall(sys_mq_timedsend);
-cond_syscall(sys_mq_timedreceive);
-cond_syscall(sys_mq_notify);
-cond_syscall(sys_mq_getsetattr);
-cond_syscall(compat_sys_mq_open);
-cond_syscall(compat_sys_mq_timedsend);
-cond_syscall(compat_sys_mq_timedreceive);
-cond_syscall(compat_sys_mq_notify);
-cond_syscall(compat_sys_mq_getsetattr);
-cond_syscall(sys_mbind);
-cond_syscall(sys_get_mempolicy);
-cond_syscall(sys_set_mempolicy);
-cond_syscall(compat_sys_mbind);
-cond_syscall(compat_sys_get_mempolicy);
-cond_syscall(compat_sys_set_mempolicy);
+
+/* mm/filemap.c */
+
+/* mm/nommu.c, also with MMU */
+cond_syscall(sys_mremap);
+
+/* security/keys/keyctl.c */
 cond_syscall(sys_add_key);
 cond_syscall(sys_request_key);
 cond_syscall(sys_keyctl);
 cond_syscall(compat_sys_keyctl);
-cond_syscall(compat_sys_socketcall);
-cond_syscall(sys_inotify_init);
-cond_syscall(sys_inotify_init1);
-cond_syscall(sys_inotify_add_watch);
-cond_syscall(sys_inotify_rm_watch);
-cond_syscall(sys_migrate_pages);
-cond_syscall(sys_move_pages);
-cond_syscall(sys_chown16);
-cond_syscall(sys_fchown16);
-cond_syscall(sys_getegid16);
-cond_syscall(sys_geteuid16);
-cond_syscall(sys_getgid16);
-cond_syscall(sys_getgroups16);
-cond_syscall(sys_getresgid16);
-cond_syscall(sys_getresuid16);
-cond_syscall(sys_getuid16);
-cond_syscall(sys_lchown16);
-cond_syscall(sys_setfsgid16);
-cond_syscall(sys_setfsuid16);
-cond_syscall(sys_setgid16);
-cond_syscall(sys_setgroups16);
-cond_syscall(sys_setregid16);
-cond_syscall(sys_setresgid16);
-cond_syscall(sys_setresuid16);
-cond_syscall(sys_setreuid16);
-cond_syscall(sys_setuid16);
-cond_syscall(sys_sgetmask);
-cond_syscall(sys_ssetmask);
-cond_syscall(sys_vm86old);
-cond_syscall(sys_vm86);
-cond_syscall(sys_modify_ldt);
-cond_syscall(sys_ipc);
-cond_syscall(compat_sys_ipc);
-cond_syscall(compat_sys_sysctl);
-cond_syscall(sys_flock);
-cond_syscall(sys_io_setup);
-cond_syscall(sys_io_destroy);
-cond_syscall(sys_io_submit);
-cond_syscall(sys_io_cancel);
-cond_syscall(sys_io_getevents);
-cond_syscall(compat_sys_io_setup);
-cond_syscall(compat_sys_io_submit);
-cond_syscall(compat_sys_io_getevents);
-cond_syscall(sys_sysfs);
-cond_syscall(sys_syslog);
-cond_syscall(sys_process_vm_readv);
-cond_syscall(sys_process_vm_writev);
-cond_syscall(compat_sys_process_vm_readv);
-cond_syscall(compat_sys_process_vm_writev);
-cond_syscall(sys_uselib);
-cond_syscall(sys_fadvise64);
-cond_syscall(sys_fadvise64_64);
-cond_syscall(sys_madvise);
-cond_syscall(sys_setuid);
-cond_syscall(sys_setregid);
-cond_syscall(sys_setgid);
-cond_syscall(sys_setreuid);
-cond_syscall(sys_setresuid);
-cond_syscall(sys_getresuid);
-cond_syscall(sys_setresgid);
-cond_syscall(sys_getresgid);
-cond_syscall(sys_setgroups);
-cond_syscall(sys_getgroups);
-cond_syscall(sys_setfsuid);
-cond_syscall(sys_setfsgid);
-cond_syscall(sys_capget);
-cond_syscall(sys_capset);
-cond_syscall(sys_copy_file_range);
 
-/* arch-specific weak syscall entries */
-cond_syscall(sys_pciconfig_read);
-cond_syscall(sys_pciconfig_write);
-cond_syscall(sys_pciconfig_iobase);
-cond_syscall(compat_sys_s390_ipc);
-cond_syscall(ppc_rtas);
-cond_syscall(sys_spu_run);
-cond_syscall(sys_spu_create);
-cond_syscall(sys_subpage_prot);
-cond_syscall(sys_s390_pci_mmio_read);
-cond_syscall(sys_s390_pci_mmio_write);
+/* arch/example/kernel/sys_example.c */
 
-/* mmu depending weak syscall entries */
+/* mm/fadvise.c */
+cond_syscall(sys_fadvise64_64);
+
+/* mm/, CONFIG_MMU only */
+cond_syscall(sys_swapon);
+cond_syscall(sys_swapoff);
 cond_syscall(sys_mprotect);
 cond_syscall(sys_msync);
 cond_syscall(sys_mlock);
 cond_syscall(sys_munlock);
 cond_syscall(sys_mlockall);
 cond_syscall(sys_munlockall);
-cond_syscall(sys_mlock2);
 cond_syscall(sys_mincore);
 cond_syscall(sys_madvise);
-cond_syscall(sys_mremap);
 cond_syscall(sys_remap_file_pages);
-cond_syscall(compat_sys_move_pages);
+cond_syscall(sys_mbind);
+cond_syscall(compat_sys_mbind);
+cond_syscall(sys_get_mempolicy);
+cond_syscall(compat_sys_get_mempolicy);
+cond_syscall(sys_set_mempolicy);
+cond_syscall(compat_sys_set_mempolicy);
+cond_syscall(sys_migrate_pages);
 cond_syscall(compat_sys_migrate_pages);
+cond_syscall(sys_move_pages);
+cond_syscall(compat_sys_move_pages);
 
-/* block-layer dependent */
-cond_syscall(sys_bdflush);
-cond_syscall(sys_ioprio_set);
-cond_syscall(sys_ioprio_get);
-
-/* New file descriptors */
-cond_syscall(sys_signalfd);
-cond_syscall(sys_signalfd4);
-cond_syscall(compat_sys_signalfd);
-cond_syscall(compat_sys_signalfd4);
-cond_syscall(sys_timerfd_create);
-cond_syscall(sys_timerfd_settime);
-cond_syscall(sys_timerfd_gettime);
-cond_syscall(compat_sys_timerfd_settime);
-cond_syscall(compat_sys_timerfd_gettime);
-cond_syscall(sys_eventfd);
-cond_syscall(sys_eventfd2);
-cond_syscall(sys_memfd_create);
-cond_syscall(sys_userfaultfd);
-
-/* performance counters: */
 cond_syscall(sys_perf_event_open);
+cond_syscall(sys_accept4);
+cond_syscall(sys_recvmmsg);
+cond_syscall(compat_sys_recvmmsg);
+
+/*
+ * Architecture specific syscalls: see further below
+ */
 
-/* fanotify! */
+/* fanotify */
 cond_syscall(sys_fanotify_init);
 cond_syscall(sys_fanotify_mark);
-cond_syscall(compat_sys_fanotify_mark);
 
 /* open by handle */
 cond_syscall(sys_name_to_handle_at);
 cond_syscall(sys_open_by_handle_at);
 cond_syscall(compat_sys_open_by_handle_at);
 
+cond_syscall(sys_sendmmsg);
+cond_syscall(compat_sys_sendmmsg);
+cond_syscall(sys_process_vm_readv);
+cond_syscall(compat_sys_process_vm_readv);
+cond_syscall(sys_process_vm_writev);
+cond_syscall(compat_sys_process_vm_writev);
+
 /* compare kernel pointers */
 cond_syscall(sys_kcmp);
 
+cond_syscall(sys_finit_module);
+
 /* operate on Secure Computing state */
 cond_syscall(sys_seccomp);
 
+cond_syscall(sys_memfd_create);
+
 /* access BPF programs and maps */
 cond_syscall(sys_bpf);
 
 /* execveat */
 cond_syscall(sys_execveat);
 
+cond_syscall(sys_userfaultfd);
+
 /* membarrier */
 cond_syscall(sys_membarrier);
 
+cond_syscall(sys_mlock2);
+
+cond_syscall(sys_copy_file_range);
+
 /* memory protection keys */
 cond_syscall(sys_pkey_mprotect);
 cond_syscall(sys_pkey_alloc);
 cond_syscall(sys_pkey_free);
+
+
+/*
+ * Architecture specific weak syscall entries.
+ */
+
+/* pciconfig: alpha, arm, arm64, ia64, sparc */
+cond_syscall(sys_pciconfig_read);
+cond_syscall(sys_pciconfig_write);
+cond_syscall(sys_pciconfig_iobase);
+
+/* sys_socketcall: arm, mips, x86, ... */
+cond_syscall(sys_socketcall);
+cond_syscall(compat_sys_socketcall);
+
+/* compat syscalls for arm64, x86, ... */
+cond_syscall(compat_sys_sysctl);
+cond_syscall(compat_sys_fanotify_mark);
+
+/* x86 */
+cond_syscall(sys_vm86old);
+cond_syscall(sys_modify_ldt);
+cond_syscall(compat_sys_quotactl32);
+cond_syscall(sys_vm86);
+cond_syscall(sys_kexec_file_load);
+
+/* s390 */
+cond_syscall(sys_s390_pci_mmio_read);
+cond_syscall(sys_s390_pci_mmio_write);
+cond_syscall(compat_sys_s390_ipc);
+
+/* powerpc */
+cond_syscall(ppc_rtas);
+cond_syscall(sys_spu_run);
+cond_syscall(sys_spu_create);
+cond_syscall(sys_subpage_prot);
+
+
+/*
+ * Deprecated system calls which are still defined in
+ * include/uapi/asm-generic/unistd.h and wanted by >= 1 arch
+ */
+
+/* __ARCH_WANT_SYSCALL_NO_FLAGS */
+cond_syscall(sys_epoll_create);
+cond_syscall(sys_inotify_init);
+cond_syscall(sys_eventfd);
+cond_syscall(sys_signalfd);
+cond_syscall(compat_sys_signalfd);
+
+/* __ARCH_WANT_SYSCALL_OFF_T */
+cond_syscall(sys_fadvise64);
+
+/* __ARCH_WANT_SYSCALL_DEPRECATED */
+cond_syscall(sys_epoll_wait);
+cond_syscall(sys_recv);
+cond_syscall(compat_sys_recv);
+cond_syscall(sys_send);
+cond_syscall(sys_bdflush);
+cond_syscall(sys_uselib);
+
+
+/*
+ * The syscalls below are not found in include/uapi/asm-generic/unistd.h
+ */
+
+/* obsolete: SGETMASK_SYSCALL */
+cond_syscall(sys_sgetmask);
+cond_syscall(sys_ssetmask);
+
+/* obsolete: SYSFS_SYSCALL */
+cond_syscall(sys_sysfs);
+
+/* obsolete: __ARCH_WANT_SYS_IPC */
+cond_syscall(sys_ipc);
+cond_syscall(compat_sys_ipc);
+
+/* obsolete: UID16 */
+cond_syscall(sys_chown16);
+cond_syscall(sys_fchown16);
+cond_syscall(sys_getegid16);
+cond_syscall(sys_geteuid16);
+cond_syscall(sys_getgid16);
+cond_syscall(sys_getgroups16);
+cond_syscall(sys_getresgid16);
+cond_syscall(sys_getresuid16);
+cond_syscall(sys_getuid16);
+cond_syscall(sys_lchown16);
+cond_syscall(sys_setfsgid16);
+cond_syscall(sys_setfsuid16);
+cond_syscall(sys_setgid16);
+cond_syscall(sys_setgroups16);
+cond_syscall(sys_setregid16);
+cond_syscall(sys_setresgid16);
+cond_syscall(sys_setresuid16);
+cond_syscall(sys_setreuid16);
+cond_syscall(sys_setuid16);
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 45/45] bpf: whitelist all syscalls for error injection
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (43 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 44/45] kernel/sys_ni: sort cond_syscall() entries Dominik Brodowski
@ 2018-03-22  9:00 ` Dominik Brodowski
  2018-03-22 20:29 ` [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Linus Torvalds
  45 siblings, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22  9:00 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch; +Cc: Howard McLauchlan

From: Howard McLauchlan <hmclauchlan@fb.com>

Error injection is a useful mechanism to fail arbitrary kernel
functions. However, it is often hard to guarantee an error propagates
appropriately to user space programs. By injecting into syscalls, we can
return arbitrary values to user space directly; this increases
flexibility and robustness in testing, allowing us to test user space
error paths effectively.

The following script, for example, fails calls to sys_open() from a
given pid:

from bcc import BPF
from sys import argv

pid = argv[1]

prog = r"""

int kprobe__SyS_open(struct pt_regs *ctx, const char *pathname, int flags)
{
    u32 pid = bpf_get_current_pid_tgid();
    if (pid == %s)
        bpf_override_return(ctx, -ENOMEM);
    return 0;
}
""" % pid

b = BPF(text=prog)
while 1:
    b.perf_buffer_poll()

This patch whitelists all syscalls defined with SYSCALL_DEFINE and
COMPAT_SYSCALL_DEFINE for error injection. These changes are not
intended to be considered stable, and would normally be configured off.

Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 include/linux/compat.h   | 3 +++
 include/linux/syscalls.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/include/linux/compat.h b/include/linux/compat.h
index f1649a5e6716..57eb263a3bc9 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -33,6 +33,8 @@
 #endif
 
 #define COMPAT_SYSCALL_DEFINE0(name) \
+	asmlinkage long compat_sys_##name(void); \
+	ALLOW_ERROR_INJECTION(compat_sys_##name, ERRNO); \
 	asmlinkage long compat_sys_##name(void)
 
 #define COMPAT_SYSCALL_DEFINE1(name, ...) \
@@ -51,6 +53,7 @@
 #define COMPAT_SYSCALL_DEFINEx(x, name, ...)				\
 	asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))\
 		__attribute__((alias(__stringify(compat_SyS##name))));  \
+	ALLOW_ERROR_INJECTION(compat_sys##name, ERRNO);	\
 	static inline long C_SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
 	asmlinkage long compat_SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__));\
 	asmlinkage long compat_SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__))\
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 3591c4af33d8..cc6fcd7d5b3c 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
 
 #define SYSCALL_DEFINE0(sname)					\
 	SYSCALL_METADATA(_##sname, 0);				\
+	asmlinkage long sys_##sname(void);			\
+	ALLOW_ERROR_INJECTION(sys_##sname, ERRNO);		\
 	asmlinkage long sys_##sname(void)
 
 #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
@@ -210,6 +212,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
 #define __SYSCALL_DEFINEx(x, name, ...)					\
 	asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))	\
 		__attribute__((alias(__stringify(SyS##name))));		\
+	ALLOW_ERROR_INJECTION(sys##name, ERRNO);			\
 	static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__));	\
 	asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__));	\
 	asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__))	\
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 40/45] x86: use _do_fork() in compat_sys_x86_clone()
  2018-03-22  9:00 ` [PATCH 40/45] x86: use _do_fork() in compat_sys_x86_clone() Dominik Brodowski
@ 2018-03-22  9:26   ` Thomas Gleixner
  0 siblings, 0 replies; 62+ messages in thread
From: Thomas Gleixner @ 2018-03-22  9:26 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Ingo Molnar,
	Jiri Slaby, x86

On Thu, 22 Mar 2018, Dominik Brodowski wrote:

> It is trivial to directly call _do_fork() instead of the sys_clone()
> syscall in compat_sys_x86_clone().
> 
> This patch is part of a series which tries to remove in-kernel calls to
> syscalls. On this basis, the syscall entry path can be streamlined.
> 
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Jiri Slaby <jslaby@suse.com>
> Cc: x86@kernel.org
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 41/45] x86: remove compat_sys_x86_waitpid()
  2018-03-22  9:00 ` [PATCH 41/45] x86: remove compat_sys_x86_waitpid() Dominik Brodowski
@ 2018-03-22  9:27   ` Thomas Gleixner
  0 siblings, 0 replies; 62+ messages in thread
From: Thomas Gleixner @ 2018-03-22  9:27 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Ingo Molnar,
	Jiri Slaby, x86

On Thu, 22 Mar 2018, Dominik Brodowski wrote:

> compat_sys_x86_waitpid() is not needed, as it takes the same parameters
> (int, *int, int) as the native syscall.
> 
> Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Jiri Slaby <jslaby@suse.com>
> Cc: x86@kernel.org
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 42/45] x86: fix sys_sigreturn() return type to be long, not unsigned long
  2018-03-22  9:00 ` [PATCH 42/45] x86: fix sys_sigreturn() return type to be long, not unsigned long Dominik Brodowski
@ 2018-03-22  9:27   ` Thomas Gleixner
  0 siblings, 0 replies; 62+ messages in thread
From: Thomas Gleixner @ 2018-03-22  9:27 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Andi Kleen,
	Ingo Molnar, Jiri Slaby, x86, Michael Tautschnig

On Thu, 22 Mar 2018, Dominik Brodowski wrote:

> Same as with other system calls, sys_sigreturn() should return a value
> of type long, not unsigned long. This also matches the behaviour for
> IA32_EMULATION, see sys32_sigreturn() in arch/x86/ia32/ia32_signal.c .
> 
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Jiri Slaby <jslaby@suse.com>
> Cc: x86@kernel.org
> Cc: Michael Tautschnig <tautschn@amazon.co.uk>
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 43/45] x86/sigreturn: use SYSCALL_DEFINE0
  2018-03-22  9:00 ` [PATCH 43/45] x86/sigreturn: use SYSCALL_DEFINE0 Dominik Brodowski
@ 2018-03-22  9:27   ` Thomas Gleixner
  0 siblings, 0 replies; 62+ messages in thread
From: Thomas Gleixner @ 2018-03-22  9:27 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Tautschnig,
	Michael, Michael Tautschnig, Ingo Molnar, H . Peter Anvin,
	Jaswinder Singh, Andi Kleen, x86

On Thu, 22 Mar 2018, Dominik Brodowski wrote:

> From: "Tautschnig, Michael" <tautschn@amazon.co.uk>
> 
> All definitions of syscalls in x86 except for those patched here have
> already been using the appropriate SYSCALL_DEFINE*.
> 
> Signed-off-by: Michael Tautschnig <tautschn@amazon.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Jaswinder Singh <jaswinder@infradead.org>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: x86@kernel.org
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Any chance that kernel/uid6.c can go? [Was: [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c]
  2018-03-22  9:00 ` [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c Dominik Brodowski
@ 2018-03-22 10:21   ` Dominik Brodowski
  2018-03-22 17:57     ` Linus Torvalds
  0 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22 10:21 UTC (permalink / raw)
  To: linux-kernel, torvalds, viro, arnd, linux-arch
  Cc: Eric W . Biederman, Andrew Morton

On Thu, Mar 22, 2018 at 10:00:36AM +0100, Dominik Brodowski wrote:
> Using these helpers allows us to avoid the in-kernel calls to these
> syscalls: sys_setregid(), sys_setgid(), sys_setreuid(), sys_setuid(),
> sys_setresuid(), sys_setresgid(), sys_setfsuid(), and sys_setfsgid().
> 
> The ksys_ prefix denotes that these function are meant as a drop-in
> replacement for the syscall. In particular, they use the same calling
> convention.
> 
> This patch is part of a series which tries to remove in-kernel calls to
> syscalls. On this basis, the syscall entry path can be streamlined.
> 
> Cc: Al Viro <viro@ZenIV.linux.org.uk>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
> ---
>  kernel/sys.c   | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++--------
>  kernel/uid16.c | 19 ++++++++++---------

In its header, kernel/uid16.c says, since 2.3.39 was released in January
2000:

 *      Wrapper functions for 16bit uid back compatibility. All nicely tied
 *      together in the faint hope we can take the out in five years time.

Are we any closer to removing these wrappers?

Thanks,
	Dominik

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield()
  2018-03-22  9:00 ` [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield() Dominik Brodowski
@ 2018-03-22 17:29   ` Peter Zijlstra
  2018-03-22 17:41     ` Dominik Brodowski
  2018-03-22 17:44     ` Linus Torvalds
  0 siblings, 2 replies; 62+ messages in thread
From: Peter Zijlstra @ 2018-03-22 17:29 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Ingo Molnar

On Thu, Mar 22, 2018 at 10:00:37AM +0100, Dominik Brodowski wrote:
> Using the sched-internal do_sched_yield() helper allows us to get rid of
> the sched-internal call to the sys_sched_yield() syscall.
> 
> This patch is part of a series which tries to remove in-kernel calls to
> syscalls. On this basis, the syscall entry path can be streamlined.

But why !? Either Cc me on more of the series such that the whole makes
sense, or better yet, write a proper Changelog.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield()
  2018-03-22 17:29   ` Peter Zijlstra
@ 2018-03-22 17:41     ` Dominik Brodowski
  2018-03-22 17:44     ` Linus Torvalds
  1 sibling, 0 replies; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-22 17:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Ingo Molnar

On Thu, Mar 22, 2018 at 06:29:59PM +0100, Peter Zijlstra wrote:
> On Thu, Mar 22, 2018 at 10:00:37AM +0100, Dominik Brodowski wrote:
> > Using the sched-internal do_sched_yield() helper allows us to get rid of
> > the sched-internal call to the sys_sched_yield() syscall.
> > 
> > This patch is part of a series which tries to remove in-kernel calls to
> > syscalls. On this basis, the syscall entry path can be streamlined.
> 
> But why !? Either Cc me on more of the series such that the whole makes
> sense, or better yet, write a proper Changelog.

Well, the summary is right there in the changelog: Kernel code simply should
not pretend to be userspace and call a syscall function. For a more
detailled description, see, for instance, Linus' explanation in
http://lkml.kernel.org/r/CA+55aFwo7yA1gm8AUYMEQA8ZNY-9GGF8Oup09jJFvEa4J7C+jA@mail.gmail.com :

| On x86-64, we'd like to just pass the 'struct pt_regs *' pointer, and
| have the sys_xyz() function itself just pick out the arguments it
| needs from there.
|
| That has a few reasons for it:
| 
| - we can clear all registers at system call entry, which helps defeat
| some of the "pass seldom used register with user-controlled value that
| survives deep into the callchain" things that people used to leak
| information
| 
|  - we can streamline the low-level system call code, which needs to
| pass around 'struct pt_regs *' anyway, and the system call only picks
| up the values it actually needs

I can add such a long description to all these patches, but that seems to
be a bit... longwinded.

Thanks,
	Dominik

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield()
  2018-03-22 17:29   ` Peter Zijlstra
  2018-03-22 17:41     ` Dominik Brodowski
@ 2018-03-22 17:44     ` Linus Torvalds
  2018-03-23  7:38       ` git send-email and sending the cover-letter to all cc addresses found in a patch series Dominik Brodowski
  1 sibling, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2018-03-22 17:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dominik Brodowski, Linux Kernel Mailing List, Al Viro,
	Arnd Bergmann, linux-arch, Ingo Molnar

On Thu, Mar 22, 2018 at 10:29 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
> But why !? Either Cc me on more of the series such that the whole makes
> sense, or better yet, write a proper Changelog.

This is a common issue. We should encourage people to always send at
least the cover-page to everybody who gets cc'd, even if they don't
get the whole series.

Anyway, to repeat: the calling convention for x86-64 system call
wrappers will be to just pass in "struct pt_regs", and the system call
wrapper itself will take the arguments from there.

That means that we won't have random user space contents in registers
that can leak deep down the call chain. The registers are cleared at
system call entry, and only the actual real arguments are reloaded.

(It also makes do_syscall_64() generate better code, natch).

Anyway, that means that you *CANNOT* call "sys_xyz() from kernel code.
Not that you really should have anyway, but there are tons of
historical reasons why we do. But now it fundamentally won't work,
because you'd need to literally do

    { struct pt_regs regs;
       regs.rdi = (unsigned long) firstarg;
       regs.rsi = (unsigned long) second;
      ...
      sys_xyz(&regs); }

to do it on x86-64.

Anyway, there's a longer discussion about why this is the case
elsewhere, and why we want to do it, but just take it as granted: you
will not be able to call sys_xyz() directly, and that's just a fact.

Making people able to do it would make real system calls (that are a
hell of a lot more important) slower. So it's simply not going to be
allowed.

               Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: Any chance that kernel/uid6.c can go? [Was: [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c]
  2018-03-22 10:21   ` Any chance that kernel/uid6.c can go? [Was: [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c] Dominik Brodowski
@ 2018-03-22 17:57     ` Linus Torvalds
  0 siblings, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2018-03-22 17:57 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: Linux Kernel Mailing List, Al Viro, Arnd Bergmann, linux-arch,
	Eric W . Biederman, Andrew Morton

On Thu, Mar 22, 2018 at 3:21 AM, Dominik Brodowski
<linux@dominikbrodowski.net> wrote:
>
> In its header, kernel/uid16.c says, since 2.3.39 was released in January
> 2000:
>
>  *      Wrapper functions for 16bit uid back compatibility. All nicely tied
>  *      together in the faint hope we can take the out in five years time.
>
> Are we any closer to removing these wrappers?

Honestly, I don't see any real upside to getting rid of them.

We used to still run some of the _original_ binaries from the old
floppy disk distributions just a few years ago. I honestly hope we
still do. And those old uid system calls would be very much part of
it.

Sadly, I don't know where those old binaries are. Anybody know where
the bash binary from 1991 is? There was a "bash.Z" as part of the 0.01
release.

(Ok, that one is almost certainly broken, but Alan Cox reported
running some really old binaries from the early times long ago before
he turned to even *older* retrocomputing and started concentrating on
the old 8-bit machines ;)

              Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/)
  2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
                   ` (44 preceding siblings ...)
  2018-03-22  9:00 ` [PATCH 45/45] bpf: whitelist all syscalls for error injection Dominik Brodowski
@ 2018-03-22 20:29 ` Linus Torvalds
  45 siblings, 0 replies; 62+ messages in thread
From: Linus Torvalds @ 2018-03-22 20:29 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: Linux Kernel Mailing List, Al Viro, Arnd Bergmann, linux-arch

On Thu, Mar 22, 2018 at 2:00 AM, Dominik Brodowski
<linux@dominikbrodowski.net> wrote:
> Here is a third series of patches which reduce the number of syscall
> invocations from within the kernel. Once this long-term goal is achieved,
> the syscall entry path can be streamlined.

Looks good to me.

              Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* git send-email and sending the cover-letter to all cc addresses found in a patch series
  2018-03-22 17:44     ` Linus Torvalds
@ 2018-03-23  7:38       ` Dominik Brodowski
  2018-03-23  7:49         ` Joe Perches
  0 siblings, 1 reply; 62+ messages in thread
From: Dominik Brodowski @ 2018-03-23  7:38 UTC (permalink / raw)
  To: gitster; +Cc: Peter Zijlstra, Linux Kernel Mailing List, git, Linus Torvalds

On Thu, Mar 22, 2018 at 10:44:54AM -0700, Linus Torvalds wrote:
> On Thu, Mar 22, 2018 at 10:29 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > But why !? Either Cc me on more of the series such that the whole makes
> > sense, or better yet, write a proper Changelog.
> 
> This is a common issue. We should encourage people to always send at
> least the cover-page to everybody who gets cc'd, even if they don't
> get the whole series.

Will try to do that in future. Does git send-email have such an option? Or
do I have to specify all cc addresses in the cover letter manually? I found
some reference to an unresolved discussion on git@ of that topic in 2016, so
I might not be the only one who could make use of that feature...

Thanks,
	Dominik

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: git send-email and sending the cover-letter to all cc addresses found in a patch series
  2018-03-23  7:38       ` git send-email and sending the cover-letter to all cc addresses found in a patch series Dominik Brodowski
@ 2018-03-23  7:49         ` Joe Perches
  0 siblings, 0 replies; 62+ messages in thread
From: Joe Perches @ 2018-03-23  7:49 UTC (permalink / raw)
  To: Dominik Brodowski, gitster
  Cc: Peter Zijlstra, Linux Kernel Mailing List, git, Linus Torvalds

On Fri, 2018-03-23 at 08:38 +0100, Dominik Brodowski wrote:
> On Thu, Mar 22, 2018 at 10:44:54AM -0700, Linus Torvalds wrote:
> > On Thu, Mar 22, 2018 at 10:29 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > > 
> > > But why !? Either Cc me on more of the series such that the whole makes
> > > sense, or better yet, write a proper Changelog.
> > 
> > This is a common issue. We should encourage people to always send at
> > least the cover-page to everybody who gets cc'd, even if they don't
> > get the whole series.
> 
> Will try to do that in future. Does git send-email have such an option? Or
> do I have to specify all cc addresses in the cover letter manually? I found
> some reference to an unresolved discussion on git@ of that topic in 2016, so
> I might not be the only one who could make use of that feature...

The main problem might be the quantity of recipients.

Many spam filters look at how many recipients on an email
to eliminate likely spam.

kernel.org mailing lists have a maximum email header size.
Too many recipients aren't delivered to the lists.

Using BCC could work, but replies then don't go to all
of the recipients.

Generally, I send only to listed the 0/n patch only
to lists and individual patches to maintainers.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 16/45] inotify: add do_inotify_init() helper; remove in-kernel call to syscall
  2018-03-22  9:00 ` [PATCH 16/45] inotify: add do_inotify_init() helper; remove in-kernel call to syscall Dominik Brodowski
@ 2018-03-26 12:25   ` Jan Kara
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Kara @ 2018-03-26 12:25 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Jan Kara,
	Amir Goldstein, linux-fsdevel

On Thu 22-03-18 10:00:30, Dominik Brodowski wrote:
> Using the inotify-internal do_inotify_init() helper allows us to get rid
> of the in-kernel call to sys_inotify_init1() syscall.
> 
> This patch is part of a series which tries to remove in-kernel calls to
> syscalls. On this basis, the syscall entry path can be streamlined.
> 
> Cc: Jan Kara <jack@suse.cz>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Cc: linux-fsdevel@vger.kernel.org
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

Looks good. Feel free to add:

Acked-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/notify/inotify/inotify_user.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
> index 2c908b31d6c9..43c23653ce2e 100644
> --- a/fs/notify/inotify/inotify_user.c
> +++ b/fs/notify/inotify/inotify_user.c
> @@ -635,7 +635,7 @@ static struct fsnotify_group *inotify_new_group(unsigned int max_events)
>  
>  
>  /* inotify syscalls */
> -SYSCALL_DEFINE1(inotify_init1, int, flags)
> +static int do_inotify_init(int flags)
>  {
>  	struct fsnotify_group *group;
>  	int ret;
> @@ -660,9 +660,14 @@ SYSCALL_DEFINE1(inotify_init1, int, flags)
>  	return ret;
>  }
>  
> +SYSCALL_DEFINE1(inotify_init1, int, flags)
> +{
> +	return do_inotify_init(flags);
> +}
> +
>  SYSCALL_DEFINE0(inotify_init)
>  {
> -	return sys_inotify_init1(0);
> +	return do_inotify_init(0);
>  }
>  
>  SYSCALL_DEFINE3(inotify_add_watch, int, fd, const char __user *, pathname,
> -- 
> 2.16.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 17/45] fanotify: add do_fanotify_mark() helper; remove in-kernel call to syscall
  2018-03-22  9:00 ` [PATCH 17/45] fanotify: add do_fanotify_mark() " Dominik Brodowski
@ 2018-03-26 12:25   ` Jan Kara
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Kara @ 2018-03-26 12:25 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Jan Kara, Amir Goldstein

On Thu 22-03-18 10:00:31, Dominik Brodowski wrote:
> Using the fs-internal do_fanotify_mark() helper allows us to get rid of
> the fs-internal call to the sys_fanotify_mark() syscall.
> 
> This patch is part of a series which tries to remove in-kernel calls to
> syscalls. On this basis, the syscall entry path can be streamlined.
> 
> Cc: Jan Kara <jack@suse.cz>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

Looks good. Feel free to add:

Acked-by: Jan Kara <jack@suse.cz>

								Honza
> ---
>  fs/notify/fanotify/fanotify_user.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index c07eb3d655ea..fa803a58a605 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -820,9 +820,8 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
>  	return fd;
>  }
>  
> -SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
> -			      __u64, mask, int, dfd,
> -			      const char  __user *, pathname)
> +static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
> +			    int dfd, const char  __user *pathname)
>  {
>  	struct inode *inode = NULL;
>  	struct vfsmount *mnt = NULL;
> @@ -928,13 +927,20 @@ SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
>  	return ret;
>  }
>  
> +SYSCALL_DEFINE5(fanotify_mark, int, fanotify_fd, unsigned int, flags,
> +			      __u64, mask, int, dfd,
> +			      const char  __user *, pathname)
> +{
> +	return do_fanotify_mark(fanotify_fd, flags, mask, dfd, pathname);
> +}
> +
>  #ifdef CONFIG_COMPAT
>  COMPAT_SYSCALL_DEFINE6(fanotify_mark,
>  				int, fanotify_fd, unsigned int, flags,
>  				__u32, mask0, __u32, mask1, int, dfd,
>  				const char  __user *, pathname)
>  {
> -	return sys_fanotify_mark(fanotify_fd, flags,
> +	return do_fanotify_mark(fanotify_fd, flags,
>  #ifdef __BIG_ENDIAN
>  				((__u64)mask0 << 32) | mask1,
>  #else
> -- 
> 2.16.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 18/45] fs/quota: add kernel_quotactl() helper; remove in-kernel call to syscall
  2018-03-22  9:00 ` [PATCH 18/45] fs/quota: add kernel_quotactl() " Dominik Brodowski
@ 2018-03-26 12:26   ` Jan Kara
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Kara @ 2018-03-26 12:26 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Jan Kara

On Thu 22-03-18 10:00:32, Dominik Brodowski wrote:
> Using the fs-internal kernel_quotactl() helper allows us to get rid of
> the fs-internal call to the sys_quotactl() syscall.
> 
> This patch is part of a series which tries to remove in-kernel calls to
> syscalls. On this basis, the syscall entry path can be streamlined.
> 
> Cc: Jan Kara <jack@suse.com>
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

Looks good. Feel free to add:

Acked-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/quota/compat.c        |  8 ++++----
>  fs/quota/quota.c         | 10 ++++++++--
>  include/linux/quotaops.h |  3 +++
>  3 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/quota/compat.c b/fs/quota/compat.c
> index 779caed4f078..1577a2fd51f4 100644
> --- a/fs/quota/compat.c
> +++ b/fs/quota/compat.c
> @@ -59,7 +59,7 @@ asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
>  	case Q_GETQUOTA:
>  		dqblk = compat_alloc_user_space(sizeof(struct if_dqblk));
>  		compat_dqblk = addr;
> -		ret = sys_quotactl(cmd, special, id, dqblk);
> +		ret = kernel_quotactl(cmd, special, id, dqblk);
>  		if (ret)
>  			break;
>  		if (copy_in_user(compat_dqblk, dqblk, sizeof(*compat_dqblk)) ||
> @@ -75,12 +75,12 @@ asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
>  			get_user(data, &compat_dqblk->dqb_valid) ||
>  			put_user(data, &dqblk->dqb_valid))
>  			break;
> -		ret = sys_quotactl(cmd, special, id, dqblk);
> +		ret = kernel_quotactl(cmd, special, id, dqblk);
>  		break;
>  	case Q_XGETQSTAT:
>  		fsqstat = compat_alloc_user_space(sizeof(struct fs_quota_stat));
>  		compat_fsqstat = addr;
> -		ret = sys_quotactl(cmd, special, id, fsqstat);
> +		ret = kernel_quotactl(cmd, special, id, fsqstat);
>  		if (ret)
>  			break;
>  		ret = -EFAULT;
> @@ -113,7 +113,7 @@ asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
>  		ret = 0;
>  		break;
>  	default:
> -		ret = sys_quotactl(cmd, special, id, addr);
> +		ret = kernel_quotactl(cmd, special, id, addr);
>  	}
>  	return ret;
>  }
> diff --git a/fs/quota/quota.c b/fs/quota/quota.c
> index 43612e2a73af..860bfbe7a07a 100644
> --- a/fs/quota/quota.c
> +++ b/fs/quota/quota.c
> @@ -833,8 +833,8 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
>   * calls. Maybe we need to add the process quotas etc. in the future,
>   * but we probably should use rlimits for that.
>   */
> -SYSCALL_DEFINE4(quotactl, unsigned int, cmd, const char __user *, special,
> -		qid_t, id, void __user *, addr)
> +int kernel_quotactl(unsigned int cmd, const char __user *special,
> +		    qid_t id, void __user *addr)
>  {
>  	uint cmds, type;
>  	struct super_block *sb = NULL;
> @@ -885,3 +885,9 @@ SYSCALL_DEFINE4(quotactl, unsigned int, cmd, const char __user *, special,
>  		path_put(pathp);
>  	return ret;
>  }
> +
> +SYSCALL_DEFINE4(quotactl, unsigned int, cmd, const char __user *, special,
> +		qid_t, id, void __user *, addr)
> +{
> +	return kernel_quotactl(cmd, special, id, addr);
> +}
> diff --git a/include/linux/quotaops.h b/include/linux/quotaops.h
> index 2fb6fb11132e..ff63eac16a79 100644
> --- a/include/linux/quotaops.h
> +++ b/include/linux/quotaops.h
> @@ -105,6 +105,9 @@ int dquot_set_dqblk(struct super_block *sb, struct kqid id,
>  int __dquot_transfer(struct inode *inode, struct dquot **transfer_to);
>  int dquot_transfer(struct inode *inode, struct iattr *iattr);
>  
> +int kernel_quotactl(unsigned int cmd, const char __user *special,
> +		    qid_t id, void __user *addr);
> +
>  static inline struct mem_dqinfo *sb_dqinfo(struct super_block *sb, int type)
>  {
>  	return sb_dqopt(sb)->info + type;
> -- 
> 2.16.2
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 19/45] fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl()
  2018-03-22  9:00 ` [PATCH 19/45] fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl() Dominik Brodowski
@ 2018-03-26 12:33   ` Jan Kara
  0 siblings, 0 replies; 62+ messages in thread
From: Jan Kara @ 2018-03-26 12:33 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, torvalds, viro, arnd, linux-arch, Jan Kara,
	Christoph Hellwig

On Thu 22-03-18 10:00:33, Dominik Brodowski wrote:
> While sys32_quotactl() is only needed on x86, it can use the recommended
> COMPAT_SYSCALL_DEFINEx() machinery for its setup.
> 
> Cc: Jan Kara <jack@suse.com>
> Cc: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

Looks good AFAICT. You can add:

Acked-by: Jan Kara <jack@suse.cz>

								Honza


> ---
>  arch/x86/entry/syscalls/syscall_32.tbl | 2 +-
>  fs/quota/compat.c                      | 5 +++--
>  include/linux/compat.h                 | 3 +++
>  include/linux/syscalls.h               | 4 +---
>  kernel/sys_ni.c                        | 2 +-
>  5 files changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> index 2a5e99cff859..09338dd2bd94 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -137,7 +137,7 @@
>  128	i386	init_module		sys_init_module
>  129	i386	delete_module		sys_delete_module
>  130	i386	get_kernel_syms
> -131	i386	quotactl		sys_quotactl			sys32_quotactl
> +131	i386	quotactl		sys_quotactl			compat_sys_quotactl32
>  132	i386	getpgid			sys_getpgid
>  133	i386	fchdir			sys_fchdir
>  134	i386	bdflush			sys_bdflush
> diff --git a/fs/quota/compat.c b/fs/quota/compat.c
> index 1577a2fd51f4..c30572857619 100644
> --- a/fs/quota/compat.c
> +++ b/fs/quota/compat.c
> @@ -41,8 +41,9 @@ struct compat_fs_quota_stat {
>  	__u16		qs_iwarnlimit;
>  };
>  
> -asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
> -						qid_t id, void __user *addr)
> +COMPAT_SYSCALL_DEFINE4(quotactl32, unsigned int, cmd,
> +		       const char __user *, special, qid_t, id,
> +		       void __user *, addr)
>  {
>  	unsigned int cmds;
>  	struct if_dqblk __user *dqblk;
> diff --git a/include/linux/compat.h b/include/linux/compat.h
> index 16c3027074a2..f1649a5e6716 100644
> --- a/include/linux/compat.h
> +++ b/include/linux/compat.h
> @@ -461,6 +461,9 @@ asmlinkage ssize_t compat_sys_pwritev2(compat_ulong_t fd,
>  		const struct compat_iovec __user *vec,
>  		compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
>  
> +asmlinkage long compat_sys_quotactl32(unsigned int cmd,
> +		const char __user *special, qid_t id, void __user *addr);
> +
>  #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
>  asmlinkage long compat_sys_preadv64(unsigned long fd,
>  		const struct compat_iovec __user *vec,
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index f30083190ae4..a3f42f5f3b08 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -241,8 +241,6 @@ static inline void addr_limit_user_check(void)
>  #endif
>  }
>  
> -asmlinkage long sys32_quotactl(unsigned int cmd, const char __user *special,
> -			       qid_t id, void __user *addr);
>  asmlinkage long sys_time(time_t __user *tloc);
>  asmlinkage long sys_stime(time_t __user *tptr);
>  asmlinkage long sys_gettimeofday(struct timeval __user *tv,
> @@ -625,7 +623,7 @@ asmlinkage long sys_chdir(const char __user *filename);
>  asmlinkage long sys_fchdir(unsigned int fd);
>  asmlinkage long sys_rmdir(const char __user *pathname);
>  asmlinkage long sys_lookup_dcookie(u64 cookie64, char __user *buf, size_t len);
> -asmlinkage long sys_quotactl(unsigned int cmd, const char __user *special,
> +asmlinkage long sys_quotactl32(unsigned int cmd, const char __user *special,
>  				qid_t id, void __user *addr);
>  asmlinkage long sys_getdents(unsigned int fd,
>  				struct linux_dirent __user *dirent,
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index b5189762d275..951dbda5c2b4 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -18,7 +18,7 @@ asmlinkage long sys_ni_syscall(void)
>  }
>  
>  cond_syscall(sys_quotactl);
> -cond_syscall(sys32_quotactl);
> +cond_syscall(compat_sys_quotactl32);
>  cond_syscall(sys_acct);
>  cond_syscall(sys_lookup_dcookie);
>  cond_syscall(compat_sys_lookup_dcookie);
> -- 
> 2.16.2
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2018-03-26 12:33 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-22  9:00 [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Dominik Brodowski
2018-03-22  9:00 ` [PATCH 01/45] fs: add ksys_getdents64() helper; remove in-kernel calls to sys_getdents64() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 02/45] fs: add ksys_ioctl() helper; remove in-kernel calls to sys_ioctl() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 03/45] fs: add ksys_lseek() helper; remove in-kernel calls to sys_lseek() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 04/45] fs: add ksys_read() helper; remove in-kernel calls to sys_read() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 05/45] fs: add ksys_sync() helper; remove in-kernel calls to sys_sync() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 06/45] fs: add do_lookup_dcookie() helper; remove in-kernel call to syscall Dominik Brodowski
2018-03-22  9:00 ` [PATCH 07/45] fs: add do_vmsplice() " Dominik Brodowski
2018-03-22  9:00 ` [PATCH 08/45] fs: add kern_select() helper; remove in-kernel call to sys_select() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 09/45] fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 10/45] fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls Dominik Brodowski
2018-03-22  9:00 ` [PATCH 11/45] fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 12/45] fs: add do_compat_fcntl64() helper; remove in-kernel call to comapt syscall Dominik Brodowski
2018-03-22  9:00 ` [PATCH 13/45] fs: add do_compat_select() " Dominik Brodowski
2018-03-22  9:00 ` [PATCH 14/45] fs: add do_compat_signalfd4() " Dominik Brodowski
2018-03-22  9:00 ` [PATCH 15/45] fs: add do_compat_futimesat() " Dominik Brodowski
2018-03-22  9:00 ` [PATCH 16/45] inotify: add do_inotify_init() helper; remove in-kernel call to syscall Dominik Brodowski
2018-03-26 12:25   ` Jan Kara
2018-03-22  9:00 ` [PATCH 17/45] fanotify: add do_fanotify_mark() " Dominik Brodowski
2018-03-26 12:25   ` Jan Kara
2018-03-22  9:00 ` [PATCH 18/45] fs/quota: add kernel_quotactl() " Dominik Brodowski
2018-03-26 12:26   ` Jan Kara
2018-03-22  9:00 ` [PATCH 19/45] fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl() Dominik Brodowski
2018-03-26 12:33   ` Jan Kara
2018-03-22  9:00 ` [PATCH 20/45] kernel: add do_compat_sigaltstack() helper; remove in-kernel call to compat syscall Dominik Brodowski
2018-03-22  9:00 ` [PATCH 21/45] kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c Dominik Brodowski
2018-03-22 10:21   ` Any chance that kernel/uid6.c can go? [Was: [PATCH 22/45] kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c] Dominik Brodowski
2018-03-22 17:57     ` Linus Torvalds
2018-03-22  9:00 ` [PATCH 23/45] sched: add do_sched_yield() helper; remove in-kernel call to sched_yield() Dominik Brodowski
2018-03-22 17:29   ` Peter Zijlstra
2018-03-22 17:41     ` Dominik Brodowski
2018-03-22 17:44     ` Linus Torvalds
2018-03-23  7:38       ` git send-email and sending the cover-letter to all cc addresses found in a patch series Dominik Brodowski
2018-03-23  7:49         ` Joe Perches
2018-03-22  9:00 ` [PATCH 24/45] kexec: call do_kexec_load() in compat syscall directly Dominik Brodowski
2018-03-22  9:00 ` [PATCH 25/45] mm: add kernel_migrate_pages() helper, move compat syscall to mm/mempolicy.c Dominik Brodowski
2018-03-22  9:00 ` [PATCH 26/45] mm: add kernel_move_pages() helper, move compat syscall to mm/migrate.c Dominik Brodowski
2018-03-22  9:00 ` [PATCH 27/45] mm: add kernel_mbind() helper; remove in-kernel call to syscall Dominik Brodowski
2018-03-22  9:00 ` [PATCH 28/45] mm: add kernel_[sg]et_mempolicy() helpers; remove in-kernel calls to syscalls Dominik Brodowski
2018-03-22  9:00 ` [PATCH 29/45] mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead() Dominik Brodowski
2018-03-22  9:00 ` [PATCH 30/45] ipc: add semtimedop syscall/compat_syscall wrappers Dominik Brodowski
2018-03-22  9:00 ` [PATCH 31/45] ipc: add semget syscall wrapper Dominik Brodowski
2018-03-22  9:00 ` [PATCH 32/45] ipc: add semctl syscall/compat_syscall wrappers Dominik Brodowski
2018-03-22  9:00 ` [PATCH 33/45] ipc: add msgget syscall wrapper Dominik Brodowski
2018-03-22  9:00 ` [PATCH 34/45] ipc: add shmget " Dominik Brodowski
2018-03-22  9:00 ` [PATCH 35/45] ipc: add shmdt " Dominik Brodowski
2018-03-22  9:00 ` [PATCH 36/45] ipc: add shmctl syscall/compat_syscall wrappers Dominik Brodowski
2018-03-22  9:00 ` [PATCH 37/45] ipc: add msgctl " Dominik Brodowski
2018-03-22  9:00 ` [PATCH 38/45] ipc: add msgrcv " Dominik Brodowski
2018-03-22  9:00 ` [PATCH 39/45] ipc: add msgsnd " Dominik Brodowski
2018-03-22  9:00 ` [PATCH 40/45] x86: use _do_fork() in compat_sys_x86_clone() Dominik Brodowski
2018-03-22  9:26   ` Thomas Gleixner
2018-03-22  9:00 ` [PATCH 41/45] x86: remove compat_sys_x86_waitpid() Dominik Brodowski
2018-03-22  9:27   ` Thomas Gleixner
2018-03-22  9:00 ` [PATCH 42/45] x86: fix sys_sigreturn() return type to be long, not unsigned long Dominik Brodowski
2018-03-22  9:27   ` Thomas Gleixner
2018-03-22  9:00 ` [PATCH 43/45] x86/sigreturn: use SYSCALL_DEFINE0 Dominik Brodowski
2018-03-22  9:27   ` Thomas Gleixner
2018-03-22  9:00 ` [PATCH 44/45] kernel/sys_ni: sort cond_syscall() entries Dominik Brodowski
2018-03-22  9:00 ` [PATCH 45/45] bpf: whitelist all syscalls for error injection Dominik Brodowski
2018-03-22 20:29 ` [PATCH 00/45] remove in-kernel syscall invocations (part 3 == remainder outside arch/) Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).