linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
@ 2014-11-24 23:00 Pieter Smith
  2014-11-24 23:01 ` [PATCH v4 1/7] fs: move sendfile syscall into fs/splice Pieter Smith
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Pieter Smith @ 2014-11-24 23:00 UTC (permalink / raw)
  To: pieter
  Cc: Josh Triplett, Alexander Duyck, Alexander Viro,
	Alexei Starovoitov, Andrew Morton, Bertrand Jacquin,
	Catalina Mocanu, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

REPO: https://github.com/smipi1/linux-tinification.git

BRANCH: tiny/config-syscall-splice

BACKGROUND: This patch-set forms part of the Linux Kernel Tinification effort (
  https://tiny.wiki.kernel.org/).

GOAL: Support compiling out the splice family of syscalls (splice, vmsplice,
  tee and sendfile) along with all supporting infrastructure if not needed.
  Many embedded systems will not need the splice-family syscalls. Omitting them
  saves space.

HISTORY:
  PATCH v4:
    - Drops __splice_p()
    - Let nfsd fall back to non-splice support when splice is compiled out
    - Style fixes
  
  PATCH v3:
    - Fixup commit logs so that they are consistent with patch strategy
    - Style fixes
  
  PATCH v2:
    - Avoid the ifdef mess introduced in PATCH v1 by mocking out exported splice
      functions.

STRATEGY:
a. With the goal of eventually compiling out fs/splice.c, several functions
   that are only used in support of the the splice family of syscalls are moved
   into fs/splice.c from fs/read_write.c. The kernel_write function that is not
   used to support the splice syscalls is moved to fs/read_write.c.

b. Introduce an EXPERT kernel configuration option; CONFIG_SYSCALL_SPLICE; to
   compile out the splice family of syscalls. This removes all userspace uses
   of the splice infrastructure.

c. Splice exports an operations struct, nosteal_pipe_buf_ops. Eliminate the 
   use of this struct when CONFIG_SYSCALL_SPLICE is undefined, so that splice
   can later be compiled out.

d. Let nfsd fall back to non-splice support when splice is compiled out.

e. Compile out fs/splice.c. Functions exported by fs/splice are mocked out with
   failing static inlines. This is done so as to all but eliminate the
   maintenance burden on file-system drivers.

RESULTS: A tinyconfig bloat-o-meter score for the entire patch-set:

add/remove: 0/41 grow/shrink: 5/7 up/down: 23/-8422 (-8399)
function                                     old     new   delta
sys_pwritev                                  115     122      +7
sys_preadv                                   115     122      +7
fdput_pos                                     29      36      +7
sys_pwrite64                                 115     116      +1
sys_pread64                                  115     116      +1
pipe_to_null                                   4       -      -4
generic_pipe_buf_nosteal                       6       -      -6
spd_release_page                              10       -     -10
fdput                                         11       -     -11
PageUptodate                                  22      11     -11
lock_page                                     36      24     -12
signal_pending                                39      26     -13
fdget                                         56      42     -14
page_cache_pipe_buf_release                   16       -     -16
user_page_pipe_buf_ops                        20       -     -20
splice_write_null                             24       4     -20
page_cache_pipe_buf_ops                       20       -     -20
nosteal_pipe_buf_ops                          20       -     -20
default_pipe_buf_ops                          20       -     -20
generic_splice_sendpage                       24       -     -24
user_page_pipe_buf_steal                      25       -     -25
splice_shrink_spd                             27       -     -27
pipe_to_user                                  43       -     -43
direct_splice_actor                           47       -     -47
default_file_splice_write                     49       -     -49
wakeup_pipe_writers                           54       -     -54
wakeup_pipe_readers                           54       -     -54
write_pipe_buf                                71       -     -71
page_cache_pipe_buf_confirm                   80       -     -80
splice_grow_spd                               87       -     -87
do_splice_to                                  87       -     -87
ipipe_prep.part                               92       -     -92
splice_from_pipe                              93       -     -93
splice_from_pipe_next                        107       -    -107
pipe_to_sendpage                             109       -    -109
page_cache_pipe_buf_steal                    114       -    -114
opipe_prep.part                              119       -    -119
sys_sendfile                                 122       -    -122
generic_file_splice_read                     131       8    -123
sys_sendfile64                               126       -    -126
sys_vmsplice                                 137       -    -137
do_splice_direct                             148       -    -148
vmsplice_to_user                             205       -    -205
__splice_from_pipe                           246       -    -246
splice_direct_to_actor                       348       -    -348
splice_to_pipe                               371       -    -371
do_sendfile                                  492       -    -492
sys_tee                                      497       -    -497
vmsplice_to_pipe                             558       -    -558
default_file_splice_read                     688       -    -688
iter_file_splice_write                       702       4    -698
sys_splice                                  1075       -   -1075
__generic_file_splice_read                  1109       -   -1109

Pieter Smith (7):
  fs: move sendfile syscall into fs/splice
  fs: moved kernel_write to fs/read_write
  fs/splice: support compiling out splice-family syscalls
  fs/fuse: support compiling out splice
  fs/nfsd: support compiling out splice
  net/core: support compiling out splice
  fs/splice: full support for compiling out splice

 fs/Makefile            |   3 +-
 fs/fuse/dev.c          |   9 ++-
 fs/read_write.c        | 181 +++------------------------------------------
 fs/splice.c            | 194 +++++++++++++++++++++++++++++++++++++++++++++----
 include/linux/fs.h     |  26 +++++++
 include/linux/skbuff.h |  10 +++
 include/linux/splice.h |  42 +++++++++++
 init/Kconfig           |  10 +++
 kernel/sys_ni.c        |   8 ++
 net/core/skbuff.c      |  11 ++-
 net/sunrpc/svc.c       |   2 +-
 11 files changed, 302 insertions(+), 194 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4 1/7] fs: move sendfile syscall into fs/splice
  2014-11-24 23:00 [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Pieter Smith
@ 2014-11-24 23:01 ` Pieter Smith
  2014-11-24 23:01 ` [PATCH v4 2/7] fs: moved kernel_write to fs/read_write Pieter Smith
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Pieter Smith @ 2014-11-24 23:01 UTC (permalink / raw)
  To: pieter
  Cc: Josh Triplett, Alexander Duyck, Alexander Viro,
	Alexei Starovoitov, Andrew Morton, Bertrand Jacquin,
	Catalina Mocanu, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

sendfile functionally forms part of the splice group of syscalls (splice,
vmsplice and tee). Grouping sendfile with splice paves the way to compiling out
the splice group of syscalls for embedded systems that do not need these.

add/remove: 0/0 grow/shrink: 7/2 up/down: 86/-61 (25)
function                                     old     new   delta
file_start_write                              34      68     +34
file_end_write                                29      58     +29
sys_pwritev                                  115     122      +7
sys_preadv                                   115     122      +7
fdput_pos                                     29      36      +7
sys_pwrite64                                 115     116      +1
sys_pread64                                  115     116      +1
sys_tee                                      497     491      -6
sys_splice                                  1075    1020     -55

Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
 fs/read_write.c | 175 -------------------------------------------------------
 fs/splice.c     | 178 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 178 insertions(+), 175 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 7d9318c..d9451ba 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1191,178 +1191,3 @@ COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd,
 }
 #endif
 
-static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
-		  	   size_t count, loff_t max)
-{
-	struct fd in, out;
-	struct inode *in_inode, *out_inode;
-	loff_t pos;
-	loff_t out_pos;
-	ssize_t retval;
-	int fl;
-
-	/*
-	 * Get input file, and verify that it is ok..
-	 */
-	retval = -EBADF;
-	in = fdget(in_fd);
-	if (!in.file)
-		goto out;
-	if (!(in.file->f_mode & FMODE_READ))
-		goto fput_in;
-	retval = -ESPIPE;
-	if (!ppos) {
-		pos = in.file->f_pos;
-	} else {
-		pos = *ppos;
-		if (!(in.file->f_mode & FMODE_PREAD))
-			goto fput_in;
-	}
-	retval = rw_verify_area(READ, in.file, &pos, count);
-	if (retval < 0)
-		goto fput_in;
-	count = retval;
-
-	/*
-	 * Get output file, and verify that it is ok..
-	 */
-	retval = -EBADF;
-	out = fdget(out_fd);
-	if (!out.file)
-		goto fput_in;
-	if (!(out.file->f_mode & FMODE_WRITE))
-		goto fput_out;
-	retval = -EINVAL;
-	in_inode = file_inode(in.file);
-	out_inode = file_inode(out.file);
-	out_pos = out.file->f_pos;
-	retval = rw_verify_area(WRITE, out.file, &out_pos, count);
-	if (retval < 0)
-		goto fput_out;
-	count = retval;
-
-	if (!max)
-		max = min(in_inode->i_sb->s_maxbytes, out_inode->i_sb->s_maxbytes);
-
-	if (unlikely(pos + count > max)) {
-		retval = -EOVERFLOW;
-		if (pos >= max)
-			goto fput_out;
-		count = max - pos;
-	}
-
-	fl = 0;
-#if 0
-	/*
-	 * We need to debate whether we can enable this or not. The
-	 * man page documents EAGAIN return for the output at least,
-	 * and the application is arguably buggy if it doesn't expect
-	 * EAGAIN on a non-blocking file descriptor.
-	 */
-	if (in.file->f_flags & O_NONBLOCK)
-		fl = SPLICE_F_NONBLOCK;
-#endif
-	file_start_write(out.file);
-	retval = do_splice_direct(in.file, &pos, out.file, &out_pos, count, fl);
-	file_end_write(out.file);
-
-	if (retval > 0) {
-		add_rchar(current, retval);
-		add_wchar(current, retval);
-		fsnotify_access(in.file);
-		fsnotify_modify(out.file);
-		out.file->f_pos = out_pos;
-		if (ppos)
-			*ppos = pos;
-		else
-			in.file->f_pos = pos;
-	}
-
-	inc_syscr(current);
-	inc_syscw(current);
-	if (pos > max)
-		retval = -EOVERFLOW;
-
-fput_out:
-	fdput(out);
-fput_in:
-	fdput(in);
-out:
-	return retval;
-}
-
-SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd, off_t __user *, offset, size_t, count)
-{
-	loff_t pos;
-	off_t off;
-	ssize_t ret;
-
-	if (offset) {
-		if (unlikely(get_user(off, offset)))
-			return -EFAULT;
-		pos = off;
-		ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS);
-		if (unlikely(put_user(pos, offset)))
-			return -EFAULT;
-		return ret;
-	}
-
-	return do_sendfile(out_fd, in_fd, NULL, count, 0);
-}
-
-SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd, loff_t __user *, offset, size_t, count)
-{
-	loff_t pos;
-	ssize_t ret;
-
-	if (offset) {
-		if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
-			return -EFAULT;
-		ret = do_sendfile(out_fd, in_fd, &pos, count, 0);
-		if (unlikely(put_user(pos, offset)))
-			return -EFAULT;
-		return ret;
-	}
-
-	return do_sendfile(out_fd, in_fd, NULL, count, 0);
-}
-
-#ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd,
-		compat_off_t __user *, offset, compat_size_t, count)
-{
-	loff_t pos;
-	off_t off;
-	ssize_t ret;
-
-	if (offset) {
-		if (unlikely(get_user(off, offset)))
-			return -EFAULT;
-		pos = off;
-		ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS);
-		if (unlikely(put_user(pos, offset)))
-			return -EFAULT;
-		return ret;
-	}
-
-	return do_sendfile(out_fd, in_fd, NULL, count, 0);
-}
-
-COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
-		compat_loff_t __user *, offset, compat_size_t, count)
-{
-	loff_t pos;
-	ssize_t ret;
-
-	if (offset) {
-		if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
-			return -EFAULT;
-		ret = do_sendfile(out_fd, in_fd, &pos, count, 0);
-		if (unlikely(put_user(pos, offset)))
-			return -EFAULT;
-		return ret;
-	}
-
-	return do_sendfile(out_fd, in_fd, NULL, count, 0);
-}
-#endif
diff --git a/fs/splice.c b/fs/splice.c
index f5cb9ba..c1a2861 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -28,6 +28,7 @@
 #include <linux/export.h>
 #include <linux/syscalls.h>
 #include <linux/uio.h>
+#include <linux/fsnotify.h>
 #include <linux/security.h>
 #include <linux/gfp.h>
 #include <linux/socket.h>
@@ -2039,3 +2040,180 @@ SYSCALL_DEFINE4(tee, int, fdin, int, fdout, size_t, len, unsigned int, flags)
 
 	return error;
 }
+
+static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
+			   size_t count, loff_t max)
+{
+	struct fd in, out;
+	struct inode *in_inode, *out_inode;
+	loff_t pos;
+	loff_t out_pos;
+	ssize_t retval;
+	int fl;
+
+	/*
+	 * Get input file, and verify that it is ok..
+	 */
+	retval = -EBADF;
+	in = fdget(in_fd);
+	if (!in.file)
+		goto out;
+	if (!(in.file->f_mode & FMODE_READ))
+		goto fput_in;
+	retval = -ESPIPE;
+	if (!ppos) {
+		pos = in.file->f_pos;
+	} else {
+		pos = *ppos;
+		if (!(in.file->f_mode & FMODE_PREAD))
+			goto fput_in;
+	}
+	retval = rw_verify_area(READ, in.file, &pos, count);
+	if (retval < 0)
+		goto fput_in;
+	count = retval;
+
+	/*
+	 * Get output file, and verify that it is ok..
+	 */
+	retval = -EBADF;
+	out = fdget(out_fd);
+	if (!out.file)
+		goto fput_in;
+	if (!(out.file->f_mode & FMODE_WRITE))
+		goto fput_out;
+	retval = -EINVAL;
+	in_inode = file_inode(in.file);
+	out_inode = file_inode(out.file);
+	out_pos = out.file->f_pos;
+	retval = rw_verify_area(WRITE, out.file, &out_pos, count);
+	if (retval < 0)
+		goto fput_out;
+	count = retval;
+
+	if (!max)
+		max = min(in_inode->i_sb->s_maxbytes, out_inode->i_sb->s_maxbytes);
+
+	if (unlikely(pos + count > max)) {
+		retval = -EOVERFLOW;
+		if (pos >= max)
+			goto fput_out;
+		count = max - pos;
+	}
+
+	fl = 0;
+#if 0
+	/*
+	 * We need to debate whether we can enable this or not. The
+	 * man page documents EAGAIN return for the output at least,
+	 * and the application is arguably buggy if it doesn't expect
+	 * EAGAIN on a non-blocking file descriptor.
+	 */
+	if (in.file->f_flags & O_NONBLOCK)
+		fl = SPLICE_F_NONBLOCK;
+#endif
+	file_start_write(out.file);
+	retval = do_splice_direct(in.file, &pos, out.file, &out_pos, count, fl);
+	file_end_write(out.file);
+
+	if (retval > 0) {
+		add_rchar(current, retval);
+		add_wchar(current, retval);
+		fsnotify_access(in.file);
+		fsnotify_modify(out.file);
+		out.file->f_pos = out_pos;
+		if (ppos)
+			*ppos = pos;
+		else
+			in.file->f_pos = pos;
+	}
+
+	inc_syscr(current);
+	inc_syscw(current);
+	if (pos > max)
+		retval = -EOVERFLOW;
+
+fput_out:
+	fdput(out);
+fput_in:
+	fdput(in);
+out:
+	return retval;
+}
+
+SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd, off_t __user *, offset, size_t, count)
+{
+	loff_t pos;
+	off_t off;
+	ssize_t ret;
+
+	if (offset) {
+		if (unlikely(get_user(off, offset)))
+			return -EFAULT;
+		pos = off;
+		ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS);
+		if (unlikely(put_user(pos, offset)))
+			return -EFAULT;
+		return ret;
+	}
+
+	return do_sendfile(out_fd, in_fd, NULL, count, 0);
+}
+
+SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd, loff_t __user *, offset, size_t, count)
+{
+	loff_t pos;
+	ssize_t ret;
+
+	if (offset) {
+		if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
+			return -EFAULT;
+		ret = do_sendfile(out_fd, in_fd, &pos, count, 0);
+		if (unlikely(put_user(pos, offset)))
+			return -EFAULT;
+		return ret;
+	}
+
+	return do_sendfile(out_fd, in_fd, NULL, count, 0);
+}
+
+#ifdef CONFIG_COMPAT
+COMPAT_SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd,
+		compat_off_t __user *, offset, compat_size_t, count)
+{
+	loff_t pos;
+	off_t off;
+	ssize_t ret;
+
+	if (offset) {
+		if (unlikely(get_user(off, offset)))
+			return -EFAULT;
+		pos = off;
+		ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS);
+		if (unlikely(put_user(pos, offset)))
+			return -EFAULT;
+		return ret;
+	}
+
+	return do_sendfile(out_fd, in_fd, NULL, count, 0);
+}
+
+COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
+		compat_loff_t __user *, offset, compat_size_t, count)
+{
+	loff_t pos;
+	ssize_t ret;
+
+	if (offset) {
+		if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
+			return -EFAULT;
+		ret = do_sendfile(out_fd, in_fd, &pos, count, 0);
+		if (unlikely(put_user(pos, offset)))
+			return -EFAULT;
+		return ret;
+	}
+
+	return do_sendfile(out_fd, in_fd, NULL, count, 0);
+}
+#endif
+
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 2/7] fs: moved kernel_write to fs/read_write
  2014-11-24 23:00 [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Pieter Smith
  2014-11-24 23:01 ` [PATCH v4 1/7] fs: move sendfile syscall into fs/splice Pieter Smith
@ 2014-11-24 23:01 ` Pieter Smith
  2014-11-24 23:01 ` [PATCH v4 3/7] fs/splice: support compiling out splice-family syscalls Pieter Smith
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Pieter Smith @ 2014-11-24 23:01 UTC (permalink / raw)
  To: pieter
  Cc: Josh Triplett, Alexander Duyck, Alexander Viro,
	Alexei Starovoitov, Andrew Morton, Bertrand Jacquin,
	Catalina Mocanu, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

kernel_write shares infrastructure with the read_write translation unit but not
with the splice translation unit. Grouping kernel_write with the read_write
translation unit is more logical. It also paves the way to compiling out the
splice group of syscalls for embedded systems that do not need them.

Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
 fs/read_write.c | 16 ++++++++++++++++
 fs/splice.c     | 16 ----------------
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index d9451ba..f4c8d8b 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1191,3 +1191,19 @@ COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd,
 }
 #endif
 
+ssize_t kernel_write(struct file *file, const char *buf, size_t count,
+			    loff_t pos)
+{
+	mm_segment_t old_fs;
+	ssize_t res;
+
+	old_fs = get_fs();
+	set_fs(get_ds());
+	/* The cast to a user pointer is valid due to the set_fs() */
+	res = vfs_write(file, (__force const char __user *)buf, count, &pos);
+	set_fs(old_fs);
+
+	return res;
+}
+EXPORT_SYMBOL(kernel_write);
+
diff --git a/fs/splice.c b/fs/splice.c
index c1a2861..44b201b 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -583,22 +583,6 @@ static ssize_t kernel_readv(struct file *file, const struct iovec *vec,
 	return res;
 }
 
-ssize_t kernel_write(struct file *file, const char *buf, size_t count,
-			    loff_t pos)
-{
-	mm_segment_t old_fs;
-	ssize_t res;
-
-	old_fs = get_fs();
-	set_fs(get_ds());
-	/* The cast to a user pointer is valid due to the set_fs() */
-	res = vfs_write(file, (__force const char __user *)buf, count, &pos);
-	set_fs(old_fs);
-
-	return res;
-}
-EXPORT_SYMBOL(kernel_write);
-
 ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
 				 struct pipe_inode_info *pipe, size_t len,
 				 unsigned int flags)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 3/7] fs/splice: support compiling out splice-family syscalls
  2014-11-24 23:00 [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Pieter Smith
  2014-11-24 23:01 ` [PATCH v4 1/7] fs: move sendfile syscall into fs/splice Pieter Smith
  2014-11-24 23:01 ` [PATCH v4 2/7] fs: moved kernel_write to fs/read_write Pieter Smith
@ 2014-11-24 23:01 ` Pieter Smith
  2014-11-25  0:49   ` Josh Triplett
  2014-11-24 23:01 ` [PATCH v4 4/7] fs/fuse: support compiling out splice Pieter Smith
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Pieter Smith @ 2014-11-24 23:01 UTC (permalink / raw)
  To: pieter
  Cc: Josh Triplett, Alexander Duyck, Alexander Viro,
	Alexei Starovoitov, Andrew Morton, Bertrand Jacquin,
	Catalina Mocanu, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

Many embedded systems will not need the splice-family syscalls (splice,
vmsplice, tee and sendfile). Omitting them saves space.  This adds a new EXPERT
config option CONFIG_SYSCALL_SPLICE (default y) to support compiling them out.

The goal is to completely compile out fs/splice along with the syscalls. To
achieve this, the remaining patch-set will deal with fs/splice exports. As far
as possible, the impact on other device drivers will be minimized so as to
reduce the overal maintenance burden of CONFIG_SYSCALL_SPLICE.

The use of exported functions will be solved by transparently mocking them out
with static inlines. Uses of the exported pipe_buf_operations struct however
require direct modification in fs/fuse and net/core. The next two patches will
deal with this. A macro is defined that will assist with NULL'ing out callbacks
when CONFIG_SYSCALL_SPLICE is undefined: __splice_p().

Once all exports are solved, fs/splice can be compiled out.

The bloat benefit of this patch given a tinyconfig is:

add/remove: 0/16 grow/shrink: 2/5 up/down: 114/-3693 (-3579)
function                                     old     new   delta
splice_direct_to_actor                       348     416     +68
splice_to_pipe                               371     417     +46
splice_from_pipe_next                        107     106      -1
fdput                                         11       -     -11
signal_pending                                39      26     -13
fdget                                         56      42     -14
user_page_pipe_buf_ops                        20       -     -20
user_page_pipe_buf_steal                      25       -     -25
file_end_write                                58      29     -29
file_start_write                              68      34     -34
pipe_to_user                                  43       -     -43
wakeup_pipe_readers                           54       -     -54
do_splice_to                                  87       -     -87
ipipe_prep.part                               92       -     -92
opipe_prep.part                              119       -    -119
sys_sendfile                                 122       -    -122
sys_sendfile64                               126       -    -126
sys_vmsplice                                 137       -    -137
vmsplice_to_user                             205       -    -205
sys_tee                                      491       -    -491
do_sendfile                                  492       -    -492
vmsplice_to_pipe                             558       -    -558
sys_splice                                  1020       -   -1020

Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
 fs/splice.c     |  2 ++
 init/Kconfig    | 10 ++++++++++
 kernel/sys_ni.c |  8 ++++++++
 3 files changed, 20 insertions(+)

diff --git a/fs/splice.c b/fs/splice.c
index 44b201b..7c4c695 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1316,6 +1316,7 @@ long do_splice_direct(struct file *in, loff_t *ppos, struct file *out,
 	return ret;
 }
 
+#ifdef CONFIG_SYSCALL_SPLICE
 static int splice_pipe_to_pipe(struct pipe_inode_info *ipipe,
 			       struct pipe_inode_info *opipe,
 			       size_t len, unsigned int flags);
@@ -2200,4 +2201,5 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
 	return do_sendfile(out_fd, in_fd, NULL, count, 0);
 }
 #endif
+#endif
 
diff --git a/init/Kconfig b/init/Kconfig
index d811d5f..dec9819 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1571,6 +1571,16 @@ config NTP
 	  system clock to an NTP server, you can disable this option to save
 	  space.
 
+config SYSCALL_SPLICE
+	bool "Enable splice/vmsplice/tee/sendfile syscalls" if EXPERT
+	default y
+	help
+	  This option enables the splice, vmsplice, tee and sendfile syscalls. These
+	  are used by applications to: move data between buffers and arbitrary file
+	  descriptors; "copy" data between buffers; or copy data from userspace into
+	  buffers. If building an embedded system where no applications use these
+	  syscalls, you can disable this option to save space.
+
 config PCI_QUIRKS
 	default y
 	bool "Enable PCI quirk workarounds" if EXPERT
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index d2f5b00..25d5551 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -170,6 +170,14 @@ cond_syscall(sys_fstat);
 cond_syscall(sys_stat);
 cond_syscall(sys_uname);
 cond_syscall(sys_olduname);
+cond_syscall(sys_vmsplice);
+cond_syscall(sys_splice);
+cond_syscall(sys_tee);
+cond_syscall(sys_sendfile);
+cond_syscall(sys_sendfile64);
+cond_syscall(compat_sys_vmsplice);
+cond_syscall(compat_sys_sendfile);
+cond_syscall(compat_sys_sendfile64);
 
 /* arch-specific weak syscall entries */
 cond_syscall(sys_pciconfig_read);
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 4/7] fs/fuse: support compiling out splice
  2014-11-24 23:00 [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Pieter Smith
                   ` (2 preceding siblings ...)
  2014-11-24 23:01 ` [PATCH v4 3/7] fs/splice: support compiling out splice-family syscalls Pieter Smith
@ 2014-11-24 23:01 ` Pieter Smith
  2014-11-24 23:01 ` [PATCH v4 5/7] fs/nfsd: " Pieter Smith
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Pieter Smith @ 2014-11-24 23:01 UTC (permalink / raw)
  To: pieter
  Cc: Josh Triplett, Alexander Duyck, Alexander Viro,
	Alexei Starovoitov, Andrew Morton, Bertrand Jacquin,
	Catalina Mocanu, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

To implement splice support, fs/fuse makes use of nosteal_pipe_buf_ops. This
struct is exported by fs/splice. The goal of the larger patch set is to
completely compile out fs/splice, so uses of the exported struct need to be
compiled out along with fs/splice.

This patch therefore compiles out splice support in fs/fuse when
CONFIG_SYSCALL_SPLICE is undefined.

Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
 fs/fuse/dev.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index ca88731..e984302 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1191,8 +1191,9 @@ __releases(fc->lock)
  * request_end().  Otherwise add it to the processing list, and set
  * the 'sent' flag.
  */
-static ssize_t fuse_dev_do_read(struct fuse_conn *fc, struct file *file,
-				struct fuse_copy_state *cs, size_t nbytes)
+static ssize_t __maybe_unused
+fuse_dev_do_read(struct fuse_conn *fc, struct file *file,
+		 struct fuse_copy_state *cs, size_t nbytes)
 {
 	int err;
 	struct fuse_req *req;
@@ -1291,6 +1292,7 @@ static ssize_t fuse_dev_read(struct kiocb *iocb, const struct iovec *iov,
 	return fuse_dev_do_read(fc, file, &cs, iov_length(iov, nr_segs));
 }
 
+#ifdef CONFIG_SYSCALL_SPLICE
 static ssize_t fuse_dev_splice_read(struct file *in, loff_t *ppos,
 				    struct pipe_inode_info *pipe,
 				    size_t len, unsigned int flags)
@@ -1368,6 +1370,9 @@ out:
 	kfree(bufs);
 	return ret;
 }
+#else /* CONFIG_SYSCALL_SPLICE */
+#define fuse_dev_splice_read NULL
+#endif
 
 static int fuse_notify_poll(struct fuse_conn *fc, unsigned int size,
 			    struct fuse_copy_state *cs)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 5/7] fs/nfsd: support compiling out splice
  2014-11-24 23:00 [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Pieter Smith
                   ` (3 preceding siblings ...)
  2014-11-24 23:01 ` [PATCH v4 4/7] fs/fuse: support compiling out splice Pieter Smith
@ 2014-11-24 23:01 ` Pieter Smith
  2014-11-24 23:01 ` [PATCH v4 6/7] net/core: " Pieter Smith
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Pieter Smith @ 2014-11-24 23:01 UTC (permalink / raw)
  To: pieter
  Cc: Josh Triplett, Alexander Duyck, Alexander Viro,
	Alexei Starovoitov, Andrew Morton, Bertrand Jacquin,
	Catalina Mocanu, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

The goal of the larger patch set is to completely compile out fs/splice, and
as a result, splice support for all file-systems. This patch ensures that
fs/nfsd falls back to non-splice fs support when CONFIG_SYSCALL_SPLICE is
undefined.

Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
 net/sunrpc/svc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index ca8a795..6cacc37 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1084,7 +1084,7 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
 		goto err_short_len;
 
 	/* Will be turned off only in gss privacy case: */
-	rqstp->rq_splice_ok = true;
+	rqstp->rq_splice_ok = IS_ENABLED(CONFIG_SPLICE_SYSCALL);
 	/* Will be turned off only when NFSv4 Sessions are used */
 	rqstp->rq_usedeferral = true;
 	rqstp->rq_dropme = false;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 6/7] net/core: support compiling out splice
  2014-11-24 23:00 [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Pieter Smith
                   ` (4 preceding siblings ...)
  2014-11-24 23:01 ` [PATCH v4 5/7] fs/nfsd: " Pieter Smith
@ 2014-11-24 23:01 ` Pieter Smith
  2014-11-24 23:01 ` [PATCH v4 7/7] fs/splice: full support for " Pieter Smith
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Pieter Smith @ 2014-11-24 23:01 UTC (permalink / raw)
  To: pieter
  Cc: Josh Triplett, Alexander Duyck, Alexander Viro,
	Alexei Starovoitov, Andrew Morton, Bertrand Jacquin,
	Catalina Mocanu, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

To implement splice support, net/core makes use of nosteal_pipe_buf_ops. This
struct is exported by fs/splice. The goal of the larger patch set is to
completely compile out fs/splice, so uses of the exported struct need to be
compiled out along with fs/splice.

This patch therefore compiles out splice support in net/core when
CONFIG_SYSCALL_SPLICE is undefined. The compiled out function skb_splice_bits
is transparently mocked out with a static inline. The greater patch set removes
userspace splice support so it cannot be called anyway.

Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
 include/linux/skbuff.h | 10 ++++++++++
 net/core/skbuff.c      | 11 +++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a59d934..5cd636b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2640,9 +2640,19 @@ int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len);
 int skb_store_bits(struct sk_buff *skb, int offset, const void *from, int len);
 __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset, u8 *to,
 			      int len, __wsum csum);
+#ifdef CONFIG_SYSCALL_SPLICE
 int skb_splice_bits(struct sk_buff *skb, unsigned int offset,
 		    struct pipe_inode_info *pipe, unsigned int len,
 		    unsigned int flags);
+#else
+static inline int
+skb_splice_bits(struct sk_buff *skb, unsigned int offset,
+		struct pipe_inode_info *pipe, unsigned int len,
+		unsigned int flags)
+{
+	return -EPERM;
+}
+#endif
 void skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to);
 unsigned int skb_zerocopy_headlen(const struct sk_buff *from);
 int skb_zerocopy(struct sk_buff *to, struct sk_buff *from,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 61059a0..bb426d9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1678,7 +1678,8 @@ EXPORT_SYMBOL(skb_copy_bits);
  * Callback from splice_to_pipe(), if we need to release some pages
  * at the end of the spd in case we error'ed out in filling the pipe.
  */
-static void sock_spd_release(struct splice_pipe_desc *spd, unsigned int i)
+static void __maybe_unused sock_spd_release(struct splice_pipe_desc *spd,
+					    unsigned int i)
 {
 	put_page(spd->pages[i]);
 }
@@ -1781,9 +1782,9 @@ static bool __splice_segment(struct page *page, unsigned int poff,
  * Map linear and fragment data from the skb to spd. It reports true if the
  * pipe is full or if we already spliced the requested length.
  */
-static bool __skb_splice_bits(struct sk_buff *skb, struct pipe_inode_info *pipe,
-			      unsigned int *offset, unsigned int *len,
-			      struct splice_pipe_desc *spd, struct sock *sk)
+static bool __maybe_unused __skb_splice_bits(struct sk_buff *skb, struct pipe_inode_info *pipe,
+					     unsigned int *offset, unsigned int *len,
+					     struct splice_pipe_desc *spd, struct sock *sk)
 {
 	int seg;
 
@@ -1821,6 +1822,7 @@ static bool __skb_splice_bits(struct sk_buff *skb, struct pipe_inode_info *pipe,
  * the frag list, if such a thing exists. We'd probably need to recurse to
  * handle that cleanly.
  */
+#ifdef CONFIG_SYSCALL_SPLICE
 int skb_splice_bits(struct sk_buff *skb, unsigned int offset,
 		    struct pipe_inode_info *pipe, unsigned int tlen,
 		    unsigned int flags)
@@ -1876,6 +1878,7 @@ done:
 
 	return ret;
 }
+#endif /* CONFIG_SYSCALL_SPLICE */
 
 /**
  *	skb_store_bits - store bits from kernel buffer to skb
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 7/7] fs/splice: full support for compiling out splice
  2014-11-24 23:00 [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Pieter Smith
                   ` (5 preceding siblings ...)
  2014-11-24 23:01 ` [PATCH v4 6/7] net/core: " Pieter Smith
@ 2014-11-24 23:01 ` Pieter Smith
  2014-11-25  0:52 ` [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Josh Triplett
       [not found] ` <5474ABB6.3030400@infradead.org>
  8 siblings, 0 replies; 23+ messages in thread
From: Pieter Smith @ 2014-11-24 23:01 UTC (permalink / raw)
  To: pieter
  Cc: Josh Triplett, Alexander Duyck, Alexander Viro,
	Alexei Starovoitov, Andrew Morton, Bertrand Jacquin,
	Catalina Mocanu, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

Entirely compile out splice translation unit when the system is configured
without splice family of syscalls (i.e. CONFIG_SYSCALL_SPLICE is undefined).

Exported fs/splice functions are transparently mocked out with static inlines.
Because userspace support for splice has already been removed by this
patch-set, the exported functions cannot be called anyway. Mocking them out
prevents a maintenance burden on file system drivers.

The bloat score resulting from this patch given a tinyconfig is:
add/remove: 0/25 grow/shrink: 0/5 up/down: 0/-4845 (-4845)
function                                     old     new   delta
pipe_to_null                                   4       -      -4
generic_pipe_buf_nosteal                       6       -      -6
spd_release_page                              10       -     -10
PageUptodate                                  22      11     -11
lock_page                                     36      24     -12
page_cache_pipe_buf_release                   16       -     -16
splice_write_null                             24       4     -20
page_cache_pipe_buf_ops                       20       -     -20
nosteal_pipe_buf_ops                          20       -     -20
default_pipe_buf_ops                          20       -     -20
generic_splice_sendpage                       24       -     -24
splice_shrink_spd                             27       -     -27
direct_splice_actor                           47       -     -47
default_file_splice_write                     49       -     -49
wakeup_pipe_writers                           54       -     -54
write_pipe_buf                                71       -     -71
page_cache_pipe_buf_confirm                   80       -     -80
splice_grow_spd                               87       -     -87
splice_from_pipe                              93       -     -93
splice_from_pipe_next                        106       -    -106
pipe_to_sendpage                             109       -    -109
page_cache_pipe_buf_steal                    114       -    -114
generic_file_splice_read                     131       8    -123
do_splice_direct                             148       -    -148
__splice_from_pipe                           246       -    -246
splice_direct_to_actor                       416       -    -416
splice_to_pipe                               417       -    -417
default_file_splice_read                     688       -    -688
iter_file_splice_write                       702       4    -698
__generic_file_splice_read                  1109       -   -1109

The bloat score for the entire CONFIG_SYSCALL_SPLICE patch-set is:
add/remove: 0/41 grow/shrink: 5/7 up/down: 23/-8422 (-8399)
function                                     old     new   delta
sys_pwritev                                  115     122      +7
sys_preadv                                   115     122      +7
fdput_pos                                     29      36      +7
sys_pwrite64                                 115     116      +1
sys_pread64                                  115     116      +1
pipe_to_null                                   4       -      -4
generic_pipe_buf_nosteal                       6       -      -6
spd_release_page                              10       -     -10
fdput                                         11       -     -11
PageUptodate                                  22      11     -11
lock_page                                     36      24     -12
signal_pending                                39      26     -13
fdget                                         56      42     -14
page_cache_pipe_buf_release                   16       -     -16
user_page_pipe_buf_ops                        20       -     -20
splice_write_null                             24       4     -20
page_cache_pipe_buf_ops                       20       -     -20
nosteal_pipe_buf_ops                          20       -     -20
default_pipe_buf_ops                          20       -     -20
generic_splice_sendpage                       24       -     -24
user_page_pipe_buf_steal                      25       -     -25
splice_shrink_spd                             27       -     -27
pipe_to_user                                  43       -     -43
direct_splice_actor                           47       -     -47
default_file_splice_write                     49       -     -49
wakeup_pipe_writers                           54       -     -54
wakeup_pipe_readers                           54       -     -54
write_pipe_buf                                71       -     -71
page_cache_pipe_buf_confirm                   80       -     -80
splice_grow_spd                               87       -     -87
do_splice_to                                  87       -     -87
ipipe_prep.part                               92       -     -92
splice_from_pipe                              93       -     -93
splice_from_pipe_next                        107       -    -107
pipe_to_sendpage                             109       -    -109
page_cache_pipe_buf_steal                    114       -    -114
opipe_prep.part                              119       -    -119
sys_sendfile                                 122       -    -122
generic_file_splice_read                     131       8    -123
sys_sendfile64                               126       -    -126
sys_vmsplice                                 137       -    -137
do_splice_direct                             148       -    -148
vmsplice_to_user                             205       -    -205
__splice_from_pipe                           246       -    -246
splice_direct_to_actor                       348       -    -348
splice_to_pipe                               371       -    -371
do_sendfile                                  492       -    -492
sys_tee                                      497       -    -497
vmsplice_to_pipe                             558       -    -558
default_file_splice_read                     688       -    -688
iter_file_splice_write                       702       4    -698
sys_splice                                  1075       -   -1075
__generic_file_splice_read                  1109       -   -1109

Signed-off-by: Pieter Smith <pieter@boesman.nl>
---
 fs/Makefile            |  3 ++-
 fs/splice.c            |  2 --
 include/linux/fs.h     | 26 ++++++++++++++++++++++++++
 include/linux/splice.h | 42 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/fs/Makefile b/fs/Makefile
index fb7646e..9395622 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -10,7 +10,7 @@ obj-y :=	open.o read_write.o file_table.o super.o \
 		ioctl.o readdir.o select.o dcache.o inode.o \
 		attr.o bad_inode.o file.o filesystems.o namespace.o \
 		seq_file.o xattr.o libfs.o fs-writeback.o \
-		pnode.o splice.o sync.o utimes.o \
+		pnode.o sync.o utimes.o \
 		stack.o fs_struct.o statfs.o fs_pin.o
 
 ifeq ($(CONFIG_BLOCK),y)
@@ -22,6 +22,7 @@ endif
 obj-$(CONFIG_PROC_FS) += proc_namespace.o
 
 obj-$(CONFIG_FSNOTIFY)		+= notify/
+obj-$(CONFIG_SYSCALL_SPLICE)	+= splice.o
 obj-$(CONFIG_EPOLL)		+= eventpoll.o
 obj-$(CONFIG_ANON_INODES)	+= anon_inodes.o
 obj-$(CONFIG_SIGNALFD)		+= signalfd.o
diff --git a/fs/splice.c b/fs/splice.c
index 7c4c695..44b201b 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1316,7 +1316,6 @@ long do_splice_direct(struct file *in, loff_t *ppos, struct file *out,
 	return ret;
 }
 
-#ifdef CONFIG_SYSCALL_SPLICE
 static int splice_pipe_to_pipe(struct pipe_inode_info *ipipe,
 			       struct pipe_inode_info *opipe,
 			       size_t len, unsigned int flags);
@@ -2201,5 +2200,4 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
 	return do_sendfile(out_fd, in_fd, NULL, count, 0);
 }
 #endif
-#endif
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a957d43..138107e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2444,6 +2444,7 @@ extern int blkdev_fsync(struct file *filp, loff_t start, loff_t end,
 extern void block_sync_page(struct page *page);
 
 /* fs/splice.c */
+#ifdef CONFIG_SYSCALL_SPLICE
 extern ssize_t generic_file_splice_read(struct file *, loff_t *,
 		struct pipe_inode_info *, size_t, unsigned int);
 extern ssize_t default_file_splice_read(struct file *, loff_t *,
@@ -2452,6 +2453,31 @@ extern ssize_t iter_file_splice_write(struct pipe_inode_info *,
 		struct file *, loff_t *, size_t, unsigned int);
 extern ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe,
 		struct file *out, loff_t *, size_t len, unsigned int flags);
+#else
+static inline ssize_t generic_file_splice_read(struct file *in, loff_t *ppos,
+		struct pipe_inode_info *pipe, size_t len, unsigned int flags)
+{
+	return -EPERM;
+}
+
+static inline ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
+		struct pipe_inode_info *pipe, size_t len, unsigned int flags)
+{
+	return -EPERM;
+}
+
+static inline ssize_t iter_file_splice_write(struct pipe_inode_info *pipe,
+		struct file *out, loff_t *ppos, size_t len, unsigned int flags)
+{
+	return -EPERM;
+}
+
+static inline ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe,
+		struct file *out, loff_t *ppos, size_t len, unsigned int flags)
+{
+	return -EPERM;
+}
+#endif
 
 extern void
 file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping);
diff --git a/include/linux/splice.h b/include/linux/splice.h
index da2751d..34570d8 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -65,6 +65,7 @@ typedef int (splice_actor)(struct pipe_inode_info *, struct pipe_buffer *,
 typedef int (splice_direct_actor)(struct pipe_inode_info *,
 				  struct splice_desc *);
 
+#ifdef CONFIG_SYSCALL_SPLICE
 extern ssize_t splice_from_pipe(struct pipe_inode_info *, struct file *,
 				loff_t *, size_t, unsigned int,
 				splice_actor *);
@@ -74,13 +75,54 @@ extern ssize_t splice_to_pipe(struct pipe_inode_info *,
 			      struct splice_pipe_desc *);
 extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *,
 				      splice_direct_actor *);
+#else
+static inline ssize_t splice_from_pipe(struct pipe_inode_info *pipe, struct file *out,
+			 loff_t *ppos, size_t len, unsigned int flags,
+			 splice_actor *actor)
+{
+	return -EPERM;
+}
+
+static inline ssize_t __splice_from_pipe(struct pipe_inode_info *pipe, struct splice_desc *sd,
+			   splice_actor *actor)
+{
+	return -EPERM;
+}
+
+static inline ssize_t splice_to_pipe(struct pipe_inode_info *pipe,
+		       struct splice_pipe_desc *spd)
+{
+	return -EPERM;
+}
+
+static inline ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
+			       splice_direct_actor *actor)
+{
+	return -EPERM;
+}
+#endif
 
 /*
  * for dynamic pipe sizing
  */
+#ifdef CONFIG_SYSCALL_SPLICE
 extern int splice_grow_spd(const struct pipe_inode_info *, struct splice_pipe_desc *);
 extern void splice_shrink_spd(struct splice_pipe_desc *);
 extern void spd_release_page(struct splice_pipe_desc *, unsigned int);
+#else
+static inline int splice_grow_spd(const struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
+{
+	return -EPERM;
+}
+
+static inline void splice_shrink_spd(struct splice_pipe_desc *spd)
+{
+}
+
+static inline void spd_release_page(struct splice_pipe_desc *spd, unsigned int i)
+{
+}
+#endif
 
 extern const struct pipe_buf_operations page_cache_pipe_buf_ops;
 #endif
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/7] fs/splice: support compiling out splice-family syscalls
  2014-11-24 23:01 ` [PATCH v4 3/7] fs/splice: support compiling out splice-family syscalls Pieter Smith
@ 2014-11-25  0:49   ` Josh Triplett
  0 siblings, 0 replies; 23+ messages in thread
From: Josh Triplett @ 2014-11-25  0:49 UTC (permalink / raw)
  To: Pieter Smith
  Cc: Alexander Duyck, Alexander Viro, Alexei Starovoitov,
	Andrew Morton, Bertrand Jacquin, Catalina Mocanu,
	Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

On Tue, Nov 25, 2014 at 12:01:02AM +0100, Pieter Smith wrote:
> Many embedded systems will not need the splice-family syscalls (splice,
> vmsplice, tee and sendfile). Omitting them saves space.  This adds a new EXPERT
> config option CONFIG_SYSCALL_SPLICE (default y) to support compiling them out.
> 
> The goal is to completely compile out fs/splice along with the syscalls. To
> achieve this, the remaining patch-set will deal with fs/splice exports. As far
> as possible, the impact on other device drivers will be minimized so as to
> reduce the overal maintenance burden of CONFIG_SYSCALL_SPLICE.
> 
> The use of exported functions will be solved by transparently mocking them out
> with static inlines. Uses of the exported pipe_buf_operations struct however
> require direct modification in fs/fuse and net/core. The next two patches will
> deal with this. A macro is defined that will assist with NULL'ing out callbacks
> when CONFIG_SYSCALL_SPLICE is undefined: __splice_p().

This message needs updating, since the patch series doesn't introduce or
use __splice_p anymore.

> Once all exports are solved, fs/splice can be compiled out.
> 
> The bloat benefit of this patch given a tinyconfig is:
> 
> add/remove: 0/16 grow/shrink: 2/5 up/down: 114/-3693 (-3579)
> function                                     old     new   delta
> splice_direct_to_actor                       348     416     +68
> splice_to_pipe                               371     417     +46
> splice_from_pipe_next                        107     106      -1
> fdput                                         11       -     -11
> signal_pending                                39      26     -13
> fdget                                         56      42     -14
> user_page_pipe_buf_ops                        20       -     -20
> user_page_pipe_buf_steal                      25       -     -25
> file_end_write                                58      29     -29
> file_start_write                              68      34     -34
> pipe_to_user                                  43       -     -43
> wakeup_pipe_readers                           54       -     -54
> do_splice_to                                  87       -     -87
> ipipe_prep.part                               92       -     -92
> opipe_prep.part                              119       -    -119
> sys_sendfile                                 122       -    -122
> sys_sendfile64                               126       -    -126
> sys_vmsplice                                 137       -    -137
> vmsplice_to_user                             205       -    -205
> sys_tee                                      491       -    -491
> do_sendfile                                  492       -    -492
> vmsplice_to_pipe                             558       -    -558
> sys_splice                                  1020       -   -1020
> 
> Signed-off-by: Pieter Smith <pieter@boesman.nl>
> ---
>  fs/splice.c     |  2 ++
>  init/Kconfig    | 10 ++++++++++
>  kernel/sys_ni.c |  8 ++++++++
>  3 files changed, 20 insertions(+)
> 
> diff --git a/fs/splice.c b/fs/splice.c
> index 44b201b..7c4c695 100644
> --- a/fs/splice.c
> +++ b/fs/splice.c
> @@ -1316,6 +1316,7 @@ long do_splice_direct(struct file *in, loff_t *ppos, struct file *out,
>  	return ret;
>  }
>  
> +#ifdef CONFIG_SYSCALL_SPLICE
>  static int splice_pipe_to_pipe(struct pipe_inode_info *ipipe,
>  			       struct pipe_inode_info *opipe,
>  			       size_t len, unsigned int flags);
> @@ -2200,4 +2201,5 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
>  	return do_sendfile(out_fd, in_fd, NULL, count, 0);
>  }
>  #endif
> +#endif
>  
> diff --git a/init/Kconfig b/init/Kconfig
> index d811d5f..dec9819 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1571,6 +1571,16 @@ config NTP
>  	  system clock to an NTP server, you can disable this option to save
>  	  space.
>  
> +config SYSCALL_SPLICE
> +	bool "Enable splice/vmsplice/tee/sendfile syscalls" if EXPERT
> +	default y
> +	help
> +	  This option enables the splice, vmsplice, tee and sendfile syscalls. These
> +	  are used by applications to: move data between buffers and arbitrary file
> +	  descriptors; "copy" data between buffers; or copy data from userspace into
> +	  buffers. If building an embedded system where no applications use these
> +	  syscalls, you can disable this option to save space.
> +
>  config PCI_QUIRKS
>  	default y
>  	bool "Enable PCI quirk workarounds" if EXPERT
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index d2f5b00..25d5551 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -170,6 +170,14 @@ cond_syscall(sys_fstat);
>  cond_syscall(sys_stat);
>  cond_syscall(sys_uname);
>  cond_syscall(sys_olduname);
> +cond_syscall(sys_vmsplice);
> +cond_syscall(sys_splice);
> +cond_syscall(sys_tee);
> +cond_syscall(sys_sendfile);
> +cond_syscall(sys_sendfile64);
> +cond_syscall(compat_sys_vmsplice);
> +cond_syscall(compat_sys_sendfile);
> +cond_syscall(compat_sys_sendfile64);
>  
>  /* arch-specific weak syscall entries */
>  cond_syscall(sys_pciconfig_read);
> -- 
> 2.1.0
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-24 23:00 [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Pieter Smith
                   ` (6 preceding siblings ...)
  2014-11-24 23:01 ` [PATCH v4 7/7] fs/splice: full support for " Pieter Smith
@ 2014-11-25  0:52 ` Josh Triplett
       [not found] ` <5474ABB6.3030400@infradead.org>
  8 siblings, 0 replies; 23+ messages in thread
From: Josh Triplett @ 2014-11-25  0:52 UTC (permalink / raw)
  To: Pieter Smith
  Cc: Alexander Duyck, Alexander Viro, Alexei Starovoitov,
	Andrew Morton, Bertrand Jacquin, Catalina Mocanu,
	Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, open list:ABI/API, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong, 蔡正龙

On Tue, Nov 25, 2014 at 12:00:59AM +0100, Pieter Smith wrote:
> REPO: https://github.com/smipi1/linux-tinification.git
> 
> BRANCH: tiny/config-syscall-splice
> 
> BACKGROUND: This patch-set forms part of the Linux Kernel Tinification effort (
>   https://tiny.wiki.kernel.org/).
> 
> GOAL: Support compiling out the splice family of syscalls (splice, vmsplice,
>   tee and sendfile) along with all supporting infrastructure if not needed.
>   Many embedded systems will not need the splice-family syscalls. Omitting them
>   saves space.
> 
> HISTORY:
>   PATCH v4:
>     - Drops __splice_p()
>     - Let nfsd fall back to non-splice support when splice is compiled out
>     - Style fixes
[...]
> RESULTS: A tinyconfig bloat-o-meter score for the entire patch-set:
> 
> add/remove: 0/41 grow/shrink: 5/7 up/down: 23/-8422 (-8399)

I replied to one patch with a minor nit in the commit message.  Other
than that, I don't see any obvious issues with this.

- Josh Triplett

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
       [not found] ` <5474ABB6.3030400@infradead.org>
@ 2014-11-25 17:13   ` David Miller
  2014-11-25 18:10     ` Paul E. McKenney
  2014-11-25 18:53     ` josh
  2014-11-25 22:08   ` josh
  1 sibling, 2 replies; 23+ messages in thread
From: David Miller @ 2014-11-25 17:13 UTC (permalink / raw)
  To: rdunlap
  Cc: pieter, josh, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, ebiederm, fabf, fuse-devel,
	geert, hughd, iulia.manda21, JBeulich, bfields, jlayton,
	linux-api, linux-fsdevel, linux-kernel, linux-nfs, mcgrof,
	mattst88, mgorman, mst, miklos, netdev, oleg, Paul.Durrant,
	paulmck, pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

From: Randy Dunlap <rdunlap@infradead.org>
Date: Tue, 25 Nov 2014 08:17:58 -0800

> Is the splice family of syscalls the only one that tiny has identified
> for optional building or can we expect similar treatment for other
> syscalls?
> 
> Why will many embedded systems not need these syscalls?  You know
> exactly what apps they run and you are positive that those apps do
> not use splice?

I think starting to compile out system calls is a very slippery
slope we should not begin the journey down.

This changes the forward facing interface to userspace.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 17:13   ` David Miller
@ 2014-11-25 18:10     ` Paul E. McKenney
  2014-11-25 18:24       ` David Miller
  2014-11-25 18:53     ` josh
  1 sibling, 1 reply; 23+ messages in thread
From: Paul E. McKenney @ 2014-11-25 18:10 UTC (permalink / raw)
  To: David Miller
  Cc: rdunlap, pieter, josh, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, ebiederm, fabf, fuse-devel,
	geert, hughd, iulia.manda21, JBeulich, bfields, jlayton,
	linux-api, linux-fsdevel, linux-kernel, linux-nfs, mcgrof,
	mattst88, mgorman, mst, miklos, netdev, oleg, Paul.Durrant,
	pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

On Tue, Nov 25, 2014 at 12:13:05PM -0500, David Miller wrote:
> From: Randy Dunlap <rdunlap@infradead.org>
> Date: Tue, 25 Nov 2014 08:17:58 -0800
> 
> > Is the splice family of syscalls the only one that tiny has identified
> > for optional building or can we expect similar treatment for other
> > syscalls?
> > 
> > Why will many embedded systems not need these syscalls?  You know
> > exactly what apps they run and you are positive that those apps do
> > not use splice?
> 
> I think starting to compile out system calls is a very slippery
> slope we should not begin the journey down.
> 
> This changes the forward facing interface to userspace.

I certainly sympathize with this concern, given the importance of software
portability.  However, the tiny-hardware alternative appears ot some sort
of special-purpose embedded OS, which most definitely will suffer from
software compatibility issues.  I guess that the good news is that much
of the tiny hardware that used to be 8 or 16 bits is now 32 bits, which
means that it has at least some chance of running some form of Linux.  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 18:10     ` Paul E. McKenney
@ 2014-11-25 18:24       ` David Miller
  2014-11-25 18:58         ` Theodore Ts'o
  0 siblings, 1 reply; 23+ messages in thread
From: David Miller @ 2014-11-25 18:24 UTC (permalink / raw)
  To: paulmck
  Cc: rdunlap, pieter, josh, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, ebiederm, fabf, fuse-devel,
	geert, hughd, iulia.manda21, JBeulich, bfields, jlayton,
	linux-api, linux-fsdevel, linux-kernel, linux-nfs, mcgrof,
	mattst88, mgorman, mst, miklos, netdev, oleg, Paul.Durrant,
	pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Tue, 25 Nov 2014 10:10:32 -0800

> I certainly sympathize with this concern, given the importance of software
> portability.  However, the tiny-hardware alternative appears ot some sort
> of special-purpose embedded OS, which most definitely will suffer from
> software compatibility issues.  I guess that the good news is that much
> of the tiny hardware that used to be 8 or 16 bits is now 32 bits, which
> means that it has at least some chance of running some form of Linux.  ;-)

And then if some fundamental part of userland (glibc, klibc, etc.) finds
a useful way to use splice for a fundamental operation, we're back to
square one.

I simply do not agree with modifying the user facing interface, especially
one with decades of precedence.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 17:13   ` David Miller
  2014-11-25 18:10     ` Paul E. McKenney
@ 2014-11-25 18:53     ` josh
  2014-11-25 19:04       ` David Miller
  1 sibling, 1 reply; 23+ messages in thread
From: josh @ 2014-11-25 18:53 UTC (permalink / raw)
  To: David Miller
  Cc: rdunlap, pieter, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, ebiederm, fabf, fuse-devel,
	geert, hughd, iulia.manda21, JBeulich, bfields, jlayton,
	linux-api, linux-fsdevel, linux-kernel, linux-nfs, mcgrof,
	mattst88, mgorman, mst, miklos, netdev, oleg, Paul.Durrant,
	paulmck, pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

On Tue, Nov 25, 2014 at 12:13:05PM -0500, David Miller wrote:
> From: Randy Dunlap <rdunlap@infradead.org>
> Date: Tue, 25 Nov 2014 08:17:58 -0800
> 
> > Is the splice family of syscalls the only one that tiny has identified
> > for optional building or can we expect similar treatment for other
> > syscalls?
> > 
> > Why will many embedded systems not need these syscalls?  You know
> > exactly what apps they run and you are positive that those apps do
> > not use splice?
> 
> I think starting to compile out system calls is a very slippery
> slope we should not begin the journey down.
> 
> This changes the forward facing interface to userspace.

It's not a "slippery slope"; it's been our standard practice for ages.
We started down that road long, long ago, when we first introduced
Kconfig and optional/modular features.  /dev/* are user-facing
interfaces, yet you can compile them out or make them modular.  /sys/*
and/proc/* are user-facing interfaces, yet you can compile part or all
of them out.  Filesystem names passed to mount are user-facing
interfaces, yet you can compile them out.  (Not just things like ext4;
think FUSE or overlayfs, which some applications will build upon and
require.)  Some prctls are optional, new syscalls like BPF or inotify or
process_vm_{read,write}v are optional, hardware interfaces are optional,
control groups are optional, containers and namespaces are optional,
checkpoint/restart is optional, KVM is optional, kprobes are optional,
kmsg is optional, /dev/port is optional, ACL support is optional, USB
support (as used by libusb) is optional, sound interfaces are optional,
GPU interfaces are optional, even futexes are optional.

For every single one of those, userspace programs or libraries may
depend on that functionality, and summarily exit if it doesn't exist,
perhaps with a warning that you need to enable options in your kernel,
or perhaps with a simple "Function not implemented" or "No such file or
directory".

Out of the entire list above and the many more where that came from,
what makes syscalls unique?  What's wildly different between
open("/dev/foo", ...) returning an error and sys_foo returning an error?
What makes syscalls so special out of the entire list above?  We're not
breaking the ability to run old userspace on a new kernel, which *must*
be supported, and that includes not just syscalls but all user-facing
interfaces; we don't break userspace.  But we've *never* guaranteed that
you can run old userspace on a new *allnoconfig* kernel.

All of these features will remain behind CONFIG_EXPERT, and all of them
warn that you can only use them if your userspace can cope.

I've actually been thinking of introducing a new CONFIG_ALL_SYSCALLS,
under which all the "enable support for foo syscall" can live, rather
than just piling all of them directly under CONFIG_EXPERT; that option
would then repeat in very clear terms the warning that if you disable
that option and then disable specific syscalls, you need to know exactly
what your target userspace uses.  That would group together this whole
family of options, and make it clearer what the implications are.

- Josh Triplett

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 18:24       ` David Miller
@ 2014-11-25 18:58         ` Theodore Ts'o
  2014-11-25 19:05           ` David Miller
  0 siblings, 1 reply; 23+ messages in thread
From: Theodore Ts'o @ 2014-11-25 18:58 UTC (permalink / raw)
  To: David Miller
  Cc: paulmck, rdunlap, pieter, josh, alexander.h.duyck, viro, ast,
	akpm, beber, catalina.mocanu, dborkman, edumazet, ebiederm, fabf,
	fuse-devel, geert, hughd, iulia.manda21, JBeulich, bfields,
	jlayton, linux-api, linux-fsdevel, linux-kernel, linux-nfs,
	mcgrof, mattst88, mgorman, mst, miklos, netdev, oleg,
	Paul.Durrant, pefoley2, tgraf, therbert, trond.myklebust,
	willemb, xiaoguangrong, zhenglong.cai

On Tue, Nov 25, 2014 at 01:24:45PM -0500, David Miller wrote:
> 
> And then if some fundamental part of userland (glibc, klibc, etc.) finds
> a useful way to use splice for a fundamental operation, we're back to
> square one.

I'll note that the applications for these super-tiny kernels are
places where it's not likely they would be using glibc at all; think
very tiny embedded systems.  The userspace tends to be highly
restricted for the same space reasons why there is an effort to make
the kernel as small as possible.

In these places, they are using Linux already, but they're using a 2.2
or 2.4 kernel because 3.0 is just too damned big.  So the goal is to
try to provide them an alternative which allows them to use a modern,
but stripped down kernel.  If glibc or klibc isn't going to work
without splice, then it's not going to work on a pre 2.6 kernel
anyway, so things are no worse with these systems anyway.

After all, if we can get these systems to using a 3.x kernel w/o
splice, that's surely better than their using a 2.2 or 2.4 kernel w/o
the splice system, isn't it?

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 18:53     ` josh
@ 2014-11-25 19:04       ` David Miller
  2014-11-25 19:16         ` Eric W. Biederman
                           ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: David Miller @ 2014-11-25 19:04 UTC (permalink / raw)
  To: josh
  Cc: rdunlap, pieter, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, ebiederm, fabf, fuse-devel,
	geert, hughd, iulia.manda21, JBeulich, bfields, jlayton,
	linux-api, linux-fsdevel, linux-kernel, linux-nfs, mcgrof,
	mattst88, mgorman, mst, miklos, netdev, oleg, Paul.Durrant,
	paulmck, pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

From: josh@joshtriplett.org
Date: Tue, 25 Nov 2014 10:53:10 -0800

> It's not a "slippery slope"; it's been our standard practice for ages.

We've never put an entire class of generic system calls behind
a config option.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 18:58         ` Theodore Ts'o
@ 2014-11-25 19:05           ` David Miller
  0 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2014-11-25 19:05 UTC (permalink / raw)
  To: tytso
  Cc: paulmck, rdunlap, pieter, josh, alexander.h.duyck, viro, ast,
	akpm, beber, catalina.mocanu, dborkman, edumazet, ebiederm, fabf,
	fuse-devel, geert, hughd, iulia.manda21, JBeulich, bfields,
	jlayton, linux-api, linux-fsdevel, linux-kernel, linux-nfs,
	mcgrof, mattst88, mgorman, mst, miklos, netdev, oleg,
	Paul.Durrant, pefoley2, tgraf, therbert, trond.myklebust,
	willemb, xiaoguangrong, zhenglong.cai

From: Theodore Ts'o <tytso@mit.edu>
Date: Tue, 25 Nov 2014 13:58:06 -0500

> On Tue, Nov 25, 2014 at 01:24:45PM -0500, David Miller wrote:
>> 
>> And then if some fundamental part of userland (glibc, klibc, etc.) finds
>> a useful way to use splice for a fundamental operation, we're back to
>> square one.
> 
> I'll note that the applications for these super-tiny kernels are
> places where it's not likely they would be using glibc at all; think
> very tiny embedded systems.  The userspace tends to be highly
> restricted for the same space reasons why there is an effort to make
> the kernel as small as possible.

This is why I mentioned klibc, in order to avoid replies like your's,
it seems I have failed.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 19:04       ` David Miller
@ 2014-11-25 19:16         ` Eric W. Biederman
  2014-11-25 19:27           ` David Miller
  2014-11-25 20:11         ` Pieter Smith
  2014-11-26 12:19         ` One Thousand Gnomes
  2 siblings, 1 reply; 23+ messages in thread
From: Eric W. Biederman @ 2014-11-25 19:16 UTC (permalink / raw)
  To: David Miller
  Cc: josh, rdunlap, pieter, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, fabf, fuse-devel, geert,
	hughd, iulia.manda21, JBeulich, bfields, jlayton, linux-api,
	linux-fsdevel, linux-kernel, linux-nfs, mcgrof, mattst88,
	mgorman, mst, miklos, netdev, oleg, Paul.Durrant, paulmck,
	pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

David Miller <davem@davemloft.net> writes:

> From: josh@joshtriplett.org
> Date: Tue, 25 Nov 2014 10:53:10 -0800
>
>> It's not a "slippery slope"; it's been our standard practice for ages.
>
> We've never put an entire class of generic system calls behind
> a config option.

CONFIG_SYSVIPC has been in the kernel as long as I can remember.

I seem to remember a plan to remove that code once userspace had
finished migrating to more unixy interfaces to ipc.  But in 20 years
that migration does does not seem to have finished, or even look
like it ever will.

But if we started a slippery slope it was long long ago.

Eric

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 19:16         ` Eric W. Biederman
@ 2014-11-25 19:27           ` David Miller
  2014-11-25 20:01             ` Eric W. Biederman
  0 siblings, 1 reply; 23+ messages in thread
From: David Miller @ 2014-11-25 19:27 UTC (permalink / raw)
  To: ebiederm
  Cc: josh, rdunlap, pieter, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, fabf, fuse-devel, geert,
	hughd, iulia.manda21, JBeulich, bfields, jlayton, linux-api,
	linux-fsdevel, linux-kernel, linux-nfs, mcgrof, mattst88,
	mgorman, mst, miklos, netdev, oleg, Paul.Durrant, paulmck,
	pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 25 Nov 2014 13:16:44 -0600

> David Miller <davem@davemloft.net> writes:
> 
>> From: josh@joshtriplett.org
>> Date: Tue, 25 Nov 2014 10:53:10 -0800
>>
>>> It's not a "slippery slope"; it's been our standard practice for ages.
>>
>> We've never put an entire class of generic system calls behind
>> a config option.
> 
> CONFIG_SYSVIPC has been in the kernel as long as I can remember.
> 
> I seem to remember a plan to remove that code once userspace had
> finished migrating to more unixy interfaces to ipc.  But in 20 years
> that migration does does not seem to have finished, or even look
> like it ever will.
> 
> But if we started a slippery slope it was long long ago.

Fair enough.

Would be amusing if these tiny systems have it enabled.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 19:27           ` David Miller
@ 2014-11-25 20:01             ` Eric W. Biederman
  0 siblings, 0 replies; 23+ messages in thread
From: Eric W. Biederman @ 2014-11-25 20:01 UTC (permalink / raw)
  To: David Miller
  Cc: josh, rdunlap, pieter, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, fabf, fuse-devel, geert,
	hughd, iulia.manda21, JBeulich, bfields, jlayton, linux-api,
	linux-fsdevel, linux-kernel, linux-nfs, mcgrof, mattst88,
	mgorman, mst, miklos, netdev, oleg, Paul.Durrant, paulmck,
	pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

David Miller <davem@davemloft.net> writes:

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Tue, 25 Nov 2014 13:16:44 -0600
>
>> David Miller <davem@davemloft.net> writes:
>> 
>>> From: josh@joshtriplett.org
>>> Date: Tue, 25 Nov 2014 10:53:10 -0800
>>>
>>>> It's not a "slippery slope"; it's been our standard practice for ages.
>>>
>>> We've never put an entire class of generic system calls behind
>>> a config option.
>> 
>> CONFIG_SYSVIPC has been in the kernel as long as I can remember.
>> 
>> I seem to remember a plan to remove that code once userspace had
>> finished migrating to more unixy interfaces to ipc.  But in 20 years
>> that migration does does not seem to have finished, or even look
>> like it ever will.
>> 
>> But if we started a slippery slope it was long long ago.
>
> Fair enough.
>
> Would be amusing if these tiny systems have it enabled.

It would.

In practice when I was playing in that space I had a hard time
justifying CONFIG_NET and CONFIG_INET.  Despite writing a network
bootloader to use with kexec.

Eric

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 19:04       ` David Miller
  2014-11-25 19:16         ` Eric W. Biederman
@ 2014-11-25 20:11         ` Pieter Smith
  2014-11-26 12:19         ` One Thousand Gnomes
  2 siblings, 0 replies; 23+ messages in thread
From: Pieter Smith @ 2014-11-25 20:11 UTC (permalink / raw)
  To: David Miller
  Cc: josh, rdunlap, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, ebiederm, fabf, fuse-devel,
	geert, hughd, iulia.manda21, JBeulich, bfields, jlayton,
	linux-api, linux-fsdevel, linux-kernel, linux-nfs, mcgrof,
	mattst88, mgorman, mst, miklos, netdev, oleg, Paul.Durrant,
	paulmck, pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

On Tue, Nov 25, 2014 at 02:04:41PM -0500, David Miller wrote:
> From: josh@joshtriplett.org
> Date: Tue, 25 Nov 2014 10:53:10 -0800
> 
> > It's not a "slippery slope"; it's been our standard practice for ages.
> 
> We've never put an entire class of generic system calls behind
> a config option.

I would have loved to make them optional individually, but they all are
semantic variations of the same thing: Moving data between fd's without that
data passing through userspace. It therefore isn't surprising that these
syscalls share an underlying entanglement of code (which is where the bulk of
the space saving is to be had).

What a tiny product developer should be asking himself, is: "Do I really need
to efficiently move data between file descriptors?". If the answer no, he can
disable CONFIG_SYSCALL_SPLICE to squeeze an extra 8KB out of his kernel.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
       [not found] ` <5474ABB6.3030400@infradead.org>
  2014-11-25 17:13   ` David Miller
@ 2014-11-25 22:08   ` josh
  1 sibling, 0 replies; 23+ messages in thread
From: josh @ 2014-11-25 22:08 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Pieter Smith, Alexander Duyck, Alexander Viro,
	Alexei Starovoitov, Andrew Morton, Bertrand Jacquin,
	Catalina Mocanu, Daniel Borkmann, David S. Miller, Eric Dumazet,
	Eric W. Biederman, Fabian Frederick,
	open list:FUSE: FILESYSTEM...,
	Geert Uytterhoeven, Hugh Dickins, Iulia Manda, Jan Beulich,
	J. Bruce Fields, Jeff Layton, linux-api, linux-fsdevel,
	open list, open list:KERNEL NFSD, SUNR...,
	Luis R. Rodriguez, Matt Turner, Mel Gorman, Michael S. Tsirkin,
	Miklos Szeredi, open list:NETWORKING [GENERAL],
	Oleg Nesterov, Paul Durrant, Paul E. McKenney, Peter Foley,
	Thomas Graf, Tom Herbert, Trond Myklebust, Willem de Bruijn,
	Xiao Guangrong

[Resending this mail due to some email encoding brokenness that
prevented it from reaching LKML the first time; sorry to anyone who
receives two copies.]

On Tue, Nov 25, 2014 at 08:17:58AM -0800, Randy Dunlap wrote:
> On 11/24/2014 03:00 PM, Pieter Smith wrote:
> >REPO: https://github.com/smipi1/linux-tinification.git
> >
> >BRANCH: tiny/config-syscall-splice
> >
> >BACKGROUND: This patch-set forms part of the Linux Kernel Tinification effort (
> >   https://tiny.wiki.kernel.org/).
> >
> >GOAL: Support compiling out the splice family of syscalls (splice, vmsplice,
> >   tee and sendfile) along with all supporting infrastructure if not needed.
> >   Many embedded systems will not need the splice-family syscalls. Omitting them
> >   saves space.
> 
> Hi,
> 
> Is the splice family of syscalls the only one that tiny has identified
> for optional building or can we expect similar treatment for other
> syscalls?

Pretty much any system call that you could conceive of writing a
userspace without.

There's a partial project list at https://tiny.wiki.kernel.org/projects.

> Why will many embedded systems not need these syscalls?  You know
> exactly what apps they run and you are positive that those apps do
> not use splice?

Yes, precisely.  We're talking about embedded systems small enough that
you're booting with init=/your/app and don't even call fork(), where you
know exactly what code you're putting in and what libraries you use.
And they're almost certainly not running glibc.

> >RESULTS: A tinyconfig bloat-o-meter score for the entire patch-set:
> >
> >add/remove: 0/41 grow/shrink: 5/7 up/down: 23/-8422 (-8399)
> 
> The summary is that this patch saves around 8 KB of code space --
> is that correct?

Right.  For reference, we're talking about kernels where the *total*
size is a few hundred kB.

> How much storage space do embedded systems have nowadays?

For the embedded systems we're targeting for the tinification effort, in
a first pass: 512k-2M of storage (often for an *uncompressed* kernel, to
support execute-in-place), and 128k-512k of memory.  We've successfully
built useful kernels and userspaces for such environments, and we'd like
to go even smaller.

- Josh Triplett

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile)
  2014-11-25 19:04       ` David Miller
  2014-11-25 19:16         ` Eric W. Biederman
  2014-11-25 20:11         ` Pieter Smith
@ 2014-11-26 12:19         ` One Thousand Gnomes
  2 siblings, 0 replies; 23+ messages in thread
From: One Thousand Gnomes @ 2014-11-26 12:19 UTC (permalink / raw)
  To: David Miller
  Cc: josh, rdunlap, pieter, alexander.h.duyck, viro, ast, akpm, beber,
	catalina.mocanu, dborkman, edumazet, ebiederm, fabf, fuse-devel,
	geert, hughd, iulia.manda21, JBeulich, bfields, jlayton,
	linux-api, linux-fsdevel, linux-kernel, linux-nfs, mcgrof,
	mattst88, mgorman, mst, miklos, netdev, oleg, Paul.Durrant,
	paulmck, pefoley2, tgraf, therbert, trond.myklebust, willemb,
	xiaoguangrong, zhenglong.cai

On Tue, 25 Nov 2014 14:04:41 -0500 (EST)
David Miller <davem@davemloft.net> wrote:

> From: josh@joshtriplett.org
> Date: Tue, 25 Nov 2014 10:53:10 -0800
> 
> > It's not a "slippery slope"; it's been our standard practice for ages.
> 
> We've never put an entire class of generic system calls behind
> a config option.

Try running an original MCC Linux binary and C lib on a current kernel

We've put *entire binary formats* behind a config option. We've put older
syscalls behind it, we've put sysfs behind it, sysctl behind it, the
older microcode interfaces behind it, ISA bus as a concept behind
options. VDSO, IPC, even 32bit support ... the list goes on and on.

I'd say those were far more generic on the whole than splice/sendfile.

Alan

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-11-26 12:24 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-24 23:00 [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Pieter Smith
2014-11-24 23:01 ` [PATCH v4 1/7] fs: move sendfile syscall into fs/splice Pieter Smith
2014-11-24 23:01 ` [PATCH v4 2/7] fs: moved kernel_write to fs/read_write Pieter Smith
2014-11-24 23:01 ` [PATCH v4 3/7] fs/splice: support compiling out splice-family syscalls Pieter Smith
2014-11-25  0:49   ` Josh Triplett
2014-11-24 23:01 ` [PATCH v4 4/7] fs/fuse: support compiling out splice Pieter Smith
2014-11-24 23:01 ` [PATCH v4 5/7] fs/nfsd: " Pieter Smith
2014-11-24 23:01 ` [PATCH v4 6/7] net/core: " Pieter Smith
2014-11-24 23:01 ` [PATCH v4 7/7] fs/splice: full support for " Pieter Smith
2014-11-25  0:52 ` [PATCH v4 0/7] kernel tinification: optionally compile out splice family of syscalls (splice, vmsplice, tee and sendfile) Josh Triplett
     [not found] ` <5474ABB6.3030400@infradead.org>
2014-11-25 17:13   ` David Miller
2014-11-25 18:10     ` Paul E. McKenney
2014-11-25 18:24       ` David Miller
2014-11-25 18:58         ` Theodore Ts'o
2014-11-25 19:05           ` David Miller
2014-11-25 18:53     ` josh
2014-11-25 19:04       ` David Miller
2014-11-25 19:16         ` Eric W. Biederman
2014-11-25 19:27           ` David Miller
2014-11-25 20:01             ` Eric W. Biederman
2014-11-25 20:11         ` Pieter Smith
2014-11-26 12:19         ` One Thousand Gnomes
2014-11-25 22:08   ` josh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).