All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] vm: add a syscall to map a process memory into a pipe
@ 2017-11-22 19:36 ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Yossi Kuperman

From: Yossi Kuperman <yossiku@il.ibm.com>

Hi,

This patches introduces new process_vmsplice system call that combines
functionality of process_vm_read and vmsplice.

It allows to map the memory of another process into a pipe, similarly to
what vmsplice does for its own address space.

The patch 2/4 ("vm: add a syscall to map a process memory into a pipe")
actually adds the new system call and provides its elaborate description.

The patchset is against -mm tree.

v3: minor refactoring to reduce code duplication
v2: move this syscall under CONFIG_CROSS_MEMORY_ATTACH
    give correct flags to get_user_pages_remote()

Andrei Vagin (3):
  vm: add a syscall to map a process memory into a pipe
  x86: wire up the process_vmsplice syscall
  test: add a test for the process_vmsplice syscall

Mike Rapoport (1):
  fs/splice: introduce pages_to_pipe helper

 arch/x86/entry/syscalls/syscall_32.tbl             |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl             |   2 +
 fs/splice.c                                        | 262 +++++++++++++++++++--
 include/linux/compat.h                             |   3 +
 include/linux/syscalls.h                           |   4 +
 include/uapi/asm-generic/unistd.h                  |   5 +-
 kernel/sys_ni.c                                    |   2 +
 tools/testing/selftests/process_vmsplice/Makefile  |   5 +
 .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++
 9 files changed, 450 insertions(+), 22 deletions(-)
 create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
 create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 0/4] vm: add a syscall to map a process memory into a pipe
@ 2017-11-22 19:36 ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Yossi Kuperman

From: Yossi Kuperman <yossiku@il.ibm.com>

Hi,

This patches introduces new process_vmsplice system call that combines
functionality of process_vm_read and vmsplice.

It allows to map the memory of another process into a pipe, similarly to
what vmsplice does for its own address space.

The patch 2/4 ("vm: add a syscall to map a process memory into a pipe")
actually adds the new system call and provides its elaborate description.

The patchset is against -mm tree.

v3: minor refactoring to reduce code duplication
v2: move this syscall under CONFIG_CROSS_MEMORY_ATTACH
    give correct flags to get_user_pages_remote()

Andrei Vagin (3):
  vm: add a syscall to map a process memory into a pipe
  x86: wire up the process_vmsplice syscall
  test: add a test for the process_vmsplice syscall

Mike Rapoport (1):
  fs/splice: introduce pages_to_pipe helper

 arch/x86/entry/syscalls/syscall_32.tbl             |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl             |   2 +
 fs/splice.c                                        | 262 +++++++++++++++++++--
 include/linux/compat.h                             |   3 +
 include/linux/syscalls.h                           |   4 +
 include/uapi/asm-generic/unistd.h                  |   5 +-
 kernel/sys_ni.c                                    |   2 +
 tools/testing/selftests/process_vmsplice/Makefile  |   5 +
 .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++
 9 files changed, 450 insertions(+), 22 deletions(-)
 create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
 create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c

-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 1/4] fs/splice: introduce pages_to_pipe helper
  2017-11-22 19:36 ` Mike Rapoport
  (?)
@ 2017-11-22 19:36   ` Mike Rapoport
  -1 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Mike Rapoport

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 fs/splice.c | 57 ++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 39e2dc0..7f1ffc5 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1185,6 +1185,36 @@ static long do_splice(struct file *in, loff_t __user *off_in,
 	return -EINVAL;
 }
 
+static int pages_to_pipe(struct page **pages, struct pipe_inode_info *pipe,
+			 struct pipe_buffer *buf, size_t *total,
+			 ssize_t copied, size_t start)
+{
+	bool failed = false;
+	size_t len = 0;
+	int ret = 0;
+	int n;
+
+	for (n = 0; copied; n++, start = 0) {
+		int size = min_t(int, copied, PAGE_SIZE - start);
+		if (!failed) {
+			buf->page = pages[n];
+			buf->offset = start;
+			buf->len = size;
+			ret = add_to_pipe(pipe, buf);
+			if (unlikely(ret < 0))
+				failed = true;
+			else
+				len += ret;
+		} else {
+			put_page(pages[n]);
+		}
+		copied -= size;
+	}
+
+	*total += len;
+	return failed ? ret : len;
+}
+
 static int iter_to_pipe(struct iov_iter *from,
 			struct pipe_inode_info *pipe,
 			unsigned flags)
@@ -1195,13 +1225,11 @@ static int iter_to_pipe(struct iov_iter *from,
 	};
 	size_t total = 0;
 	int ret = 0;
-	bool failed = false;
 
-	while (iov_iter_count(from) && !failed) {
+	while (iov_iter_count(from)) {
 		struct page *pages[16];
 		ssize_t copied;
 		size_t start;
-		int n;
 
 		copied = iov_iter_get_pages(from, pages, ~0UL, 16, &start);
 		if (copied <= 0) {
@@ -1209,24 +1237,11 @@ static int iter_to_pipe(struct iov_iter *from,
 			break;
 		}
 
-		for (n = 0; copied; n++, start = 0) {
-			int size = min_t(int, copied, PAGE_SIZE - start);
-			if (!failed) {
-				buf.page = pages[n];
-				buf.offset = start;
-				buf.len = size;
-				ret = add_to_pipe(pipe, &buf);
-				if (unlikely(ret < 0)) {
-					failed = true;
-				} else {
-					iov_iter_advance(from, ret);
-					total += ret;
-				}
-			} else {
-				put_page(pages[n]);
-			}
-			copied -= size;
-		}
+		ret = pages_to_pipe(pages, pipe, &buf, &total, copied, start);
+		if (unlikely(ret < 0))
+			break;
+
+		iov_iter_advance(from, ret);
 	}
 	return total ? total : ret;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 1/4] fs/splice: introduce pages_to_pipe helper
@ 2017-11-22 19:36   ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Mike Rapoport

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 fs/splice.c | 57 ++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 39e2dc0..7f1ffc5 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1185,6 +1185,36 @@ static long do_splice(struct file *in, loff_t __user *off_in,
 	return -EINVAL;
 }
 
+static int pages_to_pipe(struct page **pages, struct pipe_inode_info *pipe,
+			 struct pipe_buffer *buf, size_t *total,
+			 ssize_t copied, size_t start)
+{
+	bool failed = false;
+	size_t len = 0;
+	int ret = 0;
+	int n;
+
+	for (n = 0; copied; n++, start = 0) {
+		int size = min_t(int, copied, PAGE_SIZE - start);
+		if (!failed) {
+			buf->page = pages[n];
+			buf->offset = start;
+			buf->len = size;
+			ret = add_to_pipe(pipe, buf);
+			if (unlikely(ret < 0))
+				failed = true;
+			else
+				len += ret;
+		} else {
+			put_page(pages[n]);
+		}
+		copied -= size;
+	}
+
+	*total += len;
+	return failed ? ret : len;
+}
+
 static int iter_to_pipe(struct iov_iter *from,
 			struct pipe_inode_info *pipe,
 			unsigned flags)
@@ -1195,13 +1225,11 @@ static int iter_to_pipe(struct iov_iter *from,
 	};
 	size_t total = 0;
 	int ret = 0;
-	bool failed = false;
 
-	while (iov_iter_count(from) && !failed) {
+	while (iov_iter_count(from)) {
 		struct page *pages[16];
 		ssize_t copied;
 		size_t start;
-		int n;
 
 		copied = iov_iter_get_pages(from, pages, ~0UL, 16, &start);
 		if (copied <= 0) {
@@ -1209,24 +1237,11 @@ static int iter_to_pipe(struct iov_iter *from,
 			break;
 		}
 
-		for (n = 0; copied; n++, start = 0) {
-			int size = min_t(int, copied, PAGE_SIZE - start);
-			if (!failed) {
-				buf.page = pages[n];
-				buf.offset = start;
-				buf.len = size;
-				ret = add_to_pipe(pipe, &buf);
-				if (unlikely(ret < 0)) {
-					failed = true;
-				} else {
-					iov_iter_advance(from, ret);
-					total += ret;
-				}
-			} else {
-				put_page(pages[n]);
-			}
-			copied -= size;
-		}
+		ret = pages_to_pipe(pages, pipe, &buf, &total, copied, start);
+		if (unlikely(ret < 0))
+			break;
+
+		iov_iter_advance(from, ret);
 	}
 	return total ? total : ret;
 }
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 1/4] fs/splice: introduce pages_to_pipe helper
@ 2017-11-22 19:36   ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, criu-GEFAQzZX7r8dnm+yROfE0A,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Mike Rapoport

Signed-off-by: Mike Rapoport <rppt-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 fs/splice.c | 57 ++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 39e2dc0..7f1ffc5 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1185,6 +1185,36 @@ static long do_splice(struct file *in, loff_t __user *off_in,
 	return -EINVAL;
 }
 
+static int pages_to_pipe(struct page **pages, struct pipe_inode_info *pipe,
+			 struct pipe_buffer *buf, size_t *total,
+			 ssize_t copied, size_t start)
+{
+	bool failed = false;
+	size_t len = 0;
+	int ret = 0;
+	int n;
+
+	for (n = 0; copied; n++, start = 0) {
+		int size = min_t(int, copied, PAGE_SIZE - start);
+		if (!failed) {
+			buf->page = pages[n];
+			buf->offset = start;
+			buf->len = size;
+			ret = add_to_pipe(pipe, buf);
+			if (unlikely(ret < 0))
+				failed = true;
+			else
+				len += ret;
+		} else {
+			put_page(pages[n]);
+		}
+		copied -= size;
+	}
+
+	*total += len;
+	return failed ? ret : len;
+}
+
 static int iter_to_pipe(struct iov_iter *from,
 			struct pipe_inode_info *pipe,
 			unsigned flags)
@@ -1195,13 +1225,11 @@ static int iter_to_pipe(struct iov_iter *from,
 	};
 	size_t total = 0;
 	int ret = 0;
-	bool failed = false;
 
-	while (iov_iter_count(from) && !failed) {
+	while (iov_iter_count(from)) {
 		struct page *pages[16];
 		ssize_t copied;
 		size_t start;
-		int n;
 
 		copied = iov_iter_get_pages(from, pages, ~0UL, 16, &start);
 		if (copied <= 0) {
@@ -1209,24 +1237,11 @@ static int iter_to_pipe(struct iov_iter *from,
 			break;
 		}
 
-		for (n = 0; copied; n++, start = 0) {
-			int size = min_t(int, copied, PAGE_SIZE - start);
-			if (!failed) {
-				buf.page = pages[n];
-				buf.offset = start;
-				buf.len = size;
-				ret = add_to_pipe(pipe, &buf);
-				if (unlikely(ret < 0)) {
-					failed = true;
-				} else {
-					iov_iter_advance(from, ret);
-					total += ret;
-				}
-			} else {
-				put_page(pages[n]);
-			}
-			copied -= size;
-		}
+		ret = pages_to_pipe(pages, pipe, &buf, &total, copied, start);
+		if (unlikely(ret < 0))
+			break;
+
+		iov_iter_advance(from, ret);
 	}
 	return total ? total : ret;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 2/4] vm: add a syscall to map a process memory into a pipe
  2017-11-22 19:36 ` Mike Rapoport
@ 2017-11-22 19:36   ` Mike Rapoport
  -1 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Andrei Vagin, Mike Rapoport

From: Andrei Vagin <avagin@virtuozzo.com>

It is a hybrid of process_vm_readv() and vmsplice().

vmsplice can map memory from a current address space into a pipe.
process_vm_readv can read memory of another process.

A new system call can map memory of another process into a pipe.

ssize_t process_vmsplice(pid_t pid, int fd, const struct iovec *iov,
                        unsigned long nr_segs, unsigned int flags)

All arguments are identical with vmsplice except pid which specifies a
target process.

Currently if we want to dump a process memory to a file or to a socket,
we can use process_vm_readv() + write(), but it works slow, because data
are copied into a temporary user-space buffer.

A second way is to use vmsplice() + splice(). It is more effective,
because data are not copied into a temporary buffer, but here is another
problem. vmsplice works with the currect address space, so it can be
used only if we inject our code into a target process.

The second way suffers from a few other issues:
* a process has to be stopped to run a parasite code
* a number of pipes is limited, so it may be impossible to dump all
  memory in one iteration, and we have to stop process and inject our
  code a few times.
* pages in pipes are unreclaimable, so it isn't good to hold a lot of
  memory in pipes.

The introduced syscall allows to use a second way without injecting any
code into a target process.

My experiments shows that process_vmsplice() + splice() works two time
faster than process_vm_readv() + write().

It is particularly useful on a pre-dump stage. On this stage we enable a
memory tracker, and then we are dumping  a process memory while a
process continues work. On the first iteration we are dumping all
memory, and then we are dumpung only modified memory from a previous
iteration.  After a few pre-dump operations, a process is stopped and
dumped finally. The pre-dump operations allow to significantly decrease
a process downtime, when a process is migrated to another host.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 fs/splice.c                       | 205 ++++++++++++++++++++++++++++++++++++++
 include/linux/compat.h            |   3 +
 include/linux/syscalls.h          |   4 +
 include/uapi/asm-generic/unistd.h |   5 +-
 kernel/sys_ni.c                   |   2 +
 5 files changed, 218 insertions(+), 1 deletion(-)

diff --git a/fs/splice.c b/fs/splice.c
index 7f1ffc5..72397d2 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -34,6 +34,7 @@
 #include <linux/socket.h>
 #include <linux/compat.h>
 #include <linux/sched/signal.h>
+#include <linux/sched/mm.h>
 
 #include "internal.h"
 
@@ -1373,6 +1374,210 @@ SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, iov,
 	return error;
 }
 
+#ifdef CONFIG_CROSS_MEMORY_ATTACH
+/*
+ * Map pages from a specified task into a pipe
+ */
+static int remote_single_vec_to_pipe(struct task_struct *task,
+			struct mm_struct *mm,
+			const struct iovec *rvec,
+			struct pipe_inode_info *pipe,
+			unsigned int flags,
+			size_t *total)
+{
+	struct pipe_buffer buf = {
+		.ops = &user_page_pipe_buf_ops,
+		.flags = flags
+	};
+	unsigned long addr = (unsigned long) rvec->iov_base;
+	unsigned long pa = addr & PAGE_MASK;
+	unsigned long start_offset = addr - pa;
+	unsigned long nr_pages;
+	ssize_t len = rvec->iov_len;
+	struct page *process_pages[16];
+	bool failed = false;
+	int ret = 0;
+
+	nr_pages = (addr + len - 1) / PAGE_SIZE - addr / PAGE_SIZE + 1;
+	while (nr_pages) {
+		long pages = min(nr_pages, 16UL);
+		int locked = 1;
+		ssize_t copied;
+
+		/*
+		 * Get the pages we're interested in.  We must
+		 * access remotely because task/mm might not
+		 * current/current->mm
+		 */
+		down_read(&mm->mmap_sem);
+		pages = get_user_pages_remote(task, mm, pa, pages, 0,
+					      process_pages, NULL, &locked);
+		if (locked)
+			up_read(&mm->mmap_sem);
+		if (pages <= 0) {
+			failed = true;
+			ret = -EFAULT;
+			break;
+		}
+
+		copied = pages * PAGE_SIZE - start_offset;
+		if (copied > len)
+			copied = len;
+		len -= copied;
+
+		ret = pages_to_pipe(process_pages, pipe, &buf, total, copied,
+				    start_offset);
+		if (unlikely(ret < 0))
+			break;
+
+		start_offset = 0;
+		nr_pages -= pages;
+		pa += pages * PAGE_SIZE;
+	}
+	return ret < 0 ? ret : 0;
+}
+
+static ssize_t remote_iovec_to_pipe(struct task_struct *task,
+			struct mm_struct *mm,
+			const struct iovec *rvec,
+			unsigned long riovcnt,
+			struct pipe_inode_info *pipe,
+			unsigned int flags)
+{
+	size_t total = 0;
+	int ret = 0, i;
+
+	for (i = 0; i < riovcnt; i++) {
+		/* Work out address and page range required */
+		if (rvec[i].iov_len == 0)
+			continue;
+
+		ret = remote_single_vec_to_pipe(
+				task, mm, &rvec[i], pipe, flags, &total);
+		if (ret < 0)
+			break;
+	}
+	return total ? total : ret;
+}
+
+static long process_vmsplice_to_pipe(struct task_struct *task,
+				struct mm_struct *mm, struct file *file,
+				const struct iovec __user *uiov,
+				unsigned long nr_segs, unsigned int flags)
+{
+	struct pipe_inode_info *pipe;
+	struct iovec iovstack[UIO_FASTIOV];
+	struct iovec *iov = iovstack;
+	unsigned int buf_flag = 0;
+	long ret;
+
+	if (flags & SPLICE_F_GIFT)
+		buf_flag = PIPE_BUF_FLAG_GIFT;
+
+	pipe = get_pipe_info(file);
+	if (!pipe)
+		return -EBADF;
+
+	ret = rw_copy_check_uvector(CHECK_IOVEC_ONLY, uiov, nr_segs,
+					UIO_FASTIOV, iovstack, &iov);
+	if (ret < 0)
+		return ret;
+
+	pipe_lock(pipe);
+	ret = wait_for_space(pipe, flags);
+	if (!ret)
+		ret = remote_iovec_to_pipe(task, mm, iov,
+						nr_segs, pipe, buf_flag);
+	pipe_unlock(pipe);
+	if (ret > 0)
+		wakeup_pipe_readers(pipe);
+
+	if (iov != iovstack)
+		kfree(iov);
+	return ret;
+}
+
+/* process_vmsplice splices a process address range into a pipe. */
+SYSCALL_DEFINE5(process_vmsplice, int, pid, int, fd,
+		const struct iovec __user *, iov,
+		unsigned long, nr_segs, unsigned int, flags)
+{
+	struct task_struct *task;
+	struct mm_struct *mm;
+	struct fd f;
+	long ret;
+
+	if (unlikely(flags & ~SPLICE_F_ALL))
+		return -EINVAL;
+	if (unlikely(nr_segs > UIO_MAXIOV))
+		return -EINVAL;
+	else if (unlikely(!nr_segs))
+		return 0;
+
+	f = fdget(fd);
+	if (!f.file)
+		return -EBADF;
+
+	/* Get process information */
+	task = find_get_task_by_vpid(pid);
+	if (!task) {
+		ret = -ESRCH;
+		goto out_fput;
+	}
+
+	mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS);
+	if (!mm || IS_ERR(mm)) {
+		ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH;
+		/*
+		 * Explicitly map EACCES to EPERM as EPERM is a more a
+		 * appropriate error code for process_vw_readv/writev
+		 */
+		if (ret == -EACCES)
+			ret = -EPERM;
+		goto put_task_struct;
+	}
+
+	ret = -EBADF;
+	if (f.file->f_mode & FMODE_WRITE)
+		ret = process_vmsplice_to_pipe(task, mm, f.file,
+						iov, nr_segs, flags);
+	mmput(mm);
+
+put_task_struct:
+	put_task_struct(task);
+
+out_fput:
+	fdput(f);
+
+	return ret;
+}
+
+#ifdef CONFIG_COMPAT
+COMPAT_SYSCALL_DEFINE5(process_vmsplice, pid_t, pid, int, fd,
+			const struct compat_iovec __user *, iov32,
+			unsigned int, nr_segs, unsigned int, flags)
+{
+	struct iovec __user *iov;
+	unsigned int i;
+
+	if (nr_segs > UIO_MAXIOV)
+		return -EINVAL;
+
+	iov = compat_alloc_user_space(nr_segs * sizeof(struct iovec));
+	for (i = 0; i < nr_segs; i++) {
+		struct compat_iovec v;
+
+		if (get_user(v.iov_base, &iov32[i].iov_base) ||
+		    get_user(v.iov_len, &iov32[i].iov_len) ||
+		    put_user(compat_ptr(v.iov_base), &iov[i].iov_base) ||
+		    put_user(v.iov_len, &iov[i].iov_len))
+			return -EFAULT;
+	}
+	return sys_process_vmsplice(pid, fd, iov, nr_segs, flags);
+}
+#endif
+#endif /* CONFIG_CROSS_MEMORY_ATTACH */
+
 #ifdef CONFIG_COMPAT
 COMPAT_SYSCALL_DEFINE4(vmsplice, int, fd, const struct compat_iovec __user *, iov32,
 		    unsigned int, nr_segs, unsigned int, flags)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 0fc3640..11b3753 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -550,6 +550,9 @@ asmlinkage long compat_sys_getdents(unsigned int fd,
 				    unsigned int count);
 asmlinkage long compat_sys_vmsplice(int fd, const struct compat_iovec __user *,
 				    unsigned int nr_segs, unsigned int flags);
+asmlinkage long compat_sys_process_vmsplice(pid_t pid, int fd,
+				    const struct compat_iovec __user *,
+				    unsigned int nr_segs, unsigned int flags);
 asmlinkage long compat_sys_open(const char __user *filename, int flags,
 				umode_t mode);
 asmlinkage long compat_sys_openat(int dfd, const char __user *filename,
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index a78186d..4ba9333 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -941,4 +941,8 @@ asmlinkage long sys_pkey_free(int pkey);
 asmlinkage long sys_statx(int dfd, const char __user *path, unsigned flags,
 			  unsigned mask, struct statx __user *buffer);
 
+asmlinkage long sys_process_vmsplice(pid_t pid,
+			int fd, const struct iovec __user *iov,
+			unsigned long nr_segs, unsigned int flags);
+
 #endif
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 8b87de0..37f1832 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -732,9 +732,12 @@ __SYSCALL(__NR_pkey_alloc,    sys_pkey_alloc)
 __SYSCALL(__NR_pkey_free,     sys_pkey_free)
 #define __NR_statx 291
 __SYSCALL(__NR_statx,     sys_statx)
+#define __NR_process_vmsplice 292
+__SC_COMP(__NR_process_vmsplice, sys_process_vmsplice,
+	  compat_sys_process_vmsplice)
 
 #undef __NR_syscalls
-#define __NR_syscalls 292
+#define __NR_syscalls 293
 
 /*
  * All syscalls below here should go away really,
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index b518976..a939fbb 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -158,8 +158,10 @@ cond_syscall(sys_sysfs);
 cond_syscall(sys_syslog);
 cond_syscall(sys_process_vm_readv);
 cond_syscall(sys_process_vm_writev);
+cond_syscall(sys_process_vmsplice);
 cond_syscall(compat_sys_process_vm_readv);
 cond_syscall(compat_sys_process_vm_writev);
+cond_syscall(compat_sys_process_vmsplice);
 cond_syscall(sys_uselib);
 cond_syscall(sys_fadvise64);
 cond_syscall(sys_fadvise64_64);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 2/4] vm: add a syscall to map a process memory into a pipe
@ 2017-11-22 19:36   ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Andrei Vagin, Mike Rapoport

From: Andrei Vagin <avagin@virtuozzo.com>

It is a hybrid of process_vm_readv() and vmsplice().

vmsplice can map memory from a current address space into a pipe.
process_vm_readv can read memory of another process.

A new system call can map memory of another process into a pipe.

ssize_t process_vmsplice(pid_t pid, int fd, const struct iovec *iov,
                        unsigned long nr_segs, unsigned int flags)

All arguments are identical with vmsplice except pid which specifies a
target process.

Currently if we want to dump a process memory to a file or to a socket,
we can use process_vm_readv() + write(), but it works slow, because data
are copied into a temporary user-space buffer.

A second way is to use vmsplice() + splice(). It is more effective,
because data are not copied into a temporary buffer, but here is another
problem. vmsplice works with the currect address space, so it can be
used only if we inject our code into a target process.

The second way suffers from a few other issues:
* a process has to be stopped to run a parasite code
* a number of pipes is limited, so it may be impossible to dump all
  memory in one iteration, and we have to stop process and inject our
  code a few times.
* pages in pipes are unreclaimable, so it isn't good to hold a lot of
  memory in pipes.

The introduced syscall allows to use a second way without injecting any
code into a target process.

My experiments shows that process_vmsplice() + splice() works two time
faster than process_vm_readv() + write().

It is particularly useful on a pre-dump stage. On this stage we enable a
memory tracker, and then we are dumping  a process memory while a
process continues work. On the first iteration we are dumping all
memory, and then we are dumpung only modified memory from a previous
iteration.  After a few pre-dump operations, a process is stopped and
dumped finally. The pre-dump operations allow to significantly decrease
a process downtime, when a process is migrated to another host.

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
---
 fs/splice.c                       | 205 ++++++++++++++++++++++++++++++++++++++
 include/linux/compat.h            |   3 +
 include/linux/syscalls.h          |   4 +
 include/uapi/asm-generic/unistd.h |   5 +-
 kernel/sys_ni.c                   |   2 +
 5 files changed, 218 insertions(+), 1 deletion(-)

diff --git a/fs/splice.c b/fs/splice.c
index 7f1ffc5..72397d2 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -34,6 +34,7 @@
 #include <linux/socket.h>
 #include <linux/compat.h>
 #include <linux/sched/signal.h>
+#include <linux/sched/mm.h>
 
 #include "internal.h"
 
@@ -1373,6 +1374,210 @@ SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, iov,
 	return error;
 }
 
+#ifdef CONFIG_CROSS_MEMORY_ATTACH
+/*
+ * Map pages from a specified task into a pipe
+ */
+static int remote_single_vec_to_pipe(struct task_struct *task,
+			struct mm_struct *mm,
+			const struct iovec *rvec,
+			struct pipe_inode_info *pipe,
+			unsigned int flags,
+			size_t *total)
+{
+	struct pipe_buffer buf = {
+		.ops = &user_page_pipe_buf_ops,
+		.flags = flags
+	};
+	unsigned long addr = (unsigned long) rvec->iov_base;
+	unsigned long pa = addr & PAGE_MASK;
+	unsigned long start_offset = addr - pa;
+	unsigned long nr_pages;
+	ssize_t len = rvec->iov_len;
+	struct page *process_pages[16];
+	bool failed = false;
+	int ret = 0;
+
+	nr_pages = (addr + len - 1) / PAGE_SIZE - addr / PAGE_SIZE + 1;
+	while (nr_pages) {
+		long pages = min(nr_pages, 16UL);
+		int locked = 1;
+		ssize_t copied;
+
+		/*
+		 * Get the pages we're interested in.  We must
+		 * access remotely because task/mm might not
+		 * current/current->mm
+		 */
+		down_read(&mm->mmap_sem);
+		pages = get_user_pages_remote(task, mm, pa, pages, 0,
+					      process_pages, NULL, &locked);
+		if (locked)
+			up_read(&mm->mmap_sem);
+		if (pages <= 0) {
+			failed = true;
+			ret = -EFAULT;
+			break;
+		}
+
+		copied = pages * PAGE_SIZE - start_offset;
+		if (copied > len)
+			copied = len;
+		len -= copied;
+
+		ret = pages_to_pipe(process_pages, pipe, &buf, total, copied,
+				    start_offset);
+		if (unlikely(ret < 0))
+			break;
+
+		start_offset = 0;
+		nr_pages -= pages;
+		pa += pages * PAGE_SIZE;
+	}
+	return ret < 0 ? ret : 0;
+}
+
+static ssize_t remote_iovec_to_pipe(struct task_struct *task,
+			struct mm_struct *mm,
+			const struct iovec *rvec,
+			unsigned long riovcnt,
+			struct pipe_inode_info *pipe,
+			unsigned int flags)
+{
+	size_t total = 0;
+	int ret = 0, i;
+
+	for (i = 0; i < riovcnt; i++) {
+		/* Work out address and page range required */
+		if (rvec[i].iov_len == 0)
+			continue;
+
+		ret = remote_single_vec_to_pipe(
+				task, mm, &rvec[i], pipe, flags, &total);
+		if (ret < 0)
+			break;
+	}
+	return total ? total : ret;
+}
+
+static long process_vmsplice_to_pipe(struct task_struct *task,
+				struct mm_struct *mm, struct file *file,
+				const struct iovec __user *uiov,
+				unsigned long nr_segs, unsigned int flags)
+{
+	struct pipe_inode_info *pipe;
+	struct iovec iovstack[UIO_FASTIOV];
+	struct iovec *iov = iovstack;
+	unsigned int buf_flag = 0;
+	long ret;
+
+	if (flags & SPLICE_F_GIFT)
+		buf_flag = PIPE_BUF_FLAG_GIFT;
+
+	pipe = get_pipe_info(file);
+	if (!pipe)
+		return -EBADF;
+
+	ret = rw_copy_check_uvector(CHECK_IOVEC_ONLY, uiov, nr_segs,
+					UIO_FASTIOV, iovstack, &iov);
+	if (ret < 0)
+		return ret;
+
+	pipe_lock(pipe);
+	ret = wait_for_space(pipe, flags);
+	if (!ret)
+		ret = remote_iovec_to_pipe(task, mm, iov,
+						nr_segs, pipe, buf_flag);
+	pipe_unlock(pipe);
+	if (ret > 0)
+		wakeup_pipe_readers(pipe);
+
+	if (iov != iovstack)
+		kfree(iov);
+	return ret;
+}
+
+/* process_vmsplice splices a process address range into a pipe. */
+SYSCALL_DEFINE5(process_vmsplice, int, pid, int, fd,
+		const struct iovec __user *, iov,
+		unsigned long, nr_segs, unsigned int, flags)
+{
+	struct task_struct *task;
+	struct mm_struct *mm;
+	struct fd f;
+	long ret;
+
+	if (unlikely(flags & ~SPLICE_F_ALL))
+		return -EINVAL;
+	if (unlikely(nr_segs > UIO_MAXIOV))
+		return -EINVAL;
+	else if (unlikely(!nr_segs))
+		return 0;
+
+	f = fdget(fd);
+	if (!f.file)
+		return -EBADF;
+
+	/* Get process information */
+	task = find_get_task_by_vpid(pid);
+	if (!task) {
+		ret = -ESRCH;
+		goto out_fput;
+	}
+
+	mm = mm_access(task, PTRACE_MODE_ATTACH_REALCREDS);
+	if (!mm || IS_ERR(mm)) {
+		ret = IS_ERR(mm) ? PTR_ERR(mm) : -ESRCH;
+		/*
+		 * Explicitly map EACCES to EPERM as EPERM is a more a
+		 * appropriate error code for process_vw_readv/writev
+		 */
+		if (ret == -EACCES)
+			ret = -EPERM;
+		goto put_task_struct;
+	}
+
+	ret = -EBADF;
+	if (f.file->f_mode & FMODE_WRITE)
+		ret = process_vmsplice_to_pipe(task, mm, f.file,
+						iov, nr_segs, flags);
+	mmput(mm);
+
+put_task_struct:
+	put_task_struct(task);
+
+out_fput:
+	fdput(f);
+
+	return ret;
+}
+
+#ifdef CONFIG_COMPAT
+COMPAT_SYSCALL_DEFINE5(process_vmsplice, pid_t, pid, int, fd,
+			const struct compat_iovec __user *, iov32,
+			unsigned int, nr_segs, unsigned int, flags)
+{
+	struct iovec __user *iov;
+	unsigned int i;
+
+	if (nr_segs > UIO_MAXIOV)
+		return -EINVAL;
+
+	iov = compat_alloc_user_space(nr_segs * sizeof(struct iovec));
+	for (i = 0; i < nr_segs; i++) {
+		struct compat_iovec v;
+
+		if (get_user(v.iov_base, &iov32[i].iov_base) ||
+		    get_user(v.iov_len, &iov32[i].iov_len) ||
+		    put_user(compat_ptr(v.iov_base), &iov[i].iov_base) ||
+		    put_user(v.iov_len, &iov[i].iov_len))
+			return -EFAULT;
+	}
+	return sys_process_vmsplice(pid, fd, iov, nr_segs, flags);
+}
+#endif
+#endif /* CONFIG_CROSS_MEMORY_ATTACH */
+
 #ifdef CONFIG_COMPAT
 COMPAT_SYSCALL_DEFINE4(vmsplice, int, fd, const struct compat_iovec __user *, iov32,
 		    unsigned int, nr_segs, unsigned int, flags)
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 0fc3640..11b3753 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -550,6 +550,9 @@ asmlinkage long compat_sys_getdents(unsigned int fd,
 				    unsigned int count);
 asmlinkage long compat_sys_vmsplice(int fd, const struct compat_iovec __user *,
 				    unsigned int nr_segs, unsigned int flags);
+asmlinkage long compat_sys_process_vmsplice(pid_t pid, int fd,
+				    const struct compat_iovec __user *,
+				    unsigned int nr_segs, unsigned int flags);
 asmlinkage long compat_sys_open(const char __user *filename, int flags,
 				umode_t mode);
 asmlinkage long compat_sys_openat(int dfd, const char __user *filename,
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index a78186d..4ba9333 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -941,4 +941,8 @@ asmlinkage long sys_pkey_free(int pkey);
 asmlinkage long sys_statx(int dfd, const char __user *path, unsigned flags,
 			  unsigned mask, struct statx __user *buffer);
 
+asmlinkage long sys_process_vmsplice(pid_t pid,
+			int fd, const struct iovec __user *iov,
+			unsigned long nr_segs, unsigned int flags);
+
 #endif
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 8b87de0..37f1832 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -732,9 +732,12 @@ __SYSCALL(__NR_pkey_alloc,    sys_pkey_alloc)
 __SYSCALL(__NR_pkey_free,     sys_pkey_free)
 #define __NR_statx 291
 __SYSCALL(__NR_statx,     sys_statx)
+#define __NR_process_vmsplice 292
+__SC_COMP(__NR_process_vmsplice, sys_process_vmsplice,
+	  compat_sys_process_vmsplice)
 
 #undef __NR_syscalls
-#define __NR_syscalls 292
+#define __NR_syscalls 293
 
 /*
  * All syscalls below here should go away really,
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index b518976..a939fbb 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -158,8 +158,10 @@ cond_syscall(sys_sysfs);
 cond_syscall(sys_syslog);
 cond_syscall(sys_process_vm_readv);
 cond_syscall(sys_process_vm_writev);
+cond_syscall(sys_process_vmsplice);
 cond_syscall(compat_sys_process_vm_readv);
 cond_syscall(compat_sys_process_vm_writev);
+cond_syscall(compat_sys_process_vmsplice);
 cond_syscall(sys_uselib);
 cond_syscall(sys_fadvise64);
 cond_syscall(sys_fadvise64_64);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 3/4] x86: wire up the process_vmsplice syscall
  2017-11-22 19:36 ` Mike Rapoport
@ 2017-11-22 19:36   ` Mike Rapoport
  -1 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Andrei Vagin

From: Andrei Vagin <avagin@openvz.org>

Signed-off-by: Andrei Vagin <avagin@openvz.org>
---
 arch/x86/entry/syscalls/syscall_32.tbl | 1 +
 arch/x86/entry/syscalls/syscall_64.tbl | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 448ac21..dc64bf5 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -391,3 +391,4 @@
 382	i386	pkey_free		sys_pkey_free
 383	i386	statx			sys_statx
 384	i386	arch_prctl		sys_arch_prctl			compat_sys_arch_prctl
+385	i386	process_vmsplice	sys_process_vmsplice		compat_sys_process_vmsplice
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 5aef183..d2f916c 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -339,6 +339,7 @@
 330	common	pkey_alloc		sys_pkey_alloc
 331	common	pkey_free		sys_pkey_free
 332	common	statx			sys_statx
+333	64	process_vmsplice	sys_process_vmsplice
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
@@ -380,3 +381,4 @@
 545	x32	execveat		compat_sys_execveat/ptregs
 546	x32	preadv2			compat_sys_preadv64v2
 547	x32	pwritev2		compat_sys_pwritev64v2
+548	x32	process_vmsplice	compat_sys_process_vmsplice
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 3/4] x86: wire up the process_vmsplice syscall
@ 2017-11-22 19:36   ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Andrei Vagin

From: Andrei Vagin <avagin@openvz.org>

Signed-off-by: Andrei Vagin <avagin@openvz.org>
---
 arch/x86/entry/syscalls/syscall_32.tbl | 1 +
 arch/x86/entry/syscalls/syscall_64.tbl | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 448ac21..dc64bf5 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -391,3 +391,4 @@
 382	i386	pkey_free		sys_pkey_free
 383	i386	statx			sys_statx
 384	i386	arch_prctl		sys_arch_prctl			compat_sys_arch_prctl
+385	i386	process_vmsplice	sys_process_vmsplice		compat_sys_process_vmsplice
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 5aef183..d2f916c 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -339,6 +339,7 @@
 330	common	pkey_alloc		sys_pkey_alloc
 331	common	pkey_free		sys_pkey_free
 332	common	statx			sys_statx
+333	64	process_vmsplice	sys_process_vmsplice
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
@@ -380,3 +381,4 @@
 545	x32	execveat		compat_sys_execveat/ptregs
 546	x32	preadv2			compat_sys_preadv64v2
 547	x32	pwritev2		compat_sys_pwritev64v2
+548	x32	process_vmsplice	compat_sys_process_vmsplice
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 4/4] test: add a test for the process_vmsplice syscall
  2017-11-22 19:36 ` Mike Rapoport
@ 2017-11-22 19:36   ` Mike Rapoport
  -1 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Andrei Vagin

From: Andrei Vagin <avagin@openvz.org>

This test checks that process_vmsplice() can splice pages from a remote
process and returns EFAULT, if process_vmsplice() tries to splice pages
by an unaccessiable address.

Signed-off-by: Andrei Vagin <avagin@openvz.org>
---
 tools/testing/selftests/process_vmsplice/Makefile  |   5 +
 .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++++++++
 2 files changed, 193 insertions(+)
 create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
 create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c

diff --git a/tools/testing/selftests/process_vmsplice/Makefile b/tools/testing/selftests/process_vmsplice/Makefile
new file mode 100644
index 0000000..246d5a7
--- /dev/null
+++ b/tools/testing/selftests/process_vmsplice/Makefile
@@ -0,0 +1,5 @@
+CFLAGS += -I../../../../usr/include/
+
+TEST_GEN_PROGS := process_vmsplice_test
+
+include ../lib.mk
diff --git a/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
new file mode 100644
index 0000000..8abf59b
--- /dev/null
+++ b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
@@ -0,0 +1,188 @@
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
+#include <fcntl.h>
+#include <sys/uio.h>
+#include <errno.h>
+#include <signal.h>
+#include <sys/prctl.h>
+#include <sys/wait.h>
+
+#include "../kselftest.h"
+
+#ifndef __NR_process_vmsplice
+#define __NR_process_vmsplice 333
+#endif
+
+#define pr_err(fmt, ...) \
+		({ \
+			fprintf(stderr, "%s:%d:" fmt, \
+				__func__, __LINE__, ##__VA_ARGS__); \
+			KSFT_FAIL; \
+		})
+#define pr_perror(fmt, ...) pr_err(fmt ": %m\n", ##__VA_ARGS__)
+#define fail(fmt, ...) pr_err("FAIL:" fmt, ##__VA_ARGS__)
+
+static ssize_t process_vmsplice(pid_t pid, int fd, const struct iovec *iov,
+			unsigned long nr_segs, unsigned int flags)
+{
+	return syscall(__NR_process_vmsplice, pid, fd, iov, nr_segs, flags);
+
+}
+
+#define MEM_SIZE (4096 * 100)
+#define MEM_WRONLY_SIZE (4096 * 10)
+
+int main(int argc, char **argv)
+{
+	char *addr, *addr_wronly;
+	int p[2];
+	struct iovec iov[2];
+	char buf[4096];
+	int status, ret;
+	pid_t pid;
+
+	ksft_print_header();
+
+	addr = mmap(0, MEM_SIZE, PROT_READ | PROT_WRITE,
+					MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+	if (addr == MAP_FAILED)
+		return pr_perror("Unable to create a mapping");
+
+	addr_wronly = mmap(0, MEM_WRONLY_SIZE, PROT_WRITE,
+				MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+	if (addr_wronly == MAP_FAILED)
+		return pr_perror("Unable to create a write-only mapping");
+
+	if (pipe(p))
+		return pr_perror("Unable to create a pipe");
+
+	pid = fork();
+	if (pid < 0)
+		return pr_perror("Unable to fork");
+
+	if (pid == 0) {
+		addr[0] = 'C';
+		addr[4096 + 128] = 'A';
+		addr[4096 + 128 + 4096 - 1] = 'B';
+
+		if (prctl(PR_SET_PDEATHSIG, SIGKILL))
+			return pr_perror("Unable to set PR_SET_PDEATHSIG");
+		if (write(p[1], "c", 1) != 1)
+			return pr_perror("Unable to write data into pipe");
+
+		while (1)
+			sleep(1);
+		return 1;
+	}
+	if (read(p[0], buf, 1) != 1) {
+		pr_perror("Unable to read data from pipe");
+		kill(pid, SIGKILL);
+		wait(&status);
+		return 1;
+	}
+
+	munmap(addr, MEM_SIZE);
+	munmap(addr_wronly, MEM_WRONLY_SIZE);
+
+	iov[0].iov_base = addr;
+	iov[0].iov_len = 1;
+
+	iov[1].iov_base = addr + 4096 + 128;
+	iov[1].iov_len = 4096;
+
+	/* check one iovec */
+	if (process_vmsplice(pid, p[1], iov, 1, SPLICE_F_GIFT) != 1)
+		return pr_perror("Unable to splice pages");
+
+	if (read(p[0], buf, 1) != 1)
+		return pr_perror("Unable to read from pipe");
+
+	if (buf[0] != 'C')
+		ksft_test_result_fail("Get wrong data\n");
+	else
+		ksft_test_result_pass("Check process_vmsplice with one vec\n");
+
+	/* check two iovec-s */
+	if (process_vmsplice(pid, p[1], iov, 2, SPLICE_F_GIFT) != 4097)
+		return pr_perror("Unable to spice pages\n");
+
+	if (read(p[0], buf, 1) != 1)
+		return pr_perror("Unable to read from pipe\n");
+
+	if (buf[0] != 'C')
+		ksft_test_result_fail("Get wrong data\n");
+
+	if (read(p[0], buf, 4096) != 4096)
+		return pr_perror("Unable to read from pipe\n");
+
+	if (buf[0] != 'A' || buf[4095] != 'B')
+		ksft_test_result_fail("Get wrong data\n");
+	else
+		ksft_test_result_pass("check process_vmsplice with two vecs\n");
+
+	/* check how an unreadable region in a second vec is handled */
+	iov[0].iov_base = addr;
+	iov[0].iov_len = 1;
+
+	iov[1].iov_base = addr_wronly + 5;
+	iov[1].iov_len = 1;
+
+	if (process_vmsplice(pid, p[1], iov, 2, SPLICE_F_GIFT) != 1)
+		return pr_perror("Unable to splice data");
+
+	if (read(p[0], buf, 1) != 1)
+		return pr_perror("Unable to read form pipe");
+
+	if (buf[0] != 'C')
+		ksft_test_result_fail("Get wrong data\n");
+	else
+		ksft_test_result_pass("unreadable region in a second vec\n");
+
+	/* check how an unreadable region in a first vec is handled */
+	errno = 0;
+	if (process_vmsplice(pid, p[1], iov + 1, 1, SPLICE_F_GIFT) != -1 ||
+	    errno != EFAULT)
+		ksft_test_result_fail("Got anexpected errno %d\n", errno);
+	else
+		ksft_test_result_pass("splice as much as possible\n");
+
+	iov[0].iov_base = addr;
+	iov[0].iov_len = 1;
+
+	iov[1].iov_base = addr;
+	iov[1].iov_len = MEM_SIZE;
+
+	/* splice as much as possible */
+	ret = process_vmsplice(pid, p[1], iov, 2,
+				SPLICE_F_GIFT | SPLICE_F_NONBLOCK);
+	if (ret != 4096 * 15 + 1) /* by default a pipe can fit 16 pages */
+		return pr_perror("Unable to splice pages");
+
+	while (ret > 0) {
+		int len;
+
+		len = read(p[0], buf, 4096);
+		if (len < 0)
+			return pr_perror("Unable to read data");
+		if (len > ret)
+			return pr_err("Read more than expected\n");
+		ret -= len;
+	}
+	ksft_test_result_pass("splice as much as possible\n");
+
+	if (kill(pid, SIGTERM))
+		return pr_perror("Unable to kill a child process");
+	status = -1;
+	if (wait(&status) < 0)
+		return pr_perror("Unable to wait a child process");
+	if (!WIFSIGNALED(status) || WTERMSIG(status) != SIGTERM)
+		return pr_err("The child exited with an unexpected code %d\n",
+									status);
+
+	if (ksft_get_fail_cnt())
+		return ksft_exit_fail();
+	return ksft_exit_pass();
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 4/4] test: add a test for the process_vmsplice syscall
@ 2017-11-22 19:36   ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-22 19:36 UTC (permalink / raw)
  To: Andrew Morton, Alexander Viro
  Cc: linux-mm, linux-fsdevel, linux-kernel, linux-api, criu,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Andrei Vagin

From: Andrei Vagin <avagin@openvz.org>

This test checks that process_vmsplice() can splice pages from a remote
process and returns EFAULT, if process_vmsplice() tries to splice pages
by an unaccessiable address.

Signed-off-by: Andrei Vagin <avagin@openvz.org>
---
 tools/testing/selftests/process_vmsplice/Makefile  |   5 +
 .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++++++++
 2 files changed, 193 insertions(+)
 create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
 create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c

diff --git a/tools/testing/selftests/process_vmsplice/Makefile b/tools/testing/selftests/process_vmsplice/Makefile
new file mode 100644
index 0000000..246d5a7
--- /dev/null
+++ b/tools/testing/selftests/process_vmsplice/Makefile
@@ -0,0 +1,5 @@
+CFLAGS += -I../../../../usr/include/
+
+TEST_GEN_PROGS := process_vmsplice_test
+
+include ../lib.mk
diff --git a/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
new file mode 100644
index 0000000..8abf59b
--- /dev/null
+++ b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
@@ -0,0 +1,188 @@
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include <sys/syscall.h>
+#include <fcntl.h>
+#include <sys/uio.h>
+#include <errno.h>
+#include <signal.h>
+#include <sys/prctl.h>
+#include <sys/wait.h>
+
+#include "../kselftest.h"
+
+#ifndef __NR_process_vmsplice
+#define __NR_process_vmsplice 333
+#endif
+
+#define pr_err(fmt, ...) \
+		({ \
+			fprintf(stderr, "%s:%d:" fmt, \
+				__func__, __LINE__, ##__VA_ARGS__); \
+			KSFT_FAIL; \
+		})
+#define pr_perror(fmt, ...) pr_err(fmt ": %m\n", ##__VA_ARGS__)
+#define fail(fmt, ...) pr_err("FAIL:" fmt, ##__VA_ARGS__)
+
+static ssize_t process_vmsplice(pid_t pid, int fd, const struct iovec *iov,
+			unsigned long nr_segs, unsigned int flags)
+{
+	return syscall(__NR_process_vmsplice, pid, fd, iov, nr_segs, flags);
+
+}
+
+#define MEM_SIZE (4096 * 100)
+#define MEM_WRONLY_SIZE (4096 * 10)
+
+int main(int argc, char **argv)
+{
+	char *addr, *addr_wronly;
+	int p[2];
+	struct iovec iov[2];
+	char buf[4096];
+	int status, ret;
+	pid_t pid;
+
+	ksft_print_header();
+
+	addr = mmap(0, MEM_SIZE, PROT_READ | PROT_WRITE,
+					MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+	if (addr == MAP_FAILED)
+		return pr_perror("Unable to create a mapping");
+
+	addr_wronly = mmap(0, MEM_WRONLY_SIZE, PROT_WRITE,
+				MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
+	if (addr_wronly == MAP_FAILED)
+		return pr_perror("Unable to create a write-only mapping");
+
+	if (pipe(p))
+		return pr_perror("Unable to create a pipe");
+
+	pid = fork();
+	if (pid < 0)
+		return pr_perror("Unable to fork");
+
+	if (pid == 0) {
+		addr[0] = 'C';
+		addr[4096 + 128] = 'A';
+		addr[4096 + 128 + 4096 - 1] = 'B';
+
+		if (prctl(PR_SET_PDEATHSIG, SIGKILL))
+			return pr_perror("Unable to set PR_SET_PDEATHSIG");
+		if (write(p[1], "c", 1) != 1)
+			return pr_perror("Unable to write data into pipe");
+
+		while (1)
+			sleep(1);
+		return 1;
+	}
+	if (read(p[0], buf, 1) != 1) {
+		pr_perror("Unable to read data from pipe");
+		kill(pid, SIGKILL);
+		wait(&status);
+		return 1;
+	}
+
+	munmap(addr, MEM_SIZE);
+	munmap(addr_wronly, MEM_WRONLY_SIZE);
+
+	iov[0].iov_base = addr;
+	iov[0].iov_len = 1;
+
+	iov[1].iov_base = addr + 4096 + 128;
+	iov[1].iov_len = 4096;
+
+	/* check one iovec */
+	if (process_vmsplice(pid, p[1], iov, 1, SPLICE_F_GIFT) != 1)
+		return pr_perror("Unable to splice pages");
+
+	if (read(p[0], buf, 1) != 1)
+		return pr_perror("Unable to read from pipe");
+
+	if (buf[0] != 'C')
+		ksft_test_result_fail("Get wrong data\n");
+	else
+		ksft_test_result_pass("Check process_vmsplice with one vec\n");
+
+	/* check two iovec-s */
+	if (process_vmsplice(pid, p[1], iov, 2, SPLICE_F_GIFT) != 4097)
+		return pr_perror("Unable to spice pages\n");
+
+	if (read(p[0], buf, 1) != 1)
+		return pr_perror("Unable to read from pipe\n");
+
+	if (buf[0] != 'C')
+		ksft_test_result_fail("Get wrong data\n");
+
+	if (read(p[0], buf, 4096) != 4096)
+		return pr_perror("Unable to read from pipe\n");
+
+	if (buf[0] != 'A' || buf[4095] != 'B')
+		ksft_test_result_fail("Get wrong data\n");
+	else
+		ksft_test_result_pass("check process_vmsplice with two vecs\n");
+
+	/* check how an unreadable region in a second vec is handled */
+	iov[0].iov_base = addr;
+	iov[0].iov_len = 1;
+
+	iov[1].iov_base = addr_wronly + 5;
+	iov[1].iov_len = 1;
+
+	if (process_vmsplice(pid, p[1], iov, 2, SPLICE_F_GIFT) != 1)
+		return pr_perror("Unable to splice data");
+
+	if (read(p[0], buf, 1) != 1)
+		return pr_perror("Unable to read form pipe");
+
+	if (buf[0] != 'C')
+		ksft_test_result_fail("Get wrong data\n");
+	else
+		ksft_test_result_pass("unreadable region in a second vec\n");
+
+	/* check how an unreadable region in a first vec is handled */
+	errno = 0;
+	if (process_vmsplice(pid, p[1], iov + 1, 1, SPLICE_F_GIFT) != -1 ||
+	    errno != EFAULT)
+		ksft_test_result_fail("Got anexpected errno %d\n", errno);
+	else
+		ksft_test_result_pass("splice as much as possible\n");
+
+	iov[0].iov_base = addr;
+	iov[0].iov_len = 1;
+
+	iov[1].iov_base = addr;
+	iov[1].iov_len = MEM_SIZE;
+
+	/* splice as much as possible */
+	ret = process_vmsplice(pid, p[1], iov, 2,
+				SPLICE_F_GIFT | SPLICE_F_NONBLOCK);
+	if (ret != 4096 * 15 + 1) /* by default a pipe can fit 16 pages */
+		return pr_perror("Unable to splice pages");
+
+	while (ret > 0) {
+		int len;
+
+		len = read(p[0], buf, 4096);
+		if (len < 0)
+			return pr_perror("Unable to read data");
+		if (len > ret)
+			return pr_err("Read more than expected\n");
+		ret -= len;
+	}
+	ksft_test_result_pass("splice as much as possible\n");
+
+	if (kill(pid, SIGTERM))
+		return pr_perror("Unable to kill a child process");
+	status = -1;
+	if (wait(&status) < 0)
+		return pr_perror("Unable to wait a child process");
+	if (!WIFSIGNALED(status) || WTERMSIG(status) != SIGTERM)
+		return pr_err("The child exited with an unexpected code %d\n",
+									status);
+
+	if (ksft_get_fail_cnt())
+		return ksft_exit_fail();
+	return ksft_exit_pass();
+}
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] vm: add a syscall to map a process memory into a pipe
  2017-11-22 19:36 ` Mike Rapoport
@ 2017-11-22 20:43   ` Michael Kerrisk (man-pages)
  -1 siblings, 0 replies; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2017-11-22 20:43 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Alexander Viro, linux-mm, linux-fsdevel, lkml,
	Linux API, criu, Arnd Bergmann, Pavel Emelyanov, Thomas Gleixner,
	Josh Triplett, Jann Horn, Yossi Kuperman

Hi Mike,

On 22 November 2017 at 20:36, Mike Rapoport <rppt@linux.vnet.ibm.com> wrote:
> From: Yossi Kuperman <yossiku@il.ibm.com>
>
> Hi,
>
> This patches introduces new process_vmsplice system call that combines
> functionality of process_vm_read and vmsplice.
>
> It allows to map the memory of another process into a pipe, similarly to
> what vmsplice does for its own address space.
>
> The patch 2/4 ("vm: add a syscall to map a process memory into a pipe")
> actually adds the new system call and provides its elaborate description.

Where is the man page for this new syscall?

Cheers,

Michael

> The patchset is against -mm tree.
>
> v3: minor refactoring to reduce code duplication
> v2: move this syscall under CONFIG_CROSS_MEMORY_ATTACH
>     give correct flags to get_user_pages_remote()
>
> Andrei Vagin (3):
>   vm: add a syscall to map a process memory into a pipe
>   x86: wire up the process_vmsplice syscall
>   test: add a test for the process_vmsplice syscall
>
> Mike Rapoport (1):
>   fs/splice: introduce pages_to_pipe helper
>
>  arch/x86/entry/syscalls/syscall_32.tbl             |   1 +
>  arch/x86/entry/syscalls/syscall_64.tbl             |   2 +
>  fs/splice.c                                        | 262 +++++++++++++++++++--
>  include/linux/compat.h                             |   3 +
>  include/linux/syscalls.h                           |   4 +
>  include/uapi/asm-generic/unistd.h                  |   5 +-
>  kernel/sys_ni.c                                    |   2 +
>  tools/testing/selftests/process_vmsplice/Makefile  |   5 +
>  .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++
>  9 files changed, 450 insertions(+), 22 deletions(-)
>  create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
>  create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
>
> --
> 2.7.4
>



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] vm: add a syscall to map a process memory into a pipe
@ 2017-11-22 20:43   ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 20+ messages in thread
From: Michael Kerrisk (man-pages) @ 2017-11-22 20:43 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Alexander Viro, linux-mm, linux-fsdevel, lkml,
	Linux API, criu, Arnd Bergmann, Pavel Emelyanov, Thomas Gleixner,
	Josh Triplett, Jann Horn, Yossi Kuperman

Hi Mike,

On 22 November 2017 at 20:36, Mike Rapoport <rppt@linux.vnet.ibm.com> wrote:
> From: Yossi Kuperman <yossiku@il.ibm.com>
>
> Hi,
>
> This patches introduces new process_vmsplice system call that combines
> functionality of process_vm_read and vmsplice.
>
> It allows to map the memory of another process into a pipe, similarly to
> what vmsplice does for its own address space.
>
> The patch 2/4 ("vm: add a syscall to map a process memory into a pipe")
> actually adds the new system call and provides its elaborate description.

Where is the man page for this new syscall?

Cheers,

Michael

> The patchset is against -mm tree.
>
> v3: minor refactoring to reduce code duplication
> v2: move this syscall under CONFIG_CROSS_MEMORY_ATTACH
>     give correct flags to get_user_pages_remote()
>
> Andrei Vagin (3):
>   vm: add a syscall to map a process memory into a pipe
>   x86: wire up the process_vmsplice syscall
>   test: add a test for the process_vmsplice syscall
>
> Mike Rapoport (1):
>   fs/splice: introduce pages_to_pipe helper
>
>  arch/x86/entry/syscalls/syscall_32.tbl             |   1 +
>  arch/x86/entry/syscalls/syscall_64.tbl             |   2 +
>  fs/splice.c                                        | 262 +++++++++++++++++++--
>  include/linux/compat.h                             |   3 +
>  include/linux/syscalls.h                           |   4 +
>  include/uapi/asm-generic/unistd.h                  |   5 +-
>  kernel/sys_ni.c                                    |   2 +
>  tools/testing/selftests/process_vmsplice/Makefile  |   5 +
>  .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++
>  9 files changed, 450 insertions(+), 22 deletions(-)
>  create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
>  create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
>
> --
> 2.7.4
>



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] vm: add a syscall to map a process memory into a pipe
  2017-11-22 20:43   ` Michael Kerrisk (man-pages)
@ 2017-11-23  6:29     ` Mike Rapoport
  -1 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-23  6:29 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Andrew Morton, Alexander Viro, linux-mm, linux-fsdevel, lkml,
	Linux API, criu, Arnd Bergmann, Pavel Emelyanov, Thomas Gleixner,
	Josh Triplett, Jann Horn, Yossi Kuperman

On Wed, Nov 22, 2017 at 09:43:31PM +0100, Michael Kerrisk (man-pages) wrote:
> Hi Mike,
> 
> On 22 November 2017 at 20:36, Mike Rapoport <rppt@linux.vnet.ibm.com> wrote:
> > Hi,
> >
> > This patches introduces new process_vmsplice system call that combines
> > functionality of process_vm_read and vmsplice.
> >
> > It allows to map the memory of another process into a pipe, similarly to
> > what vmsplice does for its own address space.
> >
> > The patch 2/4 ("vm: add a syscall to map a process memory into a pipe")
> > actually adds the new system call and provides its elaborate description.
> 
> Where is the man page for this new syscall?

It's still WIP, I'll send it out soon.
 
> Cheers,
> 
> Michael
> 
> > The patchset is against -mm tree.
> >
> > v3: minor refactoring to reduce code duplication
> > v2: move this syscall under CONFIG_CROSS_MEMORY_ATTACH
> >     give correct flags to get_user_pages_remote()
> >
> > Andrei Vagin (3):
> >   vm: add a syscall to map a process memory into a pipe
> >   x86: wire up the process_vmsplice syscall
> >   test: add a test for the process_vmsplice syscall
> >
> > Mike Rapoport (1):
> >   fs/splice: introduce pages_to_pipe helper
> >
> >  arch/x86/entry/syscalls/syscall_32.tbl             |   1 +
> >  arch/x86/entry/syscalls/syscall_64.tbl             |   2 +
> >  fs/splice.c                                        | 262 +++++++++++++++++++--
> >  include/linux/compat.h                             |   3 +
> >  include/linux/syscalls.h                           |   4 +
> >  include/uapi/asm-generic/unistd.h                  |   5 +-
> >  kernel/sys_ni.c                                    |   2 +
> >  tools/testing/selftests/process_vmsplice/Makefile  |   5 +
> >  .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++
> >  9 files changed, 450 insertions(+), 22 deletions(-)
> >  create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
> >  create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> >
> > --
> > 2.7.4
> >
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 0/4] vm: add a syscall to map a process memory into a pipe
@ 2017-11-23  6:29     ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-23  6:29 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Andrew Morton, Alexander Viro, linux-mm, linux-fsdevel, lkml,
	Linux API, criu, Arnd Bergmann, Pavel Emelyanov, Thomas Gleixner,
	Josh Triplett, Jann Horn, Yossi Kuperman

On Wed, Nov 22, 2017 at 09:43:31PM +0100, Michael Kerrisk (man-pages) wrote:
> Hi Mike,
> 
> On 22 November 2017 at 20:36, Mike Rapoport <rppt@linux.vnet.ibm.com> wrote:
> > Hi,
> >
> > This patches introduces new process_vmsplice system call that combines
> > functionality of process_vm_read and vmsplice.
> >
> > It allows to map the memory of another process into a pipe, similarly to
> > what vmsplice does for its own address space.
> >
> > The patch 2/4 ("vm: add a syscall to map a process memory into a pipe")
> > actually adds the new system call and provides its elaborate description.
> 
> Where is the man page for this new syscall?

It's still WIP, I'll send it out soon.
 
> Cheers,
> 
> Michael
> 
> > The patchset is against -mm tree.
> >
> > v3: minor refactoring to reduce code duplication
> > v2: move this syscall under CONFIG_CROSS_MEMORY_ATTACH
> >     give correct flags to get_user_pages_remote()
> >
> > Andrei Vagin (3):
> >   vm: add a syscall to map a process memory into a pipe
> >   x86: wire up the process_vmsplice syscall
> >   test: add a test for the process_vmsplice syscall
> >
> > Mike Rapoport (1):
> >   fs/splice: introduce pages_to_pipe helper
> >
> >  arch/x86/entry/syscalls/syscall_32.tbl             |   1 +
> >  arch/x86/entry/syscalls/syscall_64.tbl             |   2 +
> >  fs/splice.c                                        | 262 +++++++++++++++++++--
> >  include/linux/compat.h                             |   3 +
> >  include/linux/syscalls.h                           |   4 +
> >  include/uapi/asm-generic/unistd.h                  |   5 +-
> >  kernel/sys_ni.c                                    |   2 +
> >  tools/testing/selftests/process_vmsplice/Makefile  |   5 +
> >  .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++
> >  9 files changed, 450 insertions(+), 22 deletions(-)
> >  create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
> >  create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> >
> > --
> > 2.7.4
> >
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/
> 

-- 
Sincerely yours,
Mike.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 4/4] test: add a test for the process_vmsplice syscall
  2017-11-22 19:36   ` Mike Rapoport
  (?)
@ 2017-11-23  8:01     ` Greg KH
  -1 siblings, 0 replies; 20+ messages in thread
From: Greg KH @ 2017-11-23  8:01 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Alexander Viro, linux-mm, linux-fsdevel,
	linux-kernel, linux-api, criu, Arnd Bergmann, Pavel Emelyanov,
	Michael Kerrisk, Thomas Gleixner, Josh Triplett, Jann Horn,
	Andrei Vagin

On Wed, Nov 22, 2017 at 09:36:31PM +0200, Mike Rapoport wrote:
> From: Andrei Vagin <avagin@openvz.org>
> 
> This test checks that process_vmsplice() can splice pages from a remote
> process and returns EFAULT, if process_vmsplice() tries to splice pages
> by an unaccessiable address.
> 
> Signed-off-by: Andrei Vagin <avagin@openvz.org>
> ---
>  tools/testing/selftests/process_vmsplice/Makefile  |   5 +
>  .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++++++++
>  2 files changed, 193 insertions(+)
>  create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
>  create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> 
> diff --git a/tools/testing/selftests/process_vmsplice/Makefile b/tools/testing/selftests/process_vmsplice/Makefile
> new file mode 100644
> index 0000000..246d5a7
> --- /dev/null
> +++ b/tools/testing/selftests/process_vmsplice/Makefile
> @@ -0,0 +1,5 @@
> +CFLAGS += -I../../../../usr/include/
> +
> +TEST_GEN_PROGS := process_vmsplice_test
> +
> +include ../lib.mk
> diff --git a/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> new file mode 100644
> index 0000000..8abf59b
> --- /dev/null
> +++ b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> @@ -0,0 +1,188 @@
> +#define _GNU_SOURCE
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/mman.h>
> +#include <sys/syscall.h>
> +#include <fcntl.h>
> +#include <sys/uio.h>
> +#include <errno.h>
> +#include <signal.h>
> +#include <sys/prctl.h>
> +#include <sys/wait.h>
> +
> +#include "../kselftest.h"
> +
> +#ifndef __NR_process_vmsplice
> +#define __NR_process_vmsplice 333
> +#endif
> +
> +#define pr_err(fmt, ...) \
> +		({ \
> +			fprintf(stderr, "%s:%d:" fmt, \
> +				__func__, __LINE__, ##__VA_ARGS__); \
> +			KSFT_FAIL; \
> +		})
> +#define pr_perror(fmt, ...) pr_err(fmt ": %m\n", ##__VA_ARGS__)
> +#define fail(fmt, ...) pr_err("FAIL:" fmt, ##__VA_ARGS__)
> +
> +static ssize_t process_vmsplice(pid_t pid, int fd, const struct iovec *iov,
> +			unsigned long nr_segs, unsigned int flags)
> +{
> +	return syscall(__NR_process_vmsplice, pid, fd, iov, nr_segs, flags);
> +
> +}
> +
> +#define MEM_SIZE (4096 * 100)
> +#define MEM_WRONLY_SIZE (4096 * 10)
> +
> +int main(int argc, char **argv)
> +{
> +	char *addr, *addr_wronly;
> +	int p[2];
> +	struct iovec iov[2];
> +	char buf[4096];
> +	int status, ret;
> +	pid_t pid;
> +
> +	ksft_print_header();
> +
> +	addr = mmap(0, MEM_SIZE, PROT_READ | PROT_WRITE,
> +					MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +	if (addr == MAP_FAILED)
> +		return pr_perror("Unable to create a mapping");
> +
> +	addr_wronly = mmap(0, MEM_WRONLY_SIZE, PROT_WRITE,
> +				MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +	if (addr_wronly == MAP_FAILED)
> +		return pr_perror("Unable to create a write-only mapping");
> +
> +	if (pipe(p))
> +		return pr_perror("Unable to create a pipe");
> +
> +	pid = fork();
> +	if (pid < 0)
> +		return pr_perror("Unable to fork");
> +
> +	if (pid == 0) {
> +		addr[0] = 'C';
> +		addr[4096 + 128] = 'A';
> +		addr[4096 + 128 + 4096 - 1] = 'B';
> +
> +		if (prctl(PR_SET_PDEATHSIG, SIGKILL))
> +			return pr_perror("Unable to set PR_SET_PDEATHSIG");
> +		if (write(p[1], "c", 1) != 1)
> +			return pr_perror("Unable to write data into pipe");
> +
> +		while (1)
> +			sleep(1);
> +		return 1;
> +	}
> +	if (read(p[0], buf, 1) != 1) {
> +		pr_perror("Unable to read data from pipe");
> +		kill(pid, SIGKILL);
> +		wait(&status);
> +		return 1;
> +	}
> +
> +	munmap(addr, MEM_SIZE);
> +	munmap(addr_wronly, MEM_WRONLY_SIZE);
> +
> +	iov[0].iov_base = addr;
> +	iov[0].iov_len = 1;
> +
> +	iov[1].iov_base = addr + 4096 + 128;
> +	iov[1].iov_len = 4096;
> +
> +	/* check one iovec */
> +	if (process_vmsplice(pid, p[1], iov, 1, SPLICE_F_GIFT) != 1)
> +		return pr_perror("Unable to splice pages");

Shouldn't you check to see if the syscall is even present?  You should
not error if it is not, as this test will then "fail" on kernels/arches
without the syscall enabled, which isn't the nicest.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 4/4] test: add a test for the process_vmsplice syscall
@ 2017-11-23  8:01     ` Greg KH
  0 siblings, 0 replies; 20+ messages in thread
From: Greg KH @ 2017-11-23  8:01 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Alexander Viro, linux-mm, linux-fsdevel,
	linux-kernel, linux-api, criu, Arnd Bergmann, Pavel Emelyanov,
	Michael Kerrisk, Thomas Gleixner, Josh Triplett, Jann Horn,
	Andrei Vagin

On Wed, Nov 22, 2017 at 09:36:31PM +0200, Mike Rapoport wrote:
> From: Andrei Vagin <avagin@openvz.org>
> 
> This test checks that process_vmsplice() can splice pages from a remote
> process and returns EFAULT, if process_vmsplice() tries to splice pages
> by an unaccessiable address.
> 
> Signed-off-by: Andrei Vagin <avagin@openvz.org>
> ---
>  tools/testing/selftests/process_vmsplice/Makefile  |   5 +
>  .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++++++++
>  2 files changed, 193 insertions(+)
>  create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
>  create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> 
> diff --git a/tools/testing/selftests/process_vmsplice/Makefile b/tools/testing/selftests/process_vmsplice/Makefile
> new file mode 100644
> index 0000000..246d5a7
> --- /dev/null
> +++ b/tools/testing/selftests/process_vmsplice/Makefile
> @@ -0,0 +1,5 @@
> +CFLAGS += -I../../../../usr/include/
> +
> +TEST_GEN_PROGS := process_vmsplice_test
> +
> +include ../lib.mk
> diff --git a/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> new file mode 100644
> index 0000000..8abf59b
> --- /dev/null
> +++ b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> @@ -0,0 +1,188 @@
> +#define _GNU_SOURCE
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/mman.h>
> +#include <sys/syscall.h>
> +#include <fcntl.h>
> +#include <sys/uio.h>
> +#include <errno.h>
> +#include <signal.h>
> +#include <sys/prctl.h>
> +#include <sys/wait.h>
> +
> +#include "../kselftest.h"
> +
> +#ifndef __NR_process_vmsplice
> +#define __NR_process_vmsplice 333
> +#endif
> +
> +#define pr_err(fmt, ...) \
> +		({ \
> +			fprintf(stderr, "%s:%d:" fmt, \
> +				__func__, __LINE__, ##__VA_ARGS__); \
> +			KSFT_FAIL; \
> +		})
> +#define pr_perror(fmt, ...) pr_err(fmt ": %m\n", ##__VA_ARGS__)
> +#define fail(fmt, ...) pr_err("FAIL:" fmt, ##__VA_ARGS__)
> +
> +static ssize_t process_vmsplice(pid_t pid, int fd, const struct iovec *iov,
> +			unsigned long nr_segs, unsigned int flags)
> +{
> +	return syscall(__NR_process_vmsplice, pid, fd, iov, nr_segs, flags);
> +
> +}
> +
> +#define MEM_SIZE (4096 * 100)
> +#define MEM_WRONLY_SIZE (4096 * 10)
> +
> +int main(int argc, char **argv)
> +{
> +	char *addr, *addr_wronly;
> +	int p[2];
> +	struct iovec iov[2];
> +	char buf[4096];
> +	int status, ret;
> +	pid_t pid;
> +
> +	ksft_print_header();
> +
> +	addr = mmap(0, MEM_SIZE, PROT_READ | PROT_WRITE,
> +					MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +	if (addr == MAP_FAILED)
> +		return pr_perror("Unable to create a mapping");
> +
> +	addr_wronly = mmap(0, MEM_WRONLY_SIZE, PROT_WRITE,
> +				MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +	if (addr_wronly == MAP_FAILED)
> +		return pr_perror("Unable to create a write-only mapping");
> +
> +	if (pipe(p))
> +		return pr_perror("Unable to create a pipe");
> +
> +	pid = fork();
> +	if (pid < 0)
> +		return pr_perror("Unable to fork");
> +
> +	if (pid == 0) {
> +		addr[0] = 'C';
> +		addr[4096 + 128] = 'A';
> +		addr[4096 + 128 + 4096 - 1] = 'B';
> +
> +		if (prctl(PR_SET_PDEATHSIG, SIGKILL))
> +			return pr_perror("Unable to set PR_SET_PDEATHSIG");
> +		if (write(p[1], "c", 1) != 1)
> +			return pr_perror("Unable to write data into pipe");
> +
> +		while (1)
> +			sleep(1);
> +		return 1;
> +	}
> +	if (read(p[0], buf, 1) != 1) {
> +		pr_perror("Unable to read data from pipe");
> +		kill(pid, SIGKILL);
> +		wait(&status);
> +		return 1;
> +	}
> +
> +	munmap(addr, MEM_SIZE);
> +	munmap(addr_wronly, MEM_WRONLY_SIZE);
> +
> +	iov[0].iov_base = addr;
> +	iov[0].iov_len = 1;
> +
> +	iov[1].iov_base = addr + 4096 + 128;
> +	iov[1].iov_len = 4096;
> +
> +	/* check one iovec */
> +	if (process_vmsplice(pid, p[1], iov, 1, SPLICE_F_GIFT) != 1)
> +		return pr_perror("Unable to splice pages");

Shouldn't you check to see if the syscall is even present?  You should
not error if it is not, as this test will then "fail" on kernels/arches
without the syscall enabled, which isn't the nicest.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 4/4] test: add a test for the process_vmsplice syscall
@ 2017-11-23  8:01     ` Greg KH
  0 siblings, 0 replies; 20+ messages in thread
From: Greg KH @ 2017-11-23  8:01 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Alexander Viro, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA, criu-GEFAQzZX7r8dnm+yROfE0A,
	Arnd Bergmann, Pavel Emelyanov, Michael Kerrisk, Thomas Gleixner,
	Josh Triplett, Jann Horn, Andrei Vagin

On Wed, Nov 22, 2017 at 09:36:31PM +0200, Mike Rapoport wrote:
> From: Andrei Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
> 
> This test checks that process_vmsplice() can splice pages from a remote
> process and returns EFAULT, if process_vmsplice() tries to splice pages
> by an unaccessiable address.
> 
> Signed-off-by: Andrei Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
> ---
>  tools/testing/selftests/process_vmsplice/Makefile  |   5 +
>  .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++++++++
>  2 files changed, 193 insertions(+)
>  create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
>  create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> 
> diff --git a/tools/testing/selftests/process_vmsplice/Makefile b/tools/testing/selftests/process_vmsplice/Makefile
> new file mode 100644
> index 0000000..246d5a7
> --- /dev/null
> +++ b/tools/testing/selftests/process_vmsplice/Makefile
> @@ -0,0 +1,5 @@
> +CFLAGS += -I../../../../usr/include/
> +
> +TEST_GEN_PROGS := process_vmsplice_test
> +
> +include ../lib.mk
> diff --git a/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> new file mode 100644
> index 0000000..8abf59b
> --- /dev/null
> +++ b/tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> @@ -0,0 +1,188 @@
> +#define _GNU_SOURCE
> +#include <stdio.h>
> +#include <unistd.h>
> +#include <sys/mman.h>
> +#include <sys/syscall.h>
> +#include <fcntl.h>
> +#include <sys/uio.h>
> +#include <errno.h>
> +#include <signal.h>
> +#include <sys/prctl.h>
> +#include <sys/wait.h>
> +
> +#include "../kselftest.h"
> +
> +#ifndef __NR_process_vmsplice
> +#define __NR_process_vmsplice 333
> +#endif
> +
> +#define pr_err(fmt, ...) \
> +		({ \
> +			fprintf(stderr, "%s:%d:" fmt, \
> +				__func__, __LINE__, ##__VA_ARGS__); \
> +			KSFT_FAIL; \
> +		})
> +#define pr_perror(fmt, ...) pr_err(fmt ": %m\n", ##__VA_ARGS__)
> +#define fail(fmt, ...) pr_err("FAIL:" fmt, ##__VA_ARGS__)
> +
> +static ssize_t process_vmsplice(pid_t pid, int fd, const struct iovec *iov,
> +			unsigned long nr_segs, unsigned int flags)
> +{
> +	return syscall(__NR_process_vmsplice, pid, fd, iov, nr_segs, flags);
> +
> +}
> +
> +#define MEM_SIZE (4096 * 100)
> +#define MEM_WRONLY_SIZE (4096 * 10)
> +
> +int main(int argc, char **argv)
> +{
> +	char *addr, *addr_wronly;
> +	int p[2];
> +	struct iovec iov[2];
> +	char buf[4096];
> +	int status, ret;
> +	pid_t pid;
> +
> +	ksft_print_header();
> +
> +	addr = mmap(0, MEM_SIZE, PROT_READ | PROT_WRITE,
> +					MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +	if (addr == MAP_FAILED)
> +		return pr_perror("Unable to create a mapping");
> +
> +	addr_wronly = mmap(0, MEM_WRONLY_SIZE, PROT_WRITE,
> +				MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +	if (addr_wronly == MAP_FAILED)
> +		return pr_perror("Unable to create a write-only mapping");
> +
> +	if (pipe(p))
> +		return pr_perror("Unable to create a pipe");
> +
> +	pid = fork();
> +	if (pid < 0)
> +		return pr_perror("Unable to fork");
> +
> +	if (pid == 0) {
> +		addr[0] = 'C';
> +		addr[4096 + 128] = 'A';
> +		addr[4096 + 128 + 4096 - 1] = 'B';
> +
> +		if (prctl(PR_SET_PDEATHSIG, SIGKILL))
> +			return pr_perror("Unable to set PR_SET_PDEATHSIG");
> +		if (write(p[1], "c", 1) != 1)
> +			return pr_perror("Unable to write data into pipe");
> +
> +		while (1)
> +			sleep(1);
> +		return 1;
> +	}
> +	if (read(p[0], buf, 1) != 1) {
> +		pr_perror("Unable to read data from pipe");
> +		kill(pid, SIGKILL);
> +		wait(&status);
> +		return 1;
> +	}
> +
> +	munmap(addr, MEM_SIZE);
> +	munmap(addr_wronly, MEM_WRONLY_SIZE);
> +
> +	iov[0].iov_base = addr;
> +	iov[0].iov_len = 1;
> +
> +	iov[1].iov_base = addr + 4096 + 128;
> +	iov[1].iov_len = 4096;
> +
> +	/* check one iovec */
> +	if (process_vmsplice(pid, p[1], iov, 1, SPLICE_F_GIFT) != 1)
> +		return pr_perror("Unable to splice pages");

Shouldn't you check to see if the syscall is even present?  You should
not error if it is not, as this test will then "fail" on kernels/arches
without the syscall enabled, which isn't the nicest.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 4/4] test: add a test for the process_vmsplice syscall
  2017-11-23  8:01     ` Greg KH
@ 2017-11-23 14:07       ` Mike Rapoport
  -1 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-23 14:07 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, Alexander Viro, linux-mm, linux-fsdevel,
	linux-kernel, linux-api, criu, Arnd Bergmann, Pavel Emelyanov,
	Michael Kerrisk, Thomas Gleixner, Josh Triplett, Jann Horn,
	Andrei Vagin

On Thu, Nov 23, 2017 at 09:01:03AM +0100, Greg KH wrote:
> On Wed, Nov 22, 2017 at 09:36:31PM +0200, Mike Rapoport wrote:
> > From: Andrei Vagin <avagin@openvz.org>
> > 
> > This test checks that process_vmsplice() can splice pages from a remote
> > process and returns EFAULT, if process_vmsplice() tries to splice pages
> > by an unaccessiable address.
> > 
> > Signed-off-by: Andrei Vagin <avagin@openvz.org>
> > ---
> >  tools/testing/selftests/process_vmsplice/Makefile  |   5 +
> >  .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++++++++
> >  2 files changed, 193 insertions(+)
> >  create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
> >  create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> > 

[ ... ]

> 
> Shouldn't you check to see if the syscall is even present?  You should
> not error if it is not, as this test will then "fail" on kernels/arches
> without the syscall enabled, which isn't the nicest.

Sure, will fix.

> thanks,
> 
> greg k-h
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 4/4] test: add a test for the process_vmsplice syscall
@ 2017-11-23 14:07       ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2017-11-23 14:07 UTC (permalink / raw)
  To: Greg KH
  Cc: Andrew Morton, Alexander Viro, linux-mm, linux-fsdevel,
	linux-kernel, linux-api, criu, Arnd Bergmann, Pavel Emelyanov,
	Michael Kerrisk, Thomas Gleixner, Josh Triplett, Jann Horn,
	Andrei Vagin

On Thu, Nov 23, 2017 at 09:01:03AM +0100, Greg KH wrote:
> On Wed, Nov 22, 2017 at 09:36:31PM +0200, Mike Rapoport wrote:
> > From: Andrei Vagin <avagin@openvz.org>
> > 
> > This test checks that process_vmsplice() can splice pages from a remote
> > process and returns EFAULT, if process_vmsplice() tries to splice pages
> > by an unaccessiable address.
> > 
> > Signed-off-by: Andrei Vagin <avagin@openvz.org>
> > ---
> >  tools/testing/selftests/process_vmsplice/Makefile  |   5 +
> >  .../process_vmsplice/process_vmsplice_test.c       | 188 +++++++++++++++++++++
> >  2 files changed, 193 insertions(+)
> >  create mode 100644 tools/testing/selftests/process_vmsplice/Makefile
> >  create mode 100644 tools/testing/selftests/process_vmsplice/process_vmsplice_test.c
> > 

[ ... ]

> 
> Shouldn't you check to see if the syscall is even present?  You should
> not error if it is not, as this test will then "fail" on kernels/arches
> without the syscall enabled, which isn't the nicest.

Sure, will fix.

> thanks,
> 
> greg k-h
> 

-- 
Sincerely yours,
Mike.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-11-23 14:07 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-22 19:36 [PATCH v3 0/4] vm: add a syscall to map a process memory into a pipe Mike Rapoport
2017-11-22 19:36 ` Mike Rapoport
2017-11-22 19:36 ` [PATCH v3 1/4] fs/splice: introduce pages_to_pipe helper Mike Rapoport
2017-11-22 19:36   ` Mike Rapoport
2017-11-22 19:36   ` Mike Rapoport
2017-11-22 19:36 ` [PATCH v3 2/4] vm: add a syscall to map a process memory into a pipe Mike Rapoport
2017-11-22 19:36   ` Mike Rapoport
2017-11-22 19:36 ` [PATCH v3 3/4] x86: wire up the process_vmsplice syscall Mike Rapoport
2017-11-22 19:36   ` Mike Rapoport
2017-11-22 19:36 ` [PATCH v3 4/4] test: add a test for " Mike Rapoport
2017-11-22 19:36   ` Mike Rapoport
2017-11-23  8:01   ` Greg KH
2017-11-23  8:01     ` Greg KH
2017-11-23  8:01     ` Greg KH
2017-11-23 14:07     ` Mike Rapoport
2017-11-23 14:07       ` Mike Rapoport
2017-11-22 20:43 ` [PATCH v3 0/4] vm: add a syscall to map a process memory into a pipe Michael Kerrisk (man-pages)
2017-11-22 20:43   ` Michael Kerrisk (man-pages)
2017-11-23  6:29   ` Mike Rapoport
2017-11-23  6:29     ` Mike Rapoport

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.