linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* remove the last set_fs() in common code, and remove it for x86 and powerpc
@ 2020-08-17  7:32 Christoph Hellwig
  2020-08-17  7:32 ` [PATCH 01/11] mem: remove duplicate ops for /dev/zero and /dev/null Christoph Hellwig
                   ` (12 more replies)
  0 siblings, 13 replies; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

Hi all,

this series removes the last set_fs() used to force a kernel address
space for the uaccess code in the kernel read/write/splice code, and then
stops implementing the address space overrides entirely for x86 and
powerpc.

The file system part has been posted a few times, and the read/write side
has been pretty much unchanced.  For splice this series drops the
conversion of the seq_file and sysctl code to the iter ops, and thus loses
the splice support for them.  The reasons for that is that it caused a lot
of churn for not much use - splice for these small files really isn't much
of a win, even if existing userspace uses it.  All callers I found do the
proper fallback, but if this turns out to be an issue the conversion can
be resurrected.

Besides x86 and powerpc I plan to eventually convert all other
architectures, although this will be a slow process, starting with the
easier ones once the infrastructure is merged.  The process to convert
architectures is roughtly:

 - ensure there is no set_fs(KERNEL_DS) left in arch specific code
 - implement __get_kernel_nofault and __put_kernel_nofault
 - remove the arch specific address limitation functionality

Diffstat:
 arch/Kconfig                           |    3 
 arch/alpha/Kconfig                     |    1 
 arch/arc/Kconfig                       |    1 
 arch/arm/Kconfig                       |    1 
 arch/arm64/Kconfig                     |    1 
 arch/c6x/Kconfig                       |    1 
 arch/csky/Kconfig                      |    1 
 arch/h8300/Kconfig                     |    1 
 arch/hexagon/Kconfig                   |    1 
 arch/ia64/Kconfig                      |    1 
 arch/m68k/Kconfig                      |    1 
 arch/microblaze/Kconfig                |    1 
 arch/mips/Kconfig                      |    1 
 arch/nds32/Kconfig                     |    1 
 arch/nios2/Kconfig                     |    1 
 arch/openrisc/Kconfig                  |    1 
 arch/parisc/Kconfig                    |    1 
 arch/powerpc/include/asm/processor.h   |    7 -
 arch/powerpc/include/asm/thread_info.h |    5 -
 arch/powerpc/include/asm/uaccess.h     |   78 ++++++++-----------
 arch/powerpc/kernel/signal.c           |    3 
 arch/powerpc/lib/sstep.c               |    6 -
 arch/riscv/Kconfig                     |    1 
 arch/s390/Kconfig                      |    1 
 arch/sh/Kconfig                        |    1 
 arch/sparc/Kconfig                     |    1 
 arch/um/Kconfig                        |    1 
 arch/x86/ia32/ia32_aout.c              |    1 
 arch/x86/include/asm/page_32_types.h   |   11 ++
 arch/x86/include/asm/page_64_types.h   |   38 +++++++++
 arch/x86/include/asm/processor.h       |   60 ---------------
 arch/x86/include/asm/thread_info.h     |    2 
 arch/x86/include/asm/uaccess.h         |   26 ------
 arch/x86/kernel/asm-offsets.c          |    3 
 arch/x86/lib/getuser.S                 |   28 ++++---
 arch/x86/lib/putuser.S                 |   21 +++--
 arch/xtensa/Kconfig                    |    1 
 drivers/char/mem.c                     |   16 ----
 drivers/misc/lkdtm/bugs.c              |    2 
 drivers/misc/lkdtm/core.c              |    4 +
 drivers/misc/lkdtm/usercopy.c          |    2 
 fs/read_write.c                        |   69 ++++++++++-------
 fs/splice.c                            |  130 +++------------------------------
 include/linux/fs.h                     |    2 
 include/linux/uaccess.h                |   18 ++++
 lib/test_bitmap.c                      |   10 ++
 46 files changed, 235 insertions(+), 332 deletions(-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 01/11] mem: remove duplicate ops for /dev/zero and /dev/null
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-18 19:33   ` Kees Cook
  2020-08-17  7:32 ` [PATCH 02/11] fs: don't allow kernel reads and writes without iter ops Christoph Hellwig
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

There is no good reason to implement both the traditional ->read and
->write as well as the iter based ops.  So implement just the iter
based ones.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/char/mem.c | 16 ----------------
 1 file changed, 16 deletions(-)

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 687d4af6945d36..14851f36787372 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -670,18 +670,6 @@ static ssize_t write_port(struct file *file, const char __user *buf,
 	return tmp-buf;
 }
 
-static ssize_t read_null(struct file *file, char __user *buf,
-			 size_t count, loff_t *ppos)
-{
-	return 0;
-}
-
-static ssize_t write_null(struct file *file, const char __user *buf,
-			  size_t count, loff_t *ppos)
-{
-	return count;
-}
-
 static ssize_t read_iter_null(struct kiocb *iocb, struct iov_iter *to)
 {
 	return 0;
@@ -872,7 +860,6 @@ static int open_port(struct inode *inode, struct file *filp)
 
 #define zero_lseek	null_lseek
 #define full_lseek      null_lseek
-#define write_zero	write_null
 #define write_iter_zero	write_iter_null
 #define open_mem	open_port
 #define open_kmem	open_mem
@@ -903,8 +890,6 @@ static const struct file_operations __maybe_unused kmem_fops = {
 
 static const struct file_operations null_fops = {
 	.llseek		= null_lseek,
-	.read		= read_null,
-	.write		= write_null,
 	.read_iter	= read_iter_null,
 	.write_iter	= write_iter_null,
 	.splice_write	= splice_write_null,
@@ -919,7 +904,6 @@ static const struct file_operations __maybe_unused port_fops = {
 
 static const struct file_operations zero_fops = {
 	.llseek		= zero_lseek,
-	.write		= write_zero,
 	.read_iter	= read_iter_zero,
 	.write_iter	= write_iter_zero,
 	.mmap		= mmap_zero,
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 02/11] fs: don't allow kernel reads and writes without iter ops
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
  2020-08-17  7:32 ` [PATCH 01/11] mem: remove duplicate ops for /dev/zero and /dev/null Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-18 19:34   ` Kees Cook
  2020-08-17  7:32 ` [PATCH 03/11] fs: don't allow splice read/write without explicit ops Christoph Hellwig
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

Don't allow calling ->read or ->write with set_fs as a preparation for
killing off set_fs.  All the instances that we use kernel_read/write on
are using the iter ops already.

If a file has both the regular ->read/->write methods and the iter
variants those could have different semantics for messed up enough
drivers.  Also fails the kernel access to them in that case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 67 +++++++++++++++++++++++++++++++------------------
 1 file changed, 42 insertions(+), 25 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 5db58b8c78d0dd..702c4301d9eb6b 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -419,27 +419,41 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 	return ret;
 }
 
+static int warn_unsupported(struct file *file, const char *op)
+{
+	pr_warn_ratelimited(
+		"kernel %s not supported for file %pD4 (pid: %d comm: %.20s)\n",
+		op, file, current->pid, current->comm);
+	return -EINVAL;
+}
+
 ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
-	mm_segment_t old_fs = get_fs();
+	struct kvec iov = {
+		.iov_base	= buf,
+		.iov_len	= min_t(size_t, count, MAX_RW_COUNT),
+	};
+	struct kiocb kiocb;
+	struct iov_iter iter;
 	ssize_t ret;
 
 	if (WARN_ON_ONCE(!(file->f_mode & FMODE_READ)))
 		return -EINVAL;
 	if (!(file->f_mode & FMODE_CAN_READ))
 		return -EINVAL;
+	/*
+	 * Also fail if ->read_iter and ->read are both wired up as that
+	 * implies very convoluted semantics.
+	 */
+	if (unlikely(!file->f_op->read_iter || file->f_op->read))
+		return warn_unsupported(file, "read");
 
-	if (count > MAX_RW_COUNT)
-		count =  MAX_RW_COUNT;
-	set_fs(KERNEL_DS);
-	if (file->f_op->read)
-		ret = file->f_op->read(file, (void __user *)buf, count, pos);
-	else if (file->f_op->read_iter)
-		ret = new_sync_read(file, (void __user *)buf, count, pos);
-	else
-		ret = -EINVAL;
-	set_fs(old_fs);
+	init_sync_kiocb(&kiocb, file);
+	kiocb.ki_pos = *pos;
+	iov_iter_kvec(&iter, READ, &iov, 1, iov.iov_len);
+	ret = file->f_op->read_iter(&kiocb, &iter);
 	if (ret > 0) {
+		*pos = kiocb.ki_pos;
 		fsnotify_access(file);
 		add_rchar(current, ret);
 	}
@@ -510,28 +524,31 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 /* caller is responsible for file_start_write/file_end_write */
 ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
 {
-	mm_segment_t old_fs;
-	const char __user *p;
+	struct kvec iov = {
+		.iov_base	= (void *)buf,
+		.iov_len	= min_t(size_t, count, MAX_RW_COUNT),
+	};
+	struct kiocb kiocb;
+	struct iov_iter iter;
 	ssize_t ret;
 
 	if (WARN_ON_ONCE(!(file->f_mode & FMODE_WRITE)))
 		return -EBADF;
 	if (!(file->f_mode & FMODE_CAN_WRITE))
 		return -EINVAL;
+	/*
+	 * Also fail if ->write_iter and ->write are both wired up as that
+	 * implies very convoluted semantics.
+	 */
+	if (unlikely(!file->f_op->write_iter || file->f_op->write))
+		return warn_unsupported(file, "write");
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	p = (__force const char __user *)buf;
-	if (count > MAX_RW_COUNT)
-		count =  MAX_RW_COUNT;
-	if (file->f_op->write)
-		ret = file->f_op->write(file, p, count, pos);
-	else if (file->f_op->write_iter)
-		ret = new_sync_write(file, p, count, pos);
-	else
-		ret = -EINVAL;
-	set_fs(old_fs);
+	init_sync_kiocb(&kiocb, file);
+	kiocb.ki_pos = *pos;
+	iov_iter_kvec(&iter, WRITE, &iov, 1, iov.iov_len);
+	ret = file->f_op->write_iter(&kiocb, &iter);
 	if (ret > 0) {
+		*pos = kiocb.ki_pos;
 		fsnotify_modify(file);
 		add_wchar(current, ret);
 	}
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 03/11] fs: don't allow splice read/write without explicit ops
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
  2020-08-17  7:32 ` [PATCH 01/11] mem: remove duplicate ops for /dev/zero and /dev/null Christoph Hellwig
  2020-08-17  7:32 ` [PATCH 02/11] fs: don't allow kernel reads and writes without iter ops Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-18 19:39   ` Kees Cook
  2020-08-17  7:32 ` [PATCH 04/11] uaccess: add infrastructure for kernel builds with set_fs() Christoph Hellwig
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

default_file_splice_write is the last piece of generic code that uses
set_fs to make the uaccess routines operate on kernel pointers.  It
implements a "fallback loop" for splicing from files that do not actually
provide a proper splice_read method.  The usual file systems and other
high bandwith instances all provide a ->splice_read, so this just removes
support for various device drivers and procfs/debugfs files.  If splice
support for any of those turns out to be important it can be added back
by switching them to the iter ops and using generic_file_splice_read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c    |   2 +-
 fs/splice.c        | 130 +++++----------------------------------------
 include/linux/fs.h |   2 -
 3 files changed, 15 insertions(+), 119 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 702c4301d9eb6b..8c61f67453e3d3 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1077,7 +1077,7 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
 }
 EXPORT_SYMBOL(vfs_iter_write);
 
-ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
+static ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
 		  unsigned long vlen, loff_t *pos, rwf_t flags)
 {
 	struct iovec iovstack[UIO_FASTIOV];
diff --git a/fs/splice.c b/fs/splice.c
index d7c8a7c4db07ff..412df7b48f9eb7 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -342,89 +342,6 @@ const struct pipe_buf_operations nosteal_pipe_buf_ops = {
 };
 EXPORT_SYMBOL(nosteal_pipe_buf_ops);
 
-static ssize_t kernel_readv(struct file *file, const struct kvec *vec,
-			    unsigned long vlen, loff_t offset)
-{
-	mm_segment_t old_fs;
-	loff_t pos = offset;
-	ssize_t res;
-
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	/* The cast to a user pointer is valid due to the set_fs() */
-	res = vfs_readv(file, (const struct iovec __user *)vec, vlen, &pos, 0);
-	set_fs(old_fs);
-
-	return res;
-}
-
-static ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
-				 struct pipe_inode_info *pipe, size_t len,
-				 unsigned int flags)
-{
-	struct kvec *vec, __vec[PIPE_DEF_BUFFERS];
-	struct iov_iter to;
-	struct page **pages;
-	unsigned int nr_pages;
-	unsigned int mask;
-	size_t offset, base, copied = 0;
-	ssize_t res;
-	int i;
-
-	if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
-		return -EAGAIN;
-
-	/*
-	 * Try to keep page boundaries matching to source pagecache ones -
-	 * it probably won't be much help, but...
-	 */
-	offset = *ppos & ~PAGE_MASK;
-
-	iov_iter_pipe(&to, READ, pipe, len + offset);
-
-	res = iov_iter_get_pages_alloc(&to, &pages, len + offset, &base);
-	if (res <= 0)
-		return -ENOMEM;
-
-	nr_pages = DIV_ROUND_UP(res + base, PAGE_SIZE);
-
-	vec = __vec;
-	if (nr_pages > PIPE_DEF_BUFFERS) {
-		vec = kmalloc_array(nr_pages, sizeof(struct kvec), GFP_KERNEL);
-		if (unlikely(!vec)) {
-			res = -ENOMEM;
-			goto out;
-		}
-	}
-
-	mask = pipe->ring_size - 1;
-	pipe->bufs[to.head & mask].offset = offset;
-	pipe->bufs[to.head & mask].len -= offset;
-
-	for (i = 0; i < nr_pages; i++) {
-		size_t this_len = min_t(size_t, len, PAGE_SIZE - offset);
-		vec[i].iov_base = page_address(pages[i]) + offset;
-		vec[i].iov_len = this_len;
-		len -= this_len;
-		offset = 0;
-	}
-
-	res = kernel_readv(in, vec, nr_pages, *ppos);
-	if (res > 0) {
-		copied = res;
-		*ppos += res;
-	}
-
-	if (vec != __vec)
-		kfree(vec);
-out:
-	for (i = 0; i < nr_pages; i++)
-		put_page(pages[i]);
-	kvfree(pages);
-	iov_iter_advance(&to, copied);	/* truncates and discards */
-	return res;
-}
-
 /*
  * Send 'sd->len' bytes to socket from 'sd->file' at position 'sd->pos'
  * using sendpage(). Return the number of bytes sent.
@@ -788,33 +705,6 @@ iter_file_splice_write(struct pipe_inode_info *pipe, struct file *out,
 
 EXPORT_SYMBOL(iter_file_splice_write);
 
-static int write_pipe_buf(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
-			  struct splice_desc *sd)
-{
-	int ret;
-	void *data;
-	loff_t tmp = sd->pos;
-
-	data = kmap(buf->page);
-	ret = __kernel_write(sd->u.file, data + buf->offset, sd->len, &tmp);
-	kunmap(buf->page);
-
-	return ret;
-}
-
-static ssize_t default_file_splice_write(struct pipe_inode_info *pipe,
-					 struct file *out, loff_t *ppos,
-					 size_t len, unsigned int flags)
-{
-	ssize_t ret;
-
-	ret = splice_from_pipe(pipe, out, ppos, len, flags, write_pipe_buf);
-	if (ret > 0)
-		*ppos += ret;
-
-	return ret;
-}
-
 /**
  * generic_splice_sendpage - splice data from a pipe to a socket
  * @pipe:	pipe to splice from
@@ -836,15 +726,23 @@ ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe, struct file *out,
 
 EXPORT_SYMBOL(generic_splice_sendpage);
 
+static int warn_unsupported(struct file *file, const char *op)
+{
+	pr_debug_ratelimited(
+		"splice %s not supported for file %pD4 (pid: %d comm: %.20s)\n",
+		op, file, current->pid, current->comm);
+	return -EINVAL;
+}
+
 /*
  * Attempt to initiate a splice from pipe to file.
  */
 static long do_splice_from(struct pipe_inode_info *pipe, struct file *out,
 			   loff_t *ppos, size_t len, unsigned int flags)
 {
-	if (out->f_op->splice_write)
-		return out->f_op->splice_write(pipe, out, ppos, len, flags);
-	return default_file_splice_write(pipe, out, ppos, len, flags);
+	if (unlikely(!out->f_op->splice_write))
+		return warn_unsupported(out, "write");
+	return out->f_op->splice_write(pipe, out, ppos, len, flags);
 }
 
 /*
@@ -866,9 +764,9 @@ static long do_splice_to(struct file *in, loff_t *ppos,
 	if (unlikely(len > MAX_RW_COUNT))
 		len = MAX_RW_COUNT;
 
-	if (in->f_op->splice_read)
-		return in->f_op->splice_read(in, ppos, pipe, len, flags);
-	return default_file_splice_read(in, ppos, pipe, len, flags);
+	if (unlikely(!in->f_op->splice_read))
+		return warn_unsupported(in, "read");
+	return in->f_op->splice_read(in, ppos, pipe, len, flags);
 }
 
 /**
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e019ea2f1347e6..d33cc3e8ed410b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1894,8 +1894,6 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
 
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
-extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
-		unsigned long, loff_t *, rwf_t);
 extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
 				   loff_t, size_t, unsigned int);
 extern ssize_t generic_copy_file_range(struct file *file_in, loff_t pos_in,
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 04/11] uaccess: add infrastructure for kernel builds with set_fs()
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (2 preceding siblings ...)
  2020-08-17  7:32 ` [PATCH 03/11] fs: don't allow splice read/write without explicit ops Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-18 19:40   ` Kees Cook
  2020-08-17  7:32 ` [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS Christoph Hellwig
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

Add a CONFIG_SET_FS option that is selected by architecturess that
implement set_fs, which is all of them initially.  If the option is not
set stubs for routines related to overriding the address space are
provided so that architectures can start to opt out of providing set_fs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/Kconfig            |  3 +++
 arch/alpha/Kconfig      |  1 +
 arch/arc/Kconfig        |  1 +
 arch/arm/Kconfig        |  1 +
 arch/arm64/Kconfig      |  1 +
 arch/c6x/Kconfig        |  1 +
 arch/csky/Kconfig       |  1 +
 arch/h8300/Kconfig      |  1 +
 arch/hexagon/Kconfig    |  1 +
 arch/ia64/Kconfig       |  1 +
 arch/m68k/Kconfig       |  1 +
 arch/microblaze/Kconfig |  1 +
 arch/mips/Kconfig       |  1 +
 arch/nds32/Kconfig      |  1 +
 arch/nios2/Kconfig      |  1 +
 arch/openrisc/Kconfig   |  1 +
 arch/parisc/Kconfig     |  1 +
 arch/powerpc/Kconfig    |  1 +
 arch/riscv/Kconfig      |  1 +
 arch/s390/Kconfig       |  1 +
 arch/sh/Kconfig         |  1 +
 arch/sparc/Kconfig      |  1 +
 arch/um/Kconfig         |  1 +
 arch/x86/Kconfig        |  1 +
 arch/xtensa/Kconfig     |  1 +
 include/linux/uaccess.h | 18 ++++++++++++++++++
 26 files changed, 45 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index af14a567b493fc..3fab619a6aa51a 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -24,6 +24,9 @@ config KEXEC_ELF
 config HAVE_IMA_KEXEC
 	bool
 
+config SET_FS
+	bool
+
 config HOTPLUG_SMT
 	bool
 
diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
index 9c5f06e8eb9bc0..d6e9fc7a7b19e2 100644
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -39,6 +39,7 @@ config ALPHA
 	select OLD_SIGSUSPEND
 	select CPU_NO_EFFICIENT_FFS if !ALPHA_EV67
 	select MMU_GATHER_NO_RANGE
+	select SET_FS
 	help
 	  The Alpha is a 64-bit general-purpose processor designed and
 	  marketed by the Digital Equipment Corporation of blessed memory,
diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index ba00c4e1e1c271..c49f5754a11e40 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -48,6 +48,7 @@ config ARC
 	select PCI_SYSCALL if PCI
 	select PERF_USE_VMALLOC if ARC_CACHE_VIPT_ALIASING
 	select HAVE_ARCH_JUMP_LABEL if ISA_ARCV2 && !CPU_ENDIAN_BE32
+	select SET_FS
 
 config ARCH_HAS_CACHE_LINE_SIZE
 	def_bool y
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e00d94b1665876..87e1478a42dc4f 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -118,6 +118,7 @@ config ARM
 	select PCI_SYSCALL if PCI
 	select PERF_USE_VMALLOC
 	select RTC_LIB
+	select SET_FS
 	select SYS_SUPPORTS_APM_EMULATION
 	# Above selects are sorted alphabetically; please add new ones
 	# according to that.  Thanks.
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6d232837cbeee8..fbd9e35bef096f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -192,6 +192,7 @@ config ARM64
 	select PCI_SYSCALL if PCI
 	select POWER_RESET
 	select POWER_SUPPLY
+	select SET_FS
 	select SPARSE_IRQ
 	select SWIOTLB
 	select SYSCTL_EXCEPTION_TRACE
diff --git a/arch/c6x/Kconfig b/arch/c6x/Kconfig
index 6444ebfd06a665..48d66bf0465d68 100644
--- a/arch/c6x/Kconfig
+++ b/arch/c6x/Kconfig
@@ -22,6 +22,7 @@ config C6X
 	select GENERIC_CLOCKEVENTS
 	select MODULES_USE_ELF_RELA
 	select MMU_GATHER_NO_RANGE if MMU
+	select SET_FS
 
 config MMU
 	def_bool n
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index 3d5afb5f568543..2836f6e76fdb2d 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -78,6 +78,7 @@ config CSKY
 	select PCI_DOMAINS_GENERIC if PCI
 	select PCI_SYSCALL if PCI
 	select PCI_MSI if PCI
+	select SET_FS
 
 config LOCKDEP_SUPPORT
 	def_bool y
diff --git a/arch/h8300/Kconfig b/arch/h8300/Kconfig
index d11666d538fea8..7945de067e9fcc 100644
--- a/arch/h8300/Kconfig
+++ b/arch/h8300/Kconfig
@@ -25,6 +25,7 @@ config H8300
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_HASH
 	select CPU_NO_EFFICIENT_FFS
+	select SET_FS
 	select UACCESS_MEMCPY
 
 config CPU_BIG_ENDIAN
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index 667cfc511cf999..f2afabbadd430e 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -31,6 +31,7 @@ config HEXAGON
 	select GENERIC_CLOCKEVENTS_BROADCAST
 	select MODULES_USE_ELF_RELA
 	select GENERIC_CPU_DEVICES
+	select SET_FS
 	help
 	  Qualcomm Hexagon is a processor architecture designed for high
 	  performance and low power across a wide variety of applications.
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 5b4ec80bf5863a..22a6853840e235 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -56,6 +56,7 @@ config IA64
 	select NEED_DMA_MAP_STATE
 	select NEED_SG_DMA_LENGTH
 	select NUMA if !FLATMEM
+	select SET_FS
 	default y
 	help
 	  The Itanium Processor Family is Intel's 64-bit successor to
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 6f2f38d05772ab..dcf4ae8c9b215f 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -32,6 +32,7 @@ config M68K
 	select OLD_SIGSUSPEND3
 	select OLD_SIGACTION
 	select MMU_GATHER_NO_RANGE if MMU
+	select SET_FS
 
 config CPU_BIG_ENDIAN
 	def_bool y
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index d262ac0c8714bd..7e3d4583abf3e6 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -46,6 +46,7 @@ config MICROBLAZE
 	select CPU_NO_EFFICIENT_FFS
 	select MMU_GATHER_NO_RANGE if MMU
 	select SPARSE_IRQ
+	select SET_FS
 
 # Endianness selection
 choice
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index c95fa3a2484cf0..fbc26391b588f8 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -87,6 +87,7 @@ config MIPS
 	select MODULES_USE_ELF_RELA if MODULES && 64BIT
 	select PERF_USE_VMALLOC
 	select RTC_LIB
+	select SET_FS
 	select SYSCTL_EXCEPTION_TRACE
 	select VIRT_TO_BUS
 
diff --git a/arch/nds32/Kconfig b/arch/nds32/Kconfig
index e30298e99e1bdf..e8e541fd2267d0 100644
--- a/arch/nds32/Kconfig
+++ b/arch/nds32/Kconfig
@@ -48,6 +48,7 @@ config NDS32
 	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FTRACE_MCOUNT_RECORD
 	select HAVE_DYNAMIC_FTRACE
+	select SET_FS
 	help
 	  Andes(nds32) Linux support.
 
diff --git a/arch/nios2/Kconfig b/arch/nios2/Kconfig
index c6645141bb2a88..c7c6ba6bec9dfc 100644
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -27,6 +27,7 @@ config NIOS2
 	select USB_ARCH_HAS_HCD if USB_SUPPORT
 	select CPU_NO_EFFICIENT_FFS
 	select MMU_GATHER_NO_RANGE if MMU
+	select SET_FS
 
 config GENERIC_CSUM
 	def_bool y
diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index 7e94fe37cb2fdf..6233c62931803f 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -39,6 +39,7 @@ config OPENRISC
 	select ARCH_WANT_FRAME_POINTERS
 	select GENERIC_IRQ_MULTI_HANDLER
 	select MMU_GATHER_NO_RANGE if MMU
+	select SET_FS
 
 config CPU_BIG_ENDIAN
 	def_bool y
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 3b0f53dd70bc9b..be70af482b5a9a 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -63,6 +63,7 @@ config PARISC
 	select HAVE_FTRACE_MCOUNT_RECORD if HAVE_DYNAMIC_FTRACE
 	select HAVE_KPROBES_ON_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS
+	select SET_FS
 
 	help
 	  The PA-RISC microprocessor is designed by Hewlett-Packard and used
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1f48bbfb3ce99d..3f09d6fdf89405 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -249,6 +249,7 @@ config PPC
 	select PCI_SYSCALL			if PCI
 	select PPC_DAWR				if PPC64
 	select RTC_LIB
+	select SET_FS
 	select SPARSE_IRQ
 	select SYSCTL_EXCEPTION_TRACE
 	select THREAD_INFO_IN_TASK
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 7b590552914664..6e05a94f808a5a 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -82,6 +82,7 @@ config RISCV
 	select PCI_MSI if PCI
 	select RISCV_INTC
 	select RISCV_TIMER
+	select SET_FS
 	select SPARSEMEM_STATIC if 32BIT
 	select SPARSE_IRQ
 	select SYSCTL_EXCEPTION_TRACE
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 3d86e12e8e3c21..fd81385a7787cb 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -185,6 +185,7 @@ config S390
 	select OLD_SIGSUSPEND3
 	select PCI_DOMAINS		if PCI
 	select PCI_MSI			if PCI
+	select SET_FS
 	select SPARSE_IRQ
 	select SYSCTL_EXCEPTION_TRACE
 	select THREAD_INFO_IN_TASK
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index d20927128fce05..2bd1653f3b3fea 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -71,6 +71,7 @@ config SUPERH
 	select PERF_EVENTS
 	select PERF_USE_VMALLOC
 	select RTC_LIB
+	select SET_FS
 	select SPARSE_IRQ
 	help
 	  The SuperH is a RISC processor targeted for use in embedded systems
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index efeff2c896a544..3e0cf0319a278a 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -49,6 +49,7 @@ config SPARC
 	select LOCKDEP_SMALL if LOCKDEP
 	select NEED_DMA_MAP_STATE
 	select NEED_SG_DMA_LENGTH
+	select SET_FS
 
 config SPARC32
 	def_bool !64BIT
diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index eb51fec759484a..3aefcd81566809 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -19,6 +19,7 @@ config UML
 	select GENERIC_CPU_DEVICES
 	select GENERIC_CLOCKEVENTS
 	select HAVE_GCC_PLUGINS
+	select SET_FS
 	select TTY # Needed for line.c
 
 config MMU
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 7101ac64bb209d..f85c13355732fe 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -237,6 +237,7 @@ config X86
 	select HAVE_ARCH_KCSAN			if X86_64
 	select X86_FEATURE_NAMES		if PROC_FS
 	select PROC_PID_ARCH_STATUS		if PROC_FS
+	select SET_FS
 	imply IMA_SECURE_AND_OR_TRUSTED_BOOT    if EFI
 
 config INSTRUCTION_DECODER
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index e997e0119c0251..94bad4d66b4bde 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -41,6 +41,7 @@ config XTENSA
 	select IRQ_DOMAIN
 	select MODULES_USE_ELF_RELA
 	select PERF_USE_VMALLOC
+	select SET_FS
 	select VIRT_TO_BUS
 	help
 	  Xtensa processors are 32-bit RISC machines designed by Tensilica
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 94b28541165929..70073c802b48ed 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -8,6 +8,7 @@
 
 #include <asm/uaccess.h>
 
+#ifdef CONFIG_SET_FS
 /*
  * Force the uaccess routines to be wired up for actual userspace access,
  * overriding any possible set_fs(KERNEL_DS) still lingering around.  Undone
@@ -25,6 +26,23 @@ static inline void force_uaccess_end(mm_segment_t oldfs)
 {
 	set_fs(oldfs);
 }
+#else /* CONFIG_SET_FS */
+typedef struct {
+	/* empty dummy */
+} mm_segment_t;
+
+#define uaccess_kernel()		(false)
+#define user_addr_max()			(TASK_SIZE_MAX)
+
+static inline mm_segment_t force_uaccess_begin(void)
+{
+	return (mm_segment_t) { };
+}
+
+static inline void force_uaccess_end(mm_segment_t oldfs)
+{
+}
+#endif /* CONFIG_SET_FS */
 
 /*
  * Architectures should provide two primitives (raw_copy_{to,from}_user())
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (3 preceding siblings ...)
  2020-08-17  7:32 ` [PATCH 04/11] uaccess: add infrastructure for kernel builds with set_fs() Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-17  7:50   ` Christophe Leroy
  2020-08-18 19:43   ` Kees Cook
  2020-08-17  7:32 ` [PATCH 06/11] lkdtm: disable set_fs-based " Christoph Hellwig
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

We can't run the tests for userspace bitmap parsing if set_fs() doesn't
exist.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 lib/test_bitmap.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index df903c53952bb9..49b1d25fbaf546 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -365,6 +365,7 @@ static void __init __test_bitmap_parselist(int is_user)
 	for (i = 0; i < ARRAY_SIZE(parselist_tests); i++) {
 #define ptest parselist_tests[i]
 
+#ifdef CONFIG_SET_FS
 		if (is_user) {
 			mm_segment_t orig_fs = get_fs();
 			size_t len = strlen(ptest.in);
@@ -375,7 +376,9 @@ static void __init __test_bitmap_parselist(int is_user)
 						    bmap, ptest.nbits);
 			time = ktime_get() - time;
 			set_fs(orig_fs);
-		} else {
+		} else
+#endif /* CONFIG_SET_FS */
+		{
 			time = ktime_get();
 			err = bitmap_parselist(ptest.in, bmap, ptest.nbits);
 			time = ktime_get() - time;
@@ -454,6 +457,7 @@ static void __init __test_bitmap_parse(int is_user)
 	for (i = 0; i < ARRAY_SIZE(parse_tests); i++) {
 		struct test_bitmap_parselist test = parse_tests[i];
 
+#ifdef CONFIG_SET_FS
 		if (is_user) {
 			size_t len = strlen(test.in);
 			mm_segment_t orig_fs = get_fs();
@@ -464,7 +468,9 @@ static void __init __test_bitmap_parse(int is_user)
 						bmap, test.nbits);
 			time = ktime_get() - time;
 			set_fs(orig_fs);
-		} else {
+		} else
+#endif /* CONFIG_SET_FS */
+		{
 			size_t len = test.flags & NO_LEN ?
 				UINT_MAX : strlen(test.in);
 			time = ktime_get();
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 06/11] lkdtm: disable set_fs-based tests for !CONFIG_SET_FS
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (4 preceding siblings ...)
  2020-08-17  7:32 ` [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-18 19:32   ` Kees Cook
  2020-08-17  7:32 ` [PATCH 07/11] x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h Christoph Hellwig
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

Once we can't manipulate the address limit, we also can't test what
happens when the manipulation is abused.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/misc/lkdtm/bugs.c     | 2 ++
 drivers/misc/lkdtm/core.c     | 4 ++++
 drivers/misc/lkdtm/usercopy.c | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
index 4dfbfd51bdf774..66f1800b1cb82d 100644
--- a/drivers/misc/lkdtm/bugs.c
+++ b/drivers/misc/lkdtm/bugs.c
@@ -312,6 +312,7 @@ void lkdtm_CORRUPT_LIST_DEL(void)
 		pr_err("list_del() corruption not detected!\n");
 }
 
+#ifdef CONFIG_SET_FS
 /* Test if unbalanced set_fs(KERNEL_DS)/set_fs(USER_DS) check exists. */
 void lkdtm_CORRUPT_USER_DS(void)
 {
@@ -321,6 +322,7 @@ void lkdtm_CORRUPT_USER_DS(void)
 	/* Make sure we do not keep running with a KERNEL_DS! */
 	force_sig(SIGKILL);
 }
+#endif
 
 /* Test that VMAP_STACK is actually allocating with a leading guard page */
 void lkdtm_STACK_GUARD_PAGE_LEADING(void)
diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index a5e344df916632..aae08b33a7ee2a 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -112,7 +112,9 @@ static const struct crashtype crashtypes[] = {
 	CRASHTYPE(CORRUPT_STACK_STRONG),
 	CRASHTYPE(CORRUPT_LIST_ADD),
 	CRASHTYPE(CORRUPT_LIST_DEL),
+#ifdef CONFIG_SET_FS
 	CRASHTYPE(CORRUPT_USER_DS),
+#endif
 	CRASHTYPE(STACK_GUARD_PAGE_LEADING),
 	CRASHTYPE(STACK_GUARD_PAGE_TRAILING),
 	CRASHTYPE(UNSET_SMEP),
@@ -172,7 +174,9 @@ static const struct crashtype crashtypes[] = {
 	CRASHTYPE(USERCOPY_STACK_FRAME_FROM),
 	CRASHTYPE(USERCOPY_STACK_BEYOND),
 	CRASHTYPE(USERCOPY_KERNEL),
+#ifdef CONFIG_SET_FS
 	CRASHTYPE(USERCOPY_KERNEL_DS),
+#endif
 	CRASHTYPE(STACKLEAK_ERASING),
 	CRASHTYPE(CFI_FORWARD_PROTO),
 #ifdef CONFIG_X86_32
diff --git a/drivers/misc/lkdtm/usercopy.c b/drivers/misc/lkdtm/usercopy.c
index b833367a45d053..4b632fe79ab6bb 100644
--- a/drivers/misc/lkdtm/usercopy.c
+++ b/drivers/misc/lkdtm/usercopy.c
@@ -325,6 +325,7 @@ void lkdtm_USERCOPY_KERNEL(void)
 	vm_munmap(user_addr, PAGE_SIZE);
 }
 
+#ifdef CONFIG_SET_FS
 void lkdtm_USERCOPY_KERNEL_DS(void)
 {
 	char __user *user_ptr =
@@ -339,6 +340,7 @@ void lkdtm_USERCOPY_KERNEL_DS(void)
 		pr_err("copy_to_user() to noncanonical address succeeded!?\n");
 	set_fs(old_fs);
 }
+#endif
 
 void __init lkdtm_usercopy_init(void)
 {
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 07/11] x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (5 preceding siblings ...)
  2020-08-17  7:32 ` [PATCH 06/11] lkdtm: disable set_fs-based " Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-18 19:27   ` Kees Cook
  2020-08-17  7:32 ` [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code Christoph Hellwig
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

At least for 64-bit this moves them closer to some of the defines
they are based on, and it prepares for using the TASK_SIZE_MAX
definition from assembly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/x86/include/asm/page_32_types.h | 11 +++++++
 arch/x86/include/asm/page_64_types.h | 38 +++++++++++++++++++++
 arch/x86/include/asm/processor.h     | 49 ----------------------------
 3 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/arch/x86/include/asm/page_32_types.h b/arch/x86/include/asm/page_32_types.h
index 565ad755c785e2..26236925fb2c36 100644
--- a/arch/x86/include/asm/page_32_types.h
+++ b/arch/x86/include/asm/page_32_types.h
@@ -41,6 +41,17 @@
 #define __VIRTUAL_MASK_SHIFT	32
 #endif	/* CONFIG_X86_PAE */
 
+/*
+ * User space process size: 3GB (default).
+ */
+#define IA32_PAGE_OFFSET	PAGE_OFFSET
+#define TASK_SIZE		PAGE_OFFSET
+#define TASK_SIZE_LOW		TASK_SIZE
+#define TASK_SIZE_MAX		TASK_SIZE
+#define DEFAULT_MAP_WINDOW	TASK_SIZE
+#define STACK_TOP		TASK_SIZE
+#define STACK_TOP_MAX		STACK_TOP
+
 /*
  * Kernel image size is limited to 512 MB (see in arch/x86/kernel/head_32.S)
  */
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 288b065955b729..996595c9897e0a 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -58,6 +58,44 @@
 #define __VIRTUAL_MASK_SHIFT	47
 #endif
 
+/*
+ * User space process size.  This is the first address outside the user range.
+ * There are a few constraints that determine this:
+ *
+ * On Intel CPUs, if a SYSCALL instruction is at the highest canonical
+ * address, then that syscall will enter the kernel with a
+ * non-canonical return address, and SYSRET will explode dangerously.
+ * We avoid this particular problem by preventing anything executable
+ * from being mapped at the maximum canonical address.
+ *
+ * On AMD CPUs in the Ryzen family, there's a nasty bug in which the
+ * CPUs malfunction if they execute code from the highest canonical page.
+ * They'll speculate right off the end of the canonical space, and
+ * bad things happen.  This is worked around in the same way as the
+ * Intel problem.
+ *
+ * With page table isolation enabled, we map the LDT in ... [stay tuned]
+ */
+#define TASK_SIZE_MAX	((1UL << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
+
+#define DEFAULT_MAP_WINDOW	((1UL << 47) - PAGE_SIZE)
+
+/* This decides where the kernel will search for a free chunk of vm
+ * space during mmap's.
+ */
+#define IA32_PAGE_OFFSET	((current->personality & ADDR_LIMIT_3GB) ? \
+					0xc0000000 : 0xFFFFe000)
+
+#define TASK_SIZE_LOW		(test_thread_flag(TIF_ADDR32) ? \
+					IA32_PAGE_OFFSET : DEFAULT_MAP_WINDOW)
+#define TASK_SIZE		(test_thread_flag(TIF_ADDR32) ? \
+					IA32_PAGE_OFFSET : TASK_SIZE_MAX)
+#define TASK_SIZE_OF(child)	((test_tsk_thread_flag(child, TIF_ADDR32)) ? \
+					IA32_PAGE_OFFSET : TASK_SIZE_MAX)
+
+#define STACK_TOP		TASK_SIZE_LOW
+#define STACK_TOP_MAX		TASK_SIZE_MAX
+
 /*
  * Maximum kernel image size is limited to 1 GiB, due to the fixmap living
  * in the next 1 GiB (see level2_kernel_pgt in arch/x86/kernel/head_64.S).
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 97143d87994c24..1618eeb08361a9 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -782,17 +782,6 @@ static inline void spin_lock_prefetch(const void *x)
 })
 
 #ifdef CONFIG_X86_32
-/*
- * User space process size: 3GB (default).
- */
-#define IA32_PAGE_OFFSET	PAGE_OFFSET
-#define TASK_SIZE		PAGE_OFFSET
-#define TASK_SIZE_LOW		TASK_SIZE
-#define TASK_SIZE_MAX		TASK_SIZE
-#define DEFAULT_MAP_WINDOW	TASK_SIZE
-#define STACK_TOP		TASK_SIZE
-#define STACK_TOP_MAX		STACK_TOP
-
 #define INIT_THREAD  {							  \
 	.sp0			= TOP_OF_INIT_STACK,			  \
 	.sysenter_cs		= __KERNEL_CS,				  \
@@ -802,44 +791,6 @@ static inline void spin_lock_prefetch(const void *x)
 #define KSTK_ESP(task)		(task_pt_regs(task)->sp)
 
 #else
-/*
- * User space process size.  This is the first address outside the user range.
- * There are a few constraints that determine this:
- *
- * On Intel CPUs, if a SYSCALL instruction is at the highest canonical
- * address, then that syscall will enter the kernel with a
- * non-canonical return address, and SYSRET will explode dangerously.
- * We avoid this particular problem by preventing anything executable
- * from being mapped at the maximum canonical address.
- *
- * On AMD CPUs in the Ryzen family, there's a nasty bug in which the
- * CPUs malfunction if they execute code from the highest canonical page.
- * They'll speculate right off the end of the canonical space, and
- * bad things happen.  This is worked around in the same way as the
- * Intel problem.
- *
- * With page table isolation enabled, we map the LDT in ... [stay tuned]
- */
-#define TASK_SIZE_MAX	((1UL << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
-
-#define DEFAULT_MAP_WINDOW	((1UL << 47) - PAGE_SIZE)
-
-/* This decides where the kernel will search for a free chunk of vm
- * space during mmap's.
- */
-#define IA32_PAGE_OFFSET	((current->personality & ADDR_LIMIT_3GB) ? \
-					0xc0000000 : 0xFFFFe000)
-
-#define TASK_SIZE_LOW		(test_thread_flag(TIF_ADDR32) ? \
-					IA32_PAGE_OFFSET : DEFAULT_MAP_WINDOW)
-#define TASK_SIZE		(test_thread_flag(TIF_ADDR32) ? \
-					IA32_PAGE_OFFSET : TASK_SIZE_MAX)
-#define TASK_SIZE_OF(child)	((test_tsk_thread_flag(child, TIF_ADDR32)) ? \
-					IA32_PAGE_OFFSET : TASK_SIZE_MAX)
-
-#define STACK_TOP		TASK_SIZE_LOW
-#define STACK_TOP_MAX		TASK_SIZE_MAX
-
 #define INIT_THREAD  {						\
 	.addr_limit		= KERNEL_DS,			\
 }
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (6 preceding siblings ...)
  2020-08-17  7:32 ` [PATCH 07/11] x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-18 19:44   ` Kees Cook
  2020-08-17  7:32 ` [PATCH 09/11] x86: remove address space overrides using set_fs() Christoph Hellwig
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

For 64-bit the only hing missing was a strategic _AC, and for 32-bit we
need to use __PAGE_OFFSET instead of PAGE_OFFSET in the TASK_SIZE
definition to escape the explicit unsigned long cast.  This just works
because __PAGE_OFFSET is defined using _AC itself and thus never needs
the cast anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/x86/include/asm/page_32_types.h | 4 ++--
 arch/x86/include/asm/page_64_types.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/page_32_types.h b/arch/x86/include/asm/page_32_types.h
index 26236925fb2c36..f462895a33e452 100644
--- a/arch/x86/include/asm/page_32_types.h
+++ b/arch/x86/include/asm/page_32_types.h
@@ -44,8 +44,8 @@
 /*
  * User space process size: 3GB (default).
  */
-#define IA32_PAGE_OFFSET	PAGE_OFFSET
-#define TASK_SIZE		PAGE_OFFSET
+#define IA32_PAGE_OFFSET	__PAGE_OFFSET
+#define TASK_SIZE		__PAGE_OFFSET
 #define TASK_SIZE_LOW		TASK_SIZE
 #define TASK_SIZE_MAX		TASK_SIZE
 #define DEFAULT_MAP_WINDOW	TASK_SIZE
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 996595c9897e0a..838515daf87b36 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -76,7 +76,7 @@
  *
  * With page table isolation enabled, we map the LDT in ... [stay tuned]
  */
-#define TASK_SIZE_MAX	((1UL << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
+#define TASK_SIZE_MAX	((_AC(1,UL) << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
 
 #define DEFAULT_MAP_WINDOW	((1UL << 47) - PAGE_SIZE)
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 09/11] x86: remove address space overrides using set_fs()
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (7 preceding siblings ...)
  2020-08-17  7:32 ` [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-17  8:23   ` David Laight
  2020-08-18 19:46   ` Kees Cook
  2020-08-17  7:32 ` [PATCH 10/11] powerpc: use non-set_fs based maccess routines Christoph Hellwig
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

Stop providing the possibility to override the address space using
set_fs() now that there is no need for that any more.  To properly
handle the TASK_SIZE_MAX checking for 4 vs 5-level page tables on
x86 a new alternative is introduced, which just like the one in
entry_64.S has to use the hardcoded virtual address bits to escape
the fact that TASK_SIZE_MAX isn't actually a constant when 5-level
page tables are enabled.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/x86/Kconfig                   |  1 -
 arch/x86/ia32/ia32_aout.c          |  1 -
 arch/x86/include/asm/processor.h   | 11 +----------
 arch/x86/include/asm/thread_info.h |  2 --
 arch/x86/include/asm/uaccess.h     | 26 +-------------------------
 arch/x86/kernel/asm-offsets.c      |  3 ---
 arch/x86/lib/getuser.S             | 28 ++++++++++++++++++----------
 arch/x86/lib/putuser.S             | 21 ++++++++++++---------
 8 files changed, 32 insertions(+), 61 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f85c13355732fe..7101ac64bb209d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -237,7 +237,6 @@ config X86
 	select HAVE_ARCH_KCSAN			if X86_64
 	select X86_FEATURE_NAMES		if PROC_FS
 	select PROC_PID_ARCH_STATUS		if PROC_FS
-	select SET_FS
 	imply IMA_SECURE_AND_OR_TRUSTED_BOOT    if EFI
 
 config INSTRUCTION_DECODER
diff --git a/arch/x86/ia32/ia32_aout.c b/arch/x86/ia32/ia32_aout.c
index ca8a657edf5977..a09fc37ead9d47 100644
--- a/arch/x86/ia32/ia32_aout.c
+++ b/arch/x86/ia32/ia32_aout.c
@@ -239,7 +239,6 @@ static int load_aout_binary(struct linux_binprm *bprm)
 	(regs)->ss = __USER32_DS;
 	regs->r8 = regs->r9 = regs->r10 = regs->r11 =
 	regs->r12 = regs->r13 = regs->r14 = regs->r15 = 0;
-	set_fs(USER_DS);
 	return 0;
 }
 
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 1618eeb08361a9..189573d95c3af6 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -482,10 +482,6 @@ extern unsigned int fpu_user_xstate_size;
 
 struct perf_event;
 
-typedef struct {
-	unsigned long		seg;
-} mm_segment_t;
-
 struct thread_struct {
 	/* Cached TLS descriptors: */
 	struct desc_struct	tls_array[GDT_ENTRY_TLS_ENTRIES];
@@ -538,8 +534,6 @@ struct thread_struct {
 	 */
 	unsigned long		iopl_emul;
 
-	mm_segment_t		addr_limit;
-
 	unsigned int		sig_on_uaccess_err:1;
 
 	/* Floating point and extended processor state */
@@ -785,15 +779,12 @@ static inline void spin_lock_prefetch(const void *x)
 #define INIT_THREAD  {							  \
 	.sp0			= TOP_OF_INIT_STACK,			  \
 	.sysenter_cs		= __KERNEL_CS,				  \
-	.addr_limit		= KERNEL_DS,				  \
 }
 
 #define KSTK_ESP(task)		(task_pt_regs(task)->sp)
 
 #else
-#define INIT_THREAD  {						\
-	.addr_limit		= KERNEL_DS,			\
-}
+#define INIT_THREAD { }
 
 extern unsigned long KSTK_ESP(struct task_struct *task);
 
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 267701ae3d86dd..44733a4bfc4294 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -102,7 +102,6 @@ struct thread_info {
 #define TIF_SYSCALL_TRACEPOINT	28	/* syscall tracepoint instrumentation */
 #define TIF_ADDR32		29	/* 32-bit address space on 64 bits */
 #define TIF_X32			30	/* 32-bit native x86-64 binary */
-#define TIF_FSCHECK		31	/* Check FS is USER_DS on return */
 
 #define _TIF_SYSCALL_TRACE	(1 << TIF_SYSCALL_TRACE)
 #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
@@ -131,7 +130,6 @@ struct thread_info {
 #define _TIF_SYSCALL_TRACEPOINT	(1 << TIF_SYSCALL_TRACEPOINT)
 #define _TIF_ADDR32		(1 << TIF_ADDR32)
 #define _TIF_X32		(1 << TIF_X32)
-#define _TIF_FSCHECK		(1 << TIF_FSCHECK)
 
 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW_BASE					\
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index ecefaffd15d4c8..a4ceda0510ea87 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -12,30 +12,6 @@
 #include <asm/smap.h>
 #include <asm/extable.h>
 
-/*
- * The fs value determines whether argument validity checking should be
- * performed or not.  If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons, these macros are grossly misnamed.
- */
-
-#define MAKE_MM_SEG(s)	((mm_segment_t) { (s) })
-
-#define KERNEL_DS	MAKE_MM_SEG(-1UL)
-#define USER_DS 	MAKE_MM_SEG(TASK_SIZE_MAX)
-
-#define get_fs()	(current->thread.addr_limit)
-static inline void set_fs(mm_segment_t fs)
-{
-	current->thread.addr_limit = fs;
-	/* On user-mode return, check fs is correct */
-	set_thread_flag(TIF_FSCHECK);
-}
-
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-#define user_addr_max() (current->thread.addr_limit.seg)
-
 /*
  * Test whether a block of memory is a valid user space address.
  * Returns 0 if the range is valid, nonzero otherwise.
@@ -93,7 +69,7 @@ static inline bool pagefault_disabled(void);
 #define access_ok(addr, size)					\
 ({									\
 	WARN_ON_IN_IRQ();						\
-	likely(!__range_not_ok(addr, size, user_addr_max()));		\
+	likely(!__range_not_ok(addr, size, TASK_SIZE_MAX));		\
 })
 
 /*
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 3ca07ad552ae0c..70b7154f4bdd62 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -37,9 +37,6 @@ static void __used common(void)
 	OFFSET(TASK_stack_canary, task_struct, stack_canary);
 #endif
 
-	BLANK();
-	OFFSET(TASK_addr_limit, task_struct, thread.addr_limit);
-
 	BLANK();
 	OFFSET(crypto_tfm_ctx_offset, crypto_tfm, __crt_ctx);
 
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
index c8a85b512796e1..ccc9808c66420a 100644
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -35,10 +35,18 @@
 #include <asm/smap.h>
 #include <asm/export.h>
 
+#ifdef CONFIG_X86_5LEVEL
+#define LOAD_TASK_SIZE_MAX \
+	ALTERNATIVE "mov $((1 << 47) - 4096),%rdx", \
+		    "mov $((1 << 56) - 4096),%rdx", X86_FEATURE_LA57
+#else
+#define LOAD_TASK_SIZE_MAX	mov $TASK_SIZE_MAX,%_ASM_DX
+#endif
+
 	.text
 SYM_FUNC_START(__get_user_1)
-	mov PER_CPU_VAR(current_task), %_ASM_DX
-	cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
+	LOAD_TASK_SIZE_MAX
+	cmp %_ASM_DX,%_ASM_AX
 	jae bad_get_user
 	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
 	and %_ASM_DX, %_ASM_AX
@@ -53,8 +61,8 @@ EXPORT_SYMBOL(__get_user_1)
 SYM_FUNC_START(__get_user_2)
 	add $1,%_ASM_AX
 	jc bad_get_user
-	mov PER_CPU_VAR(current_task), %_ASM_DX
-	cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
+	LOAD_TASK_SIZE_MAX
+	cmp %_ASM_DX,%_ASM_AX
 	jae bad_get_user
 	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
 	and %_ASM_DX, %_ASM_AX
@@ -69,8 +77,8 @@ EXPORT_SYMBOL(__get_user_2)
 SYM_FUNC_START(__get_user_4)
 	add $3,%_ASM_AX
 	jc bad_get_user
-	mov PER_CPU_VAR(current_task), %_ASM_DX
-	cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
+	LOAD_TASK_SIZE_MAX
+	cmp %_ASM_DX,%_ASM_AX
 	jae bad_get_user
 	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
 	and %_ASM_DX, %_ASM_AX
@@ -86,8 +94,8 @@ SYM_FUNC_START(__get_user_8)
 #ifdef CONFIG_X86_64
 	add $7,%_ASM_AX
 	jc bad_get_user
-	mov PER_CPU_VAR(current_task), %_ASM_DX
-	cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
+	LOAD_TASK_SIZE_MAX
+	cmp %_ASM_DX,%_ASM_AX
 	jae bad_get_user
 	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
 	and %_ASM_DX, %_ASM_AX
@@ -99,8 +107,8 @@ SYM_FUNC_START(__get_user_8)
 #else
 	add $7,%_ASM_AX
 	jc bad_get_user_8
-	mov PER_CPU_VAR(current_task), %_ASM_DX
-	cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
+	LOAD_TASK_SIZE_MAX
+	cmp %_ASM_DX,%_ASM_AX
 	jae bad_get_user_8
 	sbb %_ASM_DX, %_ASM_DX		/* array_index_mask_nospec() */
 	and %_ASM_DX, %_ASM_AX
diff --git a/arch/x86/lib/putuser.S b/arch/x86/lib/putuser.S
index 7c7c92db8497af..f5a56394985875 100644
--- a/arch/x86/lib/putuser.S
+++ b/arch/x86/lib/putuser.S
@@ -31,12 +31,18 @@
  * as they get called from within inline assembly.
  */
 
-#define ENTER	mov PER_CPU_VAR(current_task), %_ASM_BX
+#ifdef CONFIG_X86_5LEVEL
+#define LOAD_TASK_SIZE_MAX \
+	ALTERNATIVE "mov $((1 << 47) - 4096),%rbx", \
+		    "mov $((1 << 56) - 4096),%rbx", X86_FEATURE_LA57
+#else
+#define LOAD_TASK_SIZE_MAX	mov $TASK_SIZE_MAX,%_ASM_BX
+#endif
 
 .text
 SYM_FUNC_START(__put_user_1)
-	ENTER
-	cmp TASK_addr_limit(%_ASM_BX),%_ASM_CX
+	LOAD_TASK_SIZE_MAX
+	cmp %_ASM_BX,%_ASM_CX
 	jae .Lbad_put_user
 	ASM_STAC
 1:	movb %al,(%_ASM_CX)
@@ -47,8 +53,7 @@ SYM_FUNC_END(__put_user_1)
 EXPORT_SYMBOL(__put_user_1)
 
 SYM_FUNC_START(__put_user_2)
-	ENTER
-	mov TASK_addr_limit(%_ASM_BX),%_ASM_BX
+	LOAD_TASK_SIZE_MAX
 	sub $1,%_ASM_BX
 	cmp %_ASM_BX,%_ASM_CX
 	jae .Lbad_put_user
@@ -61,8 +66,7 @@ SYM_FUNC_END(__put_user_2)
 EXPORT_SYMBOL(__put_user_2)
 
 SYM_FUNC_START(__put_user_4)
-	ENTER
-	mov TASK_addr_limit(%_ASM_BX),%_ASM_BX
+	LOAD_TASK_SIZE_MAX
 	sub $3,%_ASM_BX
 	cmp %_ASM_BX,%_ASM_CX
 	jae .Lbad_put_user
@@ -75,8 +79,7 @@ SYM_FUNC_END(__put_user_4)
 EXPORT_SYMBOL(__put_user_4)
 
 SYM_FUNC_START(__put_user_8)
-	ENTER
-	mov TASK_addr_limit(%_ASM_BX),%_ASM_BX
+	LOAD_TASK_SIZE_MAX
 	sub $7,%_ASM_BX
 	cmp %_ASM_BX,%_ASM_CX
 	jae .Lbad_put_user
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 10/11] powerpc: use non-set_fs based maccess routines
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (8 preceding siblings ...)
  2020-08-17  7:32 ` [PATCH 09/11] x86: remove address space overrides using set_fs() Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-17 15:47   ` Christophe Leroy
  2020-08-17  7:32 ` [PATCH 11/11] powerpc: remove address space overrides using set_fs() Christoph Hellwig
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

Provide __get_kernel_nofault and __put_kernel_nofault routines to
implement the maccess routines without messing with set_fs and without
opening up access to user space.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/powerpc/include/asm/uaccess.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 00699903f1efca..a31de40ac00b62 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -623,4 +623,20 @@ do {									\
 		__put_user_goto(*(u8*)(_src + _i), (u8 __user *)(_dst + _i), e);\
 } while (0)
 
+#define HAVE_GET_KERNEL_NOFAULT
+
+#define __get_kernel_nofault(dst, src, type, err_label)			\
+do {									\
+	int __kr_err;							\
+									\
+	__get_user_size(*((type *)(dst)), (__force type __user *)(src),	\
+			sizeof(type), __kr_err);			\
+	if (unlikely(__kr_err))						\
+		goto err_label;						\
+} while (0)
+
+#define __put_kernel_nofault(dst, src, type, err_label)			\
+	__put_user_size_goto(*((type *)(src)),				\
+		(__force type __user *)(dst), sizeof(type), err_label)
+
 #endif	/* _ARCH_POWERPC_UACCESS_H */
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 11/11] powerpc: remove address space overrides using set_fs()
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (9 preceding siblings ...)
  2020-08-17  7:32 ` [PATCH 10/11] powerpc: use non-set_fs based maccess routines Christoph Hellwig
@ 2020-08-17  7:32 ` Christoph Hellwig
  2020-08-17  7:39 ` remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
  2020-08-18 17:46 ` Christophe Leroy
  12 siblings, 0 replies; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:32 UTC (permalink / raw)
  To: Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

Stop providing the possibility to override the address space using
set_fs() now that there is no need for that any more.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 arch/powerpc/Kconfig                   |  1 -
 arch/powerpc/include/asm/processor.h   |  7 ---
 arch/powerpc/include/asm/thread_info.h |  5 +--
 arch/powerpc/include/asm/uaccess.h     | 62 ++++++++------------------
 arch/powerpc/kernel/signal.c           |  3 --
 arch/powerpc/lib/sstep.c               |  6 +--
 6 files changed, 22 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 3f09d6fdf89405..1f48bbfb3ce99d 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -249,7 +249,6 @@ config PPC
 	select PCI_SYSCALL			if PCI
 	select PPC_DAWR				if PPC64
 	select RTC_LIB
-	select SET_FS
 	select SPARSE_IRQ
 	select SYSCTL_EXCEPTION_TRACE
 	select THREAD_INFO_IN_TASK
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index ed0d633ab5aa42..f01e4d650c520a 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -83,10 +83,6 @@ struct task_struct;
 void start_thread(struct pt_regs *regs, unsigned long fdptr, unsigned long sp);
 void release_thread(struct task_struct *);
 
-typedef struct {
-	unsigned long seg;
-} mm_segment_t;
-
 #define TS_FPR(i) fp_state.fpr[i][TS_FPROFFSET]
 #define TS_CKFPR(i) ckfp_state.fpr[i][TS_FPROFFSET]
 
@@ -148,7 +144,6 @@ struct thread_struct {
 	unsigned long	ksp_vsid;
 #endif
 	struct pt_regs	*regs;		/* Pointer to saved register state */
-	mm_segment_t	addr_limit;	/* for get_fs() validation */
 #ifdef CONFIG_BOOKE
 	/* BookE base exception scratch space; align on cacheline */
 	unsigned long	normsave[8] ____cacheline_aligned;
@@ -295,7 +290,6 @@ struct thread_struct {
 #define INIT_THREAD { \
 	.ksp = INIT_SP, \
 	.ksp_limit = INIT_SP_LIMIT, \
-	.addr_limit = KERNEL_DS, \
 	.pgdir = swapper_pg_dir, \
 	.fpexc_mode = MSR_FE0 | MSR_FE1, \
 	SPEFSCR_INIT \
@@ -303,7 +297,6 @@ struct thread_struct {
 #else
 #define INIT_THREAD  { \
 	.ksp = INIT_SP, \
-	.addr_limit = KERNEL_DS, \
 	.fpexc_mode = 0, \
 }
 #endif
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index ca6c9702570494..46a210b03d2b80 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -90,7 +90,6 @@ void arch_setup_new_exec(void);
 #define TIF_SYSCALL_TRACE	0	/* syscall trace active */
 #define TIF_SIGPENDING		1	/* signal pending */
 #define TIF_NEED_RESCHED	2	/* rescheduling necessary */
-#define TIF_FSCHECK		3	/* Check FS is USER_DS on return */
 #define TIF_SYSCALL_EMU		4	/* syscall emulation active */
 #define TIF_RESTORE_TM		5	/* need to restore TM FP/VEC/VSX */
 #define TIF_PATCH_PENDING	6	/* pending live patching update */
@@ -130,7 +129,6 @@ void arch_setup_new_exec(void);
 #define _TIF_SYSCALL_TRACEPOINT	(1<<TIF_SYSCALL_TRACEPOINT)
 #define _TIF_EMULATE_STACK_STORE	(1<<TIF_EMULATE_STACK_STORE)
 #define _TIF_NOHZ		(1<<TIF_NOHZ)
-#define _TIF_FSCHECK		(1<<TIF_FSCHECK)
 #define _TIF_SYSCALL_EMU	(1<<TIF_SYSCALL_EMU)
 #define _TIF_SYSCALL_DOTRACE	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 				 _TIF_SECCOMP | _TIF_SYSCALL_TRACEPOINT | \
@@ -138,8 +136,7 @@ void arch_setup_new_exec(void);
 
 #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
 				 _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
-				 _TIF_RESTORE_TM | _TIF_PATCH_PENDING | \
-				 _TIF_FSCHECK)
+				 _TIF_RESTORE_TM | _TIF_PATCH_PENDING)
 #define _TIF_PERSYSCALL_MASK	(_TIF_RESTOREALL|_TIF_NOERROR)
 
 /* Bits in local_flags */
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index a31de40ac00b62..38ed408bee6cb2 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -8,62 +8,36 @@
 #include <asm/extable.h>
 #include <asm/kup.h>
 
-/*
- * The fs value determines whether argument validity checking should be
- * performed or not.  If get_fs() == USER_DS, checking is performed, with
- * get_fs() == KERNEL_DS, checking is bypassed.
- *
- * For historical reasons, these macros are grossly misnamed.
- *
- * The fs/ds values are now the highest legal address in the "segment".
- * This simplifies the checking in the routines below.
- */
-
-#define MAKE_MM_SEG(s)  ((mm_segment_t) { (s) })
-
-#define KERNEL_DS	MAKE_MM_SEG(~0UL)
 #ifdef __powerpc64__
 /* We use TASK_SIZE_USER64 as TASK_SIZE is not constant */
-#define USER_DS		MAKE_MM_SEG(TASK_SIZE_USER64 - 1)
-#else
-#define USER_DS		MAKE_MM_SEG(TASK_SIZE - 1)
-#endif
-
-#define get_fs()	(current->thread.addr_limit)
+#define TASK_SIZE_MAX		TASK_SIZE_USER64
 
-static inline void set_fs(mm_segment_t fs)
+static inline bool __access_ok(unsigned long addr, unsigned long size)
 {
-	current->thread.addr_limit = fs;
-	/* On user-mode return check addr_limit (fs) is correct */
-	set_thread_flag(TIF_FSCHECK);
+	if (addr >= TASK_SIZE_MAX)
+		return false;
+	/*
+	 * This check is sufficient because there is a large enough gap between
+	 * user addresses and the kernel addresses.
+	 */
+	return size <= TASK_SIZE_MAX;
 }
-
-#define uaccess_kernel() (get_fs().seg == KERNEL_DS.seg)
-#define user_addr_max()	(get_fs().seg)
-
-#ifdef __powerpc64__
-/*
- * This check is sufficient because there is a large enough
- * gap between user addresses and the kernel addresses
- */
-#define __access_ok(addr, size, segment)	\
-	(((addr) <= (segment).seg) && ((size) <= (segment).seg))
-
 #else
+#define TASK_SIZE_MAX		TASK_SIZE
 
-static inline int __access_ok(unsigned long addr, unsigned long size,
-			mm_segment_t seg)
+static inline bool __access_ok(unsigned long addr, unsigned long size)
 {
-	if (addr > seg.seg)
-		return 0;
-	return (size == 0 || size - 1 <= seg.seg - addr);
+	if (addr >= TASK_SIZE_MAX)
+		return false;
+	if (size == 0)
+		return false;
+	return size <= TASK_SIZE_MAX - addr;
 }
-
-#endif
+#endif /* __powerpc64__ */
 
 #define access_ok(addr, size)		\
 	(__chk_user_ptr(addr),		\
-	 __access_ok((__force unsigned long)(addr), (size), get_fs()))
+	 __access_ok((unsigned long)(addr), (size)))
 
 /*
  * These are the main single-value transfer routines.  They automatically
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
index d15a98c758b8b4..df547d8e31e49c 100644
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -312,9 +312,6 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
 {
 	user_exit();
 
-	/* Check valid addr_limit, TIF check is done there */
-	addr_limit_user_check();
-
 	if (thread_info_flags & _TIF_UPROBE)
 		uprobe_notify_resume(regs);
 
diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index caee8cc77e1954..8342188ea1acd0 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -108,11 +108,11 @@ static nokprobe_inline long address_ok(struct pt_regs *regs,
 {
 	if (!user_mode(regs))
 		return 1;
-	if (__access_ok(ea, nb, USER_DS))
+	if (__access_ok(ea, nb))
 		return 1;
-	if (__access_ok(ea, 1, USER_DS))
+	if (__access_ok(ea, 1))
 		/* Access overlaps the end of the user region */
-		regs->dar = USER_DS.seg;
+		regs->dar = TASK_SIZE_MAX - 1;
 	else
 		regs->dar = ea;
 	return 0;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (10 preceding siblings ...)
  2020-08-17  7:32 ` [PATCH 11/11] powerpc: remove address space overrides using set_fs() Christoph Hellwig
@ 2020-08-17  7:39 ` Christoph Hellwig
  2020-08-18 17:46 ` Christophe Leroy
  12 siblings, 0 replies; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Al Viro, Michael Ellerman, x86, Kees Cook, linux-kernel,
	linux-fsdevel, linux-arch, linuxppc-dev

Adding Linus as I forgot to add him to the patch bomb, sorry..

On Mon, Aug 17, 2020 at 09:32:01AM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series removes the last set_fs() used to force a kernel address
> space for the uaccess code in the kernel read/write/splice code, and then
> stops implementing the address space overrides entirely for x86 and
> powerpc.
> 
> The file system part has been posted a few times, and the read/write side
> has been pretty much unchanced.  For splice this series drops the
> conversion of the seq_file and sysctl code to the iter ops, and thus loses
> the splice support for them.  The reasons for that is that it caused a lot
> of churn for not much use - splice for these small files really isn't much
> of a win, even if existing userspace uses it.  All callers I found do the
> proper fallback, but if this turns out to be an issue the conversion can
> be resurrected.
> 
> Besides x86 and powerpc I plan to eventually convert all other
> architectures, although this will be a slow process, starting with the
> easier ones once the infrastructure is merged.  The process to convert
> architectures is roughtly:
> 
>  - ensure there is no set_fs(KERNEL_DS) left in arch specific code
>  - implement __get_kernel_nofault and __put_kernel_nofault
>  - remove the arch specific address limitation functionality
> 
> Diffstat:
>  arch/Kconfig                           |    3 
>  arch/alpha/Kconfig                     |    1 
>  arch/arc/Kconfig                       |    1 
>  arch/arm/Kconfig                       |    1 
>  arch/arm64/Kconfig                     |    1 
>  arch/c6x/Kconfig                       |    1 
>  arch/csky/Kconfig                      |    1 
>  arch/h8300/Kconfig                     |    1 
>  arch/hexagon/Kconfig                   |    1 
>  arch/ia64/Kconfig                      |    1 
>  arch/m68k/Kconfig                      |    1 
>  arch/microblaze/Kconfig                |    1 
>  arch/mips/Kconfig                      |    1 
>  arch/nds32/Kconfig                     |    1 
>  arch/nios2/Kconfig                     |    1 
>  arch/openrisc/Kconfig                  |    1 
>  arch/parisc/Kconfig                    |    1 
>  arch/powerpc/include/asm/processor.h   |    7 -
>  arch/powerpc/include/asm/thread_info.h |    5 -
>  arch/powerpc/include/asm/uaccess.h     |   78 ++++++++-----------
>  arch/powerpc/kernel/signal.c           |    3 
>  arch/powerpc/lib/sstep.c               |    6 -
>  arch/riscv/Kconfig                     |    1 
>  arch/s390/Kconfig                      |    1 
>  arch/sh/Kconfig                        |    1 
>  arch/sparc/Kconfig                     |    1 
>  arch/um/Kconfig                        |    1 
>  arch/x86/ia32/ia32_aout.c              |    1 
>  arch/x86/include/asm/page_32_types.h   |   11 ++
>  arch/x86/include/asm/page_64_types.h   |   38 +++++++++
>  arch/x86/include/asm/processor.h       |   60 ---------------
>  arch/x86/include/asm/thread_info.h     |    2 
>  arch/x86/include/asm/uaccess.h         |   26 ------
>  arch/x86/kernel/asm-offsets.c          |    3 
>  arch/x86/lib/getuser.S                 |   28 ++++---
>  arch/x86/lib/putuser.S                 |   21 +++--
>  arch/xtensa/Kconfig                    |    1 
>  drivers/char/mem.c                     |   16 ----
>  drivers/misc/lkdtm/bugs.c              |    2 
>  drivers/misc/lkdtm/core.c              |    4 +
>  drivers/misc/lkdtm/usercopy.c          |    2 
>  fs/read_write.c                        |   69 ++++++++++-------
>  fs/splice.c                            |  130 +++------------------------------
>  include/linux/fs.h                     |    2 
>  include/linux/uaccess.h                |   18 ++++
>  lib/test_bitmap.c                      |   10 ++
>  46 files changed, 235 insertions(+), 332 deletions(-)
---end quoted text---

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS
  2020-08-17  7:32 ` [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS Christoph Hellwig
@ 2020-08-17  7:50   ` Christophe Leroy
  2020-08-17  7:52     ` Christoph Hellwig
  2020-08-18 19:43   ` Kees Cook
  1 sibling, 1 reply; 39+ messages in thread
From: Christophe Leroy @ 2020-08-17  7:50 UTC (permalink / raw)
  To: Christoph Hellwig, Al Viro, Michael Ellerman, x86
  Cc: linux-fsdevel, linux-arch, linuxppc-dev, Kees Cook, linux-kernel



Le 17/08/2020 à 09:32, Christoph Hellwig a écrit :
> We can't run the tests for userspace bitmap parsing if set_fs() doesn't
> exist.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   lib/test_bitmap.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> index df903c53952bb9..49b1d25fbaf546 100644
> --- a/lib/test_bitmap.c
> +++ b/lib/test_bitmap.c
> @@ -365,6 +365,7 @@ static void __init __test_bitmap_parselist(int is_user)
>   	for (i = 0; i < ARRAY_SIZE(parselist_tests); i++) {
>   #define ptest parselist_tests[i]
>   
> +#ifdef CONFIG_SET_FS

get_fs() and set_fs() have stubs for when an arch doesn't define them, 
so I this it would be cleaner if you were using 'if 
(IS_ENABLED(CONFIG_SET_FS) && is_user)`instead of an ifdefery in the 
middle of the if/else.

Christophe


>   		if (is_user) {
>   			mm_segment_t orig_fs = get_fs();
>   			size_t len = strlen(ptest.in);
> @@ -375,7 +376,9 @@ static void __init __test_bitmap_parselist(int is_user)
>   						    bmap, ptest.nbits);
>   			time = ktime_get() - time;
>   			set_fs(orig_fs);
> -		} else {
> +		} else
> +#endif /* CONFIG_SET_FS */
> +		{
>   			time = ktime_get();
>   			err = bitmap_parselist(ptest.in, bmap, ptest.nbits);
>   			time = ktime_get() - time;
> @@ -454,6 +457,7 @@ static void __init __test_bitmap_parse(int is_user)
>   	for (i = 0; i < ARRAY_SIZE(parse_tests); i++) {
>   		struct test_bitmap_parselist test = parse_tests[i];
>   
> +#ifdef CONFIG_SET_FS
>   		if (is_user) {
>   			size_t len = strlen(test.in);
>   			mm_segment_t orig_fs = get_fs();
> @@ -464,7 +468,9 @@ static void __init __test_bitmap_parse(int is_user)
>   						bmap, test.nbits);
>   			time = ktime_get() - time;
>   			set_fs(orig_fs);
> -		} else {
> +		} else
> +#endif /* CONFIG_SET_FS */
> +		{
>   			size_t len = test.flags & NO_LEN ?
>   				UINT_MAX : strlen(test.in);
>   			time = ktime_get();
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS
  2020-08-17  7:50   ` Christophe Leroy
@ 2020-08-17  7:52     ` Christoph Hellwig
  0 siblings, 0 replies; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-17  7:52 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Christoph Hellwig, Al Viro, Michael Ellerman, x86, linux-fsdevel,
	linux-arch, linuxppc-dev, Kees Cook, linux-kernel

On Mon, Aug 17, 2020 at 09:50:05AM +0200, Christophe Leroy wrote:
>
>
> Le 17/08/2020 à 09:32, Christoph Hellwig a écrit :
>> We can't run the tests for userspace bitmap parsing if set_fs() doesn't
>> exist.
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> ---
>>   lib/test_bitmap.c | 10 ++++++++--
>>   1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
>> index df903c53952bb9..49b1d25fbaf546 100644
>> --- a/lib/test_bitmap.c
>> +++ b/lib/test_bitmap.c
>> @@ -365,6 +365,7 @@ static void __init __test_bitmap_parselist(int is_user)
>>   	for (i = 0; i < ARRAY_SIZE(parselist_tests); i++) {
>>   #define ptest parselist_tests[i]
>>   +#ifdef CONFIG_SET_FS
>
> get_fs() and set_fs() have stubs for when an arch doesn't define them, so I 
> this it would be cleaner if you were using 'if (IS_ENABLED(CONFIG_SET_FS) 
> && is_user)`instead of an ifdefery in the middle of the if/else.

No, I don't provide stubs in the prep patch, and that has been intentional
as I don't want this to spread much.  test_bitmap would be the only place
where they are somewht useful, and I just hope this test is eventually
getting rewritten to run in a normal user space context where the
uaccess tests can be resurrected.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCH 09/11] x86: remove address space overrides using set_fs()
  2020-08-17  7:32 ` [PATCH 09/11] x86: remove address space overrides using set_fs() Christoph Hellwig
@ 2020-08-17  8:23   ` David Laight
  2020-08-27  9:37     ` 'Christoph Hellwig'
  2020-08-18 19:46   ` Kees Cook
  1 sibling, 1 reply; 39+ messages in thread
From: David Laight @ 2020-08-17  8:23 UTC (permalink / raw)
  To: 'Christoph Hellwig', Al Viro, Michael Ellerman, x86
  Cc: Kees Cook, linux-kernel, linux-fsdevel, linux-arch, linuxppc-dev

From: Christoph Hellwig
> Sent: 17 August 2020 08:32
>
> Stop providing the possibility to override the address space using
> set_fs() now that there is no need for that any more.  To properly
> handle the TASK_SIZE_MAX checking for 4 vs 5-level page tables on
> x86 a new alternative is introduced, which just like the one in
> entry_64.S has to use the hardcoded virtual address bits to escape
> the fact that TASK_SIZE_MAX isn't actually a constant when 5-level
> page tables are enabled.
....
> @@ -93,7 +69,7 @@ static inline bool pagefault_disabled(void);
>  #define access_ok(addr, size)					\
>  ({									\
>  	WARN_ON_IN_IRQ();						\
> -	likely(!__range_not_ok(addr, size, user_addr_max()));		\
> +	likely(!__range_not_ok(addr, size, TASK_SIZE_MAX));		\
>  })

Can't that always compare against a constant even when 5-levl
page tables are enabled on x86-64?

On x86-64 it can (probably) reduce to (addr | (addr + size)) < 0.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 10/11] powerpc: use non-set_fs based maccess routines
  2020-08-17  7:32 ` [PATCH 10/11] powerpc: use non-set_fs based maccess routines Christoph Hellwig
@ 2020-08-17 15:47   ` Christophe Leroy
  0 siblings, 0 replies; 39+ messages in thread
From: Christophe Leroy @ 2020-08-17 15:47 UTC (permalink / raw)
  To: Christoph Hellwig, Al Viro, Michael Ellerman, x86
  Cc: linux-fsdevel, linux-arch, linuxppc-dev, Kees Cook, linux-kernel



Le 17/08/2020 à 09:32, Christoph Hellwig a écrit :
> Provide __get_kernel_nofault and __put_kernel_nofault routines to
> implement the maccess routines without messing with set_fs and without
> opening up access to user space.

__get_user_size() opens access to user space. You have to use 
__get_user_size_allowed() when user access is already allowed (or when 
not needed to allow it).

Christophe

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   arch/powerpc/include/asm/uaccess.h | 16 ++++++++++++++++
>   1 file changed, 16 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 00699903f1efca..a31de40ac00b62 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -623,4 +623,20 @@ do {									\
>   		__put_user_goto(*(u8*)(_src + _i), (u8 __user *)(_dst + _i), e);\
>   } while (0)
>   
> +#define HAVE_GET_KERNEL_NOFAULT
> +
> +#define __get_kernel_nofault(dst, src, type, err_label)			\
> +do {									\
> +	int __kr_err;							\
> +									\
> +	__get_user_size(*((type *)(dst)), (__force type __user *)(src),	\
> +			sizeof(type), __kr_err);			\
> +	if (unlikely(__kr_err))						\
> +		goto err_label;						\
> +} while (0)
> +
> +#define __put_kernel_nofault(dst, src, type, err_label)			\
> +	__put_user_size_goto(*((type *)(src)),				\
> +		(__force type __user *)(dst), sizeof(type), err_label)
> +
>   #endif	/* _ARCH_POWERPC_UACCESS_H */
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc
  2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
                   ` (11 preceding siblings ...)
  2020-08-17  7:39 ` remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
@ 2020-08-18 17:46 ` Christophe Leroy
  2020-08-18 18:05   ` Christoph Hellwig
  12 siblings, 1 reply; 39+ messages in thread
From: Christophe Leroy @ 2020-08-18 17:46 UTC (permalink / raw)
  To: Christoph Hellwig, Al Viro, Michael Ellerman, x86
  Cc: linux-fsdevel, linux-arch, linuxppc-dev, Kees Cook, linux-kernel



Le 17/08/2020 à 09:32, Christoph Hellwig a écrit :
> Hi all,
> 
> this series removes the last set_fs() used to force a kernel address
> space for the uaccess code in the kernel read/write/splice code, and then
> stops implementing the address space overrides entirely for x86 and
> powerpc.
> 
> The file system part has been posted a few times, and the read/write side
> has been pretty much unchanced.  For splice this series drops the
> conversion of the seq_file and sysctl code to the iter ops, and thus loses
> the splice support for them.  The reasons for that is that it caused a lot
> of churn for not much use - splice for these small files really isn't much
> of a win, even if existing userspace uses it.  All callers I found do the
> proper fallback, but if this turns out to be an issue the conversion can
> be resurrected.

I like this series.

I gave it a go on my powerpc mpc832x. I tested it on top of my newest 
series that reworks the 32 bits signal handlers (see 
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=196278) 
with the microbenchmark test used is that series.

With KUAP activated, on top of signal32 rework, performance is boosted 
as system time for the microbenchmark goes from 1.73s down to 1.56s, 
that is 10% quicker

Surprisingly, with the kernel as is today without my signal's series, 
your series degrades performance slightly (from 2.55s to 2.64s ie 3.5% 
slower).


I also observe, in both cases, a degradation on

	dd if=/dev/zero of=/dev/null count=1M

Without your series, it runs in 5.29 seconds.
With your series, it runs in 5.82 seconds, that is 10% more time.

Christophe


> 
> Besides x86 and powerpc I plan to eventually convert all other
> architectures, although this will be a slow process, starting with the
> easier ones once the infrastructure is merged.  The process to convert
> architectures is roughtly:
> 
>   - ensure there is no set_fs(KERNEL_DS) left in arch specific code
>   - implement __get_kernel_nofault and __put_kernel_nofault
>   - remove the arch specific address limitation functionality
> 
> Diffstat:
>   arch/Kconfig                           |    3
>   arch/alpha/Kconfig                     |    1
>   arch/arc/Kconfig                       |    1
>   arch/arm/Kconfig                       |    1
>   arch/arm64/Kconfig                     |    1
>   arch/c6x/Kconfig                       |    1
>   arch/csky/Kconfig                      |    1
>   arch/h8300/Kconfig                     |    1
>   arch/hexagon/Kconfig                   |    1
>   arch/ia64/Kconfig                      |    1
>   arch/m68k/Kconfig                      |    1
>   arch/microblaze/Kconfig                |    1
>   arch/mips/Kconfig                      |    1
>   arch/nds32/Kconfig                     |    1
>   arch/nios2/Kconfig                     |    1
>   arch/openrisc/Kconfig                  |    1
>   arch/parisc/Kconfig                    |    1
>   arch/powerpc/include/asm/processor.h   |    7 -
>   arch/powerpc/include/asm/thread_info.h |    5 -
>   arch/powerpc/include/asm/uaccess.h     |   78 ++++++++-----------
>   arch/powerpc/kernel/signal.c           |    3
>   arch/powerpc/lib/sstep.c               |    6 -
>   arch/riscv/Kconfig                     |    1
>   arch/s390/Kconfig                      |    1
>   arch/sh/Kconfig                        |    1
>   arch/sparc/Kconfig                     |    1
>   arch/um/Kconfig                        |    1
>   arch/x86/ia32/ia32_aout.c              |    1
>   arch/x86/include/asm/page_32_types.h   |   11 ++
>   arch/x86/include/asm/page_64_types.h   |   38 +++++++++
>   arch/x86/include/asm/processor.h       |   60 ---------------
>   arch/x86/include/asm/thread_info.h     |    2
>   arch/x86/include/asm/uaccess.h         |   26 ------
>   arch/x86/kernel/asm-offsets.c          |    3
>   arch/x86/lib/getuser.S                 |   28 ++++---
>   arch/x86/lib/putuser.S                 |   21 +++--
>   arch/xtensa/Kconfig                    |    1
>   drivers/char/mem.c                     |   16 ----
>   drivers/misc/lkdtm/bugs.c              |    2
>   drivers/misc/lkdtm/core.c              |    4 +
>   drivers/misc/lkdtm/usercopy.c          |    2
>   fs/read_write.c                        |   69 ++++++++++-------
>   fs/splice.c                            |  130 +++------------------------------
>   include/linux/fs.h                     |    2
>   include/linux/uaccess.h                |   18 ++++
>   lib/test_bitmap.c                      |   10 ++
>   46 files changed, 235 insertions(+), 332 deletions(-)
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc
  2020-08-18 17:46 ` Christophe Leroy
@ 2020-08-18 18:05   ` Christoph Hellwig
  2020-08-18 18:23     ` Christophe Leroy
  0 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-18 18:05 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Christoph Hellwig, Al Viro, Michael Ellerman, x86, linux-fsdevel,
	linux-arch, linuxppc-dev, Kees Cook, linux-kernel

On Tue, Aug 18, 2020 at 07:46:22PM +0200, Christophe Leroy wrote:
> I gave it a go on my powerpc mpc832x. I tested it on top of my newest 
> series that reworks the 32 bits signal handlers (see 
> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=196278) with 
> the microbenchmark test used is that series.
>
> With KUAP activated, on top of signal32 rework, performance is boosted as 
> system time for the microbenchmark goes from 1.73s down to 1.56s, that is 
> 10% quicker
>
> Surprisingly, with the kernel as is today without my signal's series, your 
> series degrades performance slightly (from 2.55s to 2.64s ie 3.5% slower).
>
>
> I also observe, in both cases, a degradation on
>
> 	dd if=/dev/zero of=/dev/null count=1M
>
> Without your series, it runs in 5.29 seconds.
> With your series, it runs in 5.82 seconds, that is 10% more time.

That's pretty strage, I wonder if some kernel text cache line
effects come into play here?

The kernel access side is only used in slow path code, so it should
not make a difference, and the uaccess code is simplified and should be
(marginally) faster.

Btw, was this with the __{get,put}_user_allowed cockup that you noticed
fixed?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc
  2020-08-18 18:05   ` Christoph Hellwig
@ 2020-08-18 18:23     ` Christophe Leroy
  2020-08-19  7:16       ` Christophe Leroy
  0 siblings, 1 reply; 39+ messages in thread
From: Christophe Leroy @ 2020-08-18 18:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-fsdevel, linux-arch,
	linuxppc-dev, Kees Cook, linux-kernel



Le 18/08/2020 à 20:05, Christoph Hellwig a écrit :
> On Tue, Aug 18, 2020 at 07:46:22PM +0200, Christophe Leroy wrote:
>> I gave it a go on my powerpc mpc832x. I tested it on top of my newest
>> series that reworks the 32 bits signal handlers (see
>> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=196278) with
>> the microbenchmark test used is that series.
>>
>> With KUAP activated, on top of signal32 rework, performance is boosted as
>> system time for the microbenchmark goes from 1.73s down to 1.56s, that is
>> 10% quicker
>>
>> Surprisingly, with the kernel as is today without my signal's series, your
>> series degrades performance slightly (from 2.55s to 2.64s ie 3.5% slower).
>>
>>
>> I also observe, in both cases, a degradation on
>>
>> 	dd if=/dev/zero of=/dev/null count=1M
>>
>> Without your series, it runs in 5.29 seconds.
>> With your series, it runs in 5.82 seconds, that is 10% more time.
> 
> That's pretty strage, I wonder if some kernel text cache line
> effects come into play here?
> 
> The kernel access side is only used in slow path code, so it should
> not make a difference, and the uaccess code is simplified and should be
> (marginally) faster.
> 
> Btw, was this with the __{get,put}_user_allowed cockup that you noticed
> fixed?
> 

Yes it is with the __get_user_size() replaced by __get_user_size_allowed().

Christophe

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 07/11] x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h
  2020-08-17  7:32 ` [PATCH 07/11] x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h Christoph Hellwig
@ 2020-08-18 19:27   ` Kees Cook
  0 siblings, 0 replies; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 09:32:08AM +0200, Christoph Hellwig wrote:
> At least for 64-bit this moves them closer to some of the defines
> they are based on, and it prepares for using the TASK_SIZE_MAX
> definition from assembly.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 06/11] lkdtm: disable set_fs-based tests for !CONFIG_SET_FS
  2020-08-17  7:32 ` [PATCH 06/11] lkdtm: disable set_fs-based " Christoph Hellwig
@ 2020-08-18 19:32   ` Kees Cook
  0 siblings, 0 replies; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 09:32:07AM +0200, Christoph Hellwig wrote:
> Once we can't manipulate the address limit, we also can't test what
> happens when the manipulation is abused.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/misc/lkdtm/bugs.c     | 2 ++
>  drivers/misc/lkdtm/core.c     | 4 ++++
>  drivers/misc/lkdtm/usercopy.c | 2 ++
>  3 files changed, 8 insertions(+)
> 
> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
> index 4dfbfd51bdf774..66f1800b1cb82d 100644
> --- a/drivers/misc/lkdtm/bugs.c
> +++ b/drivers/misc/lkdtm/bugs.c
> @@ -312,6 +312,7 @@ void lkdtm_CORRUPT_LIST_DEL(void)
>  		pr_err("list_del() corruption not detected!\n");
>  }
>  
> +#ifdef CONFIG_SET_FS
>  /* Test if unbalanced set_fs(KERNEL_DS)/set_fs(USER_DS) check exists. */
>  void lkdtm_CORRUPT_USER_DS(void)
>  {
> @@ -321,6 +322,7 @@ void lkdtm_CORRUPT_USER_DS(void)
>  	/* Make sure we do not keep running with a KERNEL_DS! */
>  	force_sig(SIGKILL);
>  }
> +#endif

Please let the test defined, but it should XFAIL with a message about
the CONFIG (see similar ifdefs in lkdtm).

>  /* Test that VMAP_STACK is actually allocating with a leading guard page */
>  void lkdtm_STACK_GUARD_PAGE_LEADING(void)
> diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
> index a5e344df916632..aae08b33a7ee2a 100644
> --- a/drivers/misc/lkdtm/core.c
> +++ b/drivers/misc/lkdtm/core.c
> @@ -112,7 +112,9 @@ static const struct crashtype crashtypes[] = {
>  	CRASHTYPE(CORRUPT_STACK_STRONG),
>  	CRASHTYPE(CORRUPT_LIST_ADD),
>  	CRASHTYPE(CORRUPT_LIST_DEL),
> +#ifdef CONFIG_SET_FS
>  	CRASHTYPE(CORRUPT_USER_DS),
> +#endif
>  	CRASHTYPE(STACK_GUARD_PAGE_LEADING),
>  	CRASHTYPE(STACK_GUARD_PAGE_TRAILING),
>  	CRASHTYPE(UNSET_SMEP),
> @@ -172,7 +174,9 @@ static const struct crashtype crashtypes[] = {
>  	CRASHTYPE(USERCOPY_STACK_FRAME_FROM),
>  	CRASHTYPE(USERCOPY_STACK_BEYOND),
>  	CRASHTYPE(USERCOPY_KERNEL),
> +#ifdef CONFIG_SET_FS
>  	CRASHTYPE(USERCOPY_KERNEL_DS),
> +#endif
>  	CRASHTYPE(STACKLEAK_ERASING),
>  	CRASHTYPE(CFI_FORWARD_PROTO),

Then none of these are needed.

>  #ifdef CONFIG_X86_32

Hmpf, this ifdef was missed in ae56942c1474 ("lkdtm: Make arch-specific
tests always available"). I will fix that.

> diff --git a/drivers/misc/lkdtm/usercopy.c b/drivers/misc/lkdtm/usercopy.c
> index b833367a45d053..4b632fe79ab6bb 100644
> --- a/drivers/misc/lkdtm/usercopy.c
> +++ b/drivers/misc/lkdtm/usercopy.c
> @@ -325,6 +325,7 @@ void lkdtm_USERCOPY_KERNEL(void)
>  	vm_munmap(user_addr, PAGE_SIZE);
>  }
>  
> +#ifdef CONFIG_SET_FS
>  void lkdtm_USERCOPY_KERNEL_DS(void)
>  {
>  	char __user *user_ptr =
> @@ -339,6 +340,7 @@ void lkdtm_USERCOPY_KERNEL_DS(void)
>  		pr_err("copy_to_user() to noncanonical address succeeded!?\n");
>  	set_fs(old_fs);
>  }
> +#endif

(Same here, please.)

>  
>  void __init lkdtm_usercopy_init(void)
>  {
> -- 
> 2.28.0
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 01/11] mem: remove duplicate ops for /dev/zero and /dev/null
  2020-08-17  7:32 ` [PATCH 01/11] mem: remove duplicate ops for /dev/zero and /dev/null Christoph Hellwig
@ 2020-08-18 19:33   ` Kees Cook
  0 siblings, 0 replies; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 09:32:02AM +0200, Christoph Hellwig wrote:
> There is no good reason to implement both the traditional ->read and
> ->write as well as the iter based ops.  So implement just the iter
> based ones.
> 
> Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 02/11] fs: don't allow kernel reads and writes without iter ops
  2020-08-17  7:32 ` [PATCH 02/11] fs: don't allow kernel reads and writes without iter ops Christoph Hellwig
@ 2020-08-18 19:34   ` Kees Cook
  0 siblings, 0 replies; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 09:32:03AM +0200, Christoph Hellwig wrote:
> Don't allow calling ->read or ->write with set_fs as a preparation for
> killing off set_fs.  All the instances that we use kernel_read/write on
> are using the iter ops already.
> 
> If a file has both the regular ->read/->write methods and the iter
> variants those could have different semantics for messed up enough
> drivers.  Also fails the kernel access to them in that case.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
  2020-08-17  7:32 ` [PATCH 03/11] fs: don't allow splice read/write without explicit ops Christoph Hellwig
@ 2020-08-18 19:39   ` Kees Cook
  2020-08-18 19:54     ` Christoph Hellwig
  0 siblings, 1 reply; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> default_file_splice_write is the last piece of generic code that uses
> set_fs to make the uaccess routines operate on kernel pointers.  It
> implements a "fallback loop" for splicing from files that do not actually
> provide a proper splice_read method.  The usual file systems and other
> high bandwith instances all provide a ->splice_read, so this just removes
> support for various device drivers and procfs/debugfs files.  If splice
> support for any of those turns out to be important it can be added back
> by switching them to the iter ops and using generic_file_splice_read.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

This seems a bit disruptive? I feel like this is going to make fuzzers
really noisy (e.g. trinity likes to splice random stuff out of /sys and
/proc).

Conceptually, though:

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 04/11] uaccess: add infrastructure for kernel builds with set_fs()
  2020-08-17  7:32 ` [PATCH 04/11] uaccess: add infrastructure for kernel builds with set_fs() Christoph Hellwig
@ 2020-08-18 19:40   ` Kees Cook
  0 siblings, 0 replies; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 09:32:05AM +0200, Christoph Hellwig wrote:
> Add a CONFIG_SET_FS option that is selected by architecturess that
> implement set_fs, which is all of them initially.  If the option is not
> set stubs for routines related to overriding the address space are
> provided so that architectures can start to opt out of providing set_fs.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS
  2020-08-17  7:32 ` [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS Christoph Hellwig
  2020-08-17  7:50   ` Christophe Leroy
@ 2020-08-18 19:43   ` Kees Cook
  1 sibling, 0 replies; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 09:32:06AM +0200, Christoph Hellwig wrote:
> We can't run the tests for userspace bitmap parsing if set_fs() doesn't
> exist.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
  2020-08-17  7:32 ` [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code Christoph Hellwig
@ 2020-08-18 19:44   ` Kees Cook
  2020-08-18 19:55     ` Christoph Hellwig
  0 siblings, 1 reply; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 09:32:09AM +0200, Christoph Hellwig wrote:
> For 64-bit the only hing missing was a strategic _AC, and for 32-bit we

typo: thing

> need to use __PAGE_OFFSET instead of PAGE_OFFSET in the TASK_SIZE
> definition to escape the explicit unsigned long cast.  This just works
> because __PAGE_OFFSET is defined using _AC itself and thus never needs
> the cast anyway.

Shouldn't this be folded into the prior patch so there's no bisection
problem?

-Kees

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  arch/x86/include/asm/page_32_types.h | 4 ++--
>  arch/x86/include/asm/page_64_types.h | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/page_32_types.h b/arch/x86/include/asm/page_32_types.h
> index 26236925fb2c36..f462895a33e452 100644
> --- a/arch/x86/include/asm/page_32_types.h
> +++ b/arch/x86/include/asm/page_32_types.h
> @@ -44,8 +44,8 @@
>  /*
>   * User space process size: 3GB (default).
>   */
> -#define IA32_PAGE_OFFSET	PAGE_OFFSET
> -#define TASK_SIZE		PAGE_OFFSET
> +#define IA32_PAGE_OFFSET	__PAGE_OFFSET
> +#define TASK_SIZE		__PAGE_OFFSET
>  #define TASK_SIZE_LOW		TASK_SIZE
>  #define TASK_SIZE_MAX		TASK_SIZE
>  #define DEFAULT_MAP_WINDOW	TASK_SIZE
> diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> index 996595c9897e0a..838515daf87b36 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -76,7 +76,7 @@
>   *
>   * With page table isolation enabled, we map the LDT in ... [stay tuned]
>   */
> -#define TASK_SIZE_MAX	((1UL << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
> +#define TASK_SIZE_MAX	((_AC(1,UL) << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE)
>  
>  #define DEFAULT_MAP_WINDOW	((1UL << 47) - PAGE_SIZE)
>  
> -- 
> 2.28.0
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/11] x86: remove address space overrides using set_fs()
  2020-08-17  7:32 ` [PATCH 09/11] x86: remove address space overrides using set_fs() Christoph Hellwig
  2020-08-17  8:23   ` David Laight
@ 2020-08-18 19:46   ` Kees Cook
  1 sibling, 0 replies; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 09:32:10AM +0200, Christoph Hellwig wrote:
> Stop providing the possibility to override the address space using
> set_fs() now that there is no need for that any more.  To properly
> handle the TASK_SIZE_MAX checking for 4 vs 5-level page tables on
> x86 a new alternative is introduced, which just like the one in
> entry_64.S has to use the hardcoded virtual address bits to escape
> the fact that TASK_SIZE_MAX isn't actually a constant when 5-level
> page tables are enabled.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Awesome. :)

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
  2020-08-18 19:39   ` Kees Cook
@ 2020-08-18 19:54     ` Christoph Hellwig
  2020-08-18 19:58       ` Kees Cook
  0 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-18 19:54 UTC (permalink / raw)
  To: Kees Cook
  Cc: Christoph Hellwig, Al Viro, Michael Ellerman, x86, linux-kernel,
	linux-fsdevel, linux-arch, linuxppc-dev

On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote:
> On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> > default_file_splice_write is the last piece of generic code that uses
> > set_fs to make the uaccess routines operate on kernel pointers.  It
> > implements a "fallback loop" for splicing from files that do not actually
> > provide a proper splice_read method.  The usual file systems and other
> > high bandwith instances all provide a ->splice_read, so this just removes
> > support for various device drivers and procfs/debugfs files.  If splice
> > support for any of those turns out to be important it can be added back
> > by switching them to the iter ops and using generic_file_splice_read.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> This seems a bit disruptive? I feel like this is going to make fuzzers
> really noisy (e.g. trinity likes to splice random stuff out of /sys and
> /proc).

Noisy in the sence of triggering the pr_debug or because they can't
handle -EINVAL?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
  2020-08-18 19:44   ` Kees Cook
@ 2020-08-18 19:55     ` Christoph Hellwig
  2020-08-18 19:59       ` Kees Cook
  0 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-18 19:55 UTC (permalink / raw)
  To: Kees Cook
  Cc: Christoph Hellwig, Al Viro, Michael Ellerman, x86, linux-kernel,
	linux-fsdevel, linux-arch, linuxppc-dev

On Tue, Aug 18, 2020 at 12:44:49PM -0700, Kees Cook wrote:
> On Mon, Aug 17, 2020 at 09:32:09AM +0200, Christoph Hellwig wrote:
> > For 64-bit the only hing missing was a strategic _AC, and for 32-bit we
> 
> typo: thing
> 
> > need to use __PAGE_OFFSET instead of PAGE_OFFSET in the TASK_SIZE
> > definition to escape the explicit unsigned long cast.  This just works
> > because __PAGE_OFFSET is defined using _AC itself and thus never needs
> > the cast anyway.
> 
> Shouldn't this be folded into the prior patch so there's no bisection
> problem?

I didn't see a problem bisecting, do you have something particular in
mind?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
  2020-08-18 19:54     ` Christoph Hellwig
@ 2020-08-18 19:58       ` Kees Cook
  2020-08-18 20:07         ` Christoph Hellwig
  0 siblings, 1 reply; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Tue, Aug 18, 2020 at 09:54:46PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote:
> > On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> > > default_file_splice_write is the last piece of generic code that uses
> > > set_fs to make the uaccess routines operate on kernel pointers.  It
> > > implements a "fallback loop" for splicing from files that do not actually
> > > provide a proper splice_read method.  The usual file systems and other
> > > high bandwith instances all provide a ->splice_read, so this just removes
> > > support for various device drivers and procfs/debugfs files.  If splice
> > > support for any of those turns out to be important it can be added back
> > > by switching them to the iter ops and using generic_file_splice_read.
> > > 
> > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > 
> > This seems a bit disruptive? I feel like this is going to make fuzzers
> > really noisy (e.g. trinity likes to splice random stuff out of /sys and
> > /proc).
> 
> Noisy in the sence of triggering the pr_debug or because they can't
> handle -EINVAL?

Well, maybe both? I doubt much _expects_ to be using splice, so I'm fine
with that, but it seems weird not to have a fall-back, especially if
something would like to splice a file out of there. But, I'm not opposed
to the change, it just seems like it might cause pain down the road.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
  2020-08-18 19:55     ` Christoph Hellwig
@ 2020-08-18 19:59       ` Kees Cook
  2020-08-18 20:00         ` Christoph Hellwig
  0 siblings, 1 reply; 39+ messages in thread
From: Kees Cook @ 2020-08-18 19:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Tue, Aug 18, 2020 at 09:55:39PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:44:49PM -0700, Kees Cook wrote:
> > On Mon, Aug 17, 2020 at 09:32:09AM +0200, Christoph Hellwig wrote:
> > > For 64-bit the only hing missing was a strategic _AC, and for 32-bit we
> > 
> > typo: thing
> > 
> > > need to use __PAGE_OFFSET instead of PAGE_OFFSET in the TASK_SIZE
> > > definition to escape the explicit unsigned long cast.  This just works
> > > because __PAGE_OFFSET is defined using _AC itself and thus never needs
> > > the cast anyway.
> > 
> > Shouldn't this be folded into the prior patch so there's no bisection
> > problem?
> 
> I didn't see a problem bisecting, do you have something particular in
> mind?

Oh, I misunderstood this patch to be a fix for compilation. Is this just
a correctness fix?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
  2020-08-18 19:59       ` Kees Cook
@ 2020-08-18 20:00         ` Christoph Hellwig
  2020-08-18 20:08           ` Kees Cook
  0 siblings, 1 reply; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-18 20:00 UTC (permalink / raw)
  To: Kees Cook
  Cc: Christoph Hellwig, Al Viro, Michael Ellerman, x86, linux-kernel,
	linux-fsdevel, linux-arch, linuxppc-dev

On Tue, Aug 18, 2020 at 12:59:05PM -0700, Kees Cook wrote:
> > I didn't see a problem bisecting, do you have something particular in
> > mind?
> 
> Oh, I misunderstood this patch to be a fix for compilation. Is this just
> a correctness fix?

It prepares for using the definition from assembly, which is done in
the next patch.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 03/11] fs: don't allow splice read/write without explicit ops
  2020-08-18 19:58       ` Kees Cook
@ 2020-08-18 20:07         ` Christoph Hellwig
  0 siblings, 0 replies; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-18 20:07 UTC (permalink / raw)
  To: Kees Cook
  Cc: Christoph Hellwig, Al Viro, Michael Ellerman, x86, linux-kernel,
	linux-fsdevel, linux-arch, linuxppc-dev

On Tue, Aug 18, 2020 at 12:58:07PM -0700, Kees Cook wrote:
> On Tue, Aug 18, 2020 at 09:54:46PM +0200, Christoph Hellwig wrote:
> > On Tue, Aug 18, 2020 at 12:39:34PM -0700, Kees Cook wrote:
> > > On Mon, Aug 17, 2020 at 09:32:04AM +0200, Christoph Hellwig wrote:
> > > > default_file_splice_write is the last piece of generic code that uses
> > > > set_fs to make the uaccess routines operate on kernel pointers.  It
> > > > implements a "fallback loop" for splicing from files that do not actually
> > > > provide a proper splice_read method.  The usual file systems and other
> > > > high bandwith instances all provide a ->splice_read, so this just removes
> > > > support for various device drivers and procfs/debugfs files.  If splice
> > > > support for any of those turns out to be important it can be added back
> > > > by switching them to the iter ops and using generic_file_splice_read.
> > > > 
> > > > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > 
> > > This seems a bit disruptive? I feel like this is going to make fuzzers
> > > really noisy (e.g. trinity likes to splice random stuff out of /sys and
> > > /proc).
> > 
> > Noisy in the sence of triggering the pr_debug or because they can't
> > handle -EINVAL?
> 
> Well, maybe both? I doubt much _expects_ to be using splice, so I'm fine
> with that, but it seems weird not to have a fall-back, especially if
> something would like to splice a file out of there. But, I'm not opposed
> to the change, it just seems like it might cause pain down the road.

The problem is that without pretending a buffer is in user space when
it actually isn't, we can't have a generic fallback.  So we'll have to
have specific support - I wrote generic support for seq_file, and
willy did for /proc/sys, but at least the first caused a few problems
and a fair amount of churn, so I'd rather see first if we can get
away without it.

> 
> -- 
> Kees Cook
---end quoted text---

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code
  2020-08-18 20:00         ` Christoph Hellwig
@ 2020-08-18 20:08           ` Kees Cook
  0 siblings, 0 replies; 39+ messages in thread
From: Kees Cook @ 2020-08-18 20:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, Michael Ellerman, x86, linux-kernel, linux-fsdevel,
	linux-arch, linuxppc-dev

On Tue, Aug 18, 2020 at 10:00:16PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:59:05PM -0700, Kees Cook wrote:
> > > I didn't see a problem bisecting, do you have something particular in
> > > mind?
> > 
> > Oh, I misunderstood this patch to be a fix for compilation. Is this just
> > a correctness fix?
> 
> It prepares for using the definition from assembly, which is done in
> the next patch.

Ah! Okay; thanks.

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: remove the last set_fs() in common code, and remove it for x86 and powerpc
  2020-08-18 18:23     ` Christophe Leroy
@ 2020-08-19  7:16       ` Christophe Leroy
  2020-08-19  7:22         ` iter and normal ops on /dev/zero & co, was " Christoph Hellwig
  0 siblings, 1 reply; 39+ messages in thread
From: Christophe Leroy @ 2020-08-19  7:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-arch, Kees Cook, x86, linux-kernel, Al Viro, linux-fsdevel,
	linuxppc-dev



Le 18/08/2020 à 20:23, Christophe Leroy a écrit :
> 
> 
> Le 18/08/2020 à 20:05, Christoph Hellwig a écrit :
>> On Tue, Aug 18, 2020 at 07:46:22PM +0200, Christophe Leroy wrote:
>>> I gave it a go on my powerpc mpc832x. I tested it on top of my newest
>>> series that reworks the 32 bits signal handlers (see
>>> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=196278) with 
>>>
>>> the microbenchmark test used is that series.
>>>
>>> With KUAP activated, on top of signal32 rework, performance is 
>>> boosted as
>>> system time for the microbenchmark goes from 1.73s down to 1.56s, 
>>> that is
>>> 10% quicker
>>>
>>> Surprisingly, with the kernel as is today without my signal's series, 
>>> your
>>> series degrades performance slightly (from 2.55s to 2.64s ie 3.5% 
>>> slower).
>>>
>>>
>>> I also observe, in both cases, a degradation on
>>>
>>>     dd if=/dev/zero of=/dev/null count=1M
>>>
>>> Without your series, it runs in 5.29 seconds.
>>> With your series, it runs in 5.82 seconds, that is 10% more time.
>>
>> That's pretty strage, I wonder if some kernel text cache line
>> effects come into play here?
>>
>> The kernel access side is only used in slow path code, so it should
>> not make a difference, and the uaccess code is simplified and should be
>> (marginally) faster.
>>
>> Btw, was this with the __{get,put}_user_allowed cockup that you noticed
>> fixed?
>>
> 
> Yes it is with the __get_user_size() replaced by __get_user_size_allowed().

I made a test with only the first patch of your series: That's 
definitely the culprit. With only that patch applies, the duration is 
6.64 seconds, that's a 25% degradation.

A perf record provides the following without the patch:
     41.91%  dd       [kernel.kallsyms]  [k] __arch_clear_user
      7.02%  dd       [kernel.kallsyms]  [k] vfs_read
      6.86%  dd       [kernel.kallsyms]  [k] new_sync_read
      6.68%  dd       [kernel.kallsyms]  [k] iov_iter_zero
      6.03%  dd       [kernel.kallsyms]  [k] transfer_to_syscall
      3.39%  dd       [kernel.kallsyms]  [k] memset
      3.07%  dd       [kernel.kallsyms]  [k] __fsnotify_parent
      2.68%  dd       [kernel.kallsyms]  [k] ksys_read
      2.09%  dd       [kernel.kallsyms]  [k] read_iter_zero
      2.01%  dd       [kernel.kallsyms]  [k] __fget_light
      1.84%  dd       [kernel.kallsyms]  [k] __fdget_pos
      1.35%  dd       [kernel.kallsyms]  [k] rw_verify_area
      1.32%  dd       libc-2.23.so       [.] __GI___libc_write
      1.21%  dd       [kernel.kallsyms]  [k] vfs_write
...
      0.03%  dd       [kernel.kallsyms]  [k] write_null

And the following with the patch:

     15.54%  dd       [kernel.kallsyms]  [k] __arch_clear_user
      9.17%  dd       [kernel.kallsyms]  [k] vfs_read
      6.54%  dd       [kernel.kallsyms]  [k] new_sync_write
      6.31%  dd       [kernel.kallsyms]  [k] transfer_to_syscall
      6.29%  dd       [kernel.kallsyms]  [k] __fsnotify_parent
      6.20%  dd       [kernel.kallsyms]  [k] new_sync_read
      5.47%  dd       [kernel.kallsyms]  [k] memset
      5.13%  dd       [kernel.kallsyms]  [k] vfs_write
      4.44%  dd       [kernel.kallsyms]  [k] iov_iter_zero
      2.95%  dd       [kernel.kallsyms]  [k] write_iter_null
      2.82%  dd       [kernel.kallsyms]  [k] ksys_read
      2.46%  dd       [kernel.kallsyms]  [k] __fget_light
      2.34%  dd       libc-2.23.so       [.] __GI___libc_read
      1.89%  dd       [kernel.kallsyms]  [k] iov_iter_advance
      1.76%  dd       [kernel.kallsyms]  [k] __fdget_pos
      1.65%  dd       [kernel.kallsyms]  [k] rw_verify_area
      1.63%  dd       [kernel.kallsyms]  [k] read_iter_zero
      1.60%  dd       [kernel.kallsyms]  [k] iov_iter_init
      1.22%  dd       [kernel.kallsyms]  [k] ksys_write
      1.14%  dd       libc-2.23.so       [.] __GI___libc_write

Christophe

> 
> Christophe

^ permalink raw reply	[flat|nested] 39+ messages in thread

* iter and normal ops on /dev/zero & co, was Re: remove the last set_fs() in common code, and remove it for x86 and powerpc
  2020-08-19  7:16       ` Christophe Leroy
@ 2020-08-19  7:22         ` Christoph Hellwig
  0 siblings, 0 replies; 39+ messages in thread
From: Christoph Hellwig @ 2020-08-19  7:22 UTC (permalink / raw)
  To: Christophe Leroy, Al Viro
  Cc: Christoph Hellwig, linux-arch, Kees Cook, x86, linux-kernel,
	linux-fsdevel, linuxppc-dev

On Wed, Aug 19, 2020 at 09:16:59AM +0200, Christophe Leroy wrote:
> I made a test with only the first patch of your series: That's definitely 
> the culprit. With only that patch applies, the duration is 6.64 seconds, 
> that's a 25% degradation.

For the record: the first patch is:

     mem: remove duplicate ops for /dev/zero and /dev/null

So these micro-optimizations matter at least for some popular
benchmarks.  It would be easy to drop, but that means we either:

 - can't support kernel_read/write on these files, which should not
   matter

or
 
 - have to drop the check for both ops being present

Al, what do you think?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 09/11] x86: remove address space overrides using set_fs()
  2020-08-17  8:23   ` David Laight
@ 2020-08-27  9:37     ` 'Christoph Hellwig'
  0 siblings, 0 replies; 39+ messages in thread
From: 'Christoph Hellwig' @ 2020-08-27  9:37 UTC (permalink / raw)
  To: David Laight
  Cc: 'Christoph Hellwig',
	Al Viro, Michael Ellerman, x86, Kees Cook, linux-kernel,
	linux-fsdevel, linux-arch, linuxppc-dev

On Mon, Aug 17, 2020 at 08:23:11AM +0000, David Laight wrote:
> From: Christoph Hellwig
> > Sent: 17 August 2020 08:32
> >
> > Stop providing the possibility to override the address space using
> > set_fs() now that there is no need for that any more.  To properly
> > handle the TASK_SIZE_MAX checking for 4 vs 5-level page tables on
> > x86 a new alternative is introduced, which just like the one in
> > entry_64.S has to use the hardcoded virtual address bits to escape
> > the fact that TASK_SIZE_MAX isn't actually a constant when 5-level
> > page tables are enabled.
> ....
> > @@ -93,7 +69,7 @@ static inline bool pagefault_disabled(void);
> >  #define access_ok(addr, size)					\
> >  ({									\
> >  	WARN_ON_IN_IRQ();						\
> > -	likely(!__range_not_ok(addr, size, user_addr_max()));		\
> > +	likely(!__range_not_ok(addr, size, TASK_SIZE_MAX));		\
> >  })
> 
> Can't that always compare against a constant even when 5-levl
> page tables are enabled on x86-64?
> 
> On x86-64 it can (probably) reduce to (addr | (addr + size)) < 0.

I'll leave that to the x86 maintainers as a future cleanup if wanted.

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2020-08-27  9:37 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-17  7:32 remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
2020-08-17  7:32 ` [PATCH 01/11] mem: remove duplicate ops for /dev/zero and /dev/null Christoph Hellwig
2020-08-18 19:33   ` Kees Cook
2020-08-17  7:32 ` [PATCH 02/11] fs: don't allow kernel reads and writes without iter ops Christoph Hellwig
2020-08-18 19:34   ` Kees Cook
2020-08-17  7:32 ` [PATCH 03/11] fs: don't allow splice read/write without explicit ops Christoph Hellwig
2020-08-18 19:39   ` Kees Cook
2020-08-18 19:54     ` Christoph Hellwig
2020-08-18 19:58       ` Kees Cook
2020-08-18 20:07         ` Christoph Hellwig
2020-08-17  7:32 ` [PATCH 04/11] uaccess: add infrastructure for kernel builds with set_fs() Christoph Hellwig
2020-08-18 19:40   ` Kees Cook
2020-08-17  7:32 ` [PATCH 05/11] test_bitmap: skip user bitmap tests for !CONFIG_SET_FS Christoph Hellwig
2020-08-17  7:50   ` Christophe Leroy
2020-08-17  7:52     ` Christoph Hellwig
2020-08-18 19:43   ` Kees Cook
2020-08-17  7:32 ` [PATCH 06/11] lkdtm: disable set_fs-based " Christoph Hellwig
2020-08-18 19:32   ` Kees Cook
2020-08-17  7:32 ` [PATCH 07/11] x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h Christoph Hellwig
2020-08-18 19:27   ` Kees Cook
2020-08-17  7:32 ` [PATCH 08/11] x86: make TASK_SIZE_MAX usable from assembly code Christoph Hellwig
2020-08-18 19:44   ` Kees Cook
2020-08-18 19:55     ` Christoph Hellwig
2020-08-18 19:59       ` Kees Cook
2020-08-18 20:00         ` Christoph Hellwig
2020-08-18 20:08           ` Kees Cook
2020-08-17  7:32 ` [PATCH 09/11] x86: remove address space overrides using set_fs() Christoph Hellwig
2020-08-17  8:23   ` David Laight
2020-08-27  9:37     ` 'Christoph Hellwig'
2020-08-18 19:46   ` Kees Cook
2020-08-17  7:32 ` [PATCH 10/11] powerpc: use non-set_fs based maccess routines Christoph Hellwig
2020-08-17 15:47   ` Christophe Leroy
2020-08-17  7:32 ` [PATCH 11/11] powerpc: remove address space overrides using set_fs() Christoph Hellwig
2020-08-17  7:39 ` remove the last set_fs() in common code, and remove it for x86 and powerpc Christoph Hellwig
2020-08-18 17:46 ` Christophe Leroy
2020-08-18 18:05   ` Christoph Hellwig
2020-08-18 18:23     ` Christophe Leroy
2020-08-19  7:16       ` Christophe Leroy
2020-08-19  7:22         ` iter and normal ops on /dev/zero & co, was " Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).