linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* clean up kernel_{read,write} & friends
@ 2020-05-08  9:22 Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 01/11] fs: call file_{start,end}_write from __kernel_write Christoph Hellwig
                   ` (10 more replies)
  0 siblings, 11 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

Hi Al,

this series fixes a few issues and cleans up the helpers that read from
or write to kernel space buffers, and ensures that we don't change the
address limit if we are using the ->read_iter and ->write_iter methods
that don't need the changed address limit.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 01/11] fs: call file_{start,end}_write from __kernel_write
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 02/11] fs: check FMODE_WRITE in __kernel_write Christoph Hellwig
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

We always need to take a reference on the file system we are writing
to.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index bbfa9b12b15eb..d5aaf3a4198b9 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -508,6 +508,7 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	if (!(file->f_mode & FMODE_CAN_WRITE))
 		return -EINVAL;
 
+	file_start_write(file);
 	old_fs = get_fs();
 	set_fs(KERNEL_DS);
 	p = (__force const char __user *)buf;
@@ -520,6 +521,7 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 		add_wchar(current, ret);
 	}
 	inc_syscw(current);
+	file_end_write(file);
 	return ret;
 }
 EXPORT_SYMBOL(__kernel_write);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 02/11] fs: check FMODE_WRITE in __kernel_write
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 01/11] fs: call file_{start,end}_write from __kernel_write Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 03/11] fs: remove the call_{read,write}_iter functions Christoph Hellwig
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

We still need to check if the fѕ is open write, even for the low-level
helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index d5aaf3a4198b9..d5c754080e5a5 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -505,6 +505,8 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	const char __user *p;
 	ssize_t ret;
 
+	if (!(file->f_mode & FMODE_WRITE))
+		return -EBADF;
 	if (!(file->f_mode & FMODE_CAN_WRITE))
 		return -EINVAL;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 03/11] fs: remove the call_{read,write}_iter functions
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 01/11] fs: call file_{start,end}_write from __kernel_write Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 02/11] fs: check FMODE_WRITE in __kernel_write Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 04/11] fs: implement kernel_write using __kernel_write Christoph Hellwig
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

Just open coding the methods calls is a lot easier to follow.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/loop.c              |  4 ++--
 drivers/target/target_core_file.c |  4 ++--
 fs/aio.c                          |  4 ++--
 fs/io_uring.c                     |  4 ++--
 fs/read_write.c                   | 12 ++++++------
 fs/splice.c                       |  2 +-
 include/linux/fs.h                | 12 ------------
 7 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index da693e6a834e5..ad167050a4ec4 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -572,9 +572,9 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
 		kthread_associate_blkcg(cmd->css);
 
 	if (rw == WRITE)
-		ret = call_write_iter(file, &cmd->iocb, &iter);
+		ret = file->f_op->write_iter(&cmd->iocb, &iter);
 	else
-		ret = call_read_iter(file, &cmd->iocb, &iter);
+		ret = file->f_op->read_iter(&cmd->iocb, &iter);
 
 	lo_rw_aio_do_completion(cmd);
 	kthread_associate_blkcg(NULL);
diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c
index 7143d03f0e027..79f0707877917 100644
--- a/drivers/target/target_core_file.c
+++ b/drivers/target/target_core_file.c
@@ -303,9 +303,9 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents,
 		aio_cmd->iocb.ki_flags |= IOCB_DSYNC;
 
 	if (is_write)
-		ret = call_write_iter(file, &aio_cmd->iocb, &iter);
+		ret = file->f_op->write_iter(&aio_cmd->iocb, &iter);
 	else
-		ret = call_read_iter(file, &aio_cmd->iocb, &iter);
+		ret = file->f_op->read_iter(&aio_cmd->iocb, &iter);
 
 	kfree(bvec);
 
diff --git a/fs/aio.c b/fs/aio.c
index 5f3d3d8149287..1ccc0efdc357d 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1540,7 +1540,7 @@ static int aio_read(struct kiocb *req, const struct iocb *iocb,
 		return ret;
 	ret = rw_verify_area(READ, file, &req->ki_pos, iov_iter_count(&iter));
 	if (!ret)
-		aio_rw_done(req, call_read_iter(file, req, &iter));
+		aio_rw_done(req, file->f_op->read_iter(req, &iter));
 	kfree(iovec);
 	return ret;
 }
@@ -1580,7 +1580,7 @@ static int aio_write(struct kiocb *req, const struct iocb *iocb,
 			__sb_writers_release(file_inode(file)->i_sb, SB_FREEZE_WRITE);
 		}
 		req->ki_flags |= IOCB_WRITE;
-		aio_rw_done(req, call_write_iter(file, req, &iter));
+		aio_rw_done(req, file->f_op->write_iter(req, &iter));
 	}
 	kfree(iovec);
 	return ret;
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 0b91b06311735..ecad3dba1b23f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2590,7 +2590,7 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
 		ssize_t ret2;
 
 		if (req->file->f_op->read_iter)
-			ret2 = call_read_iter(req->file, kiocb, &iter);
+			ret2 = req->file->f_op->read_iter(kiocb, &iter);
 		else
 			ret2 = loop_rw_iter(READ, req->file, kiocb, &iter);
 
@@ -2705,7 +2705,7 @@ static int io_write(struct io_kiocb *req, bool force_nonblock)
 			current->signal->rlim[RLIMIT_FSIZE].rlim_cur = req->fsize;
 
 		if (req->file->f_op->write_iter)
-			ret2 = call_write_iter(req->file, kiocb, &iter);
+			ret2 = req->file->f_op->write_iter(kiocb, &iter);
 		else
 			ret2 = loop_rw_iter(WRITE, req->file, kiocb, &iter);
 
diff --git a/fs/read_write.c b/fs/read_write.c
index d5c754080e5a5..d91fe7ff6cc55 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -412,7 +412,7 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 	kiocb.ki_pos = (ppos ? *ppos : 0);
 	iov_iter_init(&iter, READ, &iov, 1, len);
 
-	ret = call_read_iter(filp, &kiocb, &iter);
+	ret = filp->f_op->read_iter(&kiocb, &iter);
 	BUG_ON(ret == -EIOCBQUEUED);
 	if (ppos)
 		*ppos = kiocb.ki_pos;
@@ -481,7 +481,7 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 	kiocb.ki_pos = (ppos ? *ppos : 0);
 	iov_iter_init(&iter, WRITE, &iov, 1, len);
 
-	ret = call_write_iter(filp, &kiocb, &iter);
+	ret = filp->f_op->write_iter(&kiocb, &iter);
 	BUG_ON(ret == -EIOCBQUEUED);
 	if (ret > 0 && ppos)
 		*ppos = kiocb.ki_pos;
@@ -693,9 +693,9 @@ static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
 	kiocb.ki_pos = (ppos ? *ppos : 0);
 
 	if (type == READ)
-		ret = call_read_iter(filp, &kiocb, iter);
+		ret = filp->f_op->read_iter(&kiocb, iter);
 	else
-		ret = call_write_iter(filp, &kiocb, iter);
+		ret = filp->f_op->write_iter(&kiocb, iter);
 	BUG_ON(ret == -EIOCBQUEUED);
 	if (ppos)
 		*ppos = kiocb.ki_pos;
@@ -964,7 +964,7 @@ ssize_t vfs_iocb_iter_read(struct file *file, struct kiocb *iocb,
 	if (ret < 0)
 		return ret;
 
-	ret = call_read_iter(file, iocb, iter);
+	ret = file->f_op->read_iter(iocb, iter);
 out:
 	if (ret >= 0)
 		fsnotify_access(file);
@@ -1028,7 +1028,7 @@ ssize_t vfs_iocb_iter_write(struct file *file, struct kiocb *iocb,
 	if (ret < 0)
 		return ret;
 
-	ret = call_write_iter(file, iocb, iter);
+	ret = file->f_op->write_iter(iocb, iter);
 	if (ret > 0)
 		fsnotify_modify(file);
 
diff --git a/fs/splice.c b/fs/splice.c
index 4735defc46ee6..05f52b02320b4 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -310,7 +310,7 @@ ssize_t generic_file_splice_read(struct file *in, loff_t *ppos,
 	i_head = to.head;
 	init_sync_kiocb(&kiocb, in);
 	kiocb.ki_pos = *ppos;
-	ret = call_read_iter(in, &kiocb, &to);
+	ret = in->f_op->read_iter(&kiocb, &to);
 	if (ret > 0) {
 		*ppos = kiocb.ki_pos;
 		file_accessed(in);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 45cc10cdf6ddd..21f126957c2cf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1895,18 +1895,6 @@ struct inode_operations {
 	int (*set_acl)(struct inode *, struct posix_acl *, int);
 } ____cacheline_aligned;
 
-static inline ssize_t call_read_iter(struct file *file, struct kiocb *kio,
-				     struct iov_iter *iter)
-{
-	return file->f_op->read_iter(kio, iter);
-}
-
-static inline ssize_t call_write_iter(struct file *file, struct kiocb *kio,
-				      struct iov_iter *iter)
-{
-	return file->f_op->write_iter(kio, iter);
-}
-
 static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	return file->f_op->mmap(file, vma);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 04/11] fs: implement kernel_write using __kernel_write
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
                   ` (2 preceding siblings ...)
  2020-05-08  9:22 ` [PATCH 03/11] fs: remove the call_{read,write}_iter functions Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 05/11] fs: remove __vfs_write Christoph Hellwig
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

Consolidate the two in-kernel write helpers to make upcoming changes
easier.  The only difference are the missing call to rw_verify_area
in kernel_write, and an access_ok check that doesn't make sense for
kernel buffers to start with.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index d91fe7ff6cc55..6b456a257b31c 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -531,16 +531,13 @@ EXPORT_SYMBOL(__kernel_write);
 ssize_t kernel_write(struct file *file, const void *buf, size_t count,
 			    loff_t *pos)
 {
-	mm_segment_t old_fs;
-	ssize_t res;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	/* The cast to a user pointer is valid due to the set_fs() */
-	res = vfs_write(file, (__force const char __user *)buf, count, pos);
-	set_fs(old_fs);
+	ssize_t ret;
 
-	return res;
+	ret = rw_verify_area(WRITE, file, pos, count);
+	if (ret)
+		return ret;
+	return __kernel_write(file, buf, count, pos);
 }
 EXPORT_SYMBOL(kernel_write);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 05/11] fs: remove __vfs_write
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
                   ` (3 preceding siblings ...)
  2020-05-08  9:22 ` [PATCH 04/11] fs: implement kernel_write using __kernel_write Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 06/11] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

Fold it into the two callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 6b456a257b31c..67a035782874b 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -488,17 +488,6 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 	return ret;
 }
 
-static ssize_t __vfs_write(struct file *file, const char __user *p,
-			   size_t count, loff_t *pos)
-{
-	if (file->f_op->write)
-		return file->f_op->write(file, p, count, pos);
-	else if (file->f_op->write_iter)
-		return new_sync_write(file, p, count, pos);
-	else
-		return -EINVAL;
-}
-
 ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
 {
 	mm_segment_t old_fs;
@@ -516,7 +505,12 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 	p = (__force const char __user *)buf;
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	ret = __vfs_write(file, p, count, pos);
+	if (file->f_op->write)
+		ret = file->f_op->write(file, p, count, pos);
+	else if (file->f_op->write_iter)
+		ret = new_sync_write(file, p, count, pos);
+	else
+		ret = -EINVAL;
 	set_fs(old_fs);
 	if (ret > 0) {
 		fsnotify_modify(file);
@@ -553,19 +547,23 @@ ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_
 		return -EFAULT;
 
 	ret = rw_verify_area(WRITE, file, pos, count);
-	if (!ret) {
-		if (count > MAX_RW_COUNT)
-			count =  MAX_RW_COUNT;
-		file_start_write(file);
-		ret = __vfs_write(file, buf, count, pos);
-		if (ret > 0) {
-			fsnotify_modify(file);
-			add_wchar(current, ret);
-		}
-		inc_syscw(current);
-		file_end_write(file);
+	if (ret)
+		return ret;
+	if (count > MAX_RW_COUNT)
+		count =  MAX_RW_COUNT;
+	file_start_write(file);
+	if (file->f_op->write)
+		ret = file->f_op->write(file, buf, count, pos);
+	else if (file->f_op->write_iter)
+		ret = new_sync_write(file, buf, count, pos);
+	else
+		ret = -EINVAL;
+	if (ret > 0) {
+		fsnotify_modify(file);
+		add_wchar(current, ret);
 	}
-
+	inc_syscw(current);
+	file_end_write(file);
 	return ret;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 06/11] fs: don't change the address limit for ->write_iter in __kernel_write
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
                   ` (4 preceding siblings ...)
  2020-05-08  9:22 ` [PATCH 05/11] fs: remove __vfs_write Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 07/11] fs: add a __kernel_read helper Christoph Hellwig
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

If we write to a file that implements ->write_iter there is no need
to change the address limit if we send a kvec down.  Implement that
case, and prefer it over using plain ->write with a changed address
limit if available.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 67a035782874b..8a55e81bd9ac7 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -488,10 +488,9 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t
 	return ret;
 }
 
-ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos)
+ssize_t __kernel_write(struct file *file, const void *buf, size_t count,
+		loff_t *pos)
 {
-	mm_segment_t old_fs;
-	const char __user *p;
 	ssize_t ret;
 
 	if (!(file->f_mode & FMODE_WRITE))
@@ -500,18 +499,29 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t
 		return -EINVAL;
 
 	file_start_write(file);
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	p = (__force const char __user *)buf;
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	if (file->f_op->write)
-		ret = file->f_op->write(file, p, count, pos);
-	else if (file->f_op->write_iter)
-		ret = new_sync_write(file, p, count, pos);
-	else
+	if (file->f_op->write_iter) {
+		struct kvec iov = { .iov_base = (void *)buf, .iov_len = count };
+		struct kiocb kiocb;
+		struct iov_iter iter;
+
+		init_sync_kiocb(&kiocb, file);
+		kiocb.ki_pos = *pos;
+		iov_iter_kvec(&iter, WRITE, &iov, 1, count);
+		ret = file->f_op->write_iter(&kiocb, &iter);
+		if (ret > 0)
+			*pos = kiocb.ki_pos;
+	} else if (file->f_op->write) {
+		mm_segment_t old_fs = get_fs();
+
+		set_fs(KERNEL_DS);
+		ret = file->f_op->write(file, (__force const char __user *)buf,
+				count, pos);
+		set_fs(old_fs);
+	} else {
 		ret = -EINVAL;
-	set_fs(old_fs);
+	}
 	if (ret > 0) {
 		fsnotify_modify(file);
 		add_wchar(current, ret);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 07/11] fs: add a __kernel_read helper
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
                   ` (5 preceding siblings ...)
  2020-05-08  9:22 ` [PATCH 06/11] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 08/11] integrity/ima: switch to using __kernel_read Christoph Hellwig
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

This is the counterpart to __kernel_write, and skip the rw_verify_area
call compared to kernel_read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c    | 21 +++++++++++++++++++++
 include/linux/fs.h |  1 +
 2 files changed, 22 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index 8a55e81bd9ac7..93f5724b4837d 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -430,6 +430,27 @@ ssize_t __vfs_read(struct file *file, char __user *buf, size_t count,
 		return -EINVAL;
 }
 
+ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
+{
+	mm_segment_t old_fs = get_fs();
+	ssize_t ret;
+
+	if (!(file->f_mode & FMODE_CAN_READ))
+		return -EINVAL;
+
+	if (count > MAX_RW_COUNT)
+		count =  MAX_RW_COUNT;
+	set_fs(KERNEL_DS);
+	ret = __vfs_read(file, (void __user *)buf, count, pos);
+	set_fs(old_fs);
+	if (ret > 0) {
+		fsnotify_access(file);
+		add_rchar(current, ret);
+	}
+	inc_syscr(current);
+	return ret;
+}
+
 ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
 	mm_segment_t old_fs;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 21f126957c2cf..6441aaa25f8f2 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3011,6 +3011,7 @@ extern int kernel_read_file_from_path_initns(const char *, void **, loff_t *, lo
 extern int kernel_read_file_from_fd(int, void **, loff_t *, loff_t,
 				    enum kernel_read_file_id);
 extern ssize_t kernel_read(struct file *, void *, size_t, loff_t *);
+ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos);
 extern ssize_t kernel_write(struct file *, const void *, size_t, loff_t *);
 extern ssize_t __kernel_write(struct file *, const void *, size_t, loff_t *);
 extern struct file * open_exec(const char *);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 08/11] integrity/ima: switch to using __kernel_read
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
                   ` (6 preceding siblings ...)
  2020-05-08  9:22 ` [PATCH 07/11] fs: add a __kernel_read helper Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 09/11] fs: implement kernel_read " Christoph Hellwig
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

__kernel_read has a bunch of additional sanity checks, and this moves
the set_fs out of non-core code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 security/integrity/iint.c | 14 +-------------
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/security/integrity/iint.c b/security/integrity/iint.c
index e12c4900510f6..1d20003243c3f 100644
--- a/security/integrity/iint.c
+++ b/security/integrity/iint.c
@@ -188,19 +188,7 @@ DEFINE_LSM(integrity) = {
 int integrity_kernel_read(struct file *file, loff_t offset,
 			  void *addr, unsigned long count)
 {
-	mm_segment_t old_fs;
-	char __user *buf = (char __user *)addr;
-	ssize_t ret;
-
-	if (!(file->f_mode & FMODE_READ))
-		return -EBADF;
-
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	ret = __vfs_read(file, buf, count, &offset);
-	set_fs(old_fs);
-
-	return ret;
+	return __kernel_read(file, addr, count, &offset);
 }
 
 /*
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 09/11] fs: implement kernel_read using __kernel_read
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
                   ` (7 preceding siblings ...)
  2020-05-08  9:22 ` [PATCH 08/11] integrity/ima: switch to using __kernel_read Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 10/11] fs: remove __vfs_read Christoph Hellwig
  2020-05-08  9:22 ` [PATCH 11/11] fs: don't change the address limit for ->read_iter in __kernel_read Christoph Hellwig
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

Consolidate the two in-kernel read helpers to make upcoming changes
easier.  The only difference are the missing call to rw_verify_area
in kernel_read, and an access_ok check that doesn't make sense for
kernel buffers to start with.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 93f5724b4837d..0ffbed5fd8136 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -453,15 +453,12 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 
 ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
-	mm_segment_t old_fs;
-	ssize_t result;
+	ssize_t ret;
 
-	old_fs = get_fs();
-	set_fs(KERNEL_DS);
-	/* The cast to a user pointer is valid due to the set_fs() */
-	result = vfs_read(file, (void __user *)buf, count, pos);
-	set_fs(old_fs);
-	return result;
+	ret = rw_verify_area(READ, file, pos, count);
+	if (ret)
+		return ret;
+	return __kernel_read(file, buf, count, pos);
 }
 EXPORT_SYMBOL(kernel_read);
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 10/11] fs: remove __vfs_read
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
                   ` (8 preceding siblings ...)
  2020-05-08  9:22 ` [PATCH 09/11] fs: implement kernel_read " Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  2020-05-08 15:45   ` Ira Weiny
  2020-05-08  9:22 ` [PATCH 11/11] fs: don't change the address limit for ->read_iter in __kernel_read Christoph Hellwig
  10 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

Fold it into the two callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c    | 43 +++++++++++++++++++++----------------------
 include/linux/fs.h |  1 -
 2 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 0ffbed5fd8136..f0009b506014c 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -419,17 +419,6 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 	return ret;
 }
 
-ssize_t __vfs_read(struct file *file, char __user *buf, size_t count,
-		   loff_t *pos)
-{
-	if (file->f_op->read)
-		return file->f_op->read(file, buf, count, pos);
-	else if (file->f_op->read_iter)
-		return new_sync_read(file, buf, count, pos);
-	else
-		return -EINVAL;
-}
-
 ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
 	mm_segment_t old_fs = get_fs();
@@ -441,7 +430,12 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
 	set_fs(KERNEL_DS);
-	ret = __vfs_read(file, (void __user *)buf, count, pos);
+	if (file->f_op->read)
+		ret = file->f_op->read(file, (void __user *)buf, count, pos);
+	else if (file->f_op->read_iter)
+		ret = new_sync_read(file, (void __user *)buf, count, pos);
+	else
+		ret = -EINVAL;
 	set_fs(old_fs);
 	if (ret > 0) {
 		fsnotify_access(file);
@@ -474,17 +468,22 @@ ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
 		return -EFAULT;
 
 	ret = rw_verify_area(READ, file, pos, count);
-	if (!ret) {
-		if (count > MAX_RW_COUNT)
-			count =  MAX_RW_COUNT;
-		ret = __vfs_read(file, buf, count, pos);
-		if (ret > 0) {
-			fsnotify_access(file);
-			add_rchar(current, ret);
-		}
-		inc_syscr(current);
-	}
+	if (ret)
+		return ret;
+	if (count > MAX_RW_COUNT)
+		count =  MAX_RW_COUNT;
 
+	if (file->f_op->read)
+		ret = file->f_op->read(file, buf, count, pos);
+	else if (file->f_op->read_iter)
+		ret = new_sync_read(file, buf, count, pos);
+	else
+		ret = -EINVAL;
+	if (ret > 0) {
+		fsnotify_access(file);
+		add_rchar(current, ret);
+	}
+	inc_syscr(current);
 	return ret;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6441aaa25f8f2..4c10a07a36178 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1905,7 +1905,6 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
 			      struct iovec *fast_pointer,
 			      struct iovec **ret_pointer);
 
-extern ssize_t __vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 11/11] fs: don't change the address limit for ->read_iter in __kernel_read
  2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
                   ` (9 preceding siblings ...)
  2020-05-08  9:22 ` [PATCH 10/11] fs: remove __vfs_read Christoph Hellwig
@ 2020-05-08  9:22 ` Christoph Hellwig
  10 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08  9:22 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-security-module

If we read to a file that implements ->read_iter there is no need
to change the address limit if we send a kvec down.  Implement that
case, and prefer it over using plain ->read with a changed address
limit if available.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/read_write.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index f0009b506014c..70715a0e2375d 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -421,7 +421,6 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
 
 ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 {
-	mm_segment_t old_fs = get_fs();
 	ssize_t ret;
 
 	if (!(file->f_mode & FMODE_CAN_READ))
@@ -429,14 +428,25 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
 
 	if (count > MAX_RW_COUNT)
 		count =  MAX_RW_COUNT;
-	set_fs(KERNEL_DS);
-	if (file->f_op->read)
+	if (file->f_op->read_iter) {
+		struct kvec iov = { .iov_base = buf, .iov_len = count };
+		struct kiocb kiocb;
+		struct iov_iter iter;
+
+		init_sync_kiocb(&kiocb, file);
+		kiocb.ki_pos = *pos;
+		iov_iter_kvec(&iter, READ, &iov, 1, count);
+		ret = file->f_op->read_iter(&kiocb, &iter);
+		*pos = kiocb.ki_pos;
+	} else if (file->f_op->read) {
+		mm_segment_t old_fs = get_fs();
+
+		set_fs(KERNEL_DS);
 		ret = file->f_op->read(file, (void __user *)buf, count, pos);
-	else if (file->f_op->read_iter)
-		ret = new_sync_read(file, (void __user *)buf, count, pos);
-	else
+		set_fs(old_fs);
+	} else {
 		ret = -EINVAL;
-	set_fs(old_fs);
+	}
 	if (ret > 0) {
 		fsnotify_access(file);
 		add_rchar(current, ret);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 10/11] fs: remove __vfs_read
  2020-05-08  9:22 ` [PATCH 10/11] fs: remove __vfs_read Christoph Hellwig
@ 2020-05-08 15:45   ` Ira Weiny
  2020-05-08 15:46     ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Ira Weiny @ 2020-05-08 15:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Al Viro, linux-kernel, linux-fsdevel, linux-security-module

On Fri, May 08, 2020 at 11:22:21AM +0200, Christoph Hellwig wrote:
> Fold it into the two callers.

In 5.7-rc4, it looks like __vfs_read() is called from
security/integrity/iint.c

Was that removed somewhere prior to this patch?

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/read_write.c    | 43 +++++++++++++++++++++----------------------
>  include/linux/fs.h |  1 -
>  2 files changed, 21 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 0ffbed5fd8136..f0009b506014c 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -419,17 +419,6 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo
>  	return ret;
>  }
>  
> -ssize_t __vfs_read(struct file *file, char __user *buf, size_t count,
> -		   loff_t *pos)
> -{
> -	if (file->f_op->read)
> -		return file->f_op->read(file, buf, count, pos);
> -	else if (file->f_op->read_iter)
> -		return new_sync_read(file, buf, count, pos);
> -	else
> -		return -EINVAL;
> -}
> -
>  ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
>  {
>  	mm_segment_t old_fs = get_fs();
> @@ -441,7 +430,12 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
>  	if (count > MAX_RW_COUNT)
>  		count =  MAX_RW_COUNT;
>  	set_fs(KERNEL_DS);
> -	ret = __vfs_read(file, (void __user *)buf, count, pos);
> +	if (file->f_op->read)
> +		ret = file->f_op->read(file, (void __user *)buf, count, pos);
> +	else if (file->f_op->read_iter)
> +		ret = new_sync_read(file, (void __user *)buf, count, pos);
> +	else
> +		ret = -EINVAL;
>  	set_fs(old_fs);
>  	if (ret > 0) {
>  		fsnotify_access(file);
> @@ -474,17 +468,22 @@ ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
>  		return -EFAULT;
>  
>  	ret = rw_verify_area(READ, file, pos, count);
> -	if (!ret) {
> -		if (count > MAX_RW_COUNT)
> -			count =  MAX_RW_COUNT;
> -		ret = __vfs_read(file, buf, count, pos);
> -		if (ret > 0) {
> -			fsnotify_access(file);
> -			add_rchar(current, ret);
> -		}
> -		inc_syscr(current);
> -	}
> +	if (ret)
> +		return ret;
> +	if (count > MAX_RW_COUNT)
> +		count =  MAX_RW_COUNT;

Couldn't this clean up still happen while keeping __vfs_read()?

Ira

> +	if (file->f_op->read)
> +		ret = file->f_op->read(file, buf, count, pos);
> +	else if (file->f_op->read_iter)
> +		ret = new_sync_read(file, buf, count, pos);
> +	else
> +		ret = -EINVAL;
> +	if (ret > 0) {
> +		fsnotify_access(file);
> +		add_rchar(current, ret);
> +	}
> +	inc_syscr(current);
>  	return ret;
>  }
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 6441aaa25f8f2..4c10a07a36178 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1905,7 +1905,6 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
>  			      struct iovec *fast_pointer,
>  			      struct iovec **ret_pointer);
>  
> -extern ssize_t __vfs_read(struct file *, char __user *, size_t, loff_t *);
>  extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
>  extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
>  extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 10/11] fs: remove __vfs_read
  2020-05-08 15:45   ` Ira Weiny
@ 2020-05-08 15:46     ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2020-05-08 15:46 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Christoph Hellwig, Al Viro, linux-kernel, linux-fsdevel,
	linux-security-module

On Fri, May 08, 2020 at 08:45:53AM -0700, Ira Weiny wrote:
> On Fri, May 08, 2020 at 11:22:21AM +0200, Christoph Hellwig wrote:
> > Fold it into the two callers.
> 
> In 5.7-rc4, it looks like __vfs_read() is called from
> security/integrity/iint.c
> 
> Was that removed somewhere prior to this patch?

[PATCH 08/11] integrity/ima: switch to using __kernel_read

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-05-08 15:47 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08  9:22 clean up kernel_{read,write} & friends Christoph Hellwig
2020-05-08  9:22 ` [PATCH 01/11] fs: call file_{start,end}_write from __kernel_write Christoph Hellwig
2020-05-08  9:22 ` [PATCH 02/11] fs: check FMODE_WRITE in __kernel_write Christoph Hellwig
2020-05-08  9:22 ` [PATCH 03/11] fs: remove the call_{read,write}_iter functions Christoph Hellwig
2020-05-08  9:22 ` [PATCH 04/11] fs: implement kernel_write using __kernel_write Christoph Hellwig
2020-05-08  9:22 ` [PATCH 05/11] fs: remove __vfs_write Christoph Hellwig
2020-05-08  9:22 ` [PATCH 06/11] fs: don't change the address limit for ->write_iter in __kernel_write Christoph Hellwig
2020-05-08  9:22 ` [PATCH 07/11] fs: add a __kernel_read helper Christoph Hellwig
2020-05-08  9:22 ` [PATCH 08/11] integrity/ima: switch to using __kernel_read Christoph Hellwig
2020-05-08  9:22 ` [PATCH 09/11] fs: implement kernel_read " Christoph Hellwig
2020-05-08  9:22 ` [PATCH 10/11] fs: remove __vfs_read Christoph Hellwig
2020-05-08 15:45   ` Ira Weiny
2020-05-08 15:46     ` Christoph Hellwig
2020-05-08  9:22 ` [PATCH 11/11] fs: don't change the address limit for ->read_iter in __kernel_read Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).