All of lore.kernel.org
 help / color / mirror / Atom feed
* iomap_dio_rw ->end_io improvements
@ 2019-09-03 13:03 Christoph Hellwig
  2019-09-03 13:03 ` [PATCH 1/2] iomap: split size and error for iomap_dio_rw ->end_io Christoph Hellwig
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Christoph Hellwig @ 2019-09-03 13:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-xfs; +Cc: Goldwyn Rodrigues, Matthew Bobrowski

Hi all,

this series contains two updates to the end_io handling for the iomap
direct I/O code.  The first patch is from Matthew and passes the size and
error separately, and has been taken from his series to convert ext4 to
use iomap for direct I/O.  The second one moves the end_io handler into a
separate ops structure.  This should help with Goldwyns series to use the
iomap code in btrfs, but as-is already ensures that we don't store a
function pointer in a mutable data structure.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] iomap: split size and error for iomap_dio_rw ->end_io
  2019-09-03 13:03 iomap_dio_rw ->end_io improvements Christoph Hellwig
@ 2019-09-03 13:03 ` Christoph Hellwig
  2019-09-03 14:44   ` Darrick J. Wong
  2019-09-03 15:24   ` Matthew Wilcox
  2019-09-03 13:03 ` [PATCH 2/2] iomap: move the iomap_dio_rw ->end_io callback into a structure Christoph Hellwig
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Christoph Hellwig @ 2019-09-03 13:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-xfs; +Cc: Goldwyn Rodrigues, Matthew Bobrowski

From: Matthew Bobrowski <mbobrowski@mbobrowski.org>

Modify the calling convention for the iomap_dio_rw ->end_io() callback.
Rather than passing either dio->error or dio->size as the 'size' argument,
instead pass both the dio->error and the dio->size value separately.

In the instance that an error occurred during a write, we currently cannot
determine whether any blocks have been allocated beyond the current EOF and
data has subsequently been written to these blocks within the ->end_io()
callback. As a result, we cannot judge whether we should take the truncate
failed write path. Having both dio->error and dio->size will allow us to
perform such checks within this callback.

Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
[hch: minor cleanups]
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/direct-io.c  | 9 +++------
 fs/xfs/xfs_file.c     | 8 +++++---
 include/linux/iomap.h | 4 ++--
 3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 10517cea9682..2ccf1c6460d4 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -77,13 +77,10 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio)
 	loff_t offset = iocb->ki_pos;
 	ssize_t ret;
 
-	if (dio->end_io) {
-		ret = dio->end_io(iocb,
-				dio->error ? dio->error : dio->size,
-				dio->flags);
-	} else {
+	if (dio->end_io)
+		ret = dio->end_io(iocb, dio->size, dio->error, dio->flags);
+	else
 		ret = dio->error;
-	}
 
 	if (likely(!ret)) {
 		ret = dio->size;
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index d952d5962e93..3d8e6db9ef77 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -370,21 +370,23 @@ static int
 xfs_dio_write_end_io(
 	struct kiocb		*iocb,
 	ssize_t			size,
+	int			error,
 	unsigned		flags)
 {
 	struct inode		*inode = file_inode(iocb->ki_filp);
 	struct xfs_inode	*ip = XFS_I(inode);
 	loff_t			offset = iocb->ki_pos;
 	unsigned int		nofs_flag;
-	int			error = 0;
 
 	trace_xfs_end_io_direct_write(ip, offset, size);
 
 	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
 		return -EIO;
 
-	if (size <= 0)
-		return size;
+	if (error)
+		return error;
+	if (!size)
+		return 0;
 
 	/*
 	 * Capture amount written on completion as we can't reliably account
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index bc499ceae392..50bb746d2216 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -188,8 +188,8 @@ sector_t iomap_bmap(struct address_space *mapping, sector_t bno,
  */
 #define IOMAP_DIO_UNWRITTEN	(1 << 0)	/* covers unwritten extent(s) */
 #define IOMAP_DIO_COW		(1 << 1)	/* covers COW extent(s) */
-typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t ret,
-		unsigned flags);
+typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t size, int error,
+				 unsigned int flags);
 ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 		const struct iomap_ops *ops, iomap_dio_end_io_t end_io);
 int iomap_dio_iopoll(struct kiocb *kiocb, bool spin);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] iomap: move the iomap_dio_rw ->end_io callback into a structure
  2019-09-03 13:03 iomap_dio_rw ->end_io improvements Christoph Hellwig
  2019-09-03 13:03 ` [PATCH 1/2] iomap: split size and error for iomap_dio_rw ->end_io Christoph Hellwig
@ 2019-09-03 13:03 ` Christoph Hellwig
  2019-09-03 14:46   ` Darrick J. Wong
  2019-09-03 16:14   ` Matthew Wilcox
  2019-09-03 21:32 ` iomap_dio_rw ->end_io improvements Matthew Bobrowski
  2019-09-03 22:16 ` Darrick J. Wong
  3 siblings, 2 replies; 14+ messages in thread
From: Christoph Hellwig @ 2019-09-03 13:03 UTC (permalink / raw)
  To: linux-fsdevel, linux-xfs; +Cc: Goldwyn Rodrigues, Matthew Bobrowski

Add a new iomap_dio_ops structure that for now just contains the end_io
handler.  This avoid storing the function pointer in a mutable structure,
which is a possible exploit vector for kernel code execution, and prepares
for adding a submit_io handler that btrfs needs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/direct-io.c  | 21 ++++++++++-----------
 fs/xfs/xfs_file.c     |  6 +++++-
 include/linux/iomap.h | 10 +++++++---
 3 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 2ccf1c6460d4..1fc28c2da279 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -24,7 +24,7 @@
 
 struct iomap_dio {
 	struct kiocb		*iocb;
-	iomap_dio_end_io_t	*end_io;
+	const struct iomap_dio_ops *dops;
 	loff_t			i_size;
 	loff_t			size;
 	atomic_t		ref;
@@ -72,15 +72,14 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
 
 static ssize_t iomap_dio_complete(struct iomap_dio *dio)
 {
+	const struct iomap_dio_ops *dops = dio->dops;
 	struct kiocb *iocb = dio->iocb;
 	struct inode *inode = file_inode(iocb->ki_filp);
 	loff_t offset = iocb->ki_pos;
-	ssize_t ret;
+	ssize_t ret = dio->error;
 
-	if (dio->end_io)
-		ret = dio->end_io(iocb, dio->size, dio->error, dio->flags);
-	else
-		ret = dio->error;
+	if (dops && dops->end_io)
+		ret = dops->end_io(iocb, dio->size, ret, dio->flags);
 
 	if (likely(!ret)) {
 		ret = dio->size;
@@ -98,9 +97,9 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio)
 	 * one is a pretty crazy thing to do, so we don't support it 100%.  If
 	 * this invalidation fails, tough, the write still worked...
 	 *
-	 * And this page cache invalidation has to be after dio->end_io(), as
-	 * some filesystems convert unwritten extents to real allocations in
-	 * end_io() when necessary, otherwise a racing buffer read would cache
+	 * And this page cache invalidation has to be after ->end_io(), as some
+	 * filesystems convert unwritten extents to real allocations in
+	 * ->end_io() when necessary, otherwise a racing buffer read would cache
 	 * zeros from unwritten extents.
 	 */
 	if (!dio->error &&
@@ -393,7 +392,7 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
  */
 ssize_t
 iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
-		const struct iomap_ops *ops, iomap_dio_end_io_t end_io)
+		const struct iomap_ops *ops, const struct iomap_dio_ops *dops)
 {
 	struct address_space *mapping = iocb->ki_filp->f_mapping;
 	struct inode *inode = file_inode(iocb->ki_filp);
@@ -418,7 +417,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
 	atomic_set(&dio->ref, 1);
 	dio->size = 0;
 	dio->i_size = i_size_read(inode);
-	dio->end_io = end_io;
+	dio->dops = dops;
 	dio->error = 0;
 	dio->flags = 0;
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 3d8e6db9ef77..1ffb179f35d2 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -443,6 +443,10 @@ xfs_dio_write_end_io(
 	return error;
 }
 
+static const struct iomap_dio_ops xfs_dio_write_ops = {
+	.end_io		= xfs_dio_write_end_io,
+};
+
 /*
  * xfs_file_dio_aio_write - handle direct IO writes
  *
@@ -543,7 +547,7 @@ xfs_file_dio_aio_write(
 	}
 
 	trace_xfs_file_direct_write(ip, count, iocb->ki_pos);
-	ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, xfs_dio_write_end_io);
+	ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, &xfs_dio_write_ops);
 
 	/*
 	 * If unaligned, this is the only IO in-flight. If it has not yet
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 50bb746d2216..7aa5d6117936 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -188,10 +188,14 @@ sector_t iomap_bmap(struct address_space *mapping, sector_t bno,
  */
 #define IOMAP_DIO_UNWRITTEN	(1 << 0)	/* covers unwritten extent(s) */
 #define IOMAP_DIO_COW		(1 << 1)	/* covers COW extent(s) */
-typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t size, int error,
-				 unsigned int flags);
+
+struct iomap_dio_ops {
+	int (*end_io)(struct kiocb *iocb, ssize_t size, int error,
+		      unsigned flags);
+};
+
 ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
-		const struct iomap_ops *ops, iomap_dio_end_io_t end_io);
+		const struct iomap_ops *ops, const struct iomap_dio_ops *dops);
 int iomap_dio_iopoll(struct kiocb *kiocb, bool spin);
 
 #ifdef CONFIG_SWAP
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] iomap: split size and error for iomap_dio_rw ->end_io
  2019-09-03 13:03 ` [PATCH 1/2] iomap: split size and error for iomap_dio_rw ->end_io Christoph Hellwig
@ 2019-09-03 14:44   ` Darrick J. Wong
  2019-09-03 15:24   ` Matthew Wilcox
  1 sibling, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2019-09-03 14:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fsdevel, linux-xfs, Goldwyn Rodrigues, Matthew Bobrowski

On Tue, Sep 03, 2019 at 03:03:26PM +0200, Christoph Hellwig wrote:
> From: Matthew Bobrowski <mbobrowski@mbobrowski.org>
> 
> Modify the calling convention for the iomap_dio_rw ->end_io() callback.
> Rather than passing either dio->error or dio->size as the 'size' argument,
> instead pass both the dio->error and the dio->size value separately.
> 
> In the instance that an error occurred during a write, we currently cannot
> determine whether any blocks have been allocated beyond the current EOF and
> data has subsequently been written to these blocks within the ->end_io()
> callback. As a result, we cannot judge whether we should take the truncate
> failed write path. Having both dio->error and dio->size will allow us to
> perform such checks within this callback.
> 
> Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
> [hch: minor cleanups]
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/iomap/direct-io.c  | 9 +++------
>  fs/xfs/xfs_file.c     | 8 +++++---
>  include/linux/iomap.h | 4 ++--
>  3 files changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 10517cea9682..2ccf1c6460d4 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -77,13 +77,10 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio)
>  	loff_t offset = iocb->ki_pos;
>  	ssize_t ret;
>  
> -	if (dio->end_io) {
> -		ret = dio->end_io(iocb,
> -				dio->error ? dio->error : dio->size,
> -				dio->flags);
> -	} else {
> +	if (dio->end_io)
> +		ret = dio->end_io(iocb, dio->size, dio->error, dio->flags);
> +	else
>  		ret = dio->error;
> -	}
>  
>  	if (likely(!ret)) {
>  		ret = dio->size;
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index d952d5962e93..3d8e6db9ef77 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -370,21 +370,23 @@ static int
>  xfs_dio_write_end_io(
>  	struct kiocb		*iocb,
>  	ssize_t			size,
> +	int			error,
>  	unsigned		flags)
>  {
>  	struct inode		*inode = file_inode(iocb->ki_filp);
>  	struct xfs_inode	*ip = XFS_I(inode);
>  	loff_t			offset = iocb->ki_pos;
>  	unsigned int		nofs_flag;
> -	int			error = 0;
>  
>  	trace_xfs_end_io_direct_write(ip, offset, size);
>  
>  	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
>  		return -EIO;
>  
> -	if (size <= 0)
> -		return size;
> +	if (error)
> +		return error;
> +	if (!size)
> +		return 0;
>  
>  	/*
>  	 * Capture amount written on completion as we can't reliably account
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index bc499ceae392..50bb746d2216 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -188,8 +188,8 @@ sector_t iomap_bmap(struct address_space *mapping, sector_t bno,
>   */
>  #define IOMAP_DIO_UNWRITTEN	(1 << 0)	/* covers unwritten extent(s) */
>  #define IOMAP_DIO_COW		(1 << 1)	/* covers COW extent(s) */
> -typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t ret,
> -		unsigned flags);
> +typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t size, int error,
> +				 unsigned int flags);
>  ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  		const struct iomap_ops *ops, iomap_dio_end_io_t end_io);
>  int iomap_dio_iopoll(struct kiocb *kiocb, bool spin);
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] iomap: move the iomap_dio_rw ->end_io callback into a structure
  2019-09-03 13:03 ` [PATCH 2/2] iomap: move the iomap_dio_rw ->end_io callback into a structure Christoph Hellwig
@ 2019-09-03 14:46   ` Darrick J. Wong
  2019-09-03 16:14   ` Matthew Wilcox
  1 sibling, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2019-09-03 14:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fsdevel, linux-xfs, Goldwyn Rodrigues, Matthew Bobrowski

On Tue, Sep 03, 2019 at 03:03:27PM +0200, Christoph Hellwig wrote:
> Add a new iomap_dio_ops structure that for now just contains the end_io
> handler.  This avoid storing the function pointer in a mutable structure,
> which is a possible exploit vector for kernel code execution, and prepares
> for adding a submit_io handler that btrfs needs.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/iomap/direct-io.c  | 21 ++++++++++-----------
>  fs/xfs/xfs_file.c     |  6 +++++-
>  include/linux/iomap.h | 10 +++++++---
>  3 files changed, 22 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 2ccf1c6460d4..1fc28c2da279 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -24,7 +24,7 @@
>  
>  struct iomap_dio {
>  	struct kiocb		*iocb;
> -	iomap_dio_end_io_t	*end_io;
> +	const struct iomap_dio_ops *dops;
>  	loff_t			i_size;
>  	loff_t			size;
>  	atomic_t		ref;
> @@ -72,15 +72,14 @@ static void iomap_dio_submit_bio(struct iomap_dio *dio, struct iomap *iomap,
>  
>  static ssize_t iomap_dio_complete(struct iomap_dio *dio)
>  {
> +	const struct iomap_dio_ops *dops = dio->dops;
>  	struct kiocb *iocb = dio->iocb;
>  	struct inode *inode = file_inode(iocb->ki_filp);
>  	loff_t offset = iocb->ki_pos;
> -	ssize_t ret;
> +	ssize_t ret = dio->error;
>  
> -	if (dio->end_io)
> -		ret = dio->end_io(iocb, dio->size, dio->error, dio->flags);
> -	else
> -		ret = dio->error;
> +	if (dops && dops->end_io)
> +		ret = dops->end_io(iocb, dio->size, ret, dio->flags);
>  
>  	if (likely(!ret)) {
>  		ret = dio->size;
> @@ -98,9 +97,9 @@ static ssize_t iomap_dio_complete(struct iomap_dio *dio)
>  	 * one is a pretty crazy thing to do, so we don't support it 100%.  If
>  	 * this invalidation fails, tough, the write still worked...
>  	 *
> -	 * And this page cache invalidation has to be after dio->end_io(), as
> -	 * some filesystems convert unwritten extents to real allocations in
> -	 * end_io() when necessary, otherwise a racing buffer read would cache
> +	 * And this page cache invalidation has to be after ->end_io(), as some
> +	 * filesystems convert unwritten extents to real allocations in
> +	 * ->end_io() when necessary, otherwise a racing buffer read would cache
>  	 * zeros from unwritten extents.
>  	 */
>  	if (!dio->error &&
> @@ -393,7 +392,7 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
>   */
>  ssize_t
>  iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> -		const struct iomap_ops *ops, iomap_dio_end_io_t end_io)
> +		const struct iomap_ops *ops, const struct iomap_dio_ops *dops)
>  {
>  	struct address_space *mapping = iocb->ki_filp->f_mapping;
>  	struct inode *inode = file_inode(iocb->ki_filp);
> @@ -418,7 +417,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  	atomic_set(&dio->ref, 1);
>  	dio->size = 0;
>  	dio->i_size = i_size_read(inode);
> -	dio->end_io = end_io;
> +	dio->dops = dops;
>  	dio->error = 0;
>  	dio->flags = 0;
>  
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 3d8e6db9ef77..1ffb179f35d2 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -443,6 +443,10 @@ xfs_dio_write_end_io(
>  	return error;
>  }
>  
> +static const struct iomap_dio_ops xfs_dio_write_ops = {
> +	.end_io		= xfs_dio_write_end_io,
> +};
> +
>  /*
>   * xfs_file_dio_aio_write - handle direct IO writes
>   *
> @@ -543,7 +547,7 @@ xfs_file_dio_aio_write(
>  	}
>  
>  	trace_xfs_file_direct_write(ip, count, iocb->ki_pos);
> -	ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, xfs_dio_write_end_io);
> +	ret = iomap_dio_rw(iocb, from, &xfs_iomap_ops, &xfs_dio_write_ops);
>  
>  	/*
>  	 * If unaligned, this is the only IO in-flight. If it has not yet
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 50bb746d2216..7aa5d6117936 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -188,10 +188,14 @@ sector_t iomap_bmap(struct address_space *mapping, sector_t bno,
>   */
>  #define IOMAP_DIO_UNWRITTEN	(1 << 0)	/* covers unwritten extent(s) */
>  #define IOMAP_DIO_COW		(1 << 1)	/* covers COW extent(s) */
> -typedef int (iomap_dio_end_io_t)(struct kiocb *iocb, ssize_t size, int error,
> -				 unsigned int flags);
> +
> +struct iomap_dio_ops {
> +	int (*end_io)(struct kiocb *iocb, ssize_t size, int error,
> +		      unsigned flags);
> +};
> +
>  ssize_t iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
> -		const struct iomap_ops *ops, iomap_dio_end_io_t end_io);
> +		const struct iomap_ops *ops, const struct iomap_dio_ops *dops);
>  int iomap_dio_iopoll(struct kiocb *kiocb, bool spin);
>  
>  #ifdef CONFIG_SWAP
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] iomap: split size and error for iomap_dio_rw ->end_io
  2019-09-03 13:03 ` [PATCH 1/2] iomap: split size and error for iomap_dio_rw ->end_io Christoph Hellwig
  2019-09-03 14:44   ` Darrick J. Wong
@ 2019-09-03 15:24   ` Matthew Wilcox
  1 sibling, 0 replies; 14+ messages in thread
From: Matthew Wilcox @ 2019-09-03 15:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fsdevel, linux-xfs, Goldwyn Rodrigues, Matthew Bobrowski

On Tue, Sep 03, 2019 at 03:03:26PM +0200, Christoph Hellwig wrote:
> From: Matthew Bobrowski <mbobrowski@mbobrowski.org>
> 
> Modify the calling convention for the iomap_dio_rw ->end_io() callback.
> Rather than passing either dio->error or dio->size as the 'size' argument,
> instead pass both the dio->error and the dio->size value separately.
> 
> In the instance that an error occurred during a write, we currently cannot
> determine whether any blocks have been allocated beyond the current EOF and
> data has subsequently been written to these blocks within the ->end_io()
> callback. As a result, we cannot judge whether we should take the truncate
> failed write path. Having both dio->error and dio->size will allow us to
> perform such checks within this callback.
> 
> Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
> [hch: minor cleanups]
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] iomap: move the iomap_dio_rw ->end_io callback into a structure
  2019-09-03 13:03 ` [PATCH 2/2] iomap: move the iomap_dio_rw ->end_io callback into a structure Christoph Hellwig
  2019-09-03 14:46   ` Darrick J. Wong
@ 2019-09-03 16:14   ` Matthew Wilcox
  2019-09-04 12:51     ` Christoph Hellwig
  1 sibling, 1 reply; 14+ messages in thread
From: Matthew Wilcox @ 2019-09-03 16:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fsdevel, linux-xfs, Goldwyn Rodrigues, Matthew Bobrowski

On Tue, Sep 03, 2019 at 03:03:27PM +0200, Christoph Hellwig wrote:
> Add a new iomap_dio_ops structure that for now just contains the end_io
> handler.  This avoid storing the function pointer in a mutable structure,
> which is a possible exploit vector for kernel code execution, and prepares
> for adding a submit_io handler that btrfs needs.

Is it really a security win?  If I can overwrite dio->end_io, I can as
well overwrite dio->dops.

The patch itself looks sane, but I'm not sure about this particular reason.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iomap_dio_rw ->end_io improvements
  2019-09-03 13:03 iomap_dio_rw ->end_io improvements Christoph Hellwig
  2019-09-03 13:03 ` [PATCH 1/2] iomap: split size and error for iomap_dio_rw ->end_io Christoph Hellwig
  2019-09-03 13:03 ` [PATCH 2/2] iomap: move the iomap_dio_rw ->end_io callback into a structure Christoph Hellwig
@ 2019-09-03 21:32 ` Matthew Bobrowski
  2019-09-03 22:16 ` Darrick J. Wong
  3 siblings, 0 replies; 14+ messages in thread
From: Matthew Bobrowski @ 2019-09-03 21:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-fsdevel, linux-xfs, Goldwyn Rodrigues

Hey Christoph!

On Tue, Sep 03, 2019 at 03:03:25PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series contains two updates to the end_io handling for the iomap
> direct I/O code.  The first patch is from Matthew and passes the size and
> error separately, and has been taken from his series to convert ext4 to
> use iomap for direct I/O.

Great, looks good and thank you for expediting this change for
me. I'll make sure to drop these changes in the series that I'll be
posting through very shortly.

--M


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iomap_dio_rw ->end_io improvements
  2019-09-03 13:03 iomap_dio_rw ->end_io improvements Christoph Hellwig
                   ` (2 preceding siblings ...)
  2019-09-03 21:32 ` iomap_dio_rw ->end_io improvements Matthew Bobrowski
@ 2019-09-03 22:16 ` Darrick J. Wong
  2019-09-04  2:19   ` Damien Le Moal
  2019-09-04  5:12   ` Christoph Hellwig
  3 siblings, 2 replies; 14+ messages in thread
From: Darrick J. Wong @ 2019-09-03 22:16 UTC (permalink / raw)
  To: Christoph Hellwig, agruenba, Damien.LeMoal, Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-xfs, Matthew Bobrowski

On Tue, Sep 03, 2019 at 03:03:25PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> this series contains two updates to the end_io handling for the iomap
> direct I/O code.  The first patch is from Matthew and passes the size and
> error separately, and has been taken from his series to convert ext4 to
> use iomap for direct I/O.  The second one moves the end_io handler into a
> separate ops structure.  This should help with Goldwyns series to use the
> iomap code in btrfs, but as-is already ensures that we don't store a
> function pointer in a mutable data structure.

The biggest problem with merging these patches (and while we're at it,
Goldwyn's patch adding a srcmap parameter to ->iomap_begin) for 5.4 is
that they'll break whatever Andreas and Damien have been preparing for
gfs2 and zonefs (respectively) based off the iomap-writeback work branch
that I created off of 5.3-rc2 a month ago.

Digging through the gfs2 and zonefs code, it doesn't look like it would
be difficult to adapt them to the changes, but forcing a rebase at this
point would (a) poke holes in the idea of creating stable work branches
and (b) shoot holes in all the regression testing they've done so far.
I do not have the hardware to test either in detail.

So the question is: Are all three (xfs/gfs2/zonefs?) downstream users of
iomap ok with a rebase a week and a half before the 5.4 merge window
opens?  I'm still inclined to push all these patches (iomap cow and the
directio improvements) into a work branch for 5.5, but if someone wants
this for 5.4 badly enough to persuade everyone else to start their
testing again, then I could see trying to make this happen (no later
than 5pm Pacific on Thursday).  Bear in mind I'm on vacation starting
Friday and going until the 15th...

Once iomap accumulates more users (ext4, btrfs) then this sort of thing
will never scale and will likely never happen again.

Thoughts?  Flames? :)

--D

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iomap_dio_rw ->end_io improvements
  2019-09-03 22:16 ` Darrick J. Wong
@ 2019-09-04  2:19   ` Damien Le Moal
  2019-09-04  5:12   ` Christoph Hellwig
  1 sibling, 0 replies; 14+ messages in thread
From: Damien Le Moal @ 2019-09-04  2:19 UTC (permalink / raw)
  To: Darrick J. Wong, Christoph Hellwig, agruenba, Goldwyn Rodrigues
  Cc: linux-fsdevel, linux-xfs, Matthew Bobrowski

On 2019/09/04 7:16, Darrick J. Wong wrote:
> On Tue, Sep 03, 2019 at 03:03:25PM +0200, Christoph Hellwig wrote:
>> Hi all,
>>
>> this series contains two updates to the end_io handling for the iomap
>> direct I/O code.  The first patch is from Matthew and passes the size and
>> error separately, and has been taken from his series to convert ext4 to
>> use iomap for direct I/O.  The second one moves the end_io handler into a
>> separate ops structure.  This should help with Goldwyns series to use the
>> iomap code in btrfs, but as-is already ensures that we don't store a
>> function pointer in a mutable data structure.
> 
> The biggest problem with merging these patches (and while we're at it,
> Goldwyn's patch adding a srcmap parameter to ->iomap_begin) for 5.4 is
> that they'll break whatever Andreas and Damien have been preparing for
> gfs2 and zonefs (respectively) based off the iomap-writeback work branch
> that I created off of 5.3-rc2 a month ago.
> 
> Digging through the gfs2 and zonefs code, it doesn't look like it would
> be difficult to adapt them to the changes, but forcing a rebase at this
> point would (a) poke holes in the idea of creating stable work branches
> and (b) shoot holes in all the regression testing they've done so far.
> I do not have the hardware to test either in detail.

For zonefs, the changes are not that big (thanks for sending them :)) and
testing does not take long given the lower amount of functionalities compared to
a regular FS. So regression testing with changes to iomap will not be a huge
problem for me. I can do it if needed.

> So the question is: Are all three (xfs/gfs2/zonefs?) downstream users of
> iomap ok with a rebase a week and a half before the 5.4 merge window
> opens?  I'm still inclined to push all these patches (iomap cow and the
> directio improvements) into a work branch for 5.5, but if someone wants
> this for 5.4 badly enough to persuade everyone else to start their
> testing again, then I could see trying to make this happen (no later
> than 5pm Pacific on Thursday).  Bear in mind I'm on vacation starting
> Friday and going until the 15th...

No strong opinion either way. I will adjust to what you decide.

> 
> Once iomap accumulates more users (ext4, btrfs) then this sort of thing
> will never scale and will likely never happen again.
> 
> Thoughts?  Flames? :)
> 
> --D
> 


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iomap_dio_rw ->end_io improvements
  2019-09-03 22:16 ` Darrick J. Wong
  2019-09-04  2:19   ` Damien Le Moal
@ 2019-09-04  5:12   ` Christoph Hellwig
  2019-09-04 11:46     ` Andreas Gruenbacher
  1 sibling, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-09-04  5:12 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, agruenba, Damien.LeMoal, Goldwyn Rodrigues,
	linux-fsdevel, linux-xfs, Matthew Bobrowski

On Tue, Sep 03, 2019 at 03:16:21PM -0700, Darrick J. Wong wrote:
> The biggest problem with merging these patches (and while we're at it,
> Goldwyn's patch adding a srcmap parameter to ->iomap_begin) for 5.4 is
> that they'll break whatever Andreas and Damien have been preparing for
> gfs2 and zonefs (respectively) based off the iomap-writeback work branch
> that I created off of 5.3-rc2 a month ago.

Does Andreas have changes pending that actually pass an end_io call
back to gfs2?  So far it just passed NULL so nothing should change.
If my memory serves me correctly zonefs uses ->end_io, but then again
Damien is asking you to queue it up with the iomap tree, so doing
that trivial rebase shouldn't be an issue.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iomap_dio_rw ->end_io improvements
  2019-09-04  5:12   ` Christoph Hellwig
@ 2019-09-04 11:46     ` Andreas Gruenbacher
  2019-09-04 17:45       ` Darrick J. Wong
  0 siblings, 1 reply; 14+ messages in thread
From: Andreas Gruenbacher @ 2019-09-04 11:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J. Wong, Damien Le Moal, Goldwyn Rodrigues,
	linux-fsdevel, linux-xfs, Matthew Bobrowski

On Wed, Sep 4, 2019 at 7:12 AM Christoph Hellwig <hch@lst.de> wrote:
> On Tue, Sep 03, 2019 at 03:16:21PM -0700, Darrick J. Wong wrote:
> > The biggest problem with merging these patches (and while we're at it,
> > Goldwyn's patch adding a srcmap parameter to ->iomap_begin) for 5.4 is
> > that they'll break whatever Andreas and Damien have been preparing for
> > gfs2 and zonefs (respectively) based off the iomap-writeback work branch
> > that I created off of 5.3-rc2 a month ago.
>
> Does Andreas have changes pending that actually pass an end_io call
> back to gfs2?  So far it just passed NULL so nothing should change.

Right, we don't currently use that and there's nothing in queue in
that direction.

Andreas

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] iomap: move the iomap_dio_rw ->end_io callback into a structure
  2019-09-03 16:14   ` Matthew Wilcox
@ 2019-09-04 12:51     ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2019-09-04 12:51 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, linux-fsdevel, linux-xfs, Goldwyn Rodrigues,
	Matthew Bobrowski

On Tue, Sep 03, 2019 at 09:14:46AM -0700, Matthew Wilcox wrote:
> On Tue, Sep 03, 2019 at 03:03:27PM +0200, Christoph Hellwig wrote:
> > Add a new iomap_dio_ops structure that for now just contains the end_io
> > handler.  This avoid storing the function pointer in a mutable structure,
> > which is a possible exploit vector for kernel code execution, and prepares
> > for adding a submit_io handler that btrfs needs.
> 
> Is it really a security win?  If I can overwrite dio->end_io, I can as
> well overwrite dio->dops.

Which you'd then need to point to another place where you can stuff
function pointer.  Not impossible, but just another hoop to jump
through.  At least until we add run-time checks that ops structures
are in read-only memory, which sounds more sensible than some of the
other security hardening patches floating around.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: iomap_dio_rw ->end_io improvements
  2019-09-04 11:46     ` Andreas Gruenbacher
@ 2019-09-04 17:45       ` Darrick J. Wong
  0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2019-09-04 17:45 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Christoph Hellwig, Damien Le Moal, Goldwyn Rodrigues,
	linux-fsdevel, linux-xfs, Matthew Bobrowski

On Wed, Sep 04, 2019 at 01:46:23PM +0200, Andreas Gruenbacher wrote:
> On Wed, Sep 4, 2019 at 7:12 AM Christoph Hellwig <hch@lst.de> wrote:
> > On Tue, Sep 03, 2019 at 03:16:21PM -0700, Darrick J. Wong wrote:
> > > The biggest problem with merging these patches (and while we're at it,
> > > Goldwyn's patch adding a srcmap parameter to ->iomap_begin) for 5.4 is
> > > that they'll break whatever Andreas and Damien have been preparing for
> > > gfs2 and zonefs (respectively) based off the iomap-writeback work branch
> > > that I created off of 5.3-rc2 a month ago.
> >
> > Does Andreas have changes pending that actually pass an end_io call
> > back to gfs2?  So far it just passed NULL so nothing should change.
> 
> Right, we don't currently use that and there's nothing in queue in
> that direction.

Ok, sounds good.  I'll pull these two changes into the iomap tree for
5.4 then.

--D

> Andreas

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-09-04 17:46 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-03 13:03 iomap_dio_rw ->end_io improvements Christoph Hellwig
2019-09-03 13:03 ` [PATCH 1/2] iomap: split size and error for iomap_dio_rw ->end_io Christoph Hellwig
2019-09-03 14:44   ` Darrick J. Wong
2019-09-03 15:24   ` Matthew Wilcox
2019-09-03 13:03 ` [PATCH 2/2] iomap: move the iomap_dio_rw ->end_io callback into a structure Christoph Hellwig
2019-09-03 14:46   ` Darrick J. Wong
2019-09-03 16:14   ` Matthew Wilcox
2019-09-04 12:51     ` Christoph Hellwig
2019-09-03 21:32 ` iomap_dio_rw ->end_io improvements Matthew Bobrowski
2019-09-03 22:16 ` Darrick J. Wong
2019-09-04  2:19   ` Damien Le Moal
2019-09-04  5:12   ` Christoph Hellwig
2019-09-04 11:46     ` Andreas Gruenbacher
2019-09-04 17:45       ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.