linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: remove plugging at buffered write time
@ 2012-02-08 11:01 Wu Fengguang
  2012-02-08 23:27 ` Dave Chinner
  2012-02-09  1:14 ` Shaohua Li
  0 siblings, 2 replies; 13+ messages in thread
From: Wu Fengguang @ 2012-02-08 11:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-fsdevel, LKML, Jens Axboe, Li Shaohua

Buffered write(2) is not directly tied to IO, so it's not suitable to
handle plug in generic_file_aio_write().

Also remove the redundant plug calls around ->direct_IO(), since we now
do unplugging in the lower layer do_blockdev_direct_IO().

CC: Jens Axboe <axboe@kernel.dk>
CC: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/direct-io.c |    4 ++++
 mm/filemap.c   |    7 -------
 2 files changed, 4 insertions(+), 7 deletions(-)

--- linux-next.orig/mm/filemap.c	2012-02-08 18:51:50.000000000 +0800
+++ linux-next/mm/filemap.c	2012-02-08 18:52:19.000000000 +0800
@@ -1421,12 +1421,8 @@ generic_file_aio_read(struct kiocb *iocb
 			retval = filemap_write_and_wait_range(mapping, pos,
 					pos + iov_length(iov, nr_segs) - 1);
 			if (!retval) {
-				struct blk_plug plug;
-
-				blk_start_plug(&plug);
 				retval = mapping->a_ops->direct_IO(READ, iocb,
 							iov, pos, nr_segs);
-				blk_finish_plug(&plug);
 			}
 			if (retval > 0) {
 				*ppos = pos + retval;
@@ -2610,13 +2606,11 @@ ssize_t generic_file_aio_write(struct ki
 {
 	struct file *file = iocb->ki_filp;
 	struct inode *inode = file->f_mapping->host;
-	struct blk_plug plug;
 	ssize_t ret;
 
 	BUG_ON(iocb->ki_pos != pos);
 
 	mutex_lock(&inode->i_mutex);
-	blk_start_plug(&plug);
 	ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
 	mutex_unlock(&inode->i_mutex);
 
@@ -2627,7 +2621,6 @@ ssize_t generic_file_aio_write(struct ki
 		if (err < 0 && ret > 0)
 			ret = err;
 	}
-	blk_finish_plug(&plug);
 	return ret;
 }
 EXPORT_SYMBOL(generic_file_aio_write);
--- linux-next.orig/fs/direct-io.c	2012-02-08 18:51:37.000000000 +0800
+++ linux-next/fs/direct-io.c	2012-02-08 18:52:19.000000000 +0800
@@ -1106,6 +1106,7 @@ do_blockdev_direct_IO(int rw, struct kio
 	unsigned long user_addr;
 	size_t bytes;
 	struct buffer_head map_bh = { 0, };
+	struct blk_plug plug;
 
 	if (rw & WRITE)
 		rw = WRITE_ODIRECT;
@@ -1221,6 +1222,8 @@ do_blockdev_direct_IO(int rw, struct kio
 				PAGE_SIZE - user_addr / PAGE_SIZE);
 	}
 
+	blk_start_plug(&plug);
+
 	for (seg = 0; seg < nr_segs; seg++) {
 		user_addr = (unsigned long)iov[seg].iov_base;
 		sdio.size += bytes = iov[seg].iov_len;
@@ -1314,6 +1317,7 @@ do_blockdev_direct_IO(int rw, struct kio
 	} else
 		BUG_ON(retval != -EIOCBQUEUED);
 
+	blk_finish_plug(&plug);
 out:
 	return retval;
 }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-08 11:01 [PATCH] block: remove plugging at buffered write time Wu Fengguang
@ 2012-02-08 23:27 ` Dave Chinner
  2012-02-09  8:02   ` Wu Fengguang
  2012-02-09  1:14 ` Shaohua Li
  1 sibling, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2012-02-08 23:27 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Andrew Morton, linux-fsdevel, LKML, Jens Axboe, Li Shaohua

On Wed, Feb 08, 2012 at 07:01:44PM +0800, Wu Fengguang wrote:
> Buffered write(2) is not directly tied to IO, so it's not suitable to
> handle plug in generic_file_aio_write().

But generic_sync_write() does issue IO for O_SYNC writes, so unless
there is plugging at a lower layer in the writeback code then it
appears to me that plugging is still necessary (at least inside the
sync branch)....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-08 11:01 [PATCH] block: remove plugging at buffered write time Wu Fengguang
  2012-02-08 23:27 ` Dave Chinner
@ 2012-02-09  1:14 ` Shaohua Li
  2012-02-09  8:07   ` Wu Fengguang
  1 sibling, 1 reply; 13+ messages in thread
From: Shaohua Li @ 2012-02-09  1:14 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Andrew Morton, linux-fsdevel, LKML, Jens Axboe

On Wed, 2012-02-08 at 19:01 +0800, Wu Fengguang wrote:
> Buffered write(2) is not directly tied to IO, so it's not suitable to
> handle plug in generic_file_aio_write().
> 
> Also remove the redundant plug calls around ->direct_IO(), since we now
> do unplugging in the lower layer do_blockdev_direct_IO().
> 
> CC: Jens Axboe <axboe@kernel.dk>
> CC: Li Shaohua <shaohua.li@intel.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  fs/direct-io.c |    4 ++++
>  mm/filemap.c   |    7 -------
>  2 files changed, 4 insertions(+), 7 deletions(-)
> 
> --- linux-next.orig/mm/filemap.c	2012-02-08 18:51:50.000000000 +0800
> +++ linux-next/mm/filemap.c	2012-02-08 18:52:19.000000000 +0800
> @@ -1421,12 +1421,8 @@ generic_file_aio_read(struct kiocb *iocb
>  			retval = filemap_write_and_wait_range(mapping, pos,
>  					pos + iov_length(iov, nr_segs) - 1);
>  			if (!retval) {
> -				struct blk_plug plug;
> -
> -				blk_start_plug(&plug);
>  				retval = mapping->a_ops->direct_IO(READ, iocb,
>  							iov, pos, nr_segs);
> -				blk_finish_plug(&plug);
>  			}
>  			if (retval > 0) {
>  				*ppos = pos + retval;
> @@ -2610,13 +2606,11 @@ ssize_t generic_file_aio_write(struct ki
>  {
>  	struct file *file = iocb->ki_filp;
>  	struct inode *inode = file->f_mapping->host;
> -	struct blk_plug plug;
>  	ssize_t ret;
>  
>  	BUG_ON(iocb->ki_pos != pos);
>  
>  	mutex_lock(&inode->i_mutex);
> -	blk_start_plug(&plug);
>  	ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
>  	mutex_unlock(&inode->i_mutex);
>  
> @@ -2627,7 +2621,6 @@ ssize_t generic_file_aio_write(struct ki
>  		if (err < 0 && ret > 0)
>  			ret = err;
>  	}
> -	blk_finish_plug(&plug);
>  	return ret;
>  }
>  EXPORT_SYMBOL(generic_file_aio_write);
> --- linux-next.orig/fs/direct-io.c	2012-02-08 18:51:37.000000000 +0800
> +++ linux-next/fs/direct-io.c	2012-02-08 18:52:19.000000000 +0800
> @@ -1106,6 +1106,7 @@ do_blockdev_direct_IO(int rw, struct kio
>  	unsigned long user_addr;
>  	size_t bytes;
>  	struct buffer_head map_bh = { 0, };
> +	struct blk_plug plug;
>  
>  	if (rw & WRITE)
>  		rw = WRITE_ODIRECT;
> @@ -1221,6 +1222,8 @@ do_blockdev_direct_IO(int rw, struct kio
>  				PAGE_SIZE - user_addr / PAGE_SIZE);
>  	}
>  
> +	blk_start_plug(&plug);
> +
>  	for (seg = 0; seg < nr_segs; seg++) {
>  		user_addr = (unsigned long)iov[seg].iov_base;
>  		sdio.size += bytes = iov[seg].iov_len;
> @@ -1314,6 +1317,7 @@ do_blockdev_direct_IO(int rw, struct kio
>  	} else
>  		BUG_ON(retval != -EIOCBQUEUED);
>  
> +	blk_finish_plug(&plug);
This one can be moved up a little bit (before dio_cleanup)

Thanks,
Shaohua

>  out:
>  	return retval;
>  }



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-08 23:27 ` Dave Chinner
@ 2012-02-09  8:02   ` Wu Fengguang
  2012-02-09 18:06     ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Wu Fengguang @ 2012-02-09  8:02 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Andrew Morton, linux-fsdevel, LKML, Jens Axboe, Li Shaohua

On Thu, Feb 09, 2012 at 10:27:19AM +1100, Dave Chinner wrote:
> On Wed, Feb 08, 2012 at 07:01:44PM +0800, Wu Fengguang wrote:
> > Buffered write(2) is not directly tied to IO, so it's not suitable to
> > handle plug in generic_file_aio_write().
> 
> But generic_sync_write() does issue IO for O_SYNC writes, so unless
> there is plugging at a lower layer in the writeback code then it
> appears to me that plugging is still necessary (at least inside the
> sync branch)....

Good catch! It looks that generic_write_sync() eventually calls into
vfs_fsync_range() which further calls ->fsync(). We may add plugging
around it:

--- linux-next.orig/fs/sync.c   2012-02-09 15:59:52.000000000 +0800
+++ linux-next/fs/sync.c        2012-02-09 16:01:02.000000000 +0800
@@ -164,9 +164,17 @@ SYSCALL_DEFINE1(syncfs, int, fd)
  */
 int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync)
 {
+       struct blk_plug plug;
+       int ret;
+
        if (!file->f_op || !file->f_op->fsync)
                return -EINVAL;
-       return file->f_op->fsync(file, start, end, datasync);
+
+       blk_start_plug(&plug);
+       ret = file->f_op->fsync(file, start, end, datasync);
+       blk_finish_plug(&plug);
+
+       return ret;
 }
 EXPORT_SYMBOL(vfs_fsync_range);


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-09  1:14 ` Shaohua Li
@ 2012-02-09  8:07   ` Wu Fengguang
  2012-02-09  9:25     ` Damien Wyart
  0 siblings, 1 reply; 13+ messages in thread
From: Wu Fengguang @ 2012-02-09  8:07 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Andrew Morton, linux-fsdevel, LKML, Jens Axboe

> > @@ -1221,6 +1222,8 @@ do_blockdev_direct_IO(int rw, struct kio
> >  				PAGE_SIZE - user_addr / PAGE_SIZE);
> >  	}
> >  
> > +	blk_start_plug(&plug);
> > +
> >  	for (seg = 0; seg < nr_segs; seg++) {
> >  		user_addr = (unsigned long)iov[seg].iov_base;
> >  		sdio.size += bytes = iov[seg].iov_len;
> > @@ -1314,6 +1317,7 @@ do_blockdev_direct_IO(int rw, struct kio
> >  	} else
> >  		BUG_ON(retval != -EIOCBQUEUED);
> >  
> > +	blk_finish_plug(&plug);
> This one can be moved up a little bit (before dio_cleanup)

Done, thanks! Here is the updated patch with O_SYNC write fix.

Thanks,
Fengguang
---
Subject: block: remove plugging at buffered write time 
Date: Tue Jan 31 18:25:48 CST 2012

Buffered write(2) is not directly tied to IO, so it's not suitable to
handle plug in generic_file_aio_write().

Also moves unplugging to lower layers:
- for direct I/O, from around ->direct_IO() to do_blockdev_direct_IO()
- for O_SYNC writes, to around ->fsync()

CC: Jens Axboe <axboe@kernel.dk>
CC: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/direct-io.c |    5 +++++
 fs/sync.c      |   10 +++++++++-
 mm/filemap.c   |    7 -------
 3 files changed, 14 insertions(+), 8 deletions(-)

--- linux-next.orig/mm/filemap.c	2012-02-08 19:33:29.000000000 +0800
+++ linux-next/mm/filemap.c	2012-02-09 15:59:47.000000000 +0800
@@ -1421,12 +1421,8 @@ generic_file_aio_read(struct kiocb *iocb
 			retval = filemap_write_and_wait_range(mapping, pos,
 					pos + iov_length(iov, nr_segs) - 1);
 			if (!retval) {
-				struct blk_plug plug;
-
-				blk_start_plug(&plug);
 				retval = mapping->a_ops->direct_IO(READ, iocb,
 							iov, pos, nr_segs);
-				blk_finish_plug(&plug);
 			}
 			if (retval > 0) {
 				*ppos = pos + retval;
@@ -2610,13 +2606,11 @@ ssize_t generic_file_aio_write(struct ki
 {
 	struct file *file = iocb->ki_filp;
 	struct inode *inode = file->f_mapping->host;
-	struct blk_plug plug;
 	ssize_t ret;
 
 	BUG_ON(iocb->ki_pos != pos);
 
 	mutex_lock(&inode->i_mutex);
-	blk_start_plug(&plug);
 	ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
 	mutex_unlock(&inode->i_mutex);
 
@@ -2627,7 +2621,6 @@ ssize_t generic_file_aio_write(struct ki
 		if (err < 0 && ret > 0)
 			ret = err;
 	}
-	blk_finish_plug(&plug);
 	return ret;
 }
 EXPORT_SYMBOL(generic_file_aio_write);
--- linux-next.orig/fs/direct-io.c	2012-02-08 19:33:29.000000000 +0800
+++ linux-next/fs/direct-io.c	2012-02-09 16:03:04.000000000 +0800
@@ -1106,6 +1106,7 @@ do_blockdev_direct_IO(int rw, struct kio
 	unsigned long user_addr;
 	size_t bytes;
 	struct buffer_head map_bh = { 0, };
+	struct blk_plug plug;
 
 	if (rw & WRITE)
 		rw = WRITE_ODIRECT;
@@ -1221,6 +1222,8 @@ do_blockdev_direct_IO(int rw, struct kio
 				PAGE_SIZE - user_addr / PAGE_SIZE);
 	}
 
+	blk_start_plug(&plug);
+
 	for (seg = 0; seg < nr_segs; seg++) {
 		user_addr = (unsigned long)iov[seg].iov_base;
 		sdio.size += bytes = iov[seg].iov_len;
@@ -1279,6 +1282,8 @@ do_blockdev_direct_IO(int rw, struct kio
 	if (sdio.bio)
 		dio_bio_submit(dio, &sdio);
 
+	blk_finish_plug(&plug);
+
 	/*
 	 * It is possible that, we return short IO due to end of file.
 	 * In that case, we need to release all the pages we got hold on.
--- linux-next.orig/fs/sync.c	2012-02-09 15:59:52.000000000 +0800
+++ linux-next/fs/sync.c	2012-02-09 16:01:02.000000000 +0800
@@ -164,9 +164,17 @@ SYSCALL_DEFINE1(syncfs, int, fd)
  */
 int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync)
 {
+	struct blk_plug plug;
+	int ret;
+
 	if (!file->f_op || !file->f_op->fsync)
 		return -EINVAL;
-	return file->f_op->fsync(file, start, end, datasync);
+
+	blk_start_plug(&plug);
+	ret = file->f_op->fsync(file, start, end, datasync);
+	blk_finish_plug(&plug);
+
+	return ret;
 }
 EXPORT_SYMBOL(vfs_fsync_range);
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-09  8:07   ` Wu Fengguang
@ 2012-02-09  9:25     ` Damien Wyart
  2012-02-09  9:40       ` Damien Wyart
  0 siblings, 1 reply; 13+ messages in thread
From: Damien Wyart @ 2012-02-09  9:25 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Shaohua Li, Andrew Morton, linux-fsdevel, LKML, Jens Axboe

Hi,

* Wu Fengguang <fengguang.wu@intel.com> [2012-02-09 16:07]:
> Done, thanks! Here is the updated patch with O_SYNC write fix.

Got these errors with the updated patch on top of 3.3-rc3:

fs/sync.c: In function 'vfs_fsync_range':
fs/sync.c:167:18: error: storage size of 'plug' isn't known
fs/sync.c:173:2: error: implicit declaration of function 'blk_start_plug' [-Werror=implicit-function-declaration]
fs/sync.c:175:2: error: implicit declaration of function 'blk_finish_plug' [-Werror=implicit-function-declaration]
fs/sync.c:167:18: warning: unused variable 'plug' [-Wunused-variable]

-- 
Damien

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-09  9:25     ` Damien Wyart
@ 2012-02-09  9:40       ` Damien Wyart
  2012-02-09  9:49         ` Wu Fengguang
  0 siblings, 1 reply; 13+ messages in thread
From: Damien Wyart @ 2012-02-09  9:40 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Shaohua Li, Andrew Morton, linux-fsdevel, LKML, Jens Axboe

> > Done, thanks! Here is the updated patch with O_SYNC write fix.

> Got these errors with the updated patch on top of 3.3-rc3:

> fs/sync.c: In function 'vfs_fsync_range':
> fs/sync.c:167:18: error: storage size of 'plug' isn't known
> fs/sync.c:173:2: error: implicit declaration of function 'blk_start_plug' [-Werror=implicit-function-declaration]
> fs/sync.c:175:2: error: implicit declaration of function 'blk_finish_plug' [-Werror=implicit-function-declaration]
> fs/sync.c:167:18: warning: unused variable 'plug' [-Wunused-variable]

Adding a #include <linux/blkdev.h> at the top of fs/sync.c solved the
problem. Of course, this needs approval of a kernel developper before
being added to the patch and resubmitted because I know adding #includes
can have unwanted consequences sometimes or be seen as bloat...

-- 
Damien Wyart

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-09  9:40       ` Damien Wyart
@ 2012-02-09  9:49         ` Wu Fengguang
  0 siblings, 0 replies; 13+ messages in thread
From: Wu Fengguang @ 2012-02-09  9:49 UTC (permalink / raw)
  To: Damien Wyart; +Cc: Shaohua Li, Andrew Morton, linux-fsdevel, LKML, Jens Axboe

On Thu, Feb 09, 2012 at 10:40:19AM +0100, Damien Wyart wrote:
> > > Done, thanks! Here is the updated patch with O_SYNC write fix.
> 
> > Got these errors with the updated patch on top of 3.3-rc3:
> 
> > fs/sync.c: In function 'vfs_fsync_range':
> > fs/sync.c:167:18: error: storage size of 'plug' isn't known
> > fs/sync.c:173:2: error: implicit declaration of function 'blk_start_plug' [-Werror=implicit-function-declaration]
> > fs/sync.c:175:2: error: implicit declaration of function 'blk_finish_plug' [-Werror=implicit-function-declaration]
> > fs/sync.c:167:18: warning: unused variable 'plug' [-Wunused-variable]
> 
> Adding a #include <linux/blkdev.h> at the top of fs/sync.c solved the
> problem. Of course, this needs approval of a kernel developper before
> being added to the patch and resubmitted because I know adding #includes
> can have unwanted consequences sometimes or be seen as bloat...

Thank you! This includes the necessary blkdev.h and it compiles here too:

Subject: block: remove plugging at buffered write time 
Date: Tue Jan 31 18:25:48 CST 2012

Buffered write(2) is not directly tied to IO, so it's not suitable to
handle plug in generic_file_aio_write().

Also moves unplugging to lower layers:
- for direct I/O, from around ->direct_IO() to do_blockdev_direct_IO()
- for O_SYNC writes, to around ->fsync()

CC: Jens Axboe <axboe@kernel.dk>
CC: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 fs/direct-io.c |    5 +++++
 fs/sync.c      |   11 ++++++++++-
 mm/filemap.c   |    7 -------
 3 files changed, 15 insertions(+), 8 deletions(-)

--- linux-next.orig/mm/filemap.c	2012-02-08 19:33:29.000000000 +0800
+++ linux-next/mm/filemap.c	2012-02-09 15:59:47.000000000 +0800
@@ -1421,12 +1421,8 @@ generic_file_aio_read(struct kiocb *iocb
 			retval = filemap_write_and_wait_range(mapping, pos,
 					pos + iov_length(iov, nr_segs) - 1);
 			if (!retval) {
-				struct blk_plug plug;
-
-				blk_start_plug(&plug);
 				retval = mapping->a_ops->direct_IO(READ, iocb,
 							iov, pos, nr_segs);
-				blk_finish_plug(&plug);
 			}
 			if (retval > 0) {
 				*ppos = pos + retval;
@@ -2610,13 +2606,11 @@ ssize_t generic_file_aio_write(struct ki
 {
 	struct file *file = iocb->ki_filp;
 	struct inode *inode = file->f_mapping->host;
-	struct blk_plug plug;
 	ssize_t ret;
 
 	BUG_ON(iocb->ki_pos != pos);
 
 	mutex_lock(&inode->i_mutex);
-	blk_start_plug(&plug);
 	ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
 	mutex_unlock(&inode->i_mutex);
 
@@ -2627,7 +2621,6 @@ ssize_t generic_file_aio_write(struct ki
 		if (err < 0 && ret > 0)
 			ret = err;
 	}
-	blk_finish_plug(&plug);
 	return ret;
 }
 EXPORT_SYMBOL(generic_file_aio_write);
--- linux-next.orig/fs/direct-io.c	2012-02-08 19:33:29.000000000 +0800
+++ linux-next/fs/direct-io.c	2012-02-09 16:03:04.000000000 +0800
@@ -1106,6 +1106,7 @@ do_blockdev_direct_IO(int rw, struct kio
 	unsigned long user_addr;
 	size_t bytes;
 	struct buffer_head map_bh = { 0, };
+	struct blk_plug plug;
 
 	if (rw & WRITE)
 		rw = WRITE_ODIRECT;
@@ -1221,6 +1222,8 @@ do_blockdev_direct_IO(int rw, struct kio
 				PAGE_SIZE - user_addr / PAGE_SIZE);
 	}
 
+	blk_start_plug(&plug);
+
 	for (seg = 0; seg < nr_segs; seg++) {
 		user_addr = (unsigned long)iov[seg].iov_base;
 		sdio.size += bytes = iov[seg].iov_len;
@@ -1279,6 +1282,8 @@ do_blockdev_direct_IO(int rw, struct kio
 	if (sdio.bio)
 		dio_bio_submit(dio, &sdio);
 
+	blk_finish_plug(&plug);
+
 	/*
 	 * It is possible that, we return short IO due to end of file.
 	 * In that case, we need to release all the pages we got hold on.
--- linux-next.orig/fs/sync.c	2012-02-09 15:59:52.000000000 +0800
+++ linux-next/fs/sync.c	2012-02-09 17:39:02.000000000 +0800
@@ -15,6 +15,7 @@
 #include <linux/pagemap.h>
 #include <linux/quotaops.h>
 #include <linux/backing-dev.h>
+#include <linux/blkdev.h>
 #include "internal.h"
 
 #define VALID_FLAGS (SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE| \
@@ -164,9 +165,17 @@ SYSCALL_DEFINE1(syncfs, int, fd)
  */
 int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync)
 {
+	struct blk_plug plug;
+	int ret;
+
 	if (!file->f_op || !file->f_op->fsync)
 		return -EINVAL;
-	return file->f_op->fsync(file, start, end, datasync);
+
+	blk_start_plug(&plug);
+	ret = file->f_op->fsync(file, start, end, datasync);
+	blk_finish_plug(&plug);
+
+	return ret;
 }
 EXPORT_SYMBOL(vfs_fsync_range);
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-09  8:02   ` Wu Fengguang
@ 2012-02-09 18:06     ` Christoph Hellwig
  2012-02-09 18:30       ` Chris Mason
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2012-02-09 18:06 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Dave Chinner, Andrew Morton, linux-fsdevel, LKML, Jens Axboe, Li Shaohua

On Thu, Feb 09, 2012 at 04:02:24PM +0800, Wu Fengguang wrote:
> On Thu, Feb 09, 2012 at 10:27:19AM +1100, Dave Chinner wrote:
> > On Wed, Feb 08, 2012 at 07:01:44PM +0800, Wu Fengguang wrote:
> > > Buffered write(2) is not directly tied to IO, so it's not suitable to
> > > handle plug in generic_file_aio_write().
> > 
> > But generic_sync_write() does issue IO for O_SYNC writes, so unless
> > there is plugging at a lower layer in the writeback code then it
> > appears to me that plugging is still necessary (at least inside the
> > sync branch)....
> 
> Good catch! It looks that generic_write_sync() eventually calls into
> vfs_fsync_range() which further calls ->fsync(). We may add plugging
> around it:


NAK, please keep the plugging down in the fs, or the libraries used but
not common VFS code.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-09 18:06     ` Christoph Hellwig
@ 2012-02-09 18:30       ` Chris Mason
  2012-02-10  1:52         ` Wu Fengguang
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Mason @ 2012-02-09 18:30 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Wu Fengguang, Dave Chinner, Andrew Morton, linux-fsdevel, LKML,
	Jens Axboe, Li Shaohua

On Thu, Feb 09, 2012 at 01:06:35PM -0500, Christoph Hellwig wrote:
> On Thu, Feb 09, 2012 at 04:02:24PM +0800, Wu Fengguang wrote:
> > On Thu, Feb 09, 2012 at 10:27:19AM +1100, Dave Chinner wrote:
> > > On Wed, Feb 08, 2012 at 07:01:44PM +0800, Wu Fengguang wrote:
> > > > Buffered write(2) is not directly tied to IO, so it's not suitable to
> > > > handle plug in generic_file_aio_write().
> > > 
> > > But generic_sync_write() does issue IO for O_SYNC writes, so unless
> > > there is plugging at a lower layer in the writeback code then it
> > > appears to me that plugging is still necessary (at least inside the
> > > sync branch)....
> > 
> > Good catch! It looks that generic_write_sync() eventually calls into
> > vfs_fsync_range() which further calls ->fsync(). We may add plugging
> > around it:
> 
> 
> NAK, please keep the plugging down in the fs, or the libraries used but
> not common VFS code.

Please, what Christoph said.  At least for btrfs plugging here is wrong.

-chris


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-09 18:30       ` Chris Mason
@ 2012-02-10  1:52         ` Wu Fengguang
  2012-02-10  2:47           ` Wu Fengguang
  0 siblings, 1 reply; 13+ messages in thread
From: Wu Fengguang @ 2012-02-10  1:52 UTC (permalink / raw)
  To: Chris Mason, Christoph Hellwig, Dave Chinner, Andrew Morton,
	linux-fsdevel, LKML, Jens Axboe, Li Shaohua

On Thu, Feb 09, 2012 at 01:30:27PM -0500, Chris Mason wrote:
> On Thu, Feb 09, 2012 at 01:06:35PM -0500, Christoph Hellwig wrote:
> > On Thu, Feb 09, 2012 at 04:02:24PM +0800, Wu Fengguang wrote:
> > > On Thu, Feb 09, 2012 at 10:27:19AM +1100, Dave Chinner wrote:
> > > > On Wed, Feb 08, 2012 at 07:01:44PM +0800, Wu Fengguang wrote:
> > > > > Buffered write(2) is not directly tied to IO, so it's not suitable to
> > > > > handle plug in generic_file_aio_write().
> > > > 
> > > > But generic_sync_write() does issue IO for O_SYNC writes, so unless
> > > > there is plugging at a lower layer in the writeback code then it
> > > > appears to me that plugging is still necessary (at least inside the
> > > > sync branch)....
> > > 
> > > Good catch! It looks that generic_write_sync() eventually calls into
> > > vfs_fsync_range() which further calls ->fsync(). We may add plugging
> > > around it:
> > 
> > 
> > NAK, please keep the plugging down in the fs, or the libraries used but
> > not common VFS code.
> 
> Please, what Christoph said.  At least for btrfs plugging here is wrong.

OK, I get the point: the fs knows best when to unplug. Since any
higher level plug nesting will turn such low level efforts into no-op,
it's highly undesirable to do it in the high level.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-10  1:52         ` Wu Fengguang
@ 2012-02-10  2:47           ` Wu Fengguang
  2012-02-10  9:41             ` Jan Kara
  0 siblings, 1 reply; 13+ messages in thread
From: Wu Fengguang @ 2012-02-10  2:47 UTC (permalink / raw)
  To: Chris Mason, Christoph Hellwig, Dave Chinner, Andrew Morton,
	linux-fsdevel, LKML, Jens Axboe, Li Shaohua
  Cc: Jan Kara

On Fri, Feb 10, 2012 at 09:52:18AM +0800, Wu Fengguang wrote:
> On Thu, Feb 09, 2012 at 01:30:27PM -0500, Chris Mason wrote:
> > On Thu, Feb 09, 2012 at 01:06:35PM -0500, Christoph Hellwig wrote:
> > > On Thu, Feb 09, 2012 at 04:02:24PM +0800, Wu Fengguang wrote:
> > > > On Thu, Feb 09, 2012 at 10:27:19AM +1100, Dave Chinner wrote:
> > > > > On Wed, Feb 08, 2012 at 07:01:44PM +0800, Wu Fengguang wrote:
> > > > > > Buffered write(2) is not directly tied to IO, so it's not suitable to
> > > > > > handle plug in generic_file_aio_write().
> > > > > 
> > > > > But generic_sync_write() does issue IO for O_SYNC writes, so unless
> > > > > there is plugging at a lower layer in the writeback code then it
> > > > > appears to me that plugging is still necessary (at least inside the
> > > > > sync branch)....
> > > > 
> > > > Good catch! It looks that generic_write_sync() eventually calls into
> > > > vfs_fsync_range() which further calls ->fsync(). We may add plugging
> > > > around it:
> > > 
> > > 
> > > NAK, please keep the plugging down in the fs, or the libraries used but
> > > not common VFS code.
> > 
> > Please, what Christoph said.  At least for btrfs plugging here is wrong.
> 
> OK, I get the point: the fs knows best when to unplug. Since any
> higher level plug nesting will turn such low level efforts into no-op,
> it's highly undesirable to do it in the high level.

It's actually wrong to do plugging around vfs_fsync_range().

Because these call paths

        write() with O_SYNC
          generic_write_sync()
            vfs_fsync_range()
              ->fsync()
              generic_file_fsync()

        fsync()
          do_fsync()
            vfs_fsync()
              vfs_fsync_range()

pass arbitrary @size arguments, which may be much larger than the
preferable I/O size, or may cross extent/device boundaries.

generic_file_fsync() starts with a filemap_write_and_wait_range()
call, which already has proper plugging somewhere underneath. Then
followed by metadata writes, which has plugging inside
fsync_buffers_list(). At last, sync_inode_metadata() calls into
->write_inode() which may or may not care plugging.

The other fs specific ->fsync() do similar steps, varying in the
metadata and fs specific housekeeping part.

I'll just drop this code. Shall the fs specific metadata I/O be
plugged accordingly? I'm afraid this is beyond my knowledge base...

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] block: remove plugging at buffered write time
  2012-02-10  2:47           ` Wu Fengguang
@ 2012-02-10  9:41             ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2012-02-10  9:41 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Chris Mason, Christoph Hellwig, Dave Chinner, Andrew Morton,
	linux-fsdevel, LKML, Jens Axboe, Li Shaohua, Jan Kara

On Fri 10-02-12 10:47:16, Wu Fengguang wrote:
> On Fri, Feb 10, 2012 at 09:52:18AM +0800, Wu Fengguang wrote:
> > On Thu, Feb 09, 2012 at 01:30:27PM -0500, Chris Mason wrote:
> > > On Thu, Feb 09, 2012 at 01:06:35PM -0500, Christoph Hellwig wrote:
> > > > On Thu, Feb 09, 2012 at 04:02:24PM +0800, Wu Fengguang wrote:
> > > > > On Thu, Feb 09, 2012 at 10:27:19AM +1100, Dave Chinner wrote:
> > > > > > On Wed, Feb 08, 2012 at 07:01:44PM +0800, Wu Fengguang wrote:
> > > > > > > Buffered write(2) is not directly tied to IO, so it's not suitable to
> > > > > > > handle plug in generic_file_aio_write().
> > > > > > 
> > > > > > But generic_sync_write() does issue IO for O_SYNC writes, so unless
> > > > > > there is plugging at a lower layer in the writeback code then it
> > > > > > appears to me that plugging is still necessary (at least inside the
> > > > > > sync branch)....
> > > > > 
> > > > > Good catch! It looks that generic_write_sync() eventually calls into
> > > > > vfs_fsync_range() which further calls ->fsync(). We may add plugging
> > > > > around it:
> > > > 
> > > > 
> > > > NAK, please keep the plugging down in the fs, or the libraries used but
> > > > not common VFS code.
> > > 
> > > Please, what Christoph said.  At least for btrfs plugging here is wrong.
> > 
> > OK, I get the point: the fs knows best when to unplug. Since any
> > higher level plug nesting will turn such low level efforts into no-op,
> > it's highly undesirable to do it in the high level.
> 
> It's actually wrong to do plugging around vfs_fsync_range().
> 
> Because these call paths
> 
>         write() with O_SYNC
>           generic_write_sync()
>             vfs_fsync_range()
>               ->fsync()
>               generic_file_fsync()
> 
>         fsync()
>           do_fsync()
>             vfs_fsync()
>               vfs_fsync_range()
> 
> pass arbitrary @size arguments, which may be much larger than the
> preferable I/O size, or may cross extent/device boundaries.
> 
> generic_file_fsync() starts with a filemap_write_and_wait_range()
> call, which already has proper plugging somewhere underneath. Then
> followed by metadata writes, which has plugging inside
> fsync_buffers_list(). At last, sync_inode_metadata() calls into
> ->write_inode() which may or may not care plugging.
> 
> The other fs specific ->fsync() do similar steps, varying in the
> metadata and fs specific housekeeping part.
> 
> I'll just drop this code. Shall the fs specific metadata I/O be
> plugged accordingly? I'm afraid this is beyond my knowledge base...
  The filesystems I know (ext?, ocfs2, reiserfs, udf) either don't do any
metadata io from ->fsync (it happens from a journalling thread) or the io
is random so plugging is not desirable anyway AFAIU (well,
mpage_writepages() is clever enough to submit metadata which is interleaved
with data in one sequential stream together with the data so metadata that
remain are mostly random).

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-02-10  9:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-08 11:01 [PATCH] block: remove plugging at buffered write time Wu Fengguang
2012-02-08 23:27 ` Dave Chinner
2012-02-09  8:02   ` Wu Fengguang
2012-02-09 18:06     ` Christoph Hellwig
2012-02-09 18:30       ` Chris Mason
2012-02-10  1:52         ` Wu Fengguang
2012-02-10  2:47           ` Wu Fengguang
2012-02-10  9:41             ` Jan Kara
2012-02-09  1:14 ` Shaohua Li
2012-02-09  8:07   ` Wu Fengguang
2012-02-09  9:25     ` Damien Wyart
2012-02-09  9:40       ` Damien Wyart
2012-02-09  9:49         ` Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).