linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: Jan Kara <jack@suse.cz>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	axboe@kernel.dk, tytso@mit.edu, david@fromorbit.com, bpm@sgi.com,
	viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
	hch@infradead.org, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH 2/9] ext4: honor the O_SYNC flag for aysnchronous direct I/O requests
Date: Wed, 21 Nov 2012 01:56:26 +0100	[thread overview]
Message-ID: <20121121005626.GC10507@quack.suse.cz> (raw)
In-Reply-To: <x49sj84hwl4.fsf@segfault.boston.devel.redhat.com>

[-- Attachment #1: Type: text/plain, Size: 2231 bytes --]

On Tue 20-11-12 15:02:15, Jeff Moyer wrote:
> Jan Kara <jack@suse.cz> writes:
> 
> >> @@ -1279,6 +1280,9 @@ struct ext4_sb_info {
> >>  	/* workqueue for dio unwritten */
> >>  	struct workqueue_struct *dio_unwritten_wq;
> >>  
> >> +	/* workqueue for aio+dio+o_sync disk cache flushing */
> >> +	struct workqueue_struct *aio_dio_flush_wq;
> >> +
> >   Umm, I'm not completely decided whether we really need a separate
> > workqueue. But it doesn't cost too much so I guess it makes some sense -
> > fsync() is rather heavy so syncing won't starve extent conversion...
> 
> I'm assuming you'd like me to convert the names from flush to fsync,
> yes?
  Would be nicer, yes.

> >> +
> >> +	/*
> >> +	 * If we are running in nojournal mode, just flush the disk
> >> +	 * cache and return.
> >> +	 */
> >> +	if (!journal)
> >> +		return blkdev_issue_flush(inode->i_sb->s_bdev, GFP_NOIO, NULL);
> >   And this is wrong as well - you need to do work similar to what
> > ext4_sync_file() does. Actually it would be *much* better if these two
> > sites used the same helper function. Which also poses an interesting
> > question about locking - do we need i_mutex or not? Forcing a transaction
> > commit is definitely OK without it, similarly as grabbing transaction ids
> > from inode or ext4_should_journal_data() test. __sync_inode() call seems
> > to be OK without i_mutex as well so I believe we can just get rid of it
> > (getting i_mutex from the workqueue is a locking nightmare we don't want to
> > return to).
> 
> Just to be clear, are you saying you would like me to remove the
> mutex_lock/unlock pair from ext4_sync_file?  (I had already factored out
> the common code between this new code path and the fsync path in my tree.)
  Yes, after some thinking I came to that conclusion. We actually need to
keep i_mutex around ext4_flush_unwritten_io() to avoid livelocks but the
rest doesn't need it. The change should be definitely a separate patch just
in case there's something subtle I missed and we need to bisect in
future... I've attached a patch for that so that blame for bugs goes my way
;) Compile tested only so far. I'll give it some more testing overnight.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

[-- Attachment #2: 0001-ext4-Reduce-i_mutex-usage-in-ext4_file_sync.patch --]
[-- Type: text/x-patch, Size: 1862 bytes --]

>From 98f02e76b90e278e9688b4311a8889cec7095601 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 21 Nov 2012 01:46:51 +0100
Subject: [PATCH] ext4: Reduce i_mutex usage in ext4_file_sync()

ext4_file_sync() needs i_mutex only to avoid livelocks of
ext4_flush_unwritten_io() all other code doesn't need it. In particular
syncing of inode & metadata in non-journal case is safe (writeback doesn't
hold i_mutex either) and forcing of transaction commits doesn't need i_mutex
either (there's nothing inode specific in that code apart from grabbing
transaction ids from the inode). So shorten the span where i_mutex is held.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/fsync.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c
index be1d89f..2268114 100644
--- a/fs/ext4/fsync.c
+++ b/fs/ext4/fsync.c
@@ -113,8 +113,6 @@ static int __sync_inode(struct inode *inode, int datasync)
  *
  * What we do is just kick off a commit and wait on it.  This will snapshot the
  * inode to disk.
- *
- * i_mutex lock is held when entering and exiting this function
  */
 
 int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
@@ -133,12 +131,13 @@ int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	ret = filemap_write_and_wait_range(inode->i_mapping, start, end);
 	if (ret)
 		return ret;
-	mutex_lock(&inode->i_mutex);
 
 	if (inode->i_sb->s_flags & MS_RDONLY)
 		goto out;
 
+	mutex_lock(&inode->i_mutex);
 	ret = ext4_flush_unwritten_io(inode);
+	mutex_unlock(&inode->i_mutex);
 	if (ret < 0)
 		goto out;
 
@@ -180,7 +179,6 @@ int ext4_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 			ret = err;
 	}
  out:
-	mutex_unlock(&inode->i_mutex);
 	trace_ext4_sync_file_exit(inode, ret);
 	return ret;
 }
-- 
1.7.1


  reply	other threads:[~2012-11-21  0:56 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-20  7:41 [PATCH v1 0/9] fs: fix up AIO+DIO+O_SYNC to actually do the sync part Darrick J. Wong
2012-11-20  7:41 ` [PATCH 1/9] vfs: Handle O_SYNC AIO DIO in generic code properly Darrick J. Wong
2012-11-21 10:08   ` Christoph Hellwig
2012-11-21 16:58     ` Jeff Moyer
2012-11-21 18:29       ` Christoph Hellwig
2012-11-21 18:38         ` Jeff Moyer
2012-11-21 21:37         ` Jan Kara
2012-11-21 23:09           ` Jeffrey Ellis
2012-11-20  7:41 ` [PATCH 2/9] ext4: honor the O_SYNC flag for aysnchronous direct I/O requests Darrick J. Wong
2012-11-20 10:07   ` Jan Kara
2012-11-20 20:02     ` Jeff Moyer
2012-11-21  0:56       ` Jan Kara [this message]
2012-11-21 14:09         ` Jeff Moyer
2012-11-21 16:54           ` Jan Kara
2012-11-20  7:41 ` [PATCH 3/9] xfs: factor out everything but the filemap_write_and_wait from xfs_file_fsync Darrick J. Wong
2012-11-20 10:47   ` Dave Chinner
2012-11-21 10:09   ` Christoph Hellwig
2012-11-21 14:10     ` Jeff Moyer
2012-11-20  7:51 ` [PATCH 7/9] ocfs2: Use generic handlers of O_SYNC AIO DIO Darrick J. Wong
2012-11-21 19:32   ` Joel Becker
2012-11-20  7:51 ` [PATCH 5/9] btrfs: " Darrick J. Wong
2012-11-20  7:51 ` [PATCH 6/9] gfs2: " Darrick J. Wong
2012-11-20  7:51 ` [PATCH 4/9] xfs: honor the O_SYNC flag for aysnchronous direct I/O requests Darrick J. Wong
2012-11-20 10:24   ` Jan Kara
2012-11-20 11:20   ` Dave Chinner
2012-11-20 19:42     ` Jeff Moyer
2012-11-20 20:08       ` Dave Chinner
2012-11-20  7:51 ` [PATCH 8/9] filemap: don't call generic_write_sync for -EIOCBQUEUED Darrick J. Wong
2012-11-20  7:51 ` [PATCH 9/9] blkdev: Fix up AIO+DIO+O_SYNC to do the sync part correctly Darrick J. Wong
2012-11-20 10:15   ` Jan Kara
2012-11-20 20:47     ` Jeff Moyer
2012-11-21  0:57       ` Jan Kara
2012-11-20  8:38 ` [PATCH v1 0/9] fs: fix up AIO+DIO+O_SYNC to actually do the sync part Darrick J. Wong
2012-11-20 14:23 ` Jeff Moyer
2012-11-20 18:57   ` Darrick J. Wong
2012-11-20 19:05     ` Jeff Moyer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121121005626.GC10507@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=axboe@kernel.dk \
    --cc=bpm@sgi.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).