* avoid taking the iolock in fsync unless actually needed
@ 2021-01-11 16:15 Christoph Hellwig
2021-01-11 16:15 ` [PATCH 1/2] xfs: refactor xfs_file_fsync Christoph Hellwig
2021-01-11 16:15 ` [PATCH 2/2] xfs: reduce ilock acquisitions in xfs_file_fsync Christoph Hellwig
0 siblings, 2 replies; 8+ messages in thread
From: Christoph Hellwig @ 2021-01-11 16:15 UTC (permalink / raw)
To: linux-xfs
Hi all,
this series avoids taking the iolock in fsync if there is no dirty
metadata.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] xfs: refactor xfs_file_fsync
2021-01-11 16:15 avoid taking the iolock in fsync unless actually needed Christoph Hellwig
@ 2021-01-11 16:15 ` Christoph Hellwig
2021-01-12 15:33 ` Brian Foster
2021-01-11 16:15 ` [PATCH 2/2] xfs: reduce ilock acquisitions in xfs_file_fsync Christoph Hellwig
1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2021-01-11 16:15 UTC (permalink / raw)
To: linux-xfs
Factor out the log syncing logic into two helpers to make the code easier
to read and more maintainable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 81 +++++++++++++++++++++++++++++------------------
1 file changed, 50 insertions(+), 31 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5b0f93f738372d..414d856e2e755a 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -118,6 +118,54 @@ xfs_dir_fsync(
return xfs_log_force_inode(ip);
}
+static xfs_lsn_t
+xfs_fsync_lsn(
+ struct xfs_inode *ip,
+ bool datasync)
+{
+ if (!xfs_ipincount(ip))
+ return 0;
+ if (datasync && !(ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
+ return 0;
+ return ip->i_itemp->ili_last_lsn;
+}
+
+/*
+ * All metadata updates are logged, which means that we just have to flush the
+ * log up to the latest LSN that touched the inode.
+ *
+ * If we have concurrent fsync/fdatasync() calls, we need them to all block on
+ * the log force before we clear the ili_fsync_fields field. This ensures that
+ * we don't get a racing sync operation that does not wait for the metadata to
+ * hit the journal before returning. If we race with clearing ili_fsync_fields,
+ * then all that will happen is the log force will do nothing as the lsn will
+ * already be on disk. We can't race with setting ili_fsync_fields because that
+ * is done under XFS_ILOCK_EXCL, and that can't happen because we hold the lock
+ * shared until after the ili_fsync_fields is cleared.
+ */
+static int
+xfs_fsync_flush_log(
+ struct xfs_inode *ip,
+ bool datasync,
+ int *log_flushed)
+{
+ int error = 0;
+ xfs_lsn_t lsn;
+
+ xfs_ilock(ip, XFS_ILOCK_SHARED);
+ lsn = xfs_fsync_lsn(ip, datasync);
+ if (lsn) {
+ error = xfs_log_force_lsn(ip->i_mount, lsn, XFS_LOG_SYNC,
+ log_flushed);
+
+ spin_lock(&ip->i_itemp->ili_lock);
+ ip->i_itemp->ili_fsync_fields = 0;
+ spin_unlock(&ip->i_itemp->ili_lock);
+ }
+ xfs_iunlock(ip, XFS_ILOCK_SHARED);
+ return error;
+}
+
STATIC int
xfs_file_fsync(
struct file *file,
@@ -125,13 +173,10 @@ xfs_file_fsync(
loff_t end,
int datasync)
{
- struct inode *inode = file->f_mapping->host;
- struct xfs_inode *ip = XFS_I(inode);
- struct xfs_inode_log_item *iip = ip->i_itemp;
+ struct xfs_inode *ip = XFS_I(file->f_mapping->host);
struct xfs_mount *mp = ip->i_mount;
int error = 0;
int log_flushed = 0;
- xfs_lsn_t lsn = 0;
trace_xfs_file_fsync(ip);
@@ -155,33 +200,7 @@ xfs_file_fsync(
else if (mp->m_logdev_targp != mp->m_ddev_targp)
xfs_blkdev_issue_flush(mp->m_ddev_targp);
- /*
- * All metadata updates are logged, which means that we just have to
- * flush the log up to the latest LSN that touched the inode. If we have
- * concurrent fsync/fdatasync() calls, we need them to all block on the
- * log force before we clear the ili_fsync_fields field. This ensures
- * that we don't get a racing sync operation that does not wait for the
- * metadata to hit the journal before returning. If we race with
- * clearing the ili_fsync_fields, then all that will happen is the log
- * force will do nothing as the lsn will already be on disk. We can't
- * race with setting ili_fsync_fields because that is done under
- * XFS_ILOCK_EXCL, and that can't happen because we hold the lock shared
- * until after the ili_fsync_fields is cleared.
- */
- xfs_ilock(ip, XFS_ILOCK_SHARED);
- if (xfs_ipincount(ip)) {
- if (!datasync ||
- (iip->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
- lsn = iip->ili_last_lsn;
- }
-
- if (lsn) {
- error = xfs_log_force_lsn(mp, lsn, XFS_LOG_SYNC, &log_flushed);
- spin_lock(&iip->ili_lock);
- iip->ili_fsync_fields = 0;
- spin_unlock(&iip->ili_lock);
- }
- xfs_iunlock(ip, XFS_ILOCK_SHARED);
+ error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
/*
* If we only have a single device, and the log force about was
--
2.29.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] xfs: reduce ilock acquisitions in xfs_file_fsync
2021-01-11 16:15 avoid taking the iolock in fsync unless actually needed Christoph Hellwig
2021-01-11 16:15 ` [PATCH 1/2] xfs: refactor xfs_file_fsync Christoph Hellwig
@ 2021-01-11 16:15 ` Christoph Hellwig
2021-01-12 15:34 ` Brian Foster
1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2021-01-11 16:15 UTC (permalink / raw)
To: linux-xfs
If the inode is not pinned by the time fsync is called we don't need the
ilock to protect against concurrent clearing of ili_fsync_fields as the
inode won't need a log flush or clearing of these fields. Not taking
the iolock allows for full concurrency of fsync and thus O_DSYNC
completions with io_uring/aio write submissions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 414d856e2e755a..ba02780dee6439 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -200,7 +200,8 @@ xfs_file_fsync(
else if (mp->m_logdev_targp != mp->m_ddev_targp)
xfs_blkdev_issue_flush(mp->m_ddev_targp);
- error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
+ if (xfs_ipincount(ip))
+ error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
/*
* If we only have a single device, and the log force about was
--
2.29.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] xfs: refactor xfs_file_fsync
2021-01-11 16:15 ` [PATCH 1/2] xfs: refactor xfs_file_fsync Christoph Hellwig
@ 2021-01-12 15:33 ` Brian Foster
2021-01-12 17:12 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Brian Foster @ 2021-01-12 15:33 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On Mon, Jan 11, 2021 at 05:15:43PM +0100, Christoph Hellwig wrote:
> Factor out the log syncing logic into two helpers to make the code easier
> to read and more maintainable.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
Looks fine, though it might be nice to find some commonality with
xfs_log_force_inode():
Reviewed-by: Brian Foster <bfoster@redhat.com>
> fs/xfs/xfs_file.c | 81 +++++++++++++++++++++++++++++------------------
> 1 file changed, 50 insertions(+), 31 deletions(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 5b0f93f738372d..414d856e2e755a 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -118,6 +118,54 @@ xfs_dir_fsync(
> return xfs_log_force_inode(ip);
> }
>
> +static xfs_lsn_t
> +xfs_fsync_lsn(
> + struct xfs_inode *ip,
> + bool datasync)
> +{
> + if (!xfs_ipincount(ip))
> + return 0;
> + if (datasync && !(ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
> + return 0;
> + return ip->i_itemp->ili_last_lsn;
> +}
> +
> +/*
> + * All metadata updates are logged, which means that we just have to flush the
> + * log up to the latest LSN that touched the inode.
> + *
> + * If we have concurrent fsync/fdatasync() calls, we need them to all block on
> + * the log force before we clear the ili_fsync_fields field. This ensures that
> + * we don't get a racing sync operation that does not wait for the metadata to
> + * hit the journal before returning. If we race with clearing ili_fsync_fields,
> + * then all that will happen is the log force will do nothing as the lsn will
> + * already be on disk. We can't race with setting ili_fsync_fields because that
> + * is done under XFS_ILOCK_EXCL, and that can't happen because we hold the lock
> + * shared until after the ili_fsync_fields is cleared.
> + */
> +static int
> +xfs_fsync_flush_log(
> + struct xfs_inode *ip,
> + bool datasync,
> + int *log_flushed)
> +{
> + int error = 0;
> + xfs_lsn_t lsn;
> +
> + xfs_ilock(ip, XFS_ILOCK_SHARED);
> + lsn = xfs_fsync_lsn(ip, datasync);
> + if (lsn) {
> + error = xfs_log_force_lsn(ip->i_mount, lsn, XFS_LOG_SYNC,
> + log_flushed);
> +
> + spin_lock(&ip->i_itemp->ili_lock);
> + ip->i_itemp->ili_fsync_fields = 0;
> + spin_unlock(&ip->i_itemp->ili_lock);
> + }
> + xfs_iunlock(ip, XFS_ILOCK_SHARED);
> + return error;
> +}
> +
> STATIC int
> xfs_file_fsync(
> struct file *file,
> @@ -125,13 +173,10 @@ xfs_file_fsync(
> loff_t end,
> int datasync)
> {
> - struct inode *inode = file->f_mapping->host;
> - struct xfs_inode *ip = XFS_I(inode);
> - struct xfs_inode_log_item *iip = ip->i_itemp;
> + struct xfs_inode *ip = XFS_I(file->f_mapping->host);
> struct xfs_mount *mp = ip->i_mount;
> int error = 0;
> int log_flushed = 0;
> - xfs_lsn_t lsn = 0;
>
> trace_xfs_file_fsync(ip);
>
> @@ -155,33 +200,7 @@ xfs_file_fsync(
> else if (mp->m_logdev_targp != mp->m_ddev_targp)
> xfs_blkdev_issue_flush(mp->m_ddev_targp);
>
> - /*
> - * All metadata updates are logged, which means that we just have to
> - * flush the log up to the latest LSN that touched the inode. If we have
> - * concurrent fsync/fdatasync() calls, we need them to all block on the
> - * log force before we clear the ili_fsync_fields field. This ensures
> - * that we don't get a racing sync operation that does not wait for the
> - * metadata to hit the journal before returning. If we race with
> - * clearing the ili_fsync_fields, then all that will happen is the log
> - * force will do nothing as the lsn will already be on disk. We can't
> - * race with setting ili_fsync_fields because that is done under
> - * XFS_ILOCK_EXCL, and that can't happen because we hold the lock shared
> - * until after the ili_fsync_fields is cleared.
> - */
> - xfs_ilock(ip, XFS_ILOCK_SHARED);
> - if (xfs_ipincount(ip)) {
> - if (!datasync ||
> - (iip->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
> - lsn = iip->ili_last_lsn;
> - }
> -
> - if (lsn) {
> - error = xfs_log_force_lsn(mp, lsn, XFS_LOG_SYNC, &log_flushed);
> - spin_lock(&iip->ili_lock);
> - iip->ili_fsync_fields = 0;
> - spin_unlock(&iip->ili_lock);
> - }
> - xfs_iunlock(ip, XFS_ILOCK_SHARED);
> + error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
>
> /*
> * If we only have a single device, and the log force about was
> --
> 2.29.2
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] xfs: reduce ilock acquisitions in xfs_file_fsync
2021-01-11 16:15 ` [PATCH 2/2] xfs: reduce ilock acquisitions in xfs_file_fsync Christoph Hellwig
@ 2021-01-12 15:34 ` Brian Foster
0 siblings, 0 replies; 8+ messages in thread
From: Brian Foster @ 2021-01-12 15:34 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On Mon, Jan 11, 2021 at 05:15:44PM +0100, Christoph Hellwig wrote:
> If the inode is not pinned by the time fsync is called we don't need the
> ilock to protect against concurrent clearing of ili_fsync_fields as the
> inode won't need a log flush or clearing of these fields. Not taking
> the iolock allows for full concurrency of fsync and thus O_DSYNC
> completions with io_uring/aio write submissions.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
So this changes fsync semantics for when a concurrent modification might
already be in progress (but not yet complete) to essentially skip the
log force rather than serialize/wait and force. This seems.. reasonable
I suppose since nothign has committed at that point, but I feel like
could use more documentation and justification around that and why this
might be acceptable behavior.
Brian
> fs/xfs/xfs_file.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 414d856e2e755a..ba02780dee6439 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -200,7 +200,8 @@ xfs_file_fsync(
> else if (mp->m_logdev_targp != mp->m_ddev_targp)
> xfs_blkdev_issue_flush(mp->m_ddev_targp);
>
> - error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
> + if (xfs_ipincount(ip))
> + error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
>
> /*
> * If we only have a single device, and the log force about was
> --
> 2.29.2
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] xfs: refactor xfs_file_fsync
2021-01-12 15:33 ` Brian Foster
@ 2021-01-12 17:12 ` Christoph Hellwig
0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2021-01-12 17:12 UTC (permalink / raw)
To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs
On Tue, Jan 12, 2021 at 10:33:47AM -0500, Brian Foster wrote:
> Looks fine, though it might be nice to find some commonality with
> xfs_log_force_inode():
The common logic is called xfs_log_force_lsn :)
The fact that fsync checks and modifies ili_fsync_fields makes it rather
impractival to share more code unfortunately.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] xfs: refactor xfs_file_fsync
2021-01-22 16:46 ` [PATCH 1/2] xfs: refactor xfs_file_fsync Christoph Hellwig
@ 2021-01-22 21:08 ` Dave Chinner
0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2021-01-22 21:08 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs, Brian Foster
On Fri, Jan 22, 2021 at 05:46:42PM +0100, Christoph Hellwig wrote:
> Factor out the log syncing logic into two helpers to make the code easier
> to read and more maintainable.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Brian Foster <bfoster@redhat.com>
LGTM.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] xfs: refactor xfs_file_fsync
2021-01-22 16:46 avoid taking the iolock in fsync unless actually needed v2 Christoph Hellwig
@ 2021-01-22 16:46 ` Christoph Hellwig
2021-01-22 21:08 ` Dave Chinner
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2021-01-22 16:46 UTC (permalink / raw)
To: linux-xfs; +Cc: Brian Foster
Factor out the log syncing logic into two helpers to make the code easier
to read and more maintainable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
fs/xfs/xfs_file.c | 81 +++++++++++++++++++++++++++++------------------
1 file changed, 50 insertions(+), 31 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 39695b59dfcc92..588232c77f11e0 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -118,6 +118,54 @@ xfs_dir_fsync(
return xfs_log_force_inode(ip);
}
+static xfs_lsn_t
+xfs_fsync_lsn(
+ struct xfs_inode *ip,
+ bool datasync)
+{
+ if (!xfs_ipincount(ip))
+ return 0;
+ if (datasync && !(ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
+ return 0;
+ return ip->i_itemp->ili_last_lsn;
+}
+
+/*
+ * All metadata updates are logged, which means that we just have to flush the
+ * log up to the latest LSN that touched the inode.
+ *
+ * If we have concurrent fsync/fdatasync() calls, we need them to all block on
+ * the log force before we clear the ili_fsync_fields field. This ensures that
+ * we don't get a racing sync operation that does not wait for the metadata to
+ * hit the journal before returning. If we race with clearing ili_fsync_fields,
+ * then all that will happen is the log force will do nothing as the lsn will
+ * already be on disk. We can't race with setting ili_fsync_fields because that
+ * is done under XFS_ILOCK_EXCL, and that can't happen because we hold the lock
+ * shared until after the ili_fsync_fields is cleared.
+ */
+static int
+xfs_fsync_flush_log(
+ struct xfs_inode *ip,
+ bool datasync,
+ int *log_flushed)
+{
+ int error = 0;
+ xfs_lsn_t lsn;
+
+ xfs_ilock(ip, XFS_ILOCK_SHARED);
+ lsn = xfs_fsync_lsn(ip, datasync);
+ if (lsn) {
+ error = xfs_log_force_lsn(ip->i_mount, lsn, XFS_LOG_SYNC,
+ log_flushed);
+
+ spin_lock(&ip->i_itemp->ili_lock);
+ ip->i_itemp->ili_fsync_fields = 0;
+ spin_unlock(&ip->i_itemp->ili_lock);
+ }
+ xfs_iunlock(ip, XFS_ILOCK_SHARED);
+ return error;
+}
+
STATIC int
xfs_file_fsync(
struct file *file,
@@ -125,13 +173,10 @@ xfs_file_fsync(
loff_t end,
int datasync)
{
- struct inode *inode = file->f_mapping->host;
- struct xfs_inode *ip = XFS_I(inode);
- struct xfs_inode_log_item *iip = ip->i_itemp;
+ struct xfs_inode *ip = XFS_I(file->f_mapping->host);
struct xfs_mount *mp = ip->i_mount;
int error = 0;
int log_flushed = 0;
- xfs_lsn_t lsn = 0;
trace_xfs_file_fsync(ip);
@@ -155,33 +200,7 @@ xfs_file_fsync(
else if (mp->m_logdev_targp != mp->m_ddev_targp)
xfs_blkdev_issue_flush(mp->m_ddev_targp);
- /*
- * All metadata updates are logged, which means that we just have to
- * flush the log up to the latest LSN that touched the inode. If we have
- * concurrent fsync/fdatasync() calls, we need them to all block on the
- * log force before we clear the ili_fsync_fields field. This ensures
- * that we don't get a racing sync operation that does not wait for the
- * metadata to hit the journal before returning. If we race with
- * clearing the ili_fsync_fields, then all that will happen is the log
- * force will do nothing as the lsn will already be on disk. We can't
- * race with setting ili_fsync_fields because that is done under
- * XFS_ILOCK_EXCL, and that can't happen because we hold the lock shared
- * until after the ili_fsync_fields is cleared.
- */
- xfs_ilock(ip, XFS_ILOCK_SHARED);
- if (xfs_ipincount(ip)) {
- if (!datasync ||
- (iip->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
- lsn = iip->ili_last_lsn;
- }
-
- if (lsn) {
- error = xfs_log_force_lsn(mp, lsn, XFS_LOG_SYNC, &log_flushed);
- spin_lock(&iip->ili_lock);
- iip->ili_fsync_fields = 0;
- spin_unlock(&iip->ili_lock);
- }
- xfs_iunlock(ip, XFS_ILOCK_SHARED);
+ error = xfs_fsync_flush_log(ip, datasync, &log_flushed);
/*
* If we only have a single device, and the log force about was
--
2.29.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-01-22 21:10 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-11 16:15 avoid taking the iolock in fsync unless actually needed Christoph Hellwig
2021-01-11 16:15 ` [PATCH 1/2] xfs: refactor xfs_file_fsync Christoph Hellwig
2021-01-12 15:33 ` Brian Foster
2021-01-12 17:12 ` Christoph Hellwig
2021-01-11 16:15 ` [PATCH 2/2] xfs: reduce ilock acquisitions in xfs_file_fsync Christoph Hellwig
2021-01-12 15:34 ` Brian Foster
2021-01-22 16:46 avoid taking the iolock in fsync unless actually needed v2 Christoph Hellwig
2021-01-22 16:46 ` [PATCH 1/2] xfs: refactor xfs_file_fsync Christoph Hellwig
2021-01-22 21:08 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).