linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Btrfs: check for the full sync flag while holding the inode lock during fsync
@ 2019-10-16 15:28 fdmanana
  2019-10-17 18:40 ` David Sterba
  0 siblings, 1 reply; 2+ messages in thread
From: fdmanana @ 2019-10-16 15:28 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

We were checking for the full fsync flag in the inode before locking the
inode, which is racy, since at that that time it might not be set but
after we acquire the inode lock some other task set it. One case where
this can happen is on a system low on memory and some concurrent task
failed to allocate an extent map and therefore set the full sync flag on
the inode, to force the next fsync to work in full mode.

A consequence of missing the full fsync flag set is hitting the problems
fixed by commit 0c713cbab620 ("Btrfs: fix race between ranged fsync and
writeback of adjacent ranges"), BUG_ON() when dropping extents from a log
tree, hitting assertion failures at tree-log.c:copy_items() or all sorts
of weird inconsistencies after replaying a log due to file extents items
representing ranges that overlap.

So just move the check such that it's done after locking the inode and
before starting writeback again.

Fixes: 0c713cbab620 ("Btrfs: fix race between ranged fsync and writeback of adjacent ranges")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/file.c | 36 +++++++++++++++++-------------------
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 352928b45d2a..d6df61121dbb 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2068,25 +2068,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	struct btrfs_trans_handle *trans;
 	struct btrfs_log_ctx ctx;
 	int ret = 0, err;
-	u64 len;
 
-	/*
-	 * If the inode needs a full sync, make sure we use a full range to
-	 * avoid log tree corruption, due to hole detection racing with ordered
-	 * extent completion for adjacent ranges, and assertion failures during
-	 * hole detection.
-	 */
-	if (test_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
-		     &BTRFS_I(inode)->runtime_flags)) {
-		start = 0;
-		end = LLONG_MAX;
-	}
-
-	/*
-	 * The range length can be represented by u64, we have to do the typecasts
-	 * to avoid signed overflow if it's [0, LLONG_MAX] eg. from fsync()
-	 */
-	len = (u64)end - (u64)start + 1;
 	trace_btrfs_sync_file(file, datasync);
 
 	btrfs_init_log_ctx(&ctx, inode);
@@ -2113,6 +2095,19 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	atomic_inc(&root->log_batch);
 
 	/*
+	 * If the inode needs a full sync, make sure we use a full range to
+	 * avoid log tree corruption, due to hole detection racing with ordered
+	 * extent completion for adjacent ranges, and assertion failures during
+	 * hole detection. Do this while holding the inode lock, to avoid races
+	 * with other tasks.
+	 */
+	if (test_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
+		     &BTRFS_I(inode)->runtime_flags)) {
+		start = 0;
+		end = LLONG_MAX;
+	}
+
+	/*
 	 * Before we acquired the inode's lock, someone may have dirtied more
 	 * pages in the target range. We need to make sure that writeback for
 	 * any such pages does not start while we are logging the inode, because
@@ -2139,8 +2134,11 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync)
 	/*
 	 * We have to do this here to avoid the priority inversion of waiting on
 	 * IO of a lower priority task while holding a transaction open.
+	 *
+	 * Also, the range length can be represented by u64, we have to do the
+	 * typecasts to avoid signed overflow if it's [0, LLONG_MAX].
 	 */
-	ret = btrfs_wait_ordered_range(inode, start, len);
+	ret = btrfs_wait_ordered_range(inode, start, (u64)end - (u64)start + 1);
 	if (ret) {
 		up_write(&BTRFS_I(inode)->dio_sem);
 		inode_unlock(inode);
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] Btrfs: check for the full sync flag while holding the inode lock during fsync
  2019-10-16 15:28 [PATCH] Btrfs: check for the full sync flag while holding the inode lock during fsync fdmanana
@ 2019-10-17 18:40 ` David Sterba
  0 siblings, 0 replies; 2+ messages in thread
From: David Sterba @ 2019-10-17 18:40 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Wed, Oct 16, 2019 at 04:28:52PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> We were checking for the full fsync flag in the inode before locking the
> inode, which is racy, since at that that time it might not be set but
> after we acquire the inode lock some other task set it. One case where
> this can happen is on a system low on memory and some concurrent task
> failed to allocate an extent map and therefore set the full sync flag on
> the inode, to force the next fsync to work in full mode.
> 
> A consequence of missing the full fsync flag set is hitting the problems
> fixed by commit 0c713cbab620 ("Btrfs: fix race between ranged fsync and
> writeback of adjacent ranges"), BUG_ON() when dropping extents from a log
> tree, hitting assertion failures at tree-log.c:copy_items() or all sorts
> of weird inconsistencies after replaying a log due to file extents items
> representing ranges that overlap.
> 
> So just move the check such that it's done after locking the inode and
> before starting writeback again.
> 
> Fixes: 0c713cbab620 ("Btrfs: fix race between ranged fsync and writeback of adjacent ranges")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Oh the joys of fsync and tree-log, thanks for the fix, queued for 5.4-rc.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-10-17 18:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-16 15:28 [PATCH] Btrfs: check for the full sync flag while holding the inode lock during fsync fdmanana
2019-10-17 18:40 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).