From: fdmanana@kernel.org
To: linux-btrfs@vger.kernel.org
Subject: [PATCH v2 4/7] btrfs: remove ordered extent check and wait during fallocate
Date: Tue, 15 Mar 2022 12:18:51 +0000 [thread overview]
Message-ID: <cb7153055d8808fca7cfa7711a37a78770064e34.1647346287.git.fdmanana@suse.com> (raw)
In-Reply-To: <cover.1647346287.git.fdmanana@suse.com>
From: Filipe Manana <fdmanana@suse.com>
For fallocate() we have this loop that checks if we have ordered extents
after locking the file range, and if so unlock the range, wait for ordered
extents, and retry until we don't find more ordered extents.
This logic was needed in the past because:
1) Direct IO writes within the i_size boundary did not take the inode's
VFS lock. This was because that lock used to be a mutex, then some
years ago it was switched to a rw semaphore (commit 9902af79c01a8e
("parallel lookups: actual switch to rwsem")), and then btrfs was
changed to take the VFS inode's lock in shared mode for writes that
don't cross the i_size boundary (commit e9adabb9712ef9 ("btrfs: use
shared lock for direct writes within EOF"));
2) We could race with memory mapped writes, because memory mapped writes
don't acquire the inode's VFS lock. We don't have that race anymore,
as we have a rw semaphore to synchronize memory mapped writes with
fallocate (and reflinking too). That change happened with commit
8d9b4a162a37ce ("btrfs: exclude mmap from happening during all
fallocate operations").
So stop looking for ordered extents after locking the file range when
doing a plain fallocate.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/file.c | 42 ++++++++----------------------------------
1 file changed, 8 insertions(+), 34 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 2f57f7d9d9cb..a7fd05c1d52f 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -3474,8 +3474,12 @@ static long btrfs_fallocate(struct file *file, int mode,
}
/*
- * wait for ordered IO before we have any locks. We'll loop again
- * below with the locks held.
+ * We have locked the inode at the VFS level (in exclusive mode) and we
+ * have locked the i_mmap_lock lock (in exclusive mode). Now before
+ * locking the file range, flush all dealloc in the range and wait for
+ * all ordered extents in the range to complete. After this we can lock
+ * the file range and, due to the previous locking we did, we know there
+ * can't be more delalloc or ordered extents in the range.
*/
ret = btrfs_wait_ordered_range(inode, alloc_start,
alloc_end - alloc_start);
@@ -3489,38 +3493,8 @@ static long btrfs_fallocate(struct file *file, int mode,
}
locked_end = alloc_end - 1;
- while (1) {
- struct btrfs_ordered_extent *ordered;
-
- /* the extent lock is ordered inside the running
- * transaction
- */
- lock_extent_bits(&BTRFS_I(inode)->io_tree, alloc_start,
- locked_end, &cached_state);
- ordered = btrfs_lookup_first_ordered_extent(BTRFS_I(inode),
- locked_end);
-
- if (ordered &&
- ordered->file_offset + ordered->num_bytes > alloc_start &&
- ordered->file_offset < alloc_end) {
- btrfs_put_ordered_extent(ordered);
- unlock_extent_cached(&BTRFS_I(inode)->io_tree,
- alloc_start, locked_end,
- &cached_state);
- /*
- * we can't wait on the range with the transaction
- * running or with the extent lock held
- */
- ret = btrfs_wait_ordered_range(inode, alloc_start,
- alloc_end - alloc_start);
- if (ret)
- goto out;
- } else {
- if (ordered)
- btrfs_put_ordered_extent(ordered);
- break;
- }
- }
+ lock_extent_bits(&BTRFS_I(inode)->io_tree, alloc_start, locked_end,
+ &cached_state);
/* First, check if we exceed the qgroup limit */
INIT_LIST_HEAD(&reserve_list);
--
2.33.0
next prev parent reply other threads:[~2022-03-15 12:19 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-15 10:47 [PATCH 0/6] btrfs: one fallocate fix and removing outdated code for fallocate and reflinks fdmanana
2022-03-15 10:47 ` [PATCH 1/6] btrfs: only reserve the needed data space amount during fallocate fdmanana
2022-03-15 10:47 ` [PATCH 2/6] btrfs: remove useless dio wait call when doing fallocate zero range fdmanana
2022-03-15 10:47 ` [PATCH 3/6] btrfs: remove inode_dio_wait() calls when starting reflink operations fdmanana
2022-03-15 10:47 ` [PATCH 4/6] btrfs: remove ordered extent check and wait during fallocate fdmanana
2022-03-15 10:47 ` [PATCH 5/6] btrfs: remove ordered extent check and wait during hole punching and zero range fdmanana
2022-03-15 10:47 ` [PATCH 6/6] btrfs: add and use helper to assert an inode range is clean fdmanana
2022-03-15 12:18 ` [PATCH v2 0/7] btrfs: one fallocate fix and removing outdated code for fallocate and reflinks fdmanana
2022-03-15 12:18 ` [PATCH v2 1/7] btrfs: only reserve the needed data space amount during fallocate fdmanana
2022-03-15 13:16 ` Wang Yugui
2022-03-15 13:45 ` Filipe Manana
2022-03-15 12:18 ` [PATCH v2 2/7] btrfs: remove useless dio wait call when doing fallocate zero range fdmanana
2022-03-15 12:18 ` [PATCH v2 3/7] btrfs: remove inode_dio_wait() calls when starting reflink operations fdmanana
2022-03-15 12:18 ` fdmanana [this message]
2022-03-15 12:18 ` [PATCH v2 5/7] btrfs: lock the inode first before flushing range when punching hole fdmanana
2022-03-15 12:18 ` [PATCH v2 6/7] btrfs: remove ordered extent check and wait during hole punching and zero range fdmanana
2022-03-15 12:18 ` [PATCH v2 7/7] btrfs: add and use helper to assert an inode range is clean fdmanana
2022-03-15 15:22 ` [PATCH v3 0/7] btrfs: one fallocate fix and removing outdated code for fallocate and reflinks fdmanana
2022-03-15 15:22 ` [PATCH v3 1/7] btrfs: only reserve the needed data space amount during fallocate fdmanana
2022-03-15 15:22 ` [PATCH v3 2/7] btrfs: remove useless dio wait call when doing fallocate zero range fdmanana
2022-03-15 15:22 ` [PATCH v3 3/7] btrfs: remove inode_dio_wait() calls when starting reflink operations fdmanana
2022-03-15 15:22 ` [PATCH v3 4/7] btrfs: remove ordered extent check and wait during fallocate fdmanana
2022-03-15 15:22 ` [PATCH v3 5/7] btrfs: lock the inode first before flushing range when punching hole fdmanana
2022-03-15 15:22 ` [PATCH v3 6/7] btrfs: remove ordered extent check and wait during hole punching and zero range fdmanana
2022-03-15 15:22 ` [PATCH v3 7/7] btrfs: add and use helper to assert an inode range is clean fdmanana
2022-03-16 16:29 ` [PATCH v3 0/7] btrfs: one fallocate fix and removing outdated code for fallocate and reflinks David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cb7153055d8808fca7cfa7711a37a78770064e34.1647346287.git.fdmanana@suse.com \
--to=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.