All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Subject: [Patch v2 12/42] btrfs: refactor btrfs_invalidatepage()
Date: Wed, 28 Apr 2021 07:03:19 +0800	[thread overview]
Message-ID: <20210427230349.369603-13-wqu@suse.com> (raw)
In-Reply-To: <20210427230349.369603-1-wqu@suse.com>

This patch will refactor btrfs_invalidatepage() for the incoming subpage
support.

The involved modifications are:
- Use while() loop instead of "goto again;"
- Use single variable to determine whether to delete extent states
  Each branch will also have comments why we can or cannot delete the
  extent states
- Do qgroup free and extent states deletion per-loop
  Current code can only work for PAGE_SIZE == sectorsize case.

This refactor also makes it clear what we do for different sectors:
- Sectors without ordered extent
  We're completely safe to remove all extent states for the sector(s)

- Sectors with ordered extent, but no Private2 bit
  This means the endio has already been executed, we can't remove all
  extent states for the sector(s).

- Sectors with ordere extent, still has Private2 bit
  This means we need to decrease the ordered extent accounting.
  And then it comes to two different variants:
  * We have finished and removed the ordered extent
    Then it's the same as "sectors without ordered extent"
  * We didn't finished the ordered extent
    We can remove some extent states, but not all.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/inode.c | 167 ++++++++++++++++++++++++++---------------------
 1 file changed, 93 insertions(+), 74 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index afcdca27a42f..e1c248fdf592 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8290,15 +8290,11 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 {
 	struct btrfs_inode *inode = BTRFS_I(page->mapping->host);
 	struct extent_io_tree *tree = &inode->io_tree;
-	struct btrfs_ordered_extent *ordered;
 	struct extent_state *cached_state = NULL;
 	u64 page_start = page_offset(page);
 	u64 page_end = page_start + PAGE_SIZE - 1;
-	u64 start;
-	u64 end;
+	u64 cur;
 	int inode_evicting = inode->vfs_inode.i_state & I_FREEING;
-	bool found_ordered = false;
-	bool completed_ordered = false;
 
 	/*
 	 * We have page locked so no new ordered extent can be created on
@@ -8322,93 +8318,116 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 	if (!inode_evicting)
 		lock_extent_bits(tree, page_start, page_end, &cached_state);
 
-	start = page_start;
-again:
-	ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1);
-	if (ordered) {
-		found_ordered = true;
-		end = min(page_end,
-			  ordered->file_offset + ordered->num_bytes - 1);
+	cur = page_start;
+	while (cur < page_end) {
+		struct btrfs_ordered_extent *ordered;
+		bool delete_states;
+		u64 range_end;
+
+		ordered = btrfs_lookup_first_ordered_range(inode, cur, page_end + 1 - cur);
+		if (!ordered) {
+			range_end = page_end;
+			/*
+			 * No ordered extent covering this range, we are safe
+			 * to delete all extent states in the range.
+			 */
+			delete_states = true;
+			goto next;
+		}
+		if (ordered->file_offset > cur) {
+			/*
+			 * There is a range between [cur, oe->file_offset) not
+			 * covered by any OE.
+			 * We are safe to delete all extent states, and handle
+			 * the OE in next iteration.
+			 */
+			range_end = ordered->file_offset - 1;
+			delete_states = true;
+			goto next;
+		}
+
+		range_end = min(ordered->file_offset + ordered->num_bytes - 1,
+				page_end);
+		if (!PagePrivate2(page)) {
+			/*
+			 * If Private2 is cleared, it means endio has already
+			 * been executed for the range.
+			 * We can't delete the extent states as
+			 * btrfs_finish_ordered_io() may still use some of them.
+			 */
+			delete_states = false;
+			goto next;
+		}
+		ClearPagePrivate2(page);
+
 		/*
 		 * IO on this page will never be started, so we need to account
 		 * for any ordered extents now. Don't clear EXTENT_DELALLOC_NEW
 		 * here, must leave that up for the ordered extent completion.
+		 *
+		 * This will also unlock the range for incoming
+		 * btrfs_finish_ordered_io().
 		 */
 		if (!inode_evicting)
-			clear_extent_bit(tree, start, end,
+			clear_extent_bit(tree, cur, range_end,
 					 EXTENT_DELALLOC |
 					 EXTENT_LOCKED | EXTENT_DO_ACCOUNTING |
 					 EXTENT_DEFRAG, 1, 0, &cached_state);
+
+		spin_lock_irq(&inode->ordered_tree.lock);
+		set_bit(BTRFS_ORDERED_TRUNCATED, &ordered->flags);
+		ordered->truncated_len = min(ordered->truncated_len,
+					     cur - ordered->file_offset);
+		spin_unlock_irq(&inode->ordered_tree.lock);
+
+		if (btrfs_dec_test_ordered_pending(inode, &ordered,
+					cur, range_end + 1 - cur, 1)) {
+			btrfs_finish_ordered_io(ordered);
+			/*
+			 * The ordered extent has finished, now we're again
+			 * safe to delete all extent states of the range.
+			 */
+			delete_states = true;
+		} else {
+			/*
+			 * btrfs_finish_ordered_io() will get executed by endio of
+			 * other pages, thus we can't delete extent states any more
+			 */
+			delete_states = false;
+		}
+next:
+		if (ordered)
+			btrfs_put_ordered_extent(ordered);
 		/*
-		 * A page with Private2 bit means no bio has submitted covering
-		 * the page, thus we have to manually do the ordered extent
-		 * accounting.
+		 * Qgroup reserved space handler
+		 * Sector(s) here will be either
+		 * 1) Already written to disk or bio already finished
+		 *    Then its QGROUP_RESERVED bit in io_tree is already cleaned.
+		 *    Qgroup will be handled by its qgroup_record then.
+		 *    btrfs_qgroup_free_data() call will do nothing here.
 		 *
-		 * For page without Private2, the ordered extent accounting is
-		 * done in its endio function of the submitted bio.
+		 * 2) Not written to disk yet
+		 *    Then btrfs_qgroup_free_data() call will clear the
+		 *    QGROUP_RESERVED bit of its io_tree, and free the qgroup
+		 *    reserved data space.
+		 *    Since the IO will never happen for this page.
 		 */
-		if (TestClearPagePrivate2(page)) {
-			spin_lock_irq(&inode->ordered_tree.lock);
-			set_bit(BTRFS_ORDERED_TRUNCATED, &ordered->flags);
-			ordered->truncated_len = min(ordered->truncated_len,
-						     start - ordered->file_offset);
-			spin_unlock_irq(&inode->ordered_tree.lock);
-
-			if (btrfs_dec_test_ordered_pending(inode, &ordered,
-							   start,
-							   end - start + 1, 1)) {
-				btrfs_finish_ordered_io(ordered);
-				completed_ordered = true;
-			}
-		}
-		btrfs_put_ordered_extent(ordered);
+		btrfs_qgroup_free_data(inode, NULL, cur, range_end + 1 - cur);
 		if (!inode_evicting) {
-			cached_state = NULL;
-			lock_extent_bits(tree, start, end,
-					 &cached_state);
+			clear_extent_bit(tree, cur, range_end, EXTENT_LOCKED |
+				 EXTENT_DELALLOC | EXTENT_UPTODATE |
+				 EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 1,
+				 delete_states, &cached_state);
 		}
-
-		start = end + 1;
-		if (start < page_end)
-			goto again;
+		cur = range_end + 1;
 	}
-
 	/*
-	 * Qgroup reserved space handler
-	 * Page here will be either
-	 * 1) Already written to disk or ordered extent already submitted
-	 *    Then its QGROUP_RESERVED bit in io_tree is already cleaned.
-	 *    Qgroup will be handled by its qgroup_record then.
-	 *    btrfs_qgroup_free_data() call will do nothing here.
-	 *
-	 * 2) Not written to disk yet
-	 *    Then btrfs_qgroup_free_data() call will clear the QGROUP_RESERVED
-	 *    bit of its io_tree, and free the qgroup reserved data space.
-	 *    Since the IO will never happen for this page.
+	 * We have iterated through all OEs of the page, the page should not
+	 * have Private2 anymore, or the above iteration has something wrong.
 	 */
-	btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE);
-	if (!inode_evicting) {
-		bool delete = true;
-
-		/*
-		 * If there's an ordered extent for this range and we have not
-		 * finished it ourselves, we must leave EXTENT_DELALLOC_NEW set
-		 * in the range for the ordered extent completion. We must also
-		 * not delete the range, otherwise we would lose that bit (and
-		 * any other bits set in the range). Make sure EXTENT_UPTODATE
-		 * is cleared if we don't delete, otherwise it can lead to
-		 * corruptions if the i_size is extented later.
-		 */
-		if (found_ordered && !completed_ordered)
-			delete = false;
-		clear_extent_bit(tree, page_start, page_end, EXTENT_LOCKED |
-				 EXTENT_DELALLOC | EXTENT_UPTODATE |
-				 EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 1,
-				 delete, &cached_state);
-
+	ASSERT(!PagePrivate2(page));
+	if (!inode_evicting)
 		__btrfs_releasepage(page, GFP_NOFS);
-	}
-
 	ClearPageChecked(page);
 	clear_page_extent_mapped(page);
 }
-- 
2.31.1


  parent reply	other threads:[~2021-04-27 23:04 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27 23:03 [Patch v2 00/42] btrfs: add data write support for subpage Qu Wenruo
2021-04-27 23:03 ` [Patch v2 01/42] btrfs: scrub: fix subpage scrub repair error caused by hardcoded PAGE_SIZE Qu Wenruo
2021-05-13 22:57   ` David Sterba
2021-05-13 23:32     ` Qu Wenruo
2021-04-27 23:03 ` [Patch v2 02/42] btrfs: make free space cache size consistent across different PAGE_SIZE Qu Wenruo
2021-04-27 23:03 ` [Patch v2 03/42] btrfs: remove the unused parameter @len for btrfs_bio_fits_in_stripe() Qu Wenruo
2021-05-13 22:58   ` David Sterba
2021-05-13 23:07   ` David Sterba
2021-04-27 23:03 ` [Patch v2 04/42] btrfs: allow btrfs_bio_fits_in_stripe() to accept bio without any page Qu Wenruo
2021-04-27 23:03 ` [Patch v2 05/42] btrfs: refactor submit_extent_page() to make bio and its flag tracing easier Qu Wenruo
2021-05-13 23:03   ` David Sterba
2021-05-21 11:06   ` Johannes Thumshirn
2021-05-21 11:26     ` Qu Wenruo
2021-05-21 13:30       ` David Sterba
2021-04-27 23:03 ` [Patch v2 06/42] btrfs: make subpage metadata write path to call its own endio functions Qu Wenruo
2021-04-27 23:03 ` [Patch v2 07/42] btrfs: pass btrfs_inode into btrfs_writepage_endio_finish_ordered() Qu Wenruo
2021-05-13 23:06   ` David Sterba
2021-05-13 23:35     ` Qu Wenruo
2021-05-21 14:27   ` Josef Bacik
2021-05-21 20:22     ` David Sterba
2021-05-22  0:24     ` Qu Wenruo
2021-05-23  7:40       ` Qu Wenruo
2021-05-23 13:43         ` Josef Bacik
2021-05-23 13:50           ` Qu Wenruo
2021-05-23 14:08             ` Josef Bacik
2021-04-27 23:03 ` [Patch v2 08/42] btrfs: make Private2 lifespan more consistent Qu Wenruo
2021-04-27 23:03 ` [Patch v2 09/42] btrfs: refactor how we finish ordered extent io for endio functions Qu Wenruo
2021-05-13 23:11   ` David Sterba
2021-04-27 23:03 ` [Patch v2 10/42] btrfs: update the comments in btrfs_invalidatepage() Qu Wenruo
2021-04-27 23:03 ` [Patch v2 11/42] btrfs: introduce btrfs_lookup_first_ordered_range() Qu Wenruo
2021-05-13 23:13   ` David Sterba
2021-04-27 23:03 ` Qu Wenruo [this message]
2021-04-27 23:03 ` [Patch v2 13/42] btrfs: rename PagePrivate2 to PageOrdered inside btrfs Qu Wenruo
2021-04-27 23:03 ` [Patch v2 14/42] btrfs: pass bytenr directly to __process_pages_contig() Qu Wenruo
2021-04-27 23:03 ` [Patch v2 15/42] btrfs: refactor the page status update into process_one_page() Qu Wenruo
2021-04-27 23:03 ` [Patch v2 16/42] btrfs: provide btrfs_page_clamp_*() helpers Qu Wenruo
2021-04-27 23:03 ` [Patch v2 17/42] btrfs: only require sector size alignment for end_bio_extent_writepage() Qu Wenruo
2021-04-27 23:03 ` [Patch v2 18/42] btrfs: make btrfs_dirty_pages() to be subpage compatible Qu Wenruo
2021-04-27 23:03 ` [Patch v2 19/42] btrfs: make __process_pages_contig() to handle subpage dirty/error/writeback status Qu Wenruo
2021-04-27 23:03 ` [Patch v2 20/42] btrfs: make end_bio_extent_writepage() to be subpage compatible Qu Wenruo
2021-04-27 23:03 ` [Patch v2 21/42] btrfs: make process_one_page() to handle subpage locking Qu Wenruo
2021-04-27 23:03 ` [Patch v2 22/42] btrfs: introduce helpers for subpage ordered status Qu Wenruo
2021-04-27 23:03 ` [Patch v2 23/42] btrfs: make page Ordered bit to be subpage compatible Qu Wenruo
2021-04-27 23:03 ` [Patch v2 24/42] btrfs: update locked page dirty/writeback/error bits in __process_pages_contig Qu Wenruo
2021-04-27 23:03 ` [Patch v2 25/42] btrfs: prevent extent_clear_unlock_delalloc() to unlock page not locked by __process_pages_contig() Qu Wenruo
2021-04-27 23:03 ` [Patch v2 26/42] btrfs: make btrfs_set_range_writeback() subpage compatible Qu Wenruo
2021-04-27 23:03 ` [Patch v2 27/42] btrfs: make __extent_writepage_io() only submit dirty range for subpage Qu Wenruo
2021-04-27 23:03 ` [Patch v2 28/42] btrfs: make btrfs_truncate_block() to be subpage compatible Qu Wenruo
2021-04-27 23:03 ` [Patch v2 29/42] btrfs: make btrfs_page_mkwrite() " Qu Wenruo
2021-04-27 23:03 ` [Patch v2 30/42] btrfs: reflink: make copy_inline_to_page() " Qu Wenruo
2021-04-27 23:03 ` [Patch v2 31/42] btrfs: fix the filemap_range_has_page() call in btrfs_punch_hole_lock_range() Qu Wenruo
2021-04-27 23:03 ` [Patch v2 32/42] btrfs: don't clear page extent mapped if we're not invalidating the full page Qu Wenruo
2021-04-27 23:03 ` [Patch v2 33/42] btrfs: extract relocation page read and dirty part into its own function Qu Wenruo
2021-04-27 23:03 ` [Patch v2 34/42] btrfs: make relocate_one_page() to handle subpage case Qu Wenruo
2021-04-27 23:03 ` [Patch v2 35/42] btrfs: fix wild subpage writeback which does not have ordered extent Qu Wenruo
2021-04-27 23:03 ` [Patch v2 36/42] btrfs: disable inline extent creation for subpage Qu Wenruo
2021-05-04  4:28   ` Qu Wenruo
2021-04-27 23:03 ` [Patch v2 37/42] btrfs: skip validation for subpage read repair Qu Wenruo
2021-04-27 23:03 ` [Patch v2 38/42] btrfs: allow submit_extent_page() to do bio split for subpage Qu Wenruo
2021-04-27 23:03 ` [Patch v2 39/42] btrfs: reject raid5/6 fs " Qu Wenruo
2021-04-28 14:22   ` Neal Gompa
2021-04-28 23:11     ` Qu Wenruo
2021-05-12 22:04       ` David Sterba
2021-04-27 23:03 ` [Patch v2 40/42] btrfs: fix a crash caused by race between prepare_pages() and btrfs_releasepage() Qu Wenruo
2021-04-28 10:56   ` Filipe Manana
2021-04-27 23:03 ` [Patch v2 41/42] btrfs: fix the use-after-free bug in writeback subpage helper Qu Wenruo
2021-05-06 23:46   ` Qu Wenruo
2021-05-07  4:57     ` Ritesh Harjani
2021-05-07  5:14       ` Qu Wenruo
2021-05-10  8:38         ` Qu Wenruo
2021-05-10 12:29           ` Ritesh Harjani
2021-05-10 13:10             ` Qu Wenruo
2021-05-11 10:48               ` Ritesh Harjani
2021-05-11 11:15                 ` Qu Wenruo
2021-05-12  1:49                   ` Qu Wenruo
2021-05-12  7:09                     ` Ritesh Harjani
2021-05-13 16:33                       ` Ritesh Harjani
2021-05-13 21:36                         ` Ritesh Harjani
2021-05-13 23:41                           ` Qu Wenruo
2021-05-14 15:08                             ` Ritesh Harjani
2021-05-14 17:53                               ` Ritesh Harjani
2021-05-14 22:22                                 ` Qu Wenruo
2021-05-15  9:59                                   ` Ritesh Harjani
2021-05-15 10:15                                     ` Qu Wenruo
2021-05-25  4:43                                       ` Ritesh Harjani
2021-05-25  5:52                                         ` Qu Wenruo
2021-05-25  6:14                                           ` Qu Wenruo
2021-05-25  9:23                                             ` Ritesh Harjani
2021-05-25  9:45                                               ` Qu Wenruo
2021-05-25  9:49                                                 ` Qu Wenruo
2021-05-25 10:20                                                   ` Ritesh Harjani
2021-05-25 11:41                                                     ` Qu Wenruo
2021-05-25 13:02                                                       ` Ritesh Harjani
2021-05-26  5:29                                                         ` Ritesh Harjani
2021-05-26  5:58                                                           ` Qu Wenruo
2021-05-26 13:45                                                             ` Ritesh Harjani
2021-05-28  8:26                                                               ` Qu Wenruo
2021-05-28  8:59                                                                 ` Ritesh Harjani
2021-05-28 10:25                                                                   ` Qu Wenruo
2021-05-30  1:50                                                                     ` Qu Wenruo
2021-04-27 23:03 ` [Patch v2 42/42] btrfs: allow read-write for 4K sectorsize on 64K page size systems Qu Wenruo
2021-05-12 22:18 ` [Patch v2 00/42] btrfs: add data write support for subpage David Sterba
2021-05-12 23:48   ` Qu Wenruo
2021-05-13  2:21     ` Qu Wenruo
2021-05-13 22:54       ` David Sterba
2021-05-14  1:41         ` Qu Wenruo
2021-05-14  2:26           ` riteshh
2021-05-14 10:28             ` riteshh
2021-05-14 11:28               ` David Sterba
2021-05-14 14:38                 ` riteshh
2021-05-14 11:30       ` David Sterba
2021-05-14 22:25         ` David Sterba
2021-05-14 22:45         ` Qu Wenruo
2021-05-14 23:05           ` David Sterba
2021-05-14 23:17             ` Qu Wenruo
2021-05-17 13:22               ` David Sterba
2021-05-17 23:20                 ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210427230349.369603-13-wqu@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.