All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v3 04/13] btrfs: refactor how we iterate ordered extent in btrfs_invalidatepage()
Date: Fri, 2 Apr 2021 09:15:50 +0800	[thread overview]
Message-ID: <09d9c11c-02ff-8109-5244-021fac5c3626@oracle.com> (raw)
In-Reply-To: <20210325071445.90896-5-wqu@suse.com>

On 25/03/2021 15:14, Qu Wenruo wrote:
> In btrfs_invalidatepage(), we need to iterate through all ordered
> extents and finish them.
> 
> This involved a loop to exhaust all ordered extents, but that loop is
> implemented using again: label and goto.
> 
> Refactor the code by:
> - Use a while() loop

Just an observation.
At a minimum, while loop does 2 iterations before breaking. Whereas
label and goto could do it without reaching goto at all for the same
value of %length. So the label and goto approach is still faster.

A question below.

> - Extract the code to finish/dec an ordered extent into its own function
>    The new function, invalidate_ordered_extent(), will handle the
>    extent locking, extent bit update, and to finish/dec ordered extent.
> 
> In fact, for regular sectorsize == PAGE_SIZE case, there can only be at
> most one ordered extent for one page, thus the code is from ancient
> subpage preparation patchset.
> 
> But there is a bug hidden inside the ordered extent finish/dec part.
> 
> This patch will remove the ability to handle multiple ordered extent,
> and add extra ASSERT() to make sure for regular sectorsize we won't have
> anything wrong.
> 
> For the proper subpage support, it will be added in later patches.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/inode.c | 122 +++++++++++++++++++++++++++++------------------
>   1 file changed, 75 insertions(+), 47 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index d777f67d366b..99dcadd31870 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -8355,17 +8355,72 @@ static int btrfs_migratepage(struct address_space *mapping,
>   }
>   #endif
>   
> +/*
> + * Helper to finish/dec one ordered extent for btrfs_invalidatepage().
> + *
> + * Return true if the ordered extent is finished.
> + * Return false otherwise
> + */
> +static bool invalidate_ordered_extent(struct btrfs_inode *inode,
> +				      struct btrfs_ordered_extent *ordered,
> +				      struct page *page,
> +				      struct extent_state **cached_state,
> +				      bool inode_evicting)
> +{
> +	u64 start = page_offset(page);
> +	u64 end = page_offset(page) + PAGE_SIZE - 1;
> +	u32 len = PAGE_SIZE;
> +	bool completed_ordered = false;
> +
> +	/*
> +	 * For regular sectorsize == PAGE_SIZE, if the ordered extent covers
> +	 * the page, then it must cover the full page.
> +	 */
> +	ASSERT(ordered->file_offset <= start &&
> +	       ordered->file_offset + ordered->num_bytes > end);
> +	/*
> +	 * IO on this page will never be started, so we need to account
> +	 * for any ordered extents now. Don't clear EXTENT_DELALLOC_NEW
> +	 * here, must leave that up for the ordered extent completion.
> +	 */
> +	if (!inode_evicting)
> +		clear_extent_bit(&inode->io_tree, start, end,
> +				 EXTENT_DELALLOC | EXTENT_LOCKED |
> +				 EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 1, 0,
> +				 cached_state);
> +	/*
> +	 * Whoever cleared the private bit is responsible for the
> +	 * finish_ordered_io
> +	 */
> +	if (TestClearPagePrivate2(page)) {
> +		spin_lock_irq(&inode->ordered_tree.lock);
> +		set_bit(BTRFS_ORDERED_TRUNCATED, &ordered->flags);
> +		ordered->truncated_len = min(ordered->truncated_len,
> +					     start - ordered->file_offset);
> +		spin_unlock_irq(&inode->ordered_tree.lock);
> +
> +		if (btrfs_dec_test_ordered_pending(inode, &ordered, start, len, 1)) {
> +			btrfs_finish_ordered_io(ordered);
> +			completed_ordered = true;
> +		}
> +	}
> +	btrfs_put_ordered_extent(ordered);
> +	if (!inode_evicting) {
> +		*cached_state = NULL;
> +		lock_extent_bits(&inode->io_tree, start, end, cached_state);
> +	}
> +	return completed_ordered;
> +}
> +
>   static void btrfs_invalidatepage(struct page *page, unsigned int offset,
>   				 unsigned int length)
>   {
>   	struct btrfs_inode *inode = BTRFS_I(page->mapping->host);
>   	struct extent_io_tree *tree = &inode->io_tree;
> -	struct btrfs_ordered_extent *ordered;
>   	struct extent_state *cached_state = NULL;
>   	u64 page_start = page_offset(page);
>   	u64 page_end = page_start + PAGE_SIZE - 1;
> -	u64 start;
> -	u64 end;
> +	u64 cur;
>   	int inode_evicting = inode->vfs_inode.i_state & I_FREEING;
>   	bool found_ordered = false;
>   	bool completed_ordered = false;
> @@ -8387,51 +8442,24 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
>   	if (!inode_evicting)
>   		lock_extent_bits(tree, page_start, page_end, &cached_state);
>   
> -	start = page_start;
> -again:
> -	ordered = btrfs_lookup_ordered_range(inode, start, page_end - start + 1);
> -	if (ordered) {
> -		found_ordered = true;
> -		end = min(page_end,
> -			  ordered->file_offset + ordered->num_bytes - 1);
> -		/*
> -		 * IO on this page will never be started, so we need to account
> -		 * for any ordered extents now. Don't clear EXTENT_DELALLOC_NEW
> -		 * here, must leave that up for the ordered extent completion.
> -		 */
> -		if (!inode_evicting)
> -			clear_extent_bit(tree, start, end,
> -					 EXTENT_DELALLOC |
> -					 EXTENT_LOCKED | EXTENT_DO_ACCOUNTING |
> -					 EXTENT_DEFRAG, 1, 0, &cached_state);
> -		/*
> -		 * whoever cleared the private bit is responsible
> -		 * for the finish_ordered_io
> -		 */
> -		if (TestClearPagePrivate2(page)) {
> -			spin_lock_irq(&inode->ordered_tree.lock);
> -			set_bit(BTRFS_ORDERED_TRUNCATED, &ordered->flags);
> -			ordered->truncated_len = min(ordered->truncated_len,
> -					start - ordered->file_offset);
> -			spin_unlock_irq(&inode->ordered_tree.lock);
> -
> -			if (btrfs_dec_test_ordered_pending(inode, &ordered,
> -							   start,
> -							   end - start + 1, 1)) {
> -				btrfs_finish_ordered_io(ordered);
> -				completed_ordered = true;
> -			}
> -		}
> -		btrfs_put_ordered_extent(ordered);
> -		if (!inode_evicting) {
> -			cached_state = NULL;
> -			lock_extent_bits(tree, start, end,
> -					 &cached_state);
> -		}
> +	cur = page_start;
> +	/* Iterate through all the ordered extents covering the page */
> +	while (cur < page_end) {
> +		struct btrfs_ordered_extent *ordered;
>   
> -		start = end + 1;
> -		if (start < page_end)
> -			goto again;

> +		ordered = btrfs_lookup_ordered_range(inode, cur,
> +				page_end - cur + 1);


  This part is confusing to me. I hope you can clarify.
  btrfs_lookup_ordered_range() also does

                node = tree_search(tree, file_offset + len);

  Essentially the 2nd argument ends up being %page_end + 1 here.

  So wouldn't that end up calling invalidate_ordered_extent()
  beyond %offset + %length?

Thanks, Anand


> +		if (ordered) {
> +			cur = ordered->file_offset + ordered->num_bytes;
> +
> +			found_ordered = true;
> +			completed_ordered = invalidate_ordered_extent(inode,
> +					ordered, page, &cached_state,
> +					inode_evicting);
> +		} else {
> +			/* Exhausted all ordered extents */
> +			break;
> +		}
>   	}
>   
>   	/*
> 


  reply	other threads:[~2021-04-02  1:16 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-25  7:14 [PATCH v3 00/13] btrfs: support read-write for subpage metadata Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 01/13] btrfs: add sysfs interface for supported sectorsize Qu Wenruo
2021-03-25 14:41   ` Anand Jain
2021-03-29 18:20     ` David Sterba
2021-04-01 22:32       ` Anand Jain
2021-04-01 17:56   ` David Sterba
2021-03-25  7:14 ` [PATCH v3 02/13] btrfs: use min() to replace open-code in btrfs_invalidatepage() Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 03/13] btrfs: remove unnecessary variable shadowing " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 04/13] btrfs: refactor how we iterate ordered extent " Qu Wenruo
2021-04-02  1:15   ` Anand Jain [this message]
2021-04-02  3:33     ` Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 05/13] btrfs: introduce helpers for subpage dirty status Qu Wenruo
2021-04-01 18:11   ` David Sterba
2021-03-25  7:14 ` [PATCH v3 06/13] btrfs: introduce helpers for subpage writeback status Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 07/13] btrfs: allow btree_set_page_dirty() to do more sanity check on subpage metadata Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 08/13] btrfs: support subpage metadata csum calculation at write time Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 09/13] btrfs: make alloc_extent_buffer() check subpage dirty bitmap Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 10/13] btrfs: make the page uptodate assert to be subpage compatible Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 11/13] btrfs: make set/clear_extent_buffer_dirty() " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 12/13] btrfs: make set_btree_ioerr() accept extent buffer and " Qu Wenruo
2021-03-25  7:14 ` [PATCH v3 13/13] btrfs: add subpage overview comments Qu Wenruo
2021-03-25 12:20 ` [PATCH v3 00/13] btrfs: support read-write for subpage metadata Neal Gompa
2021-03-25 13:16   ` Qu Wenruo
2021-03-28 20:02     ` Ritesh Harjani
2021-03-29  2:01       ` Qu Wenruo
2021-04-02  1:39         ` Anand Jain
2021-04-02  3:26           ` Qu Wenruo
2021-04-02  8:33         ` Ritesh Harjani
2021-04-02  8:36           ` Qu Wenruo
2021-04-02  8:46             ` Ritesh Harjani
2021-04-02  8:52               ` Qu Wenruo
2021-04-12 11:33                 ` Qu Wenruo
2021-04-15  3:44                   ` riteshh
2021-04-15 14:52                     ` riteshh
2021-04-15 23:19                       ` Qu Wenruo
2021-04-15 23:34                         ` Qu Wenruo
2021-04-16  1:34                           ` Qu Wenruo
2021-04-16  5:50                             ` riteshh
2021-04-16  6:14                               ` Qu Wenruo
2021-04-16 16:52                                 ` riteshh
2021-04-19  5:59                                   ` riteshh
2021-04-19  6:16                                     ` Qu Wenruo
2021-04-19  7:04                                       ` riteshh
2021-04-19  7:19                                       ` Qu Wenruo
2021-04-19 13:24                                         ` Qu Wenruo
2021-04-21  7:03                                           ` riteshh
2021-04-21  7:15                                             ` Qu Wenruo
2021-04-21  7:30                                             ` riteshh
2021-04-21  8:26                                               ` Qu Wenruo
2021-04-21 11:13                                                 ` riteshh
2021-04-21 11:42                                                   ` Qu Wenruo
2021-04-21 12:15                                                     ` riteshh
2021-03-29 18:53 ` David Sterba
2021-04-01  5:36   ` Qu Wenruo
2021-04-01 17:55     ` David Sterba
2021-04-02  1:27     ` Anand Jain
2021-04-03 11:08 ` David Sterba
2021-04-05  6:14   ` Qu Wenruo
2021-04-06  2:31     ` Anand Jain
2021-04-06 19:20       ` David Sterba
2021-04-06 23:59       ` Qu Wenruo
2021-04-06 19:13     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09d9c11c-02ff-8109-5244-021fac5c3626@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.