All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Cc: David Sterba <dsterba@suse.com>
Subject: Re: [PATCH v5 06/18] btrfs: support subpage for extent buffer page release
Date: Wed, 27 Jan 2021 11:21:08 -0500	[thread overview]
Message-ID: <8522b133-9bdf-c130-1a3c-15114755c47a@toxicpanda.com> (raw)
In-Reply-To: <20210126083402.142577-7-wqu@suse.com>

On 1/26/21 3:33 AM, Qu Wenruo wrote:
> In btrfs_release_extent_buffer_pages(), we need to add extra handling
> for subpage.
> 
> Introduce a helper, detach_extent_buffer_page(), to do different
> handling for regular and subpage cases.
> 
> For subpage case, handle detaching page private.
> 
> For unmapped (dummy or cloned) ebs, we can detach the page private
> immediately as the page can only be attached to one unmapped eb.
> 
> For mapped ebs, we have to ensure there are no eb in the page range
> before we delete it, as page->private is shared between all ebs in the
> same page.
> 
> But there is a subpage specific race, where we can race with extent
> buffer allocation, and clear the page private while new eb is still
> being utilized, like this:
> 
>    Extent buffer A is the new extent buffer which will be allocated,
>    while extent buffer B is the last existing extent buffer of the page.
> 
>    		T1 (eb A) 	 |		T2 (eb B)
>    -------------------------------+------------------------------
>    alloc_extent_buffer()		 | btrfs_release_extent_buffer_pages()
>    |- p = find_or_create_page()   | |
>    |- attach_extent_buffer_page() | |
>    |				 | |- detach_extent_buffer_page()
>    |				 |    |- if (!page_range_has_eb())
>    |				 |    |  No new eb in the page range yet
>    |				 |    |  As new eb A hasn't yet been
>    |				 |    |  inserted into radix tree.
>    |				 |    |- btrfs_detach_subpage()
>    |				 |       |- detach_page_private();
>    |- radix_tree_insert()	 |
> 
>    Then we have a metadata eb whose page has no private bit.
> 
> To avoid such race, we introduce a subpage metadata-specific member,
> btrfs_subpage::eb_refs.
> 
> In alloc_extent_buffer() we increase eb_refs in the critical section of
> private_lock.  Then page_range_has_eb() will return true for
> detach_extent_buffer_page(), and will not detach page private.
> 
> The section is marked by:
> 
> - btrfs_page_inc_eb_refs()
> - btrfs_page_dec_eb_refs()
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> Reviewed-by: David Sterba <dsterba@suse.com>
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---
>   fs/btrfs/extent_io.c | 94 +++++++++++++++++++++++++++++++++++++-------
>   fs/btrfs/subpage.c   | 42 ++++++++++++++++++++
>   fs/btrfs/subpage.h   | 13 +++++-
>   3 files changed, 133 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 16a29f63cfd1..118874926179 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -4993,25 +4993,39 @@ int extent_buffer_under_io(const struct extent_buffer *eb)
>   		test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags));
>   }
>   
> -/*
> - * Release all pages attached to the extent buffer.
> - */
> -static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb)
> +static bool page_range_has_eb(struct btrfs_fs_info *fs_info, struct page *page)
>   {
> -	int i;
> -	int num_pages;
> -	int mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags);
> +	struct btrfs_subpage *subpage;
>   
> -	BUG_ON(extent_buffer_under_io(eb));
> +	lockdep_assert_held(&page->mapping->private_lock);
>   
> -	num_pages = num_extent_pages(eb);
> -	for (i = 0; i < num_pages; i++) {
> -		struct page *page = eb->pages[i];
> +	if (PagePrivate(page)) {
> +		subpage = (struct btrfs_subpage *)page->private;
> +		if (atomic_read(&subpage->eb_refs))
> +			return true;
> +	}
> +	return false;
> +}
>   
> -		if (!page)
> -			continue;
> +static void detach_extent_buffer_page(struct extent_buffer *eb, struct page *page)
> +{
> +	struct btrfs_fs_info *fs_info = eb->fs_info;
> +	const bool mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags);
> +
> +	/*
> +	 * For mapped eb, we're going to change the page private, which should
> +	 * be done under the private_lock.
> +	 */
> +	if (mapped)
> +		spin_lock(&page->mapping->private_lock);
> +
> +	if (!PagePrivate(page)) {
>   		if (mapped)
> -			spin_lock(&page->mapping->private_lock);
> +			spin_unlock(&page->mapping->private_lock);
> +		return;
> +	}
> +
> +	if (fs_info->sectorsize == PAGE_SIZE) {
>   		/*
>   		 * We do this since we'll remove the pages after we've
>   		 * removed the eb from the radix tree, so we could race
> @@ -5030,9 +5044,49 @@ static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb)
>   			 */
>   			detach_page_private(page);
>   		}
> -
>   		if (mapped)
>   			spin_unlock(&page->mapping->private_lock);
> +		return;
> +	}
> +
> +	/*
> +	 * For subpage, we can have dummy eb with page private.  In this case,
> +	 * we can directly detach the private as such page is only attached to
> +	 * one dummy eb, no sharing.
> +	 */
> +	if (!mapped) {
> +		btrfs_detach_subpage(fs_info, page);
> +		return;
> +	}
> +
> +	btrfs_page_dec_eb_refs(fs_info, page);
> +
> +	/*
> +	 * We can only detach the page private if there are no other ebs in the
> +	 * page range.
> +	 */
> +	if (!page_range_has_eb(fs_info, page))
> +		btrfs_detach_subpage(fs_info, page);
> +
> +	spin_unlock(&page->mapping->private_lock);
> +}
> +
> +/* Release all pages attached to the extent buffer */
> +static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb)
> +{
> +	int i;
> +	int num_pages;
> +
> +	ASSERT(!extent_buffer_under_io(eb));
> +
> +	num_pages = num_extent_pages(eb);
> +	for (i = 0; i < num_pages; i++) {
> +		struct page *page = eb->pages[i];
> +
> +		if (!page)
> +			continue;
> +
> +		detach_extent_buffer_page(eb, page);
>   
>   		/* One for when we allocated the page */
>   		put_page(page);
> @@ -5392,6 +5446,16 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
>   		/* Should not fail, as we have preallocated the memory */
>   		ret = attach_extent_buffer_page(eb, p, prealloc);
>   		ASSERT(!ret);
> +		/*
> +		 * To inform we have extra eb under allocation, so that
> +		 * detach_extent_buffer_page() won't release the page private
> +		 * when the eb hasn't yet been inserted into radix tree.
> +		 *
> +		 * The ref will be decreased when the eb released the page, in
> +		 * detach_extent_buffer_page().
> +		 * Thus needs no special handling in error path.
> +		 */
> +		btrfs_page_inc_eb_refs(fs_info, p);
>   		spin_unlock(&mapping->private_lock);
>   
>   		WARN_ON(PageDirty(p));
> diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c
> index 61b28dfca20c..a2a21fa0ea35 100644
> --- a/fs/btrfs/subpage.c
> +++ b/fs/btrfs/subpage.c
> @@ -52,6 +52,8 @@ int btrfs_alloc_subpage(const struct btrfs_fs_info *fs_info,
>   	if (!*ret)
>   		return -ENOMEM;
>   	spin_lock_init(&(*ret)->lock);
> +	if (type == BTRFS_SUBPAGE_METADATA)
> +		atomic_set(&(*ret)->eb_refs, 0);
>   	return 0;
>   }
>   
> @@ -59,3 +61,43 @@ void btrfs_free_subpage(struct btrfs_subpage *subpage)
>   {
>   	kfree(subpage);
>   }
> +
> +/*
> + * Increase the eb_refs of current subpage.
> + *
> + * This is important for eb allocation, to prevent race with last eb freeing
> + * of the same page.
> + * With the eb_refs increased before the eb inserted into radix tree,
> + * detach_extent_buffer_page() won't detach the page private while we're still
> + * allocating the extent buffer.
> + */
> +void btrfs_page_inc_eb_refs(const struct btrfs_fs_info *fs_info,
> +			    struct page *page)
> +{
> +	struct btrfs_subpage *subpage;
> +
> +	if (fs_info->sectorsize == PAGE_SIZE)
> +		return;
> +
> +	ASSERT(PagePrivate(page) && page->mapping);
> +	lockdep_assert_held(&page->mapping->private_lock);
> +
> +	subpage = (struct btrfs_subpage *)page->private;
> +	atomic_inc(&subpage->eb_refs);
> +}
> +
> +void btrfs_page_dec_eb_refs(const struct btrfs_fs_info *fs_info,
> +			    struct page *page)
> +{
> +	struct btrfs_subpage *subpage;
> +
> +	if (fs_info->sectorsize == PAGE_SIZE)
> +		return;
> +
> +	ASSERT(PagePrivate(page) && page->mapping);
> +	lockdep_assert_held(&page->mapping->private_lock);
> +
> +	subpage = (struct btrfs_subpage *)page->private;
> +	ASSERT(atomic_read(&subpage->eb_refs));
> +	atomic_dec(&subpage->eb_refs);
> +}
> diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h
> index 7ba544bcc9c6..eef2ecae77e0 100644
> --- a/fs/btrfs/subpage.h
> +++ b/fs/btrfs/subpage.h
> @@ -4,6 +4,7 @@
>   #define BTRFS_SUBPAGE_H
>   
>   #include <linux/spinlock.h>
> +#include <linux/refcount.h>

I made this comment elsewhere, but the patch finally showed up in my email after 
I refreshed (???? thunderbird wtf??).  Anyway you import refcount.h here, but 
don't actually use refcount_t.  Please use refcount_t, so we get the benefit of 
the debugging from the helpers.  Thanks,

Josef

  reply	other threads:[~2021-01-27 16:22 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-26  8:33 [PATCH v5 00/18] btrfs: add read-only support for subpage sector size Qu Wenruo
2021-01-26  8:33 ` [PATCH v5 01/18] btrfs: merge PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK to PAGE_START_WRITEBACK Qu Wenruo
2021-01-27 15:56   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 02/18] btrfs: set UNMAPPED bit early in btrfs_clone_extent_buffer() for subpage support Qu Wenruo
2021-01-27 15:56   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 03/18] btrfs: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2021-01-26  8:33 ` [PATCH v5 04/18] btrfs: make attach_extent_buffer_page() handle subpage case Qu Wenruo
2021-01-27 16:01   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 05/18] btrfs: make grab_extent_buffer_from_page() " Qu Wenruo
2021-01-27 16:20   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 06/18] btrfs: support subpage for extent buffer page release Qu Wenruo
2021-01-27 16:21   ` Josef Bacik [this message]
2021-02-01 15:32     ` David Sterba
2021-01-26  8:33 ` [PATCH v5 07/18] btrfs: attach private to dummy extent buffer pages Qu Wenruo
2021-01-27 16:21   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 08/18] btrfs: introduce helpers for subpage uptodate status Qu Wenruo
2021-01-27 16:34   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 09/18] btrfs: introduce helpers for subpage error status Qu Wenruo
2021-01-27 16:34   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 10/18] btrfs: support subpage in set/clear_extent_buffer_uptodate() Qu Wenruo
2021-01-27 16:35   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 11/18] btrfs: support subpage in btrfs_clone_extent_buffer Qu Wenruo
2021-01-27 16:35   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 12/18] btrfs: support subpage in try_release_extent_buffer() Qu Wenruo
2021-01-27 16:37   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 13/18] btrfs: introduce read_extent_buffer_subpage() Qu Wenruo
2021-01-27 16:39   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 14/18] btrfs: support subpage in endio_readpage_update_page_status() Qu Wenruo
2021-01-27 16:42   ` Josef Bacik
2021-01-26  8:33 ` [PATCH v5 15/18] btrfs: introduce subpage metadata validation check Qu Wenruo
2021-01-27 16:47   ` Josef Bacik
2021-01-26  8:34 ` [PATCH v5 16/18] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2021-01-27 16:56   ` Josef Bacik
2021-02-01 15:42     ` David Sterba
2021-01-26  8:34 ` [PATCH v5 17/18] btrfs: integrate page status update for data read path into begin/end_page_read() Qu Wenruo
2021-01-27 17:13   ` Josef Bacik
2021-02-01 15:47     ` David Sterba
2021-01-26  8:34 ` [PATCH v5 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo
2021-01-27 17:13   ` Josef Bacik
2021-02-01 15:49   ` David Sterba
2021-01-27 16:17 ` [PATCH v5 00/18] btrfs: add read-only support for subpage sector size Josef Bacik
2021-01-28  0:30   ` Qu Wenruo
2021-01-28 10:34     ` David Sterba
2021-01-28 10:51       ` Qu Wenruo
2021-02-01 14:50         ` David Sterba
2021-02-01 15:55 ` David Sterba
2021-02-02  9:21 ` [bug report] Unable to handle kernel paging request Anand Jain
2021-02-02 10:23   ` Qu Wenruo
2021-02-02 11:28     ` Anand Jain
2021-02-02 13:37       ` Anand Jain
2021-02-04  5:13         ` Qu Wenruo
2021-02-03 13:20 ` [PATCH v5 00/18] btrfs: add read-only support for subpage sector size David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8522b133-9bdf-c130-1a3c-15114755c47a@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.