From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4 04/18] btrfs: make attach_extent_buffer_page() to handle subpage case
Date: Wed, 20 Jan 2021 08:27:17 +0800 [thread overview]
Message-ID: <a58c8366-f3b5-a152-92be-c7252891a7c6@gmx.com> (raw)
In-Reply-To: <5a6223fc-9937-3bd6-ecd0-d6c5939f59a7@toxicpanda.com>
On 2021/1/20 上午5:54, Josef Bacik wrote:
> On 1/16/21 2:15 AM, Qu Wenruo wrote:
>> For subpage case, we need to allocate new memory for each metadata page.
>>
>> So we need to:
>> - Allow attach_extent_buffer_page() to return int
>> To indicate allocation failure
>>
>> - Prealloc btrfs_subpage structure for alloc_extent_buffer()
>> We don't want to call memory allocation with spinlock hold, so
>> do preallocation before we acquire mapping->private_lock.
>>
>> - Handle subpage and regular case differently in
>> attach_extent_buffer_page()
>> For regular case, just do the usual thing.
>> For subpage case, allocate new memory or use the preallocated memory.
>>
>> For future subpage metadata, we will make more usage of radix tree to
>> grab extnet buffer.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> fs/btrfs/extent_io.c | 75 ++++++++++++++++++++++++++++++++++++++------
>> fs/btrfs/subpage.h | 17 ++++++++++
>> 2 files changed, 82 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index a816ba4a8537..320731487ac0 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -24,6 +24,7 @@
>> #include "rcu-string.h"
>> #include "backref.h"
>> #include "disk-io.h"
>> +#include "subpage.h"
>> static struct kmem_cache *extent_state_cache;
>> static struct kmem_cache *extent_buffer_cache;
>> @@ -3140,9 +3141,13 @@ static int submit_extent_page(unsigned int opf,
>> return ret;
>> }
>> -static void attach_extent_buffer_page(struct extent_buffer *eb,
>> - struct page *page)
>> +static int attach_extent_buffer_page(struct extent_buffer *eb,
>> + struct page *page,
>> + struct btrfs_subpage *prealloc)
>> {
>> + struct btrfs_fs_info *fs_info = eb->fs_info;
>> + int ret;
>
> int ret = 0;
>
>> +
>> /*
>> * If the page is mapped to btree inode, we should hold the private
>> * lock to prevent race.
>> @@ -3152,10 +3157,32 @@ static void attach_extent_buffer_page(struct
>> extent_buffer *eb,
>> if (page->mapping)
>> lockdep_assert_held(&page->mapping->private_lock);
>> - if (!PagePrivate(page))
>> - attach_page_private(page, eb);
>> - else
>> - WARN_ON(page->private != (unsigned long)eb);
>> + if (fs_info->sectorsize == PAGE_SIZE) {
>> + if (!PagePrivate(page))
>> + attach_page_private(page, eb);
>> + else
>> + WARN_ON(page->private != (unsigned long)eb);
>> + return 0;
>> + }
>> +
>> + /* Already mapped, just free prealloc */
>> + if (PagePrivate(page)) {
>> + kfree(prealloc);
>> + return 0;
>> + }
>> +
>> + if (prealloc) {
>> + /* Has preallocated memory for subpage */
>> + spin_lock_init(&prealloc->lock);
>> + attach_page_private(page, prealloc);
>> + } else {
>> + /* Do new allocation to attach subpage */
>> + ret = btrfs_attach_subpage(fs_info, page);
>> + if (ret < 0)
>> + return ret;
>
> Delete the above 2 lines.
>
>> + }
>> +
>> + return 0;
>
> return ret;
>
>> }
>> void set_page_extent_mapped(struct page *page)
>> @@ -5062,21 +5089,29 @@ struct extent_buffer
>> *btrfs_clone_extent_buffer(const struct extent_buffer *src)
>> if (new == NULL)
>> return NULL;
>> + set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags);
>> + set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags);
>> +
>
> Why are you doing this here? It seems unrelated? Looking at the code
> it appears there's a reason for this later, but I had to go look to make
> sure I wasn't crazy, so at the very least it needs to be done in a more
> relevant patch.
This is to handle case where we allocated a page but failed to allocate
subpage structure.
In that case, btrfs_release_extent_buffer() will go different routine to
free the eb.
Without UNMAPPED bit, it just go wrong without knowing it's a unmapped eb.
This change is mostly due to the extra failure pattern introduced by the
subpage memory allocation.
>
>> for (i = 0; i < num_pages; i++) {
>> + int ret;
>> +
>> p = alloc_page(GFP_NOFS);
>> if (!p) {
>> btrfs_release_extent_buffer(new);
>> return NULL;
>> }
>> - attach_extent_buffer_page(new, p);
>> + ret = attach_extent_buffer_page(new, p, NULL);
>> + if (ret < 0) {
>> + put_page(p);
>> + btrfs_release_extent_buffer(new);
>> + return NULL;
>> + }
>> WARN_ON(PageDirty(p));
>> SetPageUptodate(p);
>> new->pages[i] = p;
>> copy_page(page_address(p), page_address(src->pages[i]));
>> }
>> - set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags);
>> - set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags);
>> return new;
>> }
>> @@ -5308,12 +5343,28 @@ struct extent_buffer
>> *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
>> num_pages = num_extent_pages(eb);
>> for (i = 0; i < num_pages; i++, index++) {
>> + struct btrfs_subpage *prealloc = NULL;
>> +
>> p = find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL);
>> if (!p) {
>> exists = ERR_PTR(-ENOMEM);
>> goto free_eb;
>> }
>> + /*
>> + * Preallocate page->private for subpage case, so that
>> + * we won't allocate memory with private_lock hold.
>> + * The memory will be freed by attach_extent_buffer_page() or
>> + * freed manually if exit earlier.
>> + */
>> + ret = btrfs_alloc_subpage(fs_info, &prealloc);
>> + if (ret < 0) {
>> + unlock_page(p);
>> + put_page(p);
>> + exists = ERR_PTR(ret);
>> + goto free_eb;
>> + }
>> +
>
> I realize that for subpage sectorsize we'll only have 1 page, but I'd
> still rather see this outside of the for loop, just for clarity sake.
This is the trade-off.
Either we do every separately, sharing the minimal amount of code (and
need extra for loop for future 16K pages), or using the same loop
sacrifice a little readability.
Here I'd say sharing more code is not that a big deal.
>
>> spin_lock(&mapping->private_lock);
>> exists = grab_extent_buffer(p);
>> if (exists) {
>> @@ -5321,10 +5372,14 @@ struct extent_buffer
>> *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
>> unlock_page(p);
>> put_page(p);
>> mark_extent_buffer_accessed(exists, p);
>> + kfree(prealloc);
>> goto free_eb;
>> }
>> - attach_extent_buffer_page(eb, p);
>> + /* Should not fail, as we have preallocated the memory */
>> + ret = attach_extent_buffer_page(eb, p, prealloc);
>> + ASSERT(!ret);
>> spin_unlock(&mapping->private_lock);
>> +
>> WARN_ON(PageDirty(p));
>> eb->pages[i] = p;
>> if (!PageUptodate(p))
>> diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h
>> index 96f3b226913e..f701256dd1e2 100644
>> --- a/fs/btrfs/subpage.h
>> +++ b/fs/btrfs/subpage.h
>> @@ -23,8 +23,25 @@
>> struct btrfs_subpage {
>> /* Common members for both data and metadata pages */
>> spinlock_t lock;
>> + union {
>> + /* Structures only used by metadata */
>> + /* Structures only used by data */
>> + };
>> };
>> +/* For rare cases where we need to pre-allocate a btrfs_subpage
>> structure */
>> +static inline int btrfs_alloc_subpage(struct btrfs_fs_info *fs_info,
>> + struct btrfs_subpage **ret)
>> +{
>> + if (fs_info->sectorsize == PAGE_SIZE)
>> + return 0;
>> +
>> + *ret = kzalloc(sizeof(struct btrfs_subpage), GFP_NOFS);
>> + if (!*ret)
>> + return -ENOMEM;
>> + return 0;
>> +}
>
> We're allocating these for every metadata page, that deserves a
> dedicated kmem_cache. Thanks,
That makes sense, especially it will go both data and metadata for subpage.
Thanks,
Qu
>
> Josef
next prev parent reply other threads:[~2021-01-20 0:29 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-16 7:15 [PATCH v4 00/18] btrfs: add read-only support for subpage sector size Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 01/18] btrfs: update locked page dirty/writeback/error bits in __process_pages_contig() Qu Wenruo
2021-01-19 21:41 ` Josef Bacik
2021-01-21 6:32 ` Qu Wenruo
2021-01-21 6:51 ` Qu Wenruo
2021-01-23 19:13 ` David Sterba
2021-01-24 0:35 ` Qu Wenruo
2021-01-24 11:49 ` David Sterba
2021-01-16 7:15 ` [PATCH v4 02/18] btrfs: merge PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK into PAGE_START_WRITEBACK Qu Wenruo
2021-01-19 21:43 ` Josef Bacik
2021-01-19 21:45 ` Josef Bacik
2021-01-16 7:15 ` [PATCH v4 03/18] btrfs: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2021-01-18 22:46 ` David Sterba
2021-01-18 22:54 ` Qu Wenruo
2021-01-19 15:51 ` David Sterba
2021-01-19 16:06 ` David Sterba
2021-01-20 0:19 ` Qu Wenruo
2021-01-23 19:37 ` David Sterba
2021-01-24 0:24 ` Qu Wenruo
2021-01-18 23:01 ` David Sterba
2021-01-16 7:15 ` [PATCH v4 04/18] btrfs: make attach_extent_buffer_page() to handle subpage case Qu Wenruo
2021-01-18 22:51 ` David Sterba
2021-01-19 21:54 ` Josef Bacik
2021-01-19 22:35 ` David Sterba
2021-01-26 7:29 ` Qu Wenruo
2021-01-27 19:58 ` David Sterba
2021-01-20 0:27 ` Qu Wenruo [this message]
2021-01-20 14:22 ` Josef Bacik
2021-01-21 1:20 ` Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 05/18] btrfs: make grab_extent_buffer_from_page() " Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 06/18] btrfs: support subpage for extent buffer page release Qu Wenruo
2021-01-20 14:44 ` Josef Bacik
2021-01-21 0:45 ` Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 07/18] btrfs: attach private to dummy extent buffer pages Qu Wenruo
2021-01-20 14:48 ` Josef Bacik
2021-01-21 0:47 ` Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 08/18] btrfs: introduce helper for subpage uptodate status Qu Wenruo
2021-01-19 19:45 ` David Sterba
2021-01-20 14:55 ` Josef Bacik
2021-01-26 7:21 ` Qu Wenruo
2021-01-20 15:00 ` Josef Bacik
2021-01-21 0:49 ` Qu Wenruo
2021-01-21 1:28 ` Josef Bacik
2021-01-21 1:38 ` Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 09/18] btrfs: introduce helper for subpage error status Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 10/18] btrfs: make set/clear_extent_buffer_uptodate() to support subpage size Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 11/18] btrfs: make btrfs_clone_extent_buffer() to be subpage compatible Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 12/18] btrfs: implement try_release_extent_buffer() for subpage metadata support Qu Wenruo
2021-01-20 15:05 ` Josef Bacik
2021-01-21 0:51 ` Qu Wenruo
2021-01-23 20:36 ` David Sterba
2021-01-25 20:02 ` Josef Bacik
2021-01-16 7:15 ` [PATCH v4 13/18] btrfs: introduce read_extent_buffer_subpage() Qu Wenruo
2021-01-20 15:08 ` Josef Bacik
2021-01-16 7:15 ` [PATCH v4 14/18] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 15/18] btrfs: disk-io: introduce subpage metadata validation check Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 16/18] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2021-01-19 20:48 ` David Sterba
2021-01-20 15:28 ` Josef Bacik
2021-01-26 7:05 ` Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 17/18] btrfs: integrate page status update for data read path into begin/end_page_read() Qu Wenruo
2021-01-20 15:41 ` Josef Bacik
2021-01-21 1:05 ` Qu Wenruo
2021-01-16 7:15 ` [PATCH v4 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo
2021-01-18 23:17 ` [PATCH v4 00/18] btrfs: add read-only support for subpage sector size David Sterba
2021-01-18 23:26 ` Qu Wenruo
2021-01-24 12:29 ` David Sterba
2021-01-25 1:19 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a58c8366-f3b5-a152-92be-c7252891a7c6@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).