linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Josef Bacik <josef@toxicpanda.com>,
	Qu Wenruo <quwenruo.btrfs@gmx.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4 04/18] btrfs: make attach_extent_buffer_page() to handle subpage case
Date: Thu, 21 Jan 2021 09:20:33 +0800	[thread overview]
Message-ID: <05a4af18-cc90-6695-6e92-6fba9d8aa44d@suse.com> (raw)
In-Reply-To: <8a1f085c-4701-c32f-37e6-72c99fa2074c@toxicpanda.com>



On 2021/1/20 下午10:22, Josef Bacik wrote:
> On 1/19/21 7:27 PM, Qu Wenruo wrote:
>>
>>
>> On 2021/1/20 上午5:54, Josef Bacik wrote:
>>> On 1/16/21 2:15 AM, Qu Wenruo wrote:
>>>> For subpage case, we need to allocate new memory for each metadata 
>>>> page.
>>>>
>>>> So we need to:
>>>> - Allow attach_extent_buffer_page() to return int
>>>>    To indicate allocation failure
>>>>
>>>> - Prealloc btrfs_subpage structure for alloc_extent_buffer()
>>>>    We don't want to call memory allocation with spinlock hold, so
>>>>    do preallocation before we acquire mapping->private_lock.
>>>>
>>>> - Handle subpage and regular case differently in
>>>>    attach_extent_buffer_page()
>>>>    For regular case, just do the usual thing.
>>>>    For subpage case, allocate new memory or use the preallocated 
>>>> memory.
>>>>
>>>> For future subpage metadata, we will make more usage of radix tree to
>>>> grab extnet buffer.
>>>>
>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>> ---
>>>>   fs/btrfs/extent_io.c | 75 
>>>> ++++++++++++++++++++++++++++++++++++++------
>>>>   fs/btrfs/subpage.h   | 17 ++++++++++
>>>>   2 files changed, 82 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>>>> index a816ba4a8537..320731487ac0 100644
>>>> --- a/fs/btrfs/extent_io.c
>>>> +++ b/fs/btrfs/extent_io.c
>>>> @@ -24,6 +24,7 @@
>>>>   #include "rcu-string.h"
>>>>   #include "backref.h"
>>>>   #include "disk-io.h"
>>>> +#include "subpage.h"
>>>>   static struct kmem_cache *extent_state_cache;
>>>>   static struct kmem_cache *extent_buffer_cache;
>>>> @@ -3140,9 +3141,13 @@ static int submit_extent_page(unsigned int opf,
>>>>       return ret;
>>>>   }
>>>> -static void attach_extent_buffer_page(struct extent_buffer *eb,
>>>> -                      struct page *page)
>>>> +static int attach_extent_buffer_page(struct extent_buffer *eb,
>>>> +                      struct page *page,
>>>> +                      struct btrfs_subpage *prealloc)
>>>>   {
>>>> +    struct btrfs_fs_info *fs_info = eb->fs_info;
>>>> +    int ret;
>>>
>>> int ret = 0;
>>>
>>>> +
>>>>       /*
>>>>        * If the page is mapped to btree inode, we should hold the 
>>>> private
>>>>        * lock to prevent race.
>>>> @@ -3152,10 +3157,32 @@ static void attach_extent_buffer_page(struct 
>>>> extent_buffer *eb,
>>>>       if (page->mapping)
>>>>           lockdep_assert_held(&page->mapping->private_lock);
>>>> -    if (!PagePrivate(page))
>>>> -        attach_page_private(page, eb);
>>>> -    else
>>>> -        WARN_ON(page->private != (unsigned long)eb);
>>>> +    if (fs_info->sectorsize == PAGE_SIZE) {
>>>> +        if (!PagePrivate(page))
>>>> +            attach_page_private(page, eb);
>>>> +        else
>>>> +            WARN_ON(page->private != (unsigned long)eb);
>>>> +        return 0;
>>>> +    }
>>>> +
>>>> +    /* Already mapped, just free prealloc */
>>>> +    if (PagePrivate(page)) {
>>>> +        kfree(prealloc);
>>>> +        return 0;
>>>> +    }
>>>> +
>>>> +    if (prealloc) {
>>>> +        /* Has preallocated memory for subpage */
>>>> +        spin_lock_init(&prealloc->lock);
>>>> +        attach_page_private(page, prealloc);
>>>> +    } else {
>>>> +        /* Do new allocation to attach subpage */
>>>> +        ret = btrfs_attach_subpage(fs_info, page);
>>>> +        if (ret < 0)
>>>> +            return ret;
>>>
>>> Delete the above 2 lines.
>>>
>>>> +    }
>>>> +
>>>> +    return 0;
>>>
>>> return ret;
>>>
>>>>   }
>>>>   void set_page_extent_mapped(struct page *page)
>>>> @@ -5062,21 +5089,29 @@ struct extent_buffer 
>>>> *btrfs_clone_extent_buffer(const struct extent_buffer *src)
>>>>       if (new == NULL)
>>>>           return NULL;
>>>> +    set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags);
>>>> +    set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags);
>>>> +
>>>
>>> Why are you doing this here?  It seems unrelated?  Looking at the 
>>> code it appears there's a reason for this later, but I had to go look 
>>> to make sure I wasn't crazy, so at the very least it needs to be done 
>>> in a more relevant patch.
>>
>> This is to handle case where we allocated a page but failed to 
>> allocate subpage structure.
>>
>> In that case, btrfs_release_extent_buffer() will go different routine 
>> to free the eb.
>>
>> Without UNMAPPED bit, it just go wrong without knowing it's a unmapped 
>> eb.
>>
>> This change is mostly due to the extra failure pattern introduced by 
>> the subpage memory allocation.
>>
> 
> Yes, but my point is it's unrelated to this change, and in fact the 
> problem exists outside of your changes, so it needs to be addressed in 
> its own patch with its own changelog.

OK, that makes sense.

But it needs be determined after determining how to handle dummy extent 
buffer first.
> 
>>>
>>>>       for (i = 0; i < num_pages; i++) {
>>>> +        int ret;
>>>> +
>>>>           p = alloc_page(GFP_NOFS);
>>>>           if (!p) {
>>>>               btrfs_release_extent_buffer(new);
>>>>               return NULL;
>>>>           }
>>>> -        attach_extent_buffer_page(new, p);
>>>> +        ret = attach_extent_buffer_page(new, p, NULL);
>>>> +        if (ret < 0) {
>>>> +            put_page(p);
>>>> +            btrfs_release_extent_buffer(new);
>>>> +            return NULL;
>>>> +        }
>>>>           WARN_ON(PageDirty(p));
>>>>           SetPageUptodate(p);
>>>>           new->pages[i] = p;
>>>>           copy_page(page_address(p), page_address(src->pages[i]));
>>>>       }
>>>> -    set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags);
>>>> -    set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags);
>>>>       return new;
>>>>   }
>>>> @@ -5308,12 +5343,28 @@ struct extent_buffer 
>>>> *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
>>>>       num_pages = num_extent_pages(eb);
>>>>       for (i = 0; i < num_pages; i++, index++) {
>>>> +        struct btrfs_subpage *prealloc = NULL;
>>>> +
>>>>           p = find_or_create_page(mapping, index, 
>>>> GFP_NOFS|__GFP_NOFAIL);
>>>>           if (!p) {
>>>>               exists = ERR_PTR(-ENOMEM);
>>>>               goto free_eb;
>>>>           }
>>>> +        /*
>>>> +         * Preallocate page->private for subpage case, so that
>>>> +         * we won't allocate memory with private_lock hold.
>>>> +         * The memory will be freed by attach_extent_buffer_page() or
>>>> +         * freed manually if exit earlier.
>>>> +         */
>>>> +        ret = btrfs_alloc_subpage(fs_info, &prealloc);
>>>> +        if (ret < 0) {
>>>> +            unlock_page(p);
>>>> +            put_page(p);
>>>> +            exists = ERR_PTR(ret);
>>>> +            goto free_eb;
>>>> +        }
>>>> +
>>>
>>> I realize that for subpage sectorsize we'll only have 1 page, but I'd 
>>> still rather see this outside of the for loop, just for clarity sake.
>>
>> This is the trade-off.
>> Either we do every separately, sharing the minimal amount of code (and 
>> need extra for loop for future 16K pages), or using the same loop 
>> sacrifice a little readability.
>>
>> Here I'd say sharing more code is not that a big deal.
>>
> 
> It's not a tradeoff, it's confusing.  What I'm suggesting is you do
> 
> ret = btrfs_alloc_subpage(fs_info, &prealloc);
> if (ret) {
>      exists = ERR_PTR(ret);
>      goto free_eb;
> }
> for (i = 0; i < num_pages; i++, index++) {
> }

This means for later 16K page support, we still need to move 
btrfs_alloc_subpage() into the loop.

But I totally understand your point here.

I'd put a comment here explaining why we can just allocate one subpage 
structure here.

Thanks,
Qu

> 
> free_eb:
>      kmem_cache_free(prealloc);
> 
> The subpage portion is part of the eb itself, and there's one per eb, 
> and thus should be pre-allocated outside of the loop that is doing the 
> page lookup, as it's logically a different thing.  Thanks,
> 
> Josef
> 


  reply	other threads:[~2021-01-21  1:25 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-16  7:15 [PATCH v4 00/18] btrfs: add read-only support for subpage sector size Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 01/18] btrfs: update locked page dirty/writeback/error bits in __process_pages_contig() Qu Wenruo
2021-01-19 21:41   ` Josef Bacik
2021-01-21  6:32     ` Qu Wenruo
2021-01-21  6:51       ` Qu Wenruo
2021-01-23 19:13         ` David Sterba
2021-01-24  0:35           ` Qu Wenruo
2021-01-24 11:49             ` David Sterba
2021-01-16  7:15 ` [PATCH v4 02/18] btrfs: merge PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK into PAGE_START_WRITEBACK Qu Wenruo
2021-01-19 21:43   ` Josef Bacik
2021-01-19 21:45   ` Josef Bacik
2021-01-16  7:15 ` [PATCH v4 03/18] btrfs: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2021-01-18 22:46   ` David Sterba
2021-01-18 22:54     ` Qu Wenruo
2021-01-19 15:51       ` David Sterba
2021-01-19 16:06         ` David Sterba
2021-01-20  0:19           ` Qu Wenruo
2021-01-23 19:37             ` David Sterba
2021-01-24  0:24               ` Qu Wenruo
2021-01-18 23:01   ` David Sterba
2021-01-16  7:15 ` [PATCH v4 04/18] btrfs: make attach_extent_buffer_page() to handle subpage case Qu Wenruo
2021-01-18 22:51   ` David Sterba
2021-01-19 21:54   ` Josef Bacik
2021-01-19 22:35     ` David Sterba
2021-01-26  7:29       ` Qu Wenruo
2021-01-27 19:58         ` David Sterba
2021-01-20  0:27     ` Qu Wenruo
2021-01-20 14:22       ` Josef Bacik
2021-01-21  1:20         ` Qu Wenruo [this message]
2021-01-16  7:15 ` [PATCH v4 05/18] btrfs: make grab_extent_buffer_from_page() " Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 06/18] btrfs: support subpage for extent buffer page release Qu Wenruo
2021-01-20 14:44   ` Josef Bacik
2021-01-21  0:45     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 07/18] btrfs: attach private to dummy extent buffer pages Qu Wenruo
2021-01-20 14:48   ` Josef Bacik
2021-01-21  0:47     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 08/18] btrfs: introduce helper for subpage uptodate status Qu Wenruo
2021-01-19 19:45   ` David Sterba
2021-01-20 14:55   ` Josef Bacik
2021-01-26  7:21     ` Qu Wenruo
2021-01-20 15:00   ` Josef Bacik
2021-01-21  0:49     ` Qu Wenruo
2021-01-21  1:28       ` Josef Bacik
2021-01-21  1:38         ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 09/18] btrfs: introduce helper for subpage error status Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 10/18] btrfs: make set/clear_extent_buffer_uptodate() to support subpage size Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 11/18] btrfs: make btrfs_clone_extent_buffer() to be subpage compatible Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 12/18] btrfs: implement try_release_extent_buffer() for subpage metadata support Qu Wenruo
2021-01-20 15:05   ` Josef Bacik
2021-01-21  0:51     ` Qu Wenruo
2021-01-23 20:36     ` David Sterba
2021-01-25 20:02       ` Josef Bacik
2021-01-16  7:15 ` [PATCH v4 13/18] btrfs: introduce read_extent_buffer_subpage() Qu Wenruo
2021-01-20 15:08   ` Josef Bacik
2021-01-16  7:15 ` [PATCH v4 14/18] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 15/18] btrfs: disk-io: introduce subpage metadata validation check Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 16/18] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2021-01-19 20:48   ` David Sterba
2021-01-20 15:28   ` Josef Bacik
2021-01-26  7:05     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 17/18] btrfs: integrate page status update for data read path into begin/end_page_read() Qu Wenruo
2021-01-20 15:41   ` Josef Bacik
2021-01-21  1:05     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo
2021-01-18 23:17 ` [PATCH v4 00/18] btrfs: add read-only support for subpage sector size David Sterba
2021-01-18 23:26   ` Qu Wenruo
2021-01-24 12:29     ` David Sterba
2021-01-25  1:19       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=05a4af18-cc90-6695-6e92-6fba9d8aa44d@suse.com \
    --to=wqu@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).