linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4 17/18] btrfs: integrate page status update for data read path into begin/end_page_read()
Date: Thu, 21 Jan 2021 09:05:57 +0800	[thread overview]
Message-ID: <490e331e-a586-3e7a-db59-b360a8118a50@gmx.com> (raw)
In-Reply-To: <b5e21d24-d2db-dc0f-bd96-1cbcad1a634e@toxicpanda.com>



On 2021/1/20 下午11:41, Josef Bacik wrote:
> On 1/16/21 2:15 AM, Qu Wenruo wrote:
>> In btrfs data page read path, the page status update are handled in two
>> different locations:
>>
>>    btrfs_do_read_page()
>>    {
>>     while (cur <= end) {
>>         /* No need to read from disk */
>>         if (HOLE/PREALLOC/INLINE){
>>             memset();
>>             set_extent_uptodate();
>>             continue;
>>         }
>>         /* Read from disk */
>>         ret = submit_extent_page(end_bio_extent_readpage);
>>    }
>>
>>    end_bio_extent_readpage()
>>    {
>>     endio_readpage_uptodate_page_status();
>>    }
>>
>> This is fine for sectorsize == PAGE_SIZE case, as for above loop we
>> should only hit one branch and then exit.
>>
>> But for subpage, there are more works to be done in page status update:
>> - Page Unlock condition
>>    Unlike regular page size == sectorsize case, we can no longer just
>>    unlock a page without a brain.
>>    Only the last reader of the page can unlock the page.
>>    This means, we can unlock the page either in the while() loop, or in
>>    the endio function.
>>
>> - Page uptodate condition
>>    Since we have multiple sectors to read for a page, we can only mark
>>    the full page uptodate if all sectors are uptodate.
>>
>> To handle both subpage and regular cases, introduce a pair of functions
>> to help handling page status update:
>>
>> - being_page_read()
>>    For regular case, it does nothing.
>>    For subpage case, it update the reader counters so that later
>>    end_page_read() can know who is the last one to unlock the page.
>>
>> - end_page_read()
>>    This is just endio_readpage_uptodate_page_status() renamed.
>>    The original name is a little too long and too specific for endio.
>>
>>    The only new trick added is the condition for page unlock.
>>    Now for subage data, we unlock the page if we're the last reader.
>>
>> This does not only provide the basis for subpage data read, but also
>> hide the special handling of page read from the main read loop.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/extent_io.c | 38 +++++++++++++++++++----------
>>   fs/btrfs/subpage.h   | 57 +++++++++++++++++++++++++++++++++++---------
>>   2 files changed, 72 insertions(+), 23 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 4bce03fed205..6ae820144ec7 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -2839,8 +2839,17 @@ static void 
>> endio_readpage_release_extent(struct processed_extent *processed,
>>       processed->uptodate = uptodate;
>>   }
>> -static void endio_readpage_update_page_status(struct page *page, bool 
>> uptodate,
>> -                          u64 start, u32 len)
>> +static void begin_data_page_read(struct btrfs_fs_info *fs_info, 
>> struct page *page)
>> +{
>> +    ASSERT(PageLocked(page));
>> +    if (fs_info->sectorsize == PAGE_SIZE)
>> +        return;
>> +
>> +    ASSERT(PagePrivate(page));
>> +    btrfs_subpage_start_reader(fs_info, page, page_offset(page), 
>> PAGE_SIZE);
>> +}
>> +
>> +static void end_page_read(struct page *page, bool uptodate, u64 
>> start, u32 len)
>>   {
>>       struct btrfs_fs_info *fs_info = 
>> btrfs_sb(page->mapping->host->i_sb);
>> @@ -2856,7 +2865,12 @@ static void 
>> endio_readpage_update_page_status(struct page *page, bool uptodate,
>>       if (fs_info->sectorsize == PAGE_SIZE)
>>           unlock_page(page);
>> -    /* Subpage locking will be handled in later patches */
>> +    else if (is_data_inode(page->mapping->host))
>> +        /*
>> +         * For subpage data, unlock the page if we're the last reader.
>> +         * For subpage metadata, page lock is not utilized for read.
>> +         */
>> +        btrfs_subpage_end_reader(fs_info, page, start, len);
>>   }
>>   /*
>> @@ -2993,7 +3007,7 @@ static void end_bio_extent_readpage(struct bio 
>> *bio)
>>           bio_offset += len;
>>           /* Update page status and unlock */
>> -        endio_readpage_update_page_status(page, uptodate, start, len);
>> +        end_page_read(page, uptodate, start, len);
>>           endio_readpage_release_extent(&processed, BTRFS_I(inode),
>>                             start, end, uptodate);
>>       }
>> @@ -3267,6 +3281,7 @@ int btrfs_do_readpage(struct page *page, struct 
>> extent_map **em_cached,
>>                 unsigned int read_flags, u64 *prev_em_start)
>>   {
>>       struct inode *inode = page->mapping->host;
>> +    struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>>       u64 start = page_offset(page);
>>       const u64 end = start + PAGE_SIZE - 1;
>>       u64 cur = start;
>> @@ -3310,6 +3325,7 @@ int btrfs_do_readpage(struct page *page, struct 
>> extent_map **em_cached,
>>               kunmap_atomic(userpage);
>>           }
>>       }
>> +    begin_data_page_read(fs_info, page);
>>       while (cur <= end) {
>>           bool force_bio_submit = false;
>>           u64 disk_bytenr;
>> @@ -3327,13 +3343,14 @@ int btrfs_do_readpage(struct page *page, 
>> struct extent_map **em_cached,
>>                           &cached, GFP_NOFS);
>>               unlock_extent_cached(tree, cur,
>>                            cur + iosize - 1, &cached);
>> +            end_page_read(page, true, cur, iosize);
>>               break;
>>           }
>>           em = __get_extent_map(inode, page, pg_offset, cur,
>>                         end - cur + 1, em_cached);
>>           if (IS_ERR_OR_NULL(em)) {
>> -            SetPageError(page);
>>               unlock_extent(tree, cur, end);
>> +            end_page_read(page, false, cur, end + 1 - cur);
>>               break;
>>           }
>>           extent_offset = cur - em->start;
>> @@ -3416,6 +3433,7 @@ int btrfs_do_readpage(struct page *page, struct 
>> extent_map **em_cached,
>>                           &cached, GFP_NOFS);
>>               unlock_extent_cached(tree, cur,
>>                            cur + iosize - 1, &cached);
>> +            end_page_read(page, true, cur, iosize);
>>               cur = cur + iosize;
>>               pg_offset += iosize;
>>               continue;
>> @@ -3425,6 +3443,7 @@ int btrfs_do_readpage(struct page *page, struct 
>> extent_map **em_cached,
>>                      EXTENT_UPTODATE, 1, NULL)) {
>>               check_page_uptodate(tree, page);
>>               unlock_extent(tree, cur, cur + iosize - 1);
>> +            end_page_read(page, true, cur, iosize);
>>               cur = cur + iosize;
>>               pg_offset += iosize;
>>               continue;
>> @@ -3433,8 +3452,8 @@ int btrfs_do_readpage(struct page *page, struct 
>> extent_map **em_cached,
>>            * to date.  Error out
>>            */
>>           if (block_start == EXTENT_MAP_INLINE) {
>> -            SetPageError(page);
>>               unlock_extent(tree, cur, cur + iosize - 1);
>> +            end_page_read(page, false, cur, iosize);
>>               cur = cur + iosize;
>>               pg_offset += iosize;
>>               continue;
>> @@ -3451,19 +3470,14 @@ int btrfs_do_readpage(struct page *page, 
>> struct extent_map **em_cached,
>>               nr++;
>>               *bio_flags = this_bio_flag;
>>           } else {
>> -            SetPageError(page);
>>               unlock_extent(tree, cur, cur + iosize - 1);
>> +            end_page_read(page, false, cur, iosize);
>>               goto out;
>>           }
>>           cur = cur + iosize;
>>           pg_offset += iosize;
>>       }
>>   out:
>> -    if (!nr) {
>> -        if (!PageError(page))
>> -            SetPageUptodate(page);
>> -        unlock_page(page);
>> -    }
> 
> Huh?  Now in the normal case we're not getting an unlocked page.

The page unlock is handled in end_page_read().

We need no special handling at all now, everything is handled in each 
branch, thus at the end, we have nothing to bother.

>  Not 
> only that we're not setting it uptodate if we had to 0 the whole page, 
> so we're just left dangling here because the endio will never be called.

Page read only have two routines: go submit bio to do the read, or fill 
it with zero inside btrfs_do_readpage() right now.

Now in btrfs_do_readpage(), we call end_page_read() in all branches 
except the bio submittion route.
So every bytes would be covered.

This is especially important for subpage, e.g:
For a 64K which has two extents in it:

0		32K		64K
|---- Hole -----|--- Regular ---|

In this case, we fill zero for [0, 32K), set it uptodate in 
end_read_page(), reduce readers to 8 but not unlock the page, generate a 
bio for [32K, 64K).

And in endio for [32K, 64) we will set the range uptodate, and the full 
page uptodate, reduce readers to 0, and unlock the page.


For regular sectorsize, it's either a hole or a regular, we always 
unlock the page, set uptodate or err in end_page_read().

> 
> Not to mention you're deleting all of teh SetPageError() calls for no 
> reason that I can see, and not replacing it with anything else, so 
> you've essentially ripped out any error handling on memory allocation.  

Nope, just check end_page_read(), if @uptodate is false, we set the page 
range error.

I guess you need to read the code not using difftool, but in code, 
epseiclaly in 0014 I integrated the page lock/unlock into 
endio_readpage_uptodate_page_status(), which later becomes end_page_read().

And if it's really as you said, I miss page unlock, very basic fsstress 
can exposed it on x86, not to mention full fstests.

Thanks,
Qu

> Thanks,
> 
> Josef

  reply	other threads:[~2021-01-21  1:08 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-16  7:15 [PATCH v4 00/18] btrfs: add read-only support for subpage sector size Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 01/18] btrfs: update locked page dirty/writeback/error bits in __process_pages_contig() Qu Wenruo
2021-01-19 21:41   ` Josef Bacik
2021-01-21  6:32     ` Qu Wenruo
2021-01-21  6:51       ` Qu Wenruo
2021-01-23 19:13         ` David Sterba
2021-01-24  0:35           ` Qu Wenruo
2021-01-24 11:49             ` David Sterba
2021-01-16  7:15 ` [PATCH v4 02/18] btrfs: merge PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK into PAGE_START_WRITEBACK Qu Wenruo
2021-01-19 21:43   ` Josef Bacik
2021-01-19 21:45   ` Josef Bacik
2021-01-16  7:15 ` [PATCH v4 03/18] btrfs: introduce the skeleton of btrfs_subpage structure Qu Wenruo
2021-01-18 22:46   ` David Sterba
2021-01-18 22:54     ` Qu Wenruo
2021-01-19 15:51       ` David Sterba
2021-01-19 16:06         ` David Sterba
2021-01-20  0:19           ` Qu Wenruo
2021-01-23 19:37             ` David Sterba
2021-01-24  0:24               ` Qu Wenruo
2021-01-18 23:01   ` David Sterba
2021-01-16  7:15 ` [PATCH v4 04/18] btrfs: make attach_extent_buffer_page() to handle subpage case Qu Wenruo
2021-01-18 22:51   ` David Sterba
2021-01-19 21:54   ` Josef Bacik
2021-01-19 22:35     ` David Sterba
2021-01-26  7:29       ` Qu Wenruo
2021-01-27 19:58         ` David Sterba
2021-01-20  0:27     ` Qu Wenruo
2021-01-20 14:22       ` Josef Bacik
2021-01-21  1:20         ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 05/18] btrfs: make grab_extent_buffer_from_page() " Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 06/18] btrfs: support subpage for extent buffer page release Qu Wenruo
2021-01-20 14:44   ` Josef Bacik
2021-01-21  0:45     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 07/18] btrfs: attach private to dummy extent buffer pages Qu Wenruo
2021-01-20 14:48   ` Josef Bacik
2021-01-21  0:47     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 08/18] btrfs: introduce helper for subpage uptodate status Qu Wenruo
2021-01-19 19:45   ` David Sterba
2021-01-20 14:55   ` Josef Bacik
2021-01-26  7:21     ` Qu Wenruo
2021-01-20 15:00   ` Josef Bacik
2021-01-21  0:49     ` Qu Wenruo
2021-01-21  1:28       ` Josef Bacik
2021-01-21  1:38         ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 09/18] btrfs: introduce helper for subpage error status Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 10/18] btrfs: make set/clear_extent_buffer_uptodate() to support subpage size Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 11/18] btrfs: make btrfs_clone_extent_buffer() to be subpage compatible Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 12/18] btrfs: implement try_release_extent_buffer() for subpage metadata support Qu Wenruo
2021-01-20 15:05   ` Josef Bacik
2021-01-21  0:51     ` Qu Wenruo
2021-01-23 20:36     ` David Sterba
2021-01-25 20:02       ` Josef Bacik
2021-01-16  7:15 ` [PATCH v4 13/18] btrfs: introduce read_extent_buffer_subpage() Qu Wenruo
2021-01-20 15:08   ` Josef Bacik
2021-01-16  7:15 ` [PATCH v4 14/18] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 15/18] btrfs: disk-io: introduce subpage metadata validation check Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 16/18] btrfs: introduce btrfs_subpage for data inodes Qu Wenruo
2021-01-19 20:48   ` David Sterba
2021-01-20 15:28   ` Josef Bacik
2021-01-26  7:05     ` Qu Wenruo
2021-01-16  7:15 ` [PATCH v4 17/18] btrfs: integrate page status update for data read path into begin/end_page_read() Qu Wenruo
2021-01-20 15:41   ` Josef Bacik
2021-01-21  1:05     ` Qu Wenruo [this message]
2021-01-16  7:15 ` [PATCH v4 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Qu Wenruo
2021-01-18 23:17 ` [PATCH v4 00/18] btrfs: add read-only support for subpage sector size David Sterba
2021-01-18 23:26   ` Qu Wenruo
2021-01-24 12:29     ` David Sterba
2021-01-25  1:19       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=490e331e-a586-3e7a-db59-b360a8118a50@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).