All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: dsterba@suse.cz, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH v2 1/3] btrfs: avoid double clean up when submit_one_bio() failed
Date: Wed, 13 Apr 2022 07:32:41 +0800	[thread overview]
Message-ID: <447a2d76-dfff-9efb-09e8-9778ac4a44f2@gmx.com> (raw)
In-Reply-To: <20220412204104.GA15609@twin.jikos.cz>



On 2022/4/13 04:41, David Sterba wrote:
> On Tue, Apr 12, 2022 at 08:30:13PM +0800, Qu Wenruo wrote:
>> [BUG]
>> When running generic/475 with 64K page size and 4K sector size, it has a
>> very high chance (almost 100%) to hang, with mostly data page locked but
>> no one is going to unlock it.
>>
>> [CAUSE]
>> With commit 1784b7d502a9 ("btrfs: handle csum lookup errors properly on
>> reads"), if we failed to lookup checksum due to metadata IO error, we
>> will return error for btrfs_submit_data_bio().
>>
>> This will cause the page to be unlocked twice in btrfs_do_readpage():
>>
>>   btrfs_do_readpage()
>>   |- submit_extent_page()
>>   |  |- submit_one_bio()
>>   |     |- btrfs_submit_data_bio()
>>   |        |- if (ret) {
>>   |        |-     bio->bi_status = ret;
>>   |        |-     bio_endio(bio); }
>>   |               In the endio function, we will call end_page_read()
>>   |               and unlock_extent() to cleanup the subpage range.
>>   |
>>   |- if (ret) {
>>   |-        unlock_extent(); end_page_read() }
>>             Here we unlock the extent and cleanup the subpage range
>>             again.
>>
>> For unlock_extent(), it's mostly double unlock safe.
>>
>> But for end_page_read(), it's not, especially for subpage case,
>> as for subpage case we will call btrfs_subpage_end_reader() to reduce
>> the reader number, and use that to number to determine if we need to
>> unlock the full page.
>>
>> If double accounted, it can underflow the number and leave the page
>> locked without anyone to unlock it.
>>
>> [FIX]
>> The commit 1784b7d502a9 ("btrfs: handle csum lookup errors properly on
>> reads") itself is completely fine, it's our existing code not properly
>> handling the error from bio submission hook properly.
>>
>> This patch will make submit_one_bio() to return void so that the callers
>> will never be able to do cleanup when bio submission hook fails.
>>
>> CC: stable@vger.kernel.org # 5.18+
>
> BTW stable tags are only for released kernels, if it's still in some rc
> then Fixes: is appropriate.

The problem is I don't have a good idea on which commit to be fixed.

Commit 1784b7d502a9 ("btrfs: handle csum lookup errors properly on
reads") is completely fine by itself.

The problem is there for a long long time, but can only be triggered
with IO errors with that newer commit.

Should we really use that commit? It looks like a scapegoat to me...

Thanks,
Qu

  reply	other threads:[~2022-04-12 23:43 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12 12:30 [PATCH v2 0/3] btrfs: vairous bug fixes related to generic/475 failure with subpage cases Qu Wenruo
2022-04-12 12:30 ` [PATCH v2 1/3] btrfs: avoid double clean up when submit_one_bio() failed Qu Wenruo
2022-04-12 20:41   ` David Sterba
2022-04-12 23:32     ` Qu Wenruo [this message]
2022-04-13 13:46       ` David Sterba
2022-04-13 23:23         ` Qu Wenruo
2022-04-15  6:28   ` Christoph Hellwig
2022-04-15  7:02     ` Qu Wenruo
2022-04-15  7:12       ` Christoph Hellwig
2022-04-15  7:14         ` Qu Wenruo
2022-04-24 23:26           ` Qu Wenruo
2022-04-12 12:30 ` [PATCH v2 2/3] btrfs: fix the error handling for submit_extent_page() for btrfs_do_readpage() Qu Wenruo
2022-04-12 12:30 ` [PATCH v2 3/3] btrfs: return correct error number for __extent_writepage_io() Qu Wenruo
2022-04-12 20:42 ` [PATCH v2 0/3] btrfs: vairous bug fixes related to generic/475 failure with subpage cases David Sterba
2022-04-14 19:38   ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=447a2d76-dfff-9efb-09e8-9778ac4a44f2@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.