linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH RFC] btrfs: Introduce btrfs child tree block verification system
Date: Thu, 12 Sep 2019 07:38:14 +0800	[thread overview]
Message-ID: <3993aeab-a695-3bd1-88d6-48e9743ab597@gmx.com> (raw)
In-Reply-To: <20190911160202.wtprsigurzfxwtic@MacBook-Pro-91.local>


[-- Attachment #1.1: Type: text/plain, Size: 3118 bytes --]



On 2019/9/12 上午12:02, Josef Bacik wrote:
> On Wed, Sep 11, 2019 at 03:46:24PM +0800, Qu Wenruo wrote:
>> Although we have btrfs_verify_level_key() function to check the first
>> key and level at tree block read time, it has its limitation due to tree
>> lock context, it's not reliable handling new tree blocks.
>>
> 
> How is it not reliable with new tree blocks?

Current btrfs_verify_level_key() skips first_key verification for any
tree blocks newer than last committed.

> 
>> So btrfs_verify_level_key() is good as a pre-check, but it can't ensure
>> new tree blocks are still sane at runtime.
>>
> 
> I mean I guess this is good, but we have to keep the parent locked when we're
> adding new blocks anyway, so I'm not entirely sure what this gains us?

For cases like tree search on current node, where all tree blocks can be
newly CoWed tree blocks.

If bit flip happens affecting those new tree blocks, we can detect them
at runtime, and that's the only time we can catch such error.

Write time tree checker doesn't go beyond single leave/node, thus has no
way to detect such parent-child mismatch case.

>  You are
> essentially duplicating the checks that we already do on reads, and then adding
> the first_key check.
> 
> I'll go along with the first_key check being relatively useful, but why exactly
> do we need all this infrastructure when we can just check it as we walk down the
> tree?

You can't really do the nritems and first key check at the current
timing of btrfs_verify_level_key() for new tree blocks due to lock context.

That's the only reason the new infrastructure is here, to block the only
hole of btrfs_verify_level_key().

> 
> <snip>
> 
>> @@ -2887,24 +2982,28 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root,
>>  			}
>>  
>>  			if (!p->skip_locking) {
>> -				level = btrfs_header_level(b);
>> -				if (level <= write_lock_level) {
>> +				if (level - 1 <= write_lock_level) {
>>  					err = btrfs_try_tree_write_lock(b);
>>  					if (!err) {
>>  						btrfs_set_path_blocking(p);
>>  						btrfs_tree_lock(b);
>>  					}
>> -					p->locks[level] = BTRFS_WRITE_LOCK;
>> +					p->locks[level - 1] = BTRFS_WRITE_LOCK;
>>  				} else {
>>  					err = btrfs_tree_read_lock_atomic(b);
>>  					if (!err) {
>>  						btrfs_set_path_blocking(p);
>>  						btrfs_tree_read_lock(b);
>>  					}
>> -					p->locks[level] = BTRFS_READ_LOCK;
>> +					p->locks[level - 1] = BTRFS_READ_LOCK;
>>  				}
>> -				p->nodes[level] = b;
>> +				p->nodes[level - 1] = b;
>>  			}
> 
> This makes no sense to me.  Why do we need to change how we do level here just
> for the btrfs_verify_child() check?

Because we can't trust the level from @b unless we have verified it.

(Although level is always checked in btrfs_verify_level_key(), but that
function is not 100% sure to be kept as is).

Thanks,
Qu

>  We've already init'ed verify further up,
> and we're not changing b here, so messing with level here just makes the code
> less clear.  Thanks,
> 
> Josef
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2019-09-11 23:38 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-11  7:46 [PATCH RFC] btrfs: Introduce btrfs child tree block verification system Qu Wenruo
2019-09-11 16:02 ` Josef Bacik
2019-09-11 23:38   ` Qu Wenruo [this message]
2019-09-12  9:59     ` Josef Bacik
2019-09-12 10:19       ` WenRuo Qu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3993aeab-a695-3bd1-88d6-48e9743ab597@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).