linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@gmail.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: Qu Wenruo <wqu@suse.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	David Sterba <dsterba@suse.com>
Subject: Re: [PATCH 1/3] btrfs: tree-checker: Fix false alerts on log trees
Date: Fri, 4 Oct 2019 15:13:02 +0100	[thread overview]
Message-ID: <CAL3q7H43Qsz9cy_EULphP=L=FjpPpiY2KwScHYCSbncJogujjg@mail.gmail.com> (raw)
In-Reply-To: <4e1e4ffd-c8c1-0d02-d8a0-ebdb4deabe4c@suse.com>

On Fri, Oct 4, 2019 at 2:54 PM Nikolay Borisov <nborisov@suse.com> wrote:
>
>
>
> On 4.10.19 г. 12:31 ч., Qu Wenruo wrote:
> > [BUG]
> > When running btrfs/063 in a loop, we got the following random write time
> > tree checker error:
> >
> >   BTRFS critical (device dm-4): corrupt leaf: root=18446744073709551610 block=33095680 slot=2 ino=307 file_offset=0, invalid previous key objectid, have 305 expect 307
> >   BTRFS info (device dm-4): leaf 33095680 gen 7 total ptrs 47 free space 12146 owner 18446744073709551610
> >   BTRFS info (device dm-4): refs 1 lock (w:0 r:0 bw:0 br:0 sw:0 sr:0) lock_owner 0 current 26176
> >           item 0 key (305 1 0) itemoff 16123 itemsize 160
> >                   inode generation 0 size 0 mode 40777
> >           item 1 key (305 12 257) itemoff 16111 itemsize 12
> >           item 2 key (307 108 0) itemoff 16058 itemsize 53 <<<
> >                   extent data disk bytenr 0 nr 0
> >                   extent data offset 0 nr 614400 ram 671744
> >           item 3 key (307 108 614400) itemoff 16005 itemsize 53
> >                   extent data disk bytenr 195342336 nr 57344
> >                   extent data offset 0 nr 53248 ram 57344
> >           item 4 key (307 108 667648) itemoff 15952 itemsize 53
> >                   extent data disk bytenr 194048000 nr 4096
> >                   extent data offset 0 nr 4096 ram 4096
> >         [...]
> >   BTRFS error (device dm-4): block=33095680 write time tree block corruption detected
> >   BTRFS: error (device dm-4) in btrfs_commit_transaction:2332: errno=-5 IO failure (Error while writing out transaction)
> >   BTRFS info (device dm-4): forced readonly
> >   BTRFS warning (device dm-4): Skipping commit of aborted transaction.
> >   BTRFS info (device dm-4): use zlib compression, level 3
> >   BTRFS: error (device dm-4) in cleanup_transaction:1890: errno=-5 IO failure
> >
> > [CAUSE]
> > Commit 59b0d030fb30 ("btrfs: tree-checker: Try to detect missing INODE_ITEM")
> > assumes all XATTR_ITEM/DIR_INDEX/DIR_ITEM/INODE_REF/EXTENT_DATA items
> > should have previous key with the same objectid as ino.
> >
> > But it's only true for fs trees. For log-tree, we can get above log tree
> > block where an EXTENT_DATA item has no previous key with the same ino.
> > As log tree only records modified items, it won't record unmodified
> > items like INODE_ITEM.
> >
> > So this triggers write time tree check warning.
> >
> > [FIX]
> > As a quick fix, check header owner to skip the previous key if it's not
> > fs tree (log tree doesn't count as fs tree).
> >
> > This fix is only to be merged as a quick fix.
> > There will be a more comprehensive fix to refactor the common check into
> > one function.
> >
> > Reported-by: David Sterba <dsterba@suse.com>
> > Fixes: 59b0d030fb30 ("btrfs: tree-checker: Try to detect missing INODE_ITEM")
> > Signed-off-by: Qu Wenruo <wqu@suse.com>
>
>
> It's not entirely clear why this bug manifests. My tests show that when
> we write extents we always update the inode's c/m time so it's always
> dirtied hence it's logged. OTOH when punching a hole the same thing is
> valid.
>
> Filipe, under what conditions should it be possible to log an
> EXTENT_DATA item without first logging the inode it belongs to? It seems
> using the usual write paths (e.g. buffered write and punchole) that's
> impossible?

The tests you did are pointless, none of those operations write to a
log tree, only fsync does that.

This change is perfectly fine. Logging (fsync) always logs the inode
item since commit [1] (2015),
however it might do so after logging extents and other items, and in
between that, if writeback for
the log tree leaf happens we get that error from the tree-checker.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e4545de5b035c7debb73d260c78377dbb69cbfb5

>
> > ---
> >  fs/btrfs/tree-checker.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> > index b8f82d9be9f0..5e34cd5e3e2e 100644
> > --- a/fs/btrfs/tree-checker.c
> > +++ b/fs/btrfs/tree-checker.c
> > @@ -148,7 +148,8 @@ static int check_extent_data_item(struct extent_buffer *leaf,
> >        * But if objectids mismatch, it means we have a missing
> >        * INODE_ITEM.
> >        */
> > -     if (slot > 0 && prev_key->objectid != key->objectid) {
> > +     if (slot > 0 && is_fstree(btrfs_header_owner(leaf)) &&
> > +         prev_key->objectid != key->objectid) {
> >               file_extent_err(leaf, slot,
> >               "invalid previous key objectid, have %llu expect %llu",
> >                               prev_key->objectid, key->objectid);
> > @@ -322,7 +323,8 @@ static int check_dir_item(struct extent_buffer *leaf,
> >       u32 cur = 0;
> >
> >       /* Same check as in check_extent_data_item() */
> > -     if (slot > 0 && prev_key->objectid != key->objectid) {
> > +     if (slot > 0 && is_fstree(btrfs_header_owner(leaf)) &&
> > +         prev_key->objectid != key->objectid) {
> >               dir_item_err(leaf, slot,
> >               "invalid previous key objectid, have %llu expect %llu",
> >                            prev_key->objectid, key->objectid);
> >



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

  reply	other threads:[~2019-10-04 14:13 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-04  9:31 [PATCH 0/3] btrfs: tree-checker: False alerts fixes for log trees Qu Wenruo
2019-10-04  9:31 ` [PATCH 1/3] btrfs: tree-checker: Fix false alerts on " Qu Wenruo
2019-10-04 13:52   ` Nikolay Borisov
2019-10-04 14:13     ` Filipe Manana [this message]
2019-10-04 14:19       ` Nikolay Borisov
2019-10-04 14:15   ` Filipe Manana
2019-10-07 15:31     ` David Sterba
2019-10-04  9:31 ` [PATCH 2/3] btrfs: tree-checker: Refactor prev_key check for ino into a function Qu Wenruo
2019-10-04  9:31 ` [PATCH 3/3] btrfs: Enhance the error outputting for write time tree checker Qu Wenruo
2019-10-07 16:46 ` [PATCH 0/3] btrfs: tree-checker: False alerts fixes for log trees David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL3q7H43Qsz9cy_EULphP=L=FjpPpiY2KwScHYCSbncJogujjg@mail.gmail.com' \
    --to=fdmanana@gmail.com \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).