All of lore.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Wang Yugui <wangyugui@e16-tech.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 1/2] btrfs: fix btrfs_prev_leaf() to not return the same key twice
Date: Mon, 17 Apr 2023 11:18:46 +0100	[thread overview]
Message-ID: <CAL3q7H6CoXvJGdf9Af4qQPyhwP-3gDPDOLNtK3YWqw1HV28rHw@mail.gmail.com> (raw)
In-Reply-To: <20230414114340.57ED.409509F4@e16-tech.com>

On Fri, Apr 14, 2023 at 4:44 AM Wang Yugui <wangyugui@e16-tech.com> wrote:
>
> Hi,
>
> > Hi,
> >
> > > From: Filipe Manana <fdmanana@suse.com>
> > >
> > > A call to btrfs_prev_leaf() may end up returning a path that points to the
> > > same item (key) again. This happens if while btrfs_prev_leaf(), after we
> > > release the path, a concurrent insertion happens, which moves items off
> > > from a sibbling into the front of the previous leaf, and an item with the
> > > computed previous key does not exists.
> > >
> > > For example, suppose we have the two following leaves:
> > >
> > >   Leaf A
> > >
> > >   -------------------------------------------------------------
> > >   | ...   key (300 96 10)   key (300 96 15)   key (300 96 16) |
> > >   -------------------------------------------------------------
> > >               slot 20             slot 21             slot 22
> > >
> > >   Leaf B
> > >
> > >   -------------------------------------------------------------
> > >   | key (300 96 20)   key (300 96 21)   key (300 96 22)   ... |
> > >   -------------------------------------------------------------
> > >       slot 0             slot 1             slot 2
> > >
> > > If we call btrfs_prev_leaf(), from btrfs_previous_item() for example, with
> > > a path pointing to leaf B and slot 0 and the following happens:
> > >
> > > 1) At btrfs_prev_leaf() we compute the previous key to search as:
> > >    (300 96 19), which is a key that does not exists in the tree;
> > >
> > > 2) Then we call btrfs_release_path() at btrfs_prev_leaf();
> > >
> > > 3) Some other task inserts a key at leaf A, that sorts before the key at
> > >    slot 20, for example it has an objectid of 299. In order to make room
> > >    for the new key, the key at slot 22 is moved to the front of leaf B.
> > >    This happens at push_leaf_right(), called from split_leaf().
> > >
> > >    After this leaf B now looks like:
> > >
> > >   --------------------------------------------------------------------------------
> > >   | key (300 96 16)    key (300 96 20)   key (300 96 21)   key (300 96 22)   ... |
> > >   --------------------------------------------------------------------------------
> > >        slot 0              slot 1             slot 2             slot 3
> > >
> > > 4) At btrfs_prev_leaf() we call btrfs_search_slot() for the computed
> > >    previous key: (300 96 19). Since the key does not exists,
> > >    btrfs_search_slot() returns 1 and with a path pointing to leaf B
> > >    and slot 1, the item with key (300 96 20);
> > >
> > > 5) This makes btrfs_prev_leaf() return a path that points to slot 1 of
> > >    leaf B, the same key as before it was called, since the key at slot 0
> > >    of leaf B (300 96 16) is less than the computed previous key, which is
> > >    (300 96 19);
> > >
> > > 6) As a consequence btrfs_previous_item() returns a path that points again
> > >    to the item with key (300 96 20).
> > >
> > > For some users of btrfs_prev_leaf() or btrfs_previous_item() this may not
> > > be functional a problem, despite not making sense to return a new path
> > > pointing again to the same item/key. However for a caller such as
> > > tree-log.c:log_dir_items(), this has a bad consequence, as it can result
> > > in not logging some dir index deletions in case the directory is being
> > > logged without holding the inode's VFS lock (logging triggered while
> > > logging a child inode for example) - for the example scenario above, in
> > > case the dir index keys 17, 18 and 19 were deleted in the current
> > > transaction.
> >
> > fstests(btrfs/080) had few frequency(<1/10) to fail after this pacth is applied.
> > but it is yet not clear enough.
> >
> > # cat results//btrfs/080.out.bad
> > QA output created by 080
> > Unexpected digest for file /mnt/scratch/20_40_30_986878191_snap/foobar_63
> > (see /usr/hpc-bio/xfstests/results//btrfs/080.full for details)
>
> fstests(btrfs/080) failure happened without this patch too.
>
> the next info update may take some long time, because of the low reproduce
> frequency.

That test sporadically fails without the patch yes.

Also, more important: the test does not exercise at all the function
changed by this patch.
So you can save time and skip doing that pointless info update on
frequency failure...

Thanks.

>
> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2023/04/14
>
>

  reply	other threads:[~2023-04-17 10:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-12 10:33 [PATCH 0/2] btrfs: fix for btrfs_prev_leaf() and unexport it fdmanana
2023-04-12 10:33 ` [PATCH 1/2] btrfs: fix btrfs_prev_leaf() to not return the same key twice fdmanana
2023-04-13 19:57   ` Wang Yugui
2023-04-14  3:43     ` Wang Yugui
2023-04-17 10:18       ` Filipe Manana [this message]
2023-04-12 10:33 ` [PATCH 2/2] btrfs: unexport btrfs_prev_leaf() fdmanana
2023-04-15  7:14   ` Anand Jain
2023-04-12 13:47 ` [PATCH 0/2] btrfs: fix for btrfs_prev_leaf() and unexport it Josef Bacik
2023-04-17 18:53 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAL3q7H6CoXvJGdf9Af4qQPyhwP-3gDPDOLNtK3YWqw1HV28rHw@mail.gmail.com \
    --to=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wangyugui@e16-tech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.