linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@gmail.com>
To: Josef Bacik <josef@toxicpanda.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>, kernel-team@fb.com
Subject: Re: [PATCH 14/14] btrfs: set the lockdep class for ebs on creation
Date: Fri, 6 Nov 2020 11:58:34 +0000	[thread overview]
Message-ID: <CAL3q7H7jrF3Acu0DHfShMWE_WmwAcwF5b7-ZOj4vRPZ61_pf-Q@mail.gmail.com> (raw)
In-Reply-To: <1cee2922a32c305056a9559ccf7aede49777beae.1604591048.git.josef@toxicpanda.com>

On Thu, Nov 5, 2020 at 3:47 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> Both Filipe and Fedora QA recently hit the following lockdep splat
>
> WARNING: possible recursive locking detected
> 5.10.0-0.rc1.20201028gited8780e3f2ec.57.fc34.x86_64 #1 Not tainted
> --------------------------------------------
> rsync/2610 is trying to acquire lock:
> ffff89617ed48f20 (&eb->lock){++++}-{2:2}, at: btrfs_tree_read_lock_atomic+0x34/0x140
>
> but task is already holding lock:
> ffff8961757b1130 (&eb->lock){++++}-{2:2}, at: btrfs_tree_read_lock_atomic+0x34/0x140
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>        CPU0
>        ----
>   lock(&eb->lock);
>   lock(&eb->lock);
>
>  *** DEADLOCK ***
>  May be due to missing lock nesting notation
> 2 locks held by rsync/2610:
>  #0: ffff896107212b90 (&type->i_mutex_dir_key#10){++++}-{3:3}, at: walk_component+0x10c/0x190
>  #1: ffff8961757b1130 (&eb->lock){++++}-{2:2}, at: btrfs_tree_read_lock_atomic+0x34/0x140
>
> stack backtrace:
> CPU: 1 PID: 2610 Comm: rsync Not tainted 5.10.0-0.rc1.20201028gited8780e3f2ec.57.fc34.x86_64 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
> Call Trace:
>  dump_stack+0x8b/0xb0
>  __lock_acquire.cold+0x12d/0x2a4
>  ? kvm_sched_clock_read+0x14/0x30
>  ? sched_clock+0x5/0x10
>  lock_acquire+0xc8/0x400
>  ? btrfs_tree_read_lock_atomic+0x34/0x140
>  ? read_block_for_search.isra.0+0xdd/0x320
>  _raw_read_lock+0x3d/0xa0
>  ? btrfs_tree_read_lock_atomic+0x34/0x140
>  btrfs_tree_read_lock_atomic+0x34/0x140
>  btrfs_search_slot+0x616/0x9a0
>  btrfs_lookup_dir_item+0x6c/0xb0
>  btrfs_lookup_dentry+0xa8/0x520
>  ? lockdep_init_map_waits+0x4c/0x210
>  btrfs_lookup+0xe/0x30
>  __lookup_slow+0x10f/0x1e0
>  walk_component+0x11b/0x190
>  path_lookupat+0x72/0x1c0
>  filename_lookup+0x97/0x180
>  ? strncpy_from_user+0x96/0x1e0
>  ? getname_flags.part.0+0x45/0x1a0
>  vfs_statx+0x64/0x100
>  ? lockdep_hardirqs_on_prepare+0xff/0x180
>  ? _raw_spin_unlock_irqrestore+0x41/0x50
>  __do_sys_newlstat+0x26/0x40
>  ? lockdep_hardirqs_on_prepare+0xff/0x180
>  ? syscall_enter_from_user_mode+0x27/0x80
>  ? syscall_enter_from_user_mode+0x27/0x80
>  do_syscall_64+0x33/0x40
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> I have also seen a report of lockdep complaining about the lock class
> that was looked up being the same as the lock class on the lock we were
> using, but I can't find the report.
>
> These are problems that occur because we do not have the lockdep class
> set on the extent buffer until _after_ we read the eb in properly.  This
> is problematic for concurrent readers, because we will create the extent
> buffer, lock it, and then attempt to read the extent buffer.
>
> If a second thread comes in and tries to do a search down the same path
> they'll get the above lockdep splat because the class isn't set properly
> on the extent buffer.
>
> There was a good reason for this, we generally didn't know the real
> owner of the eb until we read it, specifically in refcount'ed roots.
>
> However now all refcount'ed roots have the same class name, so we no
> longer need to worry about this.  For non-refcount'ed tree's we know
> which root we're on based on the parent.
>
> Fix this by setting the lockdep class on the eb at creation time instead
> of read time.  This will fix the splat and the weirdness where the class
> changes in the middle of locking the block.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

I couldn't get anymore the lockdep splat I reported before (after
applying the whole patchset of course), it used to happen very often
with btrfs/033.

Looks good, thanks.

> ---
>  fs/btrfs/disk-io.c     | 3 ---
>  fs/btrfs/extent-tree.c | 8 +++++---
>  fs/btrfs/extent_io.c   | 1 +
>  fs/btrfs/volumes.c     | 1 -
>  4 files changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index f14398b5d933..a90839426cfa 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -577,9 +577,6 @@ int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, u64 phy_offset,
>                 goto err;
>         }
>
> -       btrfs_set_buffer_lockdep_class(btrfs_header_owner(eb),
> -                                      eb, found_level);
> -
>         csum_tree_block(eb, result);
>
>         if (memcmp_extent_buffer(eb, result, 0, csum_size)) {
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 14b6e19f6151..517c2558f973 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -4629,6 +4629,11 @@ btrfs_init_new_buffer(struct btrfs_trans_handle *trans, struct btrfs_root *root,
>                 return ERR_PTR(-EUCLEAN);
>         }
>
> +       /*
> +        * This needs to stay, because we could allocate a free'd block from an
> +        * old tree into a new tree, so we need to make sure this new block is
> +        * set to the appropriate level and owner.
> +        */
>         btrfs_set_buffer_lockdep_class(owner, buf, level);
>         __btrfs_tree_lock(buf, nest);
>         btrfs_clean_tree_block(buf);
> @@ -5018,9 +5023,6 @@ static noinline int do_walk_down(struct btrfs_trans_handle *trans,
>                                                     level - 1);
>                 if (IS_ERR(next))
>                         return PTR_ERR(next);
> -
> -               btrfs_set_buffer_lockdep_class(root->root_key.objectid, next,
> -                                              level - 1);
>                 reada = 1;
>         }
>         btrfs_tree_lock(next);
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index a883350d5e7f..3c61981b2c7b 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -5196,6 +5196,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
>         eb = __alloc_extent_buffer(fs_info, start, len);
>         if (!eb)
>                 return ERR_PTR(-ENOMEM);
> +       btrfs_set_buffer_lockdep_class(owner_root, eb, level);
>
>         num_pages = num_extent_pages(eb);
>         for (i = 0; i < num_pages; i++, index++) {
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 0ca2e96a9cda..4830c40fc400 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -6902,7 +6902,6 @@ int btrfs_read_sys_array(struct btrfs_fs_info *fs_info)
>         if (IS_ERR(sb))
>                 return PTR_ERR(sb);
>         set_extent_buffer_uptodate(sb);
> -       btrfs_set_buffer_lockdep_class(root->root_key.objectid, sb, 0);
>         /*
>          * The sb extent buffer is artificial and just used to read the system array.
>          * set_extent_buffer_uptodate() call does not properly mark all it's
> --
> 2.26.2
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

  reply	other threads:[~2020-11-06 11:58 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-05 15:45 [PATCH 00/14][REBASED] Set the lockdep class on eb's at allocation time Josef Bacik
2020-11-05 15:45 ` [PATCH 01/14] btrfs: remove lockdep classes for the fs tree Josef Bacik
2020-11-06 11:50   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 02/14] btrfs: cleanup extent buffer readahead Josef Bacik
2020-11-06 11:51   ` Filipe Manana
2020-11-09 15:09   ` David Sterba
2020-11-05 15:45 ` [PATCH 03/14] btrfs: use btrfs_read_node_slot in btrfs_realloc_node Josef Bacik
2020-11-06 11:51   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 04/14] btrfs: use btrfs_read_node_slot in walk_down_reloc_tree Josef Bacik
2020-11-06 11:52   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 05/14] btrfs: use btrfs_read_node_slot in do_relocation Josef Bacik
2020-11-06 11:52   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 06/14] btrfs: use btrfs_read_node_slot in replace_path Josef Bacik
2020-11-06 11:53   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 07/14] btrfs: use btrfs_read_node_slot in walk_down_tree Josef Bacik
2020-11-06 11:53   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 08/14] btrfs: use btrfs_read_node_slot in qgroup_trace_extent_swap Josef Bacik
2020-11-06 11:53   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 09/14] btrfs: use btrfs_read_node_slot in qgroup_trace_new_subtree_blocks Josef Bacik
2020-11-06 11:54   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 10/14] btrfs: use btrfs_read_node_slot in btrfs_qgroup_trace_subtree Josef Bacik
2020-11-06 11:55   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 11/14] btrfs: pass root owner to read_tree_block Josef Bacik
2020-11-06 11:55   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 12/14] btrfs: pass the root owner and level around for reada Josef Bacik
2020-11-06 11:56   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 13/14] btrfs: pass the owner_root and level to alloc_extent_buffer Josef Bacik
2020-11-06 11:54   ` Filipe Manana
2020-11-05 15:45 ` [PATCH 14/14] btrfs: set the lockdep class for ebs on creation Josef Bacik
2020-11-06 11:58   ` Filipe Manana [this message]
2020-11-09 16:50 ` [PATCH 00/14][REBASED] Set the lockdep class on eb's at allocation time David Sterba
  -- strict thread matches above, loose matches on Subject: below --
2020-10-30 21:02 [PATCH 00/14] " Josef Bacik
2020-10-30 21:03 ` [PATCH 14/14] btrfs: set the lockdep class for ebs on creation Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAL3q7H7jrF3Acu0DHfShMWE_WmwAcwF5b7-ZOj4vRPZ61_pf-Q@mail.gmail.com \
    --to=fdmanana@gmail.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).