All of lore.kernel.org
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: Yu Kuai <yukuai1@huaweicloud.com>, song@kernel.org
Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
	yukuai3@huawei.com, yi.zhang@huawei.com
Subject: Re: [PATCH -next 2/3] md/raid10: convert resync_lock to use seqlock
Date: Thu, 1 Sep 2022 12:41:23 -0600	[thread overview]
Message-ID: <04128618-962f-fd4e-64a9-09ecf7f83776@deltatee.com> (raw)
In-Reply-To: <20220829131502.165356-3-yukuai1@huaweicloud.com>

Hi,

On 2022-08-29 07:15, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@huawei.com>
> 
> Currently, wait_barrier() will hold 'resync_lock' to read 'conf->barrier',
> and io can't be dispatched until 'barrier' is dropped.
> 
> Since holding the 'barrier' is not common, convert 'resync_lock' to use
> seqlock so that holding lock can be avoided in fast path.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

I've found some lockdep issues starting with this patch in md-next while
running mdadm tests (specifically 00raid10 when run about 10 times in a
row).

I've seen a couple different lock dep errors. The first seems to be
reproducible on this patch, then it possibly changes to the second on
subsequent patches. Not sure exactly.

I haven't dug into it too deeply, but hopefully it can be fixed easily.

Logan

--


    ================================
    WARNING: inconsistent lock state
    6.0.0-rc2-eid-vmlocalyes-dbg-00023-gfd68041d2fd2 #2604 Not tainted
    --------------------------------
    inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
    fsck.ext3/1695 [HC0[0]:SC0[0]:HE0:SE1] takes:
    ffff8881049b0120 (&____s->seqcount#10){+.?.}-{0:0}, at:
raid10_read_request+0x21f/0x760
		(raid10.c:1134)

    {IN-SOFTIRQ-W} state was registered at:
      lock_acquire+0x183/0x440
      lower_barrier+0x5e/0xd0
      end_sync_request+0x178/0x180
      end_sync_write+0x193/0x380
      bio_endio+0x346/0x3a0
      blk_update_request+0x1eb/0x7c0
      blk_mq_end_request+0x30/0x50
      lo_complete_rq+0xb7/0x100
      blk_complete_reqs+0x77/0x90
      blk_done_softirq+0x38/0x40
      __do_softirq+0x10c/0x650
      run_ksoftirqd+0x48/0x80
      smpboot_thread_fn+0x302/0x400
      kthread+0x18c/0x1c0
      ret_from_fork+0x1f/0x30

    irq event stamp: 8930
    hardirqs last  enabled at (8929): [<ffffffff96df8351>]
_raw_spin_unlock_irqrestore+0x31/0x60
    hardirqs last disabled at (8930): [<ffffffff96df7fc5>]
_raw_spin_lock_irq+0x75/0x90
    softirqs last  enabled at (6768): [<ffffffff9554970e>]
__irq_exit_rcu+0xfe/0x150
    softirqs last disabled at (6757): [<ffffffff9554970e>]
__irq_exit_rcu+0xfe/0x150

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&____s->seqcount#10);
      <Interrupt>
        lock(&____s->seqcount#10);

     *** DEADLOCK ***

    2 locks held by fsck.ext3/1695:
     #0: ffff8881007d0930 (mapping.invalidate_lock#2){++++}-{3:3}, at:
page_cache_ra_unbounded+0xaf/0x250
     #1: ffff8881049b0120 (&____s->seqcount#10){+.?.}-{0:0}, at:
raid10_read_request+0x21f/0x760

    stack backtrace:
    CPU: 0 PID: 1695 Comm: fsck.ext3 Not tainted
6.0.0-rc2-eid-vmlocalyes-dbg-00023-gfd68041d2fd2 #2604
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2
04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0x5a/0x74
     dump_stack+0x10/0x12
     print_usage_bug.part.0+0x233/0x246
     mark_lock.part.0.cold+0x73/0x14f
     mark_held_locks+0x71/0xa0
     lockdep_hardirqs_on_prepare+0x158/0x230
     trace_hardirqs_on+0x34/0x100
     _raw_spin_unlock_irq+0x28/0x60
     wait_barrier+0x4a6/0x720
         raid10.c:1004
     raid10_read_request+0x21f/0x760
     raid10_make_request+0x2d6/0x2160
     md_handle_request+0x3f3/0x5b0
     md_submit_bio+0xd9/0x120
     __submit_bio+0x9d/0x100
     submit_bio_noacct_nocheck+0x1fd/0x470
     submit_bio_noacct+0x4c2/0xbb0
     submit_bio+0x3f/0xf0
     mpage_readahead+0x323/0x3b0
     blkdev_readahead+0x15/0x20
     read_pages+0x136/0x7a0
     page_cache_ra_unbounded+0x18d/0x250
     page_cache_ra_order+0x2c9/0x400
     ondemand_readahead+0x320/0x730
     page_cache_sync_ra+0xa6/0xb0
     filemap_get_pages+0x1eb/0xc00
     filemap_read+0x1f1/0x770
     blkdev_read_iter+0x164/0x310
     vfs_read+0x467/0x5a0
     __x64_sys_pread64+0x122/0x160
     do_syscall_64+0x35/0x80
     entry_SYSCALL_64_after_hwframe+0x46/0xb0

--

    ======================================================
    WARNING: possible circular locking dependency detected
    6.0.0-rc2-eid-vmlocalyes-dbg-00027-gcd6aa5181bbb #2600 Not tainted
    ------------------------------------------------------
    systemd-udevd/292 is trying to acquire lock:
    ffff88817b644170 (&(&conf->resync_lock)->lock){....}-{2:2}, at:
wait_barrier+0x4fe/0x770

    but task is already holding lock:
    ffff88817b644120 (&____s->seqcount#11){+.+.}-{0:0}, at:
raid10_read_request+0x21f/0x760
			raid10.c:1140  wait_barrier()
			raid10.c:1204  regular_request_wait()



    which lock already depends on the new lock.


    the existing dependency chain (in reverse order) is:

    -> #1 (&____s->seqcount#11){+.+.}-{0:0}:
           raise_barrier+0xe0/0x300
		raid10.c:940 write_seqlock_irq()
           raid10_sync_request+0x629/0x4750
		raid10.c:3689 raise_barrire()
           md_do_sync.cold+0x8ec/0x1491
           md_thread+0x19d/0x2d0
           kthread+0x18c/0x1c0
           ret_from_fork+0x1f/0x30

    -> #0 (&(&conf->resync_lock)->lock){....}-{2:2}:
           __lock_acquire+0x1cb4/0x3170
           lock_acquire+0x183/0x440
           _raw_spin_lock_irq+0x4d/0x90
           wait_barrier+0x4fe/0x770
           raid10_read_request+0x21f/0x760
		raid10.c:1140  wait_barrier()
		raid10.c:1204  regular_request_wait()
           raid10_make_request+0x2d6/0x2190
           md_handle_request+0x3f3/0x5b0
           md_submit_bio+0xd9/0x120
           __submit_bio+0x9d/0x100
           submit_bio_noacct_nocheck+0x1fd/0x470
           submit_bio_noacct+0x4c2/0xbb0
           submit_bio+0x3f/0xf0
           submit_bh_wbc+0x270/0x2a0
           block_read_full_folio+0x37c/0x580
           blkdev_read_folio+0x18/0x20
           filemap_read_folio+0x3f/0x110
           do_read_cache_folio+0x13b/0x2c0
           read_cache_folio+0x42/0x50
           read_part_sector+0x74/0x1c0
           read_lba+0x176/0x2a0
           efi_partition+0x1ce/0xdd0
           bdev_disk_changed+0x2e7/0x6a0
           blkdev_get_whole+0xd2/0x140
           blkdev_get_by_dev.part.0+0x37f/0x570
           blkdev_get_by_dev+0x51/0x60
           disk_scan_partitions+0xad/0xf0
           blkdev_common_ioctl+0x3f3/0xdf0
           blkdev_ioctl+0x1e1/0x450
           __x64_sys_ioctl+0xc0/0x100
           do_syscall_64+0x35/0x80
           entry_SYSCALL_64_after_hwframe+0x46/0xb0

    other info that might help us debug this:

     Possible unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(&____s->seqcount#11);
                                   lock(&(&conf->resync_lock)->lock);
                                   lock(&____s->seqcount#11);
      lock(&(&conf->resync_lock)->lock);

     *** DEADLOCK ***

    2 locks held by systemd-udevd/292:
     #0: ffff88817a532528 (&disk->open_mutex){+.+.}-{3:3}, at:
blkdev_get_by_dev.part.0+0x180/0x570
     #1: ffff88817b644120 (&____s->seqcount#11){+.+.}-{0:0}, at:
raid10_read_request+0x21f/0x760

    stack backtrace:
    CPU: 3 PID: 292 Comm: systemd-udevd Not tainted
6.0.0-rc2-eid-vmlocalyes-dbg-00027-gcd6aa5181bbb #2600
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2
04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0x5a/0x74
     dump_stack+0x10/0x12
     print_circular_bug.cold+0x146/0x14b
     check_noncircular+0x1ff/0x250
     __lock_acquire+0x1cb4/0x3170
     lock_acquire+0x183/0x440
     _raw_spin_lock_irq+0x4d/0x90
     wait_barrier+0x4fe/0x770
     raid10_read_request+0x21f/0x760
     raid10_make_request+0x2d6/0x2190
     md_handle_request+0x3f3/0x5b0
     md_submit_bio+0xd9/0x120
     __submit_bio+0x9d/0x100
     submit_bio_noacct_nocheck+0x1fd/0x470
     submit_bio_noacct+0x4c2/0xbb0
     submit_bio+0x3f/0xf0
     submit_bh_wbc+0x270/0x2a0
     block_read_full_folio+0x37c/0x580
     blkdev_read_folio+0x18/0x20
     filemap_read_folio+0x3f/0x110
     do_read_cache_folio+0x13b/0x2c0
     read_cache_folio+0x42/0x50
     read_part_sector+0x74/0x1c0
     read_lba+0x176/0x2a0
     efi_partition+0x1ce/0xdd0
     bdev_disk_changed+0x2e7/0x6a0
     blkdev_get_whole+0xd2/0x140
     blkdev_get_by_dev.part.0+0x37f/0x570
     blkdev_get_by_dev+0x51/0x60
     disk_scan_partitions+0xad/0xf0
     blkdev_common_ioctl+0x3f3/0xdf0
     blkdev_ioctl+0x1e1/0x450
     __x64_sys_ioctl+0xc0/0x100
     do_syscall_64+0x35/0x80

  reply	other threads:[~2022-09-01 18:41 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-29 13:14 [PATCH -next 0/3] md/raid10: reduce lock contention for io Yu Kuai
2022-08-29 13:15 ` [PATCH -next 1/3] md/raid10: fix improper BUG_ON() in raise_barrier() Yu Kuai
2022-08-29 19:53   ` John Stoffel
2022-08-30  1:01     ` Yu Kuai
2022-08-30  6:32     ` Paul Menzel
2022-08-29 13:15 ` [PATCH -next 2/3] md/raid10: convert resync_lock to use seqlock Yu Kuai
2022-09-01 18:41   ` Logan Gunthorpe [this message]
2022-09-02  0:49     ` Guoqing Jiang
2022-09-02  0:56       ` Logan Gunthorpe
2022-09-02  1:00         ` Guoqing Jiang
2022-09-02  1:21     ` Yu Kuai
2022-09-02  8:14       ` Yu Kuai
2022-09-02 17:03         ` Logan Gunthorpe
2022-09-03  6:07           ` Yu Kuai
2022-09-02  9:42   ` Guoqing Jiang
2022-09-02 10:02     ` Yu Kuai
2022-09-02 10:16       ` Guoqing Jiang
2022-09-02 10:53         ` Yu Kuai
2022-08-29 13:15 ` [PATCH -next 3/3] md/raid10: prevent unnecessary calls to wake_up() in fast path Yu Kuai
2022-08-29 13:40 ` [PATCH -next 0/3] md/raid10: reduce lock contention for io Guoqing Jiang
2022-08-31 11:55   ` Yu Kuai
2022-08-29 13:58 ` Paul Menzel
2022-08-30  1:09   ` Yu Kuai
2022-08-31 11:59     ` Paul Menzel
2022-08-31 12:07       ` Yu Kuai
2022-08-31 18:00 ` Song Liu
2022-09-03  6:08   ` Yu Kuai
2022-09-09 14:45     ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04128618-962f-fd4e-64a9-09ecf7f83776@deltatee.com \
    --to=logang@deltatee.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=song@kernel.org \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.