linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Song Liu <song@kernel.org>
To: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: [PATCH] md: don't unregister sync_thread with reconfig_mutex held
Date: Wed, 10 Feb 2021 23:28:10 -0800	[thread overview]
Message-ID: <CAPhsuW5ZU2fpP1smSodKWFCqLu4J91sWqY6DC7ppQ=3VvJM+eQ@mail.gmail.com> (raw)
In-Reply-To: <1612923676-18294-1-git-send-email-guoqing.jiang@cloud.ionos.com>

On Tue, Feb 9, 2021 at 6:22 PM Guoqing Jiang
<guoqing.jiang@cloud.ionos.com> wrote:
>
> Unregister sync_thread doesn't need to hold reconfig_mutex since it
> doesn't reconfigure array.
>
> And it could cause deadlock problem for raid5 as follows:
>
> 1. process A tried to reap sync thread with reconfig_mutex held after echo
>    idle to sync_action.
> 2. raid5 sync thread was blocked if there were too many active stripes.
> 3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
>    which causes the number of active stripes can't be decreased.
> 4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
>    to hold reconfig_mutex.
>
> More details in the link:
> issu://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t
>
> Reported-and-tested-by: Donald Buczek <buczek@molgen.mpg.de>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>

Thanks for debugging the issue. However, I am not sure whether this is
the proper
fix. For example, would this break dm-raid.c:raid_message()? IIUC,
raid_message()
calls md_reap_sync_thread() without holding reconfigure_mutex, no?

Thanks,
Song

> ---
>  drivers/md/md.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index ca40942..eec8c27 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -9365,13 +9365,18 @@ void md_check_recovery(struct mddev *mddev)
>  EXPORT_SYMBOL(md_check_recovery);
>
>  void md_reap_sync_thread(struct mddev *mddev)
> +       __releases(&mddev->reconfig_mutex)
> +       __acquires(&mddev->reconfig_mutex)
> +
>  {
>         struct md_rdev *rdev;
>         sector_t old_dev_sectors = mddev->dev_sectors;
>         bool is_reshaped = false;
>
>         /* resync has finished, collect result */
> +       mddev_unlock(mddev);
>         md_unregister_thread(&mddev->sync_thread);
> +       mddev_lock_nointr(mddev);
>         if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
>             !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) &&
>             mddev->degraded != mddev->raid_disks) {
> --
> 2.7.4
>

  reply	other threads:[~2021-02-11  7:29 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-10  2:21 [PATCH] md: don't unregister sync_thread with reconfig_mutex held Guoqing Jiang
2021-02-11  7:28 ` Song Liu [this message]
2021-02-11  8:25   ` Jack Wang
2021-02-11  9:11   ` Guoqing Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPhsuW5ZU2fpP1smSodKWFCqLu4J91sWqY6DC7ppQ=3VvJM+eQ@mail.gmail.com' \
    --to=song@kernel.org \
    --cc=guoqing.jiang@cloud.ionos.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).