All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Song Liu <song@kernel.org>, AceLan Kao <acelan@gmail.com>
Cc: Guoqing Jiang <guoqing.jiang@linux.dev>,
	Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	Bagas Sanjaya <bagasdotme@gmail.com>,
	Christoph Hellwig <hch@lst.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Regressions <regressions@lists.linux.dev>,
	Linux RAID <linux-raid@vger.kernel.org>,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: Infiniate systemd loop when power off the machine with multiple MD RAIDs
Date: Mon, 28 Aug 2023 21:50:59 +0800	[thread overview]
Message-ID: <354004ce-ad4e-5ad5-8fe6-303216647e0c@huaweicloud.com> (raw)
In-Reply-To: <CAPhsuW7XEy4q3XR389F7CUvXvJ=0JR0QkMOr4LU03avT0erAfg@mail.gmail.com>

Hi,

在 2023/08/28 13:20, Song Liu 写道:
> Hi AceLan,
> 
> Thanks for running the experiments.
> 
> On Fri, Aug 25, 2023 at 9:32 PM AceLan Kao <acelan@gmail.com> wrote:
> [...]
>>>
>>> Could you please run the follow two experiments?
>>>
>>> 1. Confirm 12a6caf273240a triggers this. Specifically:
>>>     git checkout 12a6caf273240a => repros
>>>     git checkout 12a6caf273240a~1 => cannot repro
>> Yes, I'm pretty sure about this, that's my bisect result and I just
>> confirmed it again.
>> I also tried reverting 12a6caf273240a and the issue is gone.
> 
> The log doesn't match my guess. Specifically:
> 
> [  420.068142] systemd-shutdown[1]: Stopping MD /dev/md123 (9:123).
> [  420.074718] md_open:md123 openers++ = 1 by systemd-shutdow
> [  420.080787] systemd-shutdown[1]: Failed to sync MD block device
> /dev/md123, ignoring: Input/output error
> [  420.090831] md: md123 stopped.
> [  420.094465] systemd-shutdown[1]: Stopping MD /dev/md122 (9:122).
> [  420.101045] systemd-shutdown[1]: Could not stop MD /dev/md122:
> Device or resource busy

I see that:

systemd-shutdown[1]: Couldn't finalize remaining  MD devices, trying again.

Can we make sure is this why power off hang?

Because in my VM, power off is not hang and I got:

systemd-shutdown[1]: Could not stop MD /dev/md1: Device or resource busy
systemd-shutdown[1]: Failed to finalize MD devices, ignoring.
> 
> For a successful stop on md123, we reach the pr_info() in md_open().
> For a failed stop on md122, the kernel returns -EBUSY before that
> pr_info() in md_open(). There are some changes in md_open() in
> the past few release, so I am not quite sure we are looking at the
> same code.

By the way, based on code review, looks like md_open never return
-EBUSY, and I think following is the only place can return -EBUSY before
md_open() is called:

blkdev_open
  blkdev_get_by_dev
   bd_prepare_to_claim
    bd_may_claim 	-> -EBUSY

Acelan, can you add following debug patch on the top of Song's patch
and reporduce it again? Hope it'll confirm why stop array failed with
-EBUSY.

diff --git a/block/bdev.c b/block/bdev.c
index 979e28a46b98..699739223dcb 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -789,8 +789,11 @@ struct block_device *blkdev_get_by_dev(dev_t dev, 
blk_mode_t mode, void *holder,
         if (holder) {
                 mode |= BLK_OPEN_EXCL;
                 ret = bd_prepare_to_claim(bdev, holder, hops);
-               if (ret)
+               if (ret) {
+                       pr_warn("%s:%s bd_prepare_to_claim return %d\n",
+                               disk->disk_name, current->comm, ret);
                         goto put_blkdev;
+               }
         } else {
                 if (WARN_ON_ONCE(mode & BLK_OPEN_EXCL)) {
                         ret = -EIO;
diff --git a/block/fops.c b/block/fops.c
index eaa98a987213..2d69119c71f6 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -587,8 +587,11 @@ static int blkdev_open(struct inode *inode, struct 
file *filp)

         bdev = blkdev_get_by_dev(inode->i_rdev, file_to_blk_mode(filp),
                                  filp->private_data, NULL);
-       if (IS_ERR(bdev))
+       if (IS_ERR(bdev)) {
+               pr_warn("%pD:%s blkdev_get_by_dev return %ld\n",
+                       filp, current->comm, PTR_ERR(bdev));
                 return PTR_ERR(bdev);
+       }

         if (bdev_nowait(bdev))
                 filp->f_mode |= FMODE_NOWAIT;

Thanks,
Kuai

> 
> Therefore, could you please help clarify:
> 
> 1. Which base kernel are you using?
> 
>>From the log, you are using 6.5-rc7-706a74159504. However,
> I think we cannot cleanly revert 12a6caf273240a on top of
> 6.5-rc7-706a74159504. Did you manually fix some issue in the
> revert? If so, could you please share the revert commit?
> 
> 2. If you are not using 6.5-rc7-706a74159504 as base kernel, which
> one are you using?
> 
> Thanks,
> Song
> 
>>
>>>
>>> 2. Try with the following change (add debug messages), which hopefully
>>>     shows which command is holding a reference on mddev->openers.
>>>
>>> Thanks,
>>> Song
>>>
>>> diff --git i/drivers/md/md.c w/drivers/md/md.c
>>> index 78be7811a89f..3e9b718b32c1 100644
>>> --- i/drivers/md/md.c
>>> +++ w/drivers/md/md.c
>>> @@ -7574,11 +7574,15 @@ static int md_ioctl(struct block_device *bdev,
>>> blk_mode_t mode,
>>>                  if (mddev->pers && atomic_read(&mddev->openers) > 1) {
>>>                          mutex_unlock(&mddev->open_mutex);
>>>                          err = -EBUSY;
>>> +                       pr_warn("%s return -EBUSY for %s with
>>> mddev->openers = %d\n",
>>> +                               __func__, mdname(mddev),
>>> atomic_read(&mddev->openers));
>>>                          goto out;
>>>                  }
>>>                  if (test_and_set_bit(MD_CLOSING, &mddev->flags)) {
>>>                          mutex_unlock(&mddev->open_mutex);
>>>                          err = -EBUSY;
>>> +                       pr_warn("%s return -EBUSY for %s with
>>> MD_CLOSING bit set\n",
>>> +                               __func__, mdname(mddev));
>>>                          goto out;
>>>                  }
>>>                  did_set_md_closing = true;
>>> @@ -7789,6 +7793,8 @@ static int md_open(struct gendisk *disk, blk_mode_t mode)
>>>                  goto out_unlock;
>>>
>>>          atomic_inc(&mddev->openers);
>>> +       pr_info("%s:%s openers++ = %d by %s\n", __func__, mdname(mddev),
>>> +               atomic_read(&mddev->openers), current->comm);
>>>          mutex_unlock(&mddev->open_mutex);
>>>
>>>          disk_check_media_change(disk);
>>> @@ -7807,6 +7813,8 @@ static void md_release(struct gendisk *disk)
>>>
>>>          BUG_ON(!mddev);
>>>          atomic_dec(&mddev->openers);
>>> +       pr_info("%s:%s openers-- = %d by %s\n", __func__, mdname(mddev),
>>> +               atomic_read(&mddev->openers), current->comm);
>>>          mddev_put(mddev);
>>>   }
>> It's pretty strange that I can't reproduce the issue after applied the patch.
>>
>> I tried to figure out which part affect the issue and found when I
>> comment out the pr_info() In md_release(), the issue could be
>> reproduced.
>>
>> --
>> Chia-Lin Kao(AceLan)
>> http://blog.acelan.idv.tw/
>> E-Mail: acelan.kaoATcanonical.com (s/AT/@/)
> .
> 


  parent reply	other threads:[~2023-08-28 13:51 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-16  9:37 Fwd: Infiniate systemd loop when power off the machine with multiple MD RAIDs Bagas Sanjaya
2023-08-18  8:16 ` Mariusz Tkaczyk
2023-08-18  9:21   ` Hannes Reinecke
2023-08-21  3:23     ` AceLan Kao
2023-08-22  3:51   ` Guoqing Jiang
2023-08-22  6:17     ` Song Liu
2023-08-22  6:39       ` Mariusz Tkaczyk
2023-08-22  8:13         ` AceLan Kao
2023-08-22 12:41           ` Guoqing Jiang
2023-08-23  8:02             ` AceLan Kao
2023-08-23 13:25               ` Song Liu
2023-08-26  4:31                 ` AceLan Kao
2023-08-28  5:20                   ` Song Liu
2023-08-28 10:48                     ` AceLan Kao
2023-08-29  3:12                       ` AceLan Kao
2023-08-28 13:50                     ` Yu Kuai [this message]
2023-08-31  2:28                       ` Yu Kuai
2023-08-31  6:50                         ` Mariusz Tkaczyk
2023-09-06  6:26                           ` AceLan Kao
2023-09-06 10:27                             ` Mariusz Tkaczyk
2023-09-07  2:04                               ` Yu Kuai
2023-09-07 10:18                                 ` Mariusz Tkaczyk
2023-09-07 11:26                                   ` Yu Kuai
2023-09-07 12:14                                     ` Yu Kuai
2023-09-07 12:41                                       ` Mariusz Tkaczyk
2023-09-07 12:53                                         ` Yu Kuai
2023-09-07 15:09                                           ` Mariusz Tkaczyk
2023-09-08 20:25                                             ` Song Liu
2023-08-21 13:18 ` Fwd: " Yu Kuai
2023-08-22  1:39   ` AceLan Kao
2023-08-22 18:56 ` Song Liu
2023-08-22 19:13   ` Carlos Carvalho
2023-08-23  1:28     ` Yu Kuai
2023-08-23  6:04       ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=354004ce-ad4e-5ad5-8fe6-303216647e0c@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=acelan@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=guoqing.jiang@linux.dev \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=regressions@lists.linux.dev \
    --cc=song@kernel.org \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.