All of lore.kernel.org
 help / color / mirror / Atom feed
From: AceLan Kao <acelan@gmail.com>
To: Song Liu <song@kernel.org>
Cc: Guoqing Jiang <guoqing.jiang@linux.dev>,
	 Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	Bagas Sanjaya <bagasdotme@gmail.com>,
	 Christoph Hellwig <hch@lst.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	 Linux Regressions <regressions@lists.linux.dev>,
	Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: Infiniate systemd loop when power off the machine with multiple MD RAIDs
Date: Mon, 28 Aug 2023 18:48:58 +0800	[thread overview]
Message-ID: <CAMz9Wg-dcBFnw-3SfsD6Lrhp4o7cVf_w7y2X7yQ_hWCg+CiRXQ@mail.gmail.com> (raw)
In-Reply-To: <CAPhsuW7XEy4q3XR389F7CUvXvJ=0JR0QkMOr4LU03avT0erAfg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4654 bytes --]

Hi Song,

Song Liu <song@kernel.org> 於 2023年8月28日 週一 下午1:21寫道:
>
> Hi AceLan,
>
> Thanks for running the experiments.
>
> On Fri, Aug 25, 2023 at 9:32 PM AceLan Kao <acelan@gmail.com> wrote:
> [...]
> > >
> > > Could you please run the follow two experiments?
> > >
> > > 1. Confirm 12a6caf273240a triggers this. Specifically:
> > >    git checkout 12a6caf273240a => repros
> > >    git checkout 12a6caf273240a~1 => cannot repro
> > Yes, I'm pretty sure about this, that's my bisect result and I just
> > confirmed it again.
> > I also tried reverting 12a6caf273240a and the issue is gone.
>
> The log doesn't match my guess. Specifically:
>
> [  420.068142] systemd-shutdown[1]: Stopping MD /dev/md123 (9:123).
> [  420.074718] md_open:md123 openers++ = 1 by systemd-shutdow
> [  420.080787] systemd-shutdown[1]: Failed to sync MD block device
> /dev/md123, ignoring: Input/output error
> [  420.090831] md: md123 stopped.
> [  420.094465] systemd-shutdown[1]: Stopping MD /dev/md122 (9:122).
> [  420.101045] systemd-shutdown[1]: Could not stop MD /dev/md122:
> Device or resource busy
>
> For a successful stop on md123, we reach the pr_info() in md_open().
> For a failed stop on md122, the kernel returns -EBUSY before that
> pr_info() in md_open(). There are some changes in md_open() in
> the past few release, so I am not quite sure we are looking at the
> same code.
>
> Therefore, could you please help clarify:
>
> 1. Which base kernel are you using?
>
> From the log, you are using 6.5-rc7-706a74159504. However,
> I think we cannot cleanly revert 12a6caf273240a on top of
> 6.5-rc7-706a74159504. Did you manually fix some issue in the
> revert? If so, could you please share the revert commit?
Yes, I'm basing 6.5.0-rc7 706a74159504 to apply your patch
706a74159504 Linux 6.5-rc7

Attached how I revert the commit.

>
> 2. If you are not using 6.5-rc7-706a74159504 as base kernel, which
> one are you using?
>
> Thanks,
> Song
>
> >
> > >
> > > 2. Try with the following change (add debug messages), which hopefully
> > >    shows which command is holding a reference on mddev->openers.
> > >
> > > Thanks,
> > > Song
> > >
> > > diff --git i/drivers/md/md.c w/drivers/md/md.c
> > > index 78be7811a89f..3e9b718b32c1 100644
> > > --- i/drivers/md/md.c
> > > +++ w/drivers/md/md.c
> > > @@ -7574,11 +7574,15 @@ static int md_ioctl(struct block_device *bdev,
> > > blk_mode_t mode,
> > >                 if (mddev->pers && atomic_read(&mddev->openers) > 1) {
> > >                         mutex_unlock(&mddev->open_mutex);
> > >                         err = -EBUSY;
> > > +                       pr_warn("%s return -EBUSY for %s with
> > > mddev->openers = %d\n",
> > > +                               __func__, mdname(mddev),
> > > atomic_read(&mddev->openers));
> > >                         goto out;
> > >                 }
> > >                 if (test_and_set_bit(MD_CLOSING, &mddev->flags)) {
> > >                         mutex_unlock(&mddev->open_mutex);
> > >                         err = -EBUSY;
> > > +                       pr_warn("%s return -EBUSY for %s with
> > > MD_CLOSING bit set\n",
> > > +                               __func__, mdname(mddev));
> > >                         goto out;
> > >                 }
> > >                 did_set_md_closing = true;
> > > @@ -7789,6 +7793,8 @@ static int md_open(struct gendisk *disk, blk_mode_t mode)
> > >                 goto out_unlock;
> > >
> > >         atomic_inc(&mddev->openers);
> > > +       pr_info("%s:%s openers++ = %d by %s\n", __func__, mdname(mddev),
> > > +               atomic_read(&mddev->openers), current->comm);
> > >         mutex_unlock(&mddev->open_mutex);
> > >
> > >         disk_check_media_change(disk);
> > > @@ -7807,6 +7813,8 @@ static void md_release(struct gendisk *disk)
> > >
> > >         BUG_ON(!mddev);
> > >         atomic_dec(&mddev->openers);
> > > +       pr_info("%s:%s openers-- = %d by %s\n", __func__, mdname(mddev),
> > > +               atomic_read(&mddev->openers), current->comm);
> > >         mddev_put(mddev);
> > >  }
> > It's pretty strange that I can't reproduce the issue after applied the patch.
> >
> > I tried to figure out which part affect the issue and found when I
> > comment out the pr_info() In md_release(), the issue could be
> > reproduced.
> >
> > --
> > Chia-Lin Kao(AceLan)
> > http://blog.acelan.idv.tw/
> > E-Mail: acelan.kaoATcanonical.com (s/AT/@/)



-- 
Chia-Lin Kao(AceLan)
http://blog.acelan.idv.tw/
E-Mail: acelan.kaoATcanonical.com (s/AT/@/)

[-- Attachment #2: 0001-Revert-md-only-delete-entries-from-all_mddevs-when-t.patch --]
[-- Type: text/x-patch, Size: 5304 bytes --]

From 58dd106db88f00b08d20d4a949a001629238dbc0 Mon Sep 17 00:00:00 2001
From: "Chia-Lin Kao (AceLan)" <acelan.kao@canonical.com>
Date: Mon, 28 Aug 2023 02:34:08 -0400
Subject: [PATCH] Revert "md: only delete entries from all_mddevs when the disk
 is freed"

This reverts commit 12a6caf273240ae42842de8cc05feaa86e582d61.
---
 drivers/md/md.c | 52 ++++++++++++++++---------------------------------
 drivers/md/md.h |  2 --
 2 files changed, 17 insertions(+), 37 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 78be7811a89f..af5d379ad357 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -606,10 +606,6 @@ EXPORT_SYMBOL(md_flush_request);
 
 static inline struct mddev *mddev_get(struct mddev *mddev)
 {
-	lockdep_assert_held(&all_mddevs_lock);
-
-	if (test_bit(MD_DELETED, &mddev->flags))
-		return NULL;
 	atomic_inc(&mddev->active);
 	return mddev;
 }
@@ -624,7 +620,7 @@ void mddev_put(struct mddev *mddev)
 	    mddev->ctime == 0 && !mddev->hold_active) {
 		/* Array is not configured at all, and not held active,
 		 * so destroy it */
-		set_bit(MD_DELETED, &mddev->flags);
+		list_del_init(&mddev->all_mddevs);
 
 		/*
 		 * Call queue_work inside the spinlock so that
@@ -3285,8 +3281,6 @@ static bool md_rdev_overlaps(struct md_rdev *rdev)
 
 	spin_lock(&all_mddevs_lock);
 	list_for_each_entry(mddev, &all_mddevs, all_mddevs) {
-		if (test_bit(MD_DELETED, &mddev->flags))
-			continue;
 		rdev_for_each(rdev2, mddev) {
 			if (rdev != rdev2 && rdev->bdev == rdev2->bdev &&
 			    md_rdevs_overlap(rdev, rdev2)) {
@@ -5488,10 +5482,11 @@ md_attr_show(struct kobject *kobj, struct attribute *attr, char *page)
 	if (!entry->show)
 		return -EIO;
 	spin_lock(&all_mddevs_lock);
-	if (!mddev_get(mddev)) {
+	if (list_empty(&mddev->all_mddevs)) {
 		spin_unlock(&all_mddevs_lock);
 		return -EBUSY;
 	}
+	mddev_get(mddev);
 	spin_unlock(&all_mddevs_lock);
 
 	rv = entry->show(mddev, page);
@@ -5512,10 +5507,11 @@ md_attr_store(struct kobject *kobj, struct attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 	spin_lock(&all_mddevs_lock);
-	if (!mddev_get(mddev)) {
+	if (list_empty(&mddev->all_mddevs)) {
 		spin_unlock(&all_mddevs_lock);
 		return -EBUSY;
 	}
+	mddev_get(mddev);
 	spin_unlock(&all_mddevs_lock);
 	rv = entry->store(mddev, page, length);
 	mddev_put(mddev);
@@ -7826,6 +7822,7 @@ static void md_free_disk(struct gendisk *disk)
 	struct mddev *mddev = disk->private_data;
 
 	percpu_ref_exit(&mddev->writes_pending);
+
 	mddev_free(mddev);
 }
 
@@ -8159,8 +8156,7 @@ static void *md_seq_start(struct seq_file *seq, loff_t *pos)
 	list_for_each(tmp,&all_mddevs)
 		if (!l--) {
 			mddev = list_entry(tmp, struct mddev, all_mddevs);
-			if (!mddev_get(mddev))
-				continue;
+			mddev_get(mddev);
 			spin_unlock(&all_mddevs_lock);
 			return mddev;
 		}
@@ -8174,35 +8170,25 @@ static void *md_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 {
 	struct list_head *tmp;
 	struct mddev *next_mddev, *mddev = v;
-	struct mddev *to_put = NULL;
 
 	++*pos;
 	if (v == (void*)2)
 		return NULL;
 
 	spin_lock(&all_mddevs_lock);
-	if (v == (void*)1) {
+	if (v == (void*)1)
 		tmp = all_mddevs.next;
-	} else {
-		to_put = mddev;
-		tmp = mddev->all_mddevs.next;
-	}
-
-	for (;;) {
-		if (tmp == &all_mddevs) {
-			next_mddev = (void*)2;
-			*pos = 0x10000;
-			break;
-		}
-		next_mddev = list_entry(tmp, struct mddev, all_mddevs);
-		if (mddev_get(next_mddev))
-			break;
-		mddev = next_mddev;
+	else
 		tmp = mddev->all_mddevs.next;
+	if (tmp != &all_mddevs)
+		next_mddev = mddev_get(list_entry(tmp,struct mddev,all_mddevs));
+	else {
+		next_mddev = (void*)2;
+		*pos = 0x10000;
 	}
 	spin_unlock(&all_mddevs_lock);
 
-	if (to_put)
+	if (v != (void*)1)
 		mddev_put(mddev);
 	return next_mddev;
 
@@ -8772,8 +8758,6 @@ void md_do_sync(struct md_thread *thread)
 			goto skip;
 		spin_lock(&all_mddevs_lock);
 		list_for_each_entry(mddev2, &all_mddevs, all_mddevs) {
-			if (test_bit(MD_DELETED, &mddev2->flags))
-				continue;
 			if (mddev2 == mddev)
 				continue;
 			if (!mddev->parallel_resync
@@ -9577,8 +9561,7 @@ static int md_notify_reboot(struct notifier_block *this,
 
 	spin_lock(&all_mddevs_lock);
 	list_for_each_entry_safe(mddev, n, &all_mddevs, all_mddevs) {
-		if (!mddev_get(mddev))
-			continue;
+		mddev_get(mddev);
 		spin_unlock(&all_mddevs_lock);
 		if (mddev_trylock(mddev)) {
 			if (mddev->pers)
@@ -9934,8 +9917,7 @@ static __exit void md_exit(void)
 
 	spin_lock(&all_mddevs_lock);
 	list_for_each_entry_safe(mddev, n, &all_mddevs, all_mddevs) {
-		if (!mddev_get(mddev))
-			continue;
+		mddev_get(mddev);
 		spin_unlock(&all_mddevs_lock);
 		export_array(mddev);
 		mddev->ctime = 0;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 1aef86bf3fc3..4e265e38c229 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -252,7 +252,6 @@ struct md_cluster_info;
  * @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that
  *		   array is ready yet.
  * @MD_BROKEN: This is used to stop writes and mark array as failed.
- * @MD_DELETED: This device is being deleted
  *
  * change UNSUPPORTED_MDDEV_FLAGS for each array type if new flag is added
  */
@@ -269,7 +268,6 @@ enum mddev_flags {
 	MD_UPDATING_SB,
 	MD_NOT_READY,
 	MD_BROKEN,
-	MD_DELETED,
 };
 
 enum mddev_sb_flags {
-- 
2.37.2


  reply	other threads:[~2023-08-28 10:49 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-16  9:37 Fwd: Infiniate systemd loop when power off the machine with multiple MD RAIDs Bagas Sanjaya
2023-08-18  8:16 ` Mariusz Tkaczyk
2023-08-18  9:21   ` Hannes Reinecke
2023-08-21  3:23     ` AceLan Kao
2023-08-22  3:51   ` Guoqing Jiang
2023-08-22  6:17     ` Song Liu
2023-08-22  6:39       ` Mariusz Tkaczyk
2023-08-22  8:13         ` AceLan Kao
2023-08-22 12:41           ` Guoqing Jiang
2023-08-23  8:02             ` AceLan Kao
2023-08-23 13:25               ` Song Liu
2023-08-26  4:31                 ` AceLan Kao
2023-08-28  5:20                   ` Song Liu
2023-08-28 10:48                     ` AceLan Kao [this message]
2023-08-29  3:12                       ` AceLan Kao
2023-08-28 13:50                     ` Yu Kuai
2023-08-31  2:28                       ` Yu Kuai
2023-08-31  6:50                         ` Mariusz Tkaczyk
2023-09-06  6:26                           ` AceLan Kao
2023-09-06 10:27                             ` Mariusz Tkaczyk
2023-09-07  2:04                               ` Yu Kuai
2023-09-07 10:18                                 ` Mariusz Tkaczyk
2023-09-07 11:26                                   ` Yu Kuai
2023-09-07 12:14                                     ` Yu Kuai
2023-09-07 12:41                                       ` Mariusz Tkaczyk
2023-09-07 12:53                                         ` Yu Kuai
2023-09-07 15:09                                           ` Mariusz Tkaczyk
2023-09-08 20:25                                             ` Song Liu
2023-08-21 13:18 ` Fwd: " Yu Kuai
2023-08-22  1:39   ` AceLan Kao
2023-08-22 18:56 ` Song Liu
2023-08-22 19:13   ` Carlos Carvalho
2023-08-23  1:28     ` Yu Kuai
2023-08-23  6:04       ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMz9Wg-dcBFnw-3SfsD6Lrhp4o7cVf_w7y2X7yQ_hWCg+CiRXQ@mail.gmail.com \
    --to=acelan@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=guoqing.jiang@linux.dev \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=regressions@lists.linux.dev \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.