All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Use MD_BROKEN for redundant arrays
@ 2021-09-17 15:18 Mariusz Tkaczyk
  2021-09-17 15:18 ` [PATCH 1/2] md, raid1, raid10: Set MD_BROKEN for RAID1 and RAID10 Mariusz Tkaczyk
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Mariusz Tkaczyk @ 2021-09-17 15:18 UTC (permalink / raw)
  To: song; +Cc: linux-kernel

Hi Song,
This patchset adds usage of MD_BROKEN for each redundant level.
This should simplify IO failure stack when md device is failed and
fixes raid456 bug.

Mariusz Tkaczyk (2):
  md: Set MD_BROKEN for RAID1 and RAID10
  raid5: introduce MD_BROKEN

 drivers/md/md.c     | 16 ++++++++++------
 drivers/md/md.h     |  4 ++--
 drivers/md/raid1.c  |  1 +
 drivers/md/raid10.c |  1 +
 drivers/md/raid5.c  | 34 ++++++++++++++++------------------
 5 files changed, 30 insertions(+), 26 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] md, raid1, raid10: Set MD_BROKEN for RAID1 and RAID10
  2021-09-17 15:18 [PATCH 0/2] Use MD_BROKEN for redundant arrays Mariusz Tkaczyk
@ 2021-09-17 15:18 ` Mariusz Tkaczyk
  2021-09-17 15:18 ` [PATCH 2/2] raid5: introduce MD_BROKEN Mariusz Tkaczyk
  2021-09-17 15:32 ` [PATCH 0/2] Use MD_BROKEN for redundant arrays Tkaczyk, Mariusz
  2 siblings, 0 replies; 5+ messages in thread
From: Mariusz Tkaczyk @ 2021-09-17 15:18 UTC (permalink / raw)
  To: song; +Cc: linux-kernel

The idea of stopping all writes if devices is failed, introduced by
62f7b1989c0 ("md raid0/linear: Mark array as 'broken' and fail BIOs if
a member is gone") seems to be reasonable so use MD_BROKEN for RAID1 and
RAID10. Support in RAID456 is added in next commit.
If userspace (mdadm) forces md to fail the device (Failure state
written via sysfs), then EBUSY is expected if array will become failed.
To achieve that, check for MD_BROKEN and if is set, then return EBUSY to
be complaint with userspace.
For faulty state, handled via ioctl, let the error_handler to decide.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
---
 drivers/md/md.c     | 16 ++++++++++------
 drivers/md/md.h     |  4 ++--
 drivers/md/raid1.c  |  1 +
 drivers/md/raid10.c |  1 +
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index c322841d4edc..ac20eb2ddff7 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -926,8 +926,9 @@ static void super_written(struct bio *bio)
 		pr_err("md: %s gets error=%d\n", __func__,
 		       blk_status_to_errno(bio->bi_status));
 		md_error(mddev, rdev);
-		if (!test_bit(Faulty, &rdev->flags)
-		    && (bio->bi_opf & MD_FAILFAST)) {
+		if (!test_bit(Faulty, &rdev->flags) &&
+		     !test_bit(MD_BROKEN, &mddev->flags) &&
+		     (bio->bi_opf & MD_FAILFAST)) {
 			set_bit(MD_SB_NEED_REWRITE, &mddev->sb_flags);
 			set_bit(LastDev, &rdev->flags);
 		}
@@ -2979,7 +2980,8 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
 	int err = -EINVAL;
 	if (cmd_match(buf, "faulty") && rdev->mddev->pers) {
 		md_error(rdev->mddev, rdev);
-		if (test_bit(Faulty, &rdev->flags))
+
+		if (!test_bit(MD_BROKEN, &rdev->mddev->flags))
 			err = 0;
 		else
 			err = -EBUSY;
@@ -7974,12 +7976,14 @@ void md_error(struct mddev *mddev, struct md_rdev *rdev)
 	if (!mddev->pers || !mddev->pers->error_handler)
 		return;
 	mddev->pers->error_handler(mddev,rdev);
-	if (mddev->degraded)
+	if (mddev->degraded && !test_bit(MD_BROKEN, &mddev->flags))
 		set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
 	sysfs_notify_dirent_safe(rdev->sysfs_state);
 	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
-	md_wakeup_thread(mddev->thread);
+	if (!test_bit(MD_BROKEN, &mddev->flags)) {
+		set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+		md_wakeup_thread(mddev->thread);
+	}
 	if (mddev->event_work.func)
 		queue_work(md_misc_wq, &mddev->event_work);
 	md_new_event(mddev);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 4c96c36bd01a..e01433f3b46a 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -259,8 +259,8 @@ enum mddev_flags {
 	MD_NOT_READY,		/* do_md_run() is active, so 'array_state'
 				 * must not report that array is ready yet
 				 */
-	MD_BROKEN,              /* This is used in RAID-0/LINEAR only, to stop
-				 * I/O in case an array member is gone/failed.
+	MD_BROKEN,              /* This is used to stop I/O and mark device as
+				 * dead in case an array becomes failed.
 				 */
 };
 
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 19598bd38939..79462d860177 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1639,6 +1639,7 @@ static void raid1_error(struct mddev *mddev, struct md_rdev *rdev)
 		 */
 		conf->recovery_disabled = mddev->recovery_disabled;
 		spin_unlock_irqrestore(&conf->device_lock, flags);
+		set_bit(MD_BROKEN, &mddev->flags);
 		return;
 	}
 	set_bit(Blocked, &rdev->flags);
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index aa2636582841..02a4d84b4d2e 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1964,6 +1964,7 @@ static void raid10_error(struct mddev *mddev, struct md_rdev *rdev)
 		 * Don't fail the drive, just return an IO error.
 		 */
 		spin_unlock_irqrestore(&conf->device_lock, flags);
+		set_bit(MD_BROKEN, &mddev->flags);
 		return;
 	}
 	if (test_and_clear_bit(In_sync, &rdev->flags))
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] raid5: introduce MD_BROKEN
  2021-09-17 15:18 [PATCH 0/2] Use MD_BROKEN for redundant arrays Mariusz Tkaczyk
  2021-09-17 15:18 ` [PATCH 1/2] md, raid1, raid10: Set MD_BROKEN for RAID1 and RAID10 Mariusz Tkaczyk
@ 2021-09-17 15:18 ` Mariusz Tkaczyk
  2021-09-17 15:32 ` [PATCH 0/2] Use MD_BROKEN for redundant arrays Tkaczyk, Mariusz
  2 siblings, 0 replies; 5+ messages in thread
From: Mariusz Tkaczyk @ 2021-09-17 15:18 UTC (permalink / raw)
  To: song; +Cc: linux-kernel

Raid456 module had allowed to achieve failed state, distinct from other
redundant levels. It was fixed by fb73b357fb9 ("raid5: block failing
device if raid will be failed").
This fix introduces a bug, now if raid5 fails during IO, it may result
with a hung task without completion. Faulty flag on the device is
necessary to process all requests and is checked many times, mainly in
anaylze_stripe().
Allow to set faulty flag on drive again and set MD_BROKEN if raid is
failed.

Fixes: fb73b357fb9 ("raid5: block failing device if raid will be failed")
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
---
 drivers/md/raid5.c | 34 ++++++++++++++++------------------
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 02ed53b20654..43e1ff43a222 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -690,6 +690,9 @@ static int has_failed(struct r5conf *conf)
 {
 	int degraded;
 
+	if (test_bit(MD_BROKEN, &conf->mddev->flags))
+		return 1;
+
 	if (conf->mddev->reshape_position == MaxSector)
 		return conf->mddev->degraded > conf->max_degraded;
 
@@ -2877,34 +2880,29 @@ static void raid5_error(struct mddev *mddev, struct md_rdev *rdev)
 	unsigned long flags;
 	pr_debug("raid456: error called\n");
 
-	spin_lock_irqsave(&conf->device_lock, flags);
-
-	if (test_bit(In_sync, &rdev->flags) &&
-	    mddev->degraded == conf->max_degraded) {
-		/*
-		 * Don't allow to achieve failed state
-		 * Don't try to recover this device
-		 */
-		conf->recovery_disabled = mddev->recovery_disabled;
-		spin_unlock_irqrestore(&conf->device_lock, flags);
-		return;
-	}
+	pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n",
+		mdname(mddev), bdevname(rdev->bdev, b));
 
+	spin_lock_irqsave(&conf->device_lock, flags);
 	set_bit(Faulty, &rdev->flags);
 	clear_bit(In_sync, &rdev->flags);
 	mddev->degraded = raid5_calc_degraded(conf);
+
+	if (has_failed(conf)) {
+		set_bit(MD_BROKEN, &mddev->flags);
+		conf->recovery_disabled = mddev->recovery_disabled;
+		pr_crit("md/raid:%s: Cannot continue on %d devices.\n",
+			mdname(mddev), conf->raid_disks - mddev->degraded);
+	} else
+		pr_crit("md/raid:%s: Operation continuing on %d devices.\n",
+			mdname(mddev), conf->raid_disks - mddev->degraded);
+
 	spin_unlock_irqrestore(&conf->device_lock, flags);
 	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
 
 	set_bit(Blocked, &rdev->flags);
 	set_mask_bits(&mddev->sb_flags, 0,
 		      BIT(MD_SB_CHANGE_DEVS) | BIT(MD_SB_CHANGE_PENDING));
-	pr_crit("md/raid:%s: Disk failure on %s, disabling device.\n"
-		"md/raid:%s: Operation continuing on %d devices.\n",
-		mdname(mddev),
-		bdevname(rdev->bdev, b),
-		mdname(mddev),
-		conf->raid_disks - mddev->degraded);
 	r5c_update_on_rdev_error(mddev, rdev);
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] Use MD_BROKEN for redundant arrays
  2021-09-17 15:18 [PATCH 0/2] Use MD_BROKEN for redundant arrays Mariusz Tkaczyk
  2021-09-17 15:18 ` [PATCH 1/2] md, raid1, raid10: Set MD_BROKEN for RAID1 and RAID10 Mariusz Tkaczyk
  2021-09-17 15:18 ` [PATCH 2/2] raid5: introduce MD_BROKEN Mariusz Tkaczyk
@ 2021-09-17 15:32 ` Tkaczyk, Mariusz
  2 siblings, 0 replies; 5+ messages in thread
From: Tkaczyk, Mariusz @ 2021-09-17 15:32 UTC (permalink / raw)
  To: song; +Cc: linux-kernel

On 17.09.2021 17:18, Mariusz Tkaczyk wrote:
> Hi Song,
> This patchset adds usage of MD_BROKEN for each redundant level.
> This should simplify IO failure stack when md device is failed and
> fixes raid456 bug.
> 
> Mariusz Tkaczyk (2):
>    md: Set MD_BROKEN for RAID1 and RAID10
>    raid5: introduce MD_BROKEN
> 
>   drivers/md/md.c     | 16 ++++++++++------
>   drivers/md/md.h     |  4 ++--
>   drivers/md/raid1.c  |  1 +
>   drivers/md/raid10.c |  1 +
>   drivers/md/raid5.c  | 34 ++++++++++++++++------------------
>   5 files changed, 30 insertions(+), 26 deletions(-)
> 

Hi,
Please ignore it. I added wrong list.
Sorry for noise.

Thanks,
Mariusz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 0/2] Use MD_BROKEN for redundant arrays
@ 2021-09-17 15:34 Mariusz Tkaczyk
  0 siblings, 0 replies; 5+ messages in thread
From: Mariusz Tkaczyk @ 2021-09-17 15:34 UTC (permalink / raw)
  To: song; +Cc: linux-raid

Hi Song,
This patchset adds usage of MD_BROKEN for each redundant level.
This should simplify IO failure stack when md device is failed and
fixes raid456 bug..

Mariusz Tkaczyk (2):
  md: Set MD_BROKEN for RAID1 and RAID10
  raid5: introduce MD_BROKEN

 drivers/md/md.c     | 16 ++++++++++------
 drivers/md/md.h     |  4 ++--
 drivers/md/raid1.c  |  1 +
 drivers/md/raid10.c |  1 +
 drivers/md/raid5.c  | 34 ++++++++++++++++------------------
 5 files changed, 30 insertions(+), 26 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-17 15:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-17 15:18 [PATCH 0/2] Use MD_BROKEN for redundant arrays Mariusz Tkaczyk
2021-09-17 15:18 ` [PATCH 1/2] md, raid1, raid10: Set MD_BROKEN for RAID1 and RAID10 Mariusz Tkaczyk
2021-09-17 15:18 ` [PATCH 2/2] raid5: introduce MD_BROKEN Mariusz Tkaczyk
2021-09-17 15:32 ` [PATCH 0/2] Use MD_BROKEN for redundant arrays Tkaczyk, Mariusz
2021-09-17 15:34 Mariusz Tkaczyk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.