All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-16 13:40 ` Guilherme G. Piccoli
  0 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-16 13:40 UTC (permalink / raw)
  To: linux-raid
  Cc: Song Liu, gpiccoli, NeilBrown, linux-block, dm-devel, jay.vosburgh

Currently if md raid0/linear array gets one or more members removed while
being mounted, kernel keeps showing state 'clean' in the 'array_state'
sysfs attribute. Despite udev signaling the member device is gone, 'mdadm'
cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted.

Nothing else hints that something is wrong (except that the removed devices
don't show properly in the output of 'mdadm detail' command). There is no
other property to be checked, and if user is not performing reads/writes
to the array, even kernel log is quiet and doesn't give a clue about the
missing member.

This patch changes this behavior; when 'array_state' is read we introduce
a non-expensive check (only for raid0/md-linear) that relies in the
comparison of the total number of disks when array was assembled with
gendisk flags of those devices to validate if all members are available
and functional. A new array state 'broken' was added: it mimics the state
'clean' in every aspect, being useful only to distinguish if such array
has some member missing. Also, we show a rate-limited warning in kernel
log in such case.

This patch has no proper functional change other than adding the
'clean'-like state; it was tested with ext4 and xfs filesystems. It
requires a 'mdadm' counterpart to handle the 'broken' state.

Cc: NeilBrown <neilb@suse.com>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
---

Cover-letter from v1:
lore.kernel.org/linux-block/20190729203135.12934-1-gpiccoli@canonical.com

v1 -> v2:
* Added handling for md/linear 'broken' state;
* Check for is_missing_dev() instead of personality (thanks Neil for
the suggestion);
* Changed is_missing_dev() handlers to static;
* Print rate-limited warning in case of more members go away, not only
the first.

Song, after thinking and experimenting a bit more, I don't think it
makes sense to allow setting the 'broken' state, unless we refactor
and "re-purpose" this state. What 'broken' really means is a 'clean'
state but with 1+ members missing - setting it is the same as set
'clean', so it'd be counter-intuitive to set 'broken' but having in
fact set 'clean' merely with the name 'broken'.

The purpose of 'broken' is really to only indicate a missing member,
as noticed by kernel in array_state reads.
If you really think we should use 'broken' for other purposes and hence
users should be allowed to set it, let's discuss and rewrite it.
Personally I think the less we change, the better. We just need
some way to warn users about the missing members of their arrays.
Thanks,

Guilherme


 drivers/md/md-linear.c | 25 +++++++++++++++++++++++++
 drivers/md/md.c        | 22 ++++++++++++++++++----
 drivers/md/md.h        |  2 ++
 drivers/md/raid0.c     | 26 ++++++++++++++++++++++++++
 4 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c
index ed2297541414..a404f700e3f8 100644
--- a/drivers/md/md-linear.c
+++ b/drivers/md/md-linear.c
@@ -237,6 +237,30 @@ static void linear_free(struct mddev *mddev, void *priv)
 	kfree(conf);
 }
 
+static bool linear_is_missing_dev(struct mddev *mddev)
+{
+	struct md_rdev *rdev;
+	static int already_missing;
+	int def_disks, work_disks = 0;
+
+	def_disks = mddev->raid_disks;
+	rdev_for_each(rdev, mddev)
+		if (rdev->bdev->bd_disk->flags & GENHD_FL_UP)
+			work_disks++;
+
+	if (unlikely(def_disks - work_disks)) {
+		if (already_missing < (def_disks - work_disks)) {
+			already_missing = def_disks - work_disks;
+			pr_warn("md: %s: linear array has %d missing/failed members\n",
+				mdname(mddev), already_missing);
+		}
+		return true;
+	}
+
+	already_missing = 0;
+	return false;
+}
+
 static bool linear_make_request(struct mddev *mddev, struct bio *bio)
 {
 	char b[BDEVNAME_SIZE];
@@ -325,6 +349,7 @@ static struct md_personality linear_personality =
 	.size		= linear_size,
 	.quiesce	= linear_quiesce,
 	.congested	= linear_congested,
+	.is_missing_dev	= linear_is_missing_dev,
 };
 
 static int __init linear_init (void)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index ba4de55eea13..0ef93d21cab6 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4160,12 +4160,17 @@ __ATTR_PREALLOC(resync_start, S_IRUGO|S_IWUSR,
  * active-idle
  *     like active, but no writes have been seen for a while (100msec).
  *
+ * broken
+ *     RAID0/LINEAR-only: same as clean, but array is missing a member.
+ *     It's useful because RAID0/LINEAR mounted-arrays aren't stopped
+ *     when a member is gone, so this state will at least alert the
+ *     user that something is wrong.
  */
 enum array_state { clear, inactive, suspended, readonly, read_auto, clean, active,
-		   write_pending, active_idle, bad_word};
+		   write_pending, active_idle, broken, bad_word};
 static char *array_states[] = {
 	"clear", "inactive", "suspended", "readonly", "read-auto", "clean", "active",
-	"write-pending", "active-idle", NULL };
+	"write-pending", "active-idle", "broken", NULL };
 
 static int match_word(const char *word, char **list)
 {
@@ -4181,7 +4186,7 @@ array_state_show(struct mddev *mddev, char *page)
 {
 	enum array_state st = inactive;
 
-	if (mddev->pers)
+	if (mddev->pers) {
 		switch(mddev->ro) {
 		case 1:
 			st = readonly;
@@ -4201,7 +4206,15 @@ array_state_show(struct mddev *mddev, char *page)
 				st = active;
 			spin_unlock(&mddev->lock);
 		}
-	else {
+
+		if ((mddev->pers->is_missing_dev) &&
+		   ((st == clean) || (st == broken))) {
+			if (mddev->pers->is_missing_dev(mddev))
+				st = broken;
+			else
+				st = clean;
+		}
+	} else {
 		if (list_empty(&mddev->disks) &&
 		    mddev->raid_disks == 0 &&
 		    mddev->dev_sectors == 0)
@@ -4315,6 +4328,7 @@ array_state_store(struct mddev *mddev, const char *buf, size_t len)
 		break;
 	case write_pending:
 	case active_idle:
+	case broken:
 		/* these cannot be set */
 		break;
 	}
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 5d7c1cad4946..2b40990c7642 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -590,6 +590,8 @@ struct md_personality
 	int (*congested)(struct mddev *mddev, int bits);
 	/* Changes the consistency policy of an active array. */
 	int (*change_consistency_policy)(struct mddev *mddev, const char *buf);
+	/* Check if there are any missing/failed members - RAID0/LINEAR only. */
+	bool (*is_missing_dev)(struct mddev *mddev);
 };
 
 struct md_sysfs_entry {
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index ef896ee7198b..f7144be96029 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -455,6 +455,31 @@ static inline int is_io_in_chunk_boundary(struct mddev *mddev,
 	}
 }
 
+static bool raid0_is_missing_dev(struct mddev *mddev)
+{
+	struct md_rdev *rdev;
+	static int already_missing;
+	int def_disks, work_disks = 0;
+	struct r0conf *conf = mddev->private;
+
+	def_disks = conf->strip_zone[0].nb_dev;
+	rdev_for_each(rdev, mddev)
+		if (rdev->bdev->bd_disk->flags & GENHD_FL_UP)
+			work_disks++;
+
+	if (unlikely(def_disks - work_disks)) {
+		if (already_missing < (def_disks - work_disks)) {
+			already_missing = def_disks - work_disks;
+			pr_warn("md: %s: raid0 array has %d missing/failed members\n",
+				mdname(mddev), already_missing);
+		}
+		return true;
+	}
+
+	already_missing = 0;
+	return false;
+}
+
 static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
 {
 	struct r0conf *conf = mddev->private;
@@ -788,6 +813,7 @@ static struct md_personality raid0_personality=
 	.takeover	= raid0_takeover,
 	.quiesce	= raid0_quiesce,
 	.congested	= raid0_congested,
+	.is_missing_dev	= raid0_is_missing_dev,
 };
 
 static int __init raid0_init (void)
-- 
2.22.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-16 13:40 ` Guilherme G. Piccoli
  0 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-16 13:40 UTC (permalink / raw)
  To: linux-raid
  Cc: linux-block, dm-devel, gpiccoli, jay.vosburgh, NeilBrown, Song Liu

Currently if md raid0/linear array gets one or more members removed while
being mounted, kernel keeps showing state 'clean' in the 'array_state'
sysfs attribute. Despite udev signaling the member device is gone, 'mdadm'
cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted.

Nothing else hints that something is wrong (except that the removed devices
don't show properly in the output of 'mdadm detail' command). There is no
other property to be checked, and if user is not performing reads/writes
to the array, even kernel log is quiet and doesn't give a clue about the
missing member.

This patch changes this behavior; when 'array_state' is read we introduce
a non-expensive check (only for raid0/md-linear) that relies in the
comparison of the total number of disks when array was assembled with
gendisk flags of those devices to validate if all members are available
and functional. A new array state 'broken' was added: it mimics the state
'clean' in every aspect, being useful only to distinguish if such array
has some member missing. Also, we show a rate-limited warning in kernel
log in such case.

This patch has no proper functional change other than adding the
'clean'-like state; it was tested with ext4 and xfs filesystems. It
requires a 'mdadm' counterpart to handle the 'broken' state.

Cc: NeilBrown <neilb@suse.com>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
---

Cover-letter from v1:
lore.kernel.org/linux-block/20190729203135.12934-1-gpiccoli@canonical.com

v1 -> v2:
* Added handling for md/linear 'broken' state;
* Check for is_missing_dev() instead of personality (thanks Neil for
the suggestion);
* Changed is_missing_dev() handlers to static;
* Print rate-limited warning in case of more members go away, not only
the first.

Song, after thinking and experimenting a bit more, I don't think it
makes sense to allow setting the 'broken' state, unless we refactor
and "re-purpose" this state. What 'broken' really means is a 'clean'
state but with 1+ members missing - setting it is the same as set
'clean', so it'd be counter-intuitive to set 'broken' but having in
fact set 'clean' merely with the name 'broken'.

The purpose of 'broken' is really to only indicate a missing member,
as noticed by kernel in array_state reads.
If you really think we should use 'broken' for other purposes and hence
users should be allowed to set it, let's discuss and rewrite it.
Personally I think the less we change, the better. We just need
some way to warn users about the missing members of their arrays.
Thanks,

Guilherme


 drivers/md/md-linear.c | 25 +++++++++++++++++++++++++
 drivers/md/md.c        | 22 ++++++++++++++++++----
 drivers/md/md.h        |  2 ++
 drivers/md/raid0.c     | 26 ++++++++++++++++++++++++++
 4 files changed, 71 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c
index ed2297541414..a404f700e3f8 100644
--- a/drivers/md/md-linear.c
+++ b/drivers/md/md-linear.c
@@ -237,6 +237,30 @@ static void linear_free(struct mddev *mddev, void *priv)
 	kfree(conf);
 }
 
+static bool linear_is_missing_dev(struct mddev *mddev)
+{
+	struct md_rdev *rdev;
+	static int already_missing;
+	int def_disks, work_disks = 0;
+
+	def_disks = mddev->raid_disks;
+	rdev_for_each(rdev, mddev)
+		if (rdev->bdev->bd_disk->flags & GENHD_FL_UP)
+			work_disks++;
+
+	if (unlikely(def_disks - work_disks)) {
+		if (already_missing < (def_disks - work_disks)) {
+			already_missing = def_disks - work_disks;
+			pr_warn("md: %s: linear array has %d missing/failed members\n",
+				mdname(mddev), already_missing);
+		}
+		return true;
+	}
+
+	already_missing = 0;
+	return false;
+}
+
 static bool linear_make_request(struct mddev *mddev, struct bio *bio)
 {
 	char b[BDEVNAME_SIZE];
@@ -325,6 +349,7 @@ static struct md_personality linear_personality =
 	.size		= linear_size,
 	.quiesce	= linear_quiesce,
 	.congested	= linear_congested,
+	.is_missing_dev	= linear_is_missing_dev,
 };
 
 static int __init linear_init (void)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index ba4de55eea13..0ef93d21cab6 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4160,12 +4160,17 @@ __ATTR_PREALLOC(resync_start, S_IRUGO|S_IWUSR,
  * active-idle
  *     like active, but no writes have been seen for a while (100msec).
  *
+ * broken
+ *     RAID0/LINEAR-only: same as clean, but array is missing a member.
+ *     It's useful because RAID0/LINEAR mounted-arrays aren't stopped
+ *     when a member is gone, so this state will at least alert the
+ *     user that something is wrong.
  */
 enum array_state { clear, inactive, suspended, readonly, read_auto, clean, active,
-		   write_pending, active_idle, bad_word};
+		   write_pending, active_idle, broken, bad_word};
 static char *array_states[] = {
 	"clear", "inactive", "suspended", "readonly", "read-auto", "clean", "active",
-	"write-pending", "active-idle", NULL };
+	"write-pending", "active-idle", "broken", NULL };
 
 static int match_word(const char *word, char **list)
 {
@@ -4181,7 +4186,7 @@ array_state_show(struct mddev *mddev, char *page)
 {
 	enum array_state st = inactive;
 
-	if (mddev->pers)
+	if (mddev->pers) {
 		switch(mddev->ro) {
 		case 1:
 			st = readonly;
@@ -4201,7 +4206,15 @@ array_state_show(struct mddev *mddev, char *page)
 				st = active;
 			spin_unlock(&mddev->lock);
 		}
-	else {
+
+		if ((mddev->pers->is_missing_dev) &&
+		   ((st == clean) || (st == broken))) {
+			if (mddev->pers->is_missing_dev(mddev))
+				st = broken;
+			else
+				st = clean;
+		}
+	} else {
 		if (list_empty(&mddev->disks) &&
 		    mddev->raid_disks == 0 &&
 		    mddev->dev_sectors == 0)
@@ -4315,6 +4328,7 @@ array_state_store(struct mddev *mddev, const char *buf, size_t len)
 		break;
 	case write_pending:
 	case active_idle:
+	case broken:
 		/* these cannot be set */
 		break;
 	}
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 5d7c1cad4946..2b40990c7642 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -590,6 +590,8 @@ struct md_personality
 	int (*congested)(struct mddev *mddev, int bits);
 	/* Changes the consistency policy of an active array. */
 	int (*change_consistency_policy)(struct mddev *mddev, const char *buf);
+	/* Check if there are any missing/failed members - RAID0/LINEAR only. */
+	bool (*is_missing_dev)(struct mddev *mddev);
 };
 
 struct md_sysfs_entry {
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index ef896ee7198b..f7144be96029 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -455,6 +455,31 @@ static inline int is_io_in_chunk_boundary(struct mddev *mddev,
 	}
 }
 
+static bool raid0_is_missing_dev(struct mddev *mddev)
+{
+	struct md_rdev *rdev;
+	static int already_missing;
+	int def_disks, work_disks = 0;
+	struct r0conf *conf = mddev->private;
+
+	def_disks = conf->strip_zone[0].nb_dev;
+	rdev_for_each(rdev, mddev)
+		if (rdev->bdev->bd_disk->flags & GENHD_FL_UP)
+			work_disks++;
+
+	if (unlikely(def_disks - work_disks)) {
+		if (already_missing < (def_disks - work_disks)) {
+			already_missing = def_disks - work_disks;
+			pr_warn("md: %s: raid0 array has %d missing/failed members\n",
+				mdname(mddev), already_missing);
+		}
+		return true;
+	}
+
+	already_missing = 0;
+	return false;
+}
+
 static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
 {
 	struct r0conf *conf = mddev->private;
@@ -788,6 +813,7 @@ static struct md_personality raid0_personality=
 	.takeover	= raid0_takeover,
 	.quiesce	= raid0_quiesce,
 	.congested	= raid0_congested,
+	.is_missing_dev	= raid0_is_missing_dev,
 };
 
 static int __init raid0_init (void)
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/2] mdadm: Introduce new array state 'broken' for raid0/linear
  2019-08-16 13:40 ` Guilherme G. Piccoli
@ 2019-08-16 13:41   ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-16 13:41 UTC (permalink / raw)
  To: linux-raid
  Cc: Song Liu, gpiccoli, NeilBrown, linux-block, dm-devel, jay.vosburgh

Currently if a md raid0/linear array gets one or more members removed while
being mounted, kernel keeps showing state 'clean' in the 'array_state'
sysfs attribute. Despite udev signaling the member device is gone, 'mdadm'
cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted.

Nothing else hints that something is wrong (except that the removed devices
don't show properly in the output of mdadm 'detail' command). There is no
other property to be checked, and if user is not performing reads/writes
to the array, even kernel log is quiet and doesn't give a clue about the
missing member.

This patch is the mdadm counterpart of kernel new array state 'broken'.
The 'broken' state mimics the state 'clean' in every aspect, being useful
only to distinguish if an array has some member missing. All necessary
paths in mdadm were changed to deal with 'broken' state, and in case the
tool runs in a kernel that is not updated, it'll work normally, i.e., it
doesn't require the 'broken' state in order to work.
Also, this patch changes the way the array state is showed in the 'detail'
command (for raid0/linear only) - now it takes the 'array_state' sysfs
attribute into account instead of only rely in the MD_SB_CLEAN flag.

Cc: NeilBrown <neilb@suse.com>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
---

v1 -> v2:
* Added handling for md/linear 'broken' state.


 Detail.c  | 17 +++++++++++++++--
 Monitor.c |  9 +++++++--
 maps.c    |  1 +
 mdadm.h   |  1 +
 mdmon.h   |  2 +-
 monitor.c |  4 ++--
 6 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/Detail.c b/Detail.c
index ad60434..cc7e9f1 100644
--- a/Detail.c
+++ b/Detail.c
@@ -81,6 +81,7 @@ int Detail(char *dev, struct context *c)
 	int external;
 	int inactive;
 	int is_container = 0;
+	char arrayst[12] = { 0 }; /* no state is >10 chars currently */
 
 	if (fd < 0) {
 		pr_err("cannot open %s: %s\n",
@@ -485,9 +486,21 @@ int Detail(char *dev, struct context *c)
 			else
 				st = ", degraded";
 
+			if (array.state & (1 << MD_SB_CLEAN)) {
+				if ((array.level == 0) ||
+				    (array.level == LEVEL_LINEAR))
+					strncpy(arrayst,
+						map_num(sysfs_array_states,
+							sra->array_state),
+						sizeof(arrayst)-1);
+				else
+					strncpy(arrayst, "clean",
+						sizeof(arrayst)-1);
+			} else
+				strncpy(arrayst, "active", sizeof(arrayst)-1);
+
 			printf("             State : %s%s%s%s%s%s \n",
-			       (array.state & (1 << MD_SB_CLEAN)) ?
-			       "clean" : "active", st,
+			       arrayst, st,
 			       (!e || (e->percent < 0 &&
 				       e->percent != RESYNC_PENDING &&
 				       e->percent != RESYNC_DELAYED)) ?
diff --git a/Monitor.c b/Monitor.c
index 036103f..9fd5406 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -1055,8 +1055,12 @@ int Wait(char *dev)
 	}
 }
 
+/* The state "broken" is used only for RAID0/LINEAR - it's the same as
+ * "clean", but used in case the array has one or more members missing.
+ */
+#define CLEAN_STATES_LAST_POS	5
 static char *clean_states[] = {
-	"clear", "inactive", "readonly", "read-auto", "clean", NULL };
+	"clear", "inactive", "readonly", "read-auto", "clean", "broken", NULL };
 
 int WaitClean(char *dev, int verbose)
 {
@@ -1116,7 +1120,8 @@ int WaitClean(char *dev, int verbose)
 			rv = read(state_fd, buf, sizeof(buf));
 			if (rv < 0)
 				break;
-			if (sysfs_match_word(buf, clean_states) <= 4)
+			if (sysfs_match_word(buf, clean_states)
+			    <= CLEAN_STATES_LAST_POS)
 				break;
 			rv = sysfs_wait(state_fd, &delay);
 			if (rv < 0 && errno != EINTR)
diff --git a/maps.c b/maps.c
index 02a0474..49b7f2c 100644
--- a/maps.c
+++ b/maps.c
@@ -150,6 +150,7 @@ mapping_t sysfs_array_states[] = {
 	{ "read-auto", ARRAY_READ_AUTO },
 	{ "clean", ARRAY_CLEAN },
 	{ "write-pending", ARRAY_WRITE_PENDING },
+	{ "broken", ARRAY_BROKEN },
 	{ NULL, ARRAY_UNKNOWN_STATE }
 };
 
diff --git a/mdadm.h b/mdadm.h
index 43b07d5..c88ceab 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -373,6 +373,7 @@ struct mdinfo {
 		ARRAY_ACTIVE,
 		ARRAY_WRITE_PENDING,
 		ARRAY_ACTIVE_IDLE,
+		ARRAY_BROKEN,
 		ARRAY_UNKNOWN_STATE,
 	} array_state;
 	struct md_bb bb;
diff --git a/mdmon.h b/mdmon.h
index 818367c..b3d72ac 100644
--- a/mdmon.h
+++ b/mdmon.h
@@ -21,7 +21,7 @@
 extern const char Name[];
 
 enum array_state { clear, inactive, suspended, readonly, read_auto,
-		   clean, active, write_pending, active_idle, bad_word};
+		   clean, active, write_pending, active_idle, broken, bad_word};
 
 enum sync_action { idle, reshape, resync, recover, check, repair, bad_action };
 
diff --git a/monitor.c b/monitor.c
index 81537ed..e0d3be6 100644
--- a/monitor.c
+++ b/monitor.c
@@ -26,7 +26,7 @@
 
 static char *array_states[] = {
 	"clear", "inactive", "suspended", "readonly", "read-auto",
-	"clean", "active", "write-pending", "active-idle", NULL };
+	"clean", "active", "write-pending", "active-idle", "broken", NULL };
 static char *sync_actions[] = {
 	"idle", "reshape", "resync", "recover", "check", "repair", NULL
 };
@@ -476,7 +476,7 @@ static int read_and_act(struct active_array *a, fd_set *fds)
 		a->next_state = clean;
 		ret |= ARRAY_DIRTY;
 	}
-	if (a->curr_state == clean) {
+	if ((a->curr_state == clean) || (a->curr_state == broken)) {
 		a->container->ss->set_array_state(a, 1);
 	}
 	if (a->curr_state == active ||
-- 
2.22.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/2] mdadm: Introduce new array state 'broken' for raid0/linear
@ 2019-08-16 13:41   ` Guilherme G. Piccoli
  0 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-16 13:41 UTC (permalink / raw)
  To: linux-raid
  Cc: linux-block, dm-devel, gpiccoli, jay.vosburgh, NeilBrown, Song Liu

Currently if a md raid0/linear array gets one or more members removed while
being mounted, kernel keeps showing state 'clean' in the 'array_state'
sysfs attribute. Despite udev signaling the member device is gone, 'mdadm'
cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted.

Nothing else hints that something is wrong (except that the removed devices
don't show properly in the output of mdadm 'detail' command). There is no
other property to be checked, and if user is not performing reads/writes
to the array, even kernel log is quiet and doesn't give a clue about the
missing member.

This patch is the mdadm counterpart of kernel new array state 'broken'.
The 'broken' state mimics the state 'clean' in every aspect, being useful
only to distinguish if an array has some member missing. All necessary
paths in mdadm were changed to deal with 'broken' state, and in case the
tool runs in a kernel that is not updated, it'll work normally, i.e., it
doesn't require the 'broken' state in order to work.
Also, this patch changes the way the array state is showed in the 'detail'
command (for raid0/linear only) - now it takes the 'array_state' sysfs
attribute into account instead of only rely in the MD_SB_CLEAN flag.

Cc: NeilBrown <neilb@suse.com>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
---

v1 -> v2:
* Added handling for md/linear 'broken' state.


 Detail.c  | 17 +++++++++++++++--
 Monitor.c |  9 +++++++--
 maps.c    |  1 +
 mdadm.h   |  1 +
 mdmon.h   |  2 +-
 monitor.c |  4 ++--
 6 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/Detail.c b/Detail.c
index ad60434..cc7e9f1 100644
--- a/Detail.c
+++ b/Detail.c
@@ -81,6 +81,7 @@ int Detail(char *dev, struct context *c)
 	int external;
 	int inactive;
 	int is_container = 0;
+	char arrayst[12] = { 0 }; /* no state is >10 chars currently */
 
 	if (fd < 0) {
 		pr_err("cannot open %s: %s\n",
@@ -485,9 +486,21 @@ int Detail(char *dev, struct context *c)
 			else
 				st = ", degraded";
 
+			if (array.state & (1 << MD_SB_CLEAN)) {
+				if ((array.level == 0) ||
+				    (array.level == LEVEL_LINEAR))
+					strncpy(arrayst,
+						map_num(sysfs_array_states,
+							sra->array_state),
+						sizeof(arrayst)-1);
+				else
+					strncpy(arrayst, "clean",
+						sizeof(arrayst)-1);
+			} else
+				strncpy(arrayst, "active", sizeof(arrayst)-1);
+
 			printf("             State : %s%s%s%s%s%s \n",
-			       (array.state & (1 << MD_SB_CLEAN)) ?
-			       "clean" : "active", st,
+			       arrayst, st,
 			       (!e || (e->percent < 0 &&
 				       e->percent != RESYNC_PENDING &&
 				       e->percent != RESYNC_DELAYED)) ?
diff --git a/Monitor.c b/Monitor.c
index 036103f..9fd5406 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -1055,8 +1055,12 @@ int Wait(char *dev)
 	}
 }
 
+/* The state "broken" is used only for RAID0/LINEAR - it's the same as
+ * "clean", but used in case the array has one or more members missing.
+ */
+#define CLEAN_STATES_LAST_POS	5
 static char *clean_states[] = {
-	"clear", "inactive", "readonly", "read-auto", "clean", NULL };
+	"clear", "inactive", "readonly", "read-auto", "clean", "broken", NULL };
 
 int WaitClean(char *dev, int verbose)
 {
@@ -1116,7 +1120,8 @@ int WaitClean(char *dev, int verbose)
 			rv = read(state_fd, buf, sizeof(buf));
 			if (rv < 0)
 				break;
-			if (sysfs_match_word(buf, clean_states) <= 4)
+			if (sysfs_match_word(buf, clean_states)
+			    <= CLEAN_STATES_LAST_POS)
 				break;
 			rv = sysfs_wait(state_fd, &delay);
 			if (rv < 0 && errno != EINTR)
diff --git a/maps.c b/maps.c
index 02a0474..49b7f2c 100644
--- a/maps.c
+++ b/maps.c
@@ -150,6 +150,7 @@ mapping_t sysfs_array_states[] = {
 	{ "read-auto", ARRAY_READ_AUTO },
 	{ "clean", ARRAY_CLEAN },
 	{ "write-pending", ARRAY_WRITE_PENDING },
+	{ "broken", ARRAY_BROKEN },
 	{ NULL, ARRAY_UNKNOWN_STATE }
 };
 
diff --git a/mdadm.h b/mdadm.h
index 43b07d5..c88ceab 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -373,6 +373,7 @@ struct mdinfo {
 		ARRAY_ACTIVE,
 		ARRAY_WRITE_PENDING,
 		ARRAY_ACTIVE_IDLE,
+		ARRAY_BROKEN,
 		ARRAY_UNKNOWN_STATE,
 	} array_state;
 	struct md_bb bb;
diff --git a/mdmon.h b/mdmon.h
index 818367c..b3d72ac 100644
--- a/mdmon.h
+++ b/mdmon.h
@@ -21,7 +21,7 @@
 extern const char Name[];
 
 enum array_state { clear, inactive, suspended, readonly, read_auto,
-		   clean, active, write_pending, active_idle, bad_word};
+		   clean, active, write_pending, active_idle, broken, bad_word};
 
 enum sync_action { idle, reshape, resync, recover, check, repair, bad_action };
 
diff --git a/monitor.c b/monitor.c
index 81537ed..e0d3be6 100644
--- a/monitor.c
+++ b/monitor.c
@@ -26,7 +26,7 @@
 
 static char *array_states[] = {
 	"clear", "inactive", "suspended", "readonly", "read-auto",
-	"clean", "active", "write-pending", "active-idle", NULL };
+	"clean", "active", "write-pending", "active-idle", "broken", NULL };
 static char *sync_actions[] = {
 	"idle", "reshape", "resync", "recover", "check", "repair", NULL
 };
@@ -476,7 +476,7 @@ static int read_and_act(struct active_array *a, fd_set *fds)
 		a->next_state = clean;
 		ret |= ARRAY_DIRTY;
 	}
-	if (a->curr_state == clean) {
+	if ((a->curr_state == clean) || (a->curr_state == broken)) {
 		a->container->ss->set_array_state(a, 1);
 	}
 	if (a->curr_state == active ||
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-16 13:40 ` Guilherme G. Piccoli
@ 2019-08-19 18:10   ` Song Liu
  -1 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-19 18:10 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh

On Fri, Aug 16, 2019 at 6:41 AM Guilherme G. Piccoli
<gpiccoli@canonical.com> wrote:
>
> Currently if md raid0/linear array gets one or more members removed while
> being mounted, kernel keeps showing state 'clean' in the 'array_state'
> sysfs attribute. Despite udev signaling the member device is gone, 'mdadm'
> cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted.
>
> Nothing else hints that something is wrong (except that the removed devices
> don't show properly in the output of 'mdadm detail' command). There is no
> other property to be checked, and if user is not performing reads/writes
> to the array, even kernel log is quiet and doesn't give a clue about the
> missing member.
>
> This patch changes this behavior; when 'array_state' is read we introduce
> a non-expensive check (only for raid0/md-linear) that relies in the
> comparison of the total number of disks when array was assembled with
> gendisk flags of those devices to validate if all members are available
> and functional. A new array state 'broken' was added: it mimics the state
> 'clean' in every aspect, being useful only to distinguish if such array
> has some member missing. Also, we show a rate-limited warning in kernel
> log in such case.
>
> This patch has no proper functional change other than adding the
> 'clean'-like state; it was tested with ext4 and xfs filesystems. It
> requires a 'mdadm' counterpart to handle the 'broken' state.
>
> Cc: NeilBrown <neilb@suse.com>
> Cc: Song Liu <songliubraving@fb.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>

If we merge this with the MD_BROKEN patch, would the code look simpler?

Thanks,
Song

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-19 18:10   ` Song Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-19 18:10 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown, Song Liu

On Fri, Aug 16, 2019 at 6:41 AM Guilherme G. Piccoli
<gpiccoli@canonical.com> wrote:
>
> Currently if md raid0/linear array gets one or more members removed while
> being mounted, kernel keeps showing state 'clean' in the 'array_state'
> sysfs attribute. Despite udev signaling the member device is gone, 'mdadm'
> cannot issue the STOP_ARRAY ioctl successfully, given the array is mounted.
>
> Nothing else hints that something is wrong (except that the removed devices
> don't show properly in the output of 'mdadm detail' command). There is no
> other property to be checked, and if user is not performing reads/writes
> to the array, even kernel log is quiet and doesn't give a clue about the
> missing member.
>
> This patch changes this behavior; when 'array_state' is read we introduce
> a non-expensive check (only for raid0/md-linear) that relies in the
> comparison of the total number of disks when array was assembled with
> gendisk flags of those devices to validate if all members are available
> and functional. A new array state 'broken' was added: it mimics the state
> 'clean' in every aspect, being useful only to distinguish if such array
> has some member missing. Also, we show a rate-limited warning in kernel
> log in such case.
>
> This patch has no proper functional change other than adding the
> 'clean'-like state; it was tested with ext4 and xfs filesystems. It
> requires a 'mdadm' counterpart to handle the 'broken' state.
>
> Cc: NeilBrown <neilb@suse.com>
> Cc: Song Liu <songliubraving@fb.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>

If we merge this with the MD_BROKEN patch, would the code look simpler?

Thanks,
Song

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-19 18:10   ` Song Liu
@ 2019-08-19 18:30     ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-19 18:30 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh

On 19/08/2019 15:10, Song Liu wrote:
> [...]
> 
> If we merge this with the MD_BROKEN patch, would the code look simpler?
> 
> Thanks,
> Song
> 

Hi Song, I don't believe it changes the complexity/"appearance" of the
code. Both patches are "relatives" in the ideas' realm, but their code
is different in nature. My goal in splitting them was to make more
bisect-able changes.

But feel free to merge them in a single patch or let me know if you
prefer that way and I can do it.

There's also a chance I haven't understood your statement/question
correctly heh - if that's the case, please clarify me!
Cheers,

Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-19 18:30     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-19 18:30 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown, Song Liu

On 19/08/2019 15:10, Song Liu wrote:
> [...]
> 
> If we merge this with the MD_BROKEN patch, would the code look simpler?
> 
> Thanks,
> Song
> 

Hi Song, I don't believe it changes the complexity/"appearance" of the
code. Both patches are "relatives" in the ideas' realm, but their code
is different in nature. My goal in splitting them was to make more
bisect-able changes.

But feel free to merge them in a single patch or let me know if you
prefer that way and I can do it.

There's also a chance I haven't understood your statement/question
correctly heh - if that's the case, please clarify me!
Cheers,

Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-19 18:30     ` Guilherme G. Piccoli
@ 2019-08-19 18:57       ` Song Liu
  -1 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-19 18:57 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh



> On Aug 19, 2019, at 11:30 AM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> On 19/08/2019 15:10, Song Liu wrote:
>> [...]
>> 
>> If we merge this with the MD_BROKEN patch, would the code look simpler?
>> 
>> Thanks,
>> Song
>> 
> 
> Hi Song, I don't believe it changes the complexity/"appearance" of the
> code. Both patches are "relatives" in the ideas' realm, but their code
> is different in nature. My goal in splitting them was to make more
> bisect-able changes.
> 
> But feel free to merge them in a single patch or let me know if you
> prefer that way and I can do it.
> 
> There's also a chance I haven't understood your statement/question
> correctly heh - if that's the case, please clarify me!

I was thinking, if we can set MD_BROKEN when the device fails, we can 
just test MD_BROKEN in array_state_show() (instead of iterating through 
all devices). 

Would this work?

Thanks,
Song

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-19 18:57       ` Song Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-19 18:57 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Song Liu, linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown



> On Aug 19, 2019, at 11:30 AM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> On 19/08/2019 15:10, Song Liu wrote:
>> [...]
>> 
>> If we merge this with the MD_BROKEN patch, would the code look simpler?
>> 
>> Thanks,
>> Song
>> 
> 
> Hi Song, I don't believe it changes the complexity/"appearance" of the
> code. Both patches are "relatives" in the ideas' realm, but their code
> is different in nature. My goal in splitting them was to make more
> bisect-able changes.
> 
> But feel free to merge them in a single patch or let me know if you
> prefer that way and I can do it.
> 
> There's also a chance I haven't understood your statement/question
> correctly heh - if that's the case, please clarify me!

I was thinking, if we can set MD_BROKEN when the device fails, we can 
just test MD_BROKEN in array_state_show() (instead of iterating through 
all devices). 

Would this work?

Thanks,
Song

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-19 18:57       ` Song Liu
@ 2019-08-19 19:11         ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-19 19:11 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh

On 19/08/2019 15:57, Song Liu wrote:
>[...] 
> 
> I was thinking, if we can set MD_BROKEN when the device fails, we can 
> just test MD_BROKEN in array_state_show() (instead of iterating through 
> all devices). 
> 
> Would this work?
> 
> Thanks,
> Song
> 

This could work, but will require some refactors; it'll simplify the
check for a healthy array (no need for is_missing_dev() function) but
will complicate the criteria to clear MD_BROKEN and the detection of
more members failing (we would stop the detection after the 1st failure
if we only test MD_BROKEN).

We will always need to test the GENHD_FL_UP in some code path, to be
able to show users device recovered from a failure.
Where do you suggest to test this flag?

Thanks,


Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-19 19:11         ` Guilherme G. Piccoli
  0 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-19 19:11 UTC (permalink / raw)
  To: Song Liu
  Cc: Song Liu, linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown

On 19/08/2019 15:57, Song Liu wrote:
>[...] 
> 
> I was thinking, if we can set MD_BROKEN when the device fails, we can 
> just test MD_BROKEN in array_state_show() (instead of iterating through 
> all devices). 
> 
> Would this work?
> 
> Thanks,
> Song
> 

This could work, but will require some refactors; it'll simplify the
check for a healthy array (no need for is_missing_dev() function) but
will complicate the criteria to clear MD_BROKEN and the detection of
more members failing (we would stop the detection after the 1st failure
if we only test MD_BROKEN).

We will always need to test the GENHD_FL_UP in some code path, to be
able to show users device recovered from a failure.
Where do you suggest to test this flag?

Thanks,


Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-19 19:11         ` Guilherme G. Piccoli
@ 2019-08-19 21:57           ` Song Liu
  -1 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-19 21:57 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh



> On Aug 19, 2019, at 12:11 PM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> On 19/08/2019 15:57, Song Liu wrote:
>> [...] 
>> 
>> I was thinking, if we can set MD_BROKEN when the device fails, we can 
>> just test MD_BROKEN in array_state_show() (instead of iterating through 
>> all devices). 
>> 
>> Would this work?
>> 
>> Thanks,
>> Song
>> 
> 
> This could work, but will require some refactors; it'll simplify the
> check for a healthy array (no need for is_missing_dev() function) but
> will complicate the criteria to clear MD_BROKEN and the detection of
> more members failing (we would stop the detection after the 1st failure
> if we only test MD_BROKEN).
> 
> We will always need to test the GENHD_FL_UP in some code path, to be
> able to show users device recovered from a failure.
> Where do you suggest to test this flag?

How about we test this when we do clear_bit(Faulty..)? And maybe also in 
add_new_disk()?

Thanks,
Song

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-19 21:57           ` Song Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-19 21:57 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Song Liu, linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown



> On Aug 19, 2019, at 12:11 PM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> On 19/08/2019 15:57, Song Liu wrote:
>> [...] 
>> 
>> I was thinking, if we can set MD_BROKEN when the device fails, we can 
>> just test MD_BROKEN in array_state_show() (instead of iterating through 
>> all devices). 
>> 
>> Would this work?
>> 
>> Thanks,
>> Song
>> 
> 
> This could work, but will require some refactors; it'll simplify the
> check for a healthy array (no need for is_missing_dev() function) but
> will complicate the criteria to clear MD_BROKEN and the detection of
> more members failing (we would stop the detection after the 1st failure
> if we only test MD_BROKEN).
> 
> We will always need to test the GENHD_FL_UP in some code path, to be
> able to show users device recovered from a failure.
> Where do you suggest to test this flag?

How about we test this when we do clear_bit(Faulty..)? And maybe also in 
add_new_disk()?

Thanks,
Song


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-19 21:57           ` Song Liu
@ 2019-08-21 14:16             ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-21 14:16 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh

On 19/08/2019 18:57, Song Liu wrote:
> [...]
> How about we test this when we do clear_bit(Faulty..)? And maybe also in 
> add_new_disk()?
> 
> Thanks,
> Song
> 

Song, thanks for the suggestions. I've been working in the refactor, so
far it's working fine. But I cannot re-add a member to raid0/linear
without performing a full stop (with "mdadm --stop"), and in this case
md_clean() will clear the flag. Restarting array this way works fine.

If I try writing 'inactive' to array_state, I cannot reinsert the member
to the array. That said, I don't think we need to worry in clearing
MD_BROKEN for RAID0/LINEAR, and it makes things far easier.
Are you ok with that? I'll submit V3 after our discussion.

Thanks,


Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-21 14:16             ` Guilherme G. Piccoli
  0 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-21 14:16 UTC (permalink / raw)
  To: Song Liu
  Cc: Song Liu, linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown

On 19/08/2019 18:57, Song Liu wrote:
> [...]
> How about we test this when we do clear_bit(Faulty..)? And maybe also in 
> add_new_disk()?
> 
> Thanks,
> Song
> 

Song, thanks for the suggestions. I've been working in the refactor, so
far it's working fine. But I cannot re-add a member to raid0/linear
without performing a full stop (with "mdadm --stop"), and in this case
md_clean() will clear the flag. Restarting array this way works fine.

If I try writing 'inactive' to array_state, I cannot reinsert the member
to the array. That said, I don't think we need to worry in clearing
MD_BROKEN for RAID0/LINEAR, and it makes things far easier.
Are you ok with that? I'll submit V3 after our discussion.

Thanks,


Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-21 14:16             ` Guilherme G. Piccoli
@ 2019-08-21 16:14               ` Song Liu
  -1 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-21 16:14 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh



> On Aug 21, 2019, at 7:16 AM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> On 19/08/2019 18:57, Song Liu wrote:
>> [...]
>> How about we test this when we do clear_bit(Faulty..)? And maybe also in 
>> add_new_disk()?
>> 
>> Thanks,
>> Song
>> 
> 
> Song, thanks for the suggestions. I've been working in the refactor, so
> far it's working fine. But I cannot re-add a member to raid0/linear
> without performing a full stop (with "mdadm --stop"), and in this case
> md_clean() will clear the flag. Restarting array this way works fine.
> 
> If I try writing 'inactive' to array_state, I cannot reinsert the member
> to the array. That said, I don't think we need to worry in clearing
> MD_BROKEN for RAID0/LINEAR, and it makes things far easier.
> Are you ok with that? I'll submit V3 after our discussion.
> 

What do you mean by "not clear MD_BROKEN"? Do you mean we need to restart
the array? 

IOW, the following won't work:

  mdadm --fail /dev/md0 /dev/sdx
  mdadm --remove /dev/md0 /dev/sdx
  mdadm --add /dev/md0 /dev/sdx
  
And we need the following instead:

  mdadm --fail /dev/md0 /dev/sdx
  mdadm --remove /dev/md0 /dev/sdx
  mdadm --stop /dev/md0 /dev/sdx
  mdadm --add /dev/md0 /dev/sdx
  mdadm --run /dev/md0 /dev/sdx

Thanks,
Song

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-21 16:14               ` Song Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-21 16:14 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Song Liu, linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown



> On Aug 21, 2019, at 7:16 AM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> On 19/08/2019 18:57, Song Liu wrote:
>> [...]
>> How about we test this when we do clear_bit(Faulty..)? And maybe also in 
>> add_new_disk()?
>> 
>> Thanks,
>> Song
>> 
> 
> Song, thanks for the suggestions. I've been working in the refactor, so
> far it's working fine. But I cannot re-add a member to raid0/linear
> without performing a full stop (with "mdadm --stop"), and in this case
> md_clean() will clear the flag. Restarting array this way works fine.
> 
> If I try writing 'inactive' to array_state, I cannot reinsert the member
> to the array. That said, I don't think we need to worry in clearing
> MD_BROKEN for RAID0/LINEAR, and it makes things far easier.
> Are you ok with that? I'll submit V3 after our discussion.
> 

What do you mean by "not clear MD_BROKEN"? Do you mean we need to restart
the array? 

IOW, the following won't work:

  mdadm --fail /dev/md0 /dev/sdx
  mdadm --remove /dev/md0 /dev/sdx
  mdadm --add /dev/md0 /dev/sdx
  
And we need the following instead:

  mdadm --fail /dev/md0 /dev/sdx
  mdadm --remove /dev/md0 /dev/sdx
  mdadm --stop /dev/md0 /dev/sdx
  mdadm --add /dev/md0 /dev/sdx
  mdadm --run /dev/md0 /dev/sdx

Thanks,
Song

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-21 16:14               ` Song Liu
@ 2019-08-21 18:06                 ` Song Liu
  -1 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-21 18:06 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh



> On Aug 21, 2019, at 9:14 AM, Song Liu <songliubraving@fb.com> wrote:
> 
> 
> 
>> On Aug 21, 2019, at 7:16 AM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
>> 
>> On 19/08/2019 18:57, Song Liu wrote:
>>> [...]
>>> How about we test this when we do clear_bit(Faulty..)? And maybe also in 
>>> add_new_disk()?
>>> 
>>> Thanks,
>>> Song
>>> 
>> 
>> Song, thanks for the suggestions. I've been working in the refactor, so
>> far it's working fine. But I cannot re-add a member to raid0/linear
>> without performing a full stop (with "mdadm --stop"), and in this case
>> md_clean() will clear the flag. Restarting array this way works fine.
>> 
>> If I try writing 'inactive' to array_state, I cannot reinsert the member
>> to the array. That said, I don't think we need to worry in clearing
>> MD_BROKEN for RAID0/LINEAR, and it makes things far easier.
>> Are you ok with that? I'll submit V3 after our discussion.
>> 
> 
> What do you mean by "not clear MD_BROKEN"? Do you mean we need to restart
> the array? 
> 
> IOW, the following won't work:
> 
>  mdadm --fail /dev/md0 /dev/sdx
>  mdadm --remove /dev/md0 /dev/sdx
>  mdadm --add /dev/md0 /dev/sdx
> 
> And we need the following instead:
> 
>  mdadm --fail /dev/md0 /dev/sdx
>  mdadm --remove /dev/md0 /dev/sdx
>  mdadm --stop /dev/md0 /dev/sdx
>  mdadm --add /dev/md0 /dev/sdx
>  mdadm --run /dev/md0 /dev/sdx

Btw, the MD_BROKEN patch conflicts with one of Neil's patches. Please 
rebase your work on top of

https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/tree/?h=md-next

Thanks,
Song

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-21 18:06                 ` Song Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-21 18:06 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Song Liu, linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown



> On Aug 21, 2019, at 9:14 AM, Song Liu <songliubraving@fb.com> wrote:
> 
> 
> 
>> On Aug 21, 2019, at 7:16 AM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
>> 
>> On 19/08/2019 18:57, Song Liu wrote:
>>> [...]
>>> How about we test this when we do clear_bit(Faulty..)? And maybe also in 
>>> add_new_disk()?
>>> 
>>> Thanks,
>>> Song
>>> 
>> 
>> Song, thanks for the suggestions. I've been working in the refactor, so
>> far it's working fine. But I cannot re-add a member to raid0/linear
>> without performing a full stop (with "mdadm --stop"), and in this case
>> md_clean() will clear the flag. Restarting array this way works fine.
>> 
>> If I try writing 'inactive' to array_state, I cannot reinsert the member
>> to the array. That said, I don't think we need to worry in clearing
>> MD_BROKEN for RAID0/LINEAR, and it makes things far easier.
>> Are you ok with that? I'll submit V3 after our discussion.
>> 
> 
> What do you mean by "not clear MD_BROKEN"? Do you mean we need to restart
> the array? 
> 
> IOW, the following won't work:
> 
>  mdadm --fail /dev/md0 /dev/sdx
>  mdadm --remove /dev/md0 /dev/sdx
>  mdadm --add /dev/md0 /dev/sdx
> 
> And we need the following instead:
> 
>  mdadm --fail /dev/md0 /dev/sdx
>  mdadm --remove /dev/md0 /dev/sdx
>  mdadm --stop /dev/md0 /dev/sdx
>  mdadm --add /dev/md0 /dev/sdx
>  mdadm --run /dev/md0 /dev/sdx

Btw, the MD_BROKEN patch conflicts with one of Neil's patches. Please 
rebase your work on top of

https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/tree/?h=md-next

Thanks,
Song




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-21 16:14               ` Song Liu
@ 2019-08-21 19:10                 ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-21 19:10 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh

On 21/08/2019 13:14, Song Liu wrote:
> [...] 
> 
> What do you mean by "not clear MD_BROKEN"? Do you mean we need to restart
> the array? 
> 
> IOW, the following won't work:
> 
>   mdadm --fail /dev/md0 /dev/sdx
>   mdadm --remove /dev/md0 /dev/sdx
>   mdadm --add /dev/md0 /dev/sdx
>   
> And we need the following instead:
> 
>   mdadm --fail /dev/md0 /dev/sdx
>   mdadm --remove /dev/md0 /dev/sdx
>   mdadm --stop /dev/md0 /dev/sdx
>   mdadm --add /dev/md0 /dev/sdx
>   mdadm --run /dev/md0 /dev/sdx
> 
> Thanks,
> Song
> 

Song, I've tried the first procedure (without the --stop) and failed to
make it work on linear/raid0 arrays, even trying in vanilla kernel.
What I could do is:

1) Mount an array and while writing, remove a member (nvme1n1 in my
case); "mdadm --detail md0" will either show 'clean' state or 'broken'
if we have my patch;

2) Unmount the array and run: "mdadm -If nvme1n1 --path
pci-0000:00:08.0-nvme-1"
This will result: "mdadm: set device faulty failed for nvme1n1:  Device
or resource busy"
Despite the error, md0 device is gone.

3) echo 1 > /sys/bus/pci/rescan [nvme1 device is back]

4) mdadm -A --scan [md0 is back, with both devices and 'clean' state]

So, either if we "--stop" or if we incremental fail a member of the
array, when it's back the state will be 'clean' and not 'broken'.
Hence, I don't see a point in clearing the MD_BROKEN flag for
raid0/linear arrays, nor I see where we could do it.

And thanks for the link for your tree, I'll certainly rebase my patch
against that.
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-21 19:10                 ` Guilherme G. Piccoli
  0 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-21 19:10 UTC (permalink / raw)
  To: Song Liu
  Cc: Song Liu, linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown

On 21/08/2019 13:14, Song Liu wrote:
> [...] 
> 
> What do you mean by "not clear MD_BROKEN"? Do you mean we need to restart
> the array? 
> 
> IOW, the following won't work:
> 
>   mdadm --fail /dev/md0 /dev/sdx
>   mdadm --remove /dev/md0 /dev/sdx
>   mdadm --add /dev/md0 /dev/sdx
>   
> And we need the following instead:
> 
>   mdadm --fail /dev/md0 /dev/sdx
>   mdadm --remove /dev/md0 /dev/sdx
>   mdadm --stop /dev/md0 /dev/sdx
>   mdadm --add /dev/md0 /dev/sdx
>   mdadm --run /dev/md0 /dev/sdx
> 
> Thanks,
> Song
> 

Song, I've tried the first procedure (without the --stop) and failed to
make it work on linear/raid0 arrays, even trying in vanilla kernel.
What I could do is:

1) Mount an array and while writing, remove a member (nvme1n1 in my
case); "mdadm --detail md0" will either show 'clean' state or 'broken'
if we have my patch;

2) Unmount the array and run: "mdadm -If nvme1n1 --path
pci-0000:00:08.0-nvme-1"
This will result: "mdadm: set device faulty failed for nvme1n1:  Device
or resource busy"
Despite the error, md0 device is gone.

3) echo 1 > /sys/bus/pci/rescan [nvme1 device is back]

4) mdadm -A --scan [md0 is back, with both devices and 'clean' state]

So, either if we "--stop" or if we incremental fail a member of the
array, when it's back the state will be 'clean' and not 'broken'.
Hence, I don't see a point in clearing the MD_BROKEN flag for
raid0/linear arrays, nor I see where we could do it.

And thanks for the link for your tree, I'll certainly rebase my patch
against that.
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-21 19:10                 ` Guilherme G. Piccoli
@ 2019-08-21 19:22                   ` Song Liu
  -1 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-21 19:22 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh



> On Aug 21, 2019, at 12:10 PM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> On 21/08/2019 13:14, Song Liu wrote:
>> [...] 
>> 
>> What do you mean by "not clear MD_BROKEN"? Do you mean we need to restart
>> the array? 
>> 
>> IOW, the following won't work:
>> 
>>  mdadm --fail /dev/md0 /dev/sdx
>>  mdadm --remove /dev/md0 /dev/sdx
>>  mdadm --add /dev/md0 /dev/sdx
>> 
>> And we need the following instead:
>> 
>>  mdadm --fail /dev/md0 /dev/sdx
>>  mdadm --remove /dev/md0 /dev/sdx
>>  mdadm --stop /dev/md0 /dev/sdx
>>  mdadm --add /dev/md0 /dev/sdx
>>  mdadm --run /dev/md0 /dev/sdx
>> 
>> Thanks,
>> Song
>> 
> 
> Song, I've tried the first procedure (without the --stop) and failed to
> make it work on linear/raid0 arrays, even trying in vanilla kernel.
> What I could do is:
> 
> 1) Mount an array and while writing, remove a member (nvme1n1 in my
> case); "mdadm --detail md0" will either show 'clean' state or 'broken'
> if we have my patch;
> 
> 2) Unmount the array and run: "mdadm -If nvme1n1 --path
> pci-0000:00:08.0-nvme-1"
> This will result: "mdadm: set device faulty failed for nvme1n1:  Device
> or resource busy"
> Despite the error, md0 device is gone.
> 
> 3) echo 1 > /sys/bus/pci/rescan [nvme1 device is back]
> 
> 4) mdadm -A --scan [md0 is back, with both devices and 'clean' state]
> 
> So, either if we "--stop" or if we incremental fail a member of the
> array, when it's back the state will be 'clean' and not 'broken'.
> Hence, I don't see a point in clearing the MD_BROKEN flag for
> raid0/linear arrays, nor I see where we could do it.

I think this makes sense. Please send the patch and we can discuss
further while looking at the code. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-21 19:22                   ` Song Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Song Liu @ 2019-08-21 19:22 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Song Liu, linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown



> On Aug 21, 2019, at 12:10 PM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> On 21/08/2019 13:14, Song Liu wrote:
>> [...] 
>> 
>> What do you mean by "not clear MD_BROKEN"? Do you mean we need to restart
>> the array? 
>> 
>> IOW, the following won't work:
>> 
>>  mdadm --fail /dev/md0 /dev/sdx
>>  mdadm --remove /dev/md0 /dev/sdx
>>  mdadm --add /dev/md0 /dev/sdx
>> 
>> And we need the following instead:
>> 
>>  mdadm --fail /dev/md0 /dev/sdx
>>  mdadm --remove /dev/md0 /dev/sdx
>>  mdadm --stop /dev/md0 /dev/sdx
>>  mdadm --add /dev/md0 /dev/sdx
>>  mdadm --run /dev/md0 /dev/sdx
>> 
>> Thanks,
>> Song
>> 
> 
> Song, I've tried the first procedure (without the --stop) and failed to
> make it work on linear/raid0 arrays, even trying in vanilla kernel.
> What I could do is:
> 
> 1) Mount an array and while writing, remove a member (nvme1n1 in my
> case); "mdadm --detail md0" will either show 'clean' state or 'broken'
> if we have my patch;
> 
> 2) Unmount the array and run: "mdadm -If nvme1n1 --path
> pci-0000:00:08.0-nvme-1"
> This will result: "mdadm: set device faulty failed for nvme1n1:  Device
> or resource busy"
> Despite the error, md0 device is gone.
> 
> 3) echo 1 > /sys/bus/pci/rescan [nvme1 device is back]
> 
> 4) mdadm -A --scan [md0 is back, with both devices and 'clean' state]
> 
> So, either if we "--stop" or if we incremental fail a member of the
> array, when it's back the state will be 'clean' and not 'broken'.
> Hence, I don't see a point in clearing the MD_BROKEN flag for
> raid0/linear arrays, nor I see where we could do it.

I think this makes sense. Please send the patch and we can discuss
further while looking at the code. 

Thanks,
Song



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-16 13:40 ` Guilherme G. Piccoli
@ 2019-08-22  8:49   ` Guoqing Jiang
  -1 siblings, 0 replies; 30+ messages in thread
From: Guoqing Jiang @ 2019-08-22  8:49 UTC (permalink / raw)
  To: Guilherme G. Piccoli, linux-raid
  Cc: linux-block, jay.vosburgh, dm-devel, Song Liu, NeilBrown

Hi,

On 8/16/19 3:40 PM, Guilherme G. Piccoli wrote:
> +static bool linear_is_missing_dev(struct mddev *mddev)
> +{
> +	struct md_rdev *rdev;
> +	static int already_missing;
> +	int def_disks, work_disks = 0;
> +
> +	def_disks = mddev->raid_disks;
> +	rdev_for_each(rdev, mddev)
> +		if (rdev->bdev->bd_disk->flags & GENHD_FL_UP)
> +			work_disks++;
> +
> +	if (unlikely(def_disks - work_disks)) {

If my understanding is correct, after enter the branch,

> +		if (already_missing < (def_disks - work_disks)) {

the above is always true since already_missing should be '0', right?
If so, maybe the above checking is pointless.

> +			already_missing = def_disks - work_disks;
> +			pr_warn("md: %s: linear array has %d missing/failed members\n",
> +				mdname(mddev), already_missing);
> +		}
> +		return true;
> +	}
> +
> +	already_missing = 0;
> +	return false;
> +}
> +.

I'd suggest something like, just FYI.

bool already_missing = false;
int missing_disks;

missing_disks = def_disks - work_disks;
if (unlikely(missing_disks)) {
	already_missing = true;
	pr_warn("md: %s: linear array has %d missing/failed members\n", mdname(mddev), missing_disks);
}
return already_missing;


Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-22  8:49   ` Guoqing Jiang
  0 siblings, 0 replies; 30+ messages in thread
From: Guoqing Jiang @ 2019-08-22  8:49 UTC (permalink / raw)
  To: Guilherme G. Piccoli, linux-raid
  Cc: linux-block, dm-devel, jay.vosburgh, NeilBrown, Song Liu

Hi,

On 8/16/19 3:40 PM, Guilherme G. Piccoli wrote:
> +static bool linear_is_missing_dev(struct mddev *mddev)
> +{
> +	struct md_rdev *rdev;
> +	static int already_missing;
> +	int def_disks, work_disks = 0;
> +
> +	def_disks = mddev->raid_disks;
> +	rdev_for_each(rdev, mddev)
> +		if (rdev->bdev->bd_disk->flags & GENHD_FL_UP)
> +			work_disks++;
> +
> +	if (unlikely(def_disks - work_disks)) {

If my understanding is correct, after enter the branch,

> +		if (already_missing < (def_disks - work_disks)) {

the above is always true since already_missing should be '0', right?
If so, maybe the above checking is pointless.

> +			already_missing = def_disks - work_disks;
> +			pr_warn("md: %s: linear array has %d missing/failed members\n",
> +				mdname(mddev), already_missing);
> +		}
> +		return true;
> +	}
> +
> +	already_missing = 0;
> +	return false;
> +}
> +.

I'd suggest something like, just FYI.

bool already_missing = false;
int missing_disks;

missing_disks = def_disks - work_disks;
if (unlikely(missing_disks)) {
	already_missing = true;
	pr_warn("md: %s: linear array has %d missing/failed members\n", mdname(mddev), missing_disks);
}
return already_missing;


Thanks,
Guoqing


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-22  8:49   ` Guoqing Jiang
@ 2019-08-22 15:35     ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-22 15:35 UTC (permalink / raw)
  To: Guoqing Jiang, linux-raid
  Cc: linux-block, jay.vosburgh, dm-devel, Song Liu, NeilBrown

On 22/08/2019 05:49, Guoqing Jiang wrote:
> Hi,

Hi Guoqing, thanks for the review!


> 
> On 8/16/19 3:40 PM, Guilherme G. Piccoli wrote:
>> +static bool linear_is_missing_dev(struct mddev *mddev)
>> +{
>> +    struct md_rdev *rdev;
>> +    static int already_missing;
>> +    int def_disks, work_disks = 0;
>> +
>> +    def_disks = mddev->raid_disks;
>> +    rdev_for_each(rdev, mddev)
>> +        if (rdev->bdev->bd_disk->flags & GENHD_FL_UP)
>> +            work_disks++;
>> +
>> +    if (unlikely(def_disks - work_disks)) {
> 
> If my understanding is correct, after enter the branch,
> 
>> +        if (already_missing < (def_disks - work_disks)) {
> 
> the above is always true since already_missing should be '0', right?
> If so, maybe the above checking is pointless.

The variable 'already_missing' is static, so indeed it starts with 0.
When there are missing disks, we'll enter the 'if(def_disks -
work_disks)' and indeed print the message. But notice we'll set
'already_missing = def_disks - work_disks', so we won't enter the if and
print the message anymore _unless_ a new member is removed and
'already_missing' gets unbalanced with regards to 'def_disks - work_disks'.

The idea behind it is to show a single message whenever a new member is
removed. The use of a static variable allows us to do that.

Nevertheless, this path will be dropped in the V3 that is ready to be sent.
Cheers,


Guilherme

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-22 15:35     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-22 15:35 UTC (permalink / raw)
  To: Guoqing Jiang, linux-raid
  Cc: linux-block, dm-devel, jay.vosburgh, NeilBrown, Song Liu

On 22/08/2019 05:49, Guoqing Jiang wrote:
> Hi,

Hi Guoqing, thanks for the review!


> 
> On 8/16/19 3:40 PM, Guilherme G. Piccoli wrote:
>> +static bool linear_is_missing_dev(struct mddev *mddev)
>> +{
>> +    struct md_rdev *rdev;
>> +    static int already_missing;
>> +    int def_disks, work_disks = 0;
>> +
>> +    def_disks = mddev->raid_disks;
>> +    rdev_for_each(rdev, mddev)
>> +        if (rdev->bdev->bd_disk->flags & GENHD_FL_UP)
>> +            work_disks++;
>> +
>> +    if (unlikely(def_disks - work_disks)) {
> 
> If my understanding is correct, after enter the branch,
> 
>> +        if (already_missing < (def_disks - work_disks)) {
> 
> the above is always true since already_missing should be '0', right?
> If so, maybe the above checking is pointless.

The variable 'already_missing' is static, so indeed it starts with 0.
When there are missing disks, we'll enter the 'if(def_disks -
work_disks)' and indeed print the message. But notice we'll set
'already_missing = def_disks - work_disks', so we won't enter the if and
print the message anymore _unless_ a new member is removed and
'already_missing' gets unbalanced with regards to 'def_disks - work_disks'.

The idea behind it is to show a single message whenever a new member is
removed. The use of a static variable allows us to do that.

Nevertheless, this path will be dropped in the V3 that is ready to be sent.
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
  2019-08-21 19:22                   ` Song Liu
@ 2019-08-22 16:16                     ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-22 16:16 UTC (permalink / raw)
  To: Song Liu
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, Jay Vosburgh

On 21/08/2019 16:22, Song Liu wrote:
> [...] 
> 
> I think this makes sense. Please send the patch and we can discuss
> further while looking at the code. 

Thanks Song, just sent the V3:
lore.kernel.org/linux-block/20190822161318.26236-1-gpiccoli@canonical.com

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken'
@ 2019-08-22 16:16                     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 30+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-22 16:16 UTC (permalink / raw)
  To: Song Liu
  Cc: Song Liu, linux-raid, linux-block, dm-devel, Jay Vosburgh, NeilBrown

On 21/08/2019 16:22, Song Liu wrote:
> [...] 
> 
> I think this makes sense. Please send the patch and we can discuss
> further while looking at the code. 

Thanks Song, just sent the V3:
lore.kernel.org/linux-block/20190822161318.26236-1-gpiccoli@canonical.com

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2019-08-22 16:20 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-16 13:40 [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken' Guilherme G. Piccoli
2019-08-16 13:40 ` Guilherme G. Piccoli
2019-08-16 13:41 ` [PATCH v2 2/2] mdadm: Introduce new array state 'broken' for raid0/linear Guilherme G. Piccoli
2019-08-16 13:41   ` Guilherme G. Piccoli
2019-08-19 18:10 ` [PATCH v2 1/2] md raid0/linear: Introduce new array state 'broken' Song Liu
2019-08-19 18:10   ` Song Liu
2019-08-19 18:30   ` Guilherme G. Piccoli
2019-08-19 18:30     ` Guilherme G. Piccoli
2019-08-19 18:57     ` Song Liu
2019-08-19 18:57       ` Song Liu
2019-08-19 19:11       ` Guilherme G. Piccoli
2019-08-19 19:11         ` Guilherme G. Piccoli
2019-08-19 21:57         ` Song Liu
2019-08-19 21:57           ` Song Liu
2019-08-21 14:16           ` Guilherme G. Piccoli
2019-08-21 14:16             ` Guilherme G. Piccoli
2019-08-21 16:14             ` Song Liu
2019-08-21 16:14               ` Song Liu
2019-08-21 18:06               ` Song Liu
2019-08-21 18:06                 ` Song Liu
2019-08-21 19:10               ` Guilherme G. Piccoli
2019-08-21 19:10                 ` Guilherme G. Piccoli
2019-08-21 19:22                 ` Song Liu
2019-08-21 19:22                   ` Song Liu
2019-08-22 16:16                   ` Guilherme G. Piccoli
2019-08-22 16:16                     ` Guilherme G. Piccoli
2019-08-22  8:49 ` Guoqing Jiang
2019-08-22  8:49   ` Guoqing Jiang
2019-08-22 15:35   ` Guilherme G. Piccoli
2019-08-22 15:35     ` Guilherme G. Piccoli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.