All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Get failed disk count from array state
@ 2017-05-15 12:17 Tomasz Majchrzak
  2017-05-19 19:43 ` Jes Sorensen
  0 siblings, 1 reply; 4+ messages in thread
From: Tomasz Majchrzak @ 2017-05-15 12:17 UTC (permalink / raw)
  To: linux-raid; +Cc: jes.sorensen, Tomasz Majchrzak

Recent commit has changed the way failed disks are counted. It breaks
recovery for external metadata arrays as failed disks are not part of
the array and have no corresponding entries is sysfs (they are only
reported for containers) so degraded arrays show no failed disks.

Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it
is not set again if GET_STATE has not been requested. As GET_STATE
provides the same information as GET_DEGRADED, the latter is not needed
anymore. Remove GET_DEGRADED option and replace it with GET_STATE
option.

Don't count number of failed disks looking at sysfs entries but
calculate it at the end. Do it only for arrays as containers report
no disks, just spares.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 Incremental.c | 14 ++++----------
 managemon.c   |  4 ++--
 mdadm.h       |  1 -
 raid6check.c  |  2 +-
 sysfs.c       | 17 +++++++----------
 5 files changed, 14 insertions(+), 24 deletions(-)

diff --git a/Incremental.c b/Incremental.c
index 680d318..64b63a1 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -886,16 +886,10 @@ static int array_try_spare(char *devname, int *dfdp, struct dev_policy *pol,
 		}
 		sra = sysfs_read(-1, mp->devnm,
 				 GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|
-				 GET_DEGRADED|GET_COMPONENT|GET_VERSION);
-		if (!sra) {
-			/* Probably a container - no degraded info */
-			sra = sysfs_read(-1, mp->devnm,
-					 GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|
-					 GET_COMPONENT|GET_VERSION);
-			if (sra)
-				sra->array.failed_disks = -1;
-		}
-		if (!sra)
+				 GET_COMPONENT|GET_VERSION);
+		if (sra)
+			sra->array.failed_disks = -1;
+		else
 			continue;
 		if (st == NULL) {
 			int i;
diff --git a/managemon.c b/managemon.c
index 3c1d4cb..f1eb68a 100644
--- a/managemon.c
+++ b/managemon.c
@@ -685,8 +685,8 @@ static void manage_new(struct mdstat_ent *mdstat,
 
 	mdi = sysfs_read(-1, mdstat->devnm,
 			 GET_LEVEL|GET_CHUNK|GET_DISKS|GET_COMPONENT|
-			 GET_DEGRADED|GET_SAFEMODE|
-			 GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|GET_LAYOUT);
+			 GET_SAFEMODE|GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|
+			 GET_LAYOUT);
 
 	if (!mdi)
 		return;
diff --git a/mdadm.h b/mdadm.h
index a92feb2..38b2180 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -632,7 +632,6 @@ enum sysfs_read_flags {
 	GET_MISMATCH	= (1 << 5),
 	GET_VERSION	= (1 << 6),
 	GET_DISKS	= (1 << 7),
-	GET_DEGRADED	= (1 << 8),
 	GET_SAFEMODE	= (1 << 9),
 	GET_BITMAP_LOCATION = (1 << 10),
 
diff --git a/raid6check.c b/raid6check.c
index 551f835..a8e6005 100644
--- a/raid6check.c
+++ b/raid6check.c
@@ -562,7 +562,7 @@ int main(int argc, char *argv[])
 			  GET_LEVEL|
 			  GET_LAYOUT|
 			  GET_DISKS|
-			  GET_DEGRADED |
+			  GET_STATE |
 			  GET_COMPONENT|
 			  GET_CHUNK|
 			  GET_DEVS|
diff --git a/sysfs.c b/sysfs.c
index f7967e8..dff986a 100644
--- a/sysfs.c
+++ b/sysfs.c
@@ -162,18 +162,12 @@ struct mdinfo *sysfs_read(int fd, char *devnm, unsigned long options)
 			goto abort;
 		sra->array.layout = strtoul(buf, NULL, 0);
 	}
-	if (options & GET_DISKS) {
+	if (options & (GET_DISKS|GET_STATE)) {
 		strcpy(base, "raid_disks");
 		if (load_sys(fname, buf, sizeof(buf)))
 			goto abort;
 		sra->array.raid_disks = strtoul(buf, NULL, 0);
 	}
-	if (options & GET_DEGRADED) {
-		strcpy(base, "degraded");
-		if (load_sys(fname, buf, sizeof(buf)))
-			goto abort;
-		sra->array.failed_disks = strtoul(buf, NULL, 0);
-	}
 	if (options & GET_COMPONENT) {
 		strcpy(base, "component_size");
 		if (load_sys(fname, buf, sizeof(buf)))
@@ -362,10 +356,8 @@ struct mdinfo *sysfs_read(int fd, char *devnm, unsigned long options)
 				dev->disk.state |= (1<<MD_DISK_SYNC);
 				sra->array.active_disks++;
 			}
-			if (strstr(buf, "faulty")) {
+			if (strstr(buf, "faulty"))
 				dev->disk.state |= (1<<MD_DISK_FAULTY);
-				sra->array.failed_disks++;
-			}
 			if (dev->disk.state == 0)
 				sra->array.spare_disks++;
 		}
@@ -376,6 +368,11 @@ struct mdinfo *sysfs_read(int fd, char *devnm, unsigned long options)
 			dev->errors = strtoul(buf, NULL, 0);
 		}
 	}
+
+	if ((options & GET_STATE) && sra->array.raid_disks)
+		sra->array.failed_disks = sra->array.raid_disks -
+			sra->array.active_disks - sra->array.spare_disks;
+
 	closedir(dir);
 	return sra;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] Get failed disk count from array state
  2017-05-15 12:17 [PATCH] Get failed disk count from array state Tomasz Majchrzak
@ 2017-05-19 19:43 ` Jes Sorensen
  2017-05-31 10:46   ` [PATCH v2] " Tomasz Majchrzak
  0 siblings, 1 reply; 4+ messages in thread
From: Jes Sorensen @ 2017-05-19 19:43 UTC (permalink / raw)
  To: Tomasz Majchrzak, linux-raid

On 05/15/2017 08:17 AM, Tomasz Majchrzak wrote:
> Recent commit has changed the way failed disks are counted. It breaks
> recovery for external metadata arrays as failed disks are not part of
> the array and have no corresponding entries is sysfs (they are only
> reported for containers) so degraded arrays show no failed disks.
> 
> Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it
> is not set again if GET_STATE has not been requested. As GET_STATE
> provides the same information as GET_DEGRADED, the latter is not needed
> anymore. Remove GET_DEGRADED option and replace it with GET_STATE
> option.
> 
> Don't count number of failed disks looking at sysfs entries but
> calculate it at the end. Do it only for arrays as containers report
> no disks, just spares.

Hi Tomasz,

I have been sitting on a stack of patches in this area as I wanted to 
test them before pushing them out. I finally got a change to run the 
testsuite and pushed them, but your patch no longer applied.

Would you mind having a look at the current trunk and see how you want 
to address it?

Thanks,
Jes

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] Get failed disk count from array state
  2017-05-19 19:43 ` Jes Sorensen
@ 2017-05-31 10:46   ` Tomasz Majchrzak
  2017-06-05 15:12     ` Jes Sorensen
  0 siblings, 1 reply; 4+ messages in thread
From: Tomasz Majchrzak @ 2017-05-31 10:46 UTC (permalink / raw)
  To: linux-raid; +Cc: jes.sorensen, Tomasz Majchrzak

Recent commit has changed the way failed disks are counted. It breaks
recovery for external metadata arrays as failed disks are not part of
the array and have no corresponding entries is sysfs (they are only
reported for containers) so degraded arrays show no failed disks.

Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it
is not set again if GET_STATE has not been requested. As GET_STATE
provides the same information as GET_DEGRADED, the latter is not needed
anymore. Remove GET_DEGRADED option and replace it with GET_STATE
option.

Don't count number of failed disks looking at sysfs entries but
calculate it at the end. Do it only for arrays as containers report
no disks, just spares.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 Incremental.c | 14 ++++----------
 Monitor.c     |  4 ++--
 managemon.c   |  4 ++--
 mdadm.h       |  1 -
 raid6check.c  |  2 +-
 sysfs.c       | 18 ++++++++----------
 6 files changed, 17 insertions(+), 26 deletions(-)

v2:
  adapted to apply to latest upstream version

diff --git a/Incremental.c b/Incremental.c
index 30dc7a2..6cf2174 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -886,16 +886,10 @@ static int array_try_spare(char *devname, int *dfdp, struct dev_policy *pol,
 		}
 		sra = sysfs_read(-1, mp->devnm,
 				 GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|
-				 GET_DEGRADED|GET_COMPONENT|GET_VERSION);
-		if (!sra) {
-			/* Probably a container - no degraded info */
-			sra = sysfs_read(-1, mp->devnm,
-					 GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|
-					 GET_COMPONENT|GET_VERSION);
-			if (sra)
-				sra->array.failed_disks = -1;
-		}
-		if (!sra)
+				 GET_COMPONENT|GET_VERSION);
+		if (sra)
+			sra->array.failed_disks = -1;
+		else
 			continue;
 		if (st == NULL) {
 			int i;
diff --git a/Monitor.c b/Monitor.c
index 725f47d..bef2f1b 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -485,8 +485,8 @@ static int check_array(struct state *st, struct mdstat_ent *mdstat,
 	if (st->devnm[0] == 0)
 		strcpy(st->devnm, fd2devnm(fd));
 
-	sra = sysfs_read(-1, st->devnm, GET_LEVEL | GET_DISKS | GET_DEGRADED |
-			 GET_MISMATCH | GET_DEVS | GET_STATE);
+	sra = sysfs_read(-1, st->devnm, GET_LEVEL | GET_DISKS | GET_MISMATCH |
+			 GET_DEVS | GET_STATE);
 	if (!sra)
 		goto disappeared;
 
diff --git a/managemon.c b/managemon.c
index a8df666..68f0c2d 100644
--- a/managemon.c
+++ b/managemon.c
@@ -685,8 +685,8 @@ static void manage_new(struct mdstat_ent *mdstat,
 
 	mdi = sysfs_read(-1, mdstat->devnm,
 			 GET_LEVEL|GET_CHUNK|GET_DISKS|GET_COMPONENT|
-			 GET_DEGRADED|GET_SAFEMODE|
-			 GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|GET_LAYOUT);
+			 GET_SAFEMODE|GET_DEVS|GET_OFFSET|GET_SIZE|GET_STATE|
+			 GET_LAYOUT);
 
 	if (!mdi)
 		return;
diff --git a/mdadm.h b/mdadm.h
index ec0a39e..ee9b837 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -637,7 +637,6 @@ enum sysfs_read_flags {
 	GET_MISMATCH	= (1 << 5),
 	GET_VERSION	= (1 << 6),
 	GET_DISKS	= (1 << 7),
-	GET_DEGRADED	= (1 << 8),
 	GET_SAFEMODE	= (1 << 9),
 	GET_BITMAP_LOCATION = (1 << 10),
 
diff --git a/raid6check.c b/raid6check.c
index 551f835..a8e6005 100644
--- a/raid6check.c
+++ b/raid6check.c
@@ -562,7 +562,7 @@ int main(int argc, char *argv[])
 			  GET_LEVEL|
 			  GET_LAYOUT|
 			  GET_DISKS|
-			  GET_DEGRADED |
+			  GET_STATE |
 			  GET_COMPONENT|
 			  GET_CHUNK|
 			  GET_DEVS|
diff --git a/sysfs.c b/sysfs.c
index e47f5e4..78d2b52 100644
--- a/sysfs.c
+++ b/sysfs.c
@@ -162,18 +162,12 @@ struct mdinfo *sysfs_read(int fd, char *devnm, unsigned long options)
 			goto abort;
 		sra->array.layout = strtoul(buf, NULL, 0);
 	}
-	if (options & GET_DISKS) {
+	if (options & (GET_DISKS|GET_STATE)) {
 		strcpy(base, "raid_disks");
 		if (load_sys(fname, buf, sizeof(buf)))
 			goto abort;
 		sra->array.raid_disks = strtoul(buf, NULL, 0);
 	}
-	if (options & GET_DEGRADED) {
-		strcpy(base, "degraded");
-		if (load_sys(fname, buf, sizeof(buf)))
-			goto abort;
-		sra->array.failed_disks = strtoul(buf, NULL, 0);
-	}
 	if (options & GET_COMPONENT) {
 		strcpy(base, "component_size");
 		if (load_sys(fname, buf, sizeof(buf)))
@@ -359,10 +353,9 @@ struct mdinfo *sysfs_read(int fd, char *devnm, unsigned long options)
 			strcpy(dbase, "state");
 			if (load_sys(fname, buf, sizeof(buf)))
 				goto abort;
-			if (strstr(buf, "faulty")) {
+			if (strstr(buf, "faulty"))
 				dev->disk.state |= (1<<MD_DISK_FAULTY);
-				sra->array.failed_disks++;
-			} else {
+			else {
 				sra->array.working_disks++;
 				if (strstr(buf, "in_sync")) {
 					dev->disk.state |= (1<<MD_DISK_SYNC);
@@ -379,6 +372,11 @@ struct mdinfo *sysfs_read(int fd, char *devnm, unsigned long options)
 			dev->errors = strtoul(buf, NULL, 0);
 		}
 	}
+
+	if ((options & GET_STATE) && sra->array.raid_disks)
+		sra->array.failed_disks = sra->array.raid_disks -
+			sra->array.active_disks - sra->array.spare_disks;
+
 	closedir(dir);
 	return sra;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] Get failed disk count from array state
  2017-05-31 10:46   ` [PATCH v2] " Tomasz Majchrzak
@ 2017-06-05 15:12     ` Jes Sorensen
  0 siblings, 0 replies; 4+ messages in thread
From: Jes Sorensen @ 2017-06-05 15:12 UTC (permalink / raw)
  To: Tomasz Majchrzak, linux-raid

On 05/31/2017 06:46 AM, Tomasz Majchrzak wrote:
> Recent commit has changed the way failed disks are counted. It breaks
> recovery for external metadata arrays as failed disks are not part of
> the array and have no corresponding entries is sysfs (they are only
> reported for containers) so degraded arrays show no failed disks.
> 
> Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it
> is not set again if GET_STATE has not been requested. As GET_STATE
> provides the same information as GET_DEGRADED, the latter is not needed
> anymore. Remove GET_DEGRADED option and replace it with GET_STATE
> option.
> 
> Don't count number of failed disks looking at sysfs entries but
> calculate it at the end. Do it only for arrays as containers report
> no disks, just spares.
> 
> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
> ---
>   Incremental.c | 14 ++++----------
>   Monitor.c     |  4 ++--
>   managemon.c   |  4 ++--
>   mdadm.h       |  1 -
>   raid6check.c  |  2 +-
>   sysfs.c       | 18 ++++++++----------
>   6 files changed, 17 insertions(+), 26 deletions(-)
> 
> v2:
>    adapted to apply to latest upstream version

Applied!

Thanks,
Jes


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-06-05 15:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-15 12:17 [PATCH] Get failed disk count from array state Tomasz Majchrzak
2017-05-19 19:43 ` Jes Sorensen
2017-05-31 10:46   ` [PATCH v2] " Tomasz Majchrzak
2017-06-05 15:12     ` Jes Sorensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.