All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v2] md: no longer compare spare disk superblock events in super_load
  2019-09-25  5:54 ` Yufen Yu
  (?)
@ 2019-09-25  5:40 ` Greg KH
  -1 siblings, 0 replies; 8+ messages in thread
From: Greg KH @ 2019-09-25  5:40 UTC (permalink / raw)
  To: Yufen Yu; +Cc: songliubraving, linux-raid, neilb, stable

On Wed, Sep 25, 2019 at 01:54:49PM +0800, Yufen Yu wrote:
> We have a test case as follow:
> 
>   mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] --assume-clean --bitmap=internal
>   mdadm -S /dev/md1
>   mdadm -A /dev/md1 /dev/sd[b-c] --run --force
> 
>   mdadm --zero /dev/sda
>   mdadm /dev/md1 -a /dev/sda
> 
>   echo offline > /sys/block/sdc/device/state
>   echo offline > /sys/block/sdb/device/state
>   sleep 5
>   mdadm -S /dev/md1
> 
>   echo running > /sys/block/sdb/device/state
>   echo running > /sys/block/sdc/device/state
>   mdadm -A /dev/md1 /dev/sd[a-c] --run --force
> 
> When we readd /dev/sda to the array, it started to do recovery.
> After offline the other two disks in md1, the recovery have
> been interrupted and superblock update info cannot be written
> to the offline disks. While the spare disk (/dev/sda) can continue
> to update superblock info.
> 
> After stopping the array and assemble it, we found the array
> run fail, with the follow kernel message:
> 
> [  172.986064] md: kicking non-fresh sdb from array!
> [  173.004210] md: kicking non-fresh sdc from array!
> [  173.022383] md/raid1:md1: active with 0 out of 4 mirrors
> [  173.022406] md1: failed to create bitmap (-5)
> [  173.023466] md: md1 stopped.
> 
> Since both sdb and sdc have the value of 'sb->events' smaller than
> that in sda, they have been kicked from the array. However, the only
> remained disk sda is in 'spare' state before stop and it cannot be
> added to conf->mirrors[] array. In the end, raid array assemble and run fail.
> 
> In fact, we can use the older disk sdb or sdc to assemble the array.
> That means we should not choose the 'spare' disk as the fresh disk in
> analyze_sbs().
> 
> To fix the problem, we do not compare superblock events when it is
> a spare disk, as same as validate_super.
> 
> Signed-off-by: Yufen Yu <yuyufen@huawei.com>
> 
> v1->v2:
>   fix wrong return value in super_90_load
> ---
>  drivers/md/md.c | 44 ++++++++++++++++++++++++--------------------
>  1 file changed, 24 insertions(+), 20 deletions(-)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] md: no longer compare spare disk superblock events in super_load
@ 2019-09-25  5:54 ` Yufen Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Yufen Yu @ 2019-09-25  5:54 UTC (permalink / raw)
  To: songliubraving; +Cc: linux-raid, neilb, stable

We have a test case as follow:

  mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] --assume-clean --bitmap=internal
  mdadm -S /dev/md1
  mdadm -A /dev/md1 /dev/sd[b-c] --run --force

  mdadm --zero /dev/sda
  mdadm /dev/md1 -a /dev/sda

  echo offline > /sys/block/sdc/device/state
  echo offline > /sys/block/sdb/device/state
  sleep 5
  mdadm -S /dev/md1

  echo running > /sys/block/sdb/device/state
  echo running > /sys/block/sdc/device/state
  mdadm -A /dev/md1 /dev/sd[a-c] --run --force

When we readd /dev/sda to the array, it started to do recovery.
After offline the other two disks in md1, the recovery have
been interrupted and superblock update info cannot be written
to the offline disks. While the spare disk (/dev/sda) can continue
to update superblock info.

After stopping the array and assemble it, we found the array
run fail, with the follow kernel message:

[  172.986064] md: kicking non-fresh sdb from array!
[  173.004210] md: kicking non-fresh sdc from array!
[  173.022383] md/raid1:md1: active with 0 out of 4 mirrors
[  173.022406] md1: failed to create bitmap (-5)
[  173.023466] md: md1 stopped.

Since both sdb and sdc have the value of 'sb->events' smaller than
that in sda, they have been kicked from the array. However, the only
remained disk sda is in 'spare' state before stop and it cannot be
added to conf->mirrors[] array. In the end, raid array assemble and run fail.

In fact, we can use the older disk sdb or sdc to assemble the array.
That means we should not choose the 'spare' disk as the fresh disk in
analyze_sbs().

To fix the problem, we do not compare superblock events when it is
a spare disk, as same as validate_super.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>

v1->v2:
  fix wrong return value in super_90_load
---
 drivers/md/md.c | 44 ++++++++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 1be7abeb24fd..0a91c20071b3 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1097,7 +1097,7 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 {
 	char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
 	mdp_super_t *sb;
-	int ret;
+	int ret = 0;
 
 	/*
 	 * Calculate the position of the superblock (512byte sectors),
@@ -1111,14 +1111,12 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 	if (ret)
 		return ret;
 
-	ret = -EINVAL;
-
 	bdevname(rdev->bdev, b);
 	sb = page_address(rdev->sb_page);
 
 	if (sb->md_magic != MD_SB_MAGIC) {
 		pr_warn("md: invalid raid superblock magic on %s\n", b);
-		goto abort;
+		return -EINVAL;
 	}
 
 	if (sb->major_version != 0 ||
@@ -1126,15 +1124,15 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 	    sb->minor_version > 91) {
 		pr_warn("Bad version number %d.%d on %s\n",
 			sb->major_version, sb->minor_version, b);
-		goto abort;
+		return -EINVAL;
 	}
 
 	if (sb->raid_disks <= 0)
-		goto abort;
+		return -EINVAL;
 
 	if (md_csum_fold(calc_sb_csum(sb)) != md_csum_fold(sb->sb_csum)) {
 		pr_warn("md: invalid superblock checksum on %s\n", b);
-		goto abort;
+		return -EINVAL;
 	}
 
 	rdev->preferred_minor = sb->md_minor;
@@ -1156,19 +1154,22 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 		if (!md_uuid_equal(refsb, sb)) {
 			pr_warn("md: %s has different UUID to %s\n",
 				b, bdevname(refdev->bdev,b2));
-			goto abort;
+			return -EINVAL;
 		}
 		if (!md_sb_equal(refsb, sb)) {
 			pr_warn("md: %s has same UUID but different superblock to %s\n",
 				b, bdevname(refdev->bdev, b2));
-			goto abort;
+			return -EINVAL;
 		}
 		ev1 = md_event(sb);
 		ev2 = md_event(refsb);
-		if (ev1 > ev2)
-			ret = 1;
-		else
-			ret = 0;
+
+		/* Insist on good event counter while assembling, except
+		 * for spares (which don't need an event count) */
+		if (sb->disks[rdev->desc_nr].state & (
+			(1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)))
+			if (ev1 > ev2)
+				ret = 1;
 	}
 	rdev->sectors = rdev->sb_start;
 	/* Limit to 4TB as metadata cannot record more than that.
@@ -1180,9 +1181,8 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 
 	if (rdev->sectors < ((sector_t)sb->size) * 2 && sb->level >= 1)
 		/* "this cannot possibly happen" ... */
-		ret = -EINVAL;
+		return -EINVAL;
 
- abort:
 	return ret;
 }
 
@@ -1520,7 +1520,7 @@ static __le32 calc_sb_1_csum(struct mdp_superblock_1 *sb)
 static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_version)
 {
 	struct mdp_superblock_1 *sb;
-	int ret;
+	int ret = 0;
 	sector_t sb_start;
 	sector_t sectors;
 	char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
@@ -1676,10 +1676,14 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
 		ev1 = le64_to_cpu(sb->events);
 		ev2 = le64_to_cpu(refsb->events);
 
-		if (ev1 > ev2)
-			ret = 1;
-		else
-			ret = 0;
+		/* Insist of good event counter while assembling, except for
+		 * spares (which don't need an event count) */
+		if (rdev->desc_nr >= 0 &&
+		    rdev->desc_nr < le32_to_cpu(sb->max_dev) &&
+		    (le16_to_cpu(sb->dev_roles[rdev->desc_nr]) < MD_DISK_ROLE_MAX ||
+		     le16_to_cpu(sb->dev_roles[rdev->desc_nr]) == MD_DISK_ROLE_JOURNAL))
+			if (ev1 > ev2)
+				ret = 1;
 	}
 	if (minor_version) {
 		sectors = (i_size_read(rdev->bdev->bd_inode) >> 9);
-- 
2.17.2

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2] md: no longer compare spare disk superblock events in super_load
@ 2019-09-25  5:54 ` Yufen Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Yufen Yu @ 2019-09-25  5:54 UTC (permalink / raw)
  To: songliubraving; +Cc: linux-raid, neilb, stable

We have a test case as follow:

  mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] --assume-clean --bitmap=internal
  mdadm -S /dev/md1
  mdadm -A /dev/md1 /dev/sd[b-c] --run --force

  mdadm --zero /dev/sda
  mdadm /dev/md1 -a /dev/sda

  echo offline > /sys/block/sdc/device/state
  echo offline > /sys/block/sdb/device/state
  sleep 5
  mdadm -S /dev/md1

  echo running > /sys/block/sdb/device/state
  echo running > /sys/block/sdc/device/state
  mdadm -A /dev/md1 /dev/sd[a-c] --run --force

When we readd /dev/sda to the array, it started to do recovery.
After offline the other two disks in md1, the recovery have
been interrupted and superblock update info cannot be written
to the offline disks. While the spare disk (/dev/sda) can continue
to update superblock info.

After stopping the array and assemble it, we found the array
run fail, with the follow kernel message:

[  172.986064] md: kicking non-fresh sdb from array!
[  173.004210] md: kicking non-fresh sdc from array!
[  173.022383] md/raid1:md1: active with 0 out of 4 mirrors
[  173.022406] md1: failed to create bitmap (-5)
[  173.023466] md: md1 stopped.

Since both sdb and sdc have the value of 'sb->events' smaller than
that in sda, they have been kicked from the array. However, the only
remained disk sda is in 'spare' state before stop and it cannot be
added to conf->mirrors[] array. In the end, raid array assemble and run fail.

In fact, we can use the older disk sdb or sdc to assemble the array.
That means we should not choose the 'spare' disk as the fresh disk in
analyze_sbs().

To fix the problem, we do not compare superblock events when it is
a spare disk, as same as validate_super.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>

v1->v2:
  fix wrong return value in super_90_load
---
 drivers/md/md.c | 44 ++++++++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 1be7abeb24fd..0a91c20071b3 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1097,7 +1097,7 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 {
 	char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
 	mdp_super_t *sb;
-	int ret;
+	int ret = 0;
 
 	/*
 	 * Calculate the position of the superblock (512byte sectors),
@@ -1111,14 +1111,12 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 	if (ret)
 		return ret;
 
-	ret = -EINVAL;
-
 	bdevname(rdev->bdev, b);
 	sb = page_address(rdev->sb_page);
 
 	if (sb->md_magic != MD_SB_MAGIC) {
 		pr_warn("md: invalid raid superblock magic on %s\n", b);
-		goto abort;
+		return -EINVAL;
 	}
 
 	if (sb->major_version != 0 ||
@@ -1126,15 +1124,15 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 	    sb->minor_version > 91) {
 		pr_warn("Bad version number %d.%d on %s\n",
 			sb->major_version, sb->minor_version, b);
-		goto abort;
+		return -EINVAL;
 	}
 
 	if (sb->raid_disks <= 0)
-		goto abort;
+		return -EINVAL;
 
 	if (md_csum_fold(calc_sb_csum(sb)) != md_csum_fold(sb->sb_csum)) {
 		pr_warn("md: invalid superblock checksum on %s\n", b);
-		goto abort;
+		return -EINVAL;
 	}
 
 	rdev->preferred_minor = sb->md_minor;
@@ -1156,19 +1154,22 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 		if (!md_uuid_equal(refsb, sb)) {
 			pr_warn("md: %s has different UUID to %s\n",
 				b, bdevname(refdev->bdev,b2));
-			goto abort;
+			return -EINVAL;
 		}
 		if (!md_sb_equal(refsb, sb)) {
 			pr_warn("md: %s has same UUID but different superblock to %s\n",
 				b, bdevname(refdev->bdev, b2));
-			goto abort;
+			return -EINVAL;
 		}
 		ev1 = md_event(sb);
 		ev2 = md_event(refsb);
-		if (ev1 > ev2)
-			ret = 1;
-		else
-			ret = 0;
+
+		/* Insist on good event counter while assembling, except
+		 * for spares (which don't need an event count) */
+		if (sb->disks[rdev->desc_nr].state & (
+			(1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)))
+			if (ev1 > ev2)
+				ret = 1;
 	}
 	rdev->sectors = rdev->sb_start;
 	/* Limit to 4TB as metadata cannot record more than that.
@@ -1180,9 +1181,8 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
 
 	if (rdev->sectors < ((sector_t)sb->size) * 2 && sb->level >= 1)
 		/* "this cannot possibly happen" ... */
-		ret = -EINVAL;
+		return -EINVAL;
 
- abort:
 	return ret;
 }
 
@@ -1520,7 +1520,7 @@ static __le32 calc_sb_1_csum(struct mdp_superblock_1 *sb)
 static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_version)
 {
 	struct mdp_superblock_1 *sb;
-	int ret;
+	int ret = 0;
 	sector_t sb_start;
 	sector_t sectors;
 	char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
@@ -1676,10 +1676,14 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
 		ev1 = le64_to_cpu(sb->events);
 		ev2 = le64_to_cpu(refsb->events);
 
-		if (ev1 > ev2)
-			ret = 1;
-		else
-			ret = 0;
+		/* Insist of good event counter while assembling, except for
+		 * spares (which don't need an event count) */
+		if (rdev->desc_nr >= 0 &&
+		    rdev->desc_nr < le32_to_cpu(sb->max_dev) &&
+		    (le16_to_cpu(sb->dev_roles[rdev->desc_nr]) < MD_DISK_ROLE_MAX ||
+		     le16_to_cpu(sb->dev_roles[rdev->desc_nr]) == MD_DISK_ROLE_JOURNAL))
+			if (ev1 > ev2)
+				ret = 1;
 	}
 	if (minor_version) {
 		sectors = (i_size_read(rdev->bdev->bd_inode) >> 9);
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] md: no longer compare spare disk superblock events in super_load
  2019-09-25  5:54 ` Yufen Yu
@ 2019-10-12  1:25   ` Yufen Yu
  -1 siblings, 0 replies; 8+ messages in thread
From: Yufen Yu @ 2019-10-12  1:25 UTC (permalink / raw)
  To: songliubraving; +Cc: linux-raid, neilb, stable

ping


On 2019/9/25 13:54, Yufen Yu wrote:
> We have a test case as follow:
>
>    mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] --assume-clean --bitmap=internal
>    mdadm -S /dev/md1
>    mdadm -A /dev/md1 /dev/sd[b-c] --run --force
>
>    mdadm --zero /dev/sda
>    mdadm /dev/md1 -a /dev/sda
>
>    echo offline > /sys/block/sdc/device/state
>    echo offline > /sys/block/sdb/device/state
>    sleep 5
>    mdadm -S /dev/md1
>
>    echo running > /sys/block/sdb/device/state
>    echo running > /sys/block/sdc/device/state
>    mdadm -A /dev/md1 /dev/sd[a-c] --run --force
>
> When we readd /dev/sda to the array, it started to do recovery.
> After offline the other two disks in md1, the recovery have
> been interrupted and superblock update info cannot be written
> to the offline disks. While the spare disk (/dev/sda) can continue
> to update superblock info.
>
> After stopping the array and assemble it, we found the array
> run fail, with the follow kernel message:
>
> [  172.986064] md: kicking non-fresh sdb from array!
> [  173.004210] md: kicking non-fresh sdc from array!
> [  173.022383] md/raid1:md1: active with 0 out of 4 mirrors
> [  173.022406] md1: failed to create bitmap (-5)
> [  173.023466] md: md1 stopped.
>
> Since both sdb and sdc have the value of 'sb->events' smaller than
> that in sda, they have been kicked from the array. However, the only
> remained disk sda is in 'spare' state before stop and it cannot be
> added to conf->mirrors[] array. In the end, raid array assemble and run fail.
>
> In fact, we can use the older disk sdb or sdc to assemble the array.
> That means we should not choose the 'spare' disk as the fresh disk in
> analyze_sbs().
>
> To fix the problem, we do not compare superblock events when it is
> a spare disk, as same as validate_super.
>
> Signed-off-by: Yufen Yu <yuyufen@huawei.com>
>
> v1->v2:
>    fix wrong return value in super_90_load
> ---
>   drivers/md/md.c | 44 ++++++++++++++++++++++++--------------------
>   1 file changed, 24 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 1be7abeb24fd..0a91c20071b3 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -1097,7 +1097,7 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   {
>   	char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
>   	mdp_super_t *sb;
> -	int ret;
> +	int ret = 0;
>   
>   	/*
>   	 * Calculate the position of the superblock (512byte sectors),
> @@ -1111,14 +1111,12 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   	if (ret)
>   		return ret;
>   
> -	ret = -EINVAL;
> -
>   	bdevname(rdev->bdev, b);
>   	sb = page_address(rdev->sb_page);
>   
>   	if (sb->md_magic != MD_SB_MAGIC) {
>   		pr_warn("md: invalid raid superblock magic on %s\n", b);
> -		goto abort;
> +		return -EINVAL;
>   	}
>   
>   	if (sb->major_version != 0 ||
> @@ -1126,15 +1124,15 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   	    sb->minor_version > 91) {
>   		pr_warn("Bad version number %d.%d on %s\n",
>   			sb->major_version, sb->minor_version, b);
> -		goto abort;
> +		return -EINVAL;
>   	}
>   
>   	if (sb->raid_disks <= 0)
> -		goto abort;
> +		return -EINVAL;
>   
>   	if (md_csum_fold(calc_sb_csum(sb)) != md_csum_fold(sb->sb_csum)) {
>   		pr_warn("md: invalid superblock checksum on %s\n", b);
> -		goto abort;
> +		return -EINVAL;
>   	}
>   
>   	rdev->preferred_minor = sb->md_minor;
> @@ -1156,19 +1154,22 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   		if (!md_uuid_equal(refsb, sb)) {
>   			pr_warn("md: %s has different UUID to %s\n",
>   				b, bdevname(refdev->bdev,b2));
> -			goto abort;
> +			return -EINVAL;
>   		}
>   		if (!md_sb_equal(refsb, sb)) {
>   			pr_warn("md: %s has same UUID but different superblock to %s\n",
>   				b, bdevname(refdev->bdev, b2));
> -			goto abort;
> +			return -EINVAL;
>   		}
>   		ev1 = md_event(sb);
>   		ev2 = md_event(refsb);
> -		if (ev1 > ev2)
> -			ret = 1;
> -		else
> -			ret = 0;
> +
> +		/* Insist on good event counter while assembling, except
> +		 * for spares (which don't need an event count) */
> +		if (sb->disks[rdev->desc_nr].state & (
> +			(1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)))
> +			if (ev1 > ev2)
> +				ret = 1;
>   	}
>   	rdev->sectors = rdev->sb_start;
>   	/* Limit to 4TB as metadata cannot record more than that.
> @@ -1180,9 +1181,8 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   
>   	if (rdev->sectors < ((sector_t)sb->size) * 2 && sb->level >= 1)
>   		/* "this cannot possibly happen" ... */
> -		ret = -EINVAL;
> +		return -EINVAL;
>   
> - abort:
>   	return ret;
>   }
>   
> @@ -1520,7 +1520,7 @@ static __le32 calc_sb_1_csum(struct mdp_superblock_1 *sb)
>   static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_version)
>   {
>   	struct mdp_superblock_1 *sb;
> -	int ret;
> +	int ret = 0;
>   	sector_t sb_start;
>   	sector_t sectors;
>   	char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
> @@ -1676,10 +1676,14 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
>   		ev1 = le64_to_cpu(sb->events);
>   		ev2 = le64_to_cpu(refsb->events);
>   
> -		if (ev1 > ev2)
> -			ret = 1;
> -		else
> -			ret = 0;
> +		/* Insist of good event counter while assembling, except for
> +		 * spares (which don't need an event count) */
> +		if (rdev->desc_nr >= 0 &&
> +		    rdev->desc_nr < le32_to_cpu(sb->max_dev) &&
> +		    (le16_to_cpu(sb->dev_roles[rdev->desc_nr]) < MD_DISK_ROLE_MAX ||
> +		     le16_to_cpu(sb->dev_roles[rdev->desc_nr]) == MD_DISK_ROLE_JOURNAL))
> +			if (ev1 > ev2)
> +				ret = 1;
>   	}
>   	if (minor_version) {
>   		sectors = (i_size_read(rdev->bdev->bd_inode) >> 9);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] md: no longer compare spare disk superblock events in super_load
@ 2019-10-12  1:25   ` Yufen Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Yufen Yu @ 2019-10-12  1:25 UTC (permalink / raw)
  To: songliubraving; +Cc: linux-raid, neilb, stable

ping


On 2019/9/25 13:54, Yufen Yu wrote:
> We have a test case as follow:
>
>    mdadm -CR /dev/md1 -l 1 -n 4 /dev/sd[a-d] --assume-clean --bitmap=internal
>    mdadm -S /dev/md1
>    mdadm -A /dev/md1 /dev/sd[b-c] --run --force
>
>    mdadm --zero /dev/sda
>    mdadm /dev/md1 -a /dev/sda
>
>    echo offline > /sys/block/sdc/device/state
>    echo offline > /sys/block/sdb/device/state
>    sleep 5
>    mdadm -S /dev/md1
>
>    echo running > /sys/block/sdb/device/state
>    echo running > /sys/block/sdc/device/state
>    mdadm -A /dev/md1 /dev/sd[a-c] --run --force
>
> When we readd /dev/sda to the array, it started to do recovery.
> After offline the other two disks in md1, the recovery have
> been interrupted and superblock update info cannot be written
> to the offline disks. While the spare disk (/dev/sda) can continue
> to update superblock info.
>
> After stopping the array and assemble it, we found the array
> run fail, with the follow kernel message:
>
> [  172.986064] md: kicking non-fresh sdb from array!
> [  173.004210] md: kicking non-fresh sdc from array!
> [  173.022383] md/raid1:md1: active with 0 out of 4 mirrors
> [  173.022406] md1: failed to create bitmap (-5)
> [  173.023466] md: md1 stopped.
>
> Since both sdb and sdc have the value of 'sb->events' smaller than
> that in sda, they have been kicked from the array. However, the only
> remained disk sda is in 'spare' state before stop and it cannot be
> added to conf->mirrors[] array. In the end, raid array assemble and run fail.
>
> In fact, we can use the older disk sdb or sdc to assemble the array.
> That means we should not choose the 'spare' disk as the fresh disk in
> analyze_sbs().
>
> To fix the problem, we do not compare superblock events when it is
> a spare disk, as same as validate_super.
>
> Signed-off-by: Yufen Yu <yuyufen@huawei.com>
>
> v1->v2:
>    fix wrong return value in super_90_load
> ---
>   drivers/md/md.c | 44 ++++++++++++++++++++++++--------------------
>   1 file changed, 24 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 1be7abeb24fd..0a91c20071b3 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -1097,7 +1097,7 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   {
>   	char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
>   	mdp_super_t *sb;
> -	int ret;
> +	int ret = 0;
>   
>   	/*
>   	 * Calculate the position of the superblock (512byte sectors),
> @@ -1111,14 +1111,12 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   	if (ret)
>   		return ret;
>   
> -	ret = -EINVAL;
> -
>   	bdevname(rdev->bdev, b);
>   	sb = page_address(rdev->sb_page);
>   
>   	if (sb->md_magic != MD_SB_MAGIC) {
>   		pr_warn("md: invalid raid superblock magic on %s\n", b);
> -		goto abort;
> +		return -EINVAL;
>   	}
>   
>   	if (sb->major_version != 0 ||
> @@ -1126,15 +1124,15 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   	    sb->minor_version > 91) {
>   		pr_warn("Bad version number %d.%d on %s\n",
>   			sb->major_version, sb->minor_version, b);
> -		goto abort;
> +		return -EINVAL;
>   	}
>   
>   	if (sb->raid_disks <= 0)
> -		goto abort;
> +		return -EINVAL;
>   
>   	if (md_csum_fold(calc_sb_csum(sb)) != md_csum_fold(sb->sb_csum)) {
>   		pr_warn("md: invalid superblock checksum on %s\n", b);
> -		goto abort;
> +		return -EINVAL;
>   	}
>   
>   	rdev->preferred_minor = sb->md_minor;
> @@ -1156,19 +1154,22 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   		if (!md_uuid_equal(refsb, sb)) {
>   			pr_warn("md: %s has different UUID to %s\n",
>   				b, bdevname(refdev->bdev,b2));
> -			goto abort;
> +			return -EINVAL;
>   		}
>   		if (!md_sb_equal(refsb, sb)) {
>   			pr_warn("md: %s has same UUID but different superblock to %s\n",
>   				b, bdevname(refdev->bdev, b2));
> -			goto abort;
> +			return -EINVAL;
>   		}
>   		ev1 = md_event(sb);
>   		ev2 = md_event(refsb);
> -		if (ev1 > ev2)
> -			ret = 1;
> -		else
> -			ret = 0;
> +
> +		/* Insist on good event counter while assembling, except
> +		 * for spares (which don't need an event count) */
> +		if (sb->disks[rdev->desc_nr].state & (
> +			(1<<MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE)))
> +			if (ev1 > ev2)
> +				ret = 1;
>   	}
>   	rdev->sectors = rdev->sb_start;
>   	/* Limit to 4TB as metadata cannot record more than that.
> @@ -1180,9 +1181,8 @@ static int super_90_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor
>   
>   	if (rdev->sectors < ((sector_t)sb->size) * 2 && sb->level >= 1)
>   		/* "this cannot possibly happen" ... */
> -		ret = -EINVAL;
> +		return -EINVAL;
>   
> - abort:
>   	return ret;
>   }
>   
> @@ -1520,7 +1520,7 @@ static __le32 calc_sb_1_csum(struct mdp_superblock_1 *sb)
>   static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_version)
>   {
>   	struct mdp_superblock_1 *sb;
> -	int ret;
> +	int ret = 0;
>   	sector_t sb_start;
>   	sector_t sectors;
>   	char b[BDEVNAME_SIZE], b2[BDEVNAME_SIZE];
> @@ -1676,10 +1676,14 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
>   		ev1 = le64_to_cpu(sb->events);
>   		ev2 = le64_to_cpu(refsb->events);
>   
> -		if (ev1 > ev2)
> -			ret = 1;
> -		else
> -			ret = 0;
> +		/* Insist of good event counter while assembling, except for
> +		 * spares (which don't need an event count) */
> +		if (rdev->desc_nr >= 0 &&
> +		    rdev->desc_nr < le32_to_cpu(sb->max_dev) &&
> +		    (le16_to_cpu(sb->dev_roles[rdev->desc_nr]) < MD_DISK_ROLE_MAX ||
> +		     le16_to_cpu(sb->dev_roles[rdev->desc_nr]) == MD_DISK_ROLE_JOURNAL))
> +			if (ev1 > ev2)
> +				ret = 1;
>   	}
>   	if (minor_version) {
>   		sectors = (i_size_read(rdev->bdev->bd_inode) >> 9);



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] md: no longer compare spare disk superblock events in super_load
  2019-10-12  1:25   ` Yufen Yu
  (?)
@ 2019-10-15  0:31   ` Song Liu
  2019-10-15  2:35       ` Yufen Yu
  -1 siblings, 1 reply; 8+ messages in thread
From: Song Liu @ 2019-10-15  0:31 UTC (permalink / raw)
  To: Yufen Yu; +Cc: Song Liu, linux-raid, NeilBrown, stable

On Fri, Oct 11, 2019 at 6:35 PM Yufen Yu <yuyufen@huawei.com> wrote:
>
> ping

Sorry for the late reply.

The patch looks good over all. Please fix issues reported by
./scripts/checkpatch.pl
and resubmit.

Thanks,
Song

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] md: no longer compare spare disk superblock events in super_load
  2019-10-15  0:31   ` Song Liu
@ 2019-10-15  2:35       ` Yufen Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Yufen Yu @ 2019-10-15  2:35 UTC (permalink / raw)
  To: Song Liu; +Cc: Song Liu, linux-raid, NeilBrown, stable



On 2019/10/15 8:31, Song Liu wrote:
> On Fri, Oct 11, 2019 at 6:35 PM Yufen Yu <yuyufen@huawei.com> wrote:
>> ping
> Sorry for the late reply.
>
> The patch looks good over all. Please fix issues reported by
> ./scripts/checkpatch.pl
> and resubmit.

Sorry for forgetting to check the patch by checkpatch.pl.
And thanks a lot for your review and response.

Thanks,
Yufen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] md: no longer compare spare disk superblock events in super_load
@ 2019-10-15  2:35       ` Yufen Yu
  0 siblings, 0 replies; 8+ messages in thread
From: Yufen Yu @ 2019-10-15  2:35 UTC (permalink / raw)
  To: Song Liu; +Cc: Song Liu, linux-raid, NeilBrown, stable



On 2019/10/15 8:31, Song Liu wrote:
> On Fri, Oct 11, 2019 at 6:35 PM Yufen Yu <yuyufen@huawei.com> wrote:
>> ping
> Sorry for the late reply.
>
> The patch looks good over all. Please fix issues reported by
> ./scripts/checkpatch.pl
> and resubmit.

Sorry for forgetting to check the patch by checkpatch.pl.
And thanks a lot for your review and response.

Thanks,
Yufen


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-10-15  2:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-25  5:54 [PATCH v2] md: no longer compare spare disk superblock events in super_load Yufen Yu
2019-09-25  5:54 ` Yufen Yu
2019-09-25  5:40 ` Greg KH
2019-10-12  1:25 ` Yufen Yu
2019-10-12  1:25   ` Yufen Yu
2019-10-15  0:31   ` Song Liu
2019-10-15  2:35     ` Yufen Yu
2019-10-15  2:35       ` Yufen Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.