All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG drivers/md/md.c: data-offset reshape renders array unloadable
@ 2015-01-24 21:44 Wesley W. Terpstra
  2015-01-24 22:23 ` Wesley W. Terpstra
  0 siblings, 1 reply; 5+ messages in thread
From: Wesley W. Terpstra @ 2015-01-24 21:44 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2067 bytes --]

On Wed, Jan 21, 2015 at 1:37 PM, Wesley W. Terpstra <wesley@terpstra.ca> wrote:
> Try: madam -A -o --run --force --freeze-reshape /dev/md120 /dev/sd[a-d]4
> It failed the same as before. Logs in mdadm-a.log and dmesg.log

I have now tried 3.18.3, and had exactly the same problem.
To recap, for anyone new to this thread:

I have a raid5 array with 4 disks. I did a reshape to change the
data-offset from 5120 to 8192, changing nothing else. The number of
disks remained the same. The reshape was going fine and was probably
around 50% done when the system was rebooted. All the superblocks have
matching event counts, all checksums pass, and all disks have clean
bills of health. However, any attempt to reassemble the array fails.

I have gone ahead and applied a patch to md.c to diagnose WHY it does
not assemble the array. Find the patch I used (against 3.18.3)
attached. When I try to assemble the array using my patched 3.18.3,
this is what I see:

[    5.830139] md: FYI: 8192 old 8192 new
[    5.830167] md: sectors mismatch 5842894848 < 5842897920 on sdc4
[    5.830199] md: sdc4 does not have a valid v1.2 superblock, not importing!
[    5.830635] md: md_import_device returned -22

First, it is obviously the last test in super_1_load that is rejecting
the array. The superblock reports more sectors than are calculated, so
the check
        if (sectors < le64_to_cpu(sb->data_size)) {
fails.

IMO, it seems that the problem is that the superblock has had
rdev->data_offset updated prematurely! The reshape was incomplete, so
I would have expected data_offset to still be 5120 and new_data_offset
to be 8192. If data_offset HAD been 5120, the two values compared in
the failing inequality would be equal.

I am considering simply modifying my superblock to reset data_offset
to 5120 and see if the reshape then resumes correctly. Is this the
correct fix to recover my array? I imagine the source of my problems
must be a bug somewhere in mdadm reshape that updates both
new_data_offset and data_offset at the same time when doing a
data-offset reshape..?

[-- Attachment #2: debug.patch --]
[-- Type: application/octet-stream, Size: 5159 bytes --]

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 9233c71..5c6a56a 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -1385,6 +1385,8 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
 		sb_start = 8;
 		break;
 	default:
+		printk("md: invalid superblock minor %d on %s\n",
+			minor_version, bdevname(rdev->bdev,b));
 		return -EINVAL;
 	}
 	rdev->sb_start = sb_start;
@@ -1393,17 +1395,39 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
 	 * and it is safe to read 4k, so we do that
 	 */
 	ret = read_disk_sb(rdev, 4096);
-	if (ret) return ret;
-
+	if (ret) {
+		printk("md: cannot read superblock on %s\n",
+			bdevname(rdev->bdev,b));
+		return ret;
+	}
+	
 	sb = page_address(rdev->sb_page);
 
-	if (sb->magic != cpu_to_le32(MD_SB_MAGIC) ||
-	    sb->major_version != cpu_to_le32(1) ||
-	    le32_to_cpu(sb->max_dev) > (4096-256)/2 ||
-	    le64_to_cpu(sb->super_offset) != rdev->sb_start ||
-	    (le32_to_cpu(sb->feature_map) & ~MD_FEATURE_ALL) != 0)
+	if (sb->magic != cpu_to_le32(MD_SB_MAGIC)) {
+		printk("md: bad magic %x on %s\n",
+			sb->magic, bdevname(rdev->bdev,b));
 		return -EINVAL;
-
+	}
+	if (sb->major_version != cpu_to_le32(1)) {
+		printk("md: wrong major version %d %s\n",
+			sb->major_version, bdevname(rdev->bdev,b));
+		return -EINVAL;
+	}
+	if (le32_to_cpu(sb->max_dev) > (4096-256)/2) {
+		printk("md: too many devices %d on %s\n",
+			sb->max_dev, bdevname(rdev->bdev,b));
+		return -EINVAL;
+	}
+	if (le64_to_cpu(sb->super_offset) != rdev->sb_start) {
+		printk("md: superblock not where expected, %d on %s\n",
+			sb->max_dev, bdevname(rdev->bdev,b));
+		return -EINVAL;
+	}
+	if ((le32_to_cpu(sb->feature_map) & ~MD_FEATURE_ALL) != 0) {
+		printk("md: required feature unsupported by kernel %x on %s\n",
+			sb->feature_map, bdevname(rdev->bdev,b));
+		return -EINVAL;
+	}
 	if (calc_sb_1_csum(sb) != sb->sb_csum) {
 		printk("md: invalid superblock checksum on %s\n",
 			bdevname(rdev->bdev,b));
@@ -1416,9 +1440,12 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
 	}
 	if (sb->pad0 ||
 	    sb->pad3[0] ||
-	    memcmp(sb->pad3, sb->pad3+1, sizeof(sb->pad3) - sizeof(sb->pad3[1])))
+	    memcmp(sb->pad3, sb->pad3+1, sizeof(sb->pad3) - sizeof(sb->pad3[1]))) {
 		/* Some padding is non-zero, might be a new feature */
+		printk("md: non-zero padding on %s\n",
+		       bdevname(rdev->bdev,b));
 		return -EINVAL;
+	}
 
 	rdev->preferred_minor = 0xffff;
 	rdev->data_offset = le64_to_cpu(sb->data_offset);
@@ -1433,12 +1460,24 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
 	if (rdev->sb_size & bmask)
 		rdev->sb_size = (rdev->sb_size | bmask) + 1;
 
+	printk("md: FYI: %ld old %ld new\n", rdev->data_offset, rdev->new_data_offset);
+	
 	if (minor_version
-	    && rdev->data_offset < sb_start + (rdev->sb_size/512))
+	    && rdev->data_offset < sb_start + (rdev->sb_size/512)) {
+		printk("md: data_offset=%ld < %ld on %s\n",
+		       rdev->data_offset,
+		       sb_start + (rdev->sb_size/512),
+		       bdevname(rdev->bdev,b));
 		return -EINVAL;
+	}
 	if (minor_version
-	    && rdev->new_data_offset < sb_start + (rdev->sb_size/512))
+	    && rdev->new_data_offset < sb_start + (rdev->sb_size/512)) {
+		printk("md: new_data_offset=%ld < %ld on %s\n",
+		       rdev->new_data_offset,
+		       sb_start + (rdev->sb_size/512),
+		       bdevname(rdev->bdev,b));
 		return -EINVAL;
+	}
 
 	if (sb->level == cpu_to_le32(LEVEL_MULTIPATH))
 		rdev->desc_nr = -1;
@@ -1460,11 +1499,17 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
 		u64 *bbp;
 		int i;
 		int sectors = le16_to_cpu(sb->bblog_size);
-		if (sectors > (PAGE_SIZE / 512))
+		if (sectors > (PAGE_SIZE / 512)) {
+			printk("md: bad block map too big %d on %s\n",
+			       sectors, bdevname(rdev->bdev,b));
 			return -EINVAL;
+		}
 		offset = le32_to_cpu(sb->bblog_offset);
-		if (offset == 0)
+		if (offset == 0) {
+			printk("md: bad block map too big %d on %s\n",
+			       sectors, bdevname(rdev->bdev,b));
 			return -EINVAL;
+		}
 		bb_sector = (long long)offset;
 		if (!sync_page_io(rdev, bb_sector, sectors << 9,
 				  rdev->bb_page, READ, true))
@@ -1480,8 +1525,11 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
 			if (bb + 1 == 0)
 				break;
 			if (md_set_badblocks(&rdev->badblocks,
-					     sector, count, 1) == 0)
+					     sector, count, 1) == 0) {
+				printk("md: set bad blocks failed on %s\n",
+				       bdevname(rdev->bdev,b));
 				return -EINVAL;
+			}
 		}
 	} else if (sb->bblog_offset != 0)
 		rdev->badblocks.shift = 0;
@@ -1515,8 +1563,11 @@ static int super_1_load(struct md_rdev *rdev, struct md_rdev *refdev, int minor_
 		sectors -= rdev->data_offset;
 	} else
 		sectors = rdev->sb_start;
-	if (sectors < le64_to_cpu(sb->data_size))
+	if (sectors < le64_to_cpu(sb->data_size)) {
+		printk("md: sectors mismatch %ld < %lld on %s\n",
+			sectors, le64_to_cpu(sb->data_size), bdevname(rdev->bdev,b));
 		return -EINVAL;
+	}
 	rdev->sectors = le64_to_cpu(sb->data_size);
 	return ret;
 }

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: BUG drivers/md/md.c: data-offset reshape renders array unloadable
  2015-01-24 21:44 BUG drivers/md/md.c: data-offset reshape renders array unloadable Wesley W. Terpstra
@ 2015-01-24 22:23 ` Wesley W. Terpstra
  2015-01-25 16:46   ` Wesley W. Terpstra
  0 siblings, 1 reply; 5+ messages in thread
From: Wesley W. Terpstra @ 2015-01-24 22:23 UTC (permalink / raw)
  To: linux-raid

On Sat, Jan 24, 2015 at 10:44 PM, Wesley W. Terpstra <wesley@terpstra.ca> wrote:
> The reshape was going fine and was probably
> around 50% done when the system was rebooted.

It is also possible the system was hung, but continued the reshape
until completion.

> First, it is obviously the last test in super_1_load that is rejecting
> the array. The superblock reports more sectors than are calculated, so
> the check
>         if (sectors < le64_to_cpu(sb->data_size)) {
> fails.

Thus, an alternative explanation could be that the the sb->data_size
was not updated after the reshape completed. The superblocks have a
"Feature Map: 0x0" according to mdadm. I gather from reading
linux/raid/md_p.h that it should at least have
MD_FEATURE_RESHAPE_ACTIVE, if the reshape were still in progress.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG drivers/md/md.c: data-offset reshape renders array unloadable
  2015-01-24 22:23 ` Wesley W. Terpstra
@ 2015-01-25 16:46   ` Wesley W. Terpstra
  2015-02-05  6:14     ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Wesley W. Terpstra @ 2015-01-25 16:46 UTC (permalink / raw)
  To: linux-raid

[-- Attachment #1: Type: text/plain, Size: 765 bytes --]

On Sat, Jan 24, 2015 at 11:23 PM, Wesley W. Terpstra <wesley@terpstra.ca> wrote:
>> First, it is obviously the last test in super_1_load that is rejecting
>> the array. The superblock reports more sectors than are calculated, so
>> the check
>>         if (sectors < le64_to_cpu(sb->data_size)) {
>> fails.
>
> Thus, an alternative explanation could be that the the sb->data_size
> was not updated after the reshape completed.

I can confirm that this was the problem.

I manually modified my super block using the attached quick hack.
Thereafter I was able to reassemble the array and fsck everything
successfully.

I will try and see if I can reproduce the problem tomorrow. It's a
pretty nasty bug to have a reshape complete and render your array
unassemblable.

[-- Attachment #2: fix-size.c --]
[-- Type: text/x-csrc, Size: 878 bytes --]

#include <string.h>
#include <stdint.h>
#include <inttypes.h>
#include <stdio.h>

int main() {
  char buf[512];
  uint64_t data_offset, data_size, reshape_position;
  uint32_t checksum;
  
  fread(buf, sizeof(buf), 1, stdin);
  
  /* Assumes little-endian */
  memcpy(&reshape_position, buf+0x68, 8);
  memcpy(&data_offset, buf+0x80, 8);
  memcpy(&data_size,   buf+0x88, 8);
  memcpy(&checksum,    buf+0xD8, 4);
  
  fprintf(stderr, "data_offset: %"PRIu64"\n", data_offset);
  fprintf(stderr, "data_size:   %"PRIu64"\n", data_size);
  fprintf(stderr, "reshape:     %"PRIu64"\n", reshape_position);
  fprintf(stderr, "checksum:    %x\n", checksum);
  
  /* Assumes no overflow */
  checksum += (5842894848 - data_size);
  data_size = 5842894848;
  
  memcpy(buf+0x88, &data_size, 8);
  memcpy(buf+0xD8, &checksum,  4);
   
  fwrite(buf, sizeof(buf), 1, stdout);
  
  return 0;
}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG drivers/md/md.c: data-offset reshape renders array unloadable
  2015-01-25 16:46   ` Wesley W. Terpstra
@ 2015-02-05  6:14     ` NeilBrown
  2015-02-05  8:17       ` Wesley W. Terpstra
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2015-02-05  6:14 UTC (permalink / raw)
  To: Wesley W. Terpstra; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]

On Sun, 25 Jan 2015 17:46:20 +0100 "Wesley W. Terpstra" <wesley@terpstra.ca>
wrote:

> On Sat, Jan 24, 2015 at 11:23 PM, Wesley W. Terpstra <wesley@terpstra.ca> wrote:
> >> First, it is obviously the last test in super_1_load that is rejecting
> >> the array. The superblock reports more sectors than are calculated, so
> >> the check
> >>         if (sectors < le64_to_cpu(sb->data_size)) {
> >> fails.
> >
> > Thus, an alternative explanation could be that the the sb->data_size
> > was not updated after the reshape completed.
> 
> I can confirm that this was the problem.
> 
> I manually modified my super block using the attached quick hack.
> Thereafter I was able to reassemble the array and fsck everything
> successfully.
> 
> I will try and see if I can reproduce the problem tomorrow. It's a
> pretty nasty bug to have a reshape complete and render your array
> unassemblable.

Thanks for all the details and analysis!!!

A v1.2 array normally has data from the data_offset all the way to the end of
the device (maybe rounded  down to chunk size).

So you normally wouldn't be able to increase the data_offset from 5120 to
8192 unless you first reduced the --size of the array (having reduced the
size of any filesystem first).
Did you do that?  That should have reduced sb->data_size appropriately.

If you did --grow the --size first, that should have reduced sb->data_size.
If you didn't the --grow --data-offset should have failed.

Obviously one of those 'should's didn't...

NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG drivers/md/md.c: data-offset reshape renders array unloadable
  2015-02-05  6:14     ` NeilBrown
@ 2015-02-05  8:17       ` Wesley W. Terpstra
  0 siblings, 0 replies; 5+ messages in thread
From: Wesley W. Terpstra @ 2015-02-05  8:17 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On Thu, Feb 5, 2015 at 7:14 AM, NeilBrown <neilb@suse.de> wrote:
> A v1.2 array normally has data from the data_offset all the way to the end of
> the device (maybe rounded  down to chunk size).

Right. So in my case there were 5120 sectors unused at the start.
Unfortunately, I do not know whether the array initially had a 'size'
that included the trailing 3072 sectors or not.

What I do know: before reshape I had a chunk size of 512K and 4 disks
in raid5. When I created the array, tools at the time did not support
--data-offset, but I probably specified all the options I could find
saying I wanted 4M alignment. I reshaped the array from 1 disk to 2 to
3 to 4 as I added disks.

Unfortunately, I do not have a record of what 'mdadm -E' said before I
started the reshape.

Before I fixed the superblock:
 Avail Dev Size : 5842897920 (2786.11 GiB 2991.56 GB)
     Array Size : 8764342272 (8358.33 GiB 8974.69 GB)
  Used Dev Size : 5842894848 (2786.11 GiB 2991.56 GB)
    Data Offset : 8192 sectors

After I fixed the superblock:
 Avail Dev Size : 5842894848 (2786.11 GiB 2991.56 GB)
     Array Size : 8764342272 (8358.33 GiB 8974.69 GB)
    Data Offset : 8192 sectors

> So you normally wouldn't be able to increase the data_offset from 5120 to
> 8192 unless you first reduced the --size of the array (having reduced the
> size of any filesystem first).

Well, it let me do it.

> Did you do that?  That should have reduced sb->data_size appropriately.

No. I just went ahead and increased the data offset.

> If you did --grow the --size first, that should have reduced sb->data_size.

I did not do this step.

> If you didn't the --grow --data-offset should have failed.

I did this directly, and it let me. I had a slightly older kernel when
I started the reshape. 3.17.7 and mdadm 3.3.2.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-02-05  8:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-24 21:44 BUG drivers/md/md.c: data-offset reshape renders array unloadable Wesley W. Terpstra
2015-01-24 22:23 ` Wesley W. Terpstra
2015-01-25 16:46   ` Wesley W. Terpstra
2015-02-05  6:14     ` NeilBrown
2015-02-05  8:17       ` Wesley W. Terpstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.