linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bug in 2.6.17 / mdadm 2.5.1
       [not found] <20060624104745.GA6352@defiant.crash>
@ 2006-06-25 13:59 ` Ronald Lembcke
  2006-06-26  1:06   ` Neil Brown
  0 siblings, 1 reply; 5+ messages in thread
From: Ronald Lembcke @ 2006-06-25 13:59 UTC (permalink / raw)
  To: linux-kernel, linux-raid; +Cc: es186

[-- Attachment #1: Type: text/plain, Size: 3161 bytes --]

Hi!

There's a bug in Kernel 2.6.17 and / or mdadm which prevents (re)adding
a disk to a degraded RAID5-Array.

The mail I'm replying to was sent to linux-raid only. A summary of my
problem is in the quoted part, and everything you need to reproduce it
is below.
There's more information (kernel log, output of mdadm -E, ...) in the
original mail (
Subject: RAID5 degraded after mdadm -S, mdadm --assemble (everytime)
Message-ID: <20060624104745.GA6352@defiant.crash>
It can be found here for example: 
http://www.spinics.net/lists/raid/msg12859.html )

More about this problem below the quoted pard.

On Sat Jun 24 12:47:45 2006, I wrote:
> I set up a RAID5 array of 4 disks. I initially created a degraded array
> and added the fourth disk (sda1) later.
> 
> The array is "clean", but when I do  
>   mdadm -S /dev/md0 
>   mdadm --assemble /dev/md0 /dev/sd[abcd]1
> it won't start. It always says sda1 is "failed".
> 
> When I remove sda1 and add it again everything seems to be fine until I
> stop the array. 

CPU: AMD-K6(tm) 3D processor
Kernel: Linux version 2.6.17 (root@ganges) (gcc version 4.0.3 (Debian
4.0.3-1)) #2 Tue Jun 20 17:48:32 CEST 2006

The problem is: The superblocks get inconsistent, but I couldn't find
where this actually happens.

Here are some simple steps to reproduce it (don't forget to adjust the
device-names if you're allready using /dev/md1 or /dev/loop[0-3]):
The behaviour changes when you execute --zero-superblock in the example 
below (it looks even more broken).
It also changes when you fail some other disk instead of loop2.
When loop3 is failed (without executing --zero-superblock) it can
successfully be re-added.


############################################
cd /tmp; mkdir raiddtest; cd raidtest
dd bs=1M count=1 if=/dev/zero of=disk0
dd bs=1M count=1 if=/dev/zero of=disk1
dd bs=1M count=1 if=/dev/zero of=disk2
dd bs=1M count=1 if=/dev/zero of=disk3
losetup /dev/loop0 disk0
losetup /dev/loop1 disk1
losetup /dev/loop2 disk2
losetup /dev/loop3 disk3
mdadm --create /dev/md1 --level=5 --raid-devices=4 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
mdadm /dev/md1 --fail /dev/loop2
mdadm /dev/md1 --remove /dev/loop2

#mdadm --zero-superblock /dev/loop2
# here something goes wrong
mdadm /dev/md1 --add /dev/loop2

mdadm --stop /dev/md1
# can't reassemble
mdadm --assemble /dev/md1 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
############################################


To cleanup everything :)
############################################
mdadm --stop /dev/md1
losetup -d /dev/loop0
losetup -d /dev/loop1
losetup -d /dev/loop2
losetup -d /dev/loop3
rm disk0 disk1 disk2 disk3
###########################################


After mdadm --create the superblocks are ok, but look a little bit
strange (the failed device):

dev_roles[i]: 0000 0001 0002 fffe 0003
the disks have dev_num: 0,1,2,4

But after --fail --remove --add 
dev_roles[i]: 0000 0001 fffe fffe 0003 0002
but the disks still have dev_num: 0,1,2,4.
Either loop2 must have disk_num=5 or dev_roles needs to be
0,1,2,0xfffe,4.


Greetings,
           Roni

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in 2.6.17 / mdadm 2.5.1
  2006-06-25 13:59 ` Bug in 2.6.17 / mdadm 2.5.1 Ronald Lembcke
@ 2006-06-26  1:06   ` Neil Brown
  2006-06-26  1:53     ` Neil Brown
  2006-06-26 21:24     ` Andre Tomt
  0 siblings, 2 replies; 5+ messages in thread
From: Neil Brown @ 2006-06-26  1:06 UTC (permalink / raw)
  To: Ronald Lembcke; +Cc: linux-kernel, linux-raid

On Sunday June 25, es186@fen-net.de wrote:
> Hi!
> 
> There's a bug in Kernel 2.6.17 and / or mdadm which prevents (re)adding
> a disk to a degraded RAID5-Array.

Thank you for the detailed report.
The bug is in the md driver in the kernel (not in mdadm), and only
affects version-1 superblocks.  Debian recently changed the default
(in /etc/mdadm.conf) to use version-1 superblocks which I thought
would be OK (I've some testing) but obviously I missed something. :-(

If you remove the "metadata=1" (or whatever it is) from
/etc/mdadm/mdadm.conf and then create the array, it will be created
with a version-0.90 superblock has had more testing.

Alternately you can apply the following patch to the kernel and
version-1 superblocks should work better.

NeilBrown

-------------------------------------------------
Set desc_nr correctly for version-1 superblocks.

This has to be done in ->load_super, not ->validate_super

### Diffstat output
 ./drivers/md/md.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c	2006-06-26 11:02:43.000000000 +1000
+++ ./drivers/md/md.c	2006-06-26 11:02:46.000000000 +1000
@@ -1057,6 +1057,11 @@ static int super_1_load(mdk_rdev_t *rdev
 	if (rdev->sb_size & bmask)
 		rdev-> sb_size = (rdev->sb_size | bmask)+1;
 
+	if (sb->level == cpu_to_le32(LEVEL_MULTIPATH))
+		rdev->desc_nr = -1;
+	else
+		rdev->desc_nr = le32_to_cpu(sb->dev_number);
+
 	if (refdev == 0)
 		ret = 1;
 	else {
@@ -1165,7 +1170,6 @@ static int super_1_validate(mddev_t *mdd
 
 	if (mddev->level != LEVEL_MULTIPATH) {
 		int role;
-		rdev->desc_nr = le32_to_cpu(sb->dev_number);
 		role = le16_to_cpu(sb->dev_roles[rdev->desc_nr]);
 		switch(role) {
 		case 0xffff: /* spare */

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in 2.6.17 / mdadm 2.5.1
  2006-06-26  1:06   ` Neil Brown
@ 2006-06-26  1:53     ` Neil Brown
  2006-06-26 21:24     ` Andre Tomt
  1 sibling, 0 replies; 5+ messages in thread
From: Neil Brown @ 2006-06-26  1:53 UTC (permalink / raw)
  To: Ronald Lembcke; +Cc: linux-kernel, linux-raid

On Monday June 26, neilb@suse.de wrote:
> On Sunday June 25, es186@fen-net.de wrote:
> > Hi!
> > 
> > There's a bug in Kernel 2.6.17 and / or mdadm which prevents (re)adding
> > a disk to a degraded RAID5-Array.
> 
> Thank you for the detailed report.
> The bug is in the md driver in the kernel (not in mdadm), and only
> affects version-1 superblocks.  Debian recently changed the default
> (in /etc/mdadm.conf) to use version-1 superblocks which I thought
> would be OK (I've some testing) but obviously I missed something. :-(
> 
> If you remove the "metadata=1" (or whatever it is) from
> /etc/mdadm/mdadm.conf and then create the array, it will be created
> with a version-0.90 superblock has had more testing.
> 
> Alternately you can apply the following patch to the kernel and
> version-1 superblocks should work better.

And as a third alternate, you can apply this patch to mdadm-2.5.1
It will work-around the kernel bug.

NeilBrown

diff .prev/Manage.c ./Manage.c
--- .prev/Manage.c	2006-06-20 10:01:17.000000000 +1000
+++ ./Manage.c	2006-06-26 11:46:56.000000000 +1000
@@ -271,8 +271,14 @@ int Manage_subdevs(char *devname, int fd
 				 * If so, we can simply re-add it.
 				 */
 				st->ss->uuid_from_super(duuid, dsuper);
-			
-				if (osuper) {
+
+				/* re-add doesn't work for version-1 superblocks
+				 * before 2.6.18 :-(
+				 */
+				if (array.major_version == 1 &&
+				    get_linux_version() <= 2006018)
+					;
+				else if (osuper) {
 					st->ss->uuid_from_super(ouuid, osuper);
 					if (memcmp(duuid, ouuid, sizeof(ouuid))==0) {
 						/* look close enough for now.  Kernel
@@ -295,7 +301,10 @@ int Manage_subdevs(char *devname, int fd
 					}
 				}
 			}
-			for (j=0; j< st->max_devs; j++) {
+			/* due to a bug in 2.6.17 and earlier, we start
+			 * looking from raid_disks, not 0
+			 */
+			for (j = array.raid_disks ; j< st->max_devs; j++) {
 				disc.number = j;
 				if (ioctl(fd, GET_DISK_INFO, &disc))
 					break;

diff .prev/super1.c ./super1.c
--- .prev/super1.c	2006-06-20 10:01:46.000000000 +1000
+++ ./super1.c	2006-06-26 11:47:12.000000000 +1000
@@ -277,6 +277,18 @@ static void examine_super1(void *sbv, ch
 	default: break;
 	}
 	printf("\n");
+	printf("    Array Slot : %d (", __le32_to_cpu(sb->dev_number));
+	for (i= __le32_to_cpu(sb->max_dev); i> 0 ; i--)
+		if (__le16_to_cpu(sb->dev_roles[i-1]) != 0xffff)
+			break;
+	for (d=0; d < i; d++) {
+		int role = __le16_to_cpu(sb->dev_roles[d]);
+		if (d) printf(", ");
+		if (role == 0xffff) printf("empty");
+		else if(role == 0xfffe) printf("failed");
+		else printf("%d", role);
+	}
+	printf(")\n");
 	printf("   Array State : ");
 	for (d=0; d<__le32_to_cpu(sb->raid_disks); d++) {
 		int cnt = 0;
@@ -767,7 +779,8 @@ static int write_init_super1(struct supe
 		if (memcmp(sb->set_uuid, refsb->set_uuid, 16)==0) {
 			/* same array, so preserve events and dev_number */
 			sb->events = refsb->events;
-			sb->dev_number = refsb->dev_number;
+			if (get_linux_version() >= 2006018)
+				sb->dev_number = refsb->dev_number;
 		}
 		free(refsb);
 	}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in 2.6.17 / mdadm 2.5.1
  2006-06-26  1:06   ` Neil Brown
  2006-06-26  1:53     ` Neil Brown
@ 2006-06-26 21:24     ` Andre Tomt
  2006-06-27  1:00       ` Neil Brown
  1 sibling, 1 reply; 5+ messages in thread
From: Andre Tomt @ 2006-06-26 21:24 UTC (permalink / raw)
  To: Neil Brown; +Cc: Ronald Lembcke, linux-kernel, linux-raid

Neil Brown wrote:
<snip>
> Alternately you can apply the following patch to the kernel and
> version-1 superblocks should work better.

-stable material?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bug in 2.6.17 / mdadm 2.5.1
  2006-06-26 21:24     ` Andre Tomt
@ 2006-06-27  1:00       ` Neil Brown
  0 siblings, 0 replies; 5+ messages in thread
From: Neil Brown @ 2006-06-27  1:00 UTC (permalink / raw)
  To: Andre Tomt; +Cc: Ronald Lembcke, linux-kernel, linux-raid

On Monday June 26, andre@tomt.net wrote:
> Neil Brown wrote:
> <snip>
> > Alternately you can apply the following patch to the kernel and
> > version-1 superblocks should work better.
> 
> -stable material?

Maybe.  I'm not sure it exactly qualifies, but I might try sending it
to them and see what they think.

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-06-27  1:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20060624104745.GA6352@defiant.crash>
2006-06-25 13:59 ` Bug in 2.6.17 / mdadm 2.5.1 Ronald Lembcke
2006-06-26  1:06   ` Neil Brown
2006-06-26  1:53     ` Neil Brown
2006-06-26 21:24     ` Andre Tomt
2006-06-27  1:00       ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).