From: "Schmidt, Annemarie" <Annemarie.Schmidt@stratus.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: RE: Mdadm re-add fails
Date: Fri, 27 May 2011 17:16:46 -0400 [thread overview]
Message-ID: <5AA430FFE4486C448003201AC83BC85E01B0357D@EXHQ.corp.stratus.com> (raw)
In-Reply-To: <5AA430FFE4486C448003201AC83BC85E01B0353E@EXHQ.corp.stratus.com>
Hi Neil,
I've unfortunately run into a problem with the patch to the enough_fd code. It does not appear to work in all cases.
mdadm --detail /dev/md21
Number Major Minor RaidDevice State
3 65 18 0 active sync /dev/sdc2
2 65 50 1 active sync /dev/sdk2
Here it works when I remove /dev/sdk2
>> mdadm /dev/md21 -f /dev/sdk2 -r /dev/sdk2
mdadm: set /dev/sdk2 faulty in /dev/md21
mdadm: hot removed /dev/sdk2 from /dev/md21
>> mdadm /dev/md21 -a /dev/sdk2
mdadm: re-added /dev/sdk2
But when I try to remove the other disk, /dev/sdc2, it doesn't:
>> mdadm /dev/md21 -f /dev/sdc2 -r /dev/sdc2
mdadm: set /dev/sdc2 faulty in /dev/md21
mdadm: hot removed /dev/sdc2 from /dev/md21
>> mdadm /dev/md21 -a /dev/sdc2
mdadm: /dev/sdc2 reports being an active member for /dev/md21, but a --re-add fails.
mdadm: not performing --add as that would convert /dev/sdc2 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdc2" first.
I could get it all to work when I removed this line from the :
+ array.raid_disks--;
>> mdadm_good_patch_minus_dec /dev/md21 -f /dev/sdk2 -r /dev/sdk2
mdadm: set /dev/sdk2 faulty in /dev/md21
mdadm: hot removed /dev/sdk2 from /dev/md21
>> mdadm_good_patch_minus_dec /dev/md21 -a /dev/sdk2
mdadm: re-added /dev/sdk2
>> mdadm_good_patch_minus_dec /dev/md21 -f /dev/sdc2 -r /dev/sdc2
mdadm: set /dev/sdc2 faulty in /dev/md21
mdadm: hot removed /dev/sdc2 from /dev/md21
>> mdadm_good_patch_minus_dec /dev/md21 -a /dev/sdc2
mdadm: re-added /dev/sdc2
So can this line simply be removed or does the patch need to be reworked?
Thanks & regards,
Annemarie Schmidt
-----Original Message-----
From: Schmidt, Annemarie
Sent: Friday, May 20, 2011 1:16 PM
To: 'NeilBrown'
Cc: linux-raid@vger.kernel.org; Dailey, Nate
Subject: RE: Mdadm re-add fails
Neil,
Yes, that worked:
>> [root@typhon ~]# mdadm --detail /dev/md24
/dev/md24:
Version : 1.2
Creation Time : Fri May 20 11:42:17 2011
Raid Level : raid1
Array Size : 5241844 (5.00 GiB 5.37 GB)
Used Dev Size : 5241844 (5.00 GiB 5.37 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Fri May 20 12:47:09 2011
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Name : typhon.mno.stratus.com:24 (local to host typhon.mno.stratus.com)
UUID : 562323d9:9a7b2979:a734abf0:b3fb8f0b
Events : 155
Number Major Minor RaidDevice State
3 65 22 0 active sync /dev/sdc6
2 65 54 1 active sync /dev/sdk6
>> [root@typhon sbin]# mdadm /dev/md24 -f /dev/sdk6 -r /dev/sdk6
mdadm: set /dev/sdk6 faulty in /dev/md24
mdadm: hot removed /dev/sdk6 from /dev/md24
Without the fix:
---------------------
>> root@typhon sbin]# mdadm /dev/md24 -a /dev/sdk6
mdadm: /dev/sdk6 reports being an active member for /dev/md24, but a --re-add fails.
mdadm: not performing --add as that would convert /dev/sdk6 in to a spare.
mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdk6" first.
With the fix:
-----------------
>> [root@typhon ~]# ./mdadm /dev/md24 -a /dev/sdk6
mdadm: re-added /dev/sdk6
Thanks very much for the assistance.
Regards,
Annemarie
-----Original Message-----
From: NeilBrown [mailto:neilb@suse.de]
Sent: Thursday, May 19, 2011 7:52 PM
To: Schmidt, Annemarie
Cc: linux-raid@vger.kernel.org
Subject: Re: Mdadm re-add fails
On Wed, 18 May 2011 10:43:47 -0400 "Schmidt, Annemarie"
<Annemarie.Schmidt@stratus.com> wrote:
> Hi!
>
> I have a 2 disk raid1 data array. As a result of other testing, the device info
> in the superblock for one of the partners, /dev/sdc2, ended up being in slot 3
> of the device info array:
>
> [root@typhon ~]# mdadm --detail /dev/md21
> /dev/md21:
> Version : 1.2
> Creation Time : Mon May 9 11:19:43 2011
> Raid Level : raid1
> Array Size : 5241844 (5.00 GiB 5.37 GB)
> Used Dev Size : 5241844 (5.00 GiB 5.37 GB)
> Raid Devices : 2
> Total Devices : 2
> Persistence : Superblock is persistent
>
> Intent Bitmap : Internal
>
> Update Time : Thu May 12 15:51:50 2011
> State : active
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> Name : typhon.mno.stratus.com:21 (local to host typhon.mno.stratus.com)
> UUID : 996d993f:baac367a:8b154ba9:43e56cff
> Events : 687
>
> Number Major Minor RaidDevice State
> --> 3 65 34 0 active sync /dev/sdc2
> 2 65 82 1 active sync /dev/sdk2
>
> When I remove /dev/sdk2 and then a re-add it back in, the re-add fails:
>
> >> [root@typhon ~]# mdadm /dev/md21 -f /dev/sdk2 -r /dev/sdk2
> mdadm: set /dev/sdk2 faulty in /dev/md21
> mdadm: hot removed /dev/sdk2 from /dev/md21
>
> >> [root@typhon ~]# mdadm /dev/md21 -a /dev/sdk2
> mdadm: /dev/sdk2 reports being an active member for /dev/md21, but a --re-add
> fails.
> mdadm: not performing --add as that would convert /dev/sdk2 in to a spare.
> mdadm: To make this a spare, use "mdadm --zero-superblock /dev/sdk2" first.
>
> I believe the re-add fails because the enough_fd function (util.c) is not searching deep enough into the
> dev_info array with this line of code:
> for (i=0; i<array.raid_disks + array.nr_disks; i++)
>
> array.raids_disk = 2 and array/nr_disks = 1, and so for this particular md device, it is only looking at slots 0-2.
> I believe the code needs to be changed to look at all possible dev_info array slots, taking into account the
> version of the superblock (like the Detail function does (Detail.c).
>
> Do folks agree?
>
I do - largely. I think there might be a better more general way to control
the loop though.
Could you try this please?
Thanks,
NeilBrown
diff --git a/util.c b/util.c
index 1056ae4..d005e0a 100644
--- a/util.c
+++ b/util.c
@@ -370,10 +370,14 @@ int enough_fd(int fd)
array.raid_disks <= 0)
return 0;
avail = calloc(array.raid_disks, 1);
- for (i=0; i<array.raid_disks + array.nr_disks; i++) {
+ for (i=0; i < 1024 && array.raid_disks > 0; i++) {
disk.number = i;
if (ioctl(fd, GET_DISK_INFO, &disk) != 0)
continue;
+ if (disk.major == 0 && disk.minor == 0)
+ continue;
+ array.raid_disks--;
+
if (! (disk.state & (1<<MD_DISK_SYNC)))
continue;
if (disk.raid_disk < 0 || disk.raid_disk >= array.raid_disks)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-05-27 21:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-18 14:43 Mdadm re-add fails Schmidt, Annemarie
2011-05-19 23:51 ` NeilBrown
2011-05-20 17:16 ` Schmidt, Annemarie
2011-05-27 21:16 ` Schmidt, Annemarie [this message]
2011-08-03 8:03 Jan Vejvalka
2011-08-03 9:25 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5AA430FFE4486C448003201AC83BC85E01B0357D@EXHQ.corp.stratus.com \
--to=annemarie.schmidt@stratus.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.