All of lore.kernel.org
 help / color / mirror / Atom feed
* Trying to get POLICY working
@ 2014-10-31 15:19 Caspar Smit
  2014-10-31 15:34 ` Robin Hill
  2014-11-01  0:20 ` NeilBrown
  0 siblings, 2 replies; 6+ messages in thread
From: Caspar Smit @ 2014-10-31 15:19 UTC (permalink / raw)
  To: linux-raid

Hi all,

I'm trying to get the POLICY framework of mdadm working but I can't seem to.

As i understand in the man page of mdadm the Incremental and POLICY
directives could allow adding a new disk without MD superblock as
spare to an already active array:

"Note that mdadm will normally only add devices to an array which were
previously working (active or spare) parts of that array.  The support
for automatic inclusion of a new drive as a spare in some array
requires a configuration through POLICY in config file."

Furthermore:

"If no md metadata is found, the device may be still added to an array
as a spare if POLICY allows."


To get the basics working I created a system with 3 disks /dev/sdb,
/dev/sdc and /dev/sdd

Created a RAID5 with one missing disk:

mdadm -C /dev/md0 -l 5 -n 3 /dev/sd[b-c] missing

I set the POLICY in mdadm.conf to:

POLICY action=force-spare

This should add any device (passed through mdadm --incremental) as
spare no matter what (Am i correct?)

Now when I do:

#mdadm --incremental /dev/sdd
mdadm: no RAID superblock on /dev/sdd.

Well, i know there is no MD superblock on /dev/sdd but shouldn't the
policy setting kick in here and add /dev/sdd as spare (and hence start
rebuilding) to /dev/md0?

mdadm version: 3.2.5-5 (latest debian wheezy stable)
kernel version: 3.2.63-2 (latest debian wheezy stable)

Kind regards,
Caspar Smit

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Trying to get POLICY working
  2014-10-31 15:19 Trying to get POLICY working Caspar Smit
@ 2014-10-31 15:34 ` Robin Hill
  2014-11-01  0:20 ` NeilBrown
  1 sibling, 0 replies; 6+ messages in thread
From: Robin Hill @ 2014-10-31 15:34 UTC (permalink / raw)
  To: Caspar Smit; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2512 bytes --]

On Fri Oct 31, 2014 at 04:19:04PM +0100, Caspar Smit wrote:

> Hi all,
> 
> I'm trying to get the POLICY framework of mdadm working but I can't seem to.
> 
> As i understand in the man page of mdadm the Incremental and POLICY
> directives could allow adding a new disk without MD superblock as
> spare to an already active array:
> 
> "Note that mdadm will normally only add devices to an array which were
> previously working (active or spare) parts of that array.  The support
> for automatic inclusion of a new drive as a spare in some array
> requires a configuration through POLICY in config file."
> 
> Furthermore:
> 
> "If no md metadata is found, the device may be still added to an array
> as a spare if POLICY allows."
> 
> 
> To get the basics working I created a system with 3 disks /dev/sdb,
> /dev/sdc and /dev/sdd
> 
> Created a RAID5 with one missing disk:
> 
> mdadm -C /dev/md0 -l 5 -n 3 /dev/sd[b-c] missing
> 
> I set the POLICY in mdadm.conf to:
> 
> POLICY action=force-spare
> 
> This should add any device (passed through mdadm --incremental) as
> spare no matter what (Am i correct?)
> 
> Now when I do:
> 
> #mdadm --incremental /dev/sdd
> mdadm: no RAID superblock on /dev/sdd.
> 
> Well, i know there is no MD superblock on /dev/sdd but shouldn't the
> policy setting kick in here and add /dev/sdd as spare (and hence start
> rebuilding) to /dev/md0?
> 
> mdadm version: 3.2.5-5 (latest debian wheezy stable)
> kernel version: 3.2.63-2 (latest debian wheezy stable)
> 
According to the mdadm.conf manual page on my machine:
      The  action  item  determines the automatic behavior allowed for
      devices matching the path and type  in  the  same  line.  If  a
      device  matches  several  lines  with different actions then the
      most permissive will apply. The  ordering  of  policy lines  is
      irrelevant to the end result.

With the examples given being:
      POLICY domain=domain1 metadata=imsm path=pci-0000:00:1f.2-scsi-* action=spare
      POLICY domain=domain1 metadata=imsm path=pci-0000:04:00.0-scsi-[01]* action=include

So I'd guess that the path= entry is required (though the type value
would look to be optional, which is not clear from the text).

HTH,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Trying to get POLICY working
  2014-10-31 15:19 Trying to get POLICY working Caspar Smit
  2014-10-31 15:34 ` Robin Hill
@ 2014-11-01  0:20 ` NeilBrown
  2014-11-03  1:54   ` NeilBrown
  1 sibling, 1 reply; 6+ messages in thread
From: NeilBrown @ 2014-11-01  0:20 UTC (permalink / raw)
  To: Caspar Smit; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2937 bytes --]

On Fri, 31 Oct 2014 16:19:04 +0100 Caspar Smit <c.smit@truebit.nl> wrote:

> Hi all,
> 
> I'm trying to get the POLICY framework of mdadm working but I can't seem to.
> 
> As i understand in the man page of mdadm the Incremental and POLICY
> directives could allow adding a new disk without MD superblock as
> spare to an already active array:
> 
> "Note that mdadm will normally only add devices to an array which were
> previously working (active or spare) parts of that array.  The support
> for automatic inclusion of a new drive as a spare in some array
> requires a configuration through POLICY in config file."
> 
> Furthermore:
> 
> "If no md metadata is found, the device may be still added to an array
> as a spare if POLICY allows."
> 
> 
> To get the basics working I created a system with 3 disks /dev/sdb,
> /dev/sdc and /dev/sdd
> 
> Created a RAID5 with one missing disk:
> 
> mdadm -C /dev/md0 -l 5 -n 3 /dev/sd[b-c] missing
> 
> I set the POLICY in mdadm.conf to:
> 
> POLICY action=force-spare
> 
> This should add any device (passed through mdadm --incremental) as
> spare no matter what (Am i correct?)

That is the theory, yes.

> 
> Now when I do:
> 
> #mdadm --incremental /dev/sdd
> mdadm: no RAID superblock on /dev/sdd.

The message suggests that 'guess_super' found something on the device, but
it didn't turn out to be something useful.... not very helpful I know.

What does "mdadm --examine /dev/sdd" report?
I suspect there is a partition table and that is causing the confusion.
Try removing the partition table (dd /dev/zero to the device for a few K).
Then try again.

Probably need a fix like:

diff --git a/Incremental.c b/Incremental.c
index c9372587f518..3156190c4603 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -196,7 +196,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
 	policy = disk_policy(&dinfo);
 	have_target = policy_check_path(&dinfo, &target_array);
 
-	if (st == NULL && (st = guess_super(dfd)) == NULL) {
+	if (st == NULL && (st = guess_super_type(dfd, guess_array)) == NULL) {
 		if (c->verbose >= 0)
 			pr_err("no recognisable superblock on %s.\n",
 			       devname);


and probably should improve the error messages...

Thanks for the report.  Please let me know if that works, and what other
difficulties you hit.

Thanks,
NeilBrown


> 
> Well, i know there is no MD superblock on /dev/sdd but shouldn't the
> policy setting kick in here and add /dev/sdd as spare (and hence start
> rebuilding) to /dev/md0?
> 
> mdadm version: 3.2.5-5 (latest debian wheezy stable)
> kernel version: 3.2.63-2 (latest debian wheezy stable)
> 
> Kind regards,
> Caspar Smit
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Trying to get POLICY working
  2014-11-01  0:20 ` NeilBrown
@ 2014-11-03  1:54   ` NeilBrown
  2014-11-03  9:43     ` Caspar Smit
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2014-11-03  1:54 UTC (permalink / raw)
  To: Caspar Smit; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3106 bytes --]

On Sat, 1 Nov 2014 11:20:01 +1100 NeilBrown <neilb@suse.de> wrote:

> On Fri, 31 Oct 2014 16:19:04 +0100 Caspar Smit <c.smit@truebit.nl> wrote:
> 
> > Hi all,
> > 
> > I'm trying to get the POLICY framework of mdadm working but I can't seem to.
> > 
> > As i understand in the man page of mdadm the Incremental and POLICY
> > directives could allow adding a new disk without MD superblock as
> > spare to an already active array:
> > 
> > "Note that mdadm will normally only add devices to an array which were
> > previously working (active or spare) parts of that array.  The support
> > for automatic inclusion of a new drive as a spare in some array
> > requires a configuration through POLICY in config file."
> > 
> > Furthermore:
> > 
> > "If no md metadata is found, the device may be still added to an array
> > as a spare if POLICY allows."
> > 
> > 
> > To get the basics working I created a system with 3 disks /dev/sdb,
> > /dev/sdc and /dev/sdd
> > 
> > Created a RAID5 with one missing disk:
> > 
> > mdadm -C /dev/md0 -l 5 -n 3 /dev/sd[b-c] missing
> > 
> > I set the POLICY in mdadm.conf to:
> > 
> > POLICY action=force-spare
> > 
> > This should add any device (passed through mdadm --incremental) as
> > spare no matter what (Am i correct?)
> 
> That is the theory, yes.
> 
> > 
> > Now when I do:
> > 
> > #mdadm --incremental /dev/sdd
> > mdadm: no RAID superblock on /dev/sdd.
> 
> The message suggests that 'guess_super' found something on the device, but
> it didn't turn out to be something useful.... not very helpful I know.
> 
> What does "mdadm --examine /dev/sdd" report?
> I suspect there is a partition table and that is causing the confusion.
> Try removing the partition table (dd /dev/zero to the device for a few K).
> Then try again.
> 
> Probably need a fix like:
> 
> diff --git a/Incremental.c b/Incremental.c
> index c9372587f518..3156190c4603 100644
> --- a/Incremental.c
> +++ b/Incremental.c
> @@ -196,7 +196,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
>  	policy = disk_policy(&dinfo);
>  	have_target = policy_check_path(&dinfo, &target_array);
>  
> -	if (st == NULL && (st = guess_super(dfd)) == NULL) {
> +	if (st == NULL && (st = guess_super_type(dfd, guess_array)) == NULL) {
>  		if (c->verbose >= 0)
>  			pr_err("no recognisable superblock on %s.\n",
>  			       devname);
> 
> 
> and probably should improve the error messages...
> 
> Thanks for the report.  Please let me know if that works, and what other
> difficulties you hit.

Actually, don't bother.  I must have been asleep.

Your problem is that you haven't defined a 'domain'.
A new spare needs to be assigned to a 'domain', and it will be attached to
any array in the same domain, as needed.

You can give all devices the domain "default" with

   POLICY domain=default

The domain of an array is inherited from the member devices, or can be set
with "spare-group=" in mdadm.conf.

So

   POLICY domain=default action=force-spare

should make it work for you.

NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Trying to get POLICY working
  2014-11-03  1:54   ` NeilBrown
@ 2014-11-03  9:43     ` Caspar Smit
  2014-11-05  5:28       ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: Caspar Smit @ 2014-11-03  9:43 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil,

Actually BOTH your answers were correct, thank you for that.

1) Your hunge was correct as my disk contained a partition table (in
my case an msdos label) and was not added with the error in my first
mail:

mdadm: no RAID superblock on /dev/sdd.

mdadm -E /dev/sdd shows:

/dev/sdd:
   MBR Magic : aa55

So it finds 'something' but clearly unusable to mdadm.

Wiping the partition table and trying again resulted in a different
error message:

mdadm: no recognisable superblock on /dev/sdd.

Which is better but still the disk was not added to the array.

2) To make it work i also needed the domain=default in the POLICY setting.

It still gave me the:

mdadm: no recognisable superblock on /dev/sdd.

But now the disk got added to the array and started rebuilding.

Note: ONLY setting the domain=default in POLICY without clearing the
partition table results in:
mdadm: no RAID superblock on /dev/sdd. and the disk will not be added
so BOTH measures were needed.

Note2: I didn't need the spare-group directive so I think
domain=default is a special case were all disks and arrays are placed
in the same domain.


Furthermore i found out something which i think should not happen
(bug?) or maybe i am wrong:

With a working clean array:

# more /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdd[3] sdc[1] sdb[0]
      203776 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]

# mdadm --fail /dev/md0 /dev/sdd
mdadm: set /dev/sdd faulty in /dev/md0

# mdadm --remove /dev/md0 /dev/sdd
mdadm: hot removed /dev/sdd from /dev/md0

# mdadm --incremental /dev/sdd
mdadm: failed to add /dev/sdd to /dev/md/0: Invalid argument.

So when it actually finds a device with an MD superblock it doesn't
add it, is this expected behavior as the disk was failed (so probably
not a good idea to add it back) or is this a bug?

Kind regards,
Caspar


2014-11-03 2:54 GMT+01:00 NeilBrown <neilb@suse.de>:
> On Sat, 1 Nov 2014 11:20:01 +1100 NeilBrown <neilb@suse.de> wrote:
>
>> On Fri, 31 Oct 2014 16:19:04 +0100 Caspar Smit <c.smit@truebit.nl> wrote:
>>
>> > Hi all,
>> >
>> > I'm trying to get the POLICY framework of mdadm working but I can't seem to.
>> >
>> > As i understand in the man page of mdadm the Incremental and POLICY
>> > directives could allow adding a new disk without MD superblock as
>> > spare to an already active array:
>> >
>> > "Note that mdadm will normally only add devices to an array which were
>> > previously working (active or spare) parts of that array.  The support
>> > for automatic inclusion of a new drive as a spare in some array
>> > requires a configuration through POLICY in config file."
>> >
>> > Furthermore:
>> >
>> > "If no md metadata is found, the device may be still added to an array
>> > as a spare if POLICY allows."
>> >
>> >
>> > To get the basics working I created a system with 3 disks /dev/sdb,
>> > /dev/sdc and /dev/sdd
>> >
>> > Created a RAID5 with one missing disk:
>> >
>> > mdadm -C /dev/md0 -l 5 -n 3 /dev/sd[b-c] missing
>> >
>> > I set the POLICY in mdadm.conf to:
>> >
>> > POLICY action=force-spare
>> >
>> > This should add any device (passed through mdadm --incremental) as
>> > spare no matter what (Am i correct?)
>>
>> That is the theory, yes.
>>
>> >
>> > Now when I do:
>> >
>> > #mdadm --incremental /dev/sdd
>> > mdadm: no RAID superblock on /dev/sdd.
>>
>> The message suggests that 'guess_super' found something on the device, but
>> it didn't turn out to be something useful.... not very helpful I know.
>>
>> What does "mdadm --examine /dev/sdd" report?
>> I suspect there is a partition table and that is causing the confusion.
>> Try removing the partition table (dd /dev/zero to the device for a few K).
>> Then try again.
>>
>> Probably need a fix like:
>>
>> diff --git a/Incremental.c b/Incremental.c
>> index c9372587f518..3156190c4603 100644
>> --- a/Incremental.c
>> +++ b/Incremental.c
>> @@ -196,7 +196,7 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
>>       policy = disk_policy(&dinfo);
>>       have_target = policy_check_path(&dinfo, &target_array);
>>
>> -     if (st == NULL && (st = guess_super(dfd)) == NULL) {
>> +     if (st == NULL && (st = guess_super_type(dfd, guess_array)) == NULL) {
>>               if (c->verbose >= 0)
>>                       pr_err("no recognisable superblock on %s.\n",
>>                              devname);
>>
>>
>> and probably should improve the error messages...
>>
>> Thanks for the report.  Please let me know if that works, and what other
>> difficulties you hit.
>
> Actually, don't bother.  I must have been asleep.
>
> Your problem is that you haven't defined a 'domain'.
> A new spare needs to be assigned to a 'domain', and it will be attached to
> any array in the same domain, as needed.
>
> You can give all devices the domain "default" with
>
>    POLICY domain=default
>
> The domain of an array is inherited from the member devices, or can be set
> with "spare-group=" in mdadm.conf.
>
> So
>
>    POLICY domain=default action=force-spare
>
> should make it work for you.
>
> NeilBrown

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Trying to get POLICY working
  2014-11-03  9:43     ` Caspar Smit
@ 2014-11-05  5:28       ` NeilBrown
  0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2014-11-05  5:28 UTC (permalink / raw)
  To: Caspar Smit; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 4286 bytes --]

On Mon, 3 Nov 2014 10:43:29 +0100 Caspar Smit <c.smit@truebit.nl> wrote:

> Hi Neil,
> 
> Actually BOTH your answers were correct, thank you for that.
> 
> 1) Your hunge was correct as my disk contained a partition table (in
> my case an msdos label) and was not added with the error in my first
> mail:
> 
> mdadm: no RAID superblock on /dev/sdd.
> 
> mdadm -E /dev/sdd shows:
> 
> /dev/sdd:
>    MBR Magic : aa55
> 
> So it finds 'something' but clearly unusable to mdadm.
> 
> Wiping the partition table and trying again resulted in a different
> error message:
> 
> mdadm: no recognisable superblock on /dev/sdd.
> 
> Which is better but still the disk was not added to the array.
> 
> 2) To make it work i also needed the domain=default in the POLICY setting.
> 
> It still gave me the:
> 
> mdadm: no recognisable superblock on /dev/sdd.
> 
> But now the disk got added to the array and started rebuilding.
> 
> Note: ONLY setting the domain=default in POLICY without clearing the
> partition table results in:
> mdadm: no RAID superblock on /dev/sdd. and the disk will not be added
> so BOTH measures were needed.

Thanks for testing and reported.... the patch I posted before (included more
completely below) should allow "domain=default" to be enough.


> 
> Note2: I didn't need the spare-group directive so I think
> domain=default is a special case were all disks and arrays are placed
> in the same domain.

"spare-group" is really only for "legacy" support.  If a domain is defined
for disks, the array made up of those disks inherits the domain.


> 
> 
> Furthermore i found out something which i think should not happen
> (bug?) or maybe i am wrong:
> 
> With a working clean array:
> 
> # more /proc/mdstat
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid5 sdd[3] sdc[1] sdb[0]
>       203776 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]
> 
> # mdadm --fail /dev/md0 /dev/sdd
> mdadm: set /dev/sdd faulty in /dev/md0
> 
> # mdadm --remove /dev/md0 /dev/sdd
> mdadm: hot removed /dev/sdd from /dev/md0
> 
> # mdadm --incremental /dev/sdd
> mdadm: failed to add /dev/sdd to /dev/md/0: Invalid argument.
> 
> So when it actually finds a device with an MD superblock it doesn't
> add it, is this expected behavior as the disk was failed (so probably
> not a good idea to add it back) or is this a bug?

Presumably "action=force-spare" was still active when you tried this?

In that case it is a bug (I think).  It should clean-out the device and add
it as a spare...

I just tested with mdadm from my 'git', and  it works as expected.
When action=force-spare  I get

mdadm: /dev/loop2 attached to /dev/md0 which is already active.

When I have "action=re-add" I get:

mdadm: can only add /dev/loop2 to /dev/md0 as a spare, and force-spare is not set.
mdadm: failed to add /dev/loop2 to existing array /dev/md0: Invalid argument.

Maybe you need a newer mdadm ...

Thanks,
NeilBrown


From: NeilBrown <neilb@suse.de>
Date: Wed, 5 Nov 2014 16:21:42 +1100
Subject: [PATCH] Incremental: don't be distracted by partition table when
 calling try_spare.

Currently a partition table on a device makes "mdadm -I" think
the array has a particular metadata type and so will only
add it to an array of that (partition table) type .. which doesn't
make any sense.

So tell guess_super to only look for 'array' metadata.

Reported-by: Caspar Smit <c.smit@truebit.nl>
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/Incremental.c b/Incremental.c
index c9372587f518..13b68bc0adea 100644
--- a/Incremental.c
+++ b/Incremental.c
@@ -196,13 +196,13 @@ int Incremental(struct mddev_dev *devlist, struct context *c,
 	policy = disk_policy(&dinfo);
 	have_target = policy_check_path(&dinfo, &target_array);
 
-	if (st == NULL && (st = guess_super(dfd)) == NULL) {
+	if (st == NULL && (st = guess_super_type(dfd, guess_array)) == NULL) {
 		if (c->verbose >= 0)
 			pr_err("no recognisable superblock on %s.\n",
 			       devname);
 		rv = try_spare(devname, &dfd, policy,
 			       have_target ? &target_array : NULL,
-			       st, c->verbose);
+			       NULL, c->verbose);
 		goto out;
 	}
 	st->ignore_hw_compat = 1;

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-11-05  5:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-31 15:19 Trying to get POLICY working Caspar Smit
2014-10-31 15:34 ` Robin Hill
2014-11-01  0:20 ` NeilBrown
2014-11-03  1:54   ` NeilBrown
2014-11-03  9:43     ` Caspar Smit
2014-11-05  5:28       ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.