All of lore.kernel.org
 help / color / mirror / Atom feed
* mdadm 3.3 fails to kick out non fresh disk
@ 2013-09-13 13:22 Francis Moreau
  2013-09-13 20:43 ` NeilBrown
  0 siblings, 1 reply; 25+ messages in thread
From: Francis Moreau @ 2013-09-13 13:22 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil,

I'm probably doing something wrong since it's a pretty critical bug
but can't see what.

I'm creating a RAID1 array with 1.2 metadata. After that I stop the
array, and restart the array with only one disk. I write random data
on the array and then stop it again:

# mkfs.ext4 /dev/md125
# mdadm --stop /dev/md125
# mdadm -IRs /dev/loop0
# mount /dev/md125 /mnt/
# date >/mnt/foo
# umount /mnt
# mdadm --stop /dev/md125

Finally I restart the array with the 2 disks (one disk is outdated)
and mdadm happily activates the array without error. Note that I add
the outdated disk first in that case:

# mdadm -IRs /dev/loop1
mdadm: /dev/loop1 attached to /dev/md/array1, which has been started.
# mdadm -IRs /dev/loop0
mdadm: /dev/loop0 attached to /dev/md/array1 which is already active.
# cat /proc/mdstat
Personalities : [raid1]
md125 : active raid1 loop0[0] loop1[1]
      117056 blocks super 1.2 [2/2] [UU]
# mount /dev/md125 /mnt
# ls /mnt/
[  457.321771] EXT4-fs error (device md125): ext4_lookup:1047: inode
#2: comm ls: deleted inode referenced: 12
ls: cannot access /mnt/1: Input/output error

If I add the outdated disk last I got this:
# mdadm -IRs /dev/loop0
mdadm: /dev/loop0 attached to /dev/md/array1, which has been started.
# mdadm -IRs /dev/loop1
mdadm: can only add /dev/loop1 to /dev/md/array1 as a spare, and
force-spare is not set.
mdadm: failed to add /dev/loop1 to existing array /dev/md/array1:
Invalid argument.

which didn't tell me the reason why loop1 must be a spare.

Is this expected ? If so could you enlight me ?

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-13 13:22 mdadm 3.3 fails to kick out non fresh disk Francis Moreau
@ 2013-09-13 20:43 ` NeilBrown
  2013-09-13 22:35   ` Francis Moreau
  0 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2013-09-13 20:43 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2029 bytes --]

On Fri, 13 Sep 2013 15:22:20 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> Hi Neil,
> 
> I'm probably doing something wrong since it's a pretty critical bug
> but can't see what.
> 
> I'm creating a RAID1 array with 1.2 metadata. After that I stop the
> array, and restart the array with only one disk. I write random data
> on the array and then stop it again:
> 
> # mkfs.ext4 /dev/md125
> # mdadm --stop /dev/md125
> # mdadm -IRs /dev/loop0
> # mount /dev/md125 /mnt/
> # date >/mnt/foo
> # umount /mnt
> # mdadm --stop /dev/md125
> 
> Finally I restart the array with the 2 disks (one disk is outdated)
> and mdadm happily activates the array without error. Note that I add
> the outdated disk first in that case:
> 
> # mdadm -IRs /dev/loop1
> mdadm: /dev/loop1 attached to /dev/md/array1, which has been started.
> # mdadm -IRs /dev/loop0
> mdadm: /dev/loop0 attached to /dev/md/array1 which is already active.

That's a worry.  I'm not sure how to fix it.

I would probably suggest you don't use "-IR" to add devices.  That would make
it a lot less likely to happen.


> # cat /proc/mdstat
> Personalities : [raid1]
> md125 : active raid1 loop0[0] loop1[1]
>       117056 blocks super 1.2 [2/2] [UU]
> # mount /dev/md125 /mnt
> # ls /mnt/
> [  457.321771] EXT4-fs error (device md125): ext4_lookup:1047: inode
> #2: comm ls: deleted inode referenced: 12
> ls: cannot access /mnt/1: Input/output error
> 
> If I add the outdated disk last I got this:
> # mdadm -IRs /dev/loop0
> mdadm: /dev/loop0 attached to /dev/md/array1, which has been started.
> # mdadm -IRs /dev/loop1
> mdadm: can only add /dev/loop1 to /dev/md/array1 as a spare, and
> force-spare is not set.
> mdadm: failed to add /dev/loop1 to existing array /dev/md/array1:
> Invalid argument.
> 
> which didn't tell me the reason why loop1 must be a spare.

It  must be a spare because it is out of date.


NeilBrown

> 
> Is this expected ? If so could you enlight me ?
> 
> Thanks


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-13 20:43 ` NeilBrown
@ 2013-09-13 22:35   ` Francis Moreau
  2013-09-13 23:56     ` Roberto Spadim
  2013-09-14 10:38     ` NeilBrown
  0 siblings, 2 replies; 25+ messages in thread
From: Francis Moreau @ 2013-09-13 22:35 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil,

On Fri, Sep 13, 2013 at 10:43 PM, NeilBrown <neilb@suse.de> wrote:
> On Fri, 13 Sep 2013 15:22:20 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
>
>> Hi Neil,
>>
>> I'm probably doing something wrong since it's a pretty critical bug
>> but can't see what.
>>
>> I'm creating a RAID1 array with 1.2 metadata. After that I stop the
>> array, and restart the array with only one disk. I write random data
>> on the array and then stop it again:
>>
>> # mkfs.ext4 /dev/md125
>> # mdadm --stop /dev/md125
>> # mdadm -IRs /dev/loop0
>> # mount /dev/md125 /mnt/
>> # date >/mnt/foo
>> # umount /mnt
>> # mdadm --stop /dev/md125
>>
>> Finally I restart the array with the 2 disks (one disk is outdated)
>> and mdadm happily activates the array without error. Note that I add
>> the outdated disk first in that case:
>>
>> # mdadm -IRs /dev/loop1
>> mdadm: /dev/loop1 attached to /dev/md/array1, which has been started.
>> # mdadm -IRs /dev/loop0
>> mdadm: /dev/loop0 attached to /dev/md/array1 which is already active.
>
> That's a worry.  I'm not sure how to fix it.
>
> I would probably suggest you don't use "-IR" to add devices.  That would make
> it a lot less likely to happen.
>

Well I'm not sure how I should start an array...

For example doing:

# mdadm -I /dev/loop0
# mdadm -I /dev/loop1
# mdadm -R /dev/md125

works for array using metadata 1.2 but doesn't if the array is using
DDF (mdmon not started). To workaround this issue you suggested to use
-IRs:

# mdadm -IRs /dev/loop0
# mdadm -IRs /dev/loop1

but now mdadm can't detect outdated disk anymore.

Could you suggest something to start an array which would work in all
cases (ddf or 1.2, add non-fresh disk...) ?

>
>> # cat /proc/mdstat
>> Personalities : [raid1]
>> md125 : active raid1 loop0[0] loop1[1]
>>       117056 blocks super 1.2 [2/2] [UU]
>> # mount /dev/md125 /mnt
>> # ls /mnt/
>> [  457.321771] EXT4-fs error (device md125): ext4_lookup:1047: inode
>> #2: comm ls: deleted inode referenced: 12
>> ls: cannot access /mnt/1: Input/output error
>>
>> If I add the outdated disk last I got this:
>> # mdadm -IRs /dev/loop0
>> mdadm: /dev/loop0 attached to /dev/md/array1, which has been started.
>> # mdadm -IRs /dev/loop1
>> mdadm: can only add /dev/loop1 to /dev/md/array1 as a spare, and
>> force-spare is not set.
>> mdadm: failed to add /dev/loop1 to existing array /dev/md/array1:
>> Invalid argument.
>>
>> which didn't tell me the reason why loop1 must be a spare.
>
> It  must be a spare because it is out of date.
>

Yes but I think mdadm should tell the reason, no  ?

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-13 22:35   ` Francis Moreau
@ 2013-09-13 23:56     ` Roberto Spadim
  2013-09-14 10:38     ` NeilBrown
  1 sibling, 0 replies; 25+ messages in thread
From: Roberto Spadim @ 2013-09-13 23:56 UTC (permalink / raw)
  To: Francis Moreau; +Cc: NeilBrown, Linux-RAID

Hi guys, i'm just reading this thread and sometimes i have some doubts
about mdadm and mdmon too... others i know why it failed or something
like...

just some points, not related to the thread problem...

in git there's some nice features talking with the user about
use "git add <file>..." to include in what will be committed

this kind of tips are very nice for beginner and new users, should be
nice put some tips in mdadm too?!
i don't remember the cases where tips should be used but maybe we
could start a new mail thread to make this feature?

thanks for the space guys, sorry can't help here in this thread

2013/9/13 Francis Moreau <francis.moro@gmail.com>:
> Hi Neil,
>
> On Fri, Sep 13, 2013 at 10:43 PM, NeilBrown <neilb@suse.de> wrote:
>> On Fri, 13 Sep 2013 15:22:20 +0200 Francis Moreau <francis.moro@gmail.com>
>> wrote:
>>
>>> Hi Neil,
>>>
>>> I'm probably doing something wrong since it's a pretty critical bug
>>> but can't see what.
>>>
>>> I'm creating a RAID1 array with 1.2 metadata. After that I stop the
>>> array, and restart the array with only one disk. I write random data
>>> on the array and then stop it again:
>>>
>>> # mkfs.ext4 /dev/md125
>>> # mdadm --stop /dev/md125
>>> # mdadm -IRs /dev/loop0
>>> # mount /dev/md125 /mnt/
>>> # date >/mnt/foo
>>> # umount /mnt
>>> # mdadm --stop /dev/md125
>>>
>>> Finally I restart the array with the 2 disks (one disk is outdated)
>>> and mdadm happily activates the array without error. Note that I add
>>> the outdated disk first in that case:
>>>
>>> # mdadm -IRs /dev/loop1
>>> mdadm: /dev/loop1 attached to /dev/md/array1, which has been started.
>>> # mdadm -IRs /dev/loop0
>>> mdadm: /dev/loop0 attached to /dev/md/array1 which is already active.
>>
>> That's a worry.  I'm not sure how to fix it.
>>
>> I would probably suggest you don't use "-IR" to add devices.  That would make
>> it a lot less likely to happen.
>>
>
> Well I'm not sure how I should start an array...
>
> For example doing:
>
> # mdadm -I /dev/loop0
> # mdadm -I /dev/loop1
> # mdadm -R /dev/md125
>
> works for array using metadata 1.2 but doesn't if the array is using
> DDF (mdmon not started). To workaround this issue you suggested to use
> -IRs:
>
> # mdadm -IRs /dev/loop0
> # mdadm -IRs /dev/loop1
>
> but now mdadm can't detect outdated disk anymore.
>
> Could you suggest something to start an array which would work in all
> cases (ddf or 1.2, add non-fresh disk...) ?
>
>>
>>> # cat /proc/mdstat
>>> Personalities : [raid1]
>>> md125 : active raid1 loop0[0] loop1[1]
>>>       117056 blocks super 1.2 [2/2] [UU]
>>> # mount /dev/md125 /mnt
>>> # ls /mnt/
>>> [  457.321771] EXT4-fs error (device md125): ext4_lookup:1047: inode
>>> #2: comm ls: deleted inode referenced: 12
>>> ls: cannot access /mnt/1: Input/output error
>>>
>>> If I add the outdated disk last I got this:
>>> # mdadm -IRs /dev/loop0
>>> mdadm: /dev/loop0 attached to /dev/md/array1, which has been started.
>>> # mdadm -IRs /dev/loop1
>>> mdadm: can only add /dev/loop1 to /dev/md/array1 as a spare, and
>>> force-spare is not set.
>>> mdadm: failed to add /dev/loop1 to existing array /dev/md/array1:
>>> Invalid argument.
>>>
>>> which didn't tell me the reason why loop1 must be a spare.
>>
>> It  must be a spare because it is out of date.
>>
>
> Yes but I think mdadm should tell the reason, no  ?
>
> Thanks
> --
> Francis
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Roberto Spadim

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-13 22:35   ` Francis Moreau
  2013-09-13 23:56     ` Roberto Spadim
@ 2013-09-14 10:38     ` NeilBrown
  2013-09-14 14:33       ` Francis Moreau
  1 sibling, 1 reply; 25+ messages in thread
From: NeilBrown @ 2013-09-14 10:38 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3192 bytes --]

On Sat, 14 Sep 2013 00:35:47 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> Hi Neil,
> 
> On Fri, Sep 13, 2013 at 10:43 PM, NeilBrown <neilb@suse.de> wrote:
> > On Fri, 13 Sep 2013 15:22:20 +0200 Francis Moreau <francis.moro@gmail.com>
> > wrote:
> >
> >> Hi Neil,
> >>
> >> I'm probably doing something wrong since it's a pretty critical bug
> >> but can't see what.
> >>
> >> I'm creating a RAID1 array with 1.2 metadata. After that I stop the
> >> array, and restart the array with only one disk. I write random data
> >> on the array and then stop it again:
> >>
> >> # mkfs.ext4 /dev/md125
> >> # mdadm --stop /dev/md125
> >> # mdadm -IRs /dev/loop0
> >> # mount /dev/md125 /mnt/
> >> # date >/mnt/foo
> >> # umount /mnt
> >> # mdadm --stop /dev/md125
> >>
> >> Finally I restart the array with the 2 disks (one disk is outdated)
> >> and mdadm happily activates the array without error. Note that I add
> >> the outdated disk first in that case:
> >>
> >> # mdadm -IRs /dev/loop1
> >> mdadm: /dev/loop1 attached to /dev/md/array1, which has been started.
> >> # mdadm -IRs /dev/loop0
> >> mdadm: /dev/loop0 attached to /dev/md/array1 which is already active.
> >
> > That's a worry.  I'm not sure how to fix it.
> >
> > I would probably suggest you don't use "-IR" to add devices.  That would make
> > it a lot less likely to happen.
> >
> 
> Well I'm not sure how I should start an array...
> 
> For example doing:
> 
> # mdadm -I /dev/loop0
> # mdadm -I /dev/loop1
> # mdadm -R /dev/md125
> 
> works for array using metadata 1.2 but doesn't if the array is using
> DDF (mdmon not started). To workaround this issue you suggested to use
> -IRs:
> 
> # mdadm -IRs /dev/loop0
> # mdadm -IRs /dev/loop1

This isn't what I meant.
I mean that after you had run
  mdadm -I /dev/foo
for all devices, you then run
  mdadm -IRs
to start any that are degraded.

BTW I think I've fixed the issue with mdadm -R /dev/md125 for DDF.
Try the latest git.

NeilBrown


> 
> but now mdadm can't detect outdated disk anymore.
> 
> Could you suggest something to start an array which would work in all
> cases (ddf or 1.2, add non-fresh disk...) ?
> 
> >
> >> # cat /proc/mdstat
> >> Personalities : [raid1]
> >> md125 : active raid1 loop0[0] loop1[1]
> >>       117056 blocks super 1.2 [2/2] [UU]
> >> # mount /dev/md125 /mnt
> >> # ls /mnt/
> >> [  457.321771] EXT4-fs error (device md125): ext4_lookup:1047: inode
> >> #2: comm ls: deleted inode referenced: 12
> >> ls: cannot access /mnt/1: Input/output error
> >>
> >> If I add the outdated disk last I got this:
> >> # mdadm -IRs /dev/loop0
> >> mdadm: /dev/loop0 attached to /dev/md/array1, which has been started.
> >> # mdadm -IRs /dev/loop1
> >> mdadm: can only add /dev/loop1 to /dev/md/array1 as a spare, and
> >> force-spare is not set.
> >> mdadm: failed to add /dev/loop1 to existing array /dev/md/array1:
> >> Invalid argument.
> >>
> >> which didn't tell me the reason why loop1 must be a spare.
> >
> > It  must be a spare because it is out of date.
> >
> 
> Yes but I think mdadm should tell the reason, no  ?
> 
> Thanks


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-14 10:38     ` NeilBrown
@ 2013-09-14 14:33       ` Francis Moreau
  2013-09-14 15:06         ` Francis Moreau
  0 siblings, 1 reply; 25+ messages in thread
From: Francis Moreau @ 2013-09-14 14:33 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On Sat, Sep 14, 2013 at 12:38 PM, NeilBrown <neilb@suse.de> wrote:
> On Sat, 14 Sep 2013 00:35:47 +0200 Francis Moreau <francis.moro@gmail.com>
> wrote:
>
>> Hi Neil,
>>
>> On Fri, Sep 13, 2013 at 10:43 PM, NeilBrown <neilb@suse.de> wrote:
>> > On Fri, 13 Sep 2013 15:22:20 +0200 Francis Moreau <francis.moro@gmail.com>
>> > wrote:
>> >
>> >> Hi Neil,
>> >>
>> >> I'm probably doing something wrong since it's a pretty critical bug
>> >> but can't see what.
>> >>
>> >> I'm creating a RAID1 array with 1.2 metadata. After that I stop the
>> >> array, and restart the array with only one disk. I write random data
>> >> on the array and then stop it again:
>> >>
>> >> # mkfs.ext4 /dev/md125
>> >> # mdadm --stop /dev/md125
>> >> # mdadm -IRs /dev/loop0
>> >> # mount /dev/md125 /mnt/
>> >> # date >/mnt/foo
>> >> # umount /mnt
>> >> # mdadm --stop /dev/md125
>> >>
>> >> Finally I restart the array with the 2 disks (one disk is outdated)
>> >> and mdadm happily activates the array without error. Note that I add
>> >> the outdated disk first in that case:
>> >>
>> >> # mdadm -IRs /dev/loop1
>> >> mdadm: /dev/loop1 attached to /dev/md/array1, which has been started.
>> >> # mdadm -IRs /dev/loop0
>> >> mdadm: /dev/loop0 attached to /dev/md/array1 which is already active.
>> >
>> > That's a worry.  I'm not sure how to fix it.
>> >
>> > I would probably suggest you don't use "-IR" to add devices.  That would make
>> > it a lot less likely to happen.
>> >
>>
>> Well I'm not sure how I should start an array...
>>
>> For example doing:
>>
>> # mdadm -I /dev/loop0
>> # mdadm -I /dev/loop1
>> # mdadm -R /dev/md125
>>
>> works for array using metadata 1.2 but doesn't if the array is using
>> DDF (mdmon not started). To workaround this issue you suggested to use
>> -IRs:
>>
>> # mdadm -IRs /dev/loop0
>> # mdadm -IRs /dev/loop1
>
> This isn't what I meant.
> I mean that after you had run
>   mdadm -I /dev/foo
> for all devices, you then run
>   mdadm -IRs
> to start any that are degraded.

oh sorry I misunderstood what you previously wrote. Using '-I' to add
devices make mdadm to notice that one disk is outdated.

>
> BTW I think I've fixed the issue with mdadm -R /dev/md125 for DDF.
> Try the latest git.

It seems it fixes the issue: mdmon is now correctly started with a
degraded DDF array.

However, after using the system with only one disk (sda), sdb is now
outdated. I rebooted the system with 2 disks but mdadm doesn't seem to
notice that sdb is outdated:

# mdadm -I /dev/sda
mdadm: container /dev/md/ddf0 now has 1 device
mdadm: /dev/md/array1_0 assembled with 1 device but not started
# mdadm -I /dev/sdb
mdadm: container /dev/md/ddf0 now has 2 devices
mdadm: Started /dev/md/array1_0 with 2 devices (1 new)
# cat /proc/mdstat
Personalities : [raid1]
md126 : active (auto-read-only) raid1 sdb[1] sda[0]
      2064384 blocks super external:/md127/0 [2/2] [UU]

md127 : inactive sdb[1](S) sda[0](S)
      65536 blocks super external:ddf

So this time mdadm fails to kick out non fresh disk (when using '-I')
but with DDF.

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-14 14:33       ` Francis Moreau
@ 2013-09-14 15:06         ` Francis Moreau
  2013-09-14 20:43           ` Martin Wilck
  0 siblings, 1 reply; 25+ messages in thread
From: Francis Moreau @ 2013-09-14 15:06 UTC (permalink / raw)
  To: Martin Wilck; +Cc: linux-raid, NeilBrown

Martin,

On Sat, Sep 14, 2013 at 4:33 PM, Francis Moreau <francis.moro@gmail.com> wrote:
> On Sat, Sep 14, 2013 at 12:38 PM, NeilBrown <neilb@suse.de> wrote:
>> On Sat, 14 Sep 2013 00:35:47 +0200 Francis Moreau <francis.moro@gmail.com>
>> wrote:

[...]

>>
>> BTW I think I've fixed the issue with mdadm -R /dev/md125 for DDF.
>> Try the latest git.
>
> It seems it fixes the issue: mdmon is now correctly started with a
> degraded DDF array.
>
> However, after using the system with only one disk (sda), sdb is now
> outdated. I rebooted the system with 2 disks but mdadm doesn't seem to
> notice that sdb is outdated:
>
> # mdadm -I /dev/sda
> mdadm: container /dev/md/ddf0 now has 1 device
> mdadm: /dev/md/array1_0 assembled with 1 device but not started
> # mdadm -I /dev/sdb
> mdadm: container /dev/md/ddf0 now has 2 devices
> mdadm: Started /dev/md/array1_0 with 2 devices (1 new)
> # cat /proc/mdstat
> Personalities : [raid1]
> md126 : active (auto-read-only) raid1 sdb[1] sda[0]
>       2064384 blocks super external:/md127/0 [2/2] [UU]
>
> md127 : inactive sdb[1](S) sda[0](S)
>       65536 blocks super external:ddf
>
> So this time mdadm fails to kick out non fresh disk (when using '-I')
> but with DDF.
>

Maybe you have an idea on why mdadm doesn't notice that sdb is
outdated in that case (DDF) ?

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-14 15:06         ` Francis Moreau
@ 2013-09-14 20:43           ` Martin Wilck
  2013-09-16 13:56             ` Francis Moreau
  0 siblings, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2013-09-14 20:43 UTC (permalink / raw)
  To: Francis Moreau, linux-raid

On 09/14/2013 05:06 PM, Francis Moreau wrote:
> Martin,
> 
> On Sat, Sep 14, 2013 at 4:33 PM, Francis Moreau <francis.moro@gmail.com> wrote:
>> On Sat, Sep 14, 2013 at 12:38 PM, NeilBrown <neilb@suse.de> wrote:
>>> On Sat, 14 Sep 2013 00:35:47 +0200 Francis Moreau <francis.moro@gmail.com>
>>> wrote:
> 
> [...]
> 
>>>
>>> BTW I think I've fixed the issue with mdadm -R /dev/md125 for DDF.
>>> Try the latest git.
>>
>> It seems it fixes the issue: mdmon is now correctly started with a
>> degraded DDF array.
>>
>> However, after using the system with only one disk (sda), sdb is now
>> outdated. I rebooted the system with 2 disks but mdadm doesn't seem to
>> notice that sdb is outdated:
>>
>> # mdadm -I /dev/sda
>> mdadm: container /dev/md/ddf0 now has 1 device
>> mdadm: /dev/md/array1_0 assembled with 1 device but not started
>> # mdadm -I /dev/sdb
>> mdadm: container /dev/md/ddf0 now has 2 devices
>> mdadm: Started /dev/md/array1_0 with 2 devices (1 new)
>> # cat /proc/mdstat
>> Personalities : [raid1]
>> md126 : active (auto-read-only) raid1 sdb[1] sda[0]
>>       2064384 blocks super external:/md127/0 [2/2] [UU]
>>
>> md127 : inactive sdb[1](S) sda[0](S)
>>       65536 blocks super external:ddf
>>
>> So this time mdadm fails to kick out non fresh disk (when using '-I')
>> but with DDF.
>>
> 
> Maybe you have an idea on why mdadm doesn't notice that sdb is
> outdated in that case (DDF) ?

It's a bug. I am sending a patch soon. Thanks a lot.

Martin


> 
> Thanks


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-14 20:43           ` Martin Wilck
@ 2013-09-16 13:56             ` Francis Moreau
  2013-09-16 17:04               ` Martin Wilck
  0 siblings, 1 reply; 25+ messages in thread
From: Francis Moreau @ 2013-09-16 13:56 UTC (permalink / raw)
  To: Martin Wilck; +Cc: linux-raid

Hi Martin,

On Sat, Sep 14, 2013 at 10:43 PM, Martin Wilck <mwilck@arcor.de> wrote:
> On 09/14/2013 05:06 PM, Francis Moreau wrote:
>> Martin,
>>
>> On Sat, Sep 14, 2013 at 4:33 PM, Francis Moreau <francis.moro@gmail.com> wrote:
>>> On Sat, Sep 14, 2013 at 12:38 PM, NeilBrown <neilb@suse.de> wrote:
>>>> On Sat, 14 Sep 2013 00:35:47 +0200 Francis Moreau <francis.moro@gmail.com>
>>>> wrote:
>>
>> [...]
>>
>>>>
>>>> BTW I think I've fixed the issue with mdadm -R /dev/md125 for DDF.
>>>> Try the latest git.
>>>
>>> It seems it fixes the issue: mdmon is now correctly started with a
>>> degraded DDF array.
>>>
>>> However, after using the system with only one disk (sda), sdb is now
>>> outdated. I rebooted the system with 2 disks but mdadm doesn't seem to
>>> notice that sdb is outdated:
>>>
>>> # mdadm -I /dev/sda
>>> mdadm: container /dev/md/ddf0 now has 1 device
>>> mdadm: /dev/md/array1_0 assembled with 1 device but not started
>>> # mdadm -I /dev/sdb
>>> mdadm: container /dev/md/ddf0 now has 2 devices
>>> mdadm: Started /dev/md/array1_0 with 2 devices (1 new)
>>> # cat /proc/mdstat
>>> Personalities : [raid1]
>>> md126 : active (auto-read-only) raid1 sdb[1] sda[0]
>>>       2064384 blocks super external:/md127/0 [2/2] [UU]
>>>
>>> md127 : inactive sdb[1](S) sda[0](S)
>>>       65536 blocks super external:ddf
>>>
>>> So this time mdadm fails to kick out non fresh disk (when using '-I')
>>> but with DDF.
>>>
>>
>> Maybe you have an idea on why mdadm doesn't notice that sdb is
>> outdated in that case (DDF) ?
>
> It's a bug. I am sending a patch soon. Thanks a lot.
>

I did give your patch "DDF: compare_super_ddf: fix sequence number
check" a try and now mdadm is able to detect a difference between the
2 disks. Therefore it refuses to insert the second disk which is
better.

However it's still not able to detect which version is the "fresher"
like mdadm does with soft RAID1 (metadata 1.2). Therefore mdadm is not
able to kick out the first disk if it's the outdated one.

Is that expected ?
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-16 13:56             ` Francis Moreau
@ 2013-09-16 17:04               ` Martin Wilck
  2013-09-20  8:56                 ` Francis Moreau
  0 siblings, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2013-09-16 17:04 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid

On 09/16/2013 03:56 PM, Francis Moreau wrote:

> I did give your patch "DDF: compare_super_ddf: fix sequence number
> check" a try and now mdadm is able to detect a difference between the
> 2 disks. Therefore it refuses to insert the second disk which is
> better.
> 
> However it's still not able to detect which version is the "fresher"
> like mdadm does with soft RAID1 (metadata 1.2). Therefore mdadm is not
> able to kick out the first disk if it's the outdated one.
> 
> Is that expected ?

At the moment, yes. This needs work.

Martin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-16 17:04               ` Martin Wilck
@ 2013-09-20  8:56                 ` Francis Moreau
  2013-09-20 18:07                   ` Martin Wilck
  0 siblings, 1 reply; 25+ messages in thread
From: Francis Moreau @ 2013-09-20  8:56 UTC (permalink / raw)
  To: Martin Wilck; +Cc: linux-raid

Hello Martin,

On Mon, Sep 16, 2013 at 7:04 PM, Martin Wilck <mwilck@arcor.de> wrote:
> On 09/16/2013 03:56 PM, Francis Moreau wrote:
>
>> I did give your patch "DDF: compare_super_ddf: fix sequence number
>> check" a try and now mdadm is able to detect a difference between the
>> 2 disks. Therefore it refuses to insert the second disk which is
>> better.
>>
>> However it's still not able to detect which version is the "fresher"
>> like mdadm does with soft RAID1 (metadata 1.2). Therefore mdadm is not
>> able to kick out the first disk if it's the outdated one.
>>
>> Is that expected ?
>
> At the moment, yes. This needs work.
>

Actually this is worse than I thought: with your patch applied mdadm
refuses to add back a spare disk into a degraded DDF array.

For example on a DDF array:

# cat /proc/mdstat
Personalities : [raid1]
md126 : active raid1 sdb[1] sda[0]
      2064384 blocks super external:/md127/0 [2/2] [UU]

md127 : inactive sdb[1](S) sda[0](S)
      65536 blocks super external:ddf

unused devices: <none>

# mdadm /dev/md126 --fail sdb
[   24.118434] md/raid1:md126: Disk failure on sdb, disabling device.
[   24.118437] md/raid1:md126: Operation continuing on 1 devices.
mdadm: set sdb faulty in /dev/md126

# mdadm /dev/md127 --remove sdb
mdadm: hot removed sdb from /dev/md127

# mdadm /dev/md127 --add /dev/sdb
mdadm: added /dev/sdb

# cat /proc/mdstat
Personalities : [raid1]
md126 : active raid1 sda[0]
      2064384 blocks super external:/md127/0 [2/1] [U_]

md127 : inactive sdb[1](S) sda[0](S)
      65536 blocks super external:ddf

unused devices: <none>


As you can see the reinserted disk sdb sits as spare and isn't added
back to the array.

Is it possible to add this major feature work again and keep your improvement ?

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-20  8:56                 ` Francis Moreau
@ 2013-09-20 18:07                   ` Martin Wilck
  2013-09-20 21:08                     ` Francis Moreau
  0 siblings, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2013-09-20 18:07 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid

On 09/20/2013 10:56 AM, Francis Moreau wrote:
> Hello Martin,
> 
> On Mon, Sep 16, 2013 at 7:04 PM, Martin Wilck <mwilck@arcor.de> wrote:
>> On 09/16/2013 03:56 PM, Francis Moreau wrote:
>>
>>> I did give your patch "DDF: compare_super_ddf: fix sequence number
>>> check" a try and now mdadm is able to detect a difference between the
>>> 2 disks. Therefore it refuses to insert the second disk which is
>>> better.
>>>
>>> However it's still not able to detect which version is the "fresher"
>>> like mdadm does with soft RAID1 (metadata 1.2). Therefore mdadm is not
>>> able to kick out the first disk if it's the outdated one.
>>>
>>> Is that expected ?
>>
>> At the moment, yes. This needs work.
>>
> 
> Actually this is worse than I thought: with your patch applied mdadm
> refuses to add back a spare disk into a degraded DDF array.
> 
> For example on a DDF array:
> 
> # cat /proc/mdstat
> Personalities : [raid1]
> md126 : active raid1 sdb[1] sda[0]
>       2064384 blocks super external:/md127/0 [2/2] [UU]
> 
> md127 : inactive sdb[1](S) sda[0](S)
>       65536 blocks super external:ddf
> 
> unused devices: <none>
> 
> # mdadm /dev/md126 --fail sdb
> [   24.118434] md/raid1:md126: Disk failure on sdb, disabling device.
> [   24.118437] md/raid1:md126: Operation continuing on 1 devices.
> mdadm: set sdb faulty in /dev/md126
> 
> # mdadm /dev/md127 --remove sdb
> mdadm: hot removed sdb from /dev/md127
> 
> # mdadm /dev/md127 --add /dev/sdb
> mdadm: added /dev/sdb
> 
> # cat /proc/mdstat
> Personalities : [raid1]
> md126 : active raid1 sda[0]
>       2064384 blocks super external:/md127/0 [2/1] [U_]
> 
> md127 : inactive sdb[1](S) sda[0](S)
>       65536 blocks super external:ddf
> 
> unused devices: <none>
> 
> 
> As you can see the reinserted disk sdb sits as spare and isn't added
> back to the array.

That's correct. You marked that disk failed.

> Is it possible to add this major feature work again and keep your improvement ?

No. A failed disk can't be added again without rebuild. I am positive
about that.

Martin

> 
> Thanks


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-20 18:07                   ` Martin Wilck
@ 2013-09-20 21:08                     ` Francis Moreau
  2013-09-21 13:22                       ` Francis Moreau
  0 siblings, 1 reply; 25+ messages in thread
From: Francis Moreau @ 2013-09-20 21:08 UTC (permalink / raw)
  To: Martin Wilck; +Cc: linux-raid

Hello Martin,

On Fri, Sep 20, 2013 at 8:07 PM, Martin Wilck <mwilck@arcor.de> wrote:
> On 09/20/2013 10:56 AM, Francis Moreau wrote:
>> Hello Martin,
>>
>> On Mon, Sep 16, 2013 at 7:04 PM, Martin Wilck <mwilck@arcor.de> wrote:
>>> On 09/16/2013 03:56 PM, Francis Moreau wrote:
>>>
>>>> I did give your patch "DDF: compare_super_ddf: fix sequence number
>>>> check" a try and now mdadm is able to detect a difference between the
>>>> 2 disks. Therefore it refuses to insert the second disk which is
>>>> better.
>>>>
>>>> However it's still not able to detect which version is the "fresher"
>>>> like mdadm does with soft RAID1 (metadata 1.2). Therefore mdadm is not
>>>> able to kick out the first disk if it's the outdated one.
>>>>
>>>> Is that expected ?
>>>
>>> At the moment, yes. This needs work.
>>>
>>
>> Actually this is worse than I thought: with your patch applied mdadm
>> refuses to add back a spare disk into a degraded DDF array.
>>
>> For example on a DDF array:
>>
>> # cat /proc/mdstat
>> Personalities : [raid1]
>> md126 : active raid1 sdb[1] sda[0]
>>       2064384 blocks super external:/md127/0 [2/2] [UU]
>>
>> md127 : inactive sdb[1](S) sda[0](S)
>>       65536 blocks super external:ddf
>>
>> unused devices: <none>
>>
>> # mdadm /dev/md126 --fail sdb
>> [   24.118434] md/raid1:md126: Disk failure on sdb, disabling device.
>> [   24.118437] md/raid1:md126: Operation continuing on 1 devices.
>> mdadm: set sdb faulty in /dev/md126
>>
>> # mdadm /dev/md127 --remove sdb
>> mdadm: hot removed sdb from /dev/md127
>>
>> # mdadm /dev/md127 --add /dev/sdb
>> mdadm: added /dev/sdb
>>
>> # cat /proc/mdstat
>> Personalities : [raid1]
>> md126 : active raid1 sda[0]
>>       2064384 blocks super external:/md127/0 [2/1] [U_]
>>
>> md127 : inactive sdb[1](S) sda[0](S)
>>       65536 blocks super external:ddf
>>
>> unused devices: <none>
>>
>>
>> As you can see the reinserted disk sdb sits as spare and isn't added
>> back to the array.
>
> That's correct. You marked that disk failed.
>
>> Is it possible to add this major feature work again and keep your improvement ?
>
> No. A failed disk can't be added again without rebuild. I am positive
> about that.
>

Hmm that's not the case with soft linux RAID AFAICS: doing the same
thing with soft RAID and the reinserted disk is added to the raid
array and it's synchronised automatically. You can try it easily.

Could you show me the mdadm command I should use to insert sdb into the array ?

Thanks.
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-20 21:08                     ` Francis Moreau
@ 2013-09-21 13:22                       ` Francis Moreau
  2013-09-23 20:02                         ` Martin Wilck
  2013-09-24 17:38                         ` Martin Wilck
  0 siblings, 2 replies; 25+ messages in thread
From: Francis Moreau @ 2013-09-21 13:22 UTC (permalink / raw)
  To: Martin Wilck; +Cc: linux-raid

On Fri, Sep 20, 2013 at 11:08 PM, Francis Moreau <francis.moro@gmail.com> wrote:
> Hello Martin,
>
> On Fri, Sep 20, 2013 at 8:07 PM, Martin Wilck <mwilck@arcor.de> wrote:
>> On 09/20/2013 10:56 AM, Francis Moreau wrote:
>>> Hello Martin,
>>>
>>> On Mon, Sep 16, 2013 at 7:04 PM, Martin Wilck <mwilck@arcor.de> wrote:
>>>> On 09/16/2013 03:56 PM, Francis Moreau wrote:
>>>>
>>>>> I did give your patch "DDF: compare_super_ddf: fix sequence number
>>>>> check" a try and now mdadm is able to detect a difference between the
>>>>> 2 disks. Therefore it refuses to insert the second disk which is
>>>>> better.
>>>>>
>>>>> However it's still not able to detect which version is the "fresher"
>>>>> like mdadm does with soft RAID1 (metadata 1.2). Therefore mdadm is not
>>>>> able to kick out the first disk if it's the outdated one.
>>>>>
>>>>> Is that expected ?
>>>>
>>>> At the moment, yes. This needs work.
>>>>
>>>
>>> Actually this is worse than I thought: with your patch applied mdadm
>>> refuses to add back a spare disk into a degraded DDF array.
>>>
>>> For example on a DDF array:
>>>
>>> # cat /proc/mdstat
>>> Personalities : [raid1]
>>> md126 : active raid1 sdb[1] sda[0]
>>>       2064384 blocks super external:/md127/0 [2/2] [UU]
>>>
>>> md127 : inactive sdb[1](S) sda[0](S)
>>>       65536 blocks super external:ddf
>>>
>>> unused devices: <none>
>>>
>>> # mdadm /dev/md126 --fail sdb
>>> [   24.118434] md/raid1:md126: Disk failure on sdb, disabling device.
>>> [   24.118437] md/raid1:md126: Operation continuing on 1 devices.
>>> mdadm: set sdb faulty in /dev/md126
>>>
>>> # mdadm /dev/md127 --remove sdb
>>> mdadm: hot removed sdb from /dev/md127
>>>
>>> # mdadm /dev/md127 --add /dev/sdb
>>> mdadm: added /dev/sdb
>>>
>>> # cat /proc/mdstat
>>> Personalities : [raid1]
>>> md126 : active raid1 sda[0]
>>>       2064384 blocks super external:/md127/0 [2/1] [U_]
>>>
>>> md127 : inactive sdb[1](S) sda[0](S)
>>>       65536 blocks super external:ddf
>>>
>>> unused devices: <none>
>>>
>>>
>>> As you can see the reinserted disk sdb sits as spare and isn't added
>>> back to the array.
>>
>> That's correct. You marked that disk failed.
>>
>>> Is it possible to add this major feature work again and keep your improvement ?
>>
>> No. A failed disk can't be added again without rebuild. I am positive
>> about that.
>>
>
> Hmm that's not the case with soft linux RAID AFAICS: doing the same
> thing with soft RAID and the reinserted disk is added to the raid
> array and it's synchronised automatically. You can try it easily.

BTW, that's also the case for DDF if I don't apply your patch.


>
> Could you show me the mdadm command I should use to insert sdb into the array ?
>
> Thanks.
> --
> Francis



-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-21 13:22                       ` Francis Moreau
@ 2013-09-23 20:02                         ` Martin Wilck
  2013-09-27  8:26                           ` Francis Moreau
  2013-09-24 17:38                         ` Martin Wilck
  1 sibling, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2013-09-23 20:02 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid

On 09/21/2013 03:22 PM, Francis Moreau wrote:
> On Fri, Sep 20, 2013 at 11:08 PM, Francis Moreau <francis.moro@gmail.com> wrote:
>> Hello Martin,
>>
>> On Fri, Sep 20, 2013 at 8:07 PM, Martin Wilck <mwilck@arcor.de> wrote:
>>> On 09/20/2013 10:56 AM, Francis Moreau wrote:
>>>> Hello Martin,
>>>>
>>>> On Mon, Sep 16, 2013 at 7:04 PM, Martin Wilck <mwilck@arcor.de> wrote:
>>>>> On 09/16/2013 03:56 PM, Francis Moreau wrote:
>>>>>
>>>>>> I did give your patch "DDF: compare_super_ddf: fix sequence number
>>>>>> check" a try and now mdadm is able to detect a difference between the
>>>>>> 2 disks. Therefore it refuses to insert the second disk which is
>>>>>> better.
>>>>>>
>>>>>> However it's still not able to detect which version is the "fresher"
>>>>>> like mdadm does with soft RAID1 (metadata 1.2). Therefore mdadm is not
>>>>>> able to kick out the first disk if it's the outdated one.
>>>>>>
>>>>>> Is that expected ?
>>>>>
>>>>> At the moment, yes. This needs work.
>>>>>
>>>>
>>>> Actually this is worse than I thought: with your patch applied mdadm
>>>> refuses to add back a spare disk into a degraded DDF array.
>>>>
>>>> For example on a DDF array:
>>>>
>>>> # cat /proc/mdstat
>>>> Personalities : [raid1]
>>>> md126 : active raid1 sdb[1] sda[0]
>>>>       2064384 blocks super external:/md127/0 [2/2] [UU]
>>>>
>>>> md127 : inactive sdb[1](S) sda[0](S)
>>>>       65536 blocks super external:ddf
>>>>
>>>> unused devices: <none>
>>>>
>>>> # mdadm /dev/md126 --fail sdb
>>>> [   24.118434] md/raid1:md126: Disk failure on sdb, disabling device.
>>>> [   24.118437] md/raid1:md126: Operation continuing on 1 devices.
>>>> mdadm: set sdb faulty in /dev/md126
>>>>
>>>> # mdadm /dev/md127 --remove sdb
>>>> mdadm: hot removed sdb from /dev/md127
>>>>
>>>> # mdadm /dev/md127 --add /dev/sdb
>>>> mdadm: added /dev/sdb
>>>>
>>>> # cat /proc/mdstat
>>>> Personalities : [raid1]
>>>> md126 : active raid1 sda[0]
>>>>       2064384 blocks super external:/md127/0 [2/1] [U_]
>>>>
>>>> md127 : inactive sdb[1](S) sda[0](S)
>>>>       65536 blocks super external:ddf
>>>>
>>>> unused devices: <none>
>>>>
>>>>
>>>> As you can see the reinserted disk sdb sits as spare and isn't added
>>>> back to the array.
>>>
>>> That's correct. You marked that disk failed.
>>>
>>>> Is it possible to add this major feature work again and keep your improvement ?
>>>
>>> No. A failed disk can't be added again without rebuild. I am positive
>>> about that.
>>>
>>
>> Hmm that's not the case with soft linux RAID AFAICS: doing the same
>> thing with soft RAID and the reinserted disk is added to the raid
>> array and it's synchronised automatically. You can try it easily.
> 

Sorry, I didn't read your problem description carefully enough. You used
mdadm --add, and that should work and should trigger a rebuild, as you said.

> BTW, that's also the case for DDF if I don't apply your patch.

I don't understand this. My patch doesn't change the behavior of "mdadm
--add". AFAICS compare_super() isn't called in that code path.

I just posted two unit tests that cover this use (or better: failure)
case, please verify that they meet your scenario.

On my system, with my latest patch, these tests are successful.

I also tried a VM, as you suggested, and did exactly what you described,
successfully. After failing/removing one disk and rebooting, the system
comes up degraded; mdadm -I the old disk fails (that's correct), but I
can mdadm --add the old disk and recovery starts automatically. So all
is fine - the question is why it doesn't work on your system.

> Additionnal information: looking at sda shows that it doesn't seem to
> have metadata anymore after having added it to the container:
> 
> # mdadm -E /dev/sda
> /dev/sda:
>    MBR Magic : aa55
> Partition[0] :      3564382 sectors at         2048 (type 83)
> Partition[1] :       559062 sectors at      3569643 (type 05)

I wonder if this gives us a clue. It seems that something erased the
meta data. I can't imagine that mdadm did that. I wonder if that could
have been your BIOS. Pretty certainly it wasn't mdadm. However mdadm
--add should work, even if the BIOS had changed something on the disk. I
admit I'm clueless here.

In order to make progress, we'd need mdadm -E output of both disks
before and after the BIOS gets to write them, after boot, and after your
trying mdadm --add. The mdmon logs would also be highly appreciated, but
they'll probably hard for you to generate. You need to compile mdmon
with CXFLAGS="-DDEBUG=1 -g" and make sure mdmon's stderr os captured
somewhere.

Regards
Martin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-21 13:22                       ` Francis Moreau
  2013-09-23 20:02                         ` Martin Wilck
@ 2013-09-24 17:38                         ` Martin Wilck
  2013-09-24 17:43                           ` Martin Wilck
  1 sibling, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2013-09-24 17:38 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid

Hi Francis,

> Could you show me the mdadm command I should use to insert sdb into the array ?

I tried your scenario on HW with LSI fake RAID and encountered the same
problem you had. I have come up with a patch that I've just posted ("Fix
mdadm --add for LSI fake RAID" and follow-ups).

Please try if this patch fixes your problem. To be precise, with this
patch applied, mdadm /dev/md127 --add /dev/sdX should work again. You
should be able to run this on your currently broken system if you
somehow manage to transfer the updated mdadm and mdmon executables to
that system, run mdmon --takeover /dev/md127, and then remove/add sdX again.

The reason for the problem was a difference in the interpretation of the
workspace_lba field in the DDF header between my code and LSI's.

This has nothing to do with my previous patch "compare_super_ddf: fix
sequence number check".

Regards
Martin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-24 17:38                         ` Martin Wilck
@ 2013-09-24 17:43                           ` Martin Wilck
  0 siblings, 0 replies; 25+ messages in thread
From: Martin Wilck @ 2013-09-24 17:43 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid

On 09/24/2013 07:38 PM, Martin Wilck wrote:

> I tried your scenario on HW with LSI fake RAID and encountered the same
> problem you had. I have come up with a patch that I've just posted ("Fix
> mdadm --add for LSI fake RAID" and follow-ups).
> 
> Please try if this patch fixes your problem. To be precise, with this
> patch applied, mdadm /dev/md127 --add /dev/sdX should work again. You
> should be able to run this on your currently broken system if you
> somehow manage to transfer the updated mdadm and mdmon executables to
> that system, run mdmon --takeover /dev/md127, and then remove/add sdX again.

One more remark: You are probably better off for now by doing operations
like re-adding diks in your system's BIOS RAID setup tool. The reason is
that we can't be sure yet that the structures we set up are correctly
understood by the BIOS - you may successfully add the disk with mdadm,
but at the next boot the BIOS may not understand the data and break it
again.

For IMSM we were in the lucky situation that the BIOS vendor himself
provided the code. For DDF, figuring this out is a cumbersome
trial-and-error process.

Martin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-23 20:02                         ` Martin Wilck
@ 2013-09-27  8:26                           ` Francis Moreau
  2013-09-27 15:47                             ` Francis Moreau
  0 siblings, 1 reply; 25+ messages in thread
From: Francis Moreau @ 2013-09-27  8:26 UTC (permalink / raw)
  To: Martin Wilck; +Cc: linux-raid

Hello Martin,

Sorry for the late answer, I was busy with some other stuff.

On Mon, Sep 23, 2013 at 10:02 PM, Martin Wilck <mwilck@arcor.de> wrote:
> On 09/21/2013 03:22 PM, Francis Moreau wrote:
>> On Fri, Sep 20, 2013 at 11:08 PM, Francis Moreau <francis.moro@gmail.com> wrote:
>>> Hello Martin,
>>>
>>> On Fri, Sep 20, 2013 at 8:07 PM, Martin Wilck <mwilck@arcor.de> wrote:
>>>> On 09/20/2013 10:56 AM, Francis Moreau wrote:
>>>>> Hello Martin,
>>>>>
>>>>> On Mon, Sep 16, 2013 at 7:04 PM, Martin Wilck <mwilck@arcor.de> wrote:
>>>>>> On 09/16/2013 03:56 PM, Francis Moreau wrote:
>>>>>>
>>>>>>> I did give your patch "DDF: compare_super_ddf: fix sequence number
>>>>>>> check" a try and now mdadm is able to detect a difference between the
>>>>>>> 2 disks. Therefore it refuses to insert the second disk which is
>>>>>>> better.
>>>>>>>
>>>>>>> However it's still not able to detect which version is the "fresher"
>>>>>>> like mdadm does with soft RAID1 (metadata 1.2). Therefore mdadm is not
>>>>>>> able to kick out the first disk if it's the outdated one.
>>>>>>>
>>>>>>> Is that expected ?
>>>>>>
>>>>>> At the moment, yes. This needs work.
>>>>>>
>>>>>
>>>>> Actually this is worse than I thought: with your patch applied mdadm
>>>>> refuses to add back a spare disk into a degraded DDF array.
>>>>>
>>>>> For example on a DDF array:
>>>>>
>>>>> # cat /proc/mdstat
>>>>> Personalities : [raid1]
>>>>> md126 : active raid1 sdb[1] sda[0]
>>>>>       2064384 blocks super external:/md127/0 [2/2] [UU]
>>>>>
>>>>> md127 : inactive sdb[1](S) sda[0](S)
>>>>>       65536 blocks super external:ddf
>>>>>
>>>>> unused devices: <none>
>>>>>
>>>>> # mdadm /dev/md126 --fail sdb
>>>>> [   24.118434] md/raid1:md126: Disk failure on sdb, disabling device.
>>>>> [   24.118437] md/raid1:md126: Operation continuing on 1 devices.
>>>>> mdadm: set sdb faulty in /dev/md126
>>>>>
>>>>> # mdadm /dev/md127 --remove sdb
>>>>> mdadm: hot removed sdb from /dev/md127
>>>>>
>>>>> # mdadm /dev/md127 --add /dev/sdb
>>>>> mdadm: added /dev/sdb
>>>>>
>>>>> # cat /proc/mdstat
>>>>> Personalities : [raid1]
>>>>> md126 : active raid1 sda[0]
>>>>>       2064384 blocks super external:/md127/0 [2/1] [U_]
>>>>>
>>>>> md127 : inactive sdb[1](S) sda[0](S)
>>>>>       65536 blocks super external:ddf
>>>>>
>>>>> unused devices: <none>
>>>>>
>>>>>
>>>>> As you can see the reinserted disk sdb sits as spare and isn't added
>>>>> back to the array.
>>>>
>>>> That's correct. You marked that disk failed.
>>>>
>>>>> Is it possible to add this major feature work again and keep your improvement ?
>>>>
>>>> No. A failed disk can't be added again without rebuild. I am positive
>>>> about that.
>>>>
>>>
>>> Hmm that's not the case with soft linux RAID AFAICS: doing the same
>>> thing with soft RAID and the reinserted disk is added to the raid
>>> array and it's synchronised automatically. You can try it easily.
>>
>
> Sorry, I didn't read your problem description carefully enough. You used
> mdadm --add, and that should work and should trigger a rebuild, as you said.
>
>> BTW, that's also the case for DDF if I don't apply your patch.
>
> I don't understand this. My patch doesn't change the behavior of "mdadm
> --add". AFAICS compare_super() isn't called in that code path.
>
> I just posted two unit tests that cover this use (or better: failure)
> case, please verify that they meet your scenario.
>
> On my system, with my latest patch, these tests are successful.
>
> I also tried a VM, as you suggested, and did exactly what you described,
> successfully. After failing/removing one disk and rebooting, the system
> comes up degraded; mdadm -I the old disk fails (that's correct), but I
> can mdadm --add the old disk and recovery starts automatically. So all
> is fine - the question is why it doesn't work on your system.

Maybe the kernel is different ? I'm using 3.4.62.

>
>> Additionnal information: looking at sda shows that it doesn't seem to
>> have metadata anymore after having added it to the container:
>>
>> # mdadm -E /dev/sda
>> /dev/sda:
>>    MBR Magic : aa55
>> Partition[0] :      3564382 sectors at         2048 (type 83)
>> Partition[1] :       559062 sectors at      3569643 (type 05)
>
> I wonder if this gives us a clue. It seems that something erased the
> meta data. I can't imagine that mdadm did that. I wonder if that could
> have been your BIOS. Pretty certainly it wasn't mdadm. However mdadm
> --add should work, even if the BIOS had changed something on the disk. I
> admit I'm clueless here.
>
> In order to make progress, we'd need mdadm -E output of both disks
> before and after the BIOS gets to write them, after boot, and after your
> trying mdadm --add. The mdmon logs would also be highly appreciated, but
> they'll probably hard for you to generate. You need to compile mdmon
> with CXFLAGS="-DDEBUG=1 -g" and make sure mdmon's stderr os captured
> somewhere.

I'm not sure why you're talking about the BIOS here... my VM hasn't
been rebooted during the tests described above. BTW I'm using qemu to
run my VM.

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-27  8:26                           ` Francis Moreau
@ 2013-09-27 15:47                             ` Francis Moreau
  2013-10-02 18:33                               ` Martin Wilck
  0 siblings, 1 reply; 25+ messages in thread
From: Francis Moreau @ 2013-09-27 15:47 UTC (permalink / raw)
  To: Martin Wilck; +Cc: linux-raid

On Fri, Sep 27, 2013 at 10:26 AM, Francis Moreau <francis.moro@gmail.com> wrote:
> Hello Martin,
>
> Sorry for the late answer, I was busy with some other stuff.
>
> On Mon, Sep 23, 2013 at 10:02 PM, Martin Wilck <mwilck@arcor.de> wrote:
>> On 09/21/2013 03:22 PM, Francis Moreau wrote:
>>> On Fri, Sep 20, 2013 at 11:08 PM, Francis Moreau <francis.moro@gmail.com> wrote:
>>>> Hello Martin,
>>>>
>>>> On Fri, Sep 20, 2013 at 8:07 PM, Martin Wilck <mwilck@arcor.de> wrote:
>>>>> On 09/20/2013 10:56 AM, Francis Moreau wrote:
>>>>>> Hello Martin,
>>>>>>
>>>>>> On Mon, Sep 16, 2013 at 7:04 PM, Martin Wilck <mwilck@arcor.de> wrote:
>>>>>>> On 09/16/2013 03:56 PM, Francis Moreau wrote:
>>>>>>>
>>>>>>>> I did give your patch "DDF: compare_super_ddf: fix sequence number
>>>>>>>> check" a try and now mdadm is able to detect a difference between the
>>>>>>>> 2 disks. Therefore it refuses to insert the second disk which is
>>>>>>>> better.
>>>>>>>>
>>>>>>>> However it's still not able to detect which version is the "fresher"
>>>>>>>> like mdadm does with soft RAID1 (metadata 1.2). Therefore mdadm is not
>>>>>>>> able to kick out the first disk if it's the outdated one.
>>>>>>>>
>>>>>>>> Is that expected ?
>>>>>>>
>>>>>>> At the moment, yes. This needs work.
>>>>>>>
>>>>>>
>>>>>> Actually this is worse than I thought: with your patch applied mdadm
>>>>>> refuses to add back a spare disk into a degraded DDF array.
>>>>>>
>>>>>> For example on a DDF array:
>>>>>>
>>>>>> # cat /proc/mdstat
>>>>>> Personalities : [raid1]
>>>>>> md126 : active raid1 sdb[1] sda[0]
>>>>>>       2064384 blocks super external:/md127/0 [2/2] [UU]
>>>>>>
>>>>>> md127 : inactive sdb[1](S) sda[0](S)
>>>>>>       65536 blocks super external:ddf
>>>>>>
>>>>>> unused devices: <none>
>>>>>>
>>>>>> # mdadm /dev/md126 --fail sdb
>>>>>> [   24.118434] md/raid1:md126: Disk failure on sdb, disabling device.
>>>>>> [   24.118437] md/raid1:md126: Operation continuing on 1 devices.
>>>>>> mdadm: set sdb faulty in /dev/md126
>>>>>>
>>>>>> # mdadm /dev/md127 --remove sdb
>>>>>> mdadm: hot removed sdb from /dev/md127
>>>>>>
>>>>>> # mdadm /dev/md127 --add /dev/sdb
>>>>>> mdadm: added /dev/sdb
>>>>>>
>>>>>> # cat /proc/mdstat
>>>>>> Personalities : [raid1]
>>>>>> md126 : active raid1 sda[0]
>>>>>>       2064384 blocks super external:/md127/0 [2/1] [U_]
>>>>>>
>>>>>> md127 : inactive sdb[1](S) sda[0](S)
>>>>>>       65536 blocks super external:ddf
>>>>>>
>>>>>> unused devices: <none>
>>>>>>
>>>>>>
>>>>>> As you can see the reinserted disk sdb sits as spare and isn't added
>>>>>> back to the array.
>>>>>
>>>>> That's correct. You marked that disk failed.
>>>>>
>>>>>> Is it possible to add this major feature work again and keep your improvement ?
>>>>>
>>>>> No. A failed disk can't be added again without rebuild. I am positive
>>>>> about that.
>>>>>
>>>>
>>>> Hmm that's not the case with soft linux RAID AFAICS: doing the same
>>>> thing with soft RAID and the reinserted disk is added to the raid
>>>> array and it's synchronised automatically. You can try it easily.
>>>
>>
>> Sorry, I didn't read your problem description carefully enough. You used
>> mdadm --add, and that should work and should trigger a rebuild, as you said.
>>
>>> BTW, that's also the case for DDF if I don't apply your patch.
>>
>> I don't understand this. My patch doesn't change the behavior of "mdadm
>> --add". AFAICS compare_super() isn't called in that code path.
>>
>> I just posted two unit tests that cover this use (or better: failure)
>> case, please verify that they meet your scenario.
>>
>> On my system, with my latest patch, these tests are successful.
>>
>> I also tried a VM, as you suggested, and did exactly what you described,
>> successfully. After failing/removing one disk and rebooting, the system
>> comes up degraded; mdadm -I the old disk fails (that's correct), but I
>> can mdadm --add the old disk and recovery starts automatically. So all
>> is fine - the question is why it doesn't work on your system.
>
> Maybe the kernel is different ? I'm using 3.4.62.
>
>>
>>> Additionnal information: looking at sda shows that it doesn't seem to
>>> have metadata anymore after having added it to the container:
>>>
>>> # mdadm -E /dev/sda
>>> /dev/sda:
>>>    MBR Magic : aa55
>>> Partition[0] :      3564382 sectors at         2048 (type 83)
>>> Partition[1] :       559062 sectors at      3569643 (type 05)
>>
>> I wonder if this gives us a clue. It seems that something erased the
>> meta data. I can't imagine that mdadm did that. I wonder if that could
>> have been your BIOS. Pretty certainly it wasn't mdadm. However mdadm
>> --add should work, even if the BIOS had changed something on the disk. I
>> admit I'm clueless here.
>>
>> In order to make progress, we'd need mdadm -E output of both disks
>> before and after the BIOS gets to write them, after boot, and after your
>> trying mdadm --add. The mdmon logs would also be highly appreciated, but
>> they'll probably hard for you to generate. You need to compile mdmon
>> with CXFLAGS="-DDEBUG=1 -g" and make sure mdmon's stderr os captured
>> somewhere.
>
> I'm not sure why you're talking about the BIOS here... my VM hasn't
> been rebooted during the tests described above. BTW I'm using qemu to
> run my VM.

I finally found my issue: mdmon --takeover service wasn't started
anymore (probably I messed it up earlier). Therefore mdmon started by
initrd was used and wasn't working properly.

If you still want me to test something, please tell me.

OTHO, it would be easier if you setup a git tree somewhere with your
patches that you want me to test. BTW I'm not subscribed to linux-raid
mailing list.

Thanks.
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-09-27 15:47                             ` Francis Moreau
@ 2013-10-02 18:33                               ` Martin Wilck
  2013-10-16  4:57                                 ` NeilBrown
  0 siblings, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2013-10-02 18:33 UTC (permalink / raw)
  To: Francis Moreau; +Cc: linux-raid

On 09/27/2013 05:47 PM, Francis Moreau wrote:

> I finally found my issue: mdmon --takeover service wasn't started
> anymore (probably I messed it up earlier). Therefore mdmon started by
> initrd was used and wasn't working properly.

Thanks a lot for finding this out, glad your system is working again.
And sorry for having lost you and talking about the BIOS; I was thinking
you were still working on fake BIOS hardware.

> If you still want me to test something, please tell me.
> 
> OTHO, it would be easier if you setup a git tree somewhere with your
> patches that you want me to test. BTW I'm not subscribed to linux-raid
> mailing list.

I'll wait for Neil to come back and comment on my patches. It doesn't
make sense for me right now to create my own repository.

Regards
Martin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-10-02 18:33                               ` Martin Wilck
@ 2013-10-16  4:57                                 ` NeilBrown
  2013-10-16 20:10                                   ` Francis Moreau
  0 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2013-10-16  4:57 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Francis Moreau, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1195 bytes --]

On Wed, 02 Oct 2013 20:33:53 +0200 Martin Wilck <mwilck@arcor.de> wrote:

> On 09/27/2013 05:47 PM, Francis Moreau wrote:
> 
> > I finally found my issue: mdmon --takeover service wasn't started
> > anymore (probably I messed it up earlier). Therefore mdmon started by
> > initrd was used and wasn't working properly.
> 
> Thanks a lot for finding this out, glad your system is working again.
> And sorry for having lost you and talking about the BIOS; I was thinking
> you were still working on fake BIOS hardware.
> 
> > If you still want me to test something, please tell me.
> > 
> > OTHO, it would be easier if you setup a git tree somewhere with your
> > patches that you want me to test. BTW I'm not subscribed to linux-raid
> > mailing list.
> 
> I'll wait for Neil to come back and comment on my patches. It doesn't
> make sense for me right now to create my own repository.

Hi,
 I'm back from leave now :-)
 Your patches look good - thanks a lots.
 I think I have applied them all (though the numbering was a bit odd and I
 might have missed something).

 I've lost track ... are there any outstanding issues here, or are we "done" ?

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-10-16  4:57                                 ` NeilBrown
@ 2013-10-16 20:10                                   ` Francis Moreau
  2013-10-17 10:58                                     ` NeilBrown
  0 siblings, 1 reply; 25+ messages in thread
From: Francis Moreau @ 2013-10-16 20:10 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid

Hi Neil,

On Wed, Oct 16, 2013 at 6:57 AM, NeilBrown <neilb@suse.de> wrote:
> On Wed, 02 Oct 2013 20:33:53 +0200 Martin Wilck <mwilck@arcor.de> wrote:
>
>> On 09/27/2013 05:47 PM, Francis Moreau wrote:
>>
>> > I finally found my issue: mdmon --takeover service wasn't started
>> > anymore (probably I messed it up earlier). Therefore mdmon started by
>> > initrd was used and wasn't working properly.
>>
>> Thanks a lot for finding this out, glad your system is working again.
>> And sorry for having lost you and talking about the BIOS; I was thinking
>> you were still working on fake BIOS hardware.
>>
>> > If you still want me to test something, please tell me.
>> >
>> > OTHO, it would be easier if you setup a git tree somewhere with your
>> > patches that you want me to test. BTW I'm not subscribed to linux-raid
>> > mailing list.
>>
>> I'll wait for Neil to come back and comment on my patches. It doesn't
>> make sense for me right now to create my own repository.
>
> Hi,
>  I'm back from leave now :-)
>  Your patches look good - thanks a lots.
>  I think I have applied them all (though the numbering was a bit odd and I
>  might have missed something).
>
>  I've lost track ... are there any outstanding issues here, or are we "done" ?
>

I think there's still one open issue when sequence numbers don't match during
incremental assembly.

Martin started addressing this in another new thread whose subject is
"RFC: incremental container assembly when sequence numbers don't
match"

Thanks
-- 
Francis

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-10-16 20:10                                   ` Francis Moreau
@ 2013-10-17 10:58                                     ` NeilBrown
  2013-10-19 20:21                                       ` Martin Wilck
  0 siblings, 1 reply; 25+ messages in thread
From: NeilBrown @ 2013-10-17 10:58 UTC (permalink / raw)
  To: Francis Moreau; +Cc: Martin Wilck, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]

On Wed, 16 Oct 2013 22:10:54 +0200 Francis Moreau <francis.moro@gmail.com>
wrote:

> Hi Neil,
> 
> On Wed, Oct 16, 2013 at 6:57 AM, NeilBrown <neilb@suse.de> wrote:
> > On Wed, 02 Oct 2013 20:33:53 +0200 Martin Wilck <mwilck@arcor.de> wrote:
> >
> >> On 09/27/2013 05:47 PM, Francis Moreau wrote:
> >>
> >> > I finally found my issue: mdmon --takeover service wasn't started
> >> > anymore (probably I messed it up earlier). Therefore mdmon started by
> >> > initrd was used and wasn't working properly.
> >>
> >> Thanks a lot for finding this out, glad your system is working again.
> >> And sorry for having lost you and talking about the BIOS; I was thinking
> >> you were still working on fake BIOS hardware.
> >>
> >> > If you still want me to test something, please tell me.
> >> >
> >> > OTHO, it would be easier if you setup a git tree somewhere with your
> >> > patches that you want me to test. BTW I'm not subscribed to linux-raid
> >> > mailing list.
> >>
> >> I'll wait for Neil to come back and comment on my patches. It doesn't
> >> make sense for me right now to create my own repository.
> >
> > Hi,
> >  I'm back from leave now :-)
> >  Your patches look good - thanks a lots.
> >  I think I have applied them all (though the numbering was a bit odd and I
> >  might have missed something).
> >
> >  I've lost track ... are there any outstanding issues here, or are we "done" ?
> >
> 
> I think there's still one open issue when sequence numbers don't match during
> incremental assembly.
> 
> Martin started addressing this in another new thread whose subject is
> "RFC: incremental container assembly when sequence numbers don't
> match"
> 
>
Ah yes, thanks.  I had a quick look and it seems to make sense, but it
deserves more thorough consideration.
I'll get on to that sometime soon.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-10-17 10:58                                     ` NeilBrown
@ 2013-10-19 20:21                                       ` Martin Wilck
  2013-10-20 23:59                                         ` NeilBrown
  0 siblings, 1 reply; 25+ messages in thread
From: Martin Wilck @ 2013-10-19 20:21 UTC (permalink / raw)
  To: NeilBrown; +Cc: Francis Moreau, linux-raid

Hi Neil,

>> Martin started addressing this in another new thread whose subject is
>> "RFC: incremental container assembly when sequence numbers don't
>> match"
>>
>>
> Ah yes, thanks.  I had a quick look and it seems to make sense, but it
> deserves more thorough consideration.
> I'll get on to that sometime soon.


Good that you're back. I was starting to get nervous :-)

Wrt the sequence number issue: One thing that I need to understand
better is how native MD deals with this kind of thing. Containers have
one additional complexity: one set of meta data for several subarrays.

IMO one thing I think we need is cleaner semantics for compare_super().
The return to distinguish at least the cases

  0 - OK
  1 - fatal incompatibility
  2 - nonfatal, new disk has "older" meta data
  3 - nonfatal, new disk has "newer" meta data

IMSM already has this to some extent.

The logic to handle this must be in the generic, non-metadata-specific
code. Metadata handlers need methods to force re-reading the meta data
from certain disk(s). mdmon also needs a way to detect that it needs to
reread the meta data.

Furthermore,  at least compare_super_ddf does not only compare, it also
makes changes to its internal data structures; I think other meta data
handlers do the same. IMO it would be more appropriate to do this in a
separate call, after the caller has decided if and how to merge the meta
data.

Then we need to make sure that (to the maximum extent possible) these
issued are handled similarly during Assembly and Incremental Assembly.

The "nonfatal" case is similar to the current "homehost" logic.

As for the complex scenarios in my "RFC" mail, thinking more about it, I
found that the dangerous incremental assembly cases are not so critical
after all, as long as subarrays aren't started prematurely, which is
mostly guaranteed by the current udev rules.

Regards
Martin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: mdadm 3.3 fails to kick out non fresh disk
  2013-10-19 20:21                                       ` Martin Wilck
@ 2013-10-20 23:59                                         ` NeilBrown
  0 siblings, 0 replies; 25+ messages in thread
From: NeilBrown @ 2013-10-20 23:59 UTC (permalink / raw)
  To: Martin Wilck; +Cc: Francis Moreau, linux-raid

[-- Attachment #1: Type: text/plain, Size: 3466 bytes --]

On Sat, 19 Oct 2013 22:21:25 +0200 Martin Wilck <mwilck@arcor.de> wrote:

> Hi Neil,
> 
> >> Martin started addressing this in another new thread whose subject is
> >> "RFC: incremental container assembly when sequence numbers don't
> >> match"
> >>
> >>
> > Ah yes, thanks.  I had a quick look and it seems to make sense, but it
> > deserves more thorough consideration.
> > I'll get on to that sometime soon.
> 
> 
> Good that you're back. I was starting to get nervous :-)

(still hoping someone else will put their hand up to maintain md though....)

> 
> Wrt the sequence number issue: One thing that I need to understand
> better is how native MD deals with this kind of thing. Containers have
> one additional complexity: one set of meta data for several subarrays.

With native MD is it is fairly simple - that 'one additional complexity' does
make a real difference.

 - Before the array is running, you can add any device.
 - When the array is started, anything older than the newest device is
   discarded (with the understanding that a bitmap broadens the "age"
   of the newest device, so that oldish devices can still be included).
 - If the array is started but readonly, the missing devices of the same age
   as the newest device can be added (I think).
 - After the array is read-write, devices can only be added if there is a
   bitmap, and they must be in the age range of the bitmap.


> 
> IMO one thing I think we need is cleaner semantics for compare_super().
> The return to distinguish at least the cases
> 
>   0 - OK
>   1 - fatal incompatibility
>   2 - nonfatal, new disk has "older" meta data
>   3 - nonfatal, new disk has "newer" meta data

Does this really belong in compare_super()?

Currently compare super is for checking if the device belongs to the array at
all.  The test on the sequence number (event counter) is separate.
getinfo_super should return that in info->events.  Current only super0 and
super1 do that, intel and ddf don't.

> 
> IMSM already has this to some extent.
> 
> The logic to handle this must be in the generic, non-metadata-specific
> code. Metadata handlers need methods to force re-reading the meta data
> from certain disk(s). mdmon also needs a way to detect that it needs to
> reread the meta data.

If there is a container with one array started and the other not, then you
might want to reread the metadata for one array but not the other.
That could get messing but should be do-able.

> 
> Furthermore,  at least compare_super_ddf does not only compare, it also
> makes changes to its internal data structures; I think other meta data
> handlers do the same. IMO it would be more appropriate to do this in a
> separate call, after the caller has decided if and how to merge the meta
> data.

Sounds reasonable.

> 
> Then we need to make sure that (to the maximum extent possible) these
> issued are handled similarly during Assembly and Incremental Assembly.
> 
> The "nonfatal" case is similar to the current "homehost" logic.
> 
> As for the complex scenarios in my "RFC" mail, thinking more about it, I
> found that the dangerous incremental assembly cases are not so critical
> after all, as long as subarrays aren't started prematurely, which is
> mostly guaranteed by the current udev rules.

Agreed.  We should  do our best to detect the dangerous cases, but they
shouldn't be likely.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-10-20 23:59 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-13 13:22 mdadm 3.3 fails to kick out non fresh disk Francis Moreau
2013-09-13 20:43 ` NeilBrown
2013-09-13 22:35   ` Francis Moreau
2013-09-13 23:56     ` Roberto Spadim
2013-09-14 10:38     ` NeilBrown
2013-09-14 14:33       ` Francis Moreau
2013-09-14 15:06         ` Francis Moreau
2013-09-14 20:43           ` Martin Wilck
2013-09-16 13:56             ` Francis Moreau
2013-09-16 17:04               ` Martin Wilck
2013-09-20  8:56                 ` Francis Moreau
2013-09-20 18:07                   ` Martin Wilck
2013-09-20 21:08                     ` Francis Moreau
2013-09-21 13:22                       ` Francis Moreau
2013-09-23 20:02                         ` Martin Wilck
2013-09-27  8:26                           ` Francis Moreau
2013-09-27 15:47                             ` Francis Moreau
2013-10-02 18:33                               ` Martin Wilck
2013-10-16  4:57                                 ` NeilBrown
2013-10-16 20:10                                   ` Francis Moreau
2013-10-17 10:58                                     ` NeilBrown
2013-10-19 20:21                                       ` Martin Wilck
2013-10-20 23:59                                         ` NeilBrown
2013-09-24 17:38                         ` Martin Wilck
2013-09-24 17:43                           ` Martin Wilck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.