All of lore.kernel.org
 help / color / mirror / Atom feed
* Problem with Raid1 when all drives failed
@ 2013-06-20  6:22 Baldysiak, Pawel
  2013-06-20  9:11 ` Stan Hoeppner
  2013-06-24  7:08 ` NeilBrown
  0 siblings, 2 replies; 3+ messages in thread
From: Baldysiak, Pawel @ 2013-06-20  6:22 UTC (permalink / raw)
  To: neilb; +Cc: linux-raid

Hi Neil,

We have observed a strange behavior of a RAID1 volume when all its drives failed.
Here is our test case:

Steps to reproduce:
1. Create 2-drives RAID1 (tested on both native and IMSM metadata)
2. Wait for the end of the initial resync 
3. Hot-unplug both drives of the RAID1 volume

Actual behavior:
The RAID1 volume is still present in OS as a degraded one-drive array

Expected behavior:
Should a RAID volume disappear from OS?

I see that when a drive is removed from OS udev runs "mdadm -If <>" for missing member which tries to write "faulty" to the state of array's member.
I see also that md driver prevents from doing this operation for the last drive in a RAID1 array, so when two drives fail nothing really happens to the one that fails as the second one.

It can be very dangerous, because if user has mounted file system at this array it can lead to unstable work of system or even a system crash. More over user does not have proper information about the state of an array.

How should it work according to the design? Should mdadm stop volume when all its members disappear?

Pawel Baldysiak

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Problem with Raid1 when all drives failed
  2013-06-20  6:22 Problem with Raid1 when all drives failed Baldysiak, Pawel
@ 2013-06-20  9:11 ` Stan Hoeppner
  2013-06-24  7:08 ` NeilBrown
  1 sibling, 0 replies; 3+ messages in thread
From: Stan Hoeppner @ 2013-06-20  9:11 UTC (permalink / raw)
  To: Baldysiak, Pawel; +Cc: neilb, linux-raid

On 6/20/2013 1:22 AM, Baldysiak, Pawel wrote:
> Hi Neil,
> 
> We have observed a strange behavior of a RAID1 volume when all its drives failed.
> Here is our test case:
> 
> Steps to reproduce:
> 1. Create 2-drives RAID1 (tested on both native and IMSM metadata)
> 2. Wait for the end of the initial resync 
> 3. Hot-unplug both drives of the RAID1 volume
> 
> Actual behavior:
> The RAID1 volume is still present in OS as a degraded one-drive array
> 
> Expected behavior:
> Should a RAID volume disappear from OS?
> 
> I see that when a drive is removed from OS udev runs "mdadm -If <>" for missing member which tries to write "faulty" to the state of array's member.
> I see also that md driver prevents from doing this operation for the last drive in a RAID1 array, so when two drives fail nothing really happens to the one that fails as the second one.
> 
> It can be very dangerous, because if user has mounted file system at this array it can lead to unstable work of system or even a system crash. More over user does not have proper information about the state of an array.
> 
> How should it work according to the design? Should mdadm stop volume when all its members disappear?

How is this scenario meaningfully different from tripping over an eSATA
cable, accidentally unplugging a JBOD chassis SAS cable, losing power to
a JBOD chassis, etc?

-- 
Stan



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Problem with Raid1 when all drives failed
  2013-06-20  6:22 Problem with Raid1 when all drives failed Baldysiak, Pawel
  2013-06-20  9:11 ` Stan Hoeppner
@ 2013-06-24  7:08 ` NeilBrown
  1 sibling, 0 replies; 3+ messages in thread
From: NeilBrown @ 2013-06-24  7:08 UTC (permalink / raw)
  To: Baldysiak, Pawel; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1857 bytes --]

On Thu, 20 Jun 2013 06:22:32 +0000 "Baldysiak, Pawel"
<pawel.baldysiak@intel.com> wrote:

> Hi Neil,
> 
> We have observed a strange behavior of a RAID1 volume when all its drives failed.
> Here is our test case:
> 
> Steps to reproduce:
> 1. Create 2-drives RAID1 (tested on both native and IMSM metadata)
> 2. Wait for the end of the initial resync 
> 3. Hot-unplug both drives of the RAID1 volume
> 
> Actual behavior:
> The RAID1 volume is still present in OS as a degraded one-drive array

That is what I expect.

> 
> Expected behavior:
> Should a RAID volume disappear from OS?

How exactly?  If the filesystem is mounted that would be impossible.

> 
> I see that when a drive is removed from OS udev runs "mdadm -If <>" for missing member which tries to write "faulty" to the state of array's member.
> I see also that md driver prevents from doing this operation for the last drive in a RAID1 array, so when two drives fail nothing really happens to the one that fails as the second one.
> 
> It can be very dangerous, because if user has mounted file system at this array it can lead to unstable work of system or even a system crash. More over user does not have proper information about the state of an array.

It shouldn't lead to a crash.  But it could certainly cause problems.
Unplugged active devices often does.

> 
> How should it work according to the design? Should mdadm stop volume when all its members disappear?

Have a look at the current code in my "master" branch.  When this happens it
will try to stop the array (which will fail if the array is mounted), and
will try to get "udisks" to unmount the array (which will fail if the
filesytem is in use).
So it goes a little way in the direction you want, but I think that what you
are asking for is impossible with Linux.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-06-24  7:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-20  6:22 Problem with Raid1 when all drives failed Baldysiak, Pawel
2013-06-20  9:11 ` Stan Hoeppner
2013-06-24  7:08 ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.