Unexpected raid1 behaviour

* Unexpected raid1 behaviour
@ 2017-12-16 19:50 Dark Penguin
  2017-12-17 11:58 ` Duncan
                   ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: Dark Penguin @ 2017-12-16 19:50 UTC (permalink / raw)
  To: linux-btrfs

Could someone please point me towards some read about how btrfs handles
multiple devices? Namely, kicking faulty devices and re-adding them.

I've been using btrfs on single devices for a while, but now I want to
start using it in raid1 mode. I booted into an Ubuntu 17.10 LiveCD and
tried to see how does it handle various situations. The experience left
me very surprised; I've tried a number of things, all of which produced
unexpected results.

I create a btrfs raid1 filesystem on two hard drives and mount it.

- When I pull one of the drives out (simulating a simple cable failure,
which happens pretty often to me), the filesystem sometimes goes
read-only. ???
- But only after a while, and not always. ???
- When I fix the cable problem (plug the device back), it's immediately
"re-added" back. But I see no replication of the data I've written onto
a degraded filesystem... Nothing shows any problems, so "my filesystem
must be ok". ???
- If I unmount the filesystem and then mount it back, I see all my
recent changes lost (everything I wrote during the "degraded" period).
- If I continue working with a degraded raid1 filesystem (even without
damaging it further by re-adding the faulty device), after a while it
won't mount at all, even with "-o degraded".

I can't wrap my head about all this. Either the kicked device should not
be re-added, or it should be re-added "properly", or it should at least
show some errors and not pretend nothing happened, right?..

I must be missing something. Is there an explanation somewhere about
what's really going on during those situations? Also, do I understand
correctly that upon detecting a faulty device (a write error), nothing
is done about it except logging an error into the 'btrfs device stats'
report? No device kicking, no notification?.. And what about degraded
filesystems - is it absolutely forbidden to work with them without
converting them to a "single" filesystem first?..

On Ubuntu 17.10, there's Linux 4.13.0-16 and btrfs-progs 4.12-1 .

-- 
darkpenguin

^ permalink raw reply	[flat|nested] 61+ messages in thread