All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs check of a raid0?
@ 2018-07-01 16:27 Marc MERLIN
  2018-07-01 19:15 ` Chris Murphy
  0 siblings, 1 reply; 3+ messages in thread
From: Marc MERLIN @ 2018-07-01 16:27 UTC (permalink / raw)
  To: linux-btrfs

Howdy,

I have a btrfs filesystem made out of 2 devices:
[   75.141414] BTRFS: device label btrfs_space devid 1 transid 429220 /dev/bcache3
[   75.164745] BTRFS: device label btrfs_space devid 2 transid 429220 /dev/bcache2

One of the 2 devices had a hardware error (not btrfs' fault):
[201504.939659] BTRFS error (device bcache3): bdev /dev/bcache2 errs: wr 552, rd 39, flush 1, corrupt 0, gen 0
[201504.995967] BTRFS warning (device bcache3): bcache3 checksum verify failed on 399998976 wanted F3019EEA found E6A97DC4 level 0
[201505.032209] BTRFS error (device bcache3): bdev /dev/bcache2 errs: wr 552, rd 40, flush 1, corrupt 0, gen 0
[201505.062447] BTRFS error (device bcache3): parent transid verify failed on 399998976 wanted 434763 found 434245
[201600.262142] BTRFS error (device bcache3): bdev /dev/bcache2 errs: wr 552, rd 41, flush 1, corrupt 0, gen 0

I unmounted it, and I'm trying to check the filesystem now.

How is it supposed to work when you have multiple devices for a btrfs
filesystem?

gargamel:~# btrfs check --repair -p /dev/bcache2 
enabling repair mode
ERROR: mount check: cannot open /dev/bcache2: No such device or address
ERROR: could not check mount status: No such device or address
gargamel:~# btrfs check --repair -p /dev/bcache3
enabling repair mode
ERROR: cannot open device '/dev/bcache3': Device or resource busy
ERROR: cannot open file system

[205248.299528] BTRFS info (device bcache3): disk space caching is enabled
[205248.320335] BTRFS error (device bcache3): Remounting read-write after error is not allowed

Yes, rebooting should likely get around the problem, but I'd rather not
reboot, I have long running stuff I would rather not stop.

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: btrfs check of a raid0?
  2018-07-01 16:27 btrfs check of a raid0? Marc MERLIN
@ 2018-07-01 19:15 ` Chris Murphy
  2018-07-01 23:03   ` Marc MERLIN
  0 siblings, 1 reply; 3+ messages in thread
From: Chris Murphy @ 2018-07-01 19:15 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Btrfs BTRFS

On Sun, Jul 1, 2018 at 10:27 AM, Marc MERLIN <marc@merlins.org> wrote:
> Howdy,
>
> I have a btrfs filesystem made out of 2 devices:
> [   75.141414] BTRFS: device label btrfs_space devid 1 transid 429220 /dev/bcache3
> [   75.164745] BTRFS: device label btrfs_space devid 2 transid 429220 /dev/bcache2
>
> One of the 2 devices had a hardware error (not btrfs' fault):
> [201504.939659] BTRFS error (device bcache3): bdev /dev/bcache2 errs: wr 552, rd 39, flush 1, corrupt 0, gen 0
> [201504.995967] BTRFS warning (device bcache3): bcache3 checksum verify failed on 399998976 wanted F3019EEA found E6A97DC4 level 0
> [201505.032209] BTRFS error (device bcache3): bdev /dev/bcache2 errs: wr 552, rd 40, flush 1, corrupt 0, gen 0
> [201505.062447] BTRFS error (device bcache3): parent transid verify failed on 399998976 wanted 434763 found 434245
> [201600.262142] BTRFS error (device bcache3): bdev /dev/bcache2 errs: wr 552, rd 41, flush 1, corrupt 0, gen 0
>
> I unmounted it, and I'm trying to check the filesystem now.

Is it raid0 metadata? Because if it's raid1 metadata it should have
passively recovered from a good copy and then fixed the bad copy.


>
> How is it supposed to work when you have multiple devices for a btrfs
> filesystem?
>
> gargamel:~# btrfs check --repair -p /dev/bcache2
> enabling repair mode
> ERROR: mount check: cannot open /dev/bcache2: No such device or address
> ERROR: could not check mount status: No such device or address
> gargamel:~# btrfs check --repair -p /dev/bcache3
> enabling repair mode
> ERROR: cannot open device '/dev/bcache3': Device or resource busy
> ERROR: cannot open file system
>
> [205248.299528] BTRFS info (device bcache3): disk space caching is enabled
> [205248.320335] BTRFS error (device bcache3): Remounting read-write after error is not allowed

If it's successfully unmounted, I don't understand the error messages
that it can't be opened. Is umount hung? Sounds to me like btrfs check
thinks it's still mounted.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: btrfs check of a raid0?
  2018-07-01 19:15 ` Chris Murphy
@ 2018-07-01 23:03   ` Marc MERLIN
  0 siblings, 0 replies; 3+ messages in thread
From: Marc MERLIN @ 2018-07-01 23:03 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Sun, Jul 01, 2018 at 01:15:09PM -0600, Chris Murphy wrote:
> > How is it supposed to work when you have multiple devices for a btrfs
> > filesystem?
> >
> > gargamel:~# btrfs check --repair -p /dev/bcache2
> > enabling repair mode
> > ERROR: mount check: cannot open /dev/bcache2: No such device or address
> > ERROR: could not check mount status: No such device or address
> > gargamel:~# btrfs check --repair -p /dev/bcache3
> > enabling repair mode
> > ERROR: cannot open device '/dev/bcache3': Device or resource busy
> > ERROR: cannot open file system
> >
> > [205248.299528] BTRFS info (device bcache3): disk space caching is enabled
> > [205248.320335] BTRFS error (device bcache3): Remounting read-write after error is not allowed
> 
> If it's successfully unmounted, I don't understand the error messages
> that it can't be opened. Is umount hung? Sounds to me like btrfs check
> thinks it's still mounted.

I spent more time on this and apparently because the underlying device
had a hardware fault (fell off the bus), its dmcrpyt device is still
there but not working.
In turn, I can't dmsetup rm it because it's in use by bcache which
didn't free it, but bcache won't let me free it because it got removed.
So, I'm stuck with a reboot in the end, oh well...

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-07-01 23:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-01 16:27 btrfs check of a raid0? Marc MERLIN
2018-07-01 19:15 ` Chris Murphy
2018-07-01 23:03   ` Marc MERLIN

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.