All of lore.kernel.org
 help / color / mirror / Atom feed
* Issues with FS going read-only and bad drive
@ 2020-01-16  3:39 Sabrina Cathey
  0 siblings, 0 replies; only message in thread
From: Sabrina Cathey @ 2020-01-16  3:39 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2494 bytes --]

Sent earlier but it hasn't shown up on the mailing list.  I know
greylisting has a delay but usually not 30 minutes as far as I know.

----

Up front required information

uname -a;btrfs --version;btrfs fi show;btrfs fi df /shizzle/
Linux babel.thegnomedev.com 5.3.8-arch1-1 #1 SMP PREEMPT @1572357769
x86_64 GNU/Linux
btrfs-progs v5.3.1
Label: 'shizzle'  uuid: 92b267f2-c8af-40eb-b433-e53e140ebd01
Total devices 10 FS bytes used 34.18TiB
devid    2 size 5.46TiB used 4.28TiB path /dev/sdb1
devid    3 size 5.46TiB used 4.28TiB path /dev/sdg1
devid    4 size 5.46TiB used 4.28TiB path /dev/sdh1
devid    5 size 5.46TiB used 4.28TiB path /dev/sdi1
devid    6 size 5.46TiB used 4.28TiB path /dev/sdj1
devid    7 size 5.46TiB used 4.28TiB path /dev/sdf1
devid    8 size 5.46TiB used 4.28TiB path /dev/sda1
devid    9 size 5.46TiB used 4.28TiB path /dev/sdd1
devid   10 size 5.46TiB used 4.28TiB path /dev/sde1
devid   11 size 5.46TiB used 4.28TiB path /dev/sdc1

Data, RAID6: total=34.18TiB, used=34.13TiB
System, RAID6: total=256.00MiB, used=1.73MiB
Metadata, RAID6: total=60.00GiB, used=54.65GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

----

dmesg output is over 100k and my understanding is that you have a size
limit so here is a pastebin: https://pastebin.com/d4BPRS6m

----

Story is that I found the server unresponsive and when I rebooted I
ended seeing a disk was missing https://i.imgur.com/iLgnNBM.jpg

I mucked about trying to figure out what to do.  I ended up rebooting
to see if I could see an issue in the drive controller BIOS and when I
got back into the OS things seemed okay at first.  It was mounted and
looked okay but then I noticed issues in dmesg related "parent transid
verify failed" errors.

It's late and I was grasping a straws and random googling.  I tried a
scrub and it failed and the filesystem went RO.  I retried a few
times, because insanity.

I tried btrfsck (default non-destructive) and it also bailed out
https://i.imgur.com/ZEq0RjU.jpg

Looking at btrfs device stats it looks like one of the devices
(/dev/sde) is bad - probably the one that was found missing initially.
I'm attaching the output of that command.  I'm way out of my depth
here - my thought is to use btrfs device delete /dev/sde1

Please can you help me to not lose my data?  With this large an amount
of data, I have yet to invest in another set of disks for backup (I
know that RAID isn't backups and I should have them).

Any help would be most appreciated

Thanks

Sabrina

[-- Attachment #2: btrfs.device.stats.shizzle.txt --]
[-- Type: text/plain, Size: 1558 bytes --]

[/dev/sdb1].write_io_errs    0
[/dev/sdb1].read_io_errs     0
[/dev/sdb1].flush_io_errs    0
[/dev/sdb1].corruption_errs  2
[/dev/sdb1].generation_errs  0
[/dev/sdg1].write_io_errs    0
[/dev/sdg1].read_io_errs     0
[/dev/sdg1].flush_io_errs    0
[/dev/sdg1].corruption_errs  0
[/dev/sdg1].generation_errs  0
[/dev/sdh1].write_io_errs    0
[/dev/sdh1].read_io_errs     0
[/dev/sdh1].flush_io_errs    0
[/dev/sdh1].corruption_errs  0
[/dev/sdh1].generation_errs  0
[/dev/sdi1].write_io_errs    0
[/dev/sdi1].read_io_errs     0
[/dev/sdi1].flush_io_errs    0
[/dev/sdi1].corruption_errs  4
[/dev/sdi1].generation_errs  0
[/dev/sdj1].write_io_errs    0
[/dev/sdj1].read_io_errs     0
[/dev/sdj1].flush_io_errs    0
[/dev/sdj1].corruption_errs  3
[/dev/sdj1].generation_errs  0
[/dev/sdf1].write_io_errs    0
[/dev/sdf1].read_io_errs     0
[/dev/sdf1].flush_io_errs    0
[/dev/sdf1].corruption_errs  0
[/dev/sdf1].generation_errs  0
[/dev/sda1].write_io_errs    0
[/dev/sda1].read_io_errs     0
[/dev/sda1].flush_io_errs    0
[/dev/sda1].corruption_errs  0
[/dev/sda1].generation_errs  0
[/dev/sdd1].write_io_errs    0
[/dev/sdd1].read_io_errs     0
[/dev/sdd1].flush_io_errs    0
[/dev/sdd1].corruption_errs  0
[/dev/sdd1].generation_errs  0
[/dev/sde1].write_io_errs    6075
[/dev/sde1].read_io_errs     5965
[/dev/sde1].flush_io_errs    184
[/dev/sde1].corruption_errs  0
[/dev/sde1].generation_errs  0
[/dev/sdc1].write_io_errs    0
[/dev/sdc1].read_io_errs     0
[/dev/sdc1].flush_io_errs    0
[/dev/sdc1].corruption_errs  0
[/dev/sdc1].generation_errs  0

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-01-16  3:40 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-16  3:39 Issues with FS going read-only and bad drive Sabrina Cathey

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.