* Unbootable root btrfs @ 2019-05-16 10:36 Lee Fleming 2019-05-16 21:39 ` Chris Murphy 0 siblings, 1 reply; 6+ messages in thread From: Lee Fleming @ 2019-05-16 10:36 UTC (permalink / raw) To: linux-btrfs After seeking advice on reddit I've been advised to post this problem here. See https://www.reddit.com/r/btrfs/comments/bp0awe/broken_btrfs_filesystem_following_a_reboot/ I have a root btrfs filesystem on top of mdadm raid10 and lvm. The raid and lvm appear to be ok but the btrfs partition will not mount. I have booted a live recovery and tried to mount/repair the filesystem. This is the result. % mount /dev/mapper/vg-root /mnt/gentoo mount: /mnt/gentoo: wrong fs type, bad option, bad superblock on /dev/mapper/vg-root, missing codepage or helper program, or other error. Trying to mount with recovery gives the same result: % mount -o ro,recovery /dev/mapper/vg-root /mnt/gentoo mount: /mnt/gentoo: wrong fs type, bad option, bad superblock on /dev/mapper/vg-root, missing codepage or helper program, or other error. And a btrfs check gives the following: % btrfs check --repair /dev/mapper/vg-root enabling repair mode bytenr mismatch, want=898031484928, have=898006728704 ERROR: cannot open file system % dmesg | grep -i btrfs [ 5.562419] Btrfs loaded, crc32c=crc32c-generic [ 14.381989] BTRFS: device fsid 1fb019f1-a8cc-46ef-8122-ac6b1bedd522 devid 1 transid 51979 /dev/dm-1 [ 14.382647] BTRFS info (device dm-1): disk space caching is enabled [ 14.382652] BTRFS info (device dm-1): has skinny extents [ 15.777186] BTRFS error (device dm-1): bad tree block start 0 898031337472 [ 15.777334] BTRFS error (device dm-1): bad tree block start 0 898031353856 [ 15.777486] BTRFS error (device dm-1): bad tree block start 0 898031370240 [ 15.864239] BTRFS error (device dm-1): bad tree block start 898006728704 898031484928 [ 15.871367] BTRFS error (device dm-1): bad tree block start 898003812352 898031484928 [ 15.871382] BTRFS error (device dm-1): failed to read block groups: -5 [ 15.892051] BTRFS error (device dm-1): open_ctree failed [ 16.016182] BTRFS info (device dm-1): disk space caching is enabled [ 16.016186] BTRFS info (device dm-1): has skinny extents [ 17.319016] BTRFS error (device dm-1): bad tree block start 0 898031337472 [ 17.319157] BTRFS error (device dm-1): bad tree block start 0 898031353856 [ 17.319303] BTRFS error (device dm-1): bad tree block start 0 898031370240 [ 17.422706] BTRFS error (device dm-1): bad tree block start 898006728704 898031484928 [ 17.429831] BTRFS error (device dm-1): bad tree block start 898003812352 898031484928 [ 17.429845] BTRFS error (device dm-1): failed to read block groups: -5 [ 17.450035] BTRFS error (device dm-1): open_ctree failed % uname -r 4.14.70-std531-amd64 % wipefs /dev/mapper/vg-root DEVICE OFFSET TYPE UUID LABEL vg-root 0x10040 btrfs 1fb019f1-a8cc-46ef-8122-ac6b1bedd522 I was asked to try with a more recent kernel. I booted archiso which showed similar results. # uname -r 5.0.10-arch1-1-ARCH # mount /dev/mapper/vg-root /mnt/funtoo [ 208.724214] BTRFS error (device dm-1): bad tree block start, want 898031337472 have 0 [ 208.724343] BTRFS error (device dm-1): bad tree block start, want 898031353856 have 0 [ 208.724556] BTRFS error (device dm-1): bad tree block start, want 898031370240 have 0 [ 208.805279] BTRFS error (device dm-1): bad tree block start, want 898031484928 have 898006728704 [ 208.812412] BTRFS error (device dm-1): bad tree block strat, want 898031484928 have 898003812352 [ 208.812451] BTRFS error (device dm-1): failed to read block groups: -5 [ 208.840576] BTRFS error (device dm-1): open_ctree failed mount: /mnt/funtoo: wrong fs type, bad option, bad superblock on /dev/mapper/vg-root, missing codepage or helper program, or other error. 32 # dmesg|grep -i btrfs [ 23.028283] Btrfs loaded, crc32c=crc32c-intel [ 23.061402] BTRFS: device fsid 1fb019f1-a8cc-46ef-8122-ac6b1bedd522 devid 1 transid 51979 /dev/dm-1 [ 207.437375] BTRFS info (device dm-1): disk space caching is enabled [ 207.437379] BTRFS info (device dm-1): has skinny extents [ 208.724214] BTRFS error (device dm-1): bad tree block start, want 898031337472 have 0 [ 208.724343] BTRFS error (device dm-1): bad tree block start, want 898031353856 have 0 [ 208.724556] BTRFS error (device dm-1): bad tree block start, want 898031370240 have 0 [ 208.805279] BTRFS error (device dm-1): bad tree block start, want 898031484928 have 898006728704 [ 208.812412] BTRFS error (device dm-1): bad tree block start, want 898031484928 have 898003812352 [ 208.812451] BTRFS error (device dm-1): failed to read block groups: -5 [ 208.840576] BTRFS error (device dm-1): open_ctree failed Any idea if this can be fixed? Cheers Lee ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Unbootable root btrfs 2019-05-16 10:36 Unbootable root btrfs Lee Fleming @ 2019-05-16 21:39 ` Chris Murphy [not found] ` <CAKS=YrMB6SNbCnJsU=rD5gC6cR5yEnSzPDax5eP-VQ-UpzHvAg@mail.gmail.com> 0 siblings, 1 reply; 6+ messages in thread From: Chris Murphy @ 2019-05-16 21:39 UTC (permalink / raw) To: Lee Fleming; +Cc: Btrfs BTRFS On Thu, May 16, 2019 at 4:37 AM Lee Fleming <leeflemingster@gmail.com> wrote: > And a btrfs check gives the following: > > % btrfs check --repair /dev/mapper/vg-root Why use repair? From the man page Warning Do not use --repair unless you are advised to do so by a developer or an experienced user > [ 17.429845] BTRFS error (device dm-1): failed to read block groups: -5 > [ 17.450035] BTRFS error (device dm-1): open_ctree failed Was there a crash or powerfailure during write before the problem started? What precipitated the problem? It might be possible to successfully mount with '-o ro,nologreplay,degraded' If that works, I'd take the opportunity to refresh backups. I'm not sure if this can be repaired but also not sure what the problem is. If it doesn't work, then the next step until a developer has an opinion on it, is 'btrfs restore' which is a way to scrape data out of an unmountable file system. It's better than nothing if the data is important, but ideal if at least ro mount can work. -- Chris Murphy ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CAKS=YrMB6SNbCnJsU=rD5gC6cR5yEnSzPDax5eP-VQ-UpzHvAg@mail.gmail.com>]
* Re: Unbootable root btrfs [not found] ` <CAKS=YrMB6SNbCnJsU=rD5gC6cR5yEnSzPDax5eP-VQ-UpzHvAg@mail.gmail.com> @ 2019-05-18 4:06 ` Chris Murphy 2019-05-18 4:39 ` Robert White 0 siblings, 1 reply; 6+ messages in thread From: Chris Murphy @ 2019-05-18 4:06 UTC (permalink / raw) To: Lee Fleming, Btrfs BTRFS; +Cc: Chris Murphy On Fri, May 17, 2019 at 2:18 AM Lee Fleming <leeflemingster@gmail.com> wrote: > > I didn't see that particular warning. I did see a warning that it could cause damage and should be tried after trying some other things which I did. The data on this drive isn't important. I just wanted to see if it could be recovered before reinstalling. > > There was no crash, just a reboot. I was setting up KVM and I rebooted into a different kernel to see if some performance problems were kernel related. And it just didn't boot. OK the corrupted Btrfs volume is a guest file system? That's unexpected. There must be some configuration specific issue that's instigating this. I've done quite a lot of Btrfs testing in qemu-kvm including the virtioblk devices using unsafe caching, and I do vile things with the VM's intentionally trying to blow up Btrfs including force quitting the VM while it's writing. And I haven't gotten any corruptions. All I can recommend is to try to reproduce it again and this next time try to keep track of the exact steps such that anyone can try to reproduce it. It might be a bug you've found. But we need a reproducer. Is it using QCOW2 or RAW file backing, or LVM, or plain partition? What is the qemu command for the VM? You can get that with 'ps -aux | grep qemu' and it should show all the options used including the kind of block devices and caching. And then what is the workload inside the VM? -- Chris Murphy ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Unbootable root btrfs 2019-05-18 4:06 ` Chris Murphy @ 2019-05-18 4:39 ` Robert White 2019-05-18 19:28 ` Chris Murphy 0 siblings, 1 reply; 6+ messages in thread From: Robert White @ 2019-05-18 4:39 UTC (permalink / raw) To: Chris Murphy, Lee Fleming, Btrfs BTRFS On 5/18/19 4:06 AM, Chris Murphy wrote: > On Fri, May 17, 2019 at 2:18 AM Lee Fleming <leeflemingster@gmail.com> wrote: >> >> I didn't see that particular warning. I did see a warning that it could cause damage and should be tried after trying some other things which I did. The data on this drive isn't important. I just wanted to see if it could be recovered before reinstalling. >> >> There was no crash, just a reboot. I was setting up KVM and I rebooted into a different kernel to see if some performance problems were kernel related. And it just didn't boot. > > OK the corrupted Btrfs volume is a guest file system? Was the reboot a reboot of the guest instance or the host? The reboot of the host can be indistinguishable from a crash to the guest file system images if shutdown is taking a long time. That megear fifteen second gap between SIGTERM and SIGKILL can be a real VM killer even in an orderly shutdown. If you don't have a qemu shutdown script in your host environment then every orderly shutdown is a risk to any running VM. The question that comes to my mind is to ask what -blockdev and/or -drive parameters you are using? Some of the combinations of features and flags can, in the name of speed, "helpfully violate" the necessary I/O orderings that filesystems depend on. So if the crash kills qemu before qemu has flushed and completed a guest-system-critical write to the host store you've suffered a corruption that has nothing to do with the filesystem code base. So, for example, you shutdown your host system. I sends SIGTERM to qemu. The guest system sends SIGTERM to its processes. The guest is still waiting its nominal 15 seconds, when the host evicts it from memory with a SIGKILL because it's 15 second timer started sooner. (15 seconds is the canonical time from my UNIX days, I don't know what the real times are for every distribution.) Upping the caching behaviours for writes can be just as deadly in some conditions. None of this my apply to OP, but it's the thing I'd check before before digging too far. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Unbootable root btrfs 2019-05-18 4:39 ` Robert White @ 2019-05-18 19:28 ` Chris Murphy 2019-05-18 19:43 ` Lee Fleming 0 siblings, 1 reply; 6+ messages in thread From: Chris Murphy @ 2019-05-18 19:28 UTC (permalink / raw) To: Robert White; +Cc: Chris Murphy, Lee Fleming, Btrfs BTRFS On Fri, May 17, 2019 at 10:39 PM Robert White <rwhite@pobox.com> wrote: > > On 5/18/19 4:06 AM, Chris Murphy wrote: > > On Fri, May 17, 2019 at 2:18 AM Lee Fleming <leeflemingster@gmail.com> wrote: > >> > >> I didn't see that particular warning. I did see a warning that it could cause damage and should be tried after trying some other things which I did. The data on this drive isn't important. I just wanted to see if it could be recovered before reinstalling. > >> > >> There was no crash, just a reboot. I was setting up KVM and I rebooted into a different kernel to see if some performance problems were kernel related. And it just didn't boot. > > > > OK the corrupted Btrfs volume is a guest file system? > > Was the reboot a reboot of the guest instance or the host? The reboot of > the host can be indistinguishable from a crash to the guest file system > images if shutdown is taking a long time. That megear fifteen second gap > between SIGTERM and SIGKILL can be a real VM killer even in an orderly > shutdown. If you don't have a qemu shutdown script in your host > environment then every orderly shutdown is a risk to any running VM. Yep it's a good point. > > The question that comes to my mind is to ask what -blockdev and/or > -drive parameters you are using? Some of the combinations of features > and flags can, in the name of speed, "helpfully violate" the necessary > I/O orderings that filesystems depend on. In particular unsafe caching. But it does make for faster writes, in particular NTFS and Btrfs in the VM guest. > So if the crash kills qemu before qemu has flushed and completed a > guest-system-critical write to the host store you've suffered a > corruption that has nothing to do with the filesystem code base. For Btrfs, I think the worst case scenario should be you lose up to 30s of writes. The super block should still point to a valid, completely committed set of trees that point to valid data extents. But yeah I have no idea what the write ordering could be if say the guest has written data>metadata>super, and then the host, not honoring fsync (some cache policies do ignore it), maybe it ends up writing out a new super before it writes out metadata - of course the host has no idea what these writes are for from the guest. And before all metadata is written by the host, the host reboots. So now you have a superblock that's pointing to a partial metadata write and that will show up as corruption. What *should* still be true is Btrfs can be made to fallback to a previous root tree by using mount option -o usebackuproot -- Chris Murphy ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Unbootable root btrfs 2019-05-18 19:28 ` Chris Murphy @ 2019-05-18 19:43 ` Lee Fleming 0 siblings, 0 replies; 6+ messages in thread From: Lee Fleming @ 2019-05-18 19:43 UTC (permalink / raw) To: Chris Murphy; +Cc: Robert White, Btrfs BTRFS No. It was the host. I've nuked the filesystem now. I'm sorry - I know that doesn't help you diagnose this problem. There wasn't anything important on this drive. I just wanted to see if it could be recovered before reinstalling everything. But I wanted to get it back up and running now. On Sat, 18 May 2019 at 20:28, Chris Murphy <lists@colorremedies.com> wrote: > > On Fri, May 17, 2019 at 10:39 PM Robert White <rwhite@pobox.com> wrote: > > > > On 5/18/19 4:06 AM, Chris Murphy wrote: > > > On Fri, May 17, 2019 at 2:18 AM Lee Fleming <leeflemingster@gmail.com> wrote: > > >> > > >> I didn't see that particular warning. I did see a warning that it could cause damage and should be tried after trying some other things which I did. The data on this drive isn't important. I just wanted to see if it could be recovered before reinstalling. > > >> > > >> There was no crash, just a reboot. I was setting up KVM and I rebooted into a different kernel to see if some performance problems were kernel related. And it just didn't boot. > > > > > > OK the corrupted Btrfs volume is a guest file system? > > > > Was the reboot a reboot of the guest instance or the host? The reboot of > > the host can be indistinguishable from a crash to the guest file system > > images if shutdown is taking a long time. That megear fifteen second gap > > between SIGTERM and SIGKILL can be a real VM killer even in an orderly > > shutdown. If you don't have a qemu shutdown script in your host > > environment then every orderly shutdown is a risk to any running VM. > > Yep it's a good point. > > > > > > The question that comes to my mind is to ask what -blockdev and/or > > -drive parameters you are using? Some of the combinations of features > > and flags can, in the name of speed, "helpfully violate" the necessary > > I/O orderings that filesystems depend on. > > In particular unsafe caching. But it does make for faster writes, in > particular NTFS and Btrfs in the VM guest. > > > > So if the crash kills qemu before qemu has flushed and completed a > > guest-system-critical write to the host store you've suffered a > > corruption that has nothing to do with the filesystem code base. > > For Btrfs, I think the worst case scenario should be you lose up to > 30s of writes. The super block should still point to a valid, > completely committed set of trees that point to valid data extents. > But yeah I have no idea what the write ordering could be if say the > guest has written data>metadata>super, and then the host, not honoring > fsync (some cache policies do ignore it), maybe it ends up writing out > a new super before it writes out metadata - of course the host has no > idea what these writes are for from the guest. And before all metadata > is written by the host, the host reboots. So now you have a superblock > that's pointing to a partial metadata write and that will show up as > corruption. > > What *should* still be true is Btrfs can be made to fallback to a > previous root tree by using mount option -o usebackuproot > > > > -- > Chris Murphy ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-05-18 19:43 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-16 10:36 Unbootable root btrfs Lee Fleming 2019-05-16 21:39 ` Chris Murphy [not found] ` <CAKS=YrMB6SNbCnJsU=rD5gC6cR5yEnSzPDax5eP-VQ-UpzHvAg@mail.gmail.com> 2019-05-18 4:06 ` Chris Murphy 2019-05-18 4:39 ` Robert White 2019-05-18 19:28 ` Chris Murphy 2019-05-18 19:43 ` Lee Fleming
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).