All of lore.kernel.org
 help / color / mirror / Atom feed
* Filesystem will remount read-only
@ 2016-09-16 14:57 Jeffrey Michels
  2016-09-16 23:18 ` Duncan
  2016-09-17  0:08 ` Chris Murphy
  0 siblings, 2 replies; 6+ messages in thread
From: Jeffrey Michels @ 2016-09-16 14:57 UTC (permalink / raw)
  To: 'linux-btrfs@vger.kernel.org'

Hello,

I have a system that has been in production for a few years.  The SAN the VM was running on had a hardware failure about a month ago and now one of the two btrfs filesystems will remount after boot read-only.  Here is the system information:

uname -a

Linux retain 3.0.101-0.47.71-default #1 SMP Thu Nov 12 12:22:22 UTC 2015 (b5b212e) x86_64 x86_64 x86_64 GNU/Linux

Btrfs --version

Btrfs v0.20+

Btrfs fi show

Label: none  uuid: f1e23038-22c1-44b2-8cf8-a3ca6363d2f4
	Total devices 1 FS bytes used 303.01GiB
	devid    1 size 1024.00GiB used 351.04GiB path /dev/dm-2

Label: none  uuid: 85e58f4e-ce56-4b11-9ed9-16abeead8863
	Total devices 1 FS bytes used 83.83GiB
	devid    1 size 149.49GiB used 101.49GiB path /dev/dm-0

Btrfs v0.20+

Btrfs fi df /retain

Data: total=261.01GiB, used=259.23GiB
System, DUP: total=8.00MiB, used=40.00KiB
System: total=4.00MiB, used=0.00
Metadata, DUP: total=45.00GiB, used=43.77GiB
Metadata: total=8.00MiB, used=0.00

Dmesg--Can provide the full output if needed via attachment.  Here is where the fs remounts read-only:

[   55.181245] btrfs: parent transid verify failed on 153295646720 wanted 230487 found 230484
[   55.187980] btrfs: parent transid verify failed on 153295646720 wanted 230487 found 230484
[   55.187991] BTRFS debug (device dm-2): run_one_delayed_ref returned -5
[   55.187994] ------------[ cut here ]------------
[   55.188021] WARNING: at /usr/src/packages/BUILD/kernel-default-3.0.101/linux-3.0/fs/btrfs/super.c:255 __btrfs_abort_transaction+0x60/0x170 [btrfs]()
[   55.188024] Hardware name: VMware Virtual Platform
[   55.188026] btrfs: Transaction aborted (error -5)
[   55.188028] Modules linked in: acpiphp microcode fuse xfs ext3 jbd mbcache loop sr_mod ppdev vmw_balloon(X) i2c_piix4 intel_agp pciehp ipv6_lib cdrom parport_pc shpchp parport rtc_cmos intel_gtt pci_hotplug floppy i2c_core sg container ac mptctl serio_raw button pcspkr btrfs zlib_deflate crc32c libcrc32c dm_mirror dm_region_hash dm_log linear sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh dm_snapshot dm_mod vmw_pvscsi vmxnet3 ata_generic ata_piix libata mptspi mptscsih mptbase scsi_transport_spi scsi_mod
[   55.188071] Supported: Yes, External
[   55.188075] Pid: 1985, comm: sync Tainted: G             X 3.0.101-0.47.71-default #1
[   55.188077] Call Trace:
[   55.188090]  [<ffffffff81004b95>] dump_trace+0x75/0x300
[   55.188097]  [<ffffffff814638a3>] dump_stack+0x69/0x6f
[   55.188104]  [<ffffffff81061f07>] warn_slowpath_common+0x87/0xe0
[   55.188109]  [<ffffffff81062015>] warn_slowpath_fmt+0x45/0x60
[   55.188125]  [<ffffffffa018eb40>] __btrfs_abort_transaction+0x60/0x170 [btrfs]
[   55.188152]  [<ffffffffa01ab466>] btrfs_run_delayed_refs+0x3a6/0x520 [btrfs]
[   55.188192]  [<ffffffffa01bb3de>] btrfs_commit_transaction+0x42e/0xa00 [btrfs]
[   55.188228]  [<ffffffff8118a0f2>] __sync_filesystem+0x62/0xb0
[   55.188234]  [<ffffffff8116133a>] iterate_supers+0x6a/0xc0
[   55.188239]  [<ffffffff8118a1b2>] sys_sync+0x52/0x80
[   55.188244]  [<ffffffff8146e772>] system_call_fastpath+0x16/0x1b
[   55.188251]  [<00007f45758cafc7>] 0x7f45758cafc6
[   55.188253] ---[ end trace c5a604849514ffcd ]---
[   55.188257] BTRFS error (device dm-2) in btrfs_run_delayed_refs:2688: errno=-5 IO failure
[   55.188259] BTRFS info (device dm-2): forced readonly
[   55.188263] BTRFS warning (device dm-2): Skipping commit of aborted transaction.
[   55.188266] BTRFS error (device dm-2) in cleanup_transaction:1538: errno=-5 IO failure

Thank you for your assistance,

Jeff Michels

iCon 2017 Registration is Now Open!
Agents of Innovation
March 8 - 10, 2017
TradeWinds Island Resort, St. Pete Beach, Florida
Register today at: www.skyward.com/icon

PRIVILEGED AND CONFIDENTIAL
Skyward Communication

This is a transmission from Skyward, Inc. and may contain information which is privileged, confidential, and protected by service work privileges.  The response is in direct relationship to the information provided to Skyward.   If you are not the addressee, note that any disclosure, copying, distribution, or use of the contents of this message is prohibited.  If you have received this transmission in error, please destroy it and notify us immediately at 715-341-9406.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Filesystem will remount read-only
  2016-09-16 14:57 Filesystem will remount read-only Jeffrey Michels
@ 2016-09-16 23:18 ` Duncan
  2016-09-17  0:08 ` Chris Murphy
  1 sibling, 0 replies; 6+ messages in thread
From: Duncan @ 2016-09-16 23:18 UTC (permalink / raw)
  To: linux-btrfs

Jeffrey Michels posted on Fri, 16 Sep 2016 14:57:43 +0000 as excerpted:

> Hello,
> 
> I have a system that has been in production for a few years.  The SAN
> the VM was running on had a hardware failure about a month ago and now
> one of the two btrfs filesystems will remount after boot read-only. 
> Here is the system information:
> 
> uname -a
> 
> Linux retain 3.0.101-0.47.71-default #1 SMP Thu Nov 12 12:22:22 UTC 2015
> (b5b212e) x86_64 x86_64 x86_64 GNU/Linux
> 
> Btrfs --version
> 
> Btrfs v0.20+

That is positively /ancient/, both kernel and userspace (btrfs-progs).  
Keep in mind that btrfs was still considered very experimental back then, 
with the experimental labels coming off only with 3.14 or there abouts, 
IIRC (userspace releases got version-synced with kernelspace in 3.12, so 
3.14 applies to both).

So you have been running an at-the-time still extremely experimental 
filesystem for years now, and it's only now coming up with problems that 
need fixed.  Pretty remarkable for the experimental state back then, but 
it doesn't change the fact that it /was/ "may eat your data and burn your 
kids alive as a sacrifice to appease the filesystem gods" level 
experimental, with the according warnings, back then.

So first thing I'd suggest is to update to kernel 4.4 LTS series, and 
something similar for btrfs-progs userspace.  Then, given the age and 
experimental nature of the filesystem back then, I'd kill the filesystems 
and do a fresh mkfs.btrfs, restoring from backups.  That way you're 
starting with a well tested and stable LTS kernel that is both reasonably 
mature already, and will be supported for some time to come, and 
eliminate any possibility of long fixed and forgotten bugs coming back to 
bite you years later.

Alternatively, if you're using a long-term support distro, you have the 
choice of going to them for that support, since unlike this list which 
focuses on the state going forward, that sort of deep long-term support 
of long outdated versions is a good part of the reason such distros 
exist, and a good part of why a lot of people are willing to pay 
sometimes rather sizable sums of money /for/ that level of support.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Filesystem will remount read-only
  2016-09-16 14:57 Filesystem will remount read-only Jeffrey Michels
  2016-09-16 23:18 ` Duncan
@ 2016-09-17  0:08 ` Chris Murphy
  2016-09-17  0:56   ` Chris Murphy
  2016-09-21 15:24   ` Jeffrey Michels
  1 sibling, 2 replies; 6+ messages in thread
From: Chris Murphy @ 2016-09-17  0:08 UTC (permalink / raw)
  To: Jeffrey Michels; +Cc: linux-btrfs

On Fri, Sep 16, 2016 at 8:57 AM, Jeffrey Michels <jeffreym@skyward.com> wrote:
> Hello,
>
> I have a system that has been in production for a few years.  The SAN the VM was running on had a hardware failure about a month ago and now one of the two btrfs filesystems will remount after boot read-only.  Here is the system information:
>
> uname -a
>
> Linux retain 3.0.101-0.47.71-default #1 SMP Thu Nov 12 12:22:22 UTC 2015 (b5b212e) x86_64 x86_64 x86_64 GNU/Linux
>
> Btrfs --version
>
> Btrfs v0.20+

Impressive that it's been running in production this long and with old
kernel. I like it!

Anyway, you could try mounting with -o recovery and see if that works.
That's about the only thing I'd trust with such an old kernel and
btrfs-progs. I don't even think it's worth trying the btrfsck on v0.20
just to see what the problems might be, and certainly not for actually
using the repair mode.  Actually I'm not even sure progs that old even
does repairs, it might be the era of notify only.

If -o recovery doesn't work, you'll need to use something newer, you
could use one of:

Fedora Rawhide nightly with 4.8rc6 kernel and btrfs-progs 4.7.2. This
is a small netinstall image. dd to a USB stick, choose Troubleshooting
option, then the Rescue option, then after startup use the 3 option to
get to a shell where you can try to mount normally, or use
btrfs-check. Limited tty, no sshd.
https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20160914.n.0/compose/Everything/x86_64/iso/Fedora-Everything-netinst-x86_64-Rawhide-20160914.n.0.iso.n.0.iso

Or something more official with published hash's for the image and a
GUI, Fedora 24 workstation has kernel 4.5.5 and btrfs-progs 4.5.2
https://getfedora.org/en/workstation/download/




-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Filesystem will remount read-only
  2016-09-17  0:08 ` Chris Murphy
@ 2016-09-17  0:56   ` Chris Murphy
  2016-09-21 15:24   ` Jeffrey Michels
  1 sibling, 0 replies; 6+ messages in thread
From: Chris Murphy @ 2016-09-17  0:56 UTC (permalink / raw)
  Cc: Jeffrey Michels, linux-btrfs

On Fri, Sep 16, 2016 at 6:08 PM, Chris Murphy <lists@colorremedies.com> wrote:
>
> If -o recovery doesn't work, you'll need to use something newer, you
> could use one of:
>
> Fedora Rawhide nightly with 4.8rc6 kernel and btrfs-progs 4.7.2. This
> is a small netinstall image. dd to a USB stick, choose Troubleshooting
> option, then the Rescue option, then after startup use the 3 option to
> get to a shell where you can try to mount normally, or use
> btrfs-check. Limited tty, no sshd.
> https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20160914.n.0/compose/Everything/x86_64/iso/Fedora-Everything-netinst-x86_64-Rawhide-20160914.n.0.iso.n.0.iso
>
> Or something more official with published hash's for the image and a
> GUI, Fedora 24 workstation has kernel 4.5.5 and btrfs-progs 4.5.2
> https://getfedora.org/en/workstation/download/

Just to complete the thought... use these just to boot and have access
to something newer. I'm not suggesting install them. First try a
normal mount, and if that fails, try -o recovery, if that fails, I'm
curious about

btrfs rescue super-recover -v <dev>
btrfs check <dev>

What I'm after is a way to get it to mount cleanly with a new kernel,
and then hoping you can then just reboot with the ancient kernel and
it'll be back to normal.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Filesystem will remount read-only
  2016-09-17  0:08 ` Chris Murphy
  2016-09-17  0:56   ` Chris Murphy
@ 2016-09-21 15:24   ` Jeffrey Michels
  2016-09-21 15:49     ` Chris Murphy
  1 sibling, 1 reply; 6+ messages in thread
From: Jeffrey Michels @ 2016-09-21 15:24 UTC (permalink / raw)
  To: 'Chris Murphy'; +Cc: linux-btrfs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3682 bytes --]

Hello,

I've booted into the latest nightly build of Fedora and run btrfs rescue super-recover -v and also btrfs check.

Super-recover reports that "All supers are valid, no need to recover."  Btrfs check displays the same errors as before:

Parent trasid verified failed on xxxx wanted xxxx found xxxxx
Ignoring transid failure
Leaf parent key incorrect xxxxxxx
Bad block xxxxxxxxxxx
Errors found in extend allocation tree or chunk allocation.

The check eventually fails with Segmentation fault (core dumped.)

Attempting to mount with -o recovery results in the error "Can't read superblock."  I am able to mount the filesystem read-only however attempting to copy all of the data off has been unsuccessful.  It appears to hit a bad area and just hangs.

Is there anything else I can try?

Thank you for your insight,

Jeff Michels

-----Original Message-----
From: chris@colorremedies.com [mailto:chris@colorremedies.com] On Behalf Of Chris Murphy
Sent: Friday, September 16, 2016 7:08 PM
To: Jeffrey Michels
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Filesystem will remount read-only

On Fri, Sep 16, 2016 at 8:57 AM, Jeffrey Michels <jeffreym@skyward.com> wrote:
> Hello,
>
> I have a system that has been in production for a few years.  The SAN the VM was running on had a hardware failure about a month ago and now one of the two btrfs filesystems will remount after boot read-only.  Here is the system information:
>
> uname -a
>
> Linux retain 3.0.101-0.47.71-default #1 SMP Thu Nov 12 12:22:22 UTC 
> 2015 (b5b212e) x86_64 x86_64 x86_64 GNU/Linux
>
> Btrfs --version
>
> Btrfs v0.20+

Impressive that it's been running in production this long and with old kernel. I like it!

Anyway, you could try mounting with -o recovery and see if that works.
That's about the only thing I'd trust with such an old kernel and btrfs-progs. I don't even think it's worth trying the btrfsck on v0.20 just to see what the problems might be, and certainly not for actually using the repair mode.  Actually I'm not even sure progs that old even does repairs, it might be the era of notify only.

If -o recovery doesn't work, you'll need to use something newer, you could use one of:

Fedora Rawhide nightly with 4.8rc6 kernel and btrfs-progs 4.7.2. This is a small netinstall image. dd to a USB stick, choose Troubleshooting option, then the Rescue option, then after startup use the 3 option to get to a shell where you can try to mount normally, or use btrfs-check. Limited tty, no sshd.
https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20160914.n.0/compose/Everything/x86_64/iso/Fedora-Everything-netinst-x86_64-Rawhide-20160914.n.0.iso.n.0.iso

Or something more official with published hash's for the image and a GUI, Fedora 24 workstation has kernel 4.5.5 and btrfs-progs 4.5.2 https://getfedora.org/en/workstation/download/




--
Chris Murphy

iCon 2017 Registration is Now Open!
Agents of Innovation
March 8 - 10, 2017
TradeWinds Island Resort, St. Pete Beach, Florida
Register today at: www.skyward.com/icon

PRIVILEGED AND CONFIDENTIAL
Skyward Communication

This is a transmission from Skyward, Inc. and may contain information which is privileged, confidential, and protected by service work privileges.  The response is in direct relationship to the information provided to Skyward.   If you are not the addressee, note that any disclosure, copying, distribution, or use of the contents of this message is prohibited.  If you have received this transmission in error, please destroy it and notify us immediately at 715-341-9406.
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Filesystem will remount read-only
  2016-09-21 15:24   ` Jeffrey Michels
@ 2016-09-21 15:49     ` Chris Murphy
  0 siblings, 0 replies; 6+ messages in thread
From: Chris Murphy @ 2016-09-21 15:49 UTC (permalink / raw)
  To: Jeffrey Michels; +Cc: Chris Murphy, linux-btrfs

On Wed, Sep 21, 2016 at 9:24 AM, Jeffrey Michels <jeffreym@skyward.com> wrote:
> Hello,
>
> I've booted into the latest nightly build of Fedora and run btrfs rescue super-recover -v and also btrfs check.
>
> Super-recover reports that "All supers are valid, no need to recover."  Btrfs check displays the same errors as before:
>
> Parent trasid verified failed on xxxx wanted xxxx found xxxxx
> Ignoring transid failure
> Leaf parent key incorrect xxxxxxx
> Bad block xxxxxxxxxxx
> Errors found in extend allocation tree or chunk allocation.
>
> The check eventually fails with Segmentation fault (core dumped.)
>
> Attempting to mount with -o recovery results in the error "Can't read superblock."  I am able to mount the filesystem read-only however attempting to copy all of the data off has been unsuccessful.  It appears to hit a bad area and just hangs.
>
> Is there anything else I can try?

Kinda weird that the supers are all valid with one tool but then the
kernel says it can't read the superblock?

Try btrfs-image -c9 -t4 and optionally with -s, and if the resulting
file isn't too big put it somewhere and I can try to iterate some
options on the file. You could do the same thing by imaging it,
restoring the image to an LVM LV of the same size or bigger, and then
see if there's a combination of
btrfs check --repair;
btrfs check --repair --init-extent-tree
btrfs rescue chunk-recover
btrfs rescue zero-log

That or some other order of things might fix it, and the only way to
know is to try it. And unfortunately each thing can make changes that
can cause the next thing to fail. Hence doing this on an image.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-09-21 15:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-16 14:57 Filesystem will remount read-only Jeffrey Michels
2016-09-16 23:18 ` Duncan
2016-09-17  0:08 ` Chris Murphy
2016-09-17  0:56   ` Chris Murphy
2016-09-21 15:24   ` Jeffrey Michels
2016-09-21 15:49     ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.