All of lore.kernel.org
 help / color / mirror / Atom feed
* Help needed with filesystem errors: parent transid verify failed
@ 2021-03-28 15:40 B A
  2021-03-29  1:02 ` Chris Murphy
  2021-03-30  0:07 ` Chris Murphy
  0 siblings, 2 replies; 13+ messages in thread
From: B A @ 2021-03-28 15:40 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3416 bytes --]

Dear btrfs experts,


On my desktop PC, I have 1 btrfs partition on a single SSD device with 3 subvolumes (/, /home, /var). Whenever I boot my PC, after logging in to GNOME, the btrfs partition is being remounted as ro due to errors. This is the dmesg output at that time:

> [  616.155392] BTRFS error (device dm-0): parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
> [  616.155650] BTRFS error (device dm-0): parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
> [  616.155657] BTRFS: error (device dm-0) in __btrfs_free_extent:3054: errno=-5 IO failure
> [  616.155662] BTRFS info (device dm-0): forced readonly
> [  616.155665] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2124: errno=-5 IO failure

The issue started to happen today after login. Yesterday everything works fine. I suggest something went wrong on last shutdown but I don't know for sure because as this disk also has my logs, I don't see any errors on that shutdown in my logs.

System info:
* Fedora 33 x86_64
* kernel: Linux 5.11.10-200.fc33.x86_64 #1 SMP
* btrfs-progs v5.10 (5.10-1.fc33.x86_64)
* Samsung 840 series SSD (SMART data looks fine)

What happens:
1. I boot my PC including mounting the root partition
2. Everything works fine.
3. I can log in as root or my user on tty and do basic stuff there and it works
4. I log in to my user account (gdm, GNOME shell). Alternatively, running e.g. `dnf history info last` also triggers the dmesg output shown above.
5. Many applications don't work any more. The common root cause seems to be that the filesystem is remounted readonly due to the errors noted above.

Basic info: see attached file "dmesg info.txt" (generated from Fedora live system)

What I've tried so far:
1. I ran `btrfs scrub` from live system. This errors out:

> [root@localhost-live liveuser]# btrfs scrub start -B /mnt
> ERROR: scrubbing /mnt failed for device id 1: ret=-1, errno=5 (Input/output error)
> scrub canceled for 1a149bda-057d-4775-ba66-1bf259fce9a5
> Scrub started:    Sun Mar 28 07:20:07 2021
> Status:           aborted
> Duration:         0:13:00
> Total to scrub:   269.06GiB
> Rate:             252.24MiB/s
> Error summary:    no errors found

At the same time, in `dmesg`, I see this:

> [ 7878.612534] BTRFS error (device dm-2): parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
> [ 7878.637673] BTRFS error (device dm-2): parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
> [ 7878.639459] BTRFS info (device dm-2): scrub: not finished on devid 1 with status: -5

2. I ran `btrfs check` (without repair) from live system. This also shows errors (see attached file "btrfs check.txt".


Side note: There is also a rare chance that this issue is triggered by a software update I did yesterday. This includes an update of systemd-246.10 to systemd-246.13 and kernel-5.11.8 to kernel-5.11.10.
Changes in systemd: https://src.fedoraproject.org/rpms/systemd/commits/f33
Changes in kernel: https://src.fedoraproject.org/rpms/kernel/commits/f33
Since this update has also been deployed to many other users (I am using stable channel) and I have not seen any related issues in Fedora's bugzilla and discourse, so I doubt this is related.


What shall I do now? Do I need any of the invasive methods (`btrfs rescue` or `btrfs check --repair`) and if yes, which method do I choose?

Kind regards,
Chris

[-- Attachment #2: btrfs check.txt --]
[-- Type: text/plain, Size: 2192 bytes --]

[root@localhost-live liveuser]# btrfs check /dev/mapper/luks-ff6e174f-4cd3-42a7-8ee5-47005dd077dc
Opening filesystem to check...
ERROR: /dev/mapper/luks-ff6e174f-4cd3-42a7-8ee5-47005dd077dc is currently mounted, use --force if you really intend to check the filesystem
[root@localhost-live liveuser]# btrfs check /dev/mapper/luks-ff6e174f-4cd3-42a7-8ee5-47005dd077dc
Opening filesystem to check...
Checking filesystem on /dev/mapper/luks-ff6e174f-4cd3-42a7-8ee5-47005dd077dc
UUID: 1a149bda-057d-4775-ba66-1bf259fce9a5
[1/7] checking root items
parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=1144881201152 item=14 parent level=1 child level=2
ERROR: failed to repair root items: Input/output error
[2/7] checking extents
parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
Ignoring transid failure
bad block 1144783093760
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space cache
parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=1144881201152 item=14 parent level=1 child level=2
cache appears valid but isn't 1062040764416
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
parent transid verify failed on 1144783093760 wanted 2734307 found 2734305
Ignoring transid failure
ERROR: child eb corrupted: parent bytenr=1144881201152 item=14 parent level=1 child level=2
Error going to next leaf -5
csum exists for 1062926516224-1062935089152 but there is no extent record
ERROR: errors found in csum tree
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
ERROR: transid errors in file system
found 11738640384 bytes used, error(s) found
total csum bytes: 0
total tree bytes: 3719168
total fs tree bytes: 0
total extent tree bytes: 3522560
btree space waste bytes: 1056895
file data blocks allocated: 69992448
 referenced 69992448


[-- Attachment #3: btrfs info.txt --]
[-- Type: text/plain, Size: 569 bytes --]

[root@localhost-live liveuser]# btrfs --version
btrfs-progs v5.7
[root@localhost-live liveuser]# btrfs fi show
Label: 'fedora_chstpc-2'  uuid: 1a149bda-057d-4775-ba66-1bf259fce9a5
	Total devices 1 FS bytes used 230.46GiB
	devid    1 size 300.00GiB used 269.06GiB path /dev/mapper/luks-ff6e174f-4cd3-42a7-8ee5-47005dd077dc

[root@localhost-live liveuser]# btrfs fi df /mnt
Data, single: total=263.00GiB, used=228.47GiB
System, DUP: total=32.00MiB, used=48.00KiB
Metadata, DUP: total=3.00GiB, used=1.99GiB
GlobalReserve, single: total=397.25MiB, used=0.00B


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-04-01 20:18 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-28 15:40 Help needed with filesystem errors: parent transid verify failed B A
2021-03-29  1:02 ` Chris Murphy
2021-03-29  1:12   ` Chris Murphy
2021-03-29  7:34   ` Aw: " B A
2021-03-29  8:09     ` Chris Murphy
2021-03-29  8:42       ` Aw: " B A
2021-03-29 13:36         ` Josef Bacik
2021-04-01 12:36           ` Aw: " B A
2021-03-29  7:36   ` Aw: " B A
2021-03-30  0:07 ` Chris Murphy
2021-03-30  8:44   ` Aw: " B A
2021-03-30 20:17     ` Chris Murphy
2021-04-01 12:42       ` Aw: " B A

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.