All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Carsten Grommel <c.grommel@profihost.ag>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: How to (attempt to) repair these btrfs errors
Date: Sat, 5 Mar 2022 20:36:56 -0500	[thread overview]
Message-ID: <YiQQOFQO7G4NZTKS@hungrycats.org> (raw)
In-Reply-To: <AM0PR08MB3265280A4F4EF8151DA289F58E029@AM0PR08MB3265.eurprd08.prod.outlook.com>

On Tue, Mar 01, 2022 at 10:55:50AM +0000, Carsten Grommel wrote:
> Follow-up pastebin with the most recent errors in dmesg:
> 
> https://pastebin.com/4yJJdQPJ

This seems to have expired.

> ________________________________________
> Von: Carsten Grommel
> Gesendet: Montag, 28. Februar 2022 19:41
> An: linux-btrfs@vger.kernel.org
> Betreff: How to (attempt to) repair these btrfs errors
> 
> Hi,
> 
> short buildup: btrfs filesystem used for storing ceph rbd backups within subvolumes got corrupted.
> Underlying 3 RAID 6es, btrfs is mounted on Top as RAID 0 over these Raids for performance ( we have to store massive Data)
> 
> Linux cloud8-1550 5.10.93+2-ph #1 SMP Fri Jan 21 07:52:51 UTC 2022 x86_64 GNU/Linux
> 
> But it was Kernel 5.4.121 before
> 
> btrfs --version
> btrfs-progs v4.20.1
> 
> btrfs fi show
> Label: none  uuid: b634a011-28fa-41d7-8d6e-3f68ccb131d0
>                 Total devices 3 FS bytes used 56.74TiB
>                 devid    1 size 25.46TiB used 22.70TiB path /dev/sda1
>                 devid    2 size 25.46TiB used 22.69TiB path /dev/sdb1
>                 devid    3 size 25.46TiB used 22.70TiB path /dev/sdd1
> 
> btrfs fi df /vmbackup/
> Data, RAID0: total=66.62TiB, used=56.45TiB
> System, RAID1: total=8.00MiB, used=4.36MiB
> Metadata, RAID1: total=750.00GiB, used=294.90GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> Attached the dmesg.log, a few dmesg messages following regarding the different errors (some informations redacted):
> 
> [Mon Feb 28 18:53:57 2022] BTRFS error (device sda1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 69074516, gen 184286
> 
> [Mon Feb 28 18:53:57 2022] BTRFS error (device sda1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 69074517, gen 184286
> 
> [Mon Feb 28 18:54:23 2022] BTRFS error (device sda1): unable to fixup (regular) error at logical 776693776384 on dev /dev/sdd1
> 
> [Mon Feb 28 18:54:25 2022] scrub_handle_errored_block: 21812 callbacks suppressed
> 
> [Mon Feb 28 18:54:31 2022] BTRFS warning (device sda1): checksum error at logical 777752285184 on dev /dev/sdd1, physical 259607957504, root 108747, inode 257, offset 59804737536, length 4096, links 1 (path: cephstorX_vm-XXX-disk-X-base.img_1645337735)
> 
> I am able to mount the filesystem in read-write mode but accessing specific blocks seems to crash btrfs to remount into read-only
> I am currently running a scrub over the filesystem.
> 
> The system got rebooted and the fs got remounted 2-3 times. I made the experience that usually btrfs would and could fix these kinds of errors after a remount, not this time though.
> 
> Before I ran “btrfs check –repair” I would like some advice at how to tackle theses errors.

The corruption and generation event counts indicate sdd1 (or one of its
component devices) was offline for a long time or suffered corruption
on a large scale.

Data is raid0, so data repair is not possible.  Delete all the files
that contain corrupt data.

If you are using space_cache=v1, now is a good time to upgrade to
space_cache=v2.  v1 space cache is stored in the data profile, and it has
likely been corrupted.  btrfs will usually detect and repair corruption
in space_cache=v1, but there is no need to take any such risk here
when you can easily use v2 instead (or at least clear the v1 cache).

I don't see any errors in these logs that would indicate a metadata issue,
but huge numbers of messages are suppressed.  Perhaps a log closer
to the moment when the filesystem goes read-only will be more useful.

I would expect that if there are no problems on sda1 or sdb1 then it
should be possible to repair the metadata errors on sdd1 by scrubbing
that device.

> Kind regards
> Carsten Grommel
> 

  reply	other threads:[~2022-03-06  1:36 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-28 18:41 How to (attempt to) repair these btrfs errors Carsten Grommel
2022-03-01 10:55 ` AW: " Carsten Grommel
2022-03-06  1:36   ` Zygo Blaxell [this message]
2022-03-07  7:03     ` Carsten Grommel
2022-03-07  7:11       ` Qu Wenruo
2022-03-07  7:25         ` AW: " Carsten Grommel
2022-03-07  7:27           ` Carsten Grommel
2022-03-07  7:39             ` Qu Wenruo
2022-03-07  7:34           ` Qu Wenruo
2022-03-07  7:48             ` AW: " Carsten Grommel
2022-03-07  8:00               ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YiQQOFQO7G4NZTKS@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=c.grommel@profihost.ag \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.