From: Goffredo Baroncelli <kreijack@libero.it>
To: David Woodhouse <dwmw2@infradead.org>, Forza <forza@tnonline.net>,
linux-btrfs@vger.kernel.org
Subject: Re: Reading files with bad data checksum
Date: Mon, 11 Jan 2021 19:56:34 +0100 [thread overview]
Message-ID: <3e309cce-d322-1df8-052a-f370ccbd2784@libero.it> (raw)
In-Reply-To: <0544b98786dbd980e8bdde58bd5713247178a74e.camel@infradead.org>
On 1/10/21 1:36 PM, David Woodhouse wrote:
> On Sun, 2021-01-10 at 13:08 +0100, Forza wrote:
[...]
>
> It showed up as errors. There appears to be a btrfs bug there but since
Yes, it is an old btrfs bug. And qemu is not the guilty.
https://lore.kernel.org/linux-btrfs/cf8a733f-2c9d-7ffe-e865-4c13d99dfb60@libero.it/
In my email there is a code to reproduce it.
Basically it is difficult to have the checksum sync with the data when O_DIRECT
is used. Even OpenZFS has problem with it. The OpenZFS solution is to lie about O_DIRECT:
it allows the flag however it doesn't honor.
I think that BTRFS should behave like ZFS: when csum are enable, O_DIRECT shouldn't be
honored (or returning an error or behaving like ZFS).
> I suspect it'll be easy to reproduce I'm more focused on recovery right
> now.
>
>>> In the short term, all I want to do is make a copy of the file, using
>>> the data which are in the disk regardless of the fact that btrfs thinks
>>> the checksum doesn't match. Is there a way I can turn off *checking* of
>>> the checksum for that specific file (or file descriptor?).
>>>
>>> Or is the only way to do it with something like FIBMAP, reading the
>>> offending blocks directly from the underlying disk and then writing
>>> them into the appropriate offset in (a copy of) the file? A plan which
>>> is slightly complicated by the fact that of course btrfs doesn't
>>> support FIBMAP.
>>>
>>> What's the best way to recover the data?
>>>
>>
>> You can use GNU ddrescue to copy files. It can skip the offending blocks
>> and replace the bad data with zeroes. Not sure how well qemu will handle
>> that though.
>
> Right. I've already copied the image with dd conv=sync,noerror to a new
> one with the +C flag. It passes 'qemu-img check', and in fact the guest
> is running just fine with it. I was expecting it to stop with
> catastrophic file system errors but I can't see anything wrong at all.
> I'm just paranoid that eventually I'll find out that the blocks belong
> to some file(s) I actually want, and I'd like to recover them.
>
> Right now I have a horribly fragmented image file with these 'errors'
> cluttering up my file system and making backups of the host go
> extremely slow. I'd like to get those blocks back so I can make a clean
> copy of the image, and keep it around for reference in case I later
> *do* discover that I need the contents of those blocks.
>
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
next prev parent reply other threads:[~2021-01-11 18:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-10 11:52 Reading files with bad data checksum David Woodhouse
2021-01-10 12:08 ` Forza
2021-01-10 12:36 ` David Woodhouse
2021-01-11 18:56 ` Goffredo Baroncelli [this message]
2021-01-10 22:45 ` Chris Murphy
2021-01-11 8:23 ` Nikolay Borisov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3e309cce-d322-1df8-052a-f370ccbd2784@libero.it \
--to=kreijack@libero.it \
--cc=dwmw2@infradead.org \
--cc=forza@tnonline.net \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).