linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Goffredo Baroncelli <kreijack@libero.it>
To: David Woodhouse <dwmw2@infradead.org>, Forza <forza@tnonline.net>,
	linux-btrfs@vger.kernel.org
Subject: Re: Reading files with bad data checksum
Date: Mon, 11 Jan 2021 19:56:34 +0100	[thread overview]
Message-ID: <3e309cce-d322-1df8-052a-f370ccbd2784@libero.it> (raw)
In-Reply-To: <0544b98786dbd980e8bdde58bd5713247178a74e.camel@infradead.org>

On 1/10/21 1:36 PM, David Woodhouse wrote:
> On Sun, 2021-01-10 at 13:08 +0100, Forza wrote:
[...]
> 
> It showed up as errors. There appears to be a btrfs bug there but since

Yes, it is an old btrfs bug. And qemu is not the guilty.

https://lore.kernel.org/linux-btrfs/cf8a733f-2c9d-7ffe-e865-4c13d99dfb60@libero.it/

In my email there is a code to reproduce it.

Basically it is difficult to have the checksum sync with the data when O_DIRECT
is used. Even OpenZFS has problem with it. The OpenZFS solution is to lie about O_DIRECT:
it allows the flag however it doesn't honor.

I think that BTRFS should behave like ZFS: when csum are enable, O_DIRECT shouldn't be
honored (or returning an error or behaving like ZFS).


> I suspect it'll be easy to reproduce I'm more focused on recovery right
> now.
> 
>>> In the short term, all I want to do is make a copy of the file, using
>>> the data which are in the disk regardless of the fact that btrfs thinks
>>> the checksum doesn't match. Is there a way I can turn off *checking* of
>>> the checksum for that specific file (or file descriptor?).
>>>
>>> Or is the only way to do it with something like FIBMAP, reading the
>>> offending blocks directly from the underlying disk and then writing
>>> them into the appropriate offset in (a copy of) the file? A plan which
>>> is slightly complicated by the fact that of course btrfs doesn't
>>> support FIBMAP.
>>>
>>> What's the best way to recover the data?
>>>
>>
>> You can use GNU ddrescue to copy files. It can skip the offending blocks
>> and replace the bad data with zeroes. Not sure how well qemu will handle
>> that though.
> 
> Right. I've already copied the image with dd conv=sync,noerror to a new
> one with the +C flag. It passes 'qemu-img check', and in fact the guest
> is running just fine with it. I was expecting it to stop with
> catastrophic file system errors but I can't see anything wrong at all.
> I'm just paranoid that eventually I'll find out that the blocks belong
> to some file(s) I actually want, and I'd like to recover them.
> 
> Right now I have a horribly fragmented image file with these 'errors'
> cluttering up my file system and making backups of the host go
> extremely slow. I'd like to get those blocks back so I can make a clean
> copy of the image, and keep it around for reference in case I later
> *do* discover that I need the contents of those blocks.
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

  reply	other threads:[~2021-01-11 18:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-10 11:52 Reading files with bad data checksum David Woodhouse
2021-01-10 12:08 ` Forza
2021-01-10 12:36   ` David Woodhouse
2021-01-11 18:56     ` Goffredo Baroncelli [this message]
2021-01-10 22:45 ` Chris Murphy
2021-01-11  8:23 ` Nikolay Borisov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e309cce-d322-1df8-052a-f370ccbd2784@libero.it \
    --to=kreijack@libero.it \
    --cc=dwmw2@infradead.org \
    --cc=forza@tnonline.net \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).