All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Anton Mitterer <calestyo@scientia.net>
To: kreijack@inwind.it
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Ongoing Btrfs stability issues
Date: Mon, 12 Mar 2018 22:48:58 +0100	[thread overview]
Message-ID: <1520891338.4266.16.camel@scientia.net> (raw)
In-Reply-To: <3fd8f21b-2e4d-3696-8e92-a20e4dda13ec@inwind.it>

On Mon, 2018-03-12 at 22:22 +0100, Goffredo Baroncelli wrote:
> Unfortunately no, the likelihood might be 100%: there are some
> patterns which trigger this problem quite easily. See The link which
> I posted in my previous email. There was a program which creates a
> bad checksum (in COW+DATASUM mode), and the file became unreadable.
But that rather seems like a plain bug?!

No reason that would conceptually make checksumming+notdatacow
impossible.

AFAIU, the conceptual thin would be about:
- data is written in nodatacow
  => thus a checksum must be written as well, so write it
- what can then of course happen is
  - both csum and data are written => fine
  - csum is written but data not and then some crash => csum will show
    that => fine
  - data is written but csum not and then some crash => csum will give
    false positive

Still better few false positives, as many unnoticed data corruptions
and no true raid repair.


> If you cannot know if a checksum is bad or the data is bad, the
> checksum is not useful at all!
Why not? It's anyway only uncertain in the case of crash,... and it at
least tells you that something is fishy.
A program which cares about its data will ensure its own journaling
means and can simply recover by this... or users could then just roll
in a backup.
Or one could provide some API/userland tool to recompute the csums of
the affected file (and possibly live with bad data).


> If I read correctly what you wrote, it seems that you consider a
> "minor issue" the fact that the checksum is not correct. If you
> accept the possibility that a checksum might be wrong, you wont trust
> anymore the checksum; so the checksum became not useful.
There's simply no disadvantage to not having checksumming at all in the
nodatacow case.
Cause then you never have an idea whether your data is correct or
not... the case with checksumming + datacow, which can give a false
positive on a crash when data was written correctly, but not the
checksum, covers at least the other cases of data corruption (silent
data corruption, csum written, but data not or only partially in case
of a crash).


> Again, you are assuming that the likelihood of having a bad checksum
> is low. Unfortunately this is not true. There are pattern which
> exploits this bug with a likelihood=100%.

Okay I don't understand why this would be so and wouldn't assume that
the IO pattern can affect it heavily... but I'm not really btrfs
expert.

My blind assumption would have been that writing an extent of data
takes much longer to complete than writing the corresponding checksum.

Even if not... I should be only a problem in case of a crash during
that,.. and than I'd still prefer to get the false positive than bad
data.


Anyway... it's not going to happen so the discussion is pointless.
I think people can probably use dm-integrity (which btw: does no CoW
either (IIRC) and still can provide integrity... ;-) ) to see whether
their data is valid.
No nice but since it won't change on btrfs, a possible alternative.


Cheers,
Chris.

  reply	other threads:[~2018-03-12 21:49 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-15 16:18 Ongoing Btrfs stability issues Alex Adriaanse
2018-02-15 18:00 ` Nikolay Borisov
2018-02-15 19:41   ` Alex Adriaanse
2018-02-15 20:42     ` Nikolay Borisov
2018-02-16  4:54       ` Alex Adriaanse
2018-02-16  7:40         ` Nikolay Borisov
2018-02-16 19:44 ` Austin S. Hemmelgarn
2018-02-17  3:03   ` Duncan
2018-02-17  4:34     ` Shehbaz Jaffer
2018-02-17 15:18       ` Hans van Kranenburg
2018-02-17 16:42         ` Shehbaz Jaffer
2018-03-01 19:04   ` Alex Adriaanse
2018-03-01 19:40     ` Nikolay Borisov
2018-03-02 17:29       ` Liu Bo
2018-03-08 17:40         ` Alex Adriaanse
2018-03-09  9:54           ` Nikolay Borisov
2018-03-09 19:05             ` Alex Adriaanse
2018-03-10 12:04               ` Nikolay Borisov
2018-03-10 14:29                 ` Christoph Anton Mitterer
2018-03-11 17:51                   ` Goffredo Baroncelli
2018-03-11 22:37                     ` Christoph Anton Mitterer
2018-03-12 21:22                       ` Goffredo Baroncelli
2018-03-12 21:48                         ` Christoph Anton Mitterer [this message]
2018-03-13 19:36                           ` Goffredo Baroncelli
2018-03-13 20:10                             ` Christoph Anton Mitterer
2018-03-14 12:02                             ` Austin S. Hemmelgarn
2018-03-14 18:39                               ` Goffredo Baroncelli
2018-03-14 19:27                                 ` Austin S. Hemmelgarn
2018-03-14 22:17                                   ` Goffredo Baroncelli
2018-03-13 13:47               ` Patrik Lundquist
2018-03-02  4:02     ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1520891338.4266.16.camel@scientia.net \
    --to=calestyo@scientia.net \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.