linux-bcachefs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Janpieter Sollie <janpieter.sollie@kabelmail.de>
To: linux-bcachefs@vger.kernel.org
Subject: bcachefs csum: what about scrubbing with ec?
Date: Sun, 24 Apr 2022 15:10:57 +0200	[thread overview]
Message-ID: <800887d8-7c00-76d0-81af-e0fd08a77847@kabelmail.de> (raw)


[-- Attachment #1.1.1: Type: text/plain, Size: 2088 bytes --]

Hi everyone,


I'm still learning about bcachefs, my experiments are currently mostly targeted at the part of
bcachefs-tools ,
digging into the whole bcachefs structure is still somewhat advanced.
I'm considering implementing bcachefs scrub: During the past months, I had several fs upgrades
where the filesystem decided the checksum wasn't correct, whereas the data was fine.  It would
be nice to be fixable.
However, something I thought about:
All non-encryption checksum algorithms are < 64 bits wide, and as such fit in the .lo part of
bch_csum.
What about using the upper part for ec? possibilities are here:
- using another checksum algorihm to check whether a failed checksum isn't an error of the
checksum itself.  maybe using bit 4 here of CSUM_TYPES to say: if 0, no 2nd, if 1, use crc32c
(as it is mostly hardware accelerated).
This would be fast + would tell whether the checksum is corrupt, or the file is.
it would however limit future checksums to 3: 1b1000 (a none algorithm can't have a 2nd
checksum), 1b1011 and 1b1100 (as those do not have room for a 2nd checksum + it may be unsafe)
- using reed-solomon as a new algorithm (nr 8).
I'm not entirely sure of that, using RS in a situation where there are only 16 bytes for ec in a
billion-size data block (many files are > 1GB these days) is mostly useless.
But, for what it's worth, it would allow to correct single bit errors on the fly (eg: the user
would never notice a bit error in its file, and it could be corrected automatically).
would any feature be worth investigating?
I had a chat with woobilicious about the topic on IRC, but we weren't really sure about the
usefullness of either of them.
So, what would other developers think about it?

Technically, an invalid checksum would point at an invalid checksum calculation (in which case
the file can be scrubbed), or a damaged file (in which case the file must be marked as dirty).
currently, the only scrub way is assuming the data is correct and the checksum isn't

I'd be glad to hear your opinions


Janpieter Sollie

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 33315 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

             reply	other threads:[~2022-04-24 13:11 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-24 13:10 Janpieter Sollie [this message]
2022-05-07 18:27 ` bcachefs csum: what about scrubbing with ec? Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=800887d8-7c00-76d0-81af-e0fd08a77847@kabelmail.de \
    --to=janpieter.sollie@kabelmail.de \
    --cc=linux-bcachefs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).