bcachefs csum: what about scrubbing with ec?

* bcachefs csum: what about scrubbing with ec?
@ 2022-04-24 13:10 Janpieter Sollie
  2022-05-07 18:27 ` Kent Overstreet
  0 siblings, 1 reply; 2+ messages in thread
From: Janpieter Sollie @ 2022-04-24 13:10 UTC (permalink / raw)
  To: linux-bcachefs

[-- Attachment #1.1.1: Type: text/plain, Size: 2088 bytes --]

Hi everyone,

I'm still learning about bcachefs, my experiments are currently mostly targeted at the part of
bcachefs-tools ,
digging into the whole bcachefs structure is still somewhat advanced.
I'm considering implementing bcachefs scrub: During the past months, I had several fs upgrades
where the filesystem decided the checksum wasn't correct, whereas the data was fine.  It would
be nice to be fixable.
However, something I thought about:
All non-encryption checksum algorithms are < 64 bits wide, and as such fit in the .lo part of
bch_csum.
What about using the upper part for ec? possibilities are here:
- using another checksum algorihm to check whether a failed checksum isn't an error of the
checksum itself.  maybe using bit 4 here of CSUM_TYPES to say: if 0, no 2nd, if 1, use crc32c
(as it is mostly hardware accelerated).
This would be fast + would tell whether the checksum is corrupt, or the file is.
it would however limit future checksums to 3: 1b1000 (a none algorithm can't have a 2nd
checksum), 1b1011 and 1b1100 (as those do not have room for a 2nd checksum + it may be unsafe)
- using reed-solomon as a new algorithm (nr 8).
I'm not entirely sure of that, using RS in a situation where there are only 16 bytes for ec in a
billion-size data block (many files are > 1GB these days) is mostly useless.
But, for what it's worth, it would allow to correct single bit errors on the fly (eg: the user
would never notice a bit error in its file, and it could be corrected automatically).
would any feature be worth investigating?
I had a chat with woobilicious about the topic on IRC, but we weren't really sure about the
usefullness of either of them.
So, what would other developers think about it?

Technically, an invalid checksum would point at an invalid checksum calculation (in which case
the file can be scrubbed), or a damaged file (in which case the file must be marked as dirty).
currently, the only scrub way is assuming the data is correct and the checksum isn't

I'd be glad to hear your opinions

Janpieter Sollie

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 33315 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread