All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sage Weil <sage@newdream.net>
To: Gregory Farnum <gfarnum@redhat.com>
Cc: Chris Dunlop <chris@onthe.net.au>,
	Allen Samuels <Allen.Samuels@sandisk.com>,
	Igor Fedotov <ifedotov@mirantis.com>,
	ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: Adding compression/checksum support for bluestore.
Date: Sun, 3 Apr 2016 09:27:22 -0400 (EDT)	[thread overview]
Message-ID: <alpine.DEB.2.11.1604030921090.19675@cpach.fuggernut.com> (raw)
In-Reply-To: <CAJ4mKGYPyqy9tUJH8eUEHFAt25Sqm6uKQpr84ckwDERqsRxUKA@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2822 bytes --]

On Fri, 1 Apr 2016, Gregory Farnum wrote:
> On Fri, Apr 1, 2016 at 10:05 PM, Chris Dunlop <chris@onthe.net.au> wrote:
> > On Fri, Apr 01, 2016 at 07:51:07PM -0700, Gregory Farnum wrote:
> >> On Fri, Apr 1, 2016 at 7:23 PM, Allen Samuels <Allen.Samuels@sandisk.com> wrote:
> >>> Talk about mental failures. The first statement is correct. It's about the ratio of checksum to data bits. After that please ignore. If you double the data you need to double the checksum bit to maintain the ber.
> >>
> >> Forgive me if I'm wrong here — I haven't done anything with
> >> checksumming since I graduated college — but good checksumming is
> >> about probabilities and people suck at evaluating probability: I'm
> >> really not sure any of the explanations given in this thread are
> >> right. Bit errors aren't random and in general it requires a lot more
> >> than one bit flip to collide a checksum, so I don't think it's a
> >> linear relationship between block size and chance of error. Finding
> >
> > A single bit flip can certainly result in a checksum collision, with the
> > same chance as any other error, i.e. 1 in 2^number_of_checksum_bits.
> 
> That's just not true. I'll quote from
> https://en.m.wikipedia.org/wiki/Cyclic_redundancy_check#Introduction
> 
> > Typically an n-bit CRC applied to a data block of arbitrary length will detect any single error burst not longer than n bits and will detect a fraction 1 − 2^(−n) of all longer error bursts.
> 
> And over (at least) the ranges they're designed for, it's even better:
> they provide guarantees about how many bits (in any combination or
> arrangement) must be flipped before they can have a false match. (It
> says "typically" because CRCs are a wide family and yes, you do have
> to select the right ones in the right ways in order to get the desired
> effects.)

That's pretty cool.  I have a new respect for CRCs.  :)
 
> As Allen says, flash may require something different, but it will be
> similar. Getting the people who actually understand this is definitely
> the way to go — it's an active research field but I think over the
> ranges we're interested in it's a solved problem. And certainly if we
> try and guess about things based on our intuition, we *will* get it
> wrong. So somebody interested in this feature set needs to go out and
> do the reading or talk to the right people, please! :)

Yep.  I was halfway through responding to Chris's last message when I 
convinced myself that actually he was right (block size doesn't matter). 
But I don't trust my intuition here anymore.  :/

In any case, it seems like the way to proceed is to have a variable length 
checksum_block_size, since we need that anyway for other reasons (e.g., 
balancing minimum read size and read amplification for small IOs vs 
metadata overhead).

sage

  reply	other threads:[~2016-04-03 13:27 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-30 19:46 Adding compression/checksum support for bluestore Allen Samuels
2016-03-30 20:41 ` Vikas Sinha-SSI
2016-03-30 22:24   ` Sage Weil
2016-03-30 22:35     ` Allen Samuels
2016-03-31 16:31   ` Igor Fedotov
2016-03-30 22:15 ` Sage Weil
2016-03-30 22:22   ` Gregory Farnum
2016-03-30 22:30     ` Sage Weil
2016-03-30 22:43       ` Allen Samuels
2016-03-30 22:32   ` Allen Samuels
2016-03-30 22:52   ` Allen Samuels
2016-03-30 22:57     ` Sage Weil
2016-03-30 23:03       ` Gregory Farnum
2016-03-30 23:08         ` Allen Samuels
2016-03-31 23:02       ` Milosz Tanski
2016-04-01  3:56     ` Chris Dunlop
2016-04-01  4:56       ` Sage Weil
2016-04-01  5:28         ` Chris Dunlop
2016-04-01 14:58           ` Sage Weil
2016-04-01 19:49             ` Chris Dunlop
2016-04-01 23:08               ` Allen Samuels
2016-04-02  2:23                 ` Allen Samuels
2016-04-02  2:51                   ` Gregory Farnum
2016-04-02  5:05                     ` Chris Dunlop
2016-04-02  5:48                       ` Allen Samuels
2016-04-02  6:18                       ` Gregory Farnum
2016-04-03 13:27                         ` Sage Weil [this message]
2016-04-04 15:33                           ` Chris Dunlop
2016-04-04 15:51                             ` Chris Dunlop
2016-04-04 17:58                               ` Allen Samuels
2016-04-04 15:26                         ` Chris Dunlop
2016-04-04 17:56                           ` Allen Samuels
2016-04-02  5:08                     ` Allen Samuels
2016-04-02  4:07                 ` Chris Dunlop
2016-04-02  5:38                   ` Allen Samuels
2016-04-04 15:00                     ` Chris Dunlop
2016-04-04 23:58                       ` Allen Samuels
2016-04-05 12:35                         ` Sage Weil
2016-04-05 15:10                           ` Chris Dunlop
2016-04-06  6:38                             ` Chris Dunlop
2016-04-06 15:47                               ` Allen Samuels
2016-04-06 17:17                                 ` Chris Dunlop
2016-04-06 18:06                                   ` Allen Samuels
2016-04-07  0:43                                     ` Chris Dunlop
2016-04-07  0:52                                       ` Allen Samuels
2016-04-07  2:59                                         ` Chris Dunlop
2016-04-07  9:51                                           ` Willem Jan Withagen
2016-04-07 12:21                                             ` Atchley, Scott
2016-04-07 15:01                                               ` Willem Jan Withagen
2016-04-07  9:51                                           ` Chris Dunlop
2016-04-08 23:16                                             ` Allen Samuels
2016-04-05 20:41                           ` Allen Samuels
2016-04-05 21:14                             ` Sage Weil
2016-04-05 12:57                         ` Dan van der Ster
2016-04-05 20:50                           ` Allen Samuels
2016-04-06  7:15                             ` Dan van der Ster
2016-03-31 16:27   ` Igor Fedotov
2016-03-31 16:32     ` Allen Samuels
2016-03-31 17:18       ` Igor Fedotov
2016-03-31 17:39         ` Piotr.Dalek
2016-03-31 18:44         ` Allen Samuels
2016-03-31 16:58 ` Igor Fedotov
2016-03-31 18:38   ` Allen Samuels
2016-04-04 12:14     ` Igor Fedotov
2016-04-04 14:44       ` Allen Samuels

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.11.1604030921090.19675@cpach.fuggernut.com \
    --to=sage@newdream.net \
    --cc=Allen.Samuels@sandisk.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=chris@onthe.net.au \
    --cc=gfarnum@redhat.com \
    --cc=ifedotov@mirantis.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.