All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Anton Mitterer <calestyo@scientia.net>
To: Sanidhya Solanki <jpage.lkml@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] BTRFS: Adds an option to select RAID Stripe size
Date: Tue, 29 Dec 2015 07:03:11 +0100	[thread overview]
Message-ID: <1451368991.7094.45.camel@scientia.net> (raw)
In-Reply-To: <20151228203111.7ba8b0be@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3176 bytes --]

On Mon, 2015-12-28 at 20:31 -0500, Sanidhya Solanki wrote:
> What is your experience like about running a production system on
> what
> is essentially a beta product? Crashes?
What do you mean? btrfs? I'm not yet running it in production (there
was a subthread recently, where I've explained a bit more why).

But right now I re-organise a lot of our data-pools and consider to set
up those, which serve just as replica holders, with btrfs.
But the RAID would likely be still from the HW controller.


> Would something like ZFS not be more suited to your environment?
Well I guess that's my personal political decision... I simply think
that btrfs should and will be the next gen Linux main filesystem.
Plus that IIRC zfs-fuse is no unmaintained and linux-zfs not yet part
of Debian rules it anyway out, as I'd be tool lazy to compile it myself
(at least for work ;) ).


> Especially as not all disks will be full, and, if a disk was to fail,
> the entire disk would need to be rebuilt from parity drives (as
> opposed
> to ZFS only using the parity data, and not copying empty blocks
> (another feature that is planned for BTRFS))
Ah? I thought btrfs would already do that as well?

Well anyway... I did some comparison between HW RAID and MD RAID each
with ext4 and btrfs.
I haven't tried btrfs-RAID6 back then, since it's IMHO still too far
away from being production ready.

IIRC, there were be some (for us) interesting cases where MD RAID would
have been somewhat faster than HW RAID,... but there are some other
major IO patters (IIRC sequential read/write) where HW RAID was simply
magnitudes faster (no big surprise of course).


> I do not believe it would be possible to guarantee crash or error
> recovery when using an in-place rebuild, without slowing down the
> entire rebuild to cache each block before replacing it with the new
> block. That would slow it down considerably, as you would have to:
> 
> copy to cache > checksum > write in place on disk > checksum new data
> >
> verify checksums
I'm not sure what you mean by "cache"... wouldn't btrfs' CoW mean that
you "just" copy the data, and once this is done, update the metadata
and things would be either consistent or they would not (and in case of
a crash still point to the old, not yet reshaped, data)?

A special case were of course nodatacow'ed data.... there one may need
some kind of cache or journal... (see the other thread of mine, where I
ask for checksumming with no-CoWed data =) ).


> I suppose that is the only proper way to do it anyway, but it will
> definitely be slow.
From my PoV... slowness doesn't matter *that* much here anyway, while
consitency/safety does.
I mean reshaping a RAID wouldn't be something you'd do every month (at
least not in production systems - test systems are of course another
case).
Once I'd have determined that another RAID chunk size would perform
*considerably* better than the current, I'd reshape... and whether that
runs than for a week or two... as long as it happens online and as long
as I can control a bit how much IO is spent on the reshape: who cares?


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

  reply	other threads:[~2015-12-29  6:03 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-28 12:24 [PATCH] BTRFS: Adds an option to select RAID Stripe size Sanidhya Solanki
2015-12-28 22:19 ` Christoph Anton Mitterer
2015-12-28 20:38   ` Sanidhya Solanki
2015-12-29  1:21     ` Christoph Anton Mitterer
2015-12-28 21:43       ` Sanidhya Solanki
2015-12-29  3:42         ` Christoph Anton Mitterer
2015-12-29  0:03           ` Sanidhya Solanki
2015-12-29  4:26             ` Christoph Anton Mitterer
2015-12-29  1:31               ` Sanidhya Solanki
2015-12-29  6:03                 ` Christoph Anton Mitterer [this message]
2015-12-29  2:23                   ` Sanidhya Solanki
2015-12-29 15:32                     ` Christoph Anton Mitterer
2015-12-29 16:44                       ` Duncan
2015-12-30  2:56                         ` Christoph Anton Mitterer
2015-12-29 18:06               ` David Sterba
2015-12-30 20:00                 ` Christoph Anton Mitterer
2015-12-30 21:02                   ` Duncan
2015-12-30 21:13                     ` Christoph Anton Mitterer
2016-01-02 11:52                 ` Sanidhya Solanki
2016-01-03  1:37                   ` Qu Wenruo
2016-01-03  2:26                     ` Christoph Anton Mitterer
2016-01-05 10:44                       ` David Sterba
2016-01-05 18:48                         ` Christoph Anton Mitterer
2016-01-10  3:11                     ` Sanidhya Solanki
2016-01-11  1:29                       ` Qu Wenruo
2016-01-11 15:43                       ` Christoph Anton Mitterer
2016-01-11 11:49                         ` Sanidhya Solanki
2016-01-11 15:57                           ` Christoph Anton Mitterer
2016-01-11 16:01                             ` Hugo Mills
2016-01-12 12:23                             ` Austin S. Hemmelgarn
2016-01-12 12:07                               ` Sanidhya Solanki
2015-12-29 13:39 ` David Sterba
2015-12-29 11:15   ` Sanidhya Solanki
2015-12-29 17:06     ` David Sterba
2015-12-29 21:32       ` Sanidhya Solanki
2015-12-30  6:39       ` Sanidhya Solanki
2015-12-30 11:59         ` Qu Wenruo
2015-12-30  9:54           ` Sanidhya Solanki
2015-12-30 14:10             ` Qu Wenruo
2015-12-30 11:15               ` Sanidhya Solanki
2015-12-30 15:58                 ` David Sterba
2015-12-30 21:19                   ` Sanidhya Solanki
2015-12-30 16:17               ` David Sterba
2015-12-30 21:21                 ` Sanidhya Solanki
2016-01-05 10:33                   ` David Sterba
2015-12-31  0:46                 ` Qu Wenruo
2016-01-05 10:16                   ` David Sterba
2015-12-30 19:48               ` Christoph Anton Mitterer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1451368991.7094.45.camel@scientia.net \
    --to=calestyo@scientia.net \
    --cc=jpage.lkml@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.