All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juan Alberto Cirez <jacirez@rdcsafety.com>
To: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
Cc: Chris Murphy <lists@colorremedies.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Add device while rebalancing
Date: Wed, 27 Apr 2016 09:58:03 -0600	[thread overview]
Message-ID: <CAHaPQf3cR4jXKziSCqp0CnrB6oKQoO=2kywsKuo7BLqHqBjBRw@mail.gmail.com> (raw)
In-Reply-To: <5720A0E8.5000407@gmail.com>

WOW!
Correct me if I'm wrong but the sum total of the above seems to
suggest (at first glance) that BRTFS add several layers of complexity,
but for little real benefit (at least in the case use of btrfs at the
brick layer with a distributed filesystem on top)...

"...I've always though it'd be neat in a Btrfs + GlusterFS, if it were
possible for Btrfs to inform Gluster FS of "missing/corrupt" files,
and then for Btrfs to drop reference for those files, instead of
either rebuilding or remaining degraded. And then let GlusterFS deal
with replication of those files to maintain redundancy. i.e. the Btrfs
volumes would be single profile for data, and raid1 for metadata. When
there's n-way raid1, each drive can have a copy of the file system,
and it'd tolerate in effect n-1 drive failures and the file system
could at least still inform Gluster (or Ceph) of the missing data, the
file system still remains valid, only briefly degraded, and can still
be expanded when new drives become available..."

That in my n00b opinion would be brilliant in a real world use case.


On Wed, Apr 27, 2016 at 5:22 AM, Austin S. Hemmelgarn
<ahferroin7@gmail.com> wrote:
> On 2016-04-26 20:58, Chris Murphy wrote:
>>
>> On Tue, Apr 26, 2016 at 5:44 AM, Juan Alberto Cirez
>> <jacirez@rdcsafety.com> wrote:
>>>
>>>
>>> With GlusterFS as a distributed volume, the files are already spread
>>> among the servers causing file I/O to be spread fairly evenly among
>>> them as well, thus probably providing the benefit one might expect
>>> with stripe (RAID10).
>>
>>
>> Yes, the raid1 of Btrfs is just so you don't have to rebuild volumes
>> if you lose a drive. But since raid1 is not n-way copies, and only
>> means two copies, you don't really want the file systems getting that
>> big or you increase the chances of a double failure.
>>
>> I've always though it'd be neat in a Btrfs + GlusterFS, if it were
>> possible for Btrfs to inform Gluster FS of "missing/corrupt" files,
>> and then for Btrfs to drop reference for those files, instead of
>> either rebuilding or remaining degraded. And then let GlusterFS deal
>> with replication of those files to maintain redundancy. i.e. the Btrfs
>> volumes would be single profile for data, and raid1 for metadata. When
>> there's n-way raid1, each drive can have a copy of the file system,
>> and it'd tolerate in effect n-1 drive failures and the file system
>> could at least still inform Gluster (or Ceph) of the missing data, the
>> file system still remains valid, only briefly degraded, and can still
>> be expanded when new drives become available.
>
> FWIW, I _think_ this can be done with the scrubbing code in GlusterFS. It's
> designed to repair data mismatches, but I'm not sure how it handles missing
> copies of data.  However, in the current state, there's no way without
> external scripts to handle re-shaping of the storage bricks if part of them
> fails.

  reply	other threads:[~2016-04-27 15:58 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-22 20:36 Add device while rebalancing Juan Alberto Cirez
2016-04-23  5:38 ` Duncan
2016-04-25 11:18   ` Austin S. Hemmelgarn
2016-04-25 12:43     ` Duncan
2016-04-25 13:02       ` Austin S. Hemmelgarn
2016-04-26 10:50         ` Juan Alberto Cirez
2016-04-26 11:11           ` Austin S. Hemmelgarn
2016-04-26 11:44             ` Juan Alberto Cirez
2016-04-26 12:04               ` Austin S. Hemmelgarn
2016-04-26 12:14                 ` Juan Alberto Cirez
2016-04-26 12:44                   ` Austin S. Hemmelgarn
2016-04-27  0:58               ` Chris Murphy
2016-04-27 10:37                 ` Duncan
2016-04-27 11:22                 ` Austin S. Hemmelgarn
2016-04-27 15:58                   ` Juan Alberto Cirez [this message]
2016-04-27 16:29                     ` Holger Hoffstätte
2016-04-27 16:38                       ` Juan Alberto Cirez
2016-04-27 16:40                         ` Juan Alberto Cirez
2016-04-27 17:23                           ` Holger Hoffstätte
2016-04-27 23:19                   ` Chris Murphy
2016-04-28 11:21                     ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHaPQf3cR4jXKziSCqp0CnrB6oKQoO=2kywsKuo7BLqHqBjBRw@mail.gmail.com' \
    --to=jacirez@rdcsafety.com \
    --cc=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.