* Potential to loose data in case of disk failure
@ 2015-11-11 17:30 Jim Murphy
2015-11-11 20:24 ` Sean Greenslade
2015-11-11 23:13 ` Chris Murphy
0 siblings, 2 replies; 9+ messages in thread
From: Jim Murphy @ 2015-11-11 17:30 UTC (permalink / raw)
To: linux-btrfs
Hi all,
What am I missing or misunderstanding? I have a newly
purchased laptop I want/need to multi boot different OSs
on. As a result after partitioning I have ended up with two
partitions on each of the two internal drives(sda3, sda8,
sdb3 and sdb8). FWIW, sda3 and sdb3 are the same size
and sda8 and sdb8 are the same size. As an end result
I want one btrfs raid1 filesystem. For lack of better terms,
sda3 and sda8 "concatenated" together, sdb3 and sdb8
"concatenated" together and then mirroring "sda" to "sdb"
using only btrfs. So far have found no use-case to cover
this.
If I create a raid1 btrfs volume using all 4 "devices" as I
understand it I would loose data if I were to loose a drive
because two mirror possibilities would be:
sda3 mirrored to sda8
sdb3 mirrored to sdb8
Is what I want to do possible without using MD-RAID and/or
LVM? If so would someone point me to the documentation
I missed. For whatever reason, I don't want to believe that
this can't be done. I want to believe that the code in btrfs
is smart enough to know that sda3 and sda8 are on the same
drive and would not try to mirror data between them except in
a test setup. I hope I just missed some documentation,
somewhere.
Thanks in advance for your help. And last but not least,
thanks to all for your work on btrfs.
Jim
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Potential to loose data in case of disk failure
2015-11-11 17:30 Potential to loose data in case of disk failure Jim Murphy
@ 2015-11-11 20:24 ` Sean Greenslade
2015-11-12 12:47 ` Austin S Hemmelgarn
2015-11-11 23:13 ` Chris Murphy
1 sibling, 1 reply; 9+ messages in thread
From: Sean Greenslade @ 2015-11-11 20:24 UTC (permalink / raw)
To: Jim Murphy; +Cc: linux-btrfs
On Wed, Nov 11, 2015 at 11:30:57AM -0600, Jim Murphy wrote:
> Hi all,
>
> What am I missing or misunderstanding? I have a newly
> purchased laptop I want/need to multi boot different OSs
> on. As a result after partitioning I have ended up with two
> partitions on each of the two internal drives(sda3, sda8,
> sdb3 and sdb8). FWIW, sda3 and sdb3 are the same size
> and sda8 and sdb8 are the same size. As an end result
> I want one btrfs raid1 filesystem. For lack of better terms,
> sda3 and sda8 "concatenated" together, sdb3 and sdb8
> "concatenated" together and then mirroring "sda" to "sdb"
> using only btrfs. So far have found no use-case to cover
> this.
>
> If I create a raid1 btrfs volume using all 4 "devices" as I
> understand it I would loose data if I were to loose a drive
> because two mirror possibilities would be:
>
> sda3 mirrored to sda8
> sdb3 mirrored to sdb8
>
> Is what I want to do possible without using MD-RAID and/or
> LVM? If so would someone point me to the documentation
> I missed. For whatever reason, I don't want to believe that
> this can't be done. I want to believe that the code in btrfs
> is smart enough to know that sda3 and sda8 are on the same
> drive and would not try to mirror data between them except in
> a test setup. I hope I just missed some documentation,
> somewhere.
>
> Thanks in advance for your help. And last but not least,
> thanks to all for your work on btrfs.
>
> Jim
That's a pretty unusual setup, so I'm not surprised there's no quick and
easy answer. The best solution in my opinion would be to shuffle your
partitions around and combine sda3 and sda8 into a single partition.
There's generally no reason to present btrfs with two different
partitions on the same disk.
If there's something that prevents you from doing that, you may be able
to use RAID10 or RAID6 somehow. I'm not really sure, though, so I'll
defer to others on the list for implementation details.
--Sean
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Potential to loose data in case of disk failure
2015-11-11 17:30 Potential to loose data in case of disk failure Jim Murphy
2015-11-11 20:24 ` Sean Greenslade
@ 2015-11-11 23:13 ` Chris Murphy
2015-11-12 5:07 ` Duncan
1 sibling, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2015-11-11 23:13 UTC (permalink / raw)
To: Btrfs BTRFS
On Wed, Nov 11, 2015 at 12:30 PM, Jim Murphy <srlinuxadmin@gmail.com> wrote:
> Hi all,
>
> What am I missing or misunderstanding? I have a newly
> purchased laptop I want/need to multi boot different OSs
> on. As a result after partitioning I have ended up with two
> partitions on each of the two internal drives(sda3, sda8,
> sdb3 and sdb8). FWIW, sda3 and sdb3 are the same size
> and sda8 and sdb8 are the same size. As an end result
> I want one btrfs raid1 filesystem. For lack of better terms,
> sda3 and sda8 "concatenated" together, sdb3 and sdb8
> "concatenated" together and then mirroring "sda" to "sdb"
> using only btrfs. So far have found no use-case to cover
> this.
I'm going to assume that mkfs.btrfs -mraid1 -draid1 command is pointed
at the two resulting /dev/mapper/X devices resulting from the linear
concat.
>
> If I create a raid1 btrfs volume using all 4 "devices" as I
> understand it I would loose data if I were to loose a drive
> because two mirror possibilities would be:
>
> sda3 mirrored to sda8
> sdb3 mirrored to sdb8
Well you don't actually know how the mirroring will allocate is the
problem with the arrangement. But yes, it's possible some chunks on
sda3 will be mirrored to sda8, which is not what you'd want so the
linear concat idea is fine using either the md driver or lvm.
> Is what I want to do possible without using MD-RAID and/or
> LVM?
Yes, either are suitable for this purpose. The decision comes down to
the user space tools, use the tool that you're most comfortable with.
> If so would someone point me to the documentation
> I missed. For whatever reason, I don't want to believe that
> this can't be done. I want to believe that the code in btrfs
> is smart enough to know that sda3 and sda8 are on the same
> drive and would not try to mirror data between them except in
> a test setup. I hope I just missed some documentation,
> somewhere.
As far as I know right now btrfs works strictly at the block device
level, and considers different partitions different block devices, it
doesn't grok the underlying physical device relationship.
--
Chris Murphy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Potential to loose data in case of disk failure
2015-11-11 23:13 ` Chris Murphy
@ 2015-11-12 5:07 ` Duncan
0 siblings, 0 replies; 9+ messages in thread
From: Duncan @ 2015-11-12 5:07 UTC (permalink / raw)
To: linux-btrfs
Chris Murphy posted on Wed, 11 Nov 2015 18:13:22 -0500 as excerpted:
> On Wed, Nov 11, 2015 at 12:30 PM, Jim Murphy <srlinuxadmin@gmail.com>
> wrote:
>> Hi all,
>>
>> What am I missing or misunderstanding? I have a newly purchased laptop
>> I want/need to multi boot different OSs on. As a result after
>> partitioning I have ended up with two partitions on each of the two
>> internal drives(sda3, sda8, sdb3 and sdb8). FWIW, sda3 and sdb3 are
>> the same size and sda8 and sdb8 are the same size. As an end result I
>> want one btrfs raid1 filesystem. For lack of better terms,
>> sda3 and sda8 "concatenated" together, sdb3 and sdb8 "concatenated"
>> together and then mirroring "sda" to "sdb" using only btrfs. So far
>> have found no use-case to cover this.
There isn't any... using ONLY btrfs (as the OP specified). You need
either mdraid or lvm to concatenate the two logical devices (partitions)
on a single physical device into one, so then btrfs will see only two
devices and put a raid1 copy on each.
This is because (reordering a bit of your quote from further below)...
> btrfs works strictly at the block device level, and considers different
> partitions different block devices, it doesn't grok the underlying
> physical device relationship.
[end of reordered bit]
> I'm going to assume that mkfs.btrfs -mraid1 -draid1 command is pointed
> at the two resulting /dev/mapper/X devices resulting from the linear
> concat.
Except, under the conditions he specified, there will be no such linear
concat mapper device available.
>> If I create a raid1 btrfs volume using all 4 "devices" as I understand
>> it I would loose data if I were to loose a drive because two mirror
>> possibilities would be:
>>
>> sda3 mirrored to sda8 sdb3 mirrored to sdb8
>
> Well you don't actually know how the mirroring will allocate is the
> problem with the arrangement. But yes, it's possible some chunks on sda3
> will be mirrored to sda8, which is not what you'd want so the linear
> concat idea is fine using either the md driver or lvm.
>
>> Is what I want to do possible without using MD-RAID and/or LVM?
>
> Yes, either are suitable for this purpose. The decision comes down to
> the user space tools, use the tool that you're most comfortable with.
It's not possible /without/ using them, no. Using them, yes, but that's
not the question that was asked.
As for the explanation, that's the part that I reordered above, btrfs
only sees block devices. It doesn't know nor care what they're from.
One workaround would be as Sean Greenslade's, using either a partitioning
tool that can safely move partitions around without destroying the data
in them, or simply copying off to backup, deleting the partitions and
recreating them in a more workable layout, then restoring from backup to
the new partitions, combining the two partitions on each physical device
into one.
Another workaround would be putting btrfs on top of an mdraid or lvm
setup as above, thereby using software to overcome the hardware layout
limitations.
Yet another, the one I'd almost certainly use here unless the use-case
made it too inconvenient or not even possible (one big file too big for
either one alone), would be to simply create two entirely separate btrfs
raid1 filesystems, each one composed of the two partitions of similar
size. I'm already a strong booster of using partitioning to avoid
putting all my data eggs in one basket, and already use multiple separate
btrfs raid1s on partitions of the same two physical devices, here, so
this wouldn't even be beyond what I'm already doing, here. And if the
OP's already dealing with that many partitions, it sounds like he'd be
able to handle it fairly well too. After all there's always symlinks and
bind-mounts available to make parts of one filesystem available in
arbitrary locations on another, if the location of the directories
themselves is hard-coded to such an extent that you can't simply move
them and point everything that was pointed at the old location to the new
one, instead.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Potential to loose data in case of disk failure
2015-11-11 20:24 ` Sean Greenslade
@ 2015-11-12 12:47 ` Austin S Hemmelgarn
2015-11-12 17:23 ` Dmitry Katsubo
0 siblings, 1 reply; 9+ messages in thread
From: Austin S Hemmelgarn @ 2015-11-12 12:47 UTC (permalink / raw)
To: Sean Greenslade, Jim Murphy; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2778 bytes --]
On 2015-11-11 15:24, Sean Greenslade wrote:
> On Wed, Nov 11, 2015 at 11:30:57AM -0600, Jim Murphy wrote:
>> Hi all,
>>
>> What am I missing or misunderstanding? I have a newly
>> purchased laptop I want/need to multi boot different OSs
>> on. As a result after partitioning I have ended up with two
>> partitions on each of the two internal drives(sda3, sda8,
>> sdb3 and sdb8). FWIW, sda3 and sdb3 are the same size
>> and sda8 and sdb8 are the same size. As an end result
>> I want one btrfs raid1 filesystem. For lack of better terms,
>> sda3 and sda8 "concatenated" together, sdb3 and sdb8
>> "concatenated" together and then mirroring "sda" to "sdb"
>> using only btrfs. So far have found no use-case to cover
>> this.
>>
>> If I create a raid1 btrfs volume using all 4 "devices" as I
>> understand it I would loose data if I were to loose a drive
>> because two mirror possibilities would be:
>>
>> sda3 mirrored to sda8
>> sdb3 mirrored to sdb8
>>
>> Is what I want to do possible without using MD-RAID and/or
>> LVM? If so would someone point me to the documentation
>> I missed. For whatever reason, I don't want to believe that
>> this can't be done. I want to believe that the code in btrfs
>> is smart enough to know that sda3 and sda8 are on the same
>> drive and would not try to mirror data between them except in
>> a test setup. I hope I just missed some documentation,
>> somewhere.
>>
>> Thanks in advance for your help. And last but not least,
>> thanks to all for your work on btrfs.
>>
>> Jim
>
> That's a pretty unusual setup, so I'm not surprised there's no quick and
> easy answer. The best solution in my opinion would be to shuffle your
> partitions around and combine sda3 and sda8 into a single partition.
> There's generally no reason to present btrfs with two different
> partitions on the same disk.
>
> If there's something that prevents you from doing that, you may be able
> to use RAID10 or RAID6 somehow. I'm not really sure, though, so I'll
> defer to others on the list for implementation details.
RAID10 has the same issue. Assume you have 1 block. This gets stored
as 2 copies, each with 2 stripes, with the stripes split symmetrically.
For this, call the first half of the first copy 1a, the second half
1b, and likewise for 2a and 2b with the second copy. 1a and 2a have
identical contents, and 1b and 2b have identical contents. It is fully
possible that you will end up with this block striped such that 1a and
2a are on one disk, and 1b and 2b on the other. Based on this, losing
one disk would mean losing half the block, which would mean based on how
BTRFS works that you would lose the whole block (because neither copy
would be complete).
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Potential to loose data in case of disk failure
2015-11-12 12:47 ` Austin S Hemmelgarn
@ 2015-11-12 17:23 ` Dmitry Katsubo
2015-11-12 17:55 ` Austin S Hemmelgarn
2015-11-13 14:51 ` Chris Murphy
0 siblings, 2 replies; 9+ messages in thread
From: Dmitry Katsubo @ 2015-11-12 17:23 UTC (permalink / raw)
To: linux-btrfs
On 2015-11-12 13:47, Austin S Hemmelgarn wrote:
>> That's a pretty unusual setup, so I'm not surprised there's no quick and
>> easy answer. The best solution in my opinion would be to shuffle your
>> partitions around and combine sda3 and sda8 into a single partition.
>> There's generally no reason to present btrfs with two different
>> partitions on the same disk.
>>
>> If there's something that prevents you from doing that, you may be able
>> to use RAID10 or RAID6 somehow. I'm not really sure, though, so I'll
>> defer to others on the list for implementation details.
> RAID10 has the same issue. Assume you have 1 block. This gets stored
> as 2 copies, each with 2 stripes, with the stripes split symmetrically.
> For this, call the first half of the first copy 1a, the second half 1b,
> and likewise for 2a and 2b with the second copy. 1a and 2a have
> identical contents, and 1b and 2b have identical contents. It is fully
> possible that you will end up with this block striped such that 1a and
> 2a are on one disk, and 1b and 2b on the other. Based on this, losing
> one disk would mean losing half the block, which would mean based on how
> BTRFS works that you would lose the whole block (because neither copy
> would be complete).
Does it equally apply to RAID1? Namely, if I create
mkfs.btrfs -mraid1 -draid1 /dev/sda3 /dev/sda8
then btrfs will "believe" that these are different drives and mistakenly
think that RAID pre-condition is satisfied. Am I right? If so then I
think this is a trap, and mkfs.btrfs should at least warn (or require
--force) if two partitions are on the same drive for raid1/raid5/raid10.
In other words, the only scenario when this check should be skipped is:
mkfs.btrfs -mraid0 -draid0 /dev/sda3 /dev/sda8
--
With best regards,
Dmitry
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Potential to loose data in case of disk failure
2015-11-12 17:23 ` Dmitry Katsubo
@ 2015-11-12 17:55 ` Austin S Hemmelgarn
2015-11-13 14:51 ` Chris Murphy
1 sibling, 0 replies; 9+ messages in thread
From: Austin S Hemmelgarn @ 2015-11-12 17:55 UTC (permalink / raw)
To: Dmitry Katsubo, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2232 bytes --]
On 2015-11-12 12:23, Dmitry Katsubo wrote:
> On 2015-11-12 13:47, Austin S Hemmelgarn wrote:
>>> That's a pretty unusual setup, so I'm not surprised there's no quick and
>>> easy answer. The best solution in my opinion would be to shuffle your
>>> partitions around and combine sda3 and sda8 into a single partition.
>>> There's generally no reason to present btrfs with two different
>>> partitions on the same disk.
>>>
>>> If there's something that prevents you from doing that, you may be able
>>> to use RAID10 or RAID6 somehow. I'm not really sure, though, so I'll
>>> defer to others on the list for implementation details.
>> RAID10 has the same issue. Assume you have 1 block. This gets stored
>> as 2 copies, each with 2 stripes, with the stripes split symmetrically.
>> For this, call the first half of the first copy 1a, the second half 1b,
>> and likewise for 2a and 2b with the second copy. 1a and 2a have
>> identical contents, and 1b and 2b have identical contents. It is fully
>> possible that you will end up with this block striped such that 1a and
>> 2a are on one disk, and 1b and 2b on the other. Based on this, losing
>> one disk would mean losing half the block, which would mean based on how
>> BTRFS works that you would lose the whole block (because neither copy
>> would be complete).
>
> Does it equally apply to RAID1? Namely, if I create
>
> mkfs.btrfs -mraid1 -draid1 /dev/sda3 /dev/sda8
>
> then btrfs will "believe" that these are different drives and mistakenly
> think that RAID pre-condition is satisfied. Am I right? If so then I
> think this is a trap, and mkfs.btrfs should at least warn (or require
> --force) if two partitions are on the same drive for raid1/raid5/raid10.
> In other words, the only scenario when this check should be skipped is:
>
> mkfs.btrfs -mraid0 -draid0 /dev/sda3 /dev/sda8
>
Yes, BTRFS will assume you know what you are doing in that case and just
do it. We probably should have some kind of warning, but that gets
tricky when you throw in stuff like LVM (which can have any arbitrary
name for the logical volumes, and there isn't any way, even using the
tools, to easily figure out what disk a given LV is on).
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Potential to loose data in case of disk failure
2015-11-12 17:23 ` Dmitry Katsubo
2015-11-12 17:55 ` Austin S Hemmelgarn
@ 2015-11-13 14:51 ` Chris Murphy
2015-11-13 15:52 ` Austin S Hemmelgarn
1 sibling, 1 reply; 9+ messages in thread
From: Chris Murphy @ 2015-11-13 14:51 UTC (permalink / raw)
To: Dmitry Katsubo; +Cc: linux-btrfs
On Thu, Nov 12, 2015 at 12:23 PM, Dmitry Katsubo <dma_k@mail.ru> wrote:
> On 2015-11-12 13:47, Austin S Hemmelgarn wrote:
>>> That's a pretty unusual setup, so I'm not surprised there's no quick and
>>> easy answer. The best solution in my opinion would be to shuffle your
>>> partitions around and combine sda3 and sda8 into a single partition.
>>> There's generally no reason to present btrfs with two different
>>> partitions on the same disk.
>>>
>>> If there's something that prevents you from doing that, you may be able
>>> to use RAID10 or RAID6 somehow. I'm not really sure, though, so I'll
>>> defer to others on the list for implementation details.
>> RAID10 has the same issue. Assume you have 1 block. This gets stored
>> as 2 copies, each with 2 stripes, with the stripes split symmetrically.
>> For this, call the first half of the first copy 1a, the second half 1b,
>> and likewise for 2a and 2b with the second copy. 1a and 2a have
>> identical contents, and 1b and 2b have identical contents. It is fully
>> possible that you will end up with this block striped such that 1a and
>> 2a are on one disk, and 1b and 2b on the other. Based on this, losing
>> one disk would mean losing half the block, which would mean based on how
>> BTRFS works that you would lose the whole block (because neither copy
>> would be complete).
>
> Does it equally apply to RAID1? Namely, if I create
>
> mkfs.btrfs -mraid1 -draid1 /dev/sda3 /dev/sda8
>
> then btrfs will "believe" that these are different drives and mistakenly
> think that RAID pre-condition is satisfied. Am I right?
Yes.
>If so then I
> think this is a trap, and mkfs.btrfs should at least warn (or require
> --force) if two partitions are on the same drive for raid1/raid5/raid10.
Does mdadm warn in the same situation? LVM?
There are some assumptions being made about the end user's
understanding of the system they're working on, and those assumptions
aren't unreasonable. But I'm not opposed to an informational message.
--
Chris Murphy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Potential to loose data in case of disk failure
2015-11-13 14:51 ` Chris Murphy
@ 2015-11-13 15:52 ` Austin S Hemmelgarn
0 siblings, 0 replies; 9+ messages in thread
From: Austin S Hemmelgarn @ 2015-11-13 15:52 UTC (permalink / raw)
To: Chris Murphy, Dmitry Katsubo; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2376 bytes --]
On 2015-11-13 09:51, Chris Murphy wrote:
> On Thu, Nov 12, 2015 at 12:23 PM, Dmitry Katsubo <dma_k@mail.ru> wrote:
>> If so then I
>> think this is a trap, and mkfs.btrfs should at least warn (or require
>> --force) if two partitions are on the same drive for raid1/raid5/raid10.
>
> Does mdadm warn in the same situation? LVM?
>
> There are some assumptions being made about the end user's
> understanding of the system they're working on, and those assumptions
> aren't unreasonable. But I'm not opposed to an informational message.
LVM doesn't, and I don't think MDADM does. With LVM things are handled
differently from mkfs, you either specify which specific PV's you want
the LV to be on (in which case it's perfectly reasonable to assume you
know what you're doing), or you let LVM try to find optimal placement
(using a similar algorithm to how BTRFS decides what device a new chunk
goes on), and it doesn't check that the PV's are on different disks (and
in fact, even if it did, you could use pvmove to force them onto the
same disk anyway), but either way it refuses to let you have more copies
than you have PV's for a RAID set. I'm not certain about how MDADM
handles things, but I don't think that it warns about having multiple
components for a given RAID set on the same disk.
Regardless of what LVM and MDADM do, it's not unreasonable to assume
that people will make the assumption that BTRFS is smart enough to
balance things properly (too many people don't do any research about a
program before trying to use it, and I must commend Jim for taking the
time to verify whether things would behave like he wanted them to), so I
do think that putting a warning in would be a good idea. This won't be
perfect of course, because people (myself included) use BTRFS on top of
DM (be it LVM, dmcrypt, or something else), MD, and other intervening
block layers, so we can't just try sub-string matching on the device names.
We absolutely should not refuse to let the user do this if they want to
however, as there are people who use RAID1 on a single disk for the
protection against corruption, even though it doesn't protect against
hardware failure (although there is a patch on the ML for using dup
profile for data chunks, so hopefully this practice will become less
common in the near future).
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-11-13 15:53 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-11 17:30 Potential to loose data in case of disk failure Jim Murphy
2015-11-11 20:24 ` Sean Greenslade
2015-11-12 12:47 ` Austin S Hemmelgarn
2015-11-12 17:23 ` Dmitry Katsubo
2015-11-12 17:55 ` Austin S Hemmelgarn
2015-11-13 14:51 ` Chris Murphy
2015-11-13 15:52 ` Austin S Hemmelgarn
2015-11-11 23:13 ` Chris Murphy
2015-11-12 5:07 ` Duncan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.