All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: remote mirroring in the works?
       [not found] <AANLkTinmSdHwXXq3s64sM39GjacafgwgTjPadZGHuway@mail.gmail.com>
@ 2010-08-30 17:07 ` Fred van Zwieten
  2010-08-30 17:21   ` Bryan Whitehead
                     ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Fred van Zwieten @ 2010-08-30 17:07 UTC (permalink / raw)
  To: linux-btrfs

Hi there,

I would like to know if there is something functionally equivalent to
NetApp's SnapMirror in the works or planning? It would require block
level access to a snap and the ability to rebuild (subvolumes
including it's) snap's on another machine.

If not, what would be the best way to build something more or less
equivalent using existing tools? rsync-ing a snap seems the same, but
it isn't. First of all it 's file based, not very nice for DB's, and
you don't get the snap's on "the other side" the same.

Fred

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 17:07 ` remote mirroring in the works? Fred van Zwieten
@ 2010-08-30 17:21   ` Bryan Whitehead
  2010-08-30 17:33   ` Roy Sigurd Karlsbakk
  2010-08-30 17:55   ` K. Richard Pixley
  2 siblings, 0 replies; 16+ messages in thread
From: Bryan Whitehead @ 2010-08-30 17:21 UTC (permalink / raw)
  To: Fred van Zwieten; +Cc: linux-btrfs

LVM Snapshot.

lvm -s -n SnapShotName /dev/VolumeGroup/SourceLogicalVolumeName

you may need to pass -l or -L to give an initial size for the COW.

(as for rebuilding on another machine, that would require shared
storage or additional LVM tricks to export/import - or good old
fashioned dd)

that said, a more appropriate list to question is linux-lvm@redhat.com

On Mon, Aug 30, 2010 at 10:07 AM, Fred van Zwieten <fvzwieten@gmail.com=
> wrote:
> Hi there,
>
> I would like to know if there is something functionally equivalent to
> NetApp's SnapMirror in the works or planning? It would require block
> level access to a snap and the ability to rebuild (subvolumes
> including it's) snap's on another machine.
>
> If not, what would be the best way to build something more or less
> equivalent using existing tools? rsync-ing a snap seems the same, but
> it isn't. First of all it 's file based, not very nice for DB's, and
> you don't get the snap's on "the other side" the same.
>
> Fred
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs=
" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 17:07 ` remote mirroring in the works? Fred van Zwieten
  2010-08-30 17:21   ` Bryan Whitehead
@ 2010-08-30 17:33   ` Roy Sigurd Karlsbakk
  2010-08-30 17:55   ` K. Richard Pixley
  2 siblings, 0 replies; 16+ messages in thread
From: Roy Sigurd Karlsbakk @ 2010-08-30 17:33 UTC (permalink / raw)
  To: Fred van Zwieten; +Cc: linux-btrfs

----- Original Message -----
> Hi there,
>=20
> I would like to know if there is something functionally equivalent to
> NetApp's SnapMirror in the works or planning? It would require block
> level access to a snap and the ability to rebuild (subvolumes
> including it's) snap's on another machine.
>=20
> If not, what would be the best way to build something more or less
> equivalent using existing tools? rsync-ing a snap seems the same, but
> it isn't. First of all it 's file based, not very nice for DB's, and
> you don't get the snap's on "the other side" the same.

Perhaps DRBD - see http://www.drbd.org/ - that'll mirror the block devi=
ce(s) on which btrfs resides.

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt.=
 Det er et element=C3=A6rt imperativ for alle pedagoger =C3=A5 unng=C3=A5=
 eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste ti=
lfeller eksisterer adekvate og relevante synonymer p=C3=A5 norsk.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 17:07 ` remote mirroring in the works? Fred van Zwieten
  2010-08-30 17:21   ` Bryan Whitehead
  2010-08-30 17:33   ` Roy Sigurd Karlsbakk
@ 2010-08-30 17:55   ` K. Richard Pixley
  2010-08-30 17:59     ` Roy Sigurd Karlsbakk
  2010-08-30 21:15     ` Fred van Zwieten
  2 siblings, 2 replies; 16+ messages in thread
From: K. Richard Pixley @ 2010-08-30 17:55 UTC (permalink / raw)
  To: Fred van Zwieten; +Cc: linux-btrfs

  On 20100830 10:07, Fred van Zwieten wrote:
> Hi there,
>
> I would like to know if there is something functionally equivalent to
> NetApp's SnapMirror in the works or planning? It would require block
> level access to a snap and the ability to rebuild (subvolumes
> including it's) snap's on another machine.
>
> If not, what would be the best way to build something more or less
> equivalent using existing tools? rsync-ing a snap seems the same, but
> it isn't. First of all it 's file based, not very nice for DB's, and
> you don't get the snap's on "the other side" the same.
>
> Fred
I think drbd does precisely what you want.

It's not useful for fault tolerance, nor for load balancing, but it will 
produce a remote block copy that can be used as a sort of "hot backup".

You can also do something very similar by combining LVM, (the logical 
volume manager), with LVM snapshots and NBD, (the network block device) 
by mirroring to an NBD device.

Neither of these approaches can tolerate the remote file system being 
"live" until and unless it takes over for the primary.  But either can 
maintain a dynamic remote block device.

--rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 17:55   ` K. Richard Pixley
@ 2010-08-30 17:59     ` Roy Sigurd Karlsbakk
  2010-08-30 18:14       ` K. Richard Pixley
  2010-08-30 21:15     ` Fred van Zwieten
  1 sibling, 1 reply; 16+ messages in thread
From: Roy Sigurd Karlsbakk @ 2010-08-30 17:59 UTC (permalink / raw)
  To: K. Richard Pixley; +Cc: linux-btrfs, Fred van Zwieten

> I think drbd does precisely what you want.
>=20
> It's not useful for fault tolerance, nor for load balancing, but it
> will
> produce a remote block copy that can be used as a sort of "hot
> backup".

drbd with heartbeat/pacemaker can provide fault tolerance...

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt.=
 Det er et element=C3=A6rt imperativ for alle pedagoger =C3=A5 unng=C3=A5=
 eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste ti=
lfeller eksisterer adekvate og relevante synonymer p=C3=A5 norsk.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 17:59     ` Roy Sigurd Karlsbakk
@ 2010-08-30 18:14       ` K. Richard Pixley
  2010-08-31  6:30         ` Simon Kirby
  2010-09-06 21:50         ` David Nicol
  0 siblings, 2 replies; 16+ messages in thread
From: K. Richard Pixley @ 2010-08-30 18:14 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: linux-btrfs, Fred van Zwieten

  On 20100830 10:59, Roy Sigurd Karlsbakk wrote:
>> I think drbd does precisely what you want.
>>
>> It's not useful for fault tolerance, nor for load balancing, but it
>> will
>> produce a remote block copy that can be used as a sort of "hot
>> backup".
> drbd with heartbeat/pacemaker can provide fault tolerance...
I think that's a matter of semantics.

Once you've failed over from the primary system to the secondary, 
changes to your block device are terminal.  It's not easy to produce a 
system which can manage those changes and "heal" in the sense of 
allowing the primary system to return to service.  In effect, returning 
the primary system to service requires taking both systems down and 
copying the block device from the secondary back to the first.

In terms of fault tolerance, I'd call this a tolerance of about a half a 
fault since the system cannot return to it's initial configuration 
without breaking continuity of service.

And there really isn't any way to extend this. It's not fault tolerance 
in the virtual synchrony sense where there can be a pool of N machines, 
all symmetric, which can tolerate N - 1 failures and produce continuing 
service throughout.

It's also not load balanced in the virtual synchrony sense where N 
machines can all be in service concurrently and the service can tolerate 
N - 1 failures, albeit at degraded performance.  Or in the sense where 
failed servers can return to the group dynamically.

It's not sufficient for any application in which I've ever sought fault 
tolerance.  If it's sufficient for you, that's great.  But my definition 
of "fault tolerance" requires that the system be capable of returning to 
it's initial state without loss of service.  The heartbeat approach with 
single failover can't do that.

--rich - who is likely now off topic.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 17:55   ` K. Richard Pixley
  2010-08-30 17:59     ` Roy Sigurd Karlsbakk
@ 2010-08-30 21:15     ` Fred van Zwieten
  2010-08-30 21:23       ` Freddie Cash
  2010-08-30 22:56       ` K. Richard Pixley
  1 sibling, 2 replies; 16+ messages in thread
From: Fred van Zwieten @ 2010-08-30 21:15 UTC (permalink / raw)
  To: K. Richard Pixley; +Cc: linux-btrfs

I just glanced over the DRBD/LVM combi, but I don't see it being
functionally equal to SnapMirror. Let me (try to) explain how
snapmirror works:

On system A there is a volume (vol1). We let this vol1(A) replicate
thru SnapMirror to vol1(B). This is done by creating a snap vol1sx(A)
and replicate all changed blocks between this snapshot (x) and the
previous snapshot (x-1). The first time, there is no x-1 and the whole
volume will be replicated, but after this initial "full copy", only
the changed blocks between the two snapshot's are being replicated to
system B. This is also called snap based replication. Why we want
this? Easy. To support consistent DB snap's. The proces works by first
putting the DB in a consistent mode (depends on DB implementation),
create a snapshot, let the DB continue, replicate the changes. This
way a DB consistent state will be replicated. The cool thing about the
NetApp implementation is that on system B the snap's (x, x-1, x-2,
etc) are also available. When there is trouble, you can choose to
online the DB on system B on any of the snap's, or, even cooler, to
replicate one of those snap's back to system A, doing a block based
rollback at the filesystem level.

=46red

On Mon, Aug 30, 2010 at 7:55 PM, K. Richard Pixley <rich@noir.com> wrot=
e:
> =C2=A0On 20100830 10:07, Fred van Zwieten wrote:
>>
>> Hi there,
>>
>> I would like to know if there is something functionally equivalent t=
o
>> NetApp's SnapMirror in the works or planning? It would require block
>> level access to a snap and the ability to rebuild (subvolumes
>> including it's) snap's on another machine.
>>
>> If not, what would be the best way to build something more or less
>> equivalent using existing tools? rsync-ing a snap seems the same, bu=
t
>> it isn't. First of all it 's file based, not very nice for DB's, and
>> you don't get the snap's on "the other side" the same.
>>
>> Fred
>
> I think drbd does precisely what you want.
>
> It's not useful for fault tolerance, nor for load balancing, but it w=
ill
> produce a remote block copy that can be used as a sort of "hot backup=
".
>
> You can also do something very similar by combining LVM, (the logical=
 volume
> manager), with LVM snapshots and NBD, (the network block device) by
> mirroring to an NBD device.
>
> Neither of these approaches can tolerate the remote file system being=
 "live"
> until and unless it takes over for the primary. =C2=A0But either can =
maintain a
> dynamic remote block device.
>
> --rich
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 21:15     ` Fred van Zwieten
@ 2010-08-30 21:23       ` Freddie Cash
  2010-08-30 22:56       ` K. Richard Pixley
  1 sibling, 0 replies; 16+ messages in thread
From: Freddie Cash @ 2010-08-30 21:23 UTC (permalink / raw)
  To: Fred van Zwieten; +Cc: K. Richard Pixley, linux-btrfs

On Mon, Aug 30, 2010 at 2:15 PM, Fred van Zwieten <fvzwieten@gmail.com> wrote:
> I just glanced over the DRBD/LVM combi, but I don't see it being
> functionally equal to SnapMirror. Let me (try to) explain how
> snapmirror works:
>
> On system A there is a volume (vol1). We let this vol1(A) replicate
> thru SnapMirror to vol1(B). This is done by creating a snap vol1sx(A)
> and replicate all changed blocks between this snapshot (x) and the
> previous snapshot (x-1). The first time, there is no x-1 and the whole
> volume will be replicated, but after this initial "full copy", only
> the changed blocks between the two snapshot's are being replicated to
> system B. This is also called snap based replication. Why we want
> this? Easy. To support consistent DB snap's. The proces works by first
> putting the DB in a consistent mode (depends on DB implementation),
> create a snapshot, let the DB continue, replicate the changes. This
> way a DB consistent state will be replicated. The cool thing about the
> NetApp implementation is that on system B the snap's (x, x-1, x-2,
> etc) are also available. When there is trouble, you can choose to
> online the DB on system B on any of the snap's, or, even cooler, to
> replicate one of those snap's back to system A, doing a block based
> rollback at the filesystem level.

In the ZFS world, this would be the "zfs send" and "zfs recv"
functionality.  In case anyone wants to read up on how it works over
there, for ideas on how it could be implemented for btrfs in the
future.

-- 
Freddie Cash
fjwcash@gmail.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 21:15     ` Fred van Zwieten
  2010-08-30 21:23       ` Freddie Cash
@ 2010-08-30 22:56       ` K. Richard Pixley
  2010-08-31  5:07         ` Fred van Zwieten
  1 sibling, 1 reply; 16+ messages in thread
From: K. Richard Pixley @ 2010-08-30 22:56 UTC (permalink / raw)
  To: Fred van Zwieten; +Cc: linux-btrfs

  If you can put the db into a consistent state, then rsync will do 
this.  Rsync does changed block transfers.

--rich

On 8/30/10 14:15 , Fred van Zwieten wrote:
> I just glanced over the DRBD/LVM combi, but I don't see it being
> functionally equal to SnapMirror. Let me (try to) explain how
> snapmirror works:
>
> On system A there is a volume (vol1). We let this vol1(A) replicate
> thru SnapMirror to vol1(B). This is done by creating a snap vol1sx(A)
> and replicate all changed blocks between this snapshot (x) and the
> previous snapshot (x-1). The first time, there is no x-1 and the whole
> volume will be replicated, but after this initial "full copy", only
> the changed blocks between the two snapshot's are being replicated to
> system B. This is also called snap based replication. Why we want
> this? Easy. To support consistent DB snap's. The proces works by first
> putting the DB in a consistent mode (depends on DB implementation),
> create a snapshot, let the DB continue, replicate the changes. This
> way a DB consistent state will be replicated. The cool thing about the
> NetApp implementation is that on system B the snap's (x, x-1, x-2,
> etc) are also available. When there is trouble, you can choose to
> online the DB on system B on any of the snap's, or, even cooler, to
> replicate one of those snap's back to system A, doing a block based
> rollback at the filesystem level.
>
> Fred
>
> On Mon, Aug 30, 2010 at 7:55 PM, K. Richard Pixley<rich@noir.com>  wrote:
>>   On 20100830 10:07, Fred van Zwieten wrote:
>>> Hi there,
>>>
>>> I would like to know if there is something functionally equivalent to
>>> NetApp's SnapMirror in the works or planning? It would require block
>>> level access to a snap and the ability to rebuild (subvolumes
>>> including it's) snap's on another machine.
>>>
>>> If not, what would be the best way to build something more or less
>>> equivalent using existing tools? rsync-ing a snap seems the same, but
>>> it isn't. First of all it 's file based, not very nice for DB's, and
>>> you don't get the snap's on "the other side" the same.
>>>
>>> Fred
>> I think drbd does precisely what you want.
>>
>> It's not useful for fault tolerance, nor for load balancing, but it will
>> produce a remote block copy that can be used as a sort of "hot backup".
>>
>> You can also do something very similar by combining LVM, (the logical volume
>> manager), with LVM snapshots and NBD, (the network block device) by
>> mirroring to an NBD device.
>>
>> Neither of these approaches can tolerate the remote file system being "live"
>> until and unless it takes over for the primary.  But either can maintain a
>> dynamic remote block device.
>>
>> --rich
>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 22:56       ` K. Richard Pixley
@ 2010-08-31  5:07         ` Fred van Zwieten
  2010-08-31  6:38           ` Simon Kirby
  0 siblings, 1 reply; 16+ messages in thread
From: Fred van Zwieten @ 2010-08-31  5:07 UTC (permalink / raw)
  To: K. Richard Pixley; +Cc: linux-btrfs

Hmmm, maybe, but rsync would take a lot of time to find the changes.
the actual blocks of a snap _are_ the changes, that's why SnapMirror
is very efficient. And, I don't see how rsync will retain the snap's
between both sites. It would be great if a tool like rsync could have
access to the changed blocks in a snap. Don't know if btrfs exposes
these somehow.

=46red

On Tue, Aug 31, 2010 at 12:56 AM, K. Richard Pixley <rich@noir.com> wro=
te:
> =C2=A0If you can put the db into a consistent state, then rsync will =
do this.
> =C2=A0Rsync does changed block transfers.
>
> --rich
>
> On 8/30/10 14:15 , Fred van Zwieten wrote:
>>
>> I just glanced over the DRBD/LVM combi, but I don't see it being
>> functionally equal to SnapMirror. Let me (try to) explain how
>> snapmirror works:
>>
>> On system A there is a volume (vol1). We let this vol1(A) replicate
>> thru SnapMirror to vol1(B). This is done by creating a snap vol1sx(A=
)
>> and replicate all changed blocks between this snapshot (x) and the
>> previous snapshot (x-1). The first time, there is no x-1 and the who=
le
>> volume will be replicated, but after this initial "full copy", only
>> the changed blocks between the two snapshot's are being replicated t=
o
>> system B. This is also called snap based replication. Why we want
>> this? Easy. To support consistent DB snap's. The proces works by fir=
st
>> putting the DB in a consistent mode (depends on DB implementation),
>> create a snapshot, let the DB continue, replicate the changes. This
>> way a DB consistent state will be replicated. The cool thing about t=
he
>> NetApp implementation is that on system B the snap's (x, x-1, x-2,
>> etc) are also available. When there is trouble, you can choose to
>> online the DB on system B on any of the snap's, or, even cooler, to
>> replicate one of those snap's back to system A, doing a block based
>> rollback at the filesystem level.
>>
>> Fred
>>
>> On Mon, Aug 30, 2010 at 7:55 PM, K. Richard Pixley<rich@noir.com> =C2=
=A0wrote:
>>>
>>> =C2=A0On 20100830 10:07, Fred van Zwieten wrote:
>>>>
>>>> Hi there,
>>>>
>>>> I would like to know if there is something functionally equivalent=
 to
>>>> NetApp's SnapMirror in the works or planning? It would require blo=
ck
>>>> level access to a snap and the ability to rebuild (subvolumes
>>>> including it's) snap's on another machine.
>>>>
>>>> If not, what would be the best way to build something more or less
>>>> equivalent using existing tools? rsync-ing a snap seems the same, =
but
>>>> it isn't. First of all it 's file based, not very nice for DB's, a=
nd
>>>> you don't get the snap's on "the other side" the same.
>>>>
>>>> Fred
>>>
>>> I think drbd does precisely what you want.
>>>
>>> It's not useful for fault tolerance, nor for load balancing, but it=
 will
>>> produce a remote block copy that can be used as a sort of "hot back=
up".
>>>
>>> You can also do something very similar by combining LVM, (the logic=
al
>>> volume
>>> manager), with LVM snapshots and NBD, (the network block device) by
>>> mirroring to an NBD device.
>>>
>>> Neither of these approaches can tolerate the remote file system bei=
ng
>>> "live"
>>> until and unless it takes over for the primary. =C2=A0But either ca=
n maintain
>>> a
>>> dynamic remote block device.
>>>
>>> --rich
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 18:14       ` K. Richard Pixley
@ 2010-08-31  6:30         ` Simon Kirby
  2010-08-31 18:44           ` Fred van Zwieten
  2010-09-06 21:50         ` David Nicol
  1 sibling, 1 reply; 16+ messages in thread
From: Simon Kirby @ 2010-08-31  6:30 UTC (permalink / raw)
  To: K. Richard Pixley; +Cc: Roy Sigurd Karlsbakk, linux-btrfs, Fred van Zwieten

On Mon, Aug 30, 2010 at 11:14:51AM -0700, K. Richard Pixley wrote:

>  On 20100830 10:59, Roy Sigurd Karlsbakk wrote:
>>> I think drbd does precisely what you want.
>>>
>>> It's not useful for fault tolerance, nor for load balancing, but it
>>> will
>>> produce a remote block copy that can be used as a sort of "hot
>>> backup".
>> drbd with heartbeat/pacemaker can provide fault tolerance...
> I think that's a matter of semantics.
>
> Once you've failed over from the primary system to the secondary,  
> changes to your block device are terminal.  It's not easy to produce a  
> system which can manage those changes and "heal" in the sense of  
> allowing the primary system to return to service.  In effect, returning  
> the primary system to service requires taking both systems down and  
> copying the block device from the secondary back to the first.

This is totally incorrect.  DRBD replicates in both directions quite
well, in fact.  I've been using it on about 60 machines for many years,
and I have never had to do what you mention.

What it does not help with is avoiding corruption that occurs above the
block layer; eg, if your file system or your database on top of it barfs,
there is no other "good copy".  fsck or repair is still required in these
cases.  It is just like local RAID 1 in this respect -- you still need a
backup and/or copy at the file level, which is closer to what is needed
here.

Simon-

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-31  5:07         ` Fred van Zwieten
@ 2010-08-31  6:38           ` Simon Kirby
  2010-08-31 18:29             ` Goffredo Baroncelli
  0 siblings, 1 reply; 16+ messages in thread
From: Simon Kirby @ 2010-08-31  6:38 UTC (permalink / raw)
  To: Fred van Zwieten; +Cc: K. Richard Pixley, linux-btrfs

On Tue, Aug 31, 2010 at 07:07:29AM +0200, Fred van Zwieten wrote:

> Hmmm, maybe, but rsync would take a lot of time to find the changes.
> the actual blocks of a snap _are_ the changes, that's why SnapMirror
> is very efficient. And, I don't see how rsync will retain the snap's
> between both sites. It would be great if a tool like rsync could have
> access to the changed blocks in a snap. Don't know if btrfs exposes
> these somehow.

rsync doesn't have the hinting required to do this efficiently.  It has
to scan the whole thing every time it is run, and isn't anything like a
continuous replication in this respect.  Also, We've had problems in the
past with very large file systems causing rsync to run out of memory,
because it builds a file list in memory.  This lead us to build a "cpbk"
tool that basically did the same thing without file listsm, which turned
out to be a piece of crap, so some other guy kindly rewrote it, but he
unfortunately missed the original point entirely and rewrote it using
file lists.  Sigh.

Anyway, there _is_ this interface:

	btrfs subvolume find-new <path> <last_gen>
		List the recently modified files in a filesystem.

Eg:

	btrfs sub find-new /mnt 0

This should print all files on the file system, and the last transaction
ID marker.  This can be used to call the interface again, which lists
only new changed things since that ID.

So, it might be pretty easy to glue these tools together, for now, until
something does this automatically and/or in some more efficient or
low-level way.

Simon-

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-31  6:38           ` Simon Kirby
@ 2010-08-31 18:29             ` Goffredo Baroncelli
  0 siblings, 0 replies; 16+ messages in thread
From: Goffredo Baroncelli @ 2010-08-31 18:29 UTC (permalink / raw)
  To: linux-btrfs

On Tuesday, 31 August, 2010, Simon Kirby wrote:
[...]
> Anyway, there _is_ this interface:
> 
> 	btrfs subvolume find-new <path> <last_gen>
> 		List the recently modified files in a filesystem.
> 
> Eg:
> 
> 	btrfs sub find-new /mnt 0
> 
> This should print all files on the file system, and the last transaction
> ID marker.  This can be used to call the interface again, which lists
> only new changed things since that ID.
  ^^^^

It is not fully correct. In fact Chris Mason says

(from http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg04620.html)


Chris> When we find an inode in the output, it doesn't mean that inode has
Chris> changed.  It just means the btree block holding that inode has 
changed.
Chris> So we'll want to add limiting based on the ctime/mtime of the inode as
Chris> well.

So even tough this command definitely helps, false positives may happen. 
And moreover an empty file is not detected (I think because the file doesn't 
have associated data). But I think that this may be easily corrected.

Regards
G.Baroncelli

-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack@inwind.it>
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-31  6:30         ` Simon Kirby
@ 2010-08-31 18:44           ` Fred van Zwieten
  0 siblings, 0 replies; 16+ messages in thread
From: Fred van Zwieten @ 2010-08-31 18:44 UTC (permalink / raw)
  To: Simon Kirby; +Cc: K. Richard Pixley, Roy Sigurd Karlsbakk, linux-btrfs

Thinking about this a bit more, would a setup with btrfs on top of
DRBD be a setup that comes in the neighboorhood of what SnapMirror
provides? DRBD does replication at the blocklevel, without any notion
of a filesystem on top of it (as I understand this). So, if I make a
snapshot on a DRBD'ed btrfs filesystem, this snapshot would also get
replicated at the DRBD level. Provided I put the DB in a consisted
state before making the snap, I have a remote consistent copy of this
DB. This copy can be used as a failover target or as a basis for
restore.

Am I correct?


On Tue, Aug 31, 2010 at 8:30 AM, Simon Kirby <sim@hostway.ca> wrote:
> On Mon, Aug 30, 2010 at 11:14:51AM -0700, K. Richard Pixley wrote:
>
>> =C2=A0On 20100830 10:59, Roy Sigurd Karlsbakk wrote:
>>>> I think drbd does precisely what you want.
>>>>
>>>> It's not useful for fault tolerance, nor for load balancing, but i=
t
>>>> will
>>>> produce a remote block copy that can be used as a sort of "hot
>>>> backup".
>>> drbd with heartbeat/pacemaker can provide fault tolerance...
>> I think that's a matter of semantics.
>>
>> Once you've failed over from the primary system to the secondary,
>> changes to your block device are terminal. =C2=A0It's not easy to pr=
oduce a
>> system which can manage those changes and "heal" in the sense of
>> allowing the primary system to return to service. =C2=A0In effect, r=
eturning
>> the primary system to service requires taking both systems down and
>> copying the block device from the secondary back to the first.
>
> This is totally incorrect. =C2=A0DRBD replicates in both directions q=
uite
> well, in fact. =C2=A0I've been using it on about 60 machines for many=
 years,
> and I have never had to do what you mention.
>
> What it does not help with is avoiding corruption that occurs above t=
he
> block layer; eg, if your file system or your database on top of it ba=
rfs,
> there is no other "good copy". =C2=A0fsck or repair is still required=
 in these
> cases. =C2=A0It is just like local RAID 1 in this respect -- you stil=
l need a
> backup and/or copy at the file level, which is closer to what is need=
ed
> here.
>
> Simon-
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs=
" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =C2=A0http://vger.kernel.org/majordomo-info.ht=
ml
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-08-30 18:14       ` K. Richard Pixley
  2010-08-31  6:30         ` Simon Kirby
@ 2010-09-06 21:50         ` David Nicol
  2010-09-07  0:04           ` K. Richard Pixley
  1 sibling, 1 reply; 16+ messages in thread
From: David Nicol @ 2010-09-06 21:50 UTC (permalink / raw)
  Cc: linux-btrfs

On Mon, Aug 30, 2010 at 1:14 PM, K. Richard Pixley <rich@noir.com> wrot=
e:
> In terms of fault tolerance, I'd call this a tolerance of about a hal=
f a
> fault since the system cannot return to it's initial configuration wi=
thout
> breaking continuity of service.
>
> And there really isn't any way to extend this. It's not fault toleran=
ce in
> the virtual synchrony sense where there can be a pool of N machines, =
all
> symmetric, which can tolerate N - 1 failures and produce continuing s=
ervice
> throughout.
>
> It's also not load balanced in the virtual synchrony sense where N ma=
chines
> can all be in service concurrently and the service can tolerate N - 1
> failures, albeit at degraded performance. =C2=A0Or in the sense where=
 failed
> servers can return to the group dynamically.
>
> It's not sufficient for any application in which I've ever sought fau=
lt
> tolerance. =C2=A0If it's sufficient for you, that's great. =C2=A0But =
my definition of
> "fault tolerance" requires that the system be capable of returning to=
 it's
> initial state without loss of service. =C2=A0The heartbeat approach w=
ith single
> failover can't do that.
>
> --rich - who is likely now off topic.


Only off-topic if BTRFS isn't ever going to ooze into the space
currently occupied by the likes of

http://en.wikipedia.org/wiki/Global_File_System

that is, file systems that have multiple nodes simultaneously
accessing block devices and tolerating faults.

There's more to HA than the file system though; well, depending on
what kinds of faults you're worried about
mitigating the risk of.

I imagine (very likely now pollyanna) that  refactoring GFS's lock
discipline to work above BTRFS's on-disk format might be a worthwhile
endeavour, or at least something to think about.  Reimplementing the
locking over BTRFS might be a shorter way there.

I'm for modular extent provisioning, which in theory could allow all
kinds of unnecessarily wasted CPU cycles, as extents get managed by
user-space daemons allocating them out of whatever file systems they
happen to be running on.


--=20
"Elevator Inspection Certificate is on file in the Maintenance Office"
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: remote mirroring in the works?
  2010-09-06 21:50         ` David Nicol
@ 2010-09-07  0:04           ` K. Richard Pixley
  0 siblings, 0 replies; 16+ messages in thread
From: K. Richard Pixley @ 2010-09-07  0:04 UTC (permalink / raw)
  To: David Nicol

  On 20100906 14:50, David Nicol wrote:
> Only off-topic if BTRFS isn't ever going to ooze into the space
> currently occupied by the likes of
>
> http://en.wikipedia.org/wiki/Global_File_System
>
> that is, file systems that have multiple nodes simultaneously
> accessing block devices and tolerating faults.

There seem to be a number of other systems looking at building fault 
tolerance and distribution over btrfs: crfs, ceph, lustre.  I'm 
convinced that will happen even if btrfs doesn't do it natively.

Btrfs could probably be built to stripe and/or mirror over several nbd 
devices now, although I haven't tried it.

--rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-09-07  0:04 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <AANLkTinmSdHwXXq3s64sM39GjacafgwgTjPadZGHuway@mail.gmail.com>
2010-08-30 17:07 ` remote mirroring in the works? Fred van Zwieten
2010-08-30 17:21   ` Bryan Whitehead
2010-08-30 17:33   ` Roy Sigurd Karlsbakk
2010-08-30 17:55   ` K. Richard Pixley
2010-08-30 17:59     ` Roy Sigurd Karlsbakk
2010-08-30 18:14       ` K. Richard Pixley
2010-08-31  6:30         ` Simon Kirby
2010-08-31 18:44           ` Fred van Zwieten
2010-09-06 21:50         ` David Nicol
2010-09-07  0:04           ` K. Richard Pixley
2010-08-30 21:15     ` Fred van Zwieten
2010-08-30 21:23       ` Freddie Cash
2010-08-30 22:56       ` K. Richard Pixley
2010-08-31  5:07         ` Fred van Zwieten
2010-08-31  6:38           ` Simon Kirby
2010-08-31 18:29             ` Goffredo Baroncelli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.