linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Copy/move btrfs volume
@ 2010-07-01 10:28 Lubos Kolouch
  2010-07-01 11:26 ` Daniel J Blueman
  0 siblings, 1 reply; 8+ messages in thread
From: Lubos Kolouch @ 2010-07-01 10:28 UTC (permalink / raw)
  To: linux-btrfs

Hello,

I am testing btrfs on one of our backup servers
(many millions of files, 1.5TB size, running on (non-btrfs-provided-) 
raid5).

I am using subvolumes/snapshots with following rsync.

It works very well, but I would like to ask a question... say I would need 
to copy/move the files to different server/disk.

Normally I would do it with rsync, but I guess it will not preserve the 
subvolumes, it will also not detect that they are the same files (I guess
they are not just normal hardlinks). So I would end up with duplicated 
files.

What is the correct way to do this?

Thank you and best regards

Lubos Kolouch


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Copy/move btrfs volume
  2010-07-01 10:28 Copy/move btrfs volume Lubos Kolouch
@ 2010-07-01 11:26 ` Daniel J Blueman
  2010-07-01 11:33   ` Lubos Kolouch
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel J Blueman @ 2010-07-01 11:26 UTC (permalink / raw)
  To: Lubos Kolouch; +Cc: linux-btrfs

On 1 July 2010 11:28, Lubos Kolouch <lubos.kolouch@gmail.com> wrote:
> Hello,
>
> I am testing btrfs on one of our backup servers
> (many millions of files, 1.5TB size, running on (non-btrfs-provided-)
> raid5).
>
> I am using subvolumes/snapshots with following rsync.
>
> It works very well, but I would like to ask a question... say I would need
> to copy/move the files to different server/disk.
>
> Normally I would do it with rsync, but I guess it will not preserve the
> subvolumes, it will also not detect that they are the same files (I guess
> they are not just normal hardlinks). So I would end up with duplicated
> files.
>
> What is the correct way to do this?

The only way to do this preserving duplication is to use hardlinks
between duplicated files (which reference counts the inode), and use
'rsync -H'.

Dan
-- 
Daniel J Blueman

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Copy/move btrfs volume
  2010-07-01 11:26 ` Daniel J Blueman
@ 2010-07-01 11:33   ` Lubos Kolouch
  2010-07-01 22:21     ` Matt Brown
  2010-07-02  1:29     ` Chris Mason
  0 siblings, 2 replies; 8+ messages in thread
From: Lubos Kolouch @ 2010-07-01 11:33 UTC (permalink / raw)
  To: linux-btrfs

Daniel J Blueman, Thu, 01 Jul 2010 12:26:10 +0100:
>> What is the correct way to do this?
> 
> The only way to do this preserving duplication is to use hardlinks
> between duplicated files (which reference counts the inode), and use
> 'rsync -H'.
> 
> Dan

But when the files are on different snaphots, does rsync see them as 
hardlinked?

A scenario - I have raid5 of say, 1TB HDDs. It contains many snapshots.
Then, few years later, new machine is bought and there are, say, 5TB 
discs.

So I need to transfer the btrfs volume to the new machine. 

But how to do it so that it looks the *same*, ie. the same snapshots?
I could of course write a custom script to create the subvolume, rsync 
the files, create snapshot, rsync files, etc,

but it would be nice if the btrfs toolset supports this by default...

Lubos


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Copy/move btrfs volume
  2010-07-01 11:33   ` Lubos Kolouch
@ 2010-07-01 22:21     ` Matt Brown
  2010-07-02  6:15       ` Oystein Viggen
  2010-07-02  1:29     ` Chris Mason
  1 sibling, 1 reply; 8+ messages in thread
From: Matt Brown @ 2010-07-01 22:21 UTC (permalink / raw)
  To: linux-btrfs

On 07/01/2010 05:33 AM, Lubos Kolouch wrote:
> Daniel J Blueman, Thu, 01 Jul 2010 12:26:10 +0100:
>>> What is the correct way to do this?
>>
>> The only way to do this preserving duplication is to use hardlinks
>> between duplicated files (which reference counts the inode), and use
>> 'rsync -H'.
>>
>> Dan

Hello,

With backed up files consisting of hard links, I usually use dd to copy
the file systems at the block level

# dd if=/dev/sda of=/dev/sdb bs=20M

and then expand the file system. This is because I found that tools like
rsync, while usually fast, are extremely slow when dealing with millions
of hard linked files.

This could also be used for btrfs to keep its snapshots.

> A scenario - I have raid5 of say, 1TB HDDs. It contains many snapshots.
> Then, few years later, new machine is bought and there are, say, 5TB
> discs.
> ...
> Lubos

For me, I had to copy over BackupPC hardlinked files from a full disk to
a smaller disk, both using ext4, and I could not use dd. What normally
should have taken an hour, instead took almost a week. (Yes, I wanted to
use btrfs, but it had a hard link limit of 255 - don't know if it still
does.)

It would be nice to have a btrfs command that could rapidly copy over
the file system, snapshots, and all other file system info.

But what benefit would having a native btrfs 'copy/rsync' command have
over the dd/resize option?

Pros
- Files will be immediately checksumed on new disks, but this may not be
as important since a checksum/verify command will be implemented.
- Great 'feature' for copying files to new drives, and keeping
snapshots. Could even be used to export snapshots.
- I believe compressed files will have to be uncompressed and
recompressed, depending on when file is checksummed. (I may be wrong on
this one). This will actually be a con for slow and/or high load machines.
- One command instead of many (dd -> resize -> verify).

Cons
- File system would still have to be unmounted, or at least read-only,
as I doubt the command will have rsync's update or delete abilities.
But, maybe it could.

Questionable
- May be faster than dd/resize, or it may be just as slow as rsync is
with hard links. And I am talking about dozens to thousands of
snapshots, and millions to billions of files.

Matt

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Copy/move btrfs volume
  2010-07-01 11:33   ` Lubos Kolouch
  2010-07-01 22:21     ` Matt Brown
@ 2010-07-02  1:29     ` Chris Mason
  1 sibling, 0 replies; 8+ messages in thread
From: Chris Mason @ 2010-07-02  1:29 UTC (permalink / raw)
  To: Lubos Kolouch; +Cc: linux-btrfs

On Thu, Jul 01, 2010 at 11:33:59AM +0000, Lubos Kolouch wrote:
> Daniel J Blueman, Thu, 01 Jul 2010 12:26:10 +0100:
> >> What is the correct way to do this?
> > 
> > The only way to do this preserving duplication is to use hardlinks
> > between duplicated files (which reference counts the inode), and use
> > 'rsync -H'.
> > 
> > Dan
> 
> But when the files are on different snaphots, does rsync see them as 
> hardlinked?
> 
> A scenario - I have raid5 of say, 1TB HDDs. It contains many snapshots.
> Then, few years later, new machine is bought and there are, say, 5TB 
> discs.
> 
> So I need to transfer the btrfs volume to the new machine. 
> 
> But how to do it so that it looks the *same*, ie. the same snapshots?
> I could of course write a custom script to create the subvolume, rsync 
> the files, create snapshot, rsync files, etc,
> 
> but it would be nice if the btrfs toolset supports this by default...

This is definitely something I'm looking to add.  The btrfs-progs git
tree has some code that allows userland to walk the btrees and detect
the duplicate files.  But this is just a building block needed for the
full backup program.

Instead of hard links, it is possible to use reflinks with cp, which
uses the cloning ioctl.

-chris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Copy/move btrfs volume
  2010-07-01 22:21     ` Matt Brown
@ 2010-07-02  6:15       ` Oystein Viggen
  2010-07-03  7:33         ` Lubos Kolouch
  0 siblings, 1 reply; 8+ messages in thread
From: Oystein Viggen @ 2010-07-02  6:15 UTC (permalink / raw)
  To: linux-btrfs

* [Matt Brown]=20

> With backed up files consisting of hard links, I usually use dd to co=
py
> the file systems at the block level
>
> # dd if=3D/dev/sda of=3D/dev/sdb bs=3D20M
>
> and then expand the file system. This is because I found that tools l=
ike
> rsync, while usually fast, are extremely slow when dealing with milli=
ons
> of hard linked files.
>
> This could also be used for btrfs to keep its snapshots.

If you can (temporarily) attach the old and new drives to the same
computer, putting the ext4 BackupPC store on LVM and moving the LV
around might be more convenient, or at least feel more "high level".

=46or btrfs with lots of snapshots, I believe "btrfs device add" of the
new device followed by "btrfs device remove" of the old one would be th=
e
most convenient.

One advantage of using LVM and btrfs multi device support in this way i=
s
that the actual downtime is minimal -- you can keep the filesystems
online.  Even on cheap hardware, the only downtime should be to
attach/remove disks.

=D8ystein
--=20
If it ain't broke, don't break it.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Copy/move btrfs volume
  2010-07-02  6:15       ` Oystein Viggen
@ 2010-07-03  7:33         ` Lubos Kolouch
  2010-07-21 15:00           ` Hubert Kario
  0 siblings, 1 reply; 8+ messages in thread
From: Lubos Kolouch @ 2010-07-03  7:33 UTC (permalink / raw)
  To: linux-btrfs

Oystein Viggen, Fri, 02 Jul 2010 08:15:03 +0200:

> For btrfs with lots of snapshots, I believe "btrfs device add" of the
> new device followed by "btrfs device remove" of the old one would be =
the
> most convenient.
>=20
> =C3=98ystein

This solution if very elegant and cool - if you can put the discs into =
one=20
computer.

It does not help too much to copy the files over network and preserve t=
he=20
snapshots... or can you add like this a network-attached device (sshfs)=
 ?

Lubos

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Copy/move btrfs volume
  2010-07-03  7:33         ` Lubos Kolouch
@ 2010-07-21 15:00           ` Hubert Kario
  0 siblings, 0 replies; 8+ messages in thread
From: Hubert Kario @ 2010-07-21 15:00 UTC (permalink / raw)
  To: Lubos Kolouch; +Cc: linux-btrfs

On Saturday 03 July 2010 09:33:19 Lubos Kolouch wrote:
> Oystein Viggen, Fri, 02 Jul 2010 08:15:03 +0200:
> > For btrfs with lots of snapshots, I believe "btrfs device add" of t=
he
> > new device followed by "btrfs device remove" of the old one would b=
e the
> > most convenient.
> >=20
> > =C3=98ystein
>=20
> This solution if very elegant and cool - if you can put the discs int=
o one
> computer.
>=20
> It does not help too much to copy the files over network and preserve=
 the
> snapshots... or can you add like this a network-attached device (sshf=
s) ?

You could also go the totally cool option (albeit a bit creazy) and use=
=20
network block devices and have no downtime...

The overall process will take more time though.

--=20
Hubert Kario
QBS - Quality Business Software
ul. Ksawer=C3=B3w 30/85
02-656 Warszawa
POLAND
tel. +48 (22) 646-61-51, 646-74-24
fax +48 (22) 646-61-50
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-07-21 15:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-01 10:28 Copy/move btrfs volume Lubos Kolouch
2010-07-01 11:26 ` Daniel J Blueman
2010-07-01 11:33   ` Lubos Kolouch
2010-07-01 22:21     ` Matt Brown
2010-07-02  6:15       ` Oystein Viggen
2010-07-03  7:33         ` Lubos Kolouch
2010-07-21 15:00           ` Hubert Kario
2010-07-02  1:29     ` Chris Mason

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).