All of lore.kernel.org
 help / color / mirror / Atom feed
* Moving contents from one subvol to another
@ 2014-11-29 14:21 Shriramana Sharma
  2014-11-29 14:28 ` Hugo Mills
  0 siblings, 1 reply; 6+ messages in thread
From: Shriramana Sharma @ 2014-11-29 14:21 UTC (permalink / raw)
  To: linux-btrfs

Hello. I am now taking the first steps to making my backup external
HDD in BtrFS. From
http://askubuntu.com/questions/119014/btrfs-subvolumes-vs-folders I
understand that the only difference between subvolumes and ordinary
folders is that the former can be snapshotted and independently
mounted.

But I have a question. I have two subvols test1, test2.

$ cd test1
$ dd if=/dev/urandom of=file bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 36.2291 s, 14.5 MB/s
$ time mv file ../test2/
real    0m2.061s
user    0m0.013s
sys     0m0.459s
$ time { cp --reflink ../test2/file . && rm ../test2/file ; }
real    0m0.677s
user    0m0.022s
sys     0m0.086s
$ mkdir foo
$ time mv file foo/
real    0m0.096s
user    0m0.008s
sys     0m0.013s

It seems that mv is not CoW aware and hence is not able to create
reflinks so it is actually processing the entire file because it
thinks test2 is a different device/filesystem/partition or such. Is
this understanding correct?

So doing cp --reflink with rm is much faster. But it is still slower
than doing mv within the same subvol. Is it because of the
housekeeping with updating the metadata of the two subvols?

Methinks --reflink option should be added to mv for the above usecase.
Do people think this is useful? Why or why not?

My concern is that if somebody wants to consolidate two subvols into
one, though really only the metadata needs to be processed using
ordinary mv isn't aware of this and using cp --reflink with rm is
unnecessarily complicated, especially if it will involve multiple
files.

And it's not clear to me what it would entail to cp --reflink + rm an
entire directory tree because IIUC I'd have to handle each file
separately. Perhaps something (unnecessarily convoluted) like:

find . | while read f
do
[ -d "$f" ] && mkdir target/"$f" && touch target/"$f" -r "$f"
[ -f "$f" ] && cp -a --reflink "$f" target/ && rm "$f"
done

Again, what would happen to files which are not regular directories or files?

And why isn't --reflink given a single letter alias for cp?

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving contents from one subvol to another
  2014-11-29 14:21 Moving contents from one subvol to another Shriramana Sharma
@ 2014-11-29 14:28 ` Hugo Mills
  2014-11-29 15:15   ` Shriramana Sharma
  0 siblings, 1 reply; 6+ messages in thread
From: Hugo Mills @ 2014-11-29 14:28 UTC (permalink / raw)
  To: Shriramana Sharma; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3108 bytes --]

On Sat, Nov 29, 2014 at 07:51:07PM +0530, Shriramana Sharma wrote:
> Hello. I am now taking the first steps to making my backup external
> HDD in BtrFS. From
> http://askubuntu.com/questions/119014/btrfs-subvolumes-vs-folders I
> understand that the only difference between subvolumes and ordinary
> folders is that the former can be snapshotted and independently
> mounted.
> 
> But I have a question. I have two subvols test1, test2.
> 
> $ cd test1
> $ dd if=/dev/urandom of=file bs=1M count=500
> 500+0 records in
> 500+0 records out
> 524288000 bytes (524 MB) copied, 36.2291 s, 14.5 MB/s
> $ time mv file ../test2/
> real    0m2.061s
> user    0m0.013s
> sys     0m0.459s
> $ time { cp --reflink ../test2/file . && rm ../test2/file ; }
> real    0m0.677s
> user    0m0.022s
> sys     0m0.086s
> $ mkdir foo
> $ time mv file foo/
> real    0m0.096s
> user    0m0.008s
> sys     0m0.013s
> 
> It seems that mv is not CoW aware and hence is not able to create
> reflinks so it is actually processing the entire file because it
> thinks test2 is a different device/filesystem/partition or such. Is
> this understanding correct?

   The latest version of mv should be able to use CoW copies to make
it more efficient. It has a --reflink option, the same as cp. Note
that you can't make reflinks crossing a mount boundary, but you can do
so crossing a subvolume boundary (as you're doing here).

> So doing cp --reflink with rm is much faster. But it is still slower
> than doing mv within the same subvol. Is it because of the
> housekeeping with updating the metadata of the two subvols?

   I should think so, yes.

> Methinks --reflink option should be added to mv for the above usecase.
> Do people think this is useful? Why or why not?

   See above: it already has been. :)

> My concern is that if somebody wants to consolidate two subvols into
> one, though really only the metadata needs to be processed using
> ordinary mv isn't aware of this and using cp --reflink with rm is
> unnecessarily complicated, especially if it will involve multiple
> files.
> 
> And it's not clear to me what it would entail to cp --reflink + rm an
> entire directory tree because IIUC I'd have to handle each file
> separately. Perhaps something (unnecessarily convoluted) like:
> 
> find . | while read f
> do
> [ -d "$f" ] && mkdir target/"$f" && touch target/"$f" -r "$f"
> [ -f "$f" ] && cp -a --reflink "$f" target/ && rm "$f"
> done
> 
> Again, what would happen to files which are not regular directories or files?

   Probably just the same thing that would happen without the
--reflink=always.

> And why isn't --reflink given a single letter alias for cp?

   I don't know about that; you'll have to ask the coreutils
developers. They're probably expecting it to be largely set to a
single value by default (e.g. through a shall alias).

   Hugo.

-- 
Hugo Mills             | "I will not be pushed, filed, stamped, indexed,
hugo@... carfax.org.uk | briefed, debriefed or numbered.
http://carfax.org.uk/  | My life is my own."
PGP: 65E74AC0          |                                Number 6, The Prisoner

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving contents from one subvol to another
  2014-11-29 14:28 ` Hugo Mills
@ 2014-11-29 15:15   ` Shriramana Sharma
  2014-11-29 17:07     ` Robert White
  0 siblings, 1 reply; 6+ messages in thread
From: Shriramana Sharma @ 2014-11-29 15:15 UTC (permalink / raw)
  To: Hugo Mills, Shriramana Sharma, linux-btrfs

On Sat, Nov 29, 2014 at 7:58 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
>    The latest version of mv should be able to use CoW copies to make
> it more efficient. It has a --reflink option, the same as cp. Note
> that you can't make reflinks crossing a mount boundary, but you can do
> so crossing a subvolume boundary (as you're doing here).

Hi thanks for this. I suppose you are referring to the commit:

http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=b47231be6813e6cb5305266e391b4bb745f27f13

>From http://git.savannah.gnu.org/cgit/coreutils.git/log/?qt=grep&q=mv%3A,
http://git.savannah.gnu.org/cgit/coreutils.git/plain/NEWS?id=b47231be6813e6cb5305266e391b4bb745f27f13
and finally http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/mv.c?id=b47231be6813e6cb5305266e391b4bb745f27f13
it doesn't seem as if there was any earlier commit actually adding a
--reflink option so it seems the improvement is in-built.

That's nice to know!

Any idea when the next coreutils point release with this will be out?

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving contents from one subvol to another
  2014-11-29 15:15   ` Shriramana Sharma
@ 2014-11-29 17:07     ` Robert White
  2014-11-30  3:53       ` Shriramana Sharma
  0 siblings, 1 reply; 6+ messages in thread
From: Robert White @ 2014-11-29 17:07 UTC (permalink / raw)
  To: Shriramana Sharma, Hugo Mills, linux-btrfs

On 11/29/2014 07:15 AM, Shriramana Sharma wrote:
> On Sat, Nov 29, 2014 at 7:58 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
>>     The latest version of mv should be able to use CoW copies to make
>> it more efficient. It has a --reflink option, the same as cp. Note
>> that you can't make reflinks crossing a mount boundary, but you can do
>> so crossing a subvolume boundary (as you're doing here).

One thing to keep in mind is that mv, when crossing any of these 
boundaries degenerates to a copy-and-remove operation and _none_ of the 
source files will be removed until _all_ of the files have been copied. 
If any of the copy operations fail the removes will not take place at 
all. It would only take a couple large NOCOW files to put you over a 
limit somewhere.

So if you get to an out-of-space condition mid-move you are going to 
have to disentangle your stuff by hand anyway.

If you are consolidating sub-volumes (as per the original question) on a 
"nearly full" drive you may want to do it all long-hand with a script 
moving various chunks or something instead of just trying a move/copy of 
"cp --reflinks /vol1/* /vol2/" (same for mv when you get that --reflinks 
revision).

ASIDE: Also be aware that such a moment would be the perfect time to 
consider compression and so on. A regular copy (non reflinks) will apply 
the currently selected compress= regime (and reconsider sparsity etc) in 
a way that the move will not.  e.g. once you decide to do intrusive 
maintenance you might be well served by taking the extra time to 
restructure your storage. 8-)



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving contents from one subvol to another
  2014-11-29 17:07     ` Robert White
@ 2014-11-30  3:53       ` Shriramana Sharma
  2014-11-30 13:04         ` Shriramana Sharma
  0 siblings, 1 reply; 6+ messages in thread
From: Shriramana Sharma @ 2014-11-30  3:53 UTC (permalink / raw)
  To: Robert White; +Cc: Hugo Mills, linux-btrfs

On Sat, Nov 29, 2014 at 10:37 PM, Robert White <rwhite@pobox.com> wrote:
>
> One thing to keep in mind is that mv, when crossing any of these boundaries
> degenerates to a copy-and-remove operation and _none_ of the source files
> will be removed until _all_ of the files have been copied. If any of the
> copy operations fail the removes will not take place at all. It would only
> take a couple large NOCOW files to put you over a limit somewhere.

Hmm... So you're saying like because the copy routine that mv calls
will see the nocow attribute (and it doesn't know it's being called as
part of a move operation) and so do a full copy rather than reflink?

Correct me if I'm wrong but it seems that mv should actually ignore
the nocow attribute as far as moving it to a new location is
concerned, no, because I'm moving, not copying? Of course it should
retain the attribute of the original files *after* the move is done.

Why should noCoW affect cp --reflink anyhow? I just created a 500 MiB
file from /dev/urandom under a chattr +C-ed dir, and copied to another
subvol using cp --reflink, and fi df still shows 500 MiB, not 1 GiB.

> If you are consolidating sub-volumes (as per the original question) on a
> "nearly full" drive you may want to do it all long-hand with a script moving
> various chunks or something instead of just trying a move/copy of "cp
> --reflinks /vol1/* /vol2/" (same for mv when you get that --reflinks
> revision).

As I said, there doesn't actually seem to be a --reflink command line option.

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving contents from one subvol to another
  2014-11-30  3:53       ` Shriramana Sharma
@ 2014-11-30 13:04         ` Shriramana Sharma
  0 siblings, 0 replies; 6+ messages in thread
From: Shriramana Sharma @ 2014-11-30 13:04 UTC (permalink / raw)
  To: Robert White; +Cc: Hugo Mills, linux-btrfs

On Sun, Nov 30, 2014 at 9:23 AM, Shriramana Sharma <samjnaa@gmail.com> wrote:
>
>
> Why should noCoW affect cp --reflink anyhow? I just created a 500 MiB
> file from /dev/urandom under a chattr +C-ed dir, and copied to another
> subvol using cp --reflink, and fi df still shows 500 MiB, not 1 GiB.

Looks like I might have spoken too soon (because I've read that some
changes aren't visible until the next FS commit) so right now it
actually says 1 GiB used, which I can't grok because why should a
nocow file be physically copied (to new blocks) just because it's
nocow? Is it because it is possible that the two copies are
overwritten separately at the same time?

But still, it seems to me that mv should make it so that the nocow
attr is temporarily (atomically?) suspended/ignored just for the
duration of the relocation, since there aren't going to be any two
copies to be overwritten at the same time.

Comments?

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-11-30 13:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-29 14:21 Moving contents from one subvol to another Shriramana Sharma
2014-11-29 14:28 ` Hugo Mills
2014-11-29 15:15   ` Shriramana Sharma
2014-11-29 17:07     ` Robert White
2014-11-30  3:53       ` Shriramana Sharma
2014-11-30 13:04         ` Shriramana Sharma

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.