All of lore.kernel.org
 help / color / mirror / Atom feed
* CoW behavior when writing same content
@ 2018-10-09 14:48 Gervais, Francois
  2018-10-09 15:52 ` Chris Murphy
  0 siblings, 1 reply; 5+ messages in thread
From: Gervais, Francois @ 2018-10-09 14:48 UTC (permalink / raw)
  To: linux-btrfs

Hi,

If I have a snapshot where I overwrite a big file but which only a
small portion of it is different, will the whole file be rewritten in
the snapshot? Or only the different part of the file?

Something like:

$ dd if=/dev/urandom of=/big_file bs=1M count=1024
$ cp /big_file root/
$ btrfs sub snap root snapshot
$ cp /big_file snapshot/

In this case is root/big_file and snapshot/big_file still share the same data?

Thank you

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CoW behavior when writing same content
  2018-10-09 14:48 CoW behavior when writing same content Gervais, Francois
@ 2018-10-09 15:52 ` Chris Murphy
  2018-10-09 16:21   ` Roman Mamedov
  2018-10-09 17:25   ` Andrei Borzenkov
  0 siblings, 2 replies; 5+ messages in thread
From: Chris Murphy @ 2018-10-09 15:52 UTC (permalink / raw)
  To: Gervais, Francois; +Cc: linux-btrfs

On Tue, Oct 9, 2018 at 8:48 AM, Gervais, Francois
<FGervais@distech-controls.com> wrote:
> Hi,
>
> If I have a snapshot where I overwrite a big file but which only a
> small portion of it is different, will the whole file be rewritten in
> the snapshot? Or only the different part of the file?

Depends on how the application modifies files. Many applications write
out a whole new file with a pseudorandom filename, fsync, then rename.

>
> Something like:
>
> $ dd if=/dev/urandom of=/big_file bs=1M count=1024
> $ cp /big_file root/
> $ btrfs sub snap root snapshot
> $ cp /big_file snapshot/
>
> In this case is root/big_file and snapshot/big_file still share the same data?

You'll be left with three files. /big_file and root/big_file will
share extents, and snapshot/big_file will have its own extents. You'd
need to copy with --reflink for snapshot/big_file to have shared
extents with /big_file - or deduplicate.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CoW behavior when writing same content
  2018-10-09 15:52 ` Chris Murphy
@ 2018-10-09 16:21   ` Roman Mamedov
  2018-10-09 17:25   ` Andrei Borzenkov
  1 sibling, 0 replies; 5+ messages in thread
From: Roman Mamedov @ 2018-10-09 16:21 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Gervais, Francois, linux-btrfs

On Tue, 9 Oct 2018 09:52:00 -0600
Chris Murphy <lists@colorremedies.com> wrote:

> You'll be left with three files. /big_file and root/big_file will
> share extents, and snapshot/big_file will have its own extents. You'd
> need to copy with --reflink for snapshot/big_file to have shared
> extents with /big_file - or deduplicate.

Or use rsync for copying, in the mode where it reads and checksums blocks of
both files, to copy only the non-matching portions.

rsync --inplace

              This  option  is  useful  for  transferring  large  files   with
              block-based  changes  or appended data, and also on systems that
              are disk bound, not network bound.  It  can  also  help  keep  a
              copy-on-write filesystem snapshot from diverging the entire con‐
              tents of a file that only has minor changes.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CoW behavior when writing same content
  2018-10-09 15:52 ` Chris Murphy
  2018-10-09 16:21   ` Roman Mamedov
@ 2018-10-09 17:25   ` Andrei Borzenkov
  2018-10-09 22:31     ` Chris Murphy
  1 sibling, 1 reply; 5+ messages in thread
From: Andrei Borzenkov @ 2018-10-09 17:25 UTC (permalink / raw)
  To: Chris Murphy, Gervais, Francois; +Cc: linux-btrfs

09.10.2018 18:52, Chris Murphy пишет:
> On Tue, Oct 9, 2018 at 8:48 AM, Gervais, Francois
> <FGervais@distech-controls.com> wrote:
>> Hi,
>>
>> If I have a snapshot where I overwrite a big file but which only a
>> small portion of it is different, will the whole file be rewritten in
>> the snapshot? Or only the different part of the file?
> 

If you overwrite the whole file, the whole file will be overwritten.

> Depends on how the application modifies files. Many applications write
> out a whole new file with a pseudorandom filename, fsync, then rename.
> 
>>
>> Something like:
>>
>> $ dd if=/dev/urandom of=/big_file bs=1M count=1024
>> $ cp /big_file root/
>> $ btrfs sub snap root snapshot
>> $ cp /big_file snapshot/
>>

And which portion of these three files is different? They must be
identical. Not that it really matters, but that does not match your
question.

>> In this case is root/big_file and snapshot/big_file still share the same data?
> 
> You'll be left with three files. /big_file and root/big_file will
> share extents,

How comes they share extents? This requires --reflink, is it default now?

> and snapshot/big_file will have its own extents. You'd
> need to copy with --reflink for snapshot/big_file to have shared
> extents with /big_file - or deduplicate.
> 
This still overwrites the whole file in the sense original file content
of "snapshot/big_file" is lost. That new content happens to be identical
and that new content will probably be reflinked does not change the fact
that original file is gone.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CoW behavior when writing same content
  2018-10-09 17:25   ` Andrei Borzenkov
@ 2018-10-09 22:31     ` Chris Murphy
  0 siblings, 0 replies; 5+ messages in thread
From: Chris Murphy @ 2018-10-09 22:31 UTC (permalink / raw)
  To: Andrei Borzenkov; +Cc: Chris Murphy, Gervais, Francois, linux-btrfs

On Tue, Oct 9, 2018 at 11:25 AM, Andrei Borzenkov <arvidjaar@gmail.com> wrote:
> 09.10.2018 18:52, Chris Murphy пишет:

>>> In this case is root/big_file and snapshot/big_file still share the same data?
>>
>> You'll be left with three files. /big_file and root/big_file will
>> share extents,
>
> How comes they share extents? This requires --reflink, is it default now?

Good catch. It's not the default. I meant to write that initially only

root/big_file and snapshot/big_file have shared extents

And the shared extents are lost when snapshot/big_file is
"overwritten" by the copy into snapshot/


>> and snapshot/big_file will have its own extents. You'd
>> need to copy with --reflink for snapshot/big_file to have shared
>> extents with /big_file - or deduplicate.
>>
> This still overwrites the whole file in the sense original file content
> of "snapshot/big_file" is lost. That new content happens to be identical
> and that new content will probably be reflinked does not change the fact
> that original file is gone.

Agreed.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-10-09 22:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-09 14:48 CoW behavior when writing same content Gervais, Francois
2018-10-09 15:52 ` Chris Murphy
2018-10-09 16:21   ` Roman Mamedov
2018-10-09 17:25   ` Andrei Borzenkov
2018-10-09 22:31     ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.