linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* cloning a btrfs drive with send and receive: clone is bigger than the original?
@ 2021-01-09 16:01  
  2021-01-09 19:45 ` Andrei Borzenkov
  0 siblings, 1 reply; 8+ messages in thread
From:   @ 2021-01-09 16:01 UTC (permalink / raw)
  To: linux-btrfs

­I've got a drive with data, and 3 snapshots of that data. I've transferred all the snapshots to another drive using btrfs send and receive. The send drive has 3.62 GB of data, the receive drive has 4.99 GB of data. It seems like the snapshots don't share data between them that was unchanged.

How can I transfer the snapshots in such a way that the snapshots only occupy the difference between the snapshots?

The data on the original drive is organized like this:
/mnt/send/storage/ <= here's all the data
/mnt/send/storage_snapshots/ <= here are the 3 snapshots

The data on the receiving drive is organized like this:
/mnt/rec/storage/ <= this folder is empty
/mnt/rec/storage_snapshots/ <= here are the 3 snapshots
/mnt/rec/btrfs_receive/ <= here are the 3 files generated by btrfs send 

How can I transfer the snapshots in such a way that /mnt/rec/storage/ holds the latest version of the data, just like on the original drive?

In detail:
# mkfs.btrfs -L SEND /dev/sda3
# mount /dev/sda3 /mnt/send/ -o,compress,noatime
# mkfs.btrfs /dev/sdd2 -L DATA
# mount /dev/sdd2 ./mnt/rec/ -o,compress,noatime
# btrfs subvolume create /mnt/rec/btrfs_receive/
Create subvolume '/mnt/rec/btrfs_receive'
# btrfs subvolume create /mnt/rec/storage_snapshots

# btrfs subvolume create /mnt/send/storage
# btrfs subvolume create /mnt/send/storage_snapshots
# cd /mnt/send/storage
# /home/cedric/mkfiles_and_md5.sh <<generates/ change data on the send drive >>
# btrfs subvolume snapshot -r /mnt/send/storage /mnt/send/storage_snapshots/storage-$(date +%Y_%m_%d-%H%m)
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/storage_snapshots/storage-2021_01_09-1301'
# /home/cedric/mkfiles_and_md5.sh <<generates/ change data on the send drive >>
btrfs subvolume snapshot -r /mnt/send/storage /mnt/send/storage_snapshots/storage-$(date +%Y_%m_%d-%H%m%S)
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/storage_snapshots/storage-2021_01_09-130120'
# /home/cedric/mkfiles_and_md5.sh <<generates/ change data on the send drive >>
# btrfs subvolume snapshot -r /mnt/send/storage /mnt/send/storage_snapshots/storage-$(date +%Y_%m_%d-%H%m%S)
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/storage_snapshots/storage-2021_01_09-130146'

# btrfs send /mnt/send/storage_snapshots/storage-2021_01_09-1301 -f /mnt/rec/btrfs_receive/storage-2021_01_09-1301.btrfssend
At subvol /mnt/send/storage_snapshots/storage-2021_01_09-1301
[root@bcache-test rec]# btrfs send -p /mnt/send/storage_snapshots/storage-2021_01_09-1301 /mnt/send/storage_snapshots/storage-2021_01_09-130120 -f /mnt/rec/btrfs_receive/storage-2021_01_09-130120.btrfssend
At subvol /mnt/send/storage_snapshots/storage-2021_01_09-130120
[root@bcache-test rec]# btrfs send -p /mnt/send/storage_snapshots/storage-2021_01_09-130120 /mnt/send/storage_snapshots/storage-2021_01_09-130146 -f /mnt/rec/btrfs_receive/storage-2021_01_09-130146.btrfssend
At subvol /mnt/send/storage_snapshots/storage-2021_01_09-130146

# btrfs receive -f /mnt/rec/btrfs_receive/storage-2021_01_09-1301.btrfssend  /mnt/rec/storage_snapshots
At subvol storage-2021_01_09-1301
# btrfs receive -f /mnt/rec/btrfs_receive/storage-2021_01_09-130120.btrfssend  /mnt/rec/storage_snapshots
At snapshot storage-2021_01_09-130120
# btrfs receive -f /mnt/rec/btrfs_receive/storage-2021_01_09-130146.btrfssend /mnt/rec/storage_snapshots
At snapshot storage-2021_01_09-130146

# rm /mnt/rec/btrfs_receive/storage-2021_01_09-1301*
# btrfs filesystem show
Label: 'SEND'  uuid: 61b7e45f-62a7-4b04-bc0c-ba1304548b02
	Total devices 1 FS bytes used 3.62GiB
	devid    1 size 5.00GiB used 4.52GiB path /dev/sda3

Label: 'DATA'  uuid: 95e85fa4-217c-429a-be55-833bb63e2c71
	Total devices 1 FS bytes used 4.99GiB
	devid    1 size 931.01GiB used 10.02GiB path /dev/sdd2


---

Take your mailboxes with you. Free, fast and secure Mail &amp; Cloud: https://www.eclipso.eu - Time to change!



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cloning a btrfs drive with send and receive: clone is bigger than the original?
  2021-01-09 16:01 cloning a btrfs drive with send and receive: clone is bigger than the original?  
@ 2021-01-09 19:45 ` Andrei Borzenkov
  2021-01-09 21:08   `  
  0 siblings, 1 reply; 8+ messages in thread
From: Andrei Borzenkov @ 2021-01-09 19:45 UTC (permalink / raw)
  To: Cedric.dewijs, linux-btrfs

09.01.2021 19:01, Cedric.dewijs@eclipso.eu пишет:
> ­I've got a drive with data, and 3 snapshots of that data. I've transferred all the snapshots to another drive using btrfs send and receive. The send drive has 3.62 GB of data, the receive drive has 4.99 GB of data. It seems like the snapshots don't share data between them that was unchanged.
> 
> How can I transfer the snapshots in such a way that the snapshots only occupy the difference between the snapshots?
> 
> The data on the original drive is organized like this:
> /mnt/send/storage/ <= here's all the data
> /mnt/send/storage_snapshots/ <= here are the 3 snapshots
> 
> The data on the receiving drive is organized like this:
> /mnt/rec/storage/ <= this folder is empty
> /mnt/rec/storage_snapshots/ <= here are the 3 snapshots
> /mnt/rec/btrfs_receive/ <= here are the 3 files generated by btrfs send 
> 
> How can I transfer the snapshots in such a way that /mnt/rec/storage/ holds the latest version of the data, just like on the original drive?
> 
> In detail:
> # mkfs.btrfs -L SEND /dev/sda3
> # mount /dev/sda3 /mnt/send/ -o,compress,noatime
> # mkfs.btrfs /dev/sdd2 -L DATA
> # mount /dev/sdd2 ./mnt/rec/ -o,compress,noatime

I can think of at least two reasons

1. Inline data is not shared and compressing increases probability of
inlining

2. I believe only extents that are aligned on and exact multiple of
filesystem block are reflinked during send.


> # btrfs subvolume create /mnt/rec/btrfs_receive/
> Create subvolume '/mnt/rec/btrfs_receive'
> # btrfs subvolume create /mnt/rec/storage_snapshots
> 
> # btrfs subvolume create /mnt/send/storage
> # btrfs subvolume create /mnt/send/storage_snapshots
> # cd /mnt/send/storage
> # /home/cedric/mkfiles_and_md5.sh <<generates/ change data on the send drive >>
> # btrfs subvolume snapshot -r /mnt/send/storage /mnt/send/storage_snapshots/storage-$(date +%Y_%m_%d-%H%m)
> Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/storage_snapshots/storage-2021_01_09-1301'
> # /home/cedric/mkfiles_and_md5.sh <<generates/ change data on the send drive >>
> btrfs subvolume snapshot -r /mnt/send/storage /mnt/send/storage_snapshots/storage-$(date +%Y_%m_%d-%H%m%S)
> Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/storage_snapshots/storage-2021_01_09-130120'
> # /home/cedric/mkfiles_and_md5.sh <<generates/ change data on the send drive >>
> # btrfs subvolume snapshot -r /mnt/send/storage /mnt/send/storage_snapshots/storage-$(date +%Y_%m_%d-%H%m%S)
> Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/storage_snapshots/storage-2021_01_09-130146'
> 
> # btrfs send /mnt/send/storage_snapshots/storage-2021_01_09-1301 -f /mnt/rec/btrfs_receive/storage-2021_01_09-1301.btrfssend
> At subvol /mnt/send/storage_snapshots/storage-2021_01_09-1301
> [root@bcache-test rec]# btrfs send -p /mnt/send/storage_snapshots/storage-2021_01_09-1301 /mnt/send/storage_snapshots/storage-2021_01_09-130120 -f /mnt/rec/btrfs_receive/storage-2021_01_09-130120.btrfssend
> At subvol /mnt/send/storage_snapshots/storage-2021_01_09-130120
> [root@bcache-test rec]# btrfs send -p /mnt/send/storage_snapshots/storage-2021_01_09-130120 /mnt/send/storage_snapshots/storage-2021_01_09-130146 -f /mnt/rec/btrfs_receive/storage-2021_01_09-130146.btrfssend
> At subvol /mnt/send/storage_snapshots/storage-2021_01_09-130146
> 
> # btrfs receive -f /mnt/rec/btrfs_receive/storage-2021_01_09-1301.btrfssend  /mnt/rec/storage_snapshots
> At subvol storage-2021_01_09-1301
> # btrfs receive -f /mnt/rec/btrfs_receive/storage-2021_01_09-130120.btrfssend  /mnt/rec/storage_snapshots
> At snapshot storage-2021_01_09-130120
> # btrfs receive -f /mnt/rec/btrfs_receive/storage-2021_01_09-130146.btrfssend /mnt/rec/storage_snapshots
> At snapshot storage-2021_01_09-130146
> 
> # rm /mnt/rec/btrfs_receive/storage-2021_01_09-1301*
> # btrfs filesystem show
> Label: 'SEND'  uuid: 61b7e45f-62a7-4b04-bc0c-ba1304548b02
> 	Total devices 1 FS bytes used 3.62GiB
> 	devid    1 size 5.00GiB used 4.52GiB path /dev/sda3
> 
> Label: 'DATA'  uuid: 95e85fa4-217c-429a-be55-833bb63e2c71
> 	Total devices 1 FS bytes used 4.99GiB
> 	devid    1 size 931.01GiB used 10.02GiB path /dev/sdd2
> 
> 
> ---
> 
> Take your mailboxes with you. Free, fast and secure Mail &amp; Cloud: https://www.eclipso.eu - Time to change!
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: cloning a btrfs drive with send and receive: clone is bigger than  the original?
  2021-01-09 19:45 ` Andrei Borzenkov
@ 2021-01-09 21:08   `  
  2021-01-10  7:41     `  
  0 siblings, 1 reply; 8+ messages in thread
From:   @ 2021-01-09 21:08 UTC (permalink / raw)
  To: Andrei Borzenkov; +Cc: linux-btrfs

> 
> How can I transfer the snapshots in such a way that the snapshots only
occupy the difference between the snapshots?
> 
> The data on the original drive is organized like this:
> /mnt/send/storage/ <= here's all the data
> /mnt/send/storage_snapshots/ <= here are the 3 snapshots
> 
> The data on the receiving drive is organized like this:
> /mnt/rec/storage/ <= this folder is empty
> /mnt/rec/storage_snapshots/ <= here are the 3 snapshots
> /mnt/rec/btrfs_receive/ <= here are the 3 files generated by btrfs
send 
> 
> How can I transfer the snapshots in such a way that /mnt/rec/storage/
holds the latest version of the data, just like on the original drive?
> 
> In detail:
> # mkfs.btrfs -L SEND /dev/sda3
> # mount /dev/sda3 /mnt/send/ -o,compress,noatime
> # mkfs.btrfs /dev/sdd2 -L DATA
> # mount /dev/sdd2 ./mnt/rec/ -o,compress,noatime

I can think of at least two reasons

1. Inline data is not shared and compressing increases probability of
inlining

2. I believe only extents that are aligned on and exact multiple of
filesystem block are reflinked during send.


Thanks. I've made the following script to test it in a more controlled way. It turns out that btrfs send and receive work correctly, provided all commands are entered correctly. it's also important to explicitly call sync between deleting subvolumes and re-creating them, or calling btrfs filesystem show. 

# cat ~/btrfs-send-test.sh 
#!/bin/bash

btrfs subvolume delete /mnt/send/storage
btrfs subvolume delete /mnt/send/snapshots/*
btrfs subvolume delete /mnt/send/snapshots/
btrfs subvolume delete /mnt/rec/diff
btrfs subvolume delete /mnt/rec/snapshots/*
btrfs subvolume delete /mnt/rec/snapshots/
sync
btrfs subvolume create /mnt/send/storage
btrfs subvolume create /mnt/send/snapshots/
btrfs subvolume create /mnt/rec/diff
btrfs subvolume create /mnt/rec/snapshots

btrfs subvolume snapshot -r /mnt/send/storage/ /mnt/send/snapshots/0
btrfs send /mnt/send/snapshots/0 | btrfs receive /mnt/rec/snapshots

onelesscounter=0
counter=1
while [ $counter -le 10 ]
do
	dd if=/dev/urandom of=/mnt/send/storage/file$( printf %03d "$counter" ).bin bs=1M count=100
	md5sum /mnt/send/storage/file$( printf %03d "$counter" ).bin >> /mnt/send/storage/md5sums.txt
	btrfs subvolume snapshot -r /mnt/send/storage /mnt/send/snapshots/$counter
	btrfs send -p /mnt/send/snapshots/$onelesscounter /mnt/send/snapshots/$counter -f /mnt/rec/diff/$counter 
	btrfs receive -f /mnt/rec/diff/$counter /mnt/rec/snapshots
	((counter++))
	((onelesscounter++))
done
echo All done

# ls -lh /mnt/rec/diff/
total 1001M
-rw------- 1 root root 101M Jan  9 22:54 1
-rw------- 1 root root 101M Jan  9 22:55 10
-rw------- 1 root root 101M Jan  9 22:54 2
-rw------- 1 root root 101M Jan  9 22:54 3
-rw------- 1 root root 101M Jan  9 22:54 4
-rw------- 1 root root 101M Jan  9 22:54 5
-rw------- 1 root root 101M Jan  9 22:54 6
-rw------- 1 root root 101M Jan  9 22:54 7
-rw------- 1 root root 101M Jan  9 22:54 8
-rw------- 1 root root 101M Jan  9 22:55 9

# btrfs filesystem show
Label: 'SEND'  uuid: 61b7e45f-62a7-4b04-bc0c-ba1304548b02
	Total devices 1 FS bytes used 1001.69MiB
	devid    1 size 5.00GiB used 1.52GiB path /dev/sda3

Label: 'DATA'  uuid: 95e85fa4-217c-429a-be55-833bb63e2c71
	Total devices 1 FS bytes used 1.96GiB <= 1GB for the snapshots, and one GB for the diff files
	devid    1 size 931.01GiB used 10.02GiB path /dev/sdd2

Output of the script:
# ~/btrfs-send-test.sh 
Delete subvolume (no-commit): '/mnt/send/storage'
Delete subvolume (no-commit): '/mnt/send/snapshots/0'
Delete subvolume (no-commit): '/mnt/send/snapshots/1'
Delete subvolume (no-commit): '/mnt/send/snapshots/10'
Delete subvolume (no-commit): '/mnt/send/snapshots/2'
Delete subvolume (no-commit): '/mnt/send/snapshots/3'
Delete subvolume (no-commit): '/mnt/send/snapshots/4'
Delete subvolume (no-commit): '/mnt/send/snapshots/5'
Delete subvolume (no-commit): '/mnt/send/snapshots/6'
Delete subvolume (no-commit): '/mnt/send/snapshots/7'
Delete subvolume (no-commit): '/mnt/send/snapshots/8'
Delete subvolume (no-commit): '/mnt/send/snapshots/9'
Delete subvolume (no-commit): '/mnt/send/snapshots'
Delete subvolume (no-commit): '/mnt/rec/diff'
Delete subvolume (no-commit): '/mnt/rec/snapshots/0'
Delete subvolume (no-commit): '/mnt/rec/snapshots/1'
Delete subvolume (no-commit): '/mnt/rec/snapshots/10'
Delete subvolume (no-commit): '/mnt/rec/snapshots/2'
Delete subvolume (no-commit): '/mnt/rec/snapshots/3'
Delete subvolume (no-commit): '/mnt/rec/snapshots/4'
Delete subvolume (no-commit): '/mnt/rec/snapshots/5'
Delete subvolume (no-commit): '/mnt/rec/snapshots/6'
Delete subvolume (no-commit): '/mnt/rec/snapshots/7'
Delete subvolume (no-commit): '/mnt/rec/snapshots/8'
Delete subvolume (no-commit): '/mnt/rec/snapshots/9'
Delete subvolume (no-commit): '/mnt/rec/snapshots'
Create subvolume '/mnt/send/storage'
Create subvolume '/mnt/send/snapshots'
Create subvolume '/mnt/rec/diff'
Create subvolume '/mnt/rec/snapshots'
Create a readonly snapshot of '/mnt/send/storage/' in '/mnt/send/snapshots/0'
At subvol /mnt/send/snapshots/0
At subvol 0
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.482553 s, 217 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/1'
At subvol /mnt/send/snapshots/1
At snapshot 1
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.467191 s, 224 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/2'
At subvol /mnt/send/snapshots/2
At snapshot 2
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.465809 s, 225 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/3'
At subvol /mnt/send/snapshots/3
At snapshot 3
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.418819 s, 250 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/4'
At subvol /mnt/send/snapshots/4
At snapshot 4
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.466965 s, 225 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/5'
At subvol /mnt/send/snapshots/5
At snapshot 5
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.466293 s, 225 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/6'
At subvol /mnt/send/snapshots/6
At snapshot 6
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.46744 s, 224 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/7'
At subvol /mnt/send/snapshots/7
At snapshot 7
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.467267 s, 224 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/8'
At subvol /mnt/send/snapshots/8
At snapshot 8
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.467288 s, 224 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/9'
At subvol /mnt/send/snapshots/9
At snapshot 9
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.467526 s, 224 MB/s
Create a readonly snapshot of '/mnt/send/storage' in '/mnt/send/snapshots/10'
At subvol /mnt/send/snapshots/10
At snapshot 10
All done


---

Take your mailboxes with you. Free, fast and secure Mail &amp; Cloud: https://www.eclipso.eu - Time to change!



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Re: cloning a btrfs drive with send and receive: clone is bigger than  the original?
  2021-01-09 21:08   `  
@ 2021-01-10  7:41     `  
  2021-01-10  7:54       ` Andrei Borzenkov
  2021-01-10 13:06       ` Re: " Graham Cobb
  0 siblings, 2 replies; 8+ messages in thread
From:   @ 2021-01-10  7:41 UTC (permalink / raw)
  To: Cedric.dewijs; +Cc: arvidjaar, linux-btrfs

I've tested some more.

Repeatedly sending the difference between two consecutive snapshots creates a structure on the target drive where all the snapshots share data. So 10 snapshots of 10 files of 100MB takes up 1GB, as expected.

Repeatedly sending the difference between the first snapshot and each next snapshot creates a structure on the target drive where the snapshots are independent, so they don't share any data. How can that be avoided?

Script (version that sends the difference between the first snapshot and each current snapshot):
# cat ~/btrfs-send-test.sh 
#!/bin/bash

btrfs subvolume delete /mnt/send/storage
btrfs subvolume delete /mnt/send/snapshots/*
btrfs subvolume delete /mnt/send/snapshots/
btrfs subvolume delete /mnt/rec/diff
btrfs subvolume delete /mnt/rec/snapshots/*
btrfs subvolume delete /mnt/rec/snapshots/
sync
btrfs subvolume create /mnt/send/storage
btrfs subvolume create /mnt/send/snapshots/
btrfs subvolume create /mnt/rec/diff
btrfs subvolume create /mnt/rec/snapshots

btrfs subvolume snapshot -r /mnt/send/storage/ /mnt/send/snapshots/0
btrfs send /mnt/send/snapshots/0 | btrfs receive /mnt/rec/snapshots

onelesscounter=0
counter=1
while [ $counter -le 10 ]
do
	dd if=/dev/urandom of=/mnt/send/storage/file$( printf %03d "$counter" ).bin bs=1M count=100
	md5sum /mnt/send/storage/file$( printf %03d "$counter" ).bin >> /mnt/send/storage/md5sums.txt
	btrfs subvolume snapshot -r /mnt/send/storage /mnt/send/snapshots/$counter
	btrfs send -p /mnt/send/snapshots/0 /mnt/send/snapshots/$counter -f /mnt/rec/diff/$counter
	#btrfs send -p /mnt/send/snapshots/$onelesscounter /mnt/send/snapshots/$counter -f /mnt/rec/diff/$counter 
	btrfs receive -f /mnt/rec/diff/$counter /mnt/rec/snapshots
	((counter++))
	((onelesscounter++))
done
echo All done

# df -h
/dev/sda3       5.0G 1007M  3.6G  22% /mnt/send
/dev/sdd2       932G   11G  919G   2% /mnt/rec

# ls -lh /mtn/rec/diff
total 5.4G
-rw------- 1 root root  101M Jan 10 09:17 1
-rw------- 1 root root 1001M Jan 10 09:19 10
-rw------- 1 root root  201M Jan 10 09:17 2
-rw------- 1 root root  301M Jan 10 09:17 3
-rw------- 1 root root  401M Jan 10 09:17 4
-rw------- 1 root root  501M Jan 10 09:17 5
-rw------- 1 root root  601M Jan 10 09:18 6
-rw------- 1 root root  701M Jan 10 09:18 7
-rw------- 1 root root  801M Jan 10 09:18 8
-rw------- 1 root root  901M Jan 10 09:18 9

#rm /mtn/rec/diff/*
#sync

# df -h
/dev/sda3       5.0G 1007M  3.6G  22% /mnt/send
/dev/sdd2       932G  5.4G  924G   1% /mnt/rec  <= all data is individually stored in the snapshots?



---

Take your mailboxes with you. Free, fast and secure Mail &amp; Cloud: https://www.eclipso.eu - Time to change!



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cloning a btrfs drive with send and receive: clone is bigger than the original?
  2021-01-10  7:41     `  
@ 2021-01-10  7:54       ` Andrei Borzenkov
  2021-01-10 13:06       ` Re: " Graham Cobb
  1 sibling, 0 replies; 8+ messages in thread
From: Andrei Borzenkov @ 2021-01-10  7:54 UTC (permalink / raw)
  To: Cedric.dewijs; +Cc: linux-btrfs

10.01.2021 10:41, Cedric.dewijs@eclipso.eu пишет:
> I've tested some more.
> 
> Repeatedly sending the difference between two consecutive snapshots creates a structure on the target drive where all the snapshots share data. So 10 snapshots of 10 files of 100MB takes up 1GB, as expected.
> 
> Repeatedly sending the difference between the first snapshot and each next snapshot creates a structure on the target drive where the snapshots are independent, so they don't share any data.

How should "btrfs receive" know that in

btrfs send -p base snap1 | btrfs receive
btrfs send -p base snap2 | btrfs receive

snap1 and snap2 are related? By definition "btrfs send -p base" computes
difference to base snapshot and btrfs receive applies this difference to
replica of base snapshot. btrfs receive cannot reuse replica of snap1
because send stream does not contain any information about it.


> How can that be avoided?
> 

You can specify additional clone sources (btrfs send -p base -c snap1
snap2) but in your example the most efficient is to send delta between
two consecutive snapshots.

> Script (version that sends the difference between the first snapshot and each current snapshot):
> # cat ~/btrfs-send-test.sh 
> #!/bin/bash
> 
> btrfs subvolume delete /mnt/send/storage
> btrfs subvolume delete /mnt/send/snapshots/*
> btrfs subvolume delete /mnt/send/snapshots/
> btrfs subvolume delete /mnt/rec/diff
> btrfs subvolume delete /mnt/rec/snapshots/*
> btrfs subvolume delete /mnt/rec/snapshots/
> sync
> btrfs subvolume create /mnt/send/storage
> btrfs subvolume create /mnt/send/snapshots/
> btrfs subvolume create /mnt/rec/diff
> btrfs subvolume create /mnt/rec/snapshots
> 
> btrfs subvolume snapshot -r /mnt/send/storage/ /mnt/send/snapshots/0
> btrfs send /mnt/send/snapshots/0 | btrfs receive /mnt/rec/snapshots
> 
> onelesscounter=0
> counter=1
> while [ $counter -le 10 ]
> do
> 	dd if=/dev/urandom of=/mnt/send/storage/file$( printf %03d "$counter" ).bin bs=1M count=100
> 	md5sum /mnt/send/storage/file$( printf %03d "$counter" ).bin >> /mnt/send/storage/md5sums.txt
> 	btrfs subvolume snapshot -r /mnt/send/storage /mnt/send/snapshots/$counter
> 	btrfs send -p /mnt/send/snapshots/0 /mnt/send/snapshots/$counter -f /mnt/rec/diff/$counter
> 	#btrfs send -p /mnt/send/snapshots/$onelesscounter /mnt/send/snapshots/$counter -f /mnt/rec/diff/$counter 
> 	btrfs receive -f /mnt/rec/diff/$counter /mnt/rec/snapshots
> 	((counter++))
> 	((onelesscounter++))
> done
> echo All done
> 
> # df -h
> /dev/sda3       5.0G 1007M  3.6G  22% /mnt/send
> /dev/sdd2       932G   11G  919G   2% /mnt/rec
> 
> # ls -lh /mtn/rec/diff
> total 5.4G
> -rw------- 1 root root  101M Jan 10 09:17 1
> -rw------- 1 root root 1001M Jan 10 09:19 10
> -rw------- 1 root root  201M Jan 10 09:17 2
> -rw------- 1 root root  301M Jan 10 09:17 3
> -rw------- 1 root root  401M Jan 10 09:17 4
> -rw------- 1 root root  501M Jan 10 09:17 5
> -rw------- 1 root root  601M Jan 10 09:18 6
> -rw------- 1 root root  701M Jan 10 09:18 7
> -rw------- 1 root root  801M Jan 10 09:18 8
> -rw------- 1 root root  901M Jan 10 09:18 9
> 
> #rm /mtn/rec/diff/*
> #sync
> 
> # df -h
> /dev/sda3       5.0G 1007M  3.6G  22% /mnt/send
> /dev/sdd2       932G  5.4G  924G   1% /mnt/rec  <= all data is individually stored in the snapshots?
> 
> 
> 
> ---
> 
> Take your mailboxes with you. Free, fast and secure Mail &amp; Cloud: https://www.eclipso.eu - Time to change!
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Re: cloning a btrfs drive with send and receive: clone is bigger than the original?
  2021-01-10  7:41     `  
  2021-01-10  7:54       ` Andrei Borzenkov
@ 2021-01-10 13:06       ` Graham Cobb
  2021-01-10 13:21         ` Hugo Mills
  1 sibling, 1 reply; 8+ messages in thread
From: Graham Cobb @ 2021-01-10 13:06 UTC (permalink / raw)
  To: Cedric.dewijs; +Cc: arvidjaar, linux-btrfs

On 10/01/2021 07:41, Cedric.dewijs@eclipso.eu wrote:
> I've tested some more.
> 
> Repeatedly sending the difference between two consecutive snapshots creates a structure on the target drive where all the snapshots share data. So 10 snapshots of 10 files of 100MB takes up 1GB, as expected.
> 
> Repeatedly sending the difference between the first snapshot and each next snapshot creates a structure on the target drive where the snapshots are independent, so they don't share any data. How can that be avoided?

If you send a snapshot B with a parent A, any files not present in A
will be created in the copy of B. The fact that you already happen to
have a copy of the files somewhere else on the target is not known to
either the sender or the receiver - how would it be?

If you want the send process to take into account *other* snapshots that
have previously been sent, you need to tell send to also use those
snapshots as clone sources. That is what the -c option is for.

Alternatively, use a deduper on the destination after the receive has
finished and let it work out what can be shared.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re: Re: cloning a btrfs drive with send and receive: clone is bigger than the original?
  2021-01-10 13:06       ` Re: " Graham Cobb
@ 2021-01-10 13:21         ` Hugo Mills
  2021-01-10 15:38           ` Andrei Borzenkov
  0 siblings, 1 reply; 8+ messages in thread
From: Hugo Mills @ 2021-01-10 13:21 UTC (permalink / raw)
  To: Graham Cobb; +Cc: Cedric.dewijs, arvidjaar, linux-btrfs

On Sun, Jan 10, 2021 at 01:06:44PM +0000, Graham Cobb wrote:
> On 10/01/2021 07:41, Cedric.dewijs@eclipso.eu wrote:
> > I've tested some more.
> > 
> > Repeatedly sending the difference between two consecutive snapshots creates a structure on the target drive where all the snapshots share data. So 10 snapshots of 10 files of 100MB takes up 1GB, as expected.
> > 
> > Repeatedly sending the difference between the first snapshot and each next snapshot creates a structure on the target drive where the snapshots are independent, so they don't share any data. How can that be avoided?
> 
> If you send a snapshot B with a parent A, any files not present in A
> will be created in the copy of B. The fact that you already happen to
> have a copy of the files somewhere else on the target is not known to
> either the sender or the receiver - how would it be?
> 
> If you want the send process to take into account *other* snapshots that
> have previously been sent, you need to tell send to also use those
> snapshots as clone sources. That is what the -c option is for.

   And even then, it won't spot files that are identical but which
don't share extents.

> Alternatively, use a deduper on the destination after the receive has
> finished and let it work out what can be shared.

   This is a viable approach.

   Hugo.

-- 
Hugo Mills             | The last man on Earth sat in a room. Suddenly, there
hugo@... carfax.org.uk | was a knock at the door.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                        Frederic Brown

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cloning a btrfs drive with send and receive: clone is bigger than the original?
  2021-01-10 13:21         ` Hugo Mills
@ 2021-01-10 15:38           ` Andrei Borzenkov
  0 siblings, 0 replies; 8+ messages in thread
From: Andrei Borzenkov @ 2021-01-10 15:38 UTC (permalink / raw)
  To: Hugo Mills, Graham Cobb, Cedric.dewijs, linux-btrfs

10.01.2021 16:21, Hugo Mills пишет:
> On Sun, Jan 10, 2021 at 01:06:44PM +0000, Graham Cobb wrote:
>> On 10/01/2021 07:41, Cedric.dewijs@eclipso.eu wrote:
>>> I've tested some more.
>>>
>>> Repeatedly sending the difference between two consecutive snapshots creates a structure on the target drive where all the snapshots share data. So 10 snapshots of 10 files of 100MB takes up 1GB, as expected.
>>>
>>> Repeatedly sending the difference between the first snapshot and each next snapshot creates a structure on the target drive where the snapshots are independent, so they don't share any data. How can that be avoided?
>>
>> If you send a snapshot B with a parent A, any files not present in A
>> will be created in the copy of B. The fact that you already happen to
>> have a copy of the files somewhere else on the target is not known to
>> either the sender or the receiver - how would it be?
>>
>> If you want the send process to take into account *other* snapshots that
>> have previously been sent, you need to tell send to also use those
>> snapshots as clone sources. That is what the -c option is for.
> 
>    And even then, it won't spot files that are identical but which
> don't share extents.
> 

It won't with "btrfs send -p" as well.

>> Alternatively, use a deduper on the destination after the receive has
>> finished and let it work out what can be shared.
> 
>    This is a viable approach.
> 
>    Hugo.
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-01-10 15:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-09 16:01 cloning a btrfs drive with send and receive: clone is bigger than the original?  
2021-01-09 19:45 ` Andrei Borzenkov
2021-01-09 21:08   `  
2021-01-10  7:41     `  
2021-01-10  7:54       ` Andrei Borzenkov
2021-01-10 13:06       ` Re: " Graham Cobb
2021-01-10 13:21         ` Hugo Mills
2021-01-10 15:38           ` Andrei Borzenkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).