All of lore.kernel.org
 help / color / mirror / Atom feed
* send | receive: received snapshot is missing recent files
@ 2017-09-06  5:37 Dave
       [not found] ` <CAH=dxU7RM7s+pxT=wxE9WcUNMWjSG_A0=1pUWD1dWGVQ6g+g8Q@mail.gmail.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Dave @ 2017-09-06  5:37 UTC (permalink / raw)
  To: linux-btrfs

I'm running Arch Linux on BTRFS. I use Snapper to take hourly
snapshots and it works without any issues.

I have a bash script that uses send | receive to transfer snapshots to
a couple external HDD's. The script runs daily on a systemd timer. I
set all this up recently and I first noticed that it runs every day
and that the expected snapshots are received.

At a glance, everything looked correct. However, today was my day to
drill down and really make sure everything was working.

To my surprise, the newest received incremental snapshots are missing
all recent files. These new snapshots reflect the system state from
weeks ago and no files more recent than a certain date are in the
snapshots.

However, the snapshots are newly created and newly received. The work
is being done fresh each day when my script runs, but the results are
anchored back in time at this earlier date. Weird.

I'm not really sure where to start troubleshooting, so I'll start by
sharing part of my script. I'm sure the problem is in my script, and
is not related to BTRFS or snapper functionality. (As I said, the
Snapper snapshots are totally OK before being sent | received.

These are the key lines of the script I'm using to send | receive a snapshot:

    old_num=$(snapper -c "$config" list -t single | awk
'/'"$selected_uuid"'/ {print $1}')
    old_snap=$SUBVOLUME/.snapshots/$old_num/snapshot
    new_num=$(snapper -c "$config" create --print-number)
    new_snap=$SUBVOLUME/.snapshots/$new_num/snapshot
    btrfs send -c "$old_snap" "$new_snap" | $ssh btrfs receive
"$backup_location"

I have to admit that even after reading the following page half a
dozen times, I barely understand the difference between -c and -p.
https://btrfs.wiki.kernel.org/index.php/FAQ#What_is_the_difference_between_-c_and_-p_in_send.3F

After reading that page again today, I feel like I should switch to -p
(maybe). However, the -c vs -p choice probably isn't my problem.

Any ideas what my problem could be?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
       [not found] ` <CAH=dxU7RM7s+pxT=wxE9WcUNMWjSG_A0=1pUWD1dWGVQ6g+g8Q@mail.gmail.com>
@ 2017-09-06 19:46   ` Dave
  2017-09-07  4:43     ` Dave
  0 siblings, 1 reply; 11+ messages in thread
From: Dave @ 2017-09-06 19:46 UTC (permalink / raw)
  To: linux-btrfs

This is an even better set of steps for reproducing the problem.

[root@srv]# sync
[root@srv]# mkdir /home/.snapshots/test1
[root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/
Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home'
[root@srv]# sync
[root@srv]# mkdir /mnt/x5a/home/test1
[root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive
/mnt/x5a/home/test1/
At subvol /home/.snapshots/test1/home/
At subvol home
[root@srv]# ls -la /mnt/x5a/home/test1/home/user1/
NOTE: all recent files are present
[root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/
NOTE: all recent files are present
[root@srv]# mkdir /home/.snapshots/test2
[root@srv]# mkdir /mnt/x5a/home/test2
[root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/
Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home'
[root@srv]# sync
[root@srv]# btrfs send -p /home/.snapshots/test1/home/
/home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/
At subvol /home/.snapshots/test2/home/
At snapshot home
[root@srv]# ls -la /mnt/x5a/home/test2/home/user1/
NOTE: all recent files are MISSING
[root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/
NOTE: all recent files are MISSING

Any ideas what could be causing this problem with incremental backups?


On Wed, Sep 6, 2017 at 3:23 PM, Dave <davestechshop@gmail.com> wrote:
>
> Here is more info on this problem. I can reproduce this without using my script. Simple btrfs commands will reproduce the problem every time. The same files are missing every time. There is no randomness to the missing data.
>
> Here are my steps:
>
> 1. snapper -c home create
> result is a valid snapshot at /home/.snapshots/1704/snapshot
> 2. btrfs send /home/.snapshots/1704/snapshot | btrfs receive /mnt/x5a/home/1704
> 3. snapper -c home create
> result is a valid snapshot at /home/.snapshots/1716/snapshot
> 4. btrfs send -c /home/.snapshots/1704/snapshot/ /home/.snapshots/1716/snapshot/ | btrfs receive /mnt/x5a/home/1716/
>
> I expect /mnt/x5a/home/1716/snapshot to be identical to /home/.snapshots/1716/snapshot. However, it is not.
> The result is that /mnt/x5a/home/1716/snapshot is missing all recent files.
>
> Next step was to delete snapshot 1716 (in both locations) and repeat the send | receive using -p
>
> btrfs su del /mnt/x5a/home/1716/snapshot
> snapper -c home delete 1716
> snapper -c home create
> btrfs send -p /home/.snapshots/1704/snapshot/ /home/.snapshots/1716/snapshot/ | btrfs receive /mnt/x5a/home/1716/
>
> The result is once again that /mnt/x5a/home/1716/snapshot is missing all recent files. However, the other snapshots are all valid:
> /home/.snapshots/1704/snapshot is valid & complete
> /mnt/x5a/home/1704/snapshot -- non-incremental send: snapshot is valid & complete
> /home/.snapshots/1716/snapshot is valid & complete
> /mnt/x5a/home/1716/snapshot -- incrementally sent snapshot is missing all recent files whether sent with -c or -p
>
> The incrementally sent snapshot is even missing files that are present in the reference snapshot /mnt/x5a/home/1704/snapshot.
>
>
>
> On Wed, Sep 6, 2017 at 1:37 AM, Dave <davestechshop@gmail.com> wrote:
>>
>> I'm running Arch Linux on BTRFS. I use Snapper to take hourly
>> snapshots and it works without any issues.
>>
>> I have a bash script that uses send | receive to transfer snapshots to
>> a couple external HDD's. The script runs daily on a systemd timer. I
>> set all this up recently and I first noticed that it runs every day
>> and that the expected snapshots are received.
>>
>> At a glance, everything looked correct. However, today was my day to
>> drill down and really make sure everything was working.
>>
>> To my surprise, the newest received incremental snapshots are missing
>> all recent files. These new snapshots reflect the system state from
>> weeks ago and no files more recent than a certain date are in the
>> snapshots.
>>
>> However, the snapshots are newly created and newly received. The work
>> is being done fresh each day when my script runs, but the results are
>> anchored back in time at this earlier date. Weird.
>>
>> I'm not really sure where to start troubleshooting, so I'll start by
>> sharing part of my script. I'm sure the problem is in my script, and
>> is not related to BTRFS or snapper functionality. (As I said, the
>> Snapper snapshots are totally OK before being sent | received.
>>
>> These are the key lines of the script I'm using to send | receive a snapshot:
>>
>>     old_num=$(snapper -c "$config" list -t single | awk
>> '/'"$selected_uuid"'/ {print $1}')
>>     old_snap=$SUBVOLUME/.snapshots/$old_num/snapshot
>>     new_num=$(snapper -c "$config" create --print-number)
>>     new_snap=$SUBVOLUME/.snapshots/$new_num/snapshot
>>     btrfs send -c "$old_snap" "$new_snap" | $ssh btrfs receive
>> "$backup_location"
>>
>> I have to admit that even after reading the following page half a
>> dozen times, I barely understand the difference between -c and -p.
>> https://btrfs.wiki.kernel.org/index.php/FAQ#What_is_the_difference_between_-c_and_-p_in_send.3F
>>
>> After reading that page again today, I feel like I should switch to -p
>> (maybe). However, the -c vs -p choice probably isn't my problem.
>>
>> Any ideas what my problem could be?
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
  2017-09-06 19:46   ` Dave
@ 2017-09-07  4:43     ` Dave
  2017-09-07  6:24       ` A L
  0 siblings, 1 reply; 11+ messages in thread
From: Dave @ 2017-09-07  4:43 UTC (permalink / raw)
  To: linux-btrfs

Here is more info and a possible (shocking) explanation. This
aggregates my prior messages and it provides an almost complete set of
steps to reproduce this problem.

Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux
btrfs-progs v4.12

My steps:

[root@srv]# sync
[root@srv]# mkdir /home/.snapshots/test1
[root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/
Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home'
[root@srv]# sync
[root@srv]# mkdir /mnt/x5a/home/test1
[root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive
/mnt/x5a/home/test1/
At subvol /home/.snapshots/test1/home/
At subvol home
[root@srv]# ls -la /mnt/x5a/home/test1/home/user1/
NOTE: all recent files are present
[root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/
NOTE: all recent files are present
[root@srv]# mkdir /home/.snapshots/test2
[root@srv]# mkdir /mnt/x5a/home/test2
[root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/
Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home'
[root@srv]# sync
[root@srv]# btrfs send -p /home/.snapshots/test1/home/
/home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/
At subvol /home/.snapshots/test2/home/
At snapshot home
[root@srv]# ls -la /mnt/x5a/home/test2/home/user1/
NOTE: all recent files are MISSING
[root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/
NOTE: all recent files are MISSING

Below I am including some rsync output to illustrate when a snapshot
is missing files (or not):

[root@srv]# rsync -aniv /home/.snapshots/test1/home/
/home/.snapshots/test2/home/
sending incremental file list

sent 1,143,286 bytes  received 1,123 bytes  762,939.33 bytes/sec
total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)

This indicates that these two subvolumes contain the same files, which
they should because test2 is a snapshot of test1 without any changes
to files, and it was not sent to another physical device.

The problem is when test2 is sent to another device as shown by the
rsync results below.

[root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/
sending incremental file list
.d..t...... ./
.d..t...... user1/
>f.st...... user1/.bash_history
>f.st...... user1/.bashrc
>f+++++++++ user1/test2017-09-06.txt
...
and a long list of other missing files

The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is
missing all recent files (any files from the month of August or
September), as my prior visual inspections had indicated. The same
files are missing every time. There is no randomness to the missing
data.

The problem does not happen for me if the receive command target is
located on the same physical device as shown next. (However, I suspect
there's more to it than that, as explained further below.)

[root@srv]# mkdir /home/.snapshots/test2rec
[root@srv]# btrfs send -p /home/.snapshots/test1/home/
/home/.snapshots/test2/home/ | btrfs receive
/home/.snapshots/test2rec/
At subvol /home/.snapshots/test2/home/

# rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/
sending incremental file list

sent 1,143,286 bytes  received 1,123 bytes  2,288,818.00 bytes/sec
total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)

The above (as well as visual inspection of files) indicates that these
two subvolumes contain the same files, which was not the case when the
same command had a target located on another physical device. Of
course, a snapshot which resides on the same physical device is not a
very good backup. So I do need to send it to another device, but that
results in missing files when the -p or -c options are used with btrfs
send. (Non-incremental sending to another physical device does work.)

I can think of a couple possible explanations.

One is that there is a problem when using the -p or -c options with
btrfs send when the target is another physical device. I suspect this
is the actual explanation, however.

A second possibility is that the presence of prior existing snapshots
at the target location (even if old and not referenced in any current
btrfs command), can determine the outcome and final contents of an
incremental send operation. I believe the info below suggests this to
be the case.

[root@srv]# btrfs su show /home/.snapshots/test2/home/
test2/home
        Name:                   home
        UUID:                   292e8bbf-a95f-2a4e-8280-129202d389dc
        Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
        Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
        Creation time:          2017-09-06 15:38:16 -0400
        Subvolume ID:           2000
        Generation:             5020
        Gen at creation:        5020
        Parent ID:              257
        Top level ID:           257
        Flags:                  readonly
        Snapshot(s):

[root@srv]# btrfs su show /mnt/x5a/home/test1/home
home/test1/home
        Name:                   home
        UUID:                   dc00b13d-f841-cf48-a169-aa61429a5679
        Parent UUID:            -
        Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
        Creation time:          2017-09-06 15:33:45 -0400
        Subvolume ID:           656
        Generation:             777
        Gen at creation:        773
        Parent ID:              257
        Top level ID:           257
        Flags:                  readonly
        Snapshot(s):

[root@srv]# btrfs su show /mnt/x5a/home/test2/home/
home/test2/home
        Name:                   home
        UUID:                   b01ab63f-17a1-f442-b9d4-ed12a0d057ea
        Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
        Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
        Creation time:          2017-09-06 15:39:51 -0400
        Subvolume ID:           660
        Generation:             779
        Gen at creation:        779
        Parent ID:              257
        Top level ID:           257
        Flags:                  readonly
        Snapshot(s):

[root@srv]# btrfs su show /home/.snapshots/test2rec/home/
test2rec/home
        Name:                   home
        UUID:                   bde1891d-1474-414f-b6ab-2a34c5af224e
        Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
        Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
        Creation time:          2017-09-06 17:36:19 -0400
        Subvolume ID:           2003
        Generation:             5027
        Gen at creation:        5027
        Parent ID:              257
        Top level ID:           257
        Flags:                  readonly
        Snapshot(s):

Below, we have old almost forgotten snapshot (date 2017-07-21) on
device /mnt/x5a/home with a Received UUID that matches the Received
UUID of test snapshots that were newly created today. How? Why?

[root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot
home/107/snapshot
        Name:                   snapshot
        UUID:                   94d0bc47-dbf2-374e-b1c8-de06d729cde2
        Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
        Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
        Creation time:          2017-07-21 00:00:25 -0400
        Subvolume ID:           433
        Generation:             222
        Gen at creation:        221
        Parent ID:              257
        Top level ID:           257
        Flags:                  readonly
        Snapshot(s):

If my guess is correct, btrfs has found this old snapshot and
referenced it without me telling it to do so. The result is that the
newly executed btrfs commands shown above have a totally unexpected
result.

Today's new snapshot will not contain any files newer than 2017-07-21.
Is this a known issue?

Refer back to the commands at the top of this message. I created a new
snapshot and did a full (non-incremental) send to the target location
(/mnt/x5a/home). Then I created a snapshot and did a send which only
referenced the prior snapshot created today. Nowhere did I reference
the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at
this backup location -- it was intended to hold a lot of them.) Yet,
the very presence of /mnt/x5a/home/107/snapshot on the target device
resulted in today's backup (and all recent backups) being worthless
due to them missing all files since  2017-07-21.

These results are totally repeatable, given my set of existing
backups. But it's bizarre to me. As I understand it, a staff person
could transfer a btrfs snapshot to a target volume and it's mere
presence there could make all subsequent backups (incremental sends)
to that target volume invalid and useless. If that is true... wow.

Another interesting observation is that the device that contains the
source snapshot, /home/.snapshots, also contains many, many prior
snapshots, going back to when this system was first set up. Why do
none of them cause a problem? Is it because I had never used
/home/.snapshots as the target of a receive operation (until I did so
today in testing the steps above)?

As far as repeating these steps, all this was totally repeatable for
me as long as /mnt/x5a/home/107/snapshot existed on the target of the
receive command (/mnt/x5a/home/). I do not know how to create such a
"rogue" snapshot on purpose, but doing so may be key to reproducing my
results.

Maybe somebody can explain to me what's really happening. How is it
possible that an old snapshot created  2017-07-21 could have the same
Received UUID as snapshots created today? And how could that fact lead
to the result I'm seeing, which seems very serious. (Unexpected
missing files from a backup which was completed without errors is
pretty serious in my book.)

Most important question: how can we rely on automated incremental
backups with btrfs send | receive given what I'm observing here
(assuming my observations are roughly correct)?

Here's more info just to confirm that my results are not due to
filesystem corruption.

running check on unmounted volume that contains /mnt/x5a/home/test2/home:
[root@srv]# btrfs check -p /dev/mapper/x5a_luks
Checking filesystem on /dev/mapper/x5a_luks
UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a
checking extents [o]
checking free space cache [.]
checking fs roots [o]
checking csums
checking root refs
found 258178555904 bytes used, no error found
total csum bytes: 250354776
total tree bytes: 1752088576
total fs tree bytes: 1308540928
total extent tree bytes: 175161344
btree space waste bytes: 215594634
file data blocks allocated: 258634637312
 referenced 292888985600

[root@srv]# btrfs fi show /mnt/x5a/
Label: 'x5a_top'  uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a
        Total devices 1 FS bytes used 240.45GiB
        devid    1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks

[root@srv]# btrfs fi df /mnt/x5a/
Data, single: total=239.01GiB, used=238.82GiB
System, DUP: total=32.00MiB, used=48.00KiB
Metadata, DUP: total=2.50GiB, used=1.63GiB
GlobalReserve, single: total=422.73MiB, used=0.00B

# btrfs scrub status -d /mnt/x5a/
scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a
scrub device /dev/mapper/x5a_luks (id 1) history
        scrub started at Wed Sep  6 17:09:58 2017 and finished after 01:42:30
        total bytes scrubbed: 242.08GiB with 0 errors

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
  2017-09-07  4:43     ` Dave
@ 2017-09-07  6:24       ` A L
  2017-09-07 12:39         ` Dave
  0 siblings, 1 reply; 11+ messages in thread
From: A L @ 2017-09-07  6:24 UTC (permalink / raw)
  To: Dave, linux-btrfs

The problem can be that you have a Received UUID on the source volume. This breaks send-receive.

---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ----

> Here is more info and a possible (shocking) explanation. This
> aggregates my prior messages and it provides an almost complete set of
> steps to reproduce this problem.
> 
> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux
> btrfs-progs v4.12
> 
> My steps:
> 
> [root@srv]# sync
> [root@srv]# mkdir /home/.snapshots/test1
> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/
> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home'
> [root@srv]# sync
> [root@srv]# mkdir /mnt/x5a/home/test1
> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive
> /mnt/x5a/home/test1/
> At subvol /home/.snapshots/test1/home/
> At subvol home
> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/
> NOTE: all recent files are present
> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/
> NOTE: all recent files are present
> [root@srv]# mkdir /home/.snapshots/test2
> [root@srv]# mkdir /mnt/x5a/home/test2
> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/
> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home'
> [root@srv]# sync
> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/
> At subvol /home/.snapshots/test2/home/
> At snapshot home
> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/
> NOTE: all recent files are MISSING
> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/
> NOTE: all recent files are MISSING
> 
> Below I am including some rsync output to illustrate when a snapshot
> is missing files (or not):
> 
> [root@srv]# rsync -aniv /home/.snapshots/test1/home/
> /home/.snapshots/test2/home/
> sending incremental file list
> 
> sent 1,143,286 bytes  received 1,123 bytes  762,939.33 bytes/sec
> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
> 
> This indicates that these two subvolumes contain the same files, which
> they should because test2 is a snapshot of test1 without any changes
> to files, and it was not sent to another physical device.
> 
> The problem is when test2 is sent to another device as shown by the
> rsync results below.
> 
> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/
> sending incremental file list
> .d..t...... ./
> .d..t...... user1/
>>f.st...... user1/.bash_history
>>f.st...... user1/.bashrc
>>f+++++++++ user1/test2017-09-06.txt
> ...
> and a long list of other missing files
> 
> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is
> missing all recent files (any files from the month of August or
> September), as my prior visual inspections had indicated. The same
> files are missing every time. There is no randomness to the missing
> data.
> 
> The problem does not happen for me if the receive command target is
> located on the same physical device as shown next. (However, I suspect
> there's more to it than that, as explained further below.)
> 
> [root@srv]# mkdir /home/.snapshots/test2rec
> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
> /home/.snapshots/test2/home/ | btrfs receive
> /home/.snapshots/test2rec/
> At subvol /home/.snapshots/test2/home/
> 
> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/
> sending incremental file list
> 
> sent 1,143,286 bytes  received 1,123 bytes  2,288,818.00 bytes/sec
> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
> 
> The above (as well as visual inspection of files) indicates that these
> two subvolumes contain the same files, which was not the case when the
> same command had a target located on another physical device. Of
> course, a snapshot which resides on the same physical device is not a
> very good backup. So I do need to send it to another device, but that
> results in missing files when the -p or -c options are used with btrfs
> send. (Non-incremental sending to another physical device does work.)
> 
> I can think of a couple possible explanations.
> 
> One is that there is a problem when using the -p or -c options with
> btrfs send when the target is another physical device. I suspect this
> is the actual explanation, however.
> 
> A second possibility is that the presence of prior existing snapshots
> at the target location (even if old and not referenced in any current
> btrfs command), can determine the outcome and final contents of an
> incremental send operation. I believe the info below suggests this to
> be the case.
> 
> [root@srv]# btrfs su show /home/.snapshots/test2/home/
> test2/home
>         Name:                   home
>         UUID:                   292e8bbf-a95f-2a4e-8280-129202d389dc
>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>         Creation time:          2017-09-06 15:38:16 -0400
>         Subvolume ID:           2000
>         Generation:             5020
>         Gen at creation:        5020
>         Parent ID:              257
>         Top level ID:           257
>         Flags:                  readonly
>         Snapshot(s):
> 
> [root@srv]# btrfs su show /mnt/x5a/home/test1/home
> home/test1/home
>         Name:                   home
>         UUID:                   dc00b13d-f841-cf48-a169-aa61429a5679
>         Parent UUID:            -
>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>         Creation time:          2017-09-06 15:33:45 -0400
>         Subvolume ID:           656
>         Generation:             777
>         Gen at creation:        773
>         Parent ID:              257
>         Top level ID:           257
>         Flags:                  readonly
>         Snapshot(s):
> 
> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/
> home/test2/home
>         Name:                   home
>         UUID:                   b01ab63f-17a1-f442-b9d4-ed12a0d057ea
>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>         Creation time:          2017-09-06 15:39:51 -0400
>         Subvolume ID:           660
>         Generation:             779
>         Gen at creation:        779
>         Parent ID:              257
>         Top level ID:           257
>         Flags:                  readonly
>         Snapshot(s):
> 
> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/
> test2rec/home
>         Name:                   home
>         UUID:                   bde1891d-1474-414f-b6ab-2a34c5af224e
>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>         Creation time:          2017-09-06 17:36:19 -0400
>         Subvolume ID:           2003
>         Generation:             5027
>         Gen at creation:        5027
>         Parent ID:              257
>         Top level ID:           257
>         Flags:                  readonly
>         Snapshot(s):
> 
> Below, we have old almost forgotten snapshot (date 2017-07-21) on
> device /mnt/x5a/home with a Received UUID that matches the Received
> UUID of test snapshots that were newly created today. How? Why?
> 
> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot
> home/107/snapshot
>         Name:                   snapshot
>         UUID:                   94d0bc47-dbf2-374e-b1c8-de06d729cde2
>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>         Creation time:          2017-07-21 00:00:25 -0400
>         Subvolume ID:           433
>         Generation:             222
>         Gen at creation:        221
>         Parent ID:              257
>         Top level ID:           257
>         Flags:                  readonly
>         Snapshot(s):
> 
> If my guess is correct, btrfs has found this old snapshot and
> referenced it without me telling it to do so. The result is that the
> newly executed btrfs commands shown above have a totally unexpected
> result.
> 
> Today's new snapshot will not contain any files newer than 2017-07-21.
> Is this a known issue?
> 
> Refer back to the commands at the top of this message. I created a new
> snapshot and did a full (non-incremental) send to the target location
> (/mnt/x5a/home). Then I created a snapshot and did a send which only
> referenced the prior snapshot created today. Nowhere did I reference
> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at
> this backup location -- it was intended to hold a lot of them.) Yet,
> the very presence of /mnt/x5a/home/107/snapshot on the target device
> resulted in today's backup (and all recent backups) being worthless
> due to them missing all files since  2017-07-21.
> 
> These results are totally repeatable, given my set of existing
> backups. But it's bizarre to me. As I understand it, a staff person
> could transfer a btrfs snapshot to a target volume and it's mere
> presence there could make all subsequent backups (incremental sends)
> to that target volume invalid and useless. If that is true... wow.
> 
> Another interesting observation is that the device that contains the
> source snapshot, /home/.snapshots, also contains many, many prior
> snapshots, going back to when this system was first set up. Why do
> none of them cause a problem? Is it because I had never used
> /home/.snapshots as the target of a receive operation (until I did so
> today in testing the steps above)?
> 
> As far as repeating these steps, all this was totally repeatable for
> me as long as /mnt/x5a/home/107/snapshot existed on the target of the
> receive command (/mnt/x5a/home/). I do not know how to create such a
> "rogue" snapshot on purpose, but doing so may be key to reproducing my
> results.
> 
> Maybe somebody can explain to me what's really happening. How is it
> possible that an old snapshot created  2017-07-21 could have the same
> Received UUID as snapshots created today? And how could that fact lead
> to the result I'm seeing, which seems very serious. (Unexpected
> missing files from a backup which was completed without errors is
> pretty serious in my book.)
> 
> Most important question: how can we rely on automated incremental
> backups with btrfs send | receive given what I'm observing here
> (assuming my observations are roughly correct)?
> 
> Here's more info just to confirm that my results are not due to
> filesystem corruption.
> 
> running check on unmounted volume that contains /mnt/x5a/home/test2/home:
> [root@srv]# btrfs check -p /dev/mapper/x5a_luks
> Checking filesystem on /dev/mapper/x5a_luks
> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a
> checking extents [o]
> checking free space cache [.]
> checking fs roots [o]
> checking csums
> checking root refs
> found 258178555904 bytes used, no error found
> total csum bytes: 250354776
> total tree bytes: 1752088576
> total fs tree bytes: 1308540928
> total extent tree bytes: 175161344
> btree space waste bytes: 215594634
> file data blocks allocated: 258634637312
>  referenced 292888985600
> 
> [root@srv]# btrfs fi show /mnt/x5a/
> Label: 'x5a_top'  uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a
>         Total devices 1 FS bytes used 240.45GiB
>         devid    1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks
> 
> [root@srv]# btrfs fi df /mnt/x5a/
> Data, single: total=239.01GiB, used=238.82GiB
> System, DUP: total=32.00MiB, used=48.00KiB
> Metadata, DUP: total=2.50GiB, used=1.63GiB
> GlobalReserve, single: total=422.73MiB, used=0.00B
> 
> # btrfs scrub status -d /mnt/x5a/
> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a
> scrub device /dev/mapper/x5a_luks (id 1) history
>         scrub started at Wed Sep  6 17:09:58 2017 and finished after 01:42:30
>         total bytes scrubbed: 242.08GiB with 0 errors
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
  2017-09-07  6:24       ` A L
@ 2017-09-07 12:39         ` Dave
  2017-09-07 13:34           ` Dave
  0 siblings, 1 reply; 11+ messages in thread
From: Dave @ 2017-09-07 12:39 UTC (permalink / raw)
  To: linux-btrfs; +Cc: A L

Hello. Can anyone further explain this issue ("you have a Received
UUID on the source volume")?

How does it happen?
How does one remove a Received UUID from the source volume?

And how does that explain my results where I showed that the problem
is not dependent upon the source volume but is instead dependent upon
some existing snapshot on the target volume?

My results do not appear to be fully explained by a Received UUID on
the source volume, as my prior message hopefully shows clearly.

Thank you.

On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote:
> The problem can be that you have a Received UUID on the source volume. This breaks send-receive.
>
> ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ----
>
>> Here is more info and a possible (shocking) explanation. This
>> aggregates my prior messages and it provides an almost complete set of
>> steps to reproduce this problem.
>>
>> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux
>> btrfs-progs v4.12
>>
>> My steps:
>>
>> [root@srv]# sync
>> [root@srv]# mkdir /home/.snapshots/test1
>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/
>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home'
>> [root@srv]# sync
>> [root@srv]# mkdir /mnt/x5a/home/test1
>> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive
>> /mnt/x5a/home/test1/
>> At subvol /home/.snapshots/test1/home/
>> At subvol home
>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/
>> NOTE: all recent files are present
>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/
>> NOTE: all recent files are present
>> [root@srv]# mkdir /home/.snapshots/test2
>> [root@srv]# mkdir /mnt/x5a/home/test2
>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/
>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home'
>> [root@srv]# sync
>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
>> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/
>> At subvol /home/.snapshots/test2/home/
>> At snapshot home
>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/
>> NOTE: all recent files are MISSING
>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/
>> NOTE: all recent files are MISSING
>>
>> Below I am including some rsync output to illustrate when a snapshot
>> is missing files (or not):
>>
>> [root@srv]# rsync -aniv /home/.snapshots/test1/home/
>> /home/.snapshots/test2/home/
>> sending incremental file list
>>
>> sent 1,143,286 bytes  received 1,123 bytes  762,939.33 bytes/sec
>> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
>>
>> This indicates that these two subvolumes contain the same files, which
>> they should because test2 is a snapshot of test1 without any changes
>> to files, and it was not sent to another physical device.
>>
>> The problem is when test2 is sent to another device as shown by the
>> rsync results below.
>>
>> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/
>> sending incremental file list
>> .d..t...... ./
>> .d..t...... user1/
>>>f.st...... user1/.bash_history
>>>f.st...... user1/.bashrc
>>>f+++++++++ user1/test2017-09-06.txt
>> ...
>> and a long list of other missing files
>>
>> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is
>> missing all recent files (any files from the month of August or
>> September), as my prior visual inspections had indicated. The same
>> files are missing every time. There is no randomness to the missing
>> data.
>>
>> The problem does not happen for me if the receive command target is
>> located on the same physical device as shown next. (However, I suspect
>> there's more to it than that, as explained further below.)
>>
>> [root@srv]# mkdir /home/.snapshots/test2rec
>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
>> /home/.snapshots/test2/home/ | btrfs receive
>> /home/.snapshots/test2rec/
>> At subvol /home/.snapshots/test2/home/
>>
>> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/
>> sending incremental file list
>>
>> sent 1,143,286 bytes  received 1,123 bytes  2,288,818.00 bytes/sec
>> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
>>
>> The above (as well as visual inspection of files) indicates that these
>> two subvolumes contain the same files, which was not the case when the
>> same command had a target located on another physical device. Of
>> course, a snapshot which resides on the same physical device is not a
>> very good backup. So I do need to send it to another device, but that
>> results in missing files when the -p or -c options are used with btrfs
>> send. (Non-incremental sending to another physical device does work.)
>>
>> I can think of a couple possible explanations.
>>
>> One is that there is a problem when using the -p or -c options with
>> btrfs send when the target is another physical device. I suspect this
>> is the actual explanation, however.
>>
>> A second possibility is that the presence of prior existing snapshots
>> at the target location (even if old and not referenced in any current
>> btrfs command), can determine the outcome and final contents of an
>> incremental send operation. I believe the info below suggests this to
>> be the case.
>>
>> [root@srv]# btrfs su show /home/.snapshots/test2/home/
>> test2/home
>>         Name:                   home
>>         UUID:                   292e8bbf-a95f-2a4e-8280-129202d389dc
>>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>         Creation time:          2017-09-06 15:38:16 -0400
>>         Subvolume ID:           2000
>>         Generation:             5020
>>         Gen at creation:        5020
>>         Parent ID:              257
>>         Top level ID:           257
>>         Flags:                  readonly
>>         Snapshot(s):
>>
>> [root@srv]# btrfs su show /mnt/x5a/home/test1/home
>> home/test1/home
>>         Name:                   home
>>         UUID:                   dc00b13d-f841-cf48-a169-aa61429a5679
>>         Parent UUID:            -
>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>         Creation time:          2017-09-06 15:33:45 -0400
>>         Subvolume ID:           656
>>         Generation:             777
>>         Gen at creation:        773
>>         Parent ID:              257
>>         Top level ID:           257
>>         Flags:                  readonly
>>         Snapshot(s):
>>
>> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/
>> home/test2/home
>>         Name:                   home
>>         UUID:                   b01ab63f-17a1-f442-b9d4-ed12a0d057ea
>>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>         Creation time:          2017-09-06 15:39:51 -0400
>>         Subvolume ID:           660
>>         Generation:             779
>>         Gen at creation:        779
>>         Parent ID:              257
>>         Top level ID:           257
>>         Flags:                  readonly
>>         Snapshot(s):
>>
>> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/
>> test2rec/home
>>         Name:                   home
>>         UUID:                   bde1891d-1474-414f-b6ab-2a34c5af224e
>>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>         Creation time:          2017-09-06 17:36:19 -0400
>>         Subvolume ID:           2003
>>         Generation:             5027
>>         Gen at creation:        5027
>>         Parent ID:              257
>>         Top level ID:           257
>>         Flags:                  readonly
>>         Snapshot(s):
>>
>> Below, we have old almost forgotten snapshot (date 2017-07-21) on
>> device /mnt/x5a/home with a Received UUID that matches the Received
>> UUID of test snapshots that were newly created today. How? Why?
>>
>> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot
>> home/107/snapshot
>>         Name:                   snapshot
>>         UUID:                   94d0bc47-dbf2-374e-b1c8-de06d729cde2
>>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>         Creation time:          2017-07-21 00:00:25 -0400
>>         Subvolume ID:           433
>>         Generation:             222
>>         Gen at creation:        221
>>         Parent ID:              257
>>         Top level ID:           257
>>         Flags:                  readonly
>>         Snapshot(s):
>>
>> If my guess is correct, btrfs has found this old snapshot and
>> referenced it without me telling it to do so. The result is that the
>> newly executed btrfs commands shown above have a totally unexpected
>> result.
>>
>> Today's new snapshot will not contain any files newer than 2017-07-21.
>> Is this a known issue?
>>
>> Refer back to the commands at the top of this message. I created a new
>> snapshot and did a full (non-incremental) send to the target location
>> (/mnt/x5a/home). Then I created a snapshot and did a send which only
>> referenced the prior snapshot created today. Nowhere did I reference
>> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at
>> this backup location -- it was intended to hold a lot of them.) Yet,
>> the very presence of /mnt/x5a/home/107/snapshot on the target device
>> resulted in today's backup (and all recent backups) being worthless
>> due to them missing all files since  2017-07-21.
>>
>> These results are totally repeatable, given my set of existing
>> backups. But it's bizarre to me. As I understand it, a staff person
>> could transfer a btrfs snapshot to a target volume and it's mere
>> presence there could make all subsequent backups (incremental sends)
>> to that target volume invalid and useless. If that is true... wow.
>>
>> Another interesting observation is that the device that contains the
>> source snapshot, /home/.snapshots, also contains many, many prior
>> snapshots, going back to when this system was first set up. Why do
>> none of them cause a problem? Is it because I had never used
>> /home/.snapshots as the target of a receive operation (until I did so
>> today in testing the steps above)?
>>
>> As far as repeating these steps, all this was totally repeatable for
>> me as long as /mnt/x5a/home/107/snapshot existed on the target of the
>> receive command (/mnt/x5a/home/). I do not know how to create such a
>> "rogue" snapshot on purpose, but doing so may be key to reproducing my
>> results.
>>
>> Maybe somebody can explain to me what's really happening. How is it
>> possible that an old snapshot created  2017-07-21 could have the same
>> Received UUID as snapshots created today? And how could that fact lead
>> to the result I'm seeing, which seems very serious. (Unexpected
>> missing files from a backup which was completed without errors is
>> pretty serious in my book.)
>>
>> Most important question: how can we rely on automated incremental
>> backups with btrfs send | receive given what I'm observing here
>> (assuming my observations are roughly correct)?
>>
>> Here's more info just to confirm that my results are not due to
>> filesystem corruption.
>>
>> running check on unmounted volume that contains /mnt/x5a/home/test2/home:
>> [root@srv]# btrfs check -p /dev/mapper/x5a_luks
>> Checking filesystem on /dev/mapper/x5a_luks
>> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a
>> checking extents [o]
>> checking free space cache [.]
>> checking fs roots [o]
>> checking csums
>> checking root refs
>> found 258178555904 bytes used, no error found
>> total csum bytes: 250354776
>> total tree bytes: 1752088576
>> total fs tree bytes: 1308540928
>> total extent tree bytes: 175161344
>> btree space waste bytes: 215594634
>> file data blocks allocated: 258634637312
>>  referenced 292888985600
>>
>> [root@srv]# btrfs fi show /mnt/x5a/
>> Label: 'x5a_top'  uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>         Total devices 1 FS bytes used 240.45GiB
>>         devid    1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks
>>
>> [root@srv]# btrfs fi df /mnt/x5a/
>> Data, single: total=239.01GiB, used=238.82GiB
>> System, DUP: total=32.00MiB, used=48.00KiB
>> Metadata, DUP: total=2.50GiB, used=1.63GiB
>> GlobalReserve, single: total=422.73MiB, used=0.00B
>>
>> # btrfs scrub status -d /mnt/x5a/
>> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a
>> scrub device /dev/mapper/x5a_luks (id 1) history
>>         scrub started at Wed Sep  6 17:09:58 2017 and finished after 01:42:30
>>         total bytes scrubbed: 242.08GiB with 0 errors
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
  2017-09-07 12:39         ` Dave
@ 2017-09-07 13:34           ` Dave
  2017-09-07 14:33             ` Axel Burri
  0 siblings, 1 reply; 11+ messages in thread
From: Dave @ 2017-09-07 13:34 UTC (permalink / raw)
  To: linux-btrfs; +Cc: A L

I just ran a test. The btrfs send - receive problem I described is
indeed fully resolved by removing the "problematic" snapshot on the
target device. I did not make any changes to the source volume. I did
not make any other changes in my steps (see earlier message for my
exact steps).

Therefore, the problem I described in my earlier message is not due
exclusively to having a Received UUID on the source volume (or to any
other feature of the source volume). It is not related to any feature
of the directly specified parent volume either. More details are
included in my earlier email.

Thanks for any further feedback, including answers to my questions and
comments about whether this is a known issue.


On Thu, Sep 7, 2017 at 8:39 AM, Dave <davestechshop@gmail.com> wrote:
>
> Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")?
>
> How does it happen?
> How does one remove a Received UUID from the source volume?
>
> And how does that explain my results where I showed that the problem
> is not dependent upon the source volume but is instead dependent upon
> some existing snapshot on the target volume?
>
> My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly.
>
> Thank you.
>
> On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote:
> > The problem can be that you have a Received UUID on the source volume. This breaks send-receive.
> >
> > ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ----
> >
> >> Here is more info and a possible (shocking) explanation. This
> >> aggregates my prior messages and it provides an almost complete set of
> >> steps to reproduce this problem.
> >>
> >> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux
> >> btrfs-progs v4.12
> >>
> >> My steps:
> >>
> >> [root@srv]# sync
> >> [root@srv]# mkdir /home/.snapshots/test1
> >> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/
> >> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home'
> >> [root@srv]# sync
> >> [root@srv]# mkdir /mnt/x5a/home/test1
> >> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive
> >> /mnt/x5a/home/test1/
> >> At subvol /home/.snapshots/test1/home/
> >> At subvol home
> >> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/
> >> NOTE: all recent files are present
> >> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/
> >> NOTE: all recent files are present
> >> [root@srv]# mkdir /home/.snapshots/test2
> >> [root@srv]# mkdir /mnt/x5a/home/test2
> >> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/
> >> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home'
> >> [root@srv]# sync
> >> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
> >> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/
> >> At subvol /home/.snapshots/test2/home/
> >> At snapshot home
> >> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/
> >> NOTE: all recent files are MISSING
> >> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/
> >> NOTE: all recent files are MISSING
> >>
> >> Below I am including some rsync output to illustrate when a snapshot
> >> is missing files (or not):
> >>
> >> [root@srv]# rsync -aniv /home/.snapshots/test1/home/
> >> /home/.snapshots/test2/home/
> >> sending incremental file list
> >>
> >> sent 1,143,286 bytes  received 1,123 bytes  762,939.33 bytes/sec
> >> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
> >>
> >> This indicates that these two subvolumes contain the same files, which
> >> they should because test2 is a snapshot of test1 without any changes
> >> to files, and it was not sent to another physical device.
> >>
> >> The problem is when test2 is sent to another device as shown by the
> >> rsync results below.
> >>
> >> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/
> >> sending incremental file list
> >> .d..t...... ./
> >> .d..t...... user1/
> >>>f.st...... user1/.bash_history
> >>>f.st...... user1/.bashrc
> >>>f+++++++++ user1/test2017-09-06.txt
> >> ...
> >> and a long list of other missing files
> >>
> >> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is
> >> missing all recent files (any files from the month of August or
> >> September), as my prior visual inspections had indicated. The same
> >> files are missing every time. There is no randomness to the missing
> >> data.
> >>
> >> The problem does not happen for me if the receive command target is
> >> located on the same physical device as shown next. (However, I suspect
> >> there's more to it than that, as explained further below.)
> >>
> >> [root@srv]# mkdir /home/.snapshots/test2rec
> >> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
> >> /home/.snapshots/test2/home/ | btrfs receive
> >> /home/.snapshots/test2rec/
> >> At subvol /home/.snapshots/test2/home/
> >>
> >> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/
> >> sending incremental file list
> >>
> >> sent 1,143,286 bytes  received 1,123 bytes  2,288,818.00 bytes/sec
> >> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
> >>
> >> The above (as well as visual inspection of files) indicates that these
> >> two subvolumes contain the same files, which was not the case when the
> >> same command had a target located on another physical device. Of
> >> course, a snapshot which resides on the same physical device is not a
> >> very good backup. So I do need to send it to another device, but that
> >> results in missing files when the -p or -c options are used with btrfs
> >> send. (Non-incremental sending to another physical device does work.)
> >>
> >> I can think of a couple possible explanations.
> >>
> >> One is that there is a problem when using the -p or -c options with
> >> btrfs send when the target is another physical device. I suspect this
> >> is the actual explanation, however.
> >>
> >> A second possibility is that the presence of prior existing snapshots
> >> at the target location (even if old and not referenced in any current
> >> btrfs command), can determine the outcome and final contents of an
> >> incremental send operation. I believe the info below suggests this to
> >> be the case.
> >>
> >> [root@srv]# btrfs su show /home/.snapshots/test2/home/
> >> test2/home
> >>         Name:                   home
> >>         UUID:                   292e8bbf-a95f-2a4e-8280-129202d389dc
> >>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
> >>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>         Creation time:          2017-09-06 15:38:16 -0400
> >>         Subvolume ID:           2000
> >>         Generation:             5020
> >>         Gen at creation:        5020
> >>         Parent ID:              257
> >>         Top level ID:           257
> >>         Flags:                  readonly
> >>         Snapshot(s):
> >>
> >> [root@srv]# btrfs su show /mnt/x5a/home/test1/home
> >> home/test1/home
> >>         Name:                   home
> >>         UUID:                   dc00b13d-f841-cf48-a169-aa61429a5679
> >>         Parent UUID:            -
> >>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>         Creation time:          2017-09-06 15:33:45 -0400
> >>         Subvolume ID:           656
> >>         Generation:             777
> >>         Gen at creation:        773
> >>         Parent ID:              257
> >>         Top level ID:           257
> >>         Flags:                  readonly
> >>         Snapshot(s):
> >>
> >> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/
> >> home/test2/home
> >>         Name:                   home
> >>         UUID:                   b01ab63f-17a1-f442-b9d4-ed12a0d057ea
> >>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
> >>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>         Creation time:          2017-09-06 15:39:51 -0400
> >>         Subvolume ID:           660
> >>         Generation:             779
> >>         Gen at creation:        779
> >>         Parent ID:              257
> >>         Top level ID:           257
> >>         Flags:                  readonly
> >>         Snapshot(s):
> >>
> >> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/
> >> test2rec/home
> >>         Name:                   home
> >>         UUID:                   bde1891d-1474-414f-b6ab-2a34c5af224e
> >>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
> >>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>         Creation time:          2017-09-06 17:36:19 -0400
> >>         Subvolume ID:           2003
> >>         Generation:             5027
> >>         Gen at creation:        5027
> >>         Parent ID:              257
> >>         Top level ID:           257
> >>         Flags:                  readonly
> >>         Snapshot(s):
> >>
> >> Below, we have old almost forgotten snapshot (date 2017-07-21) on
> >> device /mnt/x5a/home with a Received UUID that matches the Received
> >> UUID of test snapshots that were newly created today. How? Why?
> >>
> >> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot
> >> home/107/snapshot
> >>         Name:                   snapshot
> >>         UUID:                   94d0bc47-dbf2-374e-b1c8-de06d729cde2
> >>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
> >>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>         Creation time:          2017-07-21 00:00:25 -0400
> >>         Subvolume ID:           433
> >>         Generation:             222
> >>         Gen at creation:        221
> >>         Parent ID:              257
> >>         Top level ID:           257
> >>         Flags:                  readonly
> >>         Snapshot(s):
> >>
> >> If my guess is correct, btrfs has found this old snapshot and
> >> referenced it without me telling it to do so. The result is that the
> >> newly executed btrfs commands shown above have a totally unexpected
> >> result.
> >>
> >> Today's new snapshot will not contain any files newer than 2017-07-21.
> >> Is this a known issue?
> >>
> >> Refer back to the commands at the top of this message. I created a new
> >> snapshot and did a full (non-incremental) send to the target location
> >> (/mnt/x5a/home). Then I created a snapshot and did a send which only
> >> referenced the prior snapshot created today. Nowhere did I reference
> >> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at
> >> this backup location -- it was intended to hold a lot of them.) Yet,
> >> the very presence of /mnt/x5a/home/107/snapshot on the target device
> >> resulted in today's backup (and all recent backups) being worthless
> >> due to them missing all files since  2017-07-21.
> >>
> >> These results are totally repeatable, given my set of existing
> >> backups. But it's bizarre to me. As I understand it, a staff person
> >> could transfer a btrfs snapshot to a target volume and it's mere
> >> presence there could make all subsequent backups (incremental sends)
> >> to that target volume invalid and useless. If that is true... wow.
> >>
> >> Another interesting observation is that the device that contains the
> >> source snapshot, /home/.snapshots, also contains many, many prior
> >> snapshots, going back to when this system was first set up. Why do
> >> none of them cause a problem? Is it because I had never used
> >> /home/.snapshots as the target of a receive operation (until I did so
> >> today in testing the steps above)?
> >>
> >> As far as repeating these steps, all this was totally repeatable for
> >> me as long as /mnt/x5a/home/107/snapshot existed on the target of the
> >> receive command (/mnt/x5a/home/). I do not know how to create such a
> >> "rogue" snapshot on purpose, but doing so may be key to reproducing my
> >> results.
> >>
> >> Maybe somebody can explain to me what's really happening. How is it
> >> possible that an old snapshot created  2017-07-21 could have the same
> >> Received UUID as snapshots created today? And how could that fact lead
> >> to the result I'm seeing, which seems very serious. (Unexpected
> >> missing files from a backup which was completed without errors is
> >> pretty serious in my book.)
> >>
> >> Most important question: how can we rely on automated incremental
> >> backups with btrfs send | receive given what I'm observing here
> >> (assuming my observations are roughly correct)?
> >>
> >> Here's more info just to confirm that my results are not due to
> >> filesystem corruption.
> >>
> >> running check on unmounted volume that contains /mnt/x5a/home/test2/home:
> >> [root@srv]# btrfs check -p /dev/mapper/x5a_luks
> >> Checking filesystem on /dev/mapper/x5a_luks
> >> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a
> >> checking extents [o]
> >> checking free space cache [.]
> >> checking fs roots [o]
> >> checking csums
> >> checking root refs
> >> found 258178555904 bytes used, no error found
> >> total csum bytes: 250354776
> >> total tree bytes: 1752088576
> >> total fs tree bytes: 1308540928
> >> total extent tree bytes: 175161344
> >> btree space waste bytes: 215594634
> >> file data blocks allocated: 258634637312
> >>  referenced 292888985600
> >>
> >> [root@srv]# btrfs fi show /mnt/x5a/
> >> Label: 'x5a_top'  uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a
> >>         Total devices 1 FS bytes used 240.45GiB
> >>         devid    1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks
> >>
> >> [root@srv]# btrfs fi df /mnt/x5a/
> >> Data, single: total=239.01GiB, used=238.82GiB
> >> System, DUP: total=32.00MiB, used=48.00KiB
> >> Metadata, DUP: total=2.50GiB, used=1.63GiB
> >> GlobalReserve, single: total=422.73MiB, used=0.00B
> >>
> >> # btrfs scrub status -d /mnt/x5a/
> >> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a
> >> scrub device /dev/mapper/x5a_luks (id 1) history
> >>         scrub started at Wed Sep  6 17:09:58 2017 and finished after 01:42:30
> >>         total bytes scrubbed: 242.08GiB with 0 errors
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
  2017-09-07 13:34           ` Dave
@ 2017-09-07 14:33             ` Axel Burri
  2017-09-08  4:44               ` Dave
  0 siblings, 1 reply; 11+ messages in thread
From: Axel Burri @ 2017-09-07 14:33 UTC (permalink / raw)
  To: Dave, linux-btrfs; +Cc: A L

Having a received_uuid set on the source volume ("/home" in your case)
is indeed a bad thing when it comes to send/receive. You probably
restored a backup with send/receive, and made it read/write using "btrfs
property set -ts /home ro false". This is a an evil thing, as it leaves
received_uuid intact. In order to make a subvolume read-write, I
recommend to use "btrfs subvolume snapshot <ro-subvol> <rw-subvol>".

There is a FAQ entry on btrbk on how to fix this:

https://github.com/digint/btrbk/blob/master/doc/FAQ.md#im-getting-an-error-aborted-received-uuid-is-set


On 2017-09-07 15:34, Dave wrote:
> I just ran a test. The btrfs send - receive problem I described is
> indeed fully resolved by removing the "problematic" snapshot on the
> target device. I did not make any changes to the source volume. I did
> not make any other changes in my steps (see earlier message for my
> exact steps).
> 
> Therefore, the problem I described in my earlier message is not due
> exclusively to having a Received UUID on the source volume (or to any
> other feature of the source volume). It is not related to any feature
> of the directly specified parent volume either. More details are
> included in my earlier email.
> 
> Thanks for any further feedback, including answers to my questions and
> comments about whether this is a known issue.
> 
> 
> On Thu, Sep 7, 2017 at 8:39 AM, Dave <davestechshop@gmail.com> wrote:
>>
>> Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")?
>>
>> How does it happen?
>> How does one remove a Received UUID from the source volume?
>>
>> And how does that explain my results where I showed that the problem
>> is not dependent upon the source volume but is instead dependent upon
>> some existing snapshot on the target volume?
>>
>> My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly.
>>
>> Thank you.
>>
>> On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote:
>>> The problem can be that you have a Received UUID on the source volume. This breaks send-receive.
>>>
>>> ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ----
>>>
>>>> Here is more info and a possible (shocking) explanation. This
>>>> aggregates my prior messages and it provides an almost complete set of
>>>> steps to reproduce this problem.
>>>>
>>>> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux
>>>> btrfs-progs v4.12
>>>>
>>>> My steps:
>>>>
>>>> [root@srv]# sync
>>>> [root@srv]# mkdir /home/.snapshots/test1
>>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/
>>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home'
>>>> [root@srv]# sync
>>>> [root@srv]# mkdir /mnt/x5a/home/test1
>>>> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive
>>>> /mnt/x5a/home/test1/
>>>> At subvol /home/.snapshots/test1/home/
>>>> At subvol home
>>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/
>>>> NOTE: all recent files are present
>>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/
>>>> NOTE: all recent files are present
>>>> [root@srv]# mkdir /home/.snapshots/test2
>>>> [root@srv]# mkdir /mnt/x5a/home/test2
>>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/
>>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home'
>>>> [root@srv]# sync
>>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
>>>> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/
>>>> At subvol /home/.snapshots/test2/home/
>>>> At snapshot home
>>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/
>>>> NOTE: all recent files are MISSING
>>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/
>>>> NOTE: all recent files are MISSING
>>>>
>>>> Below I am including some rsync output to illustrate when a snapshot
>>>> is missing files (or not):
>>>>
>>>> [root@srv]# rsync -aniv /home/.snapshots/test1/home/
>>>> /home/.snapshots/test2/home/
>>>> sending incremental file list
>>>>
>>>> sent 1,143,286 bytes  received 1,123 bytes  762,939.33 bytes/sec
>>>> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
>>>>
>>>> This indicates that these two subvolumes contain the same files, which
>>>> they should because test2 is a snapshot of test1 without any changes
>>>> to files, and it was not sent to another physical device.
>>>>
>>>> The problem is when test2 is sent to another device as shown by the
>>>> rsync results below.
>>>>
>>>> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/
>>>> sending incremental file list
>>>> .d..t...... ./
>>>> .d..t...... user1/
>>>>> f.st...... user1/.bash_history
>>>>> f.st...... user1/.bashrc
>>>>> f+++++++++ user1/test2017-09-06.txt
>>>> ...
>>>> and a long list of other missing files
>>>>
>>>> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is
>>>> missing all recent files (any files from the month of August or
>>>> September), as my prior visual inspections had indicated. The same
>>>> files are missing every time. There is no randomness to the missing
>>>> data.
>>>>
>>>> The problem does not happen for me if the receive command target is
>>>> located on the same physical device as shown next. (However, I suspect
>>>> there's more to it than that, as explained further below.)
>>>>
>>>> [root@srv]# mkdir /home/.snapshots/test2rec
>>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
>>>> /home/.snapshots/test2/home/ | btrfs receive
>>>> /home/.snapshots/test2rec/
>>>> At subvol /home/.snapshots/test2/home/
>>>>
>>>> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/
>>>> sending incremental file list
>>>>
>>>> sent 1,143,286 bytes  received 1,123 bytes  2,288,818.00 bytes/sec
>>>> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
>>>>
>>>> The above (as well as visual inspection of files) indicates that these
>>>> two subvolumes contain the same files, which was not the case when the
>>>> same command had a target located on another physical device. Of
>>>> course, a snapshot which resides on the same physical device is not a
>>>> very good backup. So I do need to send it to another device, but that
>>>> results in missing files when the -p or -c options are used with btrfs
>>>> send. (Non-incremental sending to another physical device does work.)
>>>>
>>>> I can think of a couple possible explanations.
>>>>
>>>> One is that there is a problem when using the -p or -c options with
>>>> btrfs send when the target is another physical device. I suspect this
>>>> is the actual explanation, however.
>>>>
>>>> A second possibility is that the presence of prior existing snapshots
>>>> at the target location (even if old and not referenced in any current
>>>> btrfs command), can determine the outcome and final contents of an
>>>> incremental send operation. I believe the info below suggests this to
>>>> be the case.
>>>>
>>>> [root@srv]# btrfs su show /home/.snapshots/test2/home/
>>>> test2/home
>>>>         Name:                   home
>>>>         UUID:                   292e8bbf-a95f-2a4e-8280-129202d389dc
>>>>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>         Creation time:          2017-09-06 15:38:16 -0400
>>>>         Subvolume ID:           2000
>>>>         Generation:             5020
>>>>         Gen at creation:        5020
>>>>         Parent ID:              257
>>>>         Top level ID:           257
>>>>         Flags:                  readonly
>>>>         Snapshot(s):
>>>>
>>>> [root@srv]# btrfs su show /mnt/x5a/home/test1/home
>>>> home/test1/home
>>>>         Name:                   home
>>>>         UUID:                   dc00b13d-f841-cf48-a169-aa61429a5679
>>>>         Parent UUID:            -
>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>         Creation time:          2017-09-06 15:33:45 -0400
>>>>         Subvolume ID:           656
>>>>         Generation:             777
>>>>         Gen at creation:        773
>>>>         Parent ID:              257
>>>>         Top level ID:           257
>>>>         Flags:                  readonly
>>>>         Snapshot(s):
>>>>
>>>> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/
>>>> home/test2/home
>>>>         Name:                   home
>>>>         UUID:                   b01ab63f-17a1-f442-b9d4-ed12a0d057ea
>>>>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>         Creation time:          2017-09-06 15:39:51 -0400
>>>>         Subvolume ID:           660
>>>>         Generation:             779
>>>>         Gen at creation:        779
>>>>         Parent ID:              257
>>>>         Top level ID:           257
>>>>         Flags:                  readonly
>>>>         Snapshot(s):
>>>>
>>>> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/
>>>> test2rec/home
>>>>         Name:                   home
>>>>         UUID:                   bde1891d-1474-414f-b6ab-2a34c5af224e
>>>>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>         Creation time:          2017-09-06 17:36:19 -0400
>>>>         Subvolume ID:           2003
>>>>         Generation:             5027
>>>>         Gen at creation:        5027
>>>>         Parent ID:              257
>>>>         Top level ID:           257
>>>>         Flags:                  readonly
>>>>         Snapshot(s):
>>>>
>>>> Below, we have old almost forgotten snapshot (date 2017-07-21) on
>>>> device /mnt/x5a/home with a Received UUID that matches the Received
>>>> UUID of test snapshots that were newly created today. How? Why?
>>>>
>>>> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot
>>>> home/107/snapshot
>>>>         Name:                   snapshot
>>>>         UUID:                   94d0bc47-dbf2-374e-b1c8-de06d729cde2
>>>>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>         Creation time:          2017-07-21 00:00:25 -0400
>>>>         Subvolume ID:           433
>>>>         Generation:             222
>>>>         Gen at creation:        221
>>>>         Parent ID:              257
>>>>         Top level ID:           257
>>>>         Flags:                  readonly
>>>>         Snapshot(s):
>>>>
>>>> If my guess is correct, btrfs has found this old snapshot and
>>>> referenced it without me telling it to do so. The result is that the
>>>> newly executed btrfs commands shown above have a totally unexpected
>>>> result.
>>>>
>>>> Today's new snapshot will not contain any files newer than 2017-07-21.
>>>> Is this a known issue?
>>>>
>>>> Refer back to the commands at the top of this message. I created a new
>>>> snapshot and did a full (non-incremental) send to the target location
>>>> (/mnt/x5a/home). Then I created a snapshot and did a send which only
>>>> referenced the prior snapshot created today. Nowhere did I reference
>>>> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at
>>>> this backup location -- it was intended to hold a lot of them.) Yet,
>>>> the very presence of /mnt/x5a/home/107/snapshot on the target device
>>>> resulted in today's backup (and all recent backups) being worthless
>>>> due to them missing all files since  2017-07-21.
>>>>
>>>> These results are totally repeatable, given my set of existing
>>>> backups. But it's bizarre to me. As I understand it, a staff person
>>>> could transfer a btrfs snapshot to a target volume and it's mere
>>>> presence there could make all subsequent backups (incremental sends)
>>>> to that target volume invalid and useless. If that is true... wow.
>>>>
>>>> Another interesting observation is that the device that contains the
>>>> source snapshot, /home/.snapshots, also contains many, many prior
>>>> snapshots, going back to when this system was first set up. Why do
>>>> none of them cause a problem? Is it because I had never used
>>>> /home/.snapshots as the target of a receive operation (until I did so
>>>> today in testing the steps above)?
>>>>
>>>> As far as repeating these steps, all this was totally repeatable for
>>>> me as long as /mnt/x5a/home/107/snapshot existed on the target of the
>>>> receive command (/mnt/x5a/home/). I do not know how to create such a
>>>> "rogue" snapshot on purpose, but doing so may be key to reproducing my
>>>> results.
>>>>
>>>> Maybe somebody can explain to me what's really happening. How is it
>>>> possible that an old snapshot created  2017-07-21 could have the same
>>>> Received UUID as snapshots created today? And how could that fact lead
>>>> to the result I'm seeing, which seems very serious. (Unexpected
>>>> missing files from a backup which was completed without errors is
>>>> pretty serious in my book.)
>>>>
>>>> Most important question: how can we rely on automated incremental
>>>> backups with btrfs send | receive given what I'm observing here
>>>> (assuming my observations are roughly correct)?
>>>>
>>>> Here's more info just to confirm that my results are not due to
>>>> filesystem corruption.
>>>>
>>>> running check on unmounted volume that contains /mnt/x5a/home/test2/home:
>>>> [root@srv]# btrfs check -p /dev/mapper/x5a_luks
>>>> Checking filesystem on /dev/mapper/x5a_luks
>>>> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>>> checking extents [o]
>>>> checking free space cache [.]
>>>> checking fs roots [o]
>>>> checking csums
>>>> checking root refs
>>>> found 258178555904 bytes used, no error found
>>>> total csum bytes: 250354776
>>>> total tree bytes: 1752088576
>>>> total fs tree bytes: 1308540928
>>>> total extent tree bytes: 175161344
>>>> btree space waste bytes: 215594634
>>>> file data blocks allocated: 258634637312
>>>>  referenced 292888985600
>>>>
>>>> [root@srv]# btrfs fi show /mnt/x5a/
>>>> Label: 'x5a_top'  uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>>>         Total devices 1 FS bytes used 240.45GiB
>>>>         devid    1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks
>>>>
>>>> [root@srv]# btrfs fi df /mnt/x5a/
>>>> Data, single: total=239.01GiB, used=238.82GiB
>>>> System, DUP: total=32.00MiB, used=48.00KiB
>>>> Metadata, DUP: total=2.50GiB, used=1.63GiB
>>>> GlobalReserve, single: total=422.73MiB, used=0.00B
>>>>
>>>> # btrfs scrub status -d /mnt/x5a/
>>>> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>>> scrub device /dev/mapper/x5a_luks (id 1) history
>>>>         scrub started at Wed Sep  6 17:09:58 2017 and finished after 01:42:30
>>>>         total bytes scrubbed: 242.08GiB with 0 errors
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
  2017-09-07 14:33             ` Axel Burri
@ 2017-09-08  4:44               ` Dave
  2017-09-11 17:53                 ` Axel Burri
  0 siblings, 1 reply; 11+ messages in thread
From: Dave @ 2017-09-08  4:44 UTC (permalink / raw)
  To: linux-btrfs; +Cc: A L, Axel Burri

I'm referring to the link below. Using "btrfs subvolume snapshot -r"
copies the Received UUID from the source into the new snapshot. The
btrbk FAQ entry suggests otherwise. Has something changed?

The only way I see to remove a Received UUID is to create a rw
snapshot (above command without the "-r"), which is not ideal in this
situation when cleaning up readonly source snapshots.

Any suggestions? Thanks

On Thu, Sep 7, 2017 at 10:33 AM, Axel Burri <axel@tty0.ch> wrote:
>
> Having a received_uuid set on the source volume ("/home" in your case)
> is indeed a bad thing when it comes to send/receive. You probably
> restored a backup with send/receive, and made it read/write using "btrfs
> property set -ts /home ro false". This is a an evil thing, as it leaves
> received_uuid intact. In order to make a subvolume read-write, I
> recommend to use "btrfs subvolume snapshot <ro-subvol> <rw-subvol>".
>
> There is a FAQ entry on btrbk on how to fix this:
>
> https://github.com/digint/btrbk/blob/master/doc/FAQ.md#im-getting-an-error-aborted-received-uuid-is-set
>
>
> On 2017-09-07 15:34, Dave wrote:
> > I just ran a test. The btrfs send - receive problem I described is
> > indeed fully resolved by removing the "problematic" snapshot on the
> > target device. I did not make any changes to the source volume. I did
> > not make any other changes in my steps (see earlier message for my
> > exact steps).
> >
> > Therefore, the problem I described in my earlier message is not due
> > exclusively to having a Received UUID on the source volume (or to any
> > other feature of the source volume). It is not related to any feature
> > of the directly specified parent volume either. More details are
> > included in my earlier email.
> >
> > Thanks for any further feedback, including answers to my questions and
> > comments about whether this is a known issue.
> >
> >
> > On Thu, Sep 7, 2017 at 8:39 AM, Dave <davestechshop@gmail.com> wrote:
> >>
> >> Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")?
> >>
> >> How does it happen?
> >> How does one remove a Received UUID from the source volume?
> >>
> >> And how does that explain my results where I showed that the problem
> >> is not dependent upon the source volume but is instead dependent upon
> >> some existing snapshot on the target volume?
> >>
> >> My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly.
> >>
> >> Thank you.
> >>
> >> On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote:
> >>> The problem can be that you have a Received UUID on the source volume. This breaks send-receive.
> >>>
> >>> ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ----
> >>>
> >>>> Here is more info and a possible (shocking) explanation. This
> >>>> aggregates my prior messages and it provides an almost complete set of
> >>>> steps to reproduce this problem.
> >>>>
> >>>> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux
> >>>> btrfs-progs v4.12
> >>>>
> >>>> My steps:
> >>>>
> >>>> [root@srv]# sync
> >>>> [root@srv]# mkdir /home/.snapshots/test1
> >>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/
> >>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home'
> >>>> [root@srv]# sync
> >>>> [root@srv]# mkdir /mnt/x5a/home/test1
> >>>> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive
> >>>> /mnt/x5a/home/test1/
> >>>> At subvol /home/.snapshots/test1/home/
> >>>> At subvol home
> >>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/
> >>>> NOTE: all recent files are present
> >>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/
> >>>> NOTE: all recent files are present
> >>>> [root@srv]# mkdir /home/.snapshots/test2
> >>>> [root@srv]# mkdir /mnt/x5a/home/test2
> >>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/
> >>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home'
> >>>> [root@srv]# sync
> >>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
> >>>> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/
> >>>> At subvol /home/.snapshots/test2/home/
> >>>> At snapshot home
> >>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/
> >>>> NOTE: all recent files are MISSING
> >>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/
> >>>> NOTE: all recent files are MISSING
> >>>>
> >>>> Below I am including some rsync output to illustrate when a snapshot
> >>>> is missing files (or not):
> >>>>
> >>>> [root@srv]# rsync -aniv /home/.snapshots/test1/home/
> >>>> /home/.snapshots/test2/home/
> >>>> sending incremental file list
> >>>>
> >>>> sent 1,143,286 bytes  received 1,123 bytes  762,939.33 bytes/sec
> >>>> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
> >>>>
> >>>> This indicates that these two subvolumes contain the same files, which
> >>>> they should because test2 is a snapshot of test1 without any changes
> >>>> to files, and it was not sent to another physical device.
> >>>>
> >>>> The problem is when test2 is sent to another device as shown by the
> >>>> rsync results below.
> >>>>
> >>>> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/
> >>>> sending incremental file list
> >>>> .d..t...... ./
> >>>> .d..t...... user1/
> >>>>> f.st...... user1/.bash_history
> >>>>> f.st...... user1/.bashrc
> >>>>> f+++++++++ user1/test2017-09-06.txt
> >>>> ...
> >>>> and a long list of other missing files
> >>>>
> >>>> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is
> >>>> missing all recent files (any files from the month of August or
> >>>> September), as my prior visual inspections had indicated. The same
> >>>> files are missing every time. There is no randomness to the missing
> >>>> data.
> >>>>
> >>>> The problem does not happen for me if the receive command target is
> >>>> located on the same physical device as shown next. (However, I suspect
> >>>> there's more to it than that, as explained further below.)
> >>>>
> >>>> [root@srv]# mkdir /home/.snapshots/test2rec
> >>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
> >>>> /home/.snapshots/test2/home/ | btrfs receive
> >>>> /home/.snapshots/test2rec/
> >>>> At subvol /home/.snapshots/test2/home/
> >>>>
> >>>> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/
> >>>> sending incremental file list
> >>>>
> >>>> sent 1,143,286 bytes  received 1,123 bytes  2,288,818.00 bytes/sec
> >>>> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
> >>>>
> >>>> The above (as well as visual inspection of files) indicates that these
> >>>> two subvolumes contain the same files, which was not the case when the
> >>>> same command had a target located on another physical device. Of
> >>>> course, a snapshot which resides on the same physical device is not a
> >>>> very good backup. So I do need to send it to another device, but that
> >>>> results in missing files when the -p or -c options are used with btrfs
> >>>> send. (Non-incremental sending to another physical device does work.)
> >>>>
> >>>> I can think of a couple possible explanations.
> >>>>
> >>>> One is that there is a problem when using the -p or -c options with
> >>>> btrfs send when the target is another physical device. I suspect this
> >>>> is the actual explanation, however.
> >>>>
> >>>> A second possibility is that the presence of prior existing snapshots
> >>>> at the target location (even if old and not referenced in any current
> >>>> btrfs command), can determine the outcome and final contents of an
> >>>> incremental send operation. I believe the info below suggests this to
> >>>> be the case.
> >>>>
> >>>> [root@srv]# btrfs su show /home/.snapshots/test2/home/
> >>>> test2/home
> >>>>         Name:                   home
> >>>>         UUID:                   292e8bbf-a95f-2a4e-8280-129202d389dc
> >>>>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
> >>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>>>         Creation time:          2017-09-06 15:38:16 -0400
> >>>>         Subvolume ID:           2000
> >>>>         Generation:             5020
> >>>>         Gen at creation:        5020
> >>>>         Parent ID:              257
> >>>>         Top level ID:           257
> >>>>         Flags:                  readonly
> >>>>         Snapshot(s):
> >>>>
> >>>> [root@srv]# btrfs su show /mnt/x5a/home/test1/home
> >>>> home/test1/home
> >>>>         Name:                   home
> >>>>         UUID:                   dc00b13d-f841-cf48-a169-aa61429a5679
> >>>>         Parent UUID:            -
> >>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>>>         Creation time:          2017-09-06 15:33:45 -0400
> >>>>         Subvolume ID:           656
> >>>>         Generation:             777
> >>>>         Gen at creation:        773
> >>>>         Parent ID:              257
> >>>>         Top level ID:           257
> >>>>         Flags:                  readonly
> >>>>         Snapshot(s):
> >>>>
> >>>> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/
> >>>> home/test2/home
> >>>>         Name:                   home
> >>>>         UUID:                   b01ab63f-17a1-f442-b9d4-ed12a0d057ea
> >>>>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
> >>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>>>         Creation time:          2017-09-06 15:39:51 -0400
> >>>>         Subvolume ID:           660
> >>>>         Generation:             779
> >>>>         Gen at creation:        779
> >>>>         Parent ID:              257
> >>>>         Top level ID:           257
> >>>>         Flags:                  readonly
> >>>>         Snapshot(s):
> >>>>
> >>>> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/
> >>>> test2rec/home
> >>>>         Name:                   home
> >>>>         UUID:                   bde1891d-1474-414f-b6ab-2a34c5af224e
> >>>>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
> >>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>>>         Creation time:          2017-09-06 17:36:19 -0400
> >>>>         Subvolume ID:           2003
> >>>>         Generation:             5027
> >>>>         Gen at creation:        5027
> >>>>         Parent ID:              257
> >>>>         Top level ID:           257
> >>>>         Flags:                  readonly
> >>>>         Snapshot(s):
> >>>>
> >>>> Below, we have old almost forgotten snapshot (date 2017-07-21) on
> >>>> device /mnt/x5a/home with a Received UUID that matches the Received
> >>>> UUID of test snapshots that were newly created today. How? Why?
> >>>>
> >>>> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot
> >>>> home/107/snapshot
> >>>>         Name:                   snapshot
> >>>>         UUID:                   94d0bc47-dbf2-374e-b1c8-de06d729cde2
> >>>>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
> >>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
> >>>>         Creation time:          2017-07-21 00:00:25 -0400
> >>>>         Subvolume ID:           433
> >>>>         Generation:             222
> >>>>         Gen at creation:        221
> >>>>         Parent ID:              257
> >>>>         Top level ID:           257
> >>>>         Flags:                  readonly
> >>>>         Snapshot(s):
> >>>>
> >>>> If my guess is correct, btrfs has found this old snapshot and
> >>>> referenced it without me telling it to do so. The result is that the
> >>>> newly executed btrfs commands shown above have a totally unexpected
> >>>> result.
> >>>>
> >>>> Today's new snapshot will not contain any files newer than 2017-07-21.
> >>>> Is this a known issue?
> >>>>
> >>>> Refer back to the commands at the top of this message. I created a new
> >>>> snapshot and did a full (non-incremental) send to the target location
> >>>> (/mnt/x5a/home). Then I created a snapshot and did a send which only
> >>>> referenced the prior snapshot created today. Nowhere did I reference
> >>>> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at
> >>>> this backup location -- it was intended to hold a lot of them.) Yet,
> >>>> the very presence of /mnt/x5a/home/107/snapshot on the target device
> >>>> resulted in today's backup (and all recent backups) being worthless
> >>>> due to them missing all files since  2017-07-21.
> >>>>
> >>>> These results are totally repeatable, given my set of existing
> >>>> backups. But it's bizarre to me. As I understand it, a staff person
> >>>> could transfer a btrfs snapshot to a target volume and it's mere
> >>>> presence there could make all subsequent backups (incremental sends)
> >>>> to that target volume invalid and useless. If that is true... wow.
> >>>>
> >>>> Another interesting observation is that the device that contains the
> >>>> source snapshot, /home/.snapshots, also contains many, many prior
> >>>> snapshots, going back to when this system was first set up. Why do
> >>>> none of them cause a problem? Is it because I had never used
> >>>> /home/.snapshots as the target of a receive operation (until I did so
> >>>> today in testing the steps above)?
> >>>>
> >>>> As far as repeating these steps, all this was totally repeatable for
> >>>> me as long as /mnt/x5a/home/107/snapshot existed on the target of the
> >>>> receive command (/mnt/x5a/home/). I do not know how to create such a
> >>>> "rogue" snapshot on purpose, but doing so may be key to reproducing my
> >>>> results.
> >>>>
> >>>> Maybe somebody can explain to me what's really happening. How is it
> >>>> possible that an old snapshot created  2017-07-21 could have the same
> >>>> Received UUID as snapshots created today? And how could that fact lead
> >>>> to the result I'm seeing, which seems very serious. (Unexpected
> >>>> missing files from a backup which was completed without errors is
> >>>> pretty serious in my book.)
> >>>>
> >>>> Most important question: how can we rely on automated incremental
> >>>> backups with btrfs send | receive given what I'm observing here
> >>>> (assuming my observations are roughly correct)?
> >>>>
> >>>> Here's more info just to confirm that my results are not due to
> >>>> filesystem corruption.
> >>>>
> >>>> running check on unmounted volume that contains /mnt/x5a/home/test2/home:
> >>>> [root@srv]# btrfs check -p /dev/mapper/x5a_luks
> >>>> Checking filesystem on /dev/mapper/x5a_luks
> >>>> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a
> >>>> checking extents [o]
> >>>> checking free space cache [.]
> >>>> checking fs roots [o]
> >>>> checking csums
> >>>> checking root refs
> >>>> found 258178555904 bytes used, no error found
> >>>> total csum bytes: 250354776
> >>>> total tree bytes: 1752088576
> >>>> total fs tree bytes: 1308540928
> >>>> total extent tree bytes: 175161344
> >>>> btree space waste bytes: 215594634
> >>>> file data blocks allocated: 258634637312
> >>>>  referenced 292888985600
> >>>>
> >>>> [root@srv]# btrfs fi show /mnt/x5a/
> >>>> Label: 'x5a_top'  uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a
> >>>>         Total devices 1 FS bytes used 240.45GiB
> >>>>         devid    1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks
> >>>>
> >>>> [root@srv]# btrfs fi df /mnt/x5a/
> >>>> Data, single: total=239.01GiB, used=238.82GiB
> >>>> System, DUP: total=32.00MiB, used=48.00KiB
> >>>> Metadata, DUP: total=2.50GiB, used=1.63GiB
> >>>> GlobalReserve, single: total=422.73MiB, used=0.00B
> >>>>
> >>>> # btrfs scrub status -d /mnt/x5a/
> >>>> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a
> >>>> scrub device /dev/mapper/x5a_luks (id 1) history
> >>>>         scrub started at Wed Sep  6 17:09:58 2017 and finished after 01:42:30
> >>>>         total bytes scrubbed: 242.08GiB with 0 errors
> >>>> --
> >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >>>> the body of a message to majordomo@vger.kernel.org
> >>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
  2017-09-08  4:44               ` Dave
@ 2017-09-11 17:53                 ` Axel Burri
  2017-09-12  3:19                   ` Andrei Borzenkov
  0 siblings, 1 reply; 11+ messages in thread
From: Axel Burri @ 2017-09-11 17:53 UTC (permalink / raw)
  To: Dave, linux-btrfs; +Cc: A L

On 2017-09-08 06:44, Dave wrote:
> I'm referring to the link below. Using "btrfs subvolume snapshot -r"
> copies the Received UUID from the source into the new snapshot. The
> btrbk FAQ entry suggests otherwise. Has something changed?

I don't think something has changed, the description for the read-only
subvolumes on the btrbk FAQ was just wrong (fixed now).

> The only way I see to remove a Received UUID is to create a rw
> snapshot (above command without the "-r"), which is not ideal in this
> situation when cleaning up readonly source snapshots.
> 
> Any suggestions? Thanks

No suggestions from my part, as far as I know there is no way to easily
remove/change a received_uuid from a subvolume.

As you mentioned, you can snapshot it twice:

# btrfs subvolume snapshot mysubvol mysubvol.rw
# btrfs subvolume delete mysubvol
# btrfs subvolume snapshot -r mysubvol.rw mysubvol
# btrfs subvolume delete mysubvol.rw

Instead of the second snapshot operation, this time you could also use
the (evil) command: "btrfs btrfs property set -ts mysnapshot ro true"

> On Thu, Sep 7, 2017 at 10:33 AM, Axel Burri <axel@tty0.ch> wrote:
>>
>> Having a received_uuid set on the source volume ("/home" in your case)
>> is indeed a bad thing when it comes to send/receive. You probably
>> restored a backup with send/receive, and made it read/write using "btrfs
>> property set -ts /home ro false". This is a an evil thing, as it leaves
>> received_uuid intact. In order to make a subvolume read-write, I
>> recommend to use "btrfs subvolume snapshot <ro-subvol> <rw-subvol>".
>>
>> There is a FAQ entry on btrbk on how to fix this:
>>
>> https://github.com/digint/btrbk/blob/master/doc/FAQ.md#im-getting-an-error-aborted-received-uuid-is-set
>>
>>
>> On 2017-09-07 15:34, Dave wrote:
>>> I just ran a test. The btrfs send - receive problem I described is
>>> indeed fully resolved by removing the "problematic" snapshot on the
>>> target device. I did not make any changes to the source volume. I did
>>> not make any other changes in my steps (see earlier message for my
>>> exact steps).
>>>
>>> Therefore, the problem I described in my earlier message is not due
>>> exclusively to having a Received UUID on the source volume (or to any
>>> other feature of the source volume). It is not related to any feature
>>> of the directly specified parent volume either. More details are
>>> included in my earlier email.
>>>
>>> Thanks for any further feedback, including answers to my questions and
>>> comments about whether this is a known issue.
>>>
>>>
>>> On Thu, Sep 7, 2017 at 8:39 AM, Dave <davestechshop@gmail.com> wrote:
>>>>
>>>> Hello. Can anyone further explain this issue ("you have a Received UUID on the source volume")?
>>>>
>>>> How does it happen?
>>>> How does one remove a Received UUID from the source volume?
>>>>
>>>> And how does that explain my results where I showed that the problem
>>>> is not dependent upon the source volume but is instead dependent upon
>>>> some existing snapshot on the target volume?
>>>>
>>>> My results do not appear to be fully explained by a Received UUID on the source volume, as my prior message hopefully shows clearly.
>>>>
>>>> Thank you.
>>>>
>>>> On Thu, Sep 7, 2017 at 2:24 AM, A L <crimsoncottage@gmail.com> wrote:
>>>>> The problem can be that you have a Received UUID on the source volume. This breaks send-receive.
>>>>>
>>>>> ---- From: Dave <davestechshop@gmail.com> -- Sent: 2017-09-07 - 06:43 ----
>>>>>
>>>>>> Here is more info and a possible (shocking) explanation. This
>>>>>> aggregates my prior messages and it provides an almost complete set of
>>>>>> steps to reproduce this problem.
>>>>>>
>>>>>> Linux srv 4.9.41-1-lts #1 SMP Mon Aug 7 17:32:35 CEST 2017 x86_64 GNU/Linux
>>>>>> btrfs-progs v4.12
>>>>>>
>>>>>> My steps:
>>>>>>
>>>>>> [root@srv]# sync
>>>>>> [root@srv]# mkdir /home/.snapshots/test1
>>>>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test1/
>>>>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test1//home'
>>>>>> [root@srv]# sync
>>>>>> [root@srv]# mkdir /mnt/x5a/home/test1
>>>>>> [root@srv]# btrfs send /home/.snapshots/test1/home/ | btrfs receive
>>>>>> /mnt/x5a/home/test1/
>>>>>> At subvol /home/.snapshots/test1/home/
>>>>>> At subvol home
>>>>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user1/
>>>>>> NOTE: all recent files are present
>>>>>> [root@srv]# ls -la /mnt/x5a/home/test1/home/user2/Documents/
>>>>>> NOTE: all recent files are present
>>>>>> [root@srv]# mkdir /home/.snapshots/test2
>>>>>> [root@srv]# mkdir /mnt/x5a/home/test2
>>>>>> [root@srv]# btrfs su sn -r /home/ /home/.snapshots/test2/
>>>>>> Create a readonly snapshot of '/home/' in '/home/.snapshots/test2//home'
>>>>>> [root@srv]# sync
>>>>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
>>>>>> /home/.snapshots/test2/home/ | btrfs receive /mnt/x5a/home/test2/
>>>>>> At subvol /home/.snapshots/test2/home/
>>>>>> At snapshot home
>>>>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user1/
>>>>>> NOTE: all recent files are MISSING
>>>>>> [root@srv]# ls -la /mnt/x5a/home/test2/home/user2/Documents/
>>>>>> NOTE: all recent files are MISSING
>>>>>>
>>>>>> Below I am including some rsync output to illustrate when a snapshot
>>>>>> is missing files (or not):
>>>>>>
>>>>>> [root@srv]# rsync -aniv /home/.snapshots/test1/home/
>>>>>> /home/.snapshots/test2/home/
>>>>>> sending incremental file list
>>>>>>
>>>>>> sent 1,143,286 bytes  received 1,123 bytes  762,939.33 bytes/sec
>>>>>> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
>>>>>>
>>>>>> This indicates that these two subvolumes contain the same files, which
>>>>>> they should because test2 is a snapshot of test1 without any changes
>>>>>> to files, and it was not sent to another physical device.
>>>>>>
>>>>>> The problem is when test2 is sent to another device as shown by the
>>>>>> rsync results below.
>>>>>>
>>>>>> [root@srv]# rsync -aniv /home/.snapshots/test2/home/ /mnt/x5a/home/test2/home/
>>>>>> sending incremental file list
>>>>>> .d..t...... ./
>>>>>> .d..t...... user1/
>>>>>>> f.st...... user1/.bash_history
>>>>>>> f.st...... user1/.bashrc
>>>>>>> f+++++++++ user1/test2017-09-06.txt
>>>>>> ...
>>>>>> and a long list of other missing files
>>>>>>
>>>>>> The incrementally sent snapshot at /mnt/x5a/home/test2/home/ is
>>>>>> missing all recent files (any files from the month of August or
>>>>>> September), as my prior visual inspections had indicated. The same
>>>>>> files are missing every time. There is no randomness to the missing
>>>>>> data.
>>>>>>
>>>>>> The problem does not happen for me if the receive command target is
>>>>>> located on the same physical device as shown next. (However, I suspect
>>>>>> there's more to it than that, as explained further below.)
>>>>>>
>>>>>> [root@srv]# mkdir /home/.snapshots/test2rec
>>>>>> [root@srv]# btrfs send -p /home/.snapshots/test1/home/
>>>>>> /home/.snapshots/test2/home/ | btrfs receive
>>>>>> /home/.snapshots/test2rec/
>>>>>> At subvol /home/.snapshots/test2/home/
>>>>>>
>>>>>> # rsync -aniv /home/.snapshots/test2/home/ /home/.snapshots/test2rec/home/
>>>>>> sending incremental file list
>>>>>>
>>>>>> sent 1,143,286 bytes  received 1,123 bytes  2,288,818.00 bytes/sec
>>>>>> total size is 3,642,972,271  speedup is 3,183.28 (DRY RUN)
>>>>>>
>>>>>> The above (as well as visual inspection of files) indicates that these
>>>>>> two subvolumes contain the same files, which was not the case when the
>>>>>> same command had a target located on another physical device. Of
>>>>>> course, a snapshot which resides on the same physical device is not a
>>>>>> very good backup. So I do need to send it to another device, but that
>>>>>> results in missing files when the -p or -c options are used with btrfs
>>>>>> send. (Non-incremental sending to another physical device does work.)
>>>>>>
>>>>>> I can think of a couple possible explanations.
>>>>>>
>>>>>> One is that there is a problem when using the -p or -c options with
>>>>>> btrfs send when the target is another physical device. I suspect this
>>>>>> is the actual explanation, however.
>>>>>>
>>>>>> A second possibility is that the presence of prior existing snapshots
>>>>>> at the target location (even if old and not referenced in any current
>>>>>> btrfs command), can determine the outcome and final contents of an
>>>>>> incremental send operation. I believe the info below suggests this to
>>>>>> be the case.
>>>>>>
>>>>>> [root@srv]# btrfs su show /home/.snapshots/test2/home/
>>>>>> test2/home
>>>>>>         Name:                   home
>>>>>>         UUID:                   292e8bbf-a95f-2a4e-8280-129202d389dc
>>>>>>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
>>>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>>>         Creation time:          2017-09-06 15:38:16 -0400
>>>>>>         Subvolume ID:           2000
>>>>>>         Generation:             5020
>>>>>>         Gen at creation:        5020
>>>>>>         Parent ID:              257
>>>>>>         Top level ID:           257
>>>>>>         Flags:                  readonly
>>>>>>         Snapshot(s):
>>>>>>
>>>>>> [root@srv]# btrfs su show /mnt/x5a/home/test1/home
>>>>>> home/test1/home
>>>>>>         Name:                   home
>>>>>>         UUID:                   dc00b13d-f841-cf48-a169-aa61429a5679
>>>>>>         Parent UUID:            -
>>>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>>>         Creation time:          2017-09-06 15:33:45 -0400
>>>>>>         Subvolume ID:           656
>>>>>>         Generation:             777
>>>>>>         Gen at creation:        773
>>>>>>         Parent ID:              257
>>>>>>         Top level ID:           257
>>>>>>         Flags:                  readonly
>>>>>>         Snapshot(s):
>>>>>>
>>>>>> [root@srv]# btrfs su show /mnt/x5a/home/test2/home/
>>>>>> home/test2/home
>>>>>>         Name:                   home
>>>>>>         UUID:                   b01ab63f-17a1-f442-b9d4-ed12a0d057ea
>>>>>>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>>>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>>>         Creation time:          2017-09-06 15:39:51 -0400
>>>>>>         Subvolume ID:           660
>>>>>>         Generation:             779
>>>>>>         Gen at creation:        779
>>>>>>         Parent ID:              257
>>>>>>         Top level ID:           257
>>>>>>         Flags:                  readonly
>>>>>>         Snapshot(s):
>>>>>>
>>>>>> [root@srv]# btrfs su show /home/.snapshots/test2rec/home/
>>>>>> test2rec/home
>>>>>>         Name:                   home
>>>>>>         UUID:                   bde1891d-1474-414f-b6ab-2a34c5af224e
>>>>>>         Parent UUID:            62418df6-a1f8-d74a-a152-11f519593053
>>>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>>>         Creation time:          2017-09-06 17:36:19 -0400
>>>>>>         Subvolume ID:           2003
>>>>>>         Generation:             5027
>>>>>>         Gen at creation:        5027
>>>>>>         Parent ID:              257
>>>>>>         Top level ID:           257
>>>>>>         Flags:                  readonly
>>>>>>         Snapshot(s):
>>>>>>
>>>>>> Below, we have old almost forgotten snapshot (date 2017-07-21) on
>>>>>> device /mnt/x5a/home with a Received UUID that matches the Received
>>>>>> UUID of test snapshots that were newly created today. How? Why?
>>>>>>
>>>>>> [root@thehulk home]# btrfs su show /mnt/x5a/home/107/snapshot
>>>>>> home/107/snapshot
>>>>>>         Name:                   snapshot
>>>>>>         UUID:                   94d0bc47-dbf2-374e-b1c8-de06d729cde2
>>>>>>         Parent UUID:            8bf40f97-10e0-9f47-a281-1a0b21bbbad0
>>>>>>         Received UUID:          e00d5318-6efd-824e-ac91-f25efa5c2a74
>>>>>>         Creation time:          2017-07-21 00:00:25 -0400
>>>>>>         Subvolume ID:           433
>>>>>>         Generation:             222
>>>>>>         Gen at creation:        221
>>>>>>         Parent ID:              257
>>>>>>         Top level ID:           257
>>>>>>         Flags:                  readonly
>>>>>>         Snapshot(s):
>>>>>>
>>>>>> If my guess is correct, btrfs has found this old snapshot and
>>>>>> referenced it without me telling it to do so. The result is that the
>>>>>> newly executed btrfs commands shown above have a totally unexpected
>>>>>> result.
>>>>>>
>>>>>> Today's new snapshot will not contain any files newer than 2017-07-21.
>>>>>> Is this a known issue?
>>>>>>
>>>>>> Refer back to the commands at the top of this message. I created a new
>>>>>> snapshot and did a full (non-incremental) send to the target location
>>>>>> (/mnt/x5a/home). Then I created a snapshot and did a send which only
>>>>>> referenced the prior snapshot created today. Nowhere did I reference
>>>>>> the ancient /mnt/x5a/home/107/snapshot. (Many prior snapshots exist at
>>>>>> this backup location -- it was intended to hold a lot of them.) Yet,
>>>>>> the very presence of /mnt/x5a/home/107/snapshot on the target device
>>>>>> resulted in today's backup (and all recent backups) being worthless
>>>>>> due to them missing all files since  2017-07-21.
>>>>>>
>>>>>> These results are totally repeatable, given my set of existing
>>>>>> backups. But it's bizarre to me. As I understand it, a staff person
>>>>>> could transfer a btrfs snapshot to a target volume and it's mere
>>>>>> presence there could make all subsequent backups (incremental sends)
>>>>>> to that target volume invalid and useless. If that is true... wow.
>>>>>>
>>>>>> Another interesting observation is that the device that contains the
>>>>>> source snapshot, /home/.snapshots, also contains many, many prior
>>>>>> snapshots, going back to when this system was first set up. Why do
>>>>>> none of them cause a problem? Is it because I had never used
>>>>>> /home/.snapshots as the target of a receive operation (until I did so
>>>>>> today in testing the steps above)?
>>>>>>
>>>>>> As far as repeating these steps, all this was totally repeatable for
>>>>>> me as long as /mnt/x5a/home/107/snapshot existed on the target of the
>>>>>> receive command (/mnt/x5a/home/). I do not know how to create such a
>>>>>> "rogue" snapshot on purpose, but doing so may be key to reproducing my
>>>>>> results.
>>>>>>
>>>>>> Maybe somebody can explain to me what's really happening. How is it
>>>>>> possible that an old snapshot created  2017-07-21 could have the same
>>>>>> Received UUID as snapshots created today? And how could that fact lead
>>>>>> to the result I'm seeing, which seems very serious. (Unexpected
>>>>>> missing files from a backup which was completed without errors is
>>>>>> pretty serious in my book.)
>>>>>>
>>>>>> Most important question: how can we rely on automated incremental
>>>>>> backups with btrfs send | receive given what I'm observing here
>>>>>> (assuming my observations are roughly correct)?
>>>>>>
>>>>>> Here's more info just to confirm that my results are not due to
>>>>>> filesystem corruption.
>>>>>>
>>>>>> running check on unmounted volume that contains /mnt/x5a/home/test2/home:
>>>>>> [root@srv]# btrfs check -p /dev/mapper/x5a_luks
>>>>>> Checking filesystem on /dev/mapper/x5a_luks
>>>>>> UUID: 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>>>>> checking extents [o]
>>>>>> checking free space cache [.]
>>>>>> checking fs roots [o]
>>>>>> checking csums
>>>>>> checking root refs
>>>>>> found 258178555904 bytes used, no error found
>>>>>> total csum bytes: 250354776
>>>>>> total tree bytes: 1752088576
>>>>>> total fs tree bytes: 1308540928
>>>>>> total extent tree bytes: 175161344
>>>>>> btree space waste bytes: 215594634
>>>>>> file data blocks allocated: 258634637312
>>>>>>  referenced 292888985600
>>>>>>
>>>>>> [root@srv]# btrfs fi show /mnt/x5a/
>>>>>> Label: 'x5a_top'  uuid: 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>>>>>         Total devices 1 FS bytes used 240.45GiB
>>>>>>         devid    1 size 4.55TiB used 244.07GiB path /dev/mapper/x5a_luks
>>>>>>
>>>>>> [root@srv]# btrfs fi df /mnt/x5a/
>>>>>> Data, single: total=239.01GiB, used=238.82GiB
>>>>>> System, DUP: total=32.00MiB, used=48.00KiB
>>>>>> Metadata, DUP: total=2.50GiB, used=1.63GiB
>>>>>> GlobalReserve, single: total=422.73MiB, used=0.00B
>>>>>>
>>>>>> # btrfs scrub status -d /mnt/x5a/
>>>>>> scrub status for 724f7cc1-41d8-456f-9fab-7ace457bd62a
>>>>>> scrub device /dev/mapper/x5a_luks (id 1) history
>>>>>>         scrub started at Wed Sep  6 17:09:58 2017 and finished after 01:42:30
>>>>>>         total bytes scrubbed: 242.08GiB with 0 errors
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
  2017-09-11 17:53                 ` Axel Burri
@ 2017-09-12  3:19                   ` Andrei Borzenkov
  2017-09-13 16:52                     ` Dave
  0 siblings, 1 reply; 11+ messages in thread
From: Andrei Borzenkov @ 2017-09-12  3:19 UTC (permalink / raw)
  To: Axel Burri, Dave, linux-btrfs; +Cc: A L

11.09.2017 20:53, Axel Burri пишет:
> On 2017-09-08 06:44, Dave wrote:
>> I'm referring to the link below. Using "btrfs subvolume snapshot -r"
>> copies the Received UUID from the source into the new snapshot. The
>> btrbk FAQ entry suggests otherwise. Has something changed?
> 
> I don't think something has changed, the description for the read-only
> subvolumes on the btrbk FAQ was just wrong (fixed now).
> 
>> The only way I see to remove a Received UUID is to create a rw
>> snapshot (above command without the "-r"), which is not ideal in this
>> situation when cleaning up readonly source snapshots.
>>
>> Any suggestions? Thanks
> 
> No suggestions from my part, as far as I know there is no way to easily
> remove/change a received_uuid from a subvolume.
> 

There is BTRFS_IOC_SET_RECEIVED_SUBVOL IOCTL which is used by "btrfs
received". My understanding is that it can also be set to empty (this
clearing it). You could write small program to do it.

In general it sounds like a bug - removing read-only flag from subvolume
by any means should also clear Received UUID as we cannot anymore
guarantee that subvolume content is the same.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: send | receive: received snapshot is missing recent files
  2017-09-12  3:19                   ` Andrei Borzenkov
@ 2017-09-13 16:52                     ` Dave
  0 siblings, 0 replies; 11+ messages in thread
From: Dave @ 2017-09-13 16:52 UTC (permalink / raw)
  To: Andrei Borzenkov; +Cc: Axel Burri, linux-btrfs, A L

On Mon, Sep 11, 2017 at 11:19 PM, Andrei Borzenkov <arvidjaar@gmail.com> wrote:
> 11.09.2017 20:53, Axel Burri пишет:
>> On 2017-09-08 06:44, Dave wrote:
>>> I'm referring to the link below. Using "btrfs subvolume snapshot -r"
>>> copies the Received UUID from the source into the new snapshot. The
>>> btrbk FAQ entry suggests otherwise. Has something changed?
>>
>> I don't think something has changed, the description for the read-only
>> subvolumes on the btrbk FAQ was just wrong (fixed now).
>>
>>> The only way I see to remove a Received UUID is to create a rw
>>> snapshot (above command without the "-r"), which is not ideal in this
>>> situation when cleaning up readonly source snapshots.
>>>
>>> Any suggestions? Thanks
>>
>> No suggestions from my part, as far as I know there is no way to easily
>> remove/change a received_uuid from a subvolume.
>>
>
> There is BTRFS_IOC_SET_RECEIVED_SUBVOL IOCTL which is used by "btrfs
> received". My understanding is that it can also be set to empty (this
> clearing it). You could write small program to do it.
>
> In general it sounds like a bug - removing read-only flag from subvolume
> by any means should also clear Received UUID as we cannot anymore
> guarantee that subvolume content is the same.

Yes! That makes a great deal of sense.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-09-13 16:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-06  5:37 send | receive: received snapshot is missing recent files Dave
     [not found] ` <CAH=dxU7RM7s+pxT=wxE9WcUNMWjSG_A0=1pUWD1dWGVQ6g+g8Q@mail.gmail.com>
2017-09-06 19:46   ` Dave
2017-09-07  4:43     ` Dave
2017-09-07  6:24       ` A L
2017-09-07 12:39         ` Dave
2017-09-07 13:34           ` Dave
2017-09-07 14:33             ` Axel Burri
2017-09-08  4:44               ` Dave
2017-09-11 17:53                 ` Axel Burri
2017-09-12  3:19                   ` Andrei Borzenkov
2017-09-13 16:52                     ` Dave

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.