linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dennis Katsonis <dennisk@netspace.net.au>
To: Chris Murphy <lists@colorremedies.com>
Cc: Hans van Kranenburg <Hans.van.Kranenburg@mendix.com>,
	Nikolay Borisov <nborisov@suse.com>,
	Andrei Borzenkov <arvidjaar@gmail.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
	David Sterba <dsterba@suse.cz>
Subject: Re: Incremental receive completes succesfully despite missing files
Date: Sun, 27 Jan 2019 12:58:50 +1100	[thread overview]
Message-ID: <dd80ddc0-a88b-af50-2e5b-f30602ecfabf@netspace.net.au> (raw)
In-Reply-To: <CAJCQCtSgO4heWumhQMsp_ZRc0o79EfcvniXSoHcPyO9D=cgP2A@mail.gmail.com>

On 1/27/19 10:09 AM, Chris Murphy wrote:
> On Fri, Jan 25, 2019 at 7:43 PM Dennis Katsonis <dennisk@netspace.net.au> wrote:
>>
>> On 1/25/19 4:22 AM, Chris Murphy wrote:
>>> On Thu, Jan 24, 2019 at 3:40 AM Dennis K <dennisk@netspace.net.au> wrote:
>>>>
>>>> The fact is, this thread is the first time I've seen explicitly written
>>>> that parents must be the same at receiving and sending ends, or else
>>>> btrfs-send/receive will produce a subvolume which differs from the source.
>>>
>>> The central user error, as well as btrfs-progs bug is the failure to
>>> meet the requirement that the source(s) be snapshots. Either a full
>>> send, or an incremental send, whether with -p or -c, all of them must
>>> be snapshots. And none of yours were snapshots. They were read-only
>>> subvolumes made using 'btrfs property' to set the read-only flag, and
>>> those are not snapshots.
>>
>> Doesn't matter even if you only send snapshots and never set a subvolume
>> to ro to please btrfs-send.
>>
>> # btrfs sub create 1
>> # touch 1/file1
>> # btrfs sub snap -r 1 2
>> # btrfs send 2 | btrfs receive /destination
>> !# sudo btrfs prop set /destination/2/ ro false
>> !# rm /destination/2/file1
>> # touch 1/file2
>> # sudo btrfs sub snap -r 1 3
>> # btrfs send -p 2 3 | btrfs receive /storage2/
>> # ls /storage2/3/
>> file2
>> # ls 3/
>> file1 file2
>>
>> The lines with the '!' are the source of the trouble, but it doesn't
>> have to be rm.  You could modify file 1 instead.  It is obvious that
>> users shouldn't do that, AFTER you've run through this sequence.
>>
>> Note that snapshot 2 at the receiving end was never set back to ro.
> 
> You're just finding more bugs (technically they are missing features,
> as checking for user error is a feature); and I'm noticing there are
> zero warnings in the man page for 'btrfs property' about unsetting ro
> on snapshots. I agree with others that as soon as the ro flag is unset
> on a snapshot, whatever metadata that makes it a snapshot and
> references its parent, including both parent UUID and received UUIDs,
> should be removed.
> 

Early on using btrfs, I assumed that the generation numbers between the
sent and received subvolume were initially the same, and that these
numbers were compared to ensure parents haven't changed relative to each
other.



>> You can also add an addition "rm /destination/snap2/file1" or even "rm
>> -rf /destination/snap2/*" before the last rsync, and it still gives you
>> a full replication.
>>
>> The expectation after running rsync without any explicit additional
>> inclusion/exclusion clauses is replication, which is what it provides.
> 
> Not exactly. From man rsync:
>>>>
>        Rsync finds files that need to be transferred using a "quick
> check" algorithm (by default) that looks for files that have changed
> in  size  or  in  last-modified
>        time.   Any changes in the other preserved attributes (as
> requested by options) are made on the destination file directly when
> the quick check indicates that the
>        file’s data does not need to be updated.
>>>>
> 
> If you initially replicate a destination using -r, the destination
> files have *current* time stamps. Each additional -r causes
> replication because the modification times for the source and
> destination are different. If you initially use 'rsync -a' and then
> follow it up with rsync -r, you'll see that the destination inodes
> remain the same, no copy happens.
> 
> Anyway, if you're saying that rsync will at *worst* do an unnecessary
> file copy, that's true and hence why -c exists; whereas with btrfs
> send receive you can apparently successfully sabotage incremental send
> receive by tampering with the receive side snapshot using 'btrfs
> property' to unset ro, and remove a file; and a send/receive will not
> make a subsequent receive side snapshot identical to the send side
> source.
> 

Yes.  The worst case is, it takes a longer and there is more disk and/or
network activity and what appears to be already sent files sent again.
But no matter what interpretation the user has, or whether this is the
result of accidental or deliberate action, or even a previously aborted
run, the end result after success is returned is a duplicated filetree,
what rsync is 'contracted' to do.

This isn't quite the case with send/receive.


> 
> 
> [chris@flap ~]$ ls -li test1/
> total 1024
> 5185860 -rw-rw-r--. 1 chris chris 524288 Jan 26 15:55 file3
> 5185861 -rw-rw-r--. 1 chris chris 524288 Jan 26 15:55 file4
> [chris@flap ~]$ ls -li test2/
> total 1024
> 5185866 -rw-rw-r--. 1 chris chris 524288 Jan 26 15:55 file3
> 5185867 -rw-rw-r--. 1 chris chris 524288 Jan 26 15:55 file4
> [chris@flap ~]$ rsync -rv test1/ test2/
> sending incremental file list
> sent 77 bytes  received 12 bytes  178.00 bytes/sec
> total size is 1,048,576  speedup is 11,781.75
> [chris@flap ~]$ ls -li test1/
> total 1024
> 5185860 -rw-rw-r--. 1 chris chris 524288 Jan 26 15:55 file3
> 5185861 -rw-rw-r--. 1 chris chris 524288 Jan 26 15:55 file4
> [chris@flap ~]$ ls -li test2/
> total 1024
> 5185866 -rw-rw-r--. 1 chris chris 524288 Jan 26 15:55 file3
> 5185867 -rw-rw-r--. 1 chris chris 524288 Jan 26 15:55 file4
> [chris@flap ~]$
> 
> 
> 
>>> I'm still really not following where your confusion stems from, and
>>> therefore I'm not sure what needs fixing other than the items I've
>>> already mentioned - which itself at least would have stopped you in
>>> your tracks, to go dig deeper or ask questions before arriving at the
>>> understandably confusing results you were getting.
>>>
>>
>> From the man page
>>
>> "       btrfs receive will fail in the following cases:
>>
>>         1. receiving subvolume already exists
>>
>>         2. previously received subvolume has been changed after it was
>> received
>>
>>         3. default subvolume has changed or you didn’t mount the
>> filesystem at the toplevel subvolume
>>
>>        A subvolume is made read-only after the receiving process
>> finishes successfully (see BUGS below).
>> "
>>
>> Point 2 is what you are referring to, but it doesn't fail.  It reports
>> success and doesn't return an error code to the shell.  So the
>> documentation could be improved by reflecting this fact, which can avoid
>> users assuming that the snapshots at the receiving end must be the same
>> because it didn't fail.
> 
> The bug is that it doesn't fail. I consider the documentation correct,
> but the command behavior is wrong.
> 
> 
>> Also, the btrfs-send man page could be improved, by including parents as
>> well as clone sources as being required to be the same on both ends.
> 
> man btrfs send does say this for clones, but not for parents. And I
> agree that the documentation needs to be more clear. Someone needs to
> volunteer to make the changes and submit a patch.
> 
>>>>
>        You must not specify clone sources unless you guarantee that
> these snapshots are exactly in the same state on both sides—both for
> the sender and the receiver.
>>>>
> 
> 


  reply	other threads:[~2019-01-27  2:00 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-20 10:25 Incremental receive completes succesfully despite missing files Dennis K
2019-01-20 14:04 ` Andrei Borzenkov
2019-01-21 22:23 ` Chris Murphy
2019-01-22  4:36   ` Chris Murphy
2019-01-22  4:54   ` Chris Murphy
2019-01-22  6:00     ` Remi Gauvin
2019-01-22  6:28       ` Chris Murphy
2019-01-22 17:57         ` Andrei Borzenkov
2019-01-22 19:37           ` Chris Murphy
2019-01-22 19:45             ` Hugo Mills
2019-01-23 10:44   ` Dennis Katsonis
2019-01-23 11:25     ` Andrei Borzenkov
2019-01-23 13:52       ` Dennis Katsonis
2019-01-23 18:17         ` Andrei Borzenkov
2019-01-23 15:25       ` Hans van Kranenburg
2019-01-23 15:32         ` Nikolay Borisov
2019-01-23 16:23           ` Hans van Kranenburg
2019-01-24 10:40             ` Dennis K
2019-01-24 17:22               ` Chris Murphy
2019-01-26  2:43                 ` Dennis Katsonis
2019-01-26 23:09                   ` Chris Murphy
2019-01-27  1:58                     ` Dennis Katsonis [this message]
2019-01-23 15:40         ` Remi Gauvin
2019-01-23 16:59           ` Hans van Kranenburg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dd80ddc0-a88b-af50-2e5b-f30602ecfabf@netspace.net.au \
    --to=dennisk@netspace.net.au \
    --cc=Hans.van.Kranenburg@mendix.com \
    --cc=arvidjaar@gmail.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).