linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrei Borzenkov <arvidjaar@gmail.com>
To: Dennis K <dennisk@netspace.net.au>, linux-btrfs@vger.kernel.org
Subject: Re: Incremental receive completes succesfully despite missing files
Date: Sun, 20 Jan 2019 17:04:00 +0300	[thread overview]
Message-ID: <bfdab6f0-e282-47b9-911b-30cee765d043@gmail.com> (raw)
In-Reply-To: <110c46c8-6fe9-84ea-0f4e-8269fd8000ed@netspace.net.au>

20.01.2019 13:25, Dennis K пишет:
> Apologies in advance, if the issue I put forward is actually the
> intended behavior of BTRFS.
> 
> I have noted while playing with sub-volumes, and trying to determine
> what exactly are the requirements for a subvolume to act as a legitimate
> parent during a receive operation, that modification of one subvolume,
> can affect children subvolumes that are received.
> 
> It's possible I have noted this before when directories which I though
> should have existed in the destination volume, where not present,
> despite being present in the snapshot at the sending end.  (ie, a
> subvolume is sent incrementally, but the received subvolume is missing
> files that exist on the sent side).
> 
> I can replicate this as follows
> 
> Create the subvolumes and put some files in them.
> # btrfs sub create 1
> # btrfs sub create 2
> # cd 1
> # dd if=/dev/urandom bs=1M count=10 of=test
> # cd ..
> # btrfs sub snap 1 2

Apparently some command is missing here.

> # dd if=/dev/urandom bs=1M count=1 of=test2

This creates test2 outside of subvolumes 1 or 2.

> # cd ..
> 

And this goes one level up so that next commands are invalid (they
assume you are still in direct parent of 1 and 2).

Also I do not see what purpose your "btrfs sub snap" serves. It creates
snapshot 2/1, but it snapshot is not part of replication anyway.

> Now set as read-only to send.  Subvolume 1 has the file "test, and
> subvolume 2 has the files "test" and "test2".
> # btrfs prop set 1 ro true
> # btrfs prop set 2 ro true
> 
> Send, snapshot 2 is an incremental send.  The files created are the
> expected sizes.
> # btrfs send 1 -f /tmp/1
> # btrfs send -p 1 2 -f /tmp 2
> 

That must be a typo, from the following text /tmp/2 is implied. Never
manually type in commands; always copy and paste them (or record using
script or similar and attach exact recording). Otherwise nothing in your
report can be trusted.

> Now we make subvolume one read-write
> # btrfs prop set 1 ro false

At this point all bets are off.

> # rm 1/test
> 

Now subvolume 1 no more matches state that was used to generate
incremental stream.

> Delete subvolume 2 and then recreate it be receiving it.
> # btrfs sub del 2
> # btrfs receive -f /tmp/2 .
> 
> What happens, is that subvolume 2 is created, but it is missing the file
> 'test' which was present in subvolume 1 at the time it was created as a
> snapshot and sent.  It now only contains the file "test2", which is NOT
> the state that it was sent.
> 

That is correct. /tmp/2 contains just the *incremental* replication
stream, which contains instructions to apply changes in subvolume 2
against base subvolume 1. It does *not* contain full content of (replica
of) subvolume 2 because on receiving side btrfs would first have cloned
replica of subvolume 1 and then applied changes in replication stream.

> 
> Note the same results are obtained, if you also delete subvolume 1 and
> then recreate it with btrfs-receive.
> 
> This may explain why previously I found a send operation resulted in the
> receiving end missing files previously.
> 
> I understand that during send/receive, a snapshot is taken of the parent
> subvolume, then it is modified.  The problem is that if that snapshot is
> modified, then these modifications will affect the received subvolumes,
> including, in this case, silent data loss.
> 

Not sure I parse this part correctly, but in your case you intentionally
modified base subvolume and made btrfs apply changes to wrong initial
state. This is classical case of "doctor, it hurts when I stab myself in
the eye".

> 
> It would be better for the receive operation to fail, or at least put
> out a warning if the parent subvolume it is using has changed or is
> different from the reference subvolume used during send. 

I was honestly surprised that btrfs receive did not refuse to apply
changes to read-write subvolume. Otherwise replication stream normally
is applied in receiving side which simply does not have enough
information to check that *source* was not changed. Destination only
knows UUID of parent snapshot and assumes it was not changed.

Personally I consider ability to flip read-only bit major usability
issue which leads to problems you observed.

> I'm not sure
> whether BTRFS can check this via generation number or some other data,
> orbut at the moment, there is no such check and this appears to be a bug.
> 
> Is this correct behaviour?  Does BTRFS rely on the user, and user-space
> tools, never changing any subvolume in order to avoid silent data loss?
> 

Yes.

  reply	other threads:[~2019-01-20 14:08 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-20 10:25 Incremental receive completes succesfully despite missing files Dennis K
2019-01-20 14:04 ` Andrei Borzenkov [this message]
2019-01-21 22:23 ` Chris Murphy
2019-01-22  4:36   ` Chris Murphy
2019-01-22  4:54   ` Chris Murphy
2019-01-22  6:00     ` Remi Gauvin
2019-01-22  6:28       ` Chris Murphy
2019-01-22 17:57         ` Andrei Borzenkov
2019-01-22 19:37           ` Chris Murphy
2019-01-22 19:45             ` Hugo Mills
2019-01-23 10:44   ` Dennis Katsonis
2019-01-23 11:25     ` Andrei Borzenkov
2019-01-23 13:52       ` Dennis Katsonis
2019-01-23 18:17         ` Andrei Borzenkov
2019-01-23 15:25       ` Hans van Kranenburg
2019-01-23 15:32         ` Nikolay Borisov
2019-01-23 16:23           ` Hans van Kranenburg
2019-01-24 10:40             ` Dennis K
2019-01-24 17:22               ` Chris Murphy
2019-01-26  2:43                 ` Dennis Katsonis
2019-01-26 23:09                   ` Chris Murphy
2019-01-27  1:58                     ` Dennis Katsonis
2019-01-23 15:40         ` Remi Gauvin
2019-01-23 16:59           ` Hans van Kranenburg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bfdab6f0-e282-47b9-911b-30cee765d043@gmail.com \
    --to=arvidjaar@gmail.com \
    --cc=dennisk@netspace.net.au \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).