btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors

All of lore.kernel.org
 help / color / mirror / Atom feed

* btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
@ 2015-11-22 21:59 Nils Steinger
  2015-11-23  5:49 ` Duncan
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Nils Steinger @ 2015-11-22 21:59 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1021 bytes --]

Hi,

I recently ran into a problem while trying to back up some of my btrfs
subvolumes over the network:
`btrfs send` works flawlessly on snapshots of most subvolumes, but keeps
failing on snapshots of a certain subvolume — always after sending 15 GiB:

btrfs send /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT |
pv | ssh kappa "btrfs receive /mnt/300gb/backups/snapshots/zeta/home/"
At subvol /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT
At subvol 2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT
ERROR: send ioctl failed with -2: No such file or directory
  15GB 0:34:34 [7,41MB/s]

I've tried piping the output to /dev/null instead of ssh and got the
same error (again after sending 15 GiB), so this seems to be on the
sending side.

However, btrfs scrub reports no errors and I don't get any messages in
dmesg when the btrfs send fails.

What could cause this kind of error?
And is there a way to fix it, preferably without recreating the FS?


Regards,
Nils Steinger


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-22 21:59 btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors Nils Steinger
@ 2015-11-23  5:49 ` Duncan
  2015-11-23 12:26 ` Austin S Hemmelgarn
  2015-11-24 21:11 ` Filipe Manana
  2 siblings, 0 replies; 16+ messages in thread
From: Duncan @ 2015-11-23  5:49 UTC (permalink / raw)
  To: linux-btrfs

Nils Steinger posted on Sun, 22 Nov 2015 22:59:36 +0100 as excerpted:

> I recently ran into a problem while trying to back up some of my btrfs
> subvolumes over the network:
> `btrfs send` works flawlessly on snapshots of most subvolumes, but keeps
> failing on snapshots of a certain subvolume — always after sending 15
> GiB:
> 
> btrfs send /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT |
> pv | ssh kappa "btrfs receive /mnt/300gb/backups/snapshots/zeta/home/"
> At subvol /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT At
> subvol 2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT ERROR: send ioctl failed
> with -2: No such file or directory
>   15GB 0:34:34 [7,41MB/s]
> 
> I've tried piping the output to /dev/null instead of ssh and got the
> same error (again after sending 15 GiB), so this seems to be on the
> sending side.
> 
> However, btrfs scrub reports no errors and I don't get any messages in
> dmesg when the btrfs send fails.
> 
> What could cause this kind of error?
> And is there a way to fix it, preferably without recreating the FS?

Btrfs scrub?  Why do you believe it will detect/fix the problem?  Do you 
have reason to believe the hardware is not reliable and is returning data 
that's different than what was saved in the first place, or that your RAM 
is bad and thus that the checksums recorded for the data and metadata as 
it was saved were unreliable as saved?

Because what btrfs scrub does is very simple.  It checks that the data 
and metadata on the filesystem still produce checksums that match the 
checksums recorded when that data/metadata and the checksums covering it 
were originally saved.  If the checksums match, scrub reports no problems.

But what scrub does NOT detect are problems in the data and metadata that 
occurred before it was saved.  If you downloaded a jpeg image, for 
instance, and it was corrupted in the download, but the data you got was 
saved to btrfs just the way you got it, it won't report as invalid, 
because the checksum was taken on data that was already invalid.  But if 
it was correct as downloaded and saved, but the physical device hosting 
the btrfs is going bad, so it returns different data for that file than 
what was originally saved, then the checksum taken on the data before it 
was saved isn't going to match what you're getting back, and /that/ error 
would be detected.

So btrfs scrub detects (and under all but single and raid0 modes, 
potentially corrects using either the redundant copy of dup or raid1/10 
modes or the parity cross-checks of raid5/6 modes) is a very limited 
subset of potential errors, generally only that the data that was written 
still matches the checksum written for it, when it is read back.  But it 
won't detect others, if there's a bug in btrfs itself such that it 
checksums and writes the wrong data, or if the data was otherwise invalid 
before it was checksummed and written in the first place (as with the jpeg 
corrupted during download, example).

What you're almost certainly wanting to run instead, is btrfs check (the 
recommendation is not to run it with the --repair option, until you know 
what errors it returns in default check-but-don't-fix mode, and know that 
repair will actually fix the problem, generally after posting the results 
of the check-only here and getting confirmation that --repair will 
actually fix the problems properly), since btrfs check actually checks 
for various other filesystem related bugs.

However, note that just because send is failing, doesn't mean check will 
actually find something wrong.  It might, but it might not, too.  The 
general send/receive situation is as follows:

If both send and receive complete successfully, you can be quite 
confident that you have a faithfully reproduced copy.  However, there are 
various corner-cases that send/receive may still have problems with, altho 
over time the ones found have been fixed to work correctly.

Here's a very simple example that was one of the first such corner-cases 
fixed.  Suppose you have a subvolume that originally has two directories, 
A and B, with B nested inside A such that B is a subdir of A.  That's 
what you do your original send/receive based on.  Then, sometime later, 
you decide B should be the outer directory, with A nested inside it.  
Then, you do another send/receive, this one incremental, using the first 
one as the parent. That reversed nesting order corner-case used to trip 
up send/receive, which didn't originally know how to deal with that 
case.  But as I said, that was one of the first corner-case breakages 
found, and a patch soon taught send/receive how to deal with it properly.

But there have been a number of other similar corner-case failures, 
generally more complex than that one.  As they've been found they've been 
fixed.

The problem, however, is that as a dev you never really know that you've 
found and fixed *ALL* of them, because as you find and fix the most 
common, the remaining corner-case failures become less and less common, 
and you never really know if there's yet more of them that are simply too 
rare for people to have found and reported yet, or if you've really 
gotten them all, now.

But, again, it's worth noting that the failure mode is "fail dirty".  
That is, if both ends report success, you can be quite confident it is 
indeed a reliable copy.  The chance of silent failure is extremely small, 
and if there is a failure, you know about it as one end or the other 
fails with an error you can see, even if you don't know exactly what's 
causing it.

So definitely, do that check and see if it reports problems.  But don't 
be too surprised if it doesn't, because it very well could be another 
corner-case that is entirely valid at the filesystem level (just like 
that nesting reversal above, that's entirely legit), and send/receive 
simply doesn't know how to deal with it yet.

If the check does come up clean, then the next thing, since you didn't 
mention your kernel or btrfs-progs versions, is to upgrade to current 
versions if necessary, since send/receive (and check) will have been 
taught about more problems and how to deal with them, in newer versions.  
Try again (both the send/receive and the check) with those current 
versions.

If with current you're still getting failure, but check coming up clean, 
then you've very possibly hit another corner-case, and the devs are 
likely to be interested in trying to debug and trace it down, to 
eliminate it as they've done the others.

Meanwhile, if you don't have time to debug with them, you can of course 
try resolving the situation yourself.  Since it's reproducibly happening 
at 15 GiB, it's always happening at the same place.  You can try deleting 
stuff or moving it temporarily to a different filesystem or subvolume, 
and see if you can avoid the problem or move it elsewhere.  By bisecting 
the problem (repeatedly cutting in half the problem space each time, 
testing half of what was the bad half in the last step), you have a very 
good chance of figuring out what subdir, and possibly eventually what 
file, is causing the problem.  Once you know that, you can delete just 
that subdir or file and restore from backup, hopefully deleting the 
problem along with the file, and not bringing it back with the restore 
from backup.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-22 21:59 btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors Nils Steinger
  2015-11-23  5:49 ` Duncan
@ 2015-11-23 12:26 ` Austin S Hemmelgarn
  2015-11-23 21:10   ` Nils Steinger
  2015-11-24 21:11 ` Filipe Manana
  2 siblings, 1 reply; 16+ messages in thread
From: Austin S Hemmelgarn @ 2015-11-23 12:26 UTC (permalink / raw)
  To: Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2662 bytes --]

On 2015-11-22 16:59, Nils Steinger wrote:
> Hi,
>
> I recently ran into a problem while trying to back up some of my btrfs
> subvolumes over the network:
> `btrfs send` works flawlessly on snapshots of most subvolumes, but keeps
> failing on snapshots of a certain subvolume — always after sending 15 GiB:
>
> btrfs send /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT |
> pv | ssh kappa "btrfs receive /mnt/300gb/backups/snapshots/zeta/home/"
> At subvol /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT
> At subvol 2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT
> ERROR: send ioctl failed with -2: No such file or directory
>    15GB 0:34:34 [7,41MB/s]
>
> I've tried piping the output to /dev/null instead of ssh and got the
> same error (again after sending 15 GiB), so this seems to be on the
> sending side.
This is an issue that comes up sometimes with send, it's not well 
understood or documented, but sometimes something in source FS can get 
into a state that send chokes on, and then crashes.  I've actually been 
trying to reproduce this myself on a small filesystem so that it's 
easier to debug, but so far been unsuccessful.  I have yet to find any 
reliable way to reproduce this, and thus have no reliable way to prevent 
it from happening either.
>
> However, btrfs scrub reports no errors and I don't get any messages in
> dmesg when the btrfs send fails.
Scrub is intended to fix corruption due to hardware failures.  In almost 
all cases that I've seen of what you are getting, it wasn't a provable 
hardware issue, and scrub returned no errors.
>
> What could cause this kind of error?
> And is there a way to fix it, preferably without recreating the FS?
In general (assuming you are seeing the same issue I run into from time 
to time), there are two options other than recreating the filesystem:
1. Recreate the file that scrub is choking on.  You can see what file by 
adding -vv to the receive command-li9ne, although be ready for lots of 
output.  It's important to note that mv won't work for this unless 
you're moving the data to a different filesystem (if it's a directory, 
copy everything out and then recreate the directory, then copy 
everything back in).  The downside to this option is that you will 
usually run into multiple files that send chokes on, and the only way to 
find them all is to keep repeating the process until send completes 
successfully.
2. Run a full balance on the FS (this doesn't work anywhere near as 
reliably as the first option, but is the only way to fix some issues 
caused by doing batch deduplication on some older kernels).


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-23 12:26 ` Austin S Hemmelgarn
@ 2015-11-23 21:10   ` Nils Steinger
  2015-11-24  5:42     ` Duncan
  0 siblings, 1 reply; 16+ messages in thread
From: Nils Steinger @ 2015-11-23 21:10 UTC (permalink / raw)
  To: Duncan, Austin S Hemmelgarn; +Cc: linux-btrfs

On Mon, Nov 23, 2015 at 05:49:05AM +0000, Duncan wrote:
> Btrfs scrub?  Why do you believe it will detect/fix the problem?

I was under the impression that btrfs scrub would detect all kinds of
inconsistencies (not just data-checksum mismatches), including whatever caused
btrfs send to fail.
Thanks for clearing up that misconception!

In this case, I ended up following Austin's first advice: I ran `btrfs receive
-vv` and moved the directory where it failed (something inside my Chromium
profile) off the filesystem and back onto it. When I reran `send` it failed
inside a neighbouring directory, so I recreated that one too. Cue some more
repetitions of that (with files from my Firefox and Skype profile directories).
In the end, I just rsync'd my entire home directory to a new subvolume, which
finally seems to have done the trick — at least I could `btrfs send` a snapshot
of the new subvolume to /dev/null.

Do we anything about what might cause a filesystem to enter a state which
`send` chokes on?
I've only seen a small sample of the corrupted files before growing tired of
the process and just recreating the whole thing, but all of them were database
files (presumably SQLite). Could it be that the files were being written to
during an unclean shutdown, leading to some kind of corruption of the FS?
Unfortunately, I was a little triggerhappy when cleaning up old snapshots, so
there aren't any left to aid in troubleshooting this problem further…

Regards,
Nils

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-23 21:10   ` Nils Steinger
@ 2015-11-24  5:42     ` Duncan
  2015-11-24 12:46       ` Austin S Hemmelgarn
  0 siblings, 1 reply; 16+ messages in thread
From: Duncan @ 2015-11-24  5:42 UTC (permalink / raw)
  To: linux-btrfs

Nils Steinger posted on Mon, 23 Nov 2015 22:10:12 +0100 as excerpted:

> Do we anything about what might cause a filesystem to enter a state
> which `send` chokes on?
> I've only seen a small sample of the corrupted files before growing
> tired of the process and just recreating the whole thing, but all of
> them were database files (presumably SQLite). Could it be that the files
> were being written to during an unclean shutdown, leading to some kind
> of corruption of the FS? Unfortunately, I was a little triggerhappy when
> cleaning up old snapshots, so there aren't any left to aid in
> troubleshooting this problem further…

Austin's the one attempting to trace down the problem, so he'd have the 
most direct answer there.  (My use-case doesn't involve snapshotting or 
send/receive at all.)

But if any type of files would be likely to create issues, it'd be 
something like database or VM image files, since the random-file-rewrite-
pattern they typically have is in general the most problematic for copy-
on-write (COW) filesystems such as btrfs.  Without some sort of 
additional fragmentation management (like the autodefrag mount option), 
these files will end up _highly_ fragmented on btrfs, often thousands of 
fragments, tens of thousands when the files in question are multi-gig.

For the typically smallish sqlite database files, autodefrag can help 
with the fragmentation such a rewrite pattern generally triggers with 
COW, and it'd be recommended in general if all such files on the 
filesystem are smallish (quarter to a half gig or smaller), but if you're 
running large VM images, etc, autodefrag doesn't scale so well to them, 
and much more complex fragmentation management will be needed.  But 
that'd be for a different post as I don't yet know if it applies here, 
and I'm trying to keep this one short.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24  5:42     ` Duncan
@ 2015-11-24 12:46       ` Austin S Hemmelgarn
  2015-11-24 18:48         ` Christoph Anton Mitterer
  0 siblings, 1 reply; 16+ messages in thread
From: Austin S Hemmelgarn @ 2015-11-24 12:46 UTC (permalink / raw)
  To: Duncan, Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4180 bytes --]

On 2015-11-24 00:42, Duncan wrote:
> Nils Steinger posted on Mon, 23 Nov 2015 22:10:12 +0100 as excerpted:
>
>> Do we anything about what might cause a filesystem to enter a state
>> which `send` chokes on?
>> I've only seen a small sample of the corrupted files before growing
>> tired of the process and just recreating the whole thing, but all of
>> them were database files (presumably SQLite). Could it be that the files
>> were being written to during an unclean shutdown, leading to some kind
>> of corruption of the FS? Unfortunately, I was a little triggerhappy when
>> cleaning up old snapshots, so there aren't any left to aid in
>> troubleshooting this problem further…
That's OK, I've not been able to figure out much anyway, despite the 
case of this I had about a month ago with about 200 different files 
hitting the issue (I had written a script at that time to automate 
fixing it, but haven't been able to find it for some reason), and the 
other cases I've had on my systems over the past year (I only started 
using send about a year ago for backups).  It might be worth noting that 
you're the first person who's directly reported this (I would have, but 
I hate to report stuff that isn't a critical data safety issue without a 
reliable reproducer).
>
> Austin's the one attempting to trace down the problem, so he'd have the
> most direct answer there.  (My use-case doesn't involve snapshotting or
> send/receive at all.)
I stopped using send/receive for backups after hitting this for what I 
think is the seventh time in the past year about a month ago (I still 
use snapshots for backups, but now I use them to generate SquashFS 
images (I really don't care about the block layout or inode numbers or 
most of the BTRFS related properties), which preserves my desire to have 
bootable backups, and also saves significant storage space both locally 
and on the cloud storage services I use for off-site backups (and in 
turn saves money on those too)).  I am still trying to pull together 
something to reliably reproduce this though, as I still use send/receive 
for some things (like cloning VM's without taking them offline or 
hitting the issues with block copies of a BTRFS filesystem).
>
> But if any type of files would be likely to create issues, it'd be
> something like database or VM image files, since the random-file-rewrite-
> pattern they typically have is in general the most problematic for copy-
> on-write (COW) filesystems such as btrfs.  Without some sort of
> additional fragmentation management (like the autodefrag mount option),
> these files will end up _highly_ fragmented on btrfs, often thousands of
> fragments, tens of thousands when the files in question are multi-gig.
In general, I've seen this mostly with three types of files:
1. Database files and VM images (In my experience, this has been the 
majority of the issue on filesystems that have them.  Autodefrag doesn't 
seem to help, at least, not for SQLite or BerkDB/GDBM databases).
2. Shared libraries and executables (these are the majority of the issue 
on filesystems without databases or VM images, although I can't for the 
life of me figure out why, as they are usually written to very infrequently)
3. Plain text configuration files.

For example, the last time I had this happen, it was on the root 
filesystem of one of my systems, and about a third of the problem files 
were either in /etc or text files under /usr/share, while the remaining 
2 thirds were mostly stuff under /usr/lib and /lib.  It's probably worth 
noting also that I've never seen certain files trigger this that I would 
expect to based on the above info, in particular:
1. ClamAV virus databases (IIRC, these are similar in structure to 
SQLite DB's).
2. BOINC applications.
3. Almost anything in /usr/libexec (stuff like GCC and binutils).
4. Almost any kind of script.
It's probably also worth noting that I occasionally see inconsistencies 
in database files that cause this to happen, but have never seen any 
corruption in any other types of file, so it doesn't seem to have an 
impact on data safety.


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24 12:46       ` Austin S Hemmelgarn
@ 2015-11-24 18:48         ` Christoph Anton Mitterer
  2015-11-24 20:44           ` Austin S Hemmelgarn
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Anton Mitterer @ 2015-11-24 18:48 UTC (permalink / raw)
  To: Austin S Hemmelgarn, Duncan, Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 271 bytes --]

Hey.

All that sounds pretty serious, doesn't it? So in other words, AFAIU,
send/receive cannot really be reliably used.

I did so far for making incremental backups, but I've also experienced
some problems (though not what this is about here).


Cheers,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24 18:48         ` Christoph Anton Mitterer
@ 2015-11-24 20:44           ` Austin S Hemmelgarn
  2015-11-24 20:50             ` Christoph Anton Mitterer
  0 siblings, 1 reply; 16+ messages in thread
From: Austin S Hemmelgarn @ 2015-11-24 20:44 UTC (permalink / raw)
  To: Christoph Anton Mitterer, Duncan, Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 738 bytes --]

On 2015-11-24 13:48, Christoph Anton Mitterer wrote:
> Hey.
>
> All that sounds pretty serious, doesn't it? So in other words, AFAIU,
> send/receive cannot really be reliably used.
>
> I did so far for making incremental backups, but I've also experienced
> some problems (though not what this is about here).
>
I would say it's currently usable for one-shot stuff, but probably not 
reliably useable for automated things without some kind of 
administrative oversight.  In theory, it wouldn't be hard to write a 
script to automate fixing this particular issue when send encounters it, 
but that has it's own issues (you have to either toggle the snapshot 
writable temporarily, or modify the source and re-snapshot).



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24 20:44           ` Austin S Hemmelgarn
@ 2015-11-24 20:50             ` Christoph Anton Mitterer
  2015-11-24 20:58               ` Austin S Hemmelgarn
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Anton Mitterer @ 2015-11-24 20:50 UTC (permalink / raw)
  To: Austin S Hemmelgarn, Duncan, Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 990 bytes --]

On Tue, 2015-11-24 at 15:44 -0500, Austin S Hemmelgarn wrote:
> I would say it's currently usable for one-shot stuff, but probably
> not 
> reliably useable for automated things without some kind of 
> administrative oversight.  In theory, it wouldn't be hard to write a 
> script to automate fixing this particular issue when send encounters
> it, 
> but that has it's own issues (you have to either toggle the snapshot 
> writable temporarily, or modify the source and re-snapshot).

Well AFAIU, *this* very issue is at least something that bails out
loudly with an error... I rather worry about cases where send/receive
just exits without any error (status or message) and still didn't
manage to correctly copy everything.

The case that I had was that I incrementally send/received (with -p)
backups to another disk.
At some point in time I removed one of the older snapshots on that
backup disk... and then had fs errors... as if the data would have been
gone.. :(

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24 20:50             ` Christoph Anton Mitterer
@ 2015-11-24 20:58               ` Austin S Hemmelgarn
  2015-11-24 21:17                 ` Christoph Anton Mitterer
  0 siblings, 1 reply; 16+ messages in thread
From: Austin S Hemmelgarn @ 2015-11-24 20:58 UTC (permalink / raw)
  To: Christoph Anton Mitterer, Duncan, Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1548 bytes --]

On 2015-11-24 15:50, Christoph Anton Mitterer wrote:
> On Tue, 2015-11-24 at 15:44 -0500, Austin S Hemmelgarn wrote:
>> I would say it's currently usable for one-shot stuff, but probably
>> not
>> reliably useable for automated things without some kind of
>> administrative oversight.  In theory, it wouldn't be hard to write a
>> script to automate fixing this particular issue when send encounters
>> it,
>> but that has it's own issues (you have to either toggle the snapshot
>> writable temporarily, or modify the source and re-snapshot).
>
> Well AFAIU, *this* very issue is at least something that bails out
> loudly with an error... I rather worry about cases where send/receive
> just exits without any error (status or message) and still didn't
> manage to correctly copy everything.
>
> The case that I had was that I incrementally send/received (with -p)
> backups to another disk.
> At some point in time I removed one of the older snapshots on that
> backup disk... and then had fs errors... as if the data would have been
> gone.. :(
>
I had tried using send/receive once with -p, but had numerous issues. 
The incrementals I've been doing have used -c instead, and I hadn't had 
any issues with data loss with that.  The issue outlined here was only a 
small part of why I stopped using it for backups.  The main reason was 
to provide better consistency between my local copies and what I upload 
to S3/Dropbox, meaning I only have to test one back up image per 
filesystem backed-up, instead of two.


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3019 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-22 21:59 btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors Nils Steinger
  2015-11-23  5:49 ` Duncan
  2015-11-23 12:26 ` Austin S Hemmelgarn
@ 2015-11-24 21:11 ` Filipe Manana
  2 siblings, 0 replies; 16+ messages in thread
From: Filipe Manana @ 2015-11-24 21:11 UTC (permalink / raw)
  To: Nils Steinger; +Cc: linux-btrfs

On Sun, Nov 22, 2015 at 9:59 PM, Nils Steinger <nst@voidptr.de> wrote:
> Hi,
>
> I recently ran into a problem while trying to back up some of my btrfs
> subvolumes over the network:
> `btrfs send` works flawlessly on snapshots of most subvolumes, but keeps
> failing on snapshots of a certain subvolume — always after sending 15 GiB:
>
> btrfs send /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT |
> pv | ssh kappa "btrfs receive /mnt/300gb/backups/snapshots/zeta/home/"
> At subvol /btrfs/snapshots/home/2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT
> At subvol 2015-11-17_03:28:14_BOOT-AUTOSNAPSHOT
> ERROR: send ioctl failed with -2: No such file or directory
>   15GB 0:34:34 [7,41MB/s]

Which kernel version?

Try with a 4.3 kernel (or the latest you can, like a 4.2 or 4.1). If
it persists, you can create an image of your filesystem like this for
example:

btrfs-image -c 9 /dev/whatever fs.img

The image won't contain your data (it will all be replaced with
zeroes) but file and directory names and xattrs will remain untouched
(there's an option to sanitize file names, but that might not help
debugging what's going on with send).
If this is an option for you, you can send me the image for debugging
and getting the bug fixed - but please make sure you try a recent
kernel first (ideally 4.3) to see if the problem reproduces there, the
send code (like the rest of btrfs and the linux kernel) keeps changing
between kernel versions (bug fixes, etc).

>
> I've tried piping the output to /dev/null instead of ssh and got the
> same error (again after sending 15 GiB), so this seems to be on the
> sending side.
>
> However, btrfs scrub reports no errors and I don't get any messages in
> dmesg when the btrfs send fails.
>
> What could cause this kind of error?
> And is there a way to fix it, preferably without recreating the FS?
>
>
> Regards,
> Nils Steinger
>

-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24 20:58               ` Austin S Hemmelgarn
@ 2015-11-24 21:17                 ` Christoph Anton Mitterer
  2015-11-24 21:27                   ` Hugo Mills
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Anton Mitterer @ 2015-11-24 21:17 UTC (permalink / raw)
  To: Austin S Hemmelgarn, Duncan, Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1814 bytes --]

On Tue, 2015-11-24 at 15:58 -0500, Austin S Hemmelgarn wrote:
> I had tried using send/receive once with -p, but had numerous issues.
 
> The incrementals I've been doing have used -c instead, and I hadn't had 
> any issues with data loss with that.  The issue outlined here was only a 
> small part of why I stopped using it for backups.  The main reason was 
> to provide better consistency between my local copies and what I upload 
> to S3/Dropbox, meaning I only have to test one back up image per 
> filesystem backed-up, instead of two.

Okay maybe I just don't understand how to use send/receive correctly...


What I have is about the following (simplified):

master-fs:
5
|
+--data (subvol, my precious data)
|
+--snapshots
   |
   +--2015-11-01 (suvol, ro-snapshot of /data)

So 2015-11-01 is basically the first snapshot ever made.

Now I want to have it on:
backup-fs
+--2015-11-01 (suvol, ro-snapshot of /data)


So far I did
btrfs send /master-fs/snapshots/2015-11-01 | btrfs receive /backup-fs/2015-11-01




Then time goes by and I get new content in the data subvol, so what
I'd like to have then is a new snapshot on the master-fs:
5
|
+--data (subvol, more of my precious data)
|
+--snapshots
   |
   +--2015-11-01 (suvol, ro-snapshot of /data)
   +--2015-11-20 (suvol, ro-snapshot of /data)

And this should go incrementally on backup-fs:
backup-fs
+--2015-11-01 (suvol, ro-snapshot of /data)
+--2015-11-20
(suvol, ro-snapshot of /data)

So far I used something like:
btrfs send -p 2015-11-01 /master-fs/snapshots/2015-11-20 | btrfs receive /backup-fs/2015-11-20

And obviously I want it to share all the ref-links and stuf...


So in other words, what's the difference between -p and -c? :D


Thx,
Chris.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24 21:17                 ` Christoph Anton Mitterer
@ 2015-11-24 21:27                   ` Hugo Mills
  2015-11-24 21:36                     ` Christoph Anton Mitterer
  2015-11-26 15:44                     ` Duncan
  0 siblings, 2 replies; 16+ messages in thread
From: Hugo Mills @ 2015-11-24 21:27 UTC (permalink / raw)
  To: Christoph Anton Mitterer
  Cc: Austin S Hemmelgarn, Duncan, Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3114 bytes --]

On Tue, Nov 24, 2015 at 10:17:13PM +0100, Christoph Anton Mitterer wrote:
> On Tue, 2015-11-24 at 15:58 -0500, Austin S Hemmelgarn wrote:
> > I had tried using send/receive once with -p, but had numerous issues.
>  
> > The incrementals I've been doing have used -c instead, and I hadn't had 
> > any issues with data loss with that.  The issue outlined here was only a 
> > small part of why I stopped using it for backups.  The main reason was 
> > to provide better consistency between my local copies and what I upload 
> > to S3/Dropbox, meaning I only have to test one back up image per 
> > filesystem backed-up, instead of two.
> 
> Okay maybe I just don't understand how to use send/receive correctly...
> 
> 
> What I have is about the following (simplified):
> 
> master-fs:
> 5
> |
> +--data (subvol, my precious data)
> |
> +--snapshots
>    |
>    +--2015-11-01 (suvol, ro-snapshot of /data)
> 
> So 2015-11-01 is basically the first snapshot ever made.
> 
> Now I want to have it on:
> backup-fs
> +--2015-11-01 (suvol, ro-snapshot of /data)
> 
> 
> So far I did
> btrfs send /master-fs/snapshots/2015-11-01 | btrfs receive /backup-fs/2015-11-01
> 
> 
> 
> 
> Then time goes by and I get new content in the data subvol, so what
> I'd like to have then is a new snapshot on the master-fs:
> 5
> |
> +--data (subvol, more of my precious data)
> |
> +--snapshots
>    |
>    +--2015-11-01 (suvol, ro-snapshot of /data)
>    +--2015-11-20 (suvol, ro-snapshot of /data)
> 
> And this should go incrementally on backup-fs:
> backup-fs
> +--2015-11-01 (suvol, ro-snapshot of /data)
> +--2015-11-20
> (suvol, ro-snapshot of /data)
> 
> So far I used something like:
> btrfs send -p 2015-11-01 /master-fs/snapshots/2015-11-20 | btrfs receive /backup-fs/2015-11-20
> 
> And obviously I want it to share all the ref-links and stuf...
> 
> 
> So in other words, what's the difference between -p and -c? :D

   -p only sends the file metadata for the changes from the reference
snapshot to the sent snapshot. -c sends all the file metadata, but
will preserve the reflinks between the sent snapshot and the (one or
more) reference snapshots. You can only use one -p (because there's
only one difference you can compute at any one time), but you can use
as many -c as you like (because you can share extents with any number
of subvols).

   In both cases, the reference snapshot(s) must exist on the
receiving side.

   In implementation terms, on the receiver, -p takes a (writable)
snapshot of the reference subvol, and modifies it according to the
stream data. -c makes a new empty subvol, and populates it from
scratch, using the reflink ioctl to use data which is known to exist
in the reference subvols.

   Hugo.

-- 
Hugo Mills             | Anyone who claims their cryptographic protocol is
hugo@... carfax.org.uk | secure is either a genius or a fool. Given the
http://carfax.org.uk/  | genius/fool ratio for our species, the odds aren't
PGP: E2AB1DE4          | good.                                  Bruce Schneier

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24 21:27                   ` Hugo Mills
@ 2015-11-24 21:36                     ` Christoph Anton Mitterer
  2015-11-24 22:08                       ` Hugo Mills
  2015-11-26 15:44                     ` Duncan
  1 sibling, 1 reply; 16+ messages in thread
From: Christoph Anton Mitterer @ 2015-11-24 21:36 UTC (permalink / raw)
  To: Hugo Mills; +Cc: Austin S Hemmelgarn, Duncan, Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1778 bytes --]

On Tue, 2015-11-24 at 21:27 +0000, Hugo Mills wrote:
>    -p only sends the file metadata for the changes from the reference
> snapshot to the sent snapshot. -c sends all the file metadata, but
> will preserve the reflinks between the sent snapshot and the (one or
> more) reference snapshots.
Let me see if I got that right:
- -p sends just the differences, for both data and meta-data.
- Plus, -c sends *all* the metadata, you said... but will it send all
data (and simply ignore what's already there) or will it also just send
the differences in terms of data?
- So that means effectively I'll end up with the same... right?

In other words, -p should be a tiny bit faster... but not that extremely much (unless I have tons[0] of metadata changes)

>  You can only use one -p (because there's
> only one difference you can compute at any one time), but you can use
> as many -c as you like (because you can share extents with any number
> of subvols).
So that means, if it would work correctly, -p would be the right choice
for me, as I never have multiple snapshots that I need to draw my
relinks from, right?


>    In implementation terms, on the receiver, -p takes a (writable)
> snapshot of the reference subvol, and modifies it according to the
> stream data. -c makes a new empty subvol, and populates it from
> scratch, using the reflink ioctl to use data which is known to exist
> in the reference subvols.
I see...
I think the manpage needs more information like this... :)


Thanks, for you help :-)
Chris.


[0] People may argue that one has XXbytes of metadata, and tons are a
measurement of weight... but when I recently carried 4 of the 8TB HDDs
in my back... I came to the conclusion that data correlates to gram ;-)

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24 21:36                     ` Christoph Anton Mitterer
@ 2015-11-24 22:08                       ` Hugo Mills
  0 siblings, 0 replies; 16+ messages in thread
From: Hugo Mills @ 2015-11-24 22:08 UTC (permalink / raw)
  To: Christoph Anton Mitterer
  Cc: Austin S Hemmelgarn, Duncan, Nils Steinger, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3439 bytes --]

On Tue, Nov 24, 2015 at 10:36:26PM +0100, Christoph Anton Mitterer wrote:
> On Tue, 2015-11-24 at 21:27 +0000, Hugo Mills wrote:
> >    -p only sends the file metadata for the changes from the reference
> > snapshot to the sent snapshot. -c sends all the file metadata, but
> > will preserve the reflinks between the sent snapshot and the (one or
> > more) reference snapshots.
> Let me see if I got that right:
> - -p sends just the differences, for both data and meta-data.
> - Plus, -c sends *all* the metadata, you said... but will it send all
> data (and simply ignore what's already there) or will it also just send
> the differences in terms of data?

   Well, if you have a snapshot A, snap to A', and then send -p A A',
it'll send the same amount of data as send -c A A'.

   However, the effect on the receiving system is slightly different
in terms of the subvol metadata -- with -p, it will preserve the
information that A and A' are snapshots of the same original. With -c,
it won't preserve that.

   This will probably have knock-on effects in terms of round-tripping
the snapshots (e.g. for restoring one to the hosed system and
continuing with the incremental backup scheme). I'd have to do some
hard thinking again with the send/receive algebra to work out what the
effect would be, but with the -c approach, you'd probably have
difficulties. The round-tripping feature hasn't been implemented yet,
so the point is currently moot, but it's certainly possible to do it
(with a small send stream change), and it probably will be done at
some point.

> - So that means effectively I'll end up with the same... right?
> 
> In other words, -p should be a tiny bit faster... but not that extremely much (unless I have tons[0] of metadata changes)

   Yes.

> >  You can only use one -p (because there's
> > only one difference you can compute at any one time), but you can use
> > as many -c as you like (because you can share extents with any number
> > of subvols).
> So that means, if it would work correctly, -p would be the right choice
> for me, as I never have multiple snapshots that I need to draw my
> relinks from, right?

   Correct. The -c case is much less often needed. It's useful if you
have, say, several otherwise unrelated subvols that you need to
transfer efficiently from a filesystem that has had dedup run on it.
(Other use cases may apply as well).

> >    In implementation terms, on the receiver, -p takes a (writable)
> > snapshot of the reference subvol, and modifies it according to the
> > stream data. -c makes a new empty subvol, and populates it from
> > scratch, using the reflink ioctl to use data which is known to exist
> > in the reference subvols.
> I see...
> I think the manpage needs more information like this... :)
[snip]
> [0] People may argue that one has XXbytes of metadata, and tons are a
> measurement of weight... but when I recently carried 4 of the 8TB HDDs
> in my back... I came to the conclusion that data correlates to gram ;-)

   Yeah, I've met that particular equation too... :)

   Hugo.

-- 
Hugo Mills             | Anyone who claims their cryptographic protocol is
hugo@... carfax.org.uk | secure is either a genius or a fool. Given the
http://carfax.org.uk/  | genius/fool ratio for our species, the odds aren't
PGP: E2AB1DE4          | good.                                  Bruce Schneier

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors
  2015-11-24 21:27                   ` Hugo Mills
  2015-11-24 21:36                     ` Christoph Anton Mitterer
@ 2015-11-26 15:44                     ` Duncan
  1 sibling, 0 replies; 16+ messages in thread
From: Duncan @ 2015-11-26 15:44 UTC (permalink / raw)
  To: linux-btrfs

Hugo Mills posted on Tue, 24 Nov 2015 21:27:46 +0000 as excerpted:

[In the context of btrfs send...]

>    -p only sends the file metadata for the changes from the reference
> snapshot to the sent snapshot. -c sends all the file metadata, but will
> preserve the reflinks between the sent snapshot and the (one or more)
> reference snapshots. You can only use one -p (because there's only one
> difference you can compute at any one time), but you can use as many -c
> as you like (because you can share extents with any number of subvols).
> 
>    In both cases, the reference snapshot(s) must exist on the
> receiving side.
> 
>    In implementation terms, on the receiver, -p takes a (writable)
> snapshot of the reference subvol, and modifies it according to the
> stream data. -c makes a new empty subvol, and populates it from scratch,
> using the reflink ioctl to use data which is known to exist in the
> reference subvols.

Thanks, Hugo.  I had a vague idea that the above was the difference in 
general, but as CAM says, the manpage (and wiki) isn't particularly 
detailed on the differences, so I didn't know whether my vague idea was 
correct or not.  Your explanation makes perfect sense and clears things 
up dramatically. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-11-26 15:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-22 21:59 btrfs send reproducibly fails for a specific subvolume after sending 15 GiB, scrub reports no errors Nils Steinger
2015-11-23  5:49 ` Duncan
2015-11-23 12:26 ` Austin S Hemmelgarn
2015-11-23 21:10   ` Nils Steinger
2015-11-24  5:42     ` Duncan
2015-11-24 12:46       ` Austin S Hemmelgarn
2015-11-24 18:48         ` Christoph Anton Mitterer
2015-11-24 20:44           ` Austin S Hemmelgarn
2015-11-24 20:50             ` Christoph Anton Mitterer
2015-11-24 20:58               ` Austin S Hemmelgarn
2015-11-24 21:17                 ` Christoph Anton Mitterer
2015-11-24 21:27                   ` Hugo Mills
2015-11-24 21:36                     ` Christoph Anton Mitterer
2015-11-24 22:08                       ` Hugo Mills
2015-11-26 15:44                     ` Duncan
2015-11-24 21:11 ` Filipe Manana

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.