All of lore.kernel.org
 help / color / mirror / Atom feed
* Understanding btrfs and backups
@ 2014-03-06 18:18 Eric Mesa
  2014-03-06 21:33 ` Duncan
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Mesa @ 2014-03-06 18:18 UTC (permalink / raw)
  To: linux-btrfs

apologies if this is a resend - it appeared to me that it was rejected
because of something in how Gmail was formatting the message. I can't find
it in the Gmane archives which leads me to believe it was never delivered.

I was hoping to gain some clarification on btrfs snapshops and how they
function as backups.

I did a bit of Googling and found lots of examples of bash commands, but no
one seemed to explain what was going on to a level that would satisfy me for
my data needs.

I read this Ars Technica article today
http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/

First of all, the btrfs-raid1 sounds awesome. Because it helps protect
against one of RAID1's failings - bit rot issues. But raid1 is not backup,
it's just redundancy.

Second, the article mentions using snapshots as a backup method. Page 3
section: Using the features.

He makes a snapshot and sends that. Then he sends what changed the second
time. He mentions that because btrfs knows what's changed it's a quick process.

Right now on my Linux computer I use Back in Time which, I think, is just an
rsync frontend. It takes a long time to complete the backup for my 1 TB
/home drive. The copy part is nice and quick, but the comparison part takes
a long time and hammers the CPU. I have it setup to run at night because if
it runs while I'm using the computer, things can crawl.

So I was wondering if btrfs snapshots are a substitute for this. Right now
if I realize I deleted a file 5 days ago, I can go into Back in Time (the
gui) or just navigate to it on the backup drive and restore that one file.
>From what I've read about btrfs, I'd have to restore the entire home drive,
right? Which means I'd lose all the changes from the past five days. If
that's the case, it wouldn't really solve my problem - although maybe I'm
just not thinking creatively.

Also, if I first do the big snapshot backup and then the increments, how do
I delete the older snapshots? In other words, the way I'm picturing things
working is that I have the main snapshot and every snapshot after that is
just a description of what's changed since then. So wouldn't the entire
chain be necessary to reconstruct where I'm at now?

On a somewhat separate note, I have noticed that many people/utilities for
btrfs mention making snapshots every hour. Are the snapshots generally that
small that such a think wouldn't quickly fill a hard drive?

Thanks for reading my questions, I appreciate the help. When all is said and
done I'd certainly like to publish a how-to from my point of undertanding.


--
Eric Mesa


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-06 18:18 Understanding btrfs and backups Eric Mesa
@ 2014-03-06 21:33 ` Duncan
  2014-03-07 10:13   ` Wolfgang Mader
                     ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Duncan @ 2014-03-06 21:33 UTC (permalink / raw)
  To: linux-btrfs

Eric Mesa posted on Thu, 06 Mar 2014 18:18:15 +0000 as excerpted:

> apologies if this is a resend - it appeared to me that it was rejected
> because of something in how Gmail was formatting the message. I can't
> find it in the Gmane archives which leads me to believe it was never
> delivered.

Probably HTML-formatted.  AFAIK vger.kernel.org (the list-serv for many 
kernel lists) is set to reject that.  Too bad more list-servs don't do 
likewise. =:^(

> I was hoping to gain some clarification on btrfs snapshops and how they
> function as backups.

Looking at the below it does indeed appear you are confused, but this is 
the place to post the questions necessary to get unconfused. =:^)

> I did a bit of Googling and found lots of examples of bash commands, but
> no one seemed to explain what was going on to a level that would satisfy
> me for my data needs.

You don't mention whether you've seen/read the btrfs wiki or not.  That's 
the most direct and authoritative place to look... and to bookmark. =:^)

https://btrfs.wiki.kernel.org

> I read this Ars Technica article today
> http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-
cows-inside-next-gen-filesystems/
> 
> First of all, the btrfs-raid1 sounds awesome. Because it helps protect
> against one of RAID1's failings - bit rot issues. But raid1 is not
> backup, it's just redundancy.
> 
> Second, the article mentions using snapshots as a backup method.

Well, this is where you start to be confused.  Snapshots are not backups 
either, altho they're sort of opposite raid in that while raid is 
redundancy-only, snapshots are rollback-only, without the redundancy 
(I'll explain...).

> Page 3 section: Using the features.
> 
> He makes a snapshot and sends that. Then he sends what changed the
> second time. He mentions that because btrfs knows what's changed it's a
> quick process.

OK, what that is discussing is btrfs send/receive, with snapshots simply 
part of the process of doing that.  Think rsync in effect, but btrfs-
specific and much more efficient.  Btrfs send/receive does use snapshots 
but only as part of making the send/receive process more reliable and 
efficient.  I'll discuss snapshots (and COW) first, below, then bring in 
btrfs send/receive at the end.

> Right now on my Linux computer I use Back in Time which, I think, is
> just an rsync frontend. It takes a long time to complete the backup for
> my 1 TB /home drive. The copy part is nice and quick, but the comparison
> part takes a long time and hammers the CPU. I have it setup to run at
> night because if it runs while I'm using the computer, things can crawl.
> 
> So I was wondering if btrfs snapshots are a substitute for this. Right
> now if I realize I deleted a file 5 days ago, I can go into Back in Time
> (the gui) or just navigate to it on the backup drive and restore that
> one file.

> From what I've read about btrfs, I'd have to restore the entire home
> drive, right? Which means I'd lose all the changes from the past five
> days. If that's the case, it wouldn't really solve my problem -
> although maybe I'm just not thinking creatively.

No, in snapshot terms you don't restore the entire drive.  Rather, the 
snapshots are taken on the local filesystem, storing (like one still 
frame in a series that makes a movie, thus the term snapshot) the state 
of the filesystem at the point the snapshot was taken.  Files can be 
created/deleted/moved/altered after the snapshot, and only the 
differences between snapshots and between the last snapshot and the 
current state are changed.

The fact that btrfs is a copy-on-write (COW) filesystem makes 
snapshotting very easy... trivial... since it's a byproduct of the COW 
nature of the filesystem and thus comes very nearly for free, with only 
hooking up some way to access specific bits of functionality that's 
already there necessary in ordered to get snapshotting.

A copy-on-write illustration (please view with a monospace font for 
proper alignment):

Suppose each letter of the following string represents a block of a 
particular size (say 4KiB) of a file, with the corresponding block 
addresses noted as well:

0000000001111111  
1234567890123456
||||||||||||||||
abcdefgxijklmnop

It's the first bit of the alphabet, but notice the x where h belongs.  
Now someone notices and edits the file, correcting the problem:

abcdefghijklmnop

Except when they save the file, a COW-based filesystem will make the 
change like this:

0000000501111111
1234567390123456
||||||| ||||||||
abcdefg ijklmnop
       |
       h

The unchanged blocks of the file all remain in place.  The only change is 
to the one block, which unlike normal filesystems, isn't edited in-place, 
but rather, is written into a new location, and the filesystem simply 
notes that the new location (53) should be used to read that file block 
now, instead of the old location (08).  Of course as illustrated here, 
the addresses each take up two characters while the data block only takes 
up one, but each of those letters represents a whole 4 KiB, so in 
actuality the data is much larger than the address referring to it.

Now all that a snapshot taken when the first copy of the file was there 
has to do is keep the old address list for it, 01-16 around when the new 
copy, addresses 01-07,53,09-16, gets made.  And the only space the 
snapshot takes up is the metadata block for the old address list and the 
single data block number 08, where that x was in the illustration.

The only thing needed was that some mechanism be hooked up to tell the 
filesystem when to record the current situation as a snapshot, and some 
way to select the various snapshots.


As for restoring, since a snapshot is a copy of the filesystem as it 
existed at that point, and the method btrfs exposes for accessing them is 
to mount that specific snapshot, to restore an individual file from a 
snapshot, you simply mount the snapshot you want somewhere and copy the 
file as it existed in that snapshot over top of your current version 
(which will have presumably already been mounted elsewhere, before you 
mounted the snapshot to retrieve the file from), then unmount the 
snapshot and go about your day. =:^)

> Also, if I first do the big snapshot backup and then the increments, how
> do I delete the older snapshots? In other words, the way I'm picturing
> things working is that I have the main snapshot and every snapshot after
> that is just a description of what's changed since then. So wouldn't the
> entire chain be necessary to reconstruct where I'm at now?

Since a snapshot is an image of the filesystem as it was at that 
particular point in time, and btrfs by nature copies blocks elsewhere 
when they are modified, all (well, not "all" as there's metadata like 
file owner, permissions and group, too, but that's handled the same way) 
the snapshot does is map what blocks composed each file at the time the 
snapshot was taken.

Which means you can delete any of them, and other snapshots remain in 
place.

Meanwhile, the actual data blocks remain where they were, as long as they 
are tracked by at least one snapshot.  In the illustration above, as long 
as at least one snapshot remains that contains block number 08 (the x), 
it won't be entirely erased, since something still links to the contents 
of that block.  As soon as all snapshots containing the 08 block are 
deleted, then block 08 itself can be returned to the pool of free blocks 
to be used again, since all snapshots tracking that block are now gone.

> On a somewhat separate note, I have noticed that many people/utilities
> for btrfs mention making snapshots every hour. Are the snapshots
> generally that small that such a think wouldn't quickly fill a hard
> drive?

Yes, they're that small.  Actually, if nothing has changed between 
snapshots, the only space taken by a snapshot is the space for the 
snapshot name and similar metadata.  No data space is used at all.  If 
only one block of one file has changed, then that's all the data space 
the snapshot will take.

Of course if nearly the entire filesystem changes, then it'll need nearly 
double the space, but that doesn't normally happen (for filesystems of 
any size anyway) when the snapshots are taken an hour apart!


However, best snapshot management practice does progressive snapshot 
thinning, so you never have more than a few hundred snapshots to manage 
at once.  Think of it this way.  If you realize you deleted something you 
needed yesterday, you might well remember about when you deleted it and 
can thus pick the correct snapshot to mount and copy it back from.  But 
if you don't realize you need it until a year later, say when you're 
doing your taxes, how likely are you to remember the specific hour, or 
even the specific day, you deleted it?  A year later, getting a copy from 
the correct week, or perhaps the correct month, will probably suffice, 
and even if you DID still have every single hour's snapshots a year 
later, how would you ever know which one to pick?  So while a day out, 
hourly snapshots are nice, a year out, they're just noise.

As a result, a typical automated snapshot thinning script, working with 
snapshots each hour to begin with, might look like this:

Keep two days of hourly snapshots: 48 hourly snapshots

After two days, delete five of six snapshots, leaving a snapshot every 6 
hours, four snapshots a day, for another 5 days: 4*5=20 6-hourly, 20
+48=68 total.

After a week, delete three of the four 6-hour snapshots, leaving daily 
snapshots, for 12 weeks (plus the week of more frequent snapshots above, 
13 weeks total): 7*12=84 daily snaps, 68+84=152 total.

After a quarter (13 weeks), delete six of seven daily snapshots, leaving 
weekly snapshots, for 3 more quarters plus the one above of more frequent 
snapshots, totaling a year: 3*13=39 weekly snaps, 152+39=191 total.

After a year, delete 12 of the 13 weekly snapshots, leaving one a 
quarter.  At 191 for the latest year plus one a quarter you can have 
several years worth of snapshots (well beyond the normal life of the 
storage media) and still be in the low 200s snapshots total, while 
keeping them reasonably easy to manage. =:^)


*But*, btrfs snapshots by themselves remain on the existing btrfs 
filesystem, and thus are subject to many of the same risks as the 
filesystem itself.  As you mentioned raid is redundancy not backup, 
snapshots aren't backup either; snapshots are multiple logical copies 
thus protecting you from accidental deletion or bad editing, but pointed 
at the same data blocks without redundancy, and if those data blocks or 
the entire physical media go bad...

Which is where real backups, separate copies on separate physical media, 
come in, which is where btrfs send/receive, as the ars-technica article 
was describing, comes in.

The idea is to make a read-only snapshot on the local filesystem, read-
only so it can't change while it's being sent, and then use btrfs send to 
send that snapshot to be stored on some other media, which can optionally 
be over the network to a machine and media at a different site, altho it 
can be to a different device on the same machine, as well.

The first time you do this, there's no existing copy at the other end, so 
btrfs send sends a full copy and btrfs receive writes it out.  After 
that, the receive side has a snapshot identical to the one created on the 
send side and further btrfs send/receives to the same set simply 
duplicate the differences between the reference and the new snapshot from 
the send end to the receive end.  As with local snapshots, old ones can 
be deleted on both the send and receive ends, as long as at least one 
common reference snapshot is maintained on both ends, so diffs taken 
against the send side reference can be applied to an appropriately 
identical receive side reference, thereby updating the receive side to 
match the new read-only snapshot on the send side.


Hopefully that's clearer now. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-06 21:33 ` Duncan
@ 2014-03-07 10:13   ` Wolfgang Mader
  2014-03-09 15:46     ` Duncan
  2014-03-07 14:03   ` Eric Mesa
  2014-03-17  5:42   ` Understanding btrfs and backups => automatic snapshot script Marc MERLIN
  2 siblings, 1 reply; 25+ messages in thread
From: Wolfgang Mader @ 2014-03-07 10:13 UTC (permalink / raw)
  To: linux-btrfs

Duncan, thank you for this comprehensive post. Really helpful as always!

[...]

> As for restoring, since a snapshot is a copy of the filesystem as it
> existed at that point, and the method btrfs exposes for accessing them is
> to mount that specific snapshot, to restore an individual file from a
> snapshot, you simply mount the snapshot you want somewhere and copy the
> file as it existed in that snapshot over top of your current version
> (which will have presumably already been mounted elsewhere, before you
> mounted the snapshot to retrieve the file from), then unmount the
> snapshot and go about your day. =:^)

Please, how do I list mounted snapshots only?

[...]

> 
> Since a snapshot is an image of the filesystem as it was at that
> particular point in time, and btrfs by nature copies blocks elsewhere
> when they are modified, all (well, not "all" as there's metadata like
> file owner, permissions and group, too, but that's handled the same way)
> the snapshot does is map what blocks composed each file at the time the
> snapshot was taken.

Is it correct, that e.g. ownership is recorded separately from the data 
itself, so if I would change the owner of all my files, the respective 
snapshot would only store the old owner information?

[...]

> 
> The first time you do this, there's no existing copy at the other end, so
> btrfs send sends a full copy and btrfs receive writes it out.  After
> that, the receive side has a snapshot identical to the one created on the
> send side and further btrfs send/receives to the same set simply
> duplicate the differences between the reference and the new snapshot from
> the send end to the receive end.  As with local snapshots, old ones can
> be deleted on both the send and receive ends, as long as at least one
> common reference snapshot is maintained on both ends, so diffs taken
> against the send side reference can be applied to an appropriately
> identical receive side reference, thereby updating the receive side to
> match the new read-only snapshot on the send side.

Is the receiving side a complete file system in its own right? If so, I only 
need to maintain one common reference in order to apply the received snapshot, 
right. If I would in any way get the send and receive side out of sync, such 
that they do not share a common reference any more, only the send/receive 
would fail, but I still would have the complete filesystem on the receiving 
side, and could copy it all over (cp, rscync) to the send side in case of a 
disaster on the send side. Is this correct?

Thank you!
Best,
Wolfgang

-- 
Wolfgang Mader
Wolfgang.Mader@fdm.uni-freiburg.de
Telefon: +49 (761) 203-7710
Institute of Physics
Hermann-Herder Str. 3, 79104 Freiburg, Germany
Office: 207

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-06 21:33 ` Duncan
  2014-03-07 10:13   ` Wolfgang Mader
@ 2014-03-07 14:03   ` Eric Mesa
  2014-03-07 15:14     ` Sander
                       ` (2 more replies)
  2014-03-17  5:42   ` Understanding btrfs and backups => automatic snapshot script Marc MERLIN
  2 siblings, 3 replies; 25+ messages in thread
From: Eric Mesa @ 2014-03-07 14:03 UTC (permalink / raw)
  To: linux-btrfs

Duncan <1i5t5.duncan <at> cox.net> writes:
> *But*, btrfs snapshots by themselves remain on the existing btrfs 
> filesystem, and thus are subject to many of the same risks as the 
> filesystem itself.  As you mentioned raid is redundancy not backup, 
> snapshots aren't backup either; snapshots are multiple logical copies 
> thus protecting you from accidental deletion or bad editing, but pointed 
> at the same data blocks without redundancy, and if those data blocks or 
> the entire physical media go bad...
> 
> Which is where real backups, separate copies on separate physical media, 
> come in, which is where btrfs send/receive, as the ars-technica article 
> was describing, comes in.
> 
> The idea is to make a read-only snapshot on the local filesystem, read-
> only so it can't change while it's being sent, and then use btrfs send to 
> send that snapshot to be stored on some other media, which can optionally 
> be over the network to a machine and media at a different site, altho it 
> can be to a different device on the same machine, as well.
> 
> The first time you do this, there's no existing copy at the other end, so 
> btrfs send sends a full copy and btrfs receive writes it out.  After 
> that, the receive side has a snapshot identical to the one created on the 
> send side and further btrfs send/receives to the same set simply 
> duplicate the differences between the reference and the new snapshot from 
> the send end to the receive end.  As with local snapshots, old ones can 
> be deleted on both the send and receive ends, as long as at least one 
> common reference snapshot is maintained on both ends, so diffs taken 
> against the send side reference can be applied to an appropriately 
> identical receive side reference, thereby updating the receive side to 
> match the new read-only snapshot on the send side.
> 
> Hopefully that's clearer now. =:^)
> 


Duncan - thanks for this comprehensive explanation. For a huge portion of
your reply...I was all wondering why you and others were saying snapshots
aren't backups. They certainly SEEMED like backups. But now I see that the
problem is one of precise terminology vs colloquialisms. In other words,
snapsshots are not backups in and of themselves. They are like Mac's Time
Machine. BUT if you take these snapshots and then put them on another media
- whether that's local or not - THEN you have backups. Am I right, or am I
still missing something subtle? 

I think the most important thing you said was at the end and I'd like a
little clarification on that if it's OK with you. 

"As with local snapshots, old ones can 
> be deleted on both the send and receive ends, as long as at least one 
> common reference snapshot is maintained on both ends, so diffs taken 
> against the send side reference can be applied to an appropriately 
> identical receive side reference, thereby updating the receive side to 
> match the new read-only snapshot on the send side."

So, let's say I have everything set up. This means I created the read-only
shot on my home btrfs volume and sent it to the backup drive. I'm making
hourly snapshots and after each snapshot is made, it's sent to the backup
drive. So, obviously the backup drive needs to be at least as big as the
home drive so it can store what's on home plus the snapshot-diffs. Now let's
be extreme and say that in the course of a year I touch and somehow change
every single file on the home drive. That means if I only had one snapshot
I'd need home drive x 2 space. (for used space, not unused space, naturally)
So I might want my backups to have last's year's data, but wouldn't want to
need to upgrade the size of my actual home drive. So I would want to
maintain less snapshots on my home drive than my backup drive. (It's
possible I'm missing something here...something subtle that makes this not
necessary) So do I only need to make sure I have the latest snapshot or
maybe latest plus n-1 on the home drive while the backup drive can have all
snapshots since the beginning? I THINK that can be the case based on reading
your sentence, but I just want to make sure. 

In case you were wondering, this is based on what's happened to me with Back
in Time. I had to reduce the number of backups I was keeping because my home
drive wasn't at 100%, but the backupdrive was at 100% because I'd added and
deleted some VMs and other large files (video files I think). And Back in
Time intelligently does not remove the oldest backup off the top until it
knows it has made a new backup - which it couldn't do because it was at
100%. So I had to delete the top 1 or 2 backups and then tell it to keep
less backups. Your description of snapshots makes it seems much less likely
that this would be an issue. Although Back in Time is an incremental backup,
its takes up more space. If I may venture to see if I've learned something
from your response, is it because when I change a file Back in Time stores
the entire changed file while btrfs only stores the bits that have changed?
Also, does it matter if the file is binary or text? If I'm editing metadata
on an mp3 file is the resulting snapshot the entire mp3 or just what's
changed? (vs how it would work with a text file)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-07 14:03   ` Eric Mesa
@ 2014-03-07 15:14     ` Sander
  2014-03-09  4:13       ` Chris Samuel
  2014-03-09 16:40     ` Duncan
  2014-03-13 17:12     ` Understanding btrfs and backups Chris Murphy
  2 siblings, 1 reply; 25+ messages in thread
From: Sander @ 2014-03-07 15:14 UTC (permalink / raw)
  To: Eric Mesa; +Cc: linux-btrfs

Eric Mesa wrote (ao):
> Duncan - thanks for this comprehensive explanation. For a huge portion of
> your reply...I was all wondering why you and others were saying snapshots
> aren't backups. They certainly SEEMED like backups. But now I see that the
> problem is one of precise terminology vs colloquialisms. In other words,
> snapsshots are not backups in and of themselves. They are like Mac's Time
> Machine. BUT if you take these snapshots and then put them on another media
> - whether that's local or not - THEN you have backups. Am I right, or am I
> still missing something subtle? 

Snapshots are backups, but only protect you against a limited amount of
disasters. Snapshots are very convenient to quickly go back in time for
some or all files and directories. But if the filesystem or underlaying
disk goes up in flames, the snapshots are toast as well. So you need
additional backups, preferably not on the same hardware, for real
protection against data loss.

The convenience of snapshots is that you can (almost) make them as often
as you want, fully automated, with (almost) no impact on performance,
without the need for extra hardware, and a restore is no more than a
simple copy.

	Sander

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-07 15:14     ` Sander
@ 2014-03-09  4:13       ` Chris Samuel
  2014-03-09 15:30         ` Duncan
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Samuel @ 2014-03-09  4:13 UTC (permalink / raw)
  To: linux-btrfs

On Fri, 7 Mar 2014 04:14:16 PM Sander wrote:

> But if the filesystem or underlaying disk goes up in flames, the
> snapshots are toast as well. So you need additional backups,
> preferably not on the same hardware, for real protection against
> data loss.

..and don't forget to think about off-site backups too.

http://www.flickr.com/photos/94482242@N00/7746409996/

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-09  4:13       ` Chris Samuel
@ 2014-03-09 15:30         ` Duncan
  2014-03-13  8:18           ` Chris Samuel
  0 siblings, 1 reply; 25+ messages in thread
From: Duncan @ 2014-03-09 15:30 UTC (permalink / raw)
  To: linux-btrfs

Chris Samuel posted on Sun, 09 Mar 2014 15:13:42 +1100 as excerpted:

> On Fri, 7 Mar 2014 04:14:16 PM Sander wrote:
> 
>> But if the filesystem or underlaying disk goes up in flames, the
>> snapshots are toast as well. So you need additional backups, preferably
>> not on the same hardware, for real protection against data loss.
> 
> ...and don't forget to think about off-site backups too.
> 
> http://www.flickr.com/photos/94482242@N00/7746409996/

While I realize that was in reference to the "up in flames" comment and 
presumably if there's a need to worry about that, offsite backup /is/ of 
some value, for some people, offsite backup really isn't that valuable.

I figure if something like that happens here, I'll have FAR more pressing 
things to worry about for awhile than restoring my computer.  And by the 
time life does get somewhat back to normal and I can think about the data 
that was on the computer, I might as well do over from scratch, like I 
will have done with much of the rest of my life by that point.  The real 
valuable data is backed up where it counts -- to my head -- and if I lose 
that, well, I won't be very worried about it any more, will I?

Of course if I were a bush doctor like the guy who owned the computer in 
that photo apparently was, then there'd be other people's medical records 
and the like to worry about too, and having offsite backups of that 
/would/ be important!

And of course the same would apply if I had a bunch of family pictures on 
the computer to worry about, but for that I'd need a family first...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-07 10:13   ` Wolfgang Mader
@ 2014-03-09 15:46     ` Duncan
  0 siblings, 0 replies; 25+ messages in thread
From: Duncan @ 2014-03-09 15:46 UTC (permalink / raw)
  To: linux-btrfs

Wolfgang Mader posted on Fri, 07 Mar 2014 11:13:51 +0100 as excerpted:

> Duncan, thank you for this comprehensive post. Really helpful as always!
> 
> [...]
> 
>> As for restoring, since a snapshot is a copy of the filesystem as it
>> existed at that point, and the method btrfs exposes for accessing them
>> is to mount that specific snapshot, to restore an individual file from
>> a snapshot, you simply mount the snapshot you want somewhere and copy
>> the file as it existed in that snapshot over top of your current
>> version
> 
> Please, how do I list mounted snapshots only?
> 
> [...]

I personally don't use snapshots a whole lot (tho I like the concept) as 
they don't really fit my use-case.  So in general I won't try to answer 
usage-detail questions such as that.

That said, see the "Managing snapshots" section on the sysadmin guide 
page on the wiki, for some general snapshot management hints.

https://btrfs.wiki.kernel.org/index.php/SysadminGuide#Managing_snapshots

The main point from there is to leave the top level of the filesystem 
empty but for the subvolumes/snapshots (see the tree diagrams) and to set 
a default subvolume that will be your normal subvolume-mount if you don't 
specify one.  Then you can mount the root subvolume (subvolid=0, see the 
fstab line for /media/btrfs) when you want to manage snapshots.

But the example there is full snapshot rollback.  To restore an 
individual file instead of that, you'd just mount the root subvolume and 
the snapshots would all appear as subdirs, such that you could browse 
them as you would a normal filesystem, diving into the snapshot and its 
subdirs until you find the file you want to restore, and then copying it 
over to the working copy/snapshot.

That doesn't directly answer how to list mounted snapshots only, but 
given the above tree layout, I don't really see that you'd /need/ to list 
mounted snapshots only, since presumably you'd have only the default 
mounted, plus the root subvolume, where you could browse into all the 
snapshots just as if they were normal directories.

Also see the subvolumes and snapshots section of the FAQ:

https://btrfs.wiki.kernel.org/index.php/FAQ#Subvolumes

>> Since a snapshot is an image of the filesystem as it was at that
>> particular point in time, and btrfs by nature copies blocks elsewhere
>> when they are modified, all (well, not "all" as there's metadata like
>> file owner, permissions and group, too, but that's handled the same
>> way) the snapshot does is map what blocks composed each file at the
>> time the snapshot was taken.
> 
> Is it correct, that e.g. ownership is recorded separately from the data
> itself, so if I would change the owner of all my files, the respective
> snapshot would only store the old owner information?

Yes.  If you change the owner of the files in your "current" subvolume, 
the previous snapshots will retain their old ownership.  Owner/
permissions/etc are metadata, stored separately from the actual data, 
with both data and metadata being snapshotted.


[ on btrfs send/receive ]
> 
> Is the receiving side a complete file system in its own right?

Normally, yes.

However, send normally serializes its output to STDOUT and that output 
can be sent to a specific file on some other filesystem (like ext4), or 
to tape or whatever, instead.  In this case you can read back from that 
file using cat (or netcat if it's over the network, or whatever), 
directing its output to btrfs receive, to turn that data back into a 
filesystem.  Used like this, you can think of the original send as a full 
backup (to tape or whatever), and child sends as incremental backups.  
Obviously, if stored in this form, in ordered to restore the incrementals 
you'd need the full backup they were based upon, just as you would if 
doing the same thing using conventional backup to tape or whatever.

> If so, I only need to maintain one common reference in order to apply
> the received snapshot, right. If I would in any way get the send and
> receive side out of sync, such that they do not share a common
> reference any more, only the send/receive would fail, but I still would
> have the complete filesystem on the receiving side, and could copy it
> all over (cp, rscync) to the send side in case of a disaster on the
> send side. Is this correct?

In the normal case (not stored as a file or serialized data stream as 
described above), yes.

Meanwhile, given that we're talking of btrfs send/receive in the context 
of backups, it's worth explicitly making note of the current on-list 
reports and bugfixes in area of send/receive.  In general, we're talking 
about an in-principle feature that should eventually be reliable enough 
to use as backup in the way discussed.  However, at present, if it's data 
you'd really miss were it to disappear, please back it up using another 
method (say rsync or conventional backups) as well.  To my knowledge, if 
the send and receive both occur without error, it should be a faithful 
copy of the data just as reliable as the original, but there are still 
corner-cases that are erroring out, and I'd definitely hate to actually 
need a current backup some bit after my send/receive started triggering 
errors due to some corner-case but before I had setup an alternative, 
such that I didn't have a current backup available!

IOW, yes, set it up and test it, but if we're talking about backups that 
you're actually going to be relying on right now, not something you're 
testing now in ordered to have the setup and experience for when you 
might rely on it say a year from now, I strongly recommend that you 
choose something with a bit more proven reliability than btrfs
send/receive at this point.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-07 14:03   ` Eric Mesa
  2014-03-07 15:14     ` Sander
@ 2014-03-09 16:40     ` Duncan
  2014-03-11  0:39       ` Testing BTRFS Lists
  2014-03-13 17:12     ` Understanding btrfs and backups Chris Murphy
  2 siblings, 1 reply; 25+ messages in thread
From: Duncan @ 2014-03-09 16:40 UTC (permalink / raw)
  To: linux-btrfs

Eric Mesa posted on Fri, 07 Mar 2014 14:03:44 +0000 as excerpted:

> Duncan - thanks for this comprehensive explanation. For a huge portion
> of your reply...I was all wondering why you and others were saying
> snapshots aren't backups. They certainly SEEMED like backups. But now I
> see that the problem is one of precise terminology vs colloquialisms. In
> other words, snapsshots are not backups in and of themselves. They are
> like Mac's Time Machine. BUT if you take these snapshots and then put
> them on another media - whether that's local or not - THEN you have
> backups. Am I right, or am I still missing something subtle?

You got it. =:^)

Tho as I just mentioned in a reply on a different subthread, it's worth 
noting that btrfs send/receive is still a bit buggy at present and is 
giving people with corner-cases some errors.  To my knowledge, if both 
the send and receive sides complete without error, it's a perfectly 
reliable backup.  The problem is, they aren't always completing without 
errors at present, and I'd hate to have to actually need a current backup 
shortly after those send/receives started triggering errors, before I had 
a chance to put a different solution in place.  So at this point I'd 
recommend having that other solution in place from the beginning, just in 
case.

IOW, it's fine to play with send/receive right now, but don't depend on 
it with your life, or the life of your data!  In a year or even six 
months, hopefully those bugs should be worked out and it'll be reliable 
as the sun rise, but I wouldn't count on that for my own data ATM, and 
I'd recommend you don't either.

Tho as I said, to the best of my knowledge, if both sides complete 
without error, it's as reliable as btrfs itself is ATM.  (Tho while 
kernel 3.13 did tone down the "might-eat-your-babies" warning on the 
kernel's btrfs config option, it's still what I'd classify as "semi-
stable", so keep those backups updated and tested, and run current 
kernels since older kernels do still mean known bugs that are fixed in 
current!)

> I think the most important thing you said was at the end and I'd like a
> little clarification on that if it's OK with you.
> 
> "As with local snapshots, old ones can
>> be deleted on both the send and receive ends, as long as at least one
>> common reference snapshot is maintained on both ends, so diffs taken
>> against the send side reference can be applied to an appropriately
>> identical receive side reference, thereby updating the receive side to
>> match the new read-only snapshot on the send side."
> 
> So, let's say I have everything set up. This means I created the
> read-only shot on my home btrfs volume and sent it to the backup drive.
> I'm making hourly snapshots and after each snapshot is made, it's sent
> to the backup drive. So, obviously the backup drive needs to be at least
> as big as the home drive so it can store what's on home plus the
> snapshot-diffs. Now let's be extreme and say that in the course of a
> year I touch and somehow change every single file on the home drive.
> That means if I only had one snapshot I'd need home drive x 2 space.
> (for used space, not unused space, naturally)

Well, not strictly as you said.  If you changed every BLOCK of every file 
over that year, THEN you'd need 2X the space.  But if a lot of those 
files are say half-gig-plus ISOs and you only changed say one word of one 
file on each ISO, then no, it wouldn't be the whole files changed, only a 
single individual (btrfs size, 4 KiB AFAIK) block within the file, and 4 
KiB out of half a gig is under 1/10 of 1 percent, so you wouldn't need 2X 
the space in a scenario like that.

> So I might want my backups to have last's year's data, but wouldn't want
> to need to upgrade the size of my actual home drive. So I would want to
> maintain less snapshots on my home drive than my backup drive. (It's
> possible I'm missing something here...something subtle that makes this
> not necessary) So do I only need to make sure I have the latest snapshot
> or maybe latest plus n-1 on the home drive while the backup drive can
> have all snapshots since the beginning? I THINK that can be the case
> based on reading your sentence, but I just want to make sure.

In general, yes.  Tho if you're doing hourly snapshots I'd probably keep 
say a day's worth locally, plus one a day for a week, and 1 weekly 
snapshot before that, just to cover the case of the my needing to recover 
a backup and finding that the remote backup just keeled over 12 hours 
ago.  Unless you're writing/erasing heavily, snapshots take up very 
nearly zero space, so keeping a few extra around isn't going to hurt a 
whole lot.

Meanwhile, however, I'd suggest a reasonable thinning down script on the 
remote backup as well, because at least at present, there are overhead 
issues once you get over several hundred snapshots.  But realistically, 
if you decide you need a file 11 months old, are you really going to care 
or even know exactly what hour it was, eleven months ago?  Not very 
likely.  It's far more likely that all those hundreds of snapshots will 
just be getting in your way, and it's unlikely to matter eleven months 
out whether you even get the precise /week/, vs. the one before or after 
that.

So I'd recommend something like thinning down the hourly snapshots to say 
one every six hours after a couple days, perhaps one a day after a week, 
one a week after a quarter (13 weeks), and one a quarter after a year.  
That keeps things manageable should you actually need to go back a year, 
so you're not sorting thru thousands (24*365=8760) of hourly snapshots 
just to pick one at random from sometime in the correct week after all.  
With a bit of reasonable thinning, you can keep that to well under 500 
(I've counted them up in examples a few times and come up with 200-300) 
without much trouble, making it a LOT easier to actually FIND a useful 
snapshot without all the extra noise, if you actually NEED to.

> If I may venture to see if I've learned
> something from your response, is it because when I change a file Back in
> Time stores the entire changed file while btrfs only stores the bits
> that have changed? Also, does it matter if the file is binary or text?
> If I'm editing metadata on an mp3 file is the resulting snapshot the
> entire mp3 or just what's changed? (vs how it would work with a text
> file)

I think I covered that above. =:^)  If you're not using btrfs 
compression, text or uncompressable binary shouldn't matter, and as I 
said, I believe the block size is 4 KiB (on x86 and amd64, it's actually 
the kernel memory page size, which differs on some archs).

Come to think of it, tho, in a heavy-snapshot scenario, compression may 
actually use more space, since I believe compression blocks are 128 KiB.  
But I'm far from sure on how compression and snapshots interact and what 
that would do in practice, size-wise.  Hopefully a dev or someone else 
with more information on that particular aspect will step in with 
accurate information there, either confirming or dispelling my thought.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Testing BTRFS
  2014-03-09 16:40     ` Duncan
@ 2014-03-11  0:39       ` Lists
  2014-03-11  1:02         ` Avi Miller
  2014-03-11 13:33         ` Josef Bacik
  0 siblings, 2 replies; 25+ messages in thread
From: Lists @ 2014-03-11  0:39 UTC (permalink / raw)
  To: linux-btrfs

I'd like to begin testing BTRFS. We'd probably begin roll out in 6 
months to a year if testing goes well.

We're currently using CentOS6/64 everywhere, are aware of BTRFS being a 
"Technology preview" in RHEL 7beta and would like to begin testing 
production-level load testing. We generate about 10 GB of distinct data 
daily that is stored redundantly by default on a combination of ZFS and 
Ext4.

Is there a "recommended way" to do this? Is it anywhere as easy as 
ZFSonLinux yum install?

-Ben

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Testing BTRFS
  2014-03-11  0:39       ` Testing BTRFS Lists
@ 2014-03-11  1:02         ` Avi Miller
  2014-03-11 19:08           ` Eric Sandeen
  2014-03-13 18:10           ` Testing BTRFS Lists
  2014-03-11 13:33         ` Josef Bacik
  1 sibling, 2 replies; 25+ messages in thread
From: Avi Miller @ 2014-03-11  1:02 UTC (permalink / raw)
  To: Lists; +Cc: linux-btrfs


On 11 Mar 2014, at 11:39 am, Lists <lists@benjamindsmith.com> wrote:

> Is there a "recommended way" to do this? Is it anywhere as easy as ZFSonLinux yum install?

Oracle Linux 6 with the Unbreakable Enterprise Kernel Release 2 or Release 3 has production-ready btrfs support. You can even convert your existing CentOS6 boxes across to Oracle Linux 6 in-place without reinstalling:

http://linux.oracle.com/switch/centos/

Oracle also now provides all errata, including security and bug fixes for free at http://public-yum.oracle.com and our kernel source code can be found at https://oss.oracle.com/git/

Cheers,
Avi

--
Oracle <http://www.oracle.com>
Avi Miller | Product Management Director | +61 (3) 8616 3496
Oracle Linux and Virtualization
417 St Kilda Road, Melbourne, Victoria 3004 Australia


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Testing BTRFS
  2014-03-11  0:39       ` Testing BTRFS Lists
  2014-03-11  1:02         ` Avi Miller
@ 2014-03-11 13:33         ` Josef Bacik
  1 sibling, 0 replies; 25+ messages in thread
From: Josef Bacik @ 2014-03-11 13:33 UTC (permalink / raw)
  To: Lists, linux-btrfs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/10/2014 08:39 PM, Lists wrote:
> I'd like to begin testing BTRFS. We'd probably begin roll out in 6 
> months to a year if testing goes well.
> 
> We're currently using CentOS6/64 everywhere, are aware of BTRFS
> being a "Technology preview" in RHEL 7beta and would like to begin
> testing production-level load testing. We generate about 10 GB of
> distinct data daily that is stored redundantly by default on a
> combination of ZFS and Ext4.
> 
> Is there a "recommended way" to do this? Is it anywhere as easy as 
> ZFSonLinux yum install?
> 
> 

There is way too much churn for any "enterprise" distro to be able to
keep up with bugfixes and stuff.  You are best off rolling your own
kernel based on the stable series if you want to think about using
btrfs in production.  Thanks,

Josef

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTHxCtAAoJEANb+wAKly3BdJ8QAK5hYoNtJT/UEkpKakpNoXfV
q6lg2NVPT6EeHzcMhTRS+VTOJ/bjvfwX0qDxRkjvo73F+nkQYcrO78cEMvPqtwTq
HKxrGMibCtt5PlzzcbKqSc1VGIDFEkD2z7fr5y2n4V+E5x0EPCFxU6VOjXgqXyEZ
8tKW24oxLAwbWBvyiaKrB/gWm47Aw6p2pVWgWrqMjMFUaNQBoisAU+1Ezn0Xjg6w
4wazfGqUkUZ3pMcZr5IMQ9X+p+FUid7JWcdNwPjIsPMQhP7mkIK0Mq8eDu6ijVv2
nI52pZuYaZs3+7OlkEoHRssnAwIWUwUq9UQwRjl4WK8FrpgdyYe0n2zlZIWGinvF
qZRMmB5PtM+SYT9Wt5OPAgZxb/ivc9Vz7ACG4edNSBqZ1D7+52aazT4JY0fqWGGU
8vapdKUmyXPQT9MphvHUEqnJtA/K9ek8Frt+f304KCcl/0IEESAoo3InlS7Hw45D
ANEO4ZCwaUp/WjhqvwuvhYrqn8ENsbCm31RYAvAGEOoROzwXEnbl/Nv4DaKa+Q7b
I6uSpyS60cNA2wmKm3wzFGpvSkP8PMzA1zSepK/yJ9p3PxUdxUpY1OqYc8y7gqOf
+ACNUuNMbNwAhMb9udEZzuBZojX3/vVPlqOWLPYDr3fVCrDIwuSKtloao+czkrpo
sxJbe80q3rqtw+p0pStO
=n1mD
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Testing BTRFS
  2014-03-11  1:02         ` Avi Miller
@ 2014-03-11 19:08           ` Eric Sandeen
  2014-03-11 20:30             ` Avi Miller
  2014-03-12 11:15             ` xfstests btrfs/035 (was Re: Testing BTRFS) David Disseldorp
  2014-03-13 18:10           ` Testing BTRFS Lists
  1 sibling, 2 replies; 25+ messages in thread
From: Eric Sandeen @ 2014-03-11 19:08 UTC (permalink / raw)
  To: Avi Miller, Lists; +Cc: linux-btrfs

On 3/10/14, 8:02 PM, Avi Miller wrote:
> 
> On 11 Mar 2014, at 11:39 am, Lists <lists@benjamindsmith.com> wrote:
> 
>> Is there a "recommended way" to do this? Is it anywhere as easy as
>> ZFSonLinux yum install?
> 
> Oracle Linux 6 with the Unbreakable Enterprise Kernel Release 2 or
> Release 3 has production-ready btrfs support. You can even convert
> your existing CentOS6 boxes across to Oracle Linux 6 in-place without
> reinstalling:
> 
> http://linux.oracle.com/switch/centos/

If we're plugging distros... I can also tell you that you can install
upcoming RHEL7 on btrfs if you like, and it has a very up-to-date
btrfs codebase.  Of course Fedora and other "non-enterprise" distros
have btrfs support as well.

But we're keeping it "tech preview" in RHEL7 for now, because in our
testing, it does not yet reach the level of reliability that we
wish to provide to our customers.

Indeed, testing 3.8.13-26.2.1.el6uek.x86_64 (which is, I believe,
the kernel which Avi referred to) via xfstests, I saw
failures on btrfs/009 and btrfs/022; then the box deadlocked
on btrfs/024.  I rebooted & resumed, then deadlocked on btrfs/030.
Rebooted and resumed again, then panicked on btrfs/035.  At that
point I stopped.

Ben, the best advice I have for you is to test *your* workload
on btrfs with whatever qualification tests you have, and see how
things fare.  If you want to know the current state of btrfs,
test the upstream code as best you can; if you hope to deploy
on a distribution with a longer support window, test on that
distribution.

But I agree with Josef that for now, the fixes and changes are
still flying fast & furious, and except in limited use cases,
btrfs is not yet ready for general commercial deployment.

-Eric

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Testing BTRFS
  2014-03-11 19:08           ` Eric Sandeen
@ 2014-03-11 20:30             ` Avi Miller
  2014-03-12 11:15             ` xfstests btrfs/035 (was Re: Testing BTRFS) David Disseldorp
  1 sibling, 0 replies; 25+ messages in thread
From: Avi Miller @ 2014-03-11 20:30 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Lists, linux-btrfs

Hey,

On 12 Mar 2014, at 6:08 am, Eric Sandeen <sandeen@redhat.com> wrote:

> If we're plugging distros... I can also tell you that you can install
> upcoming RHEL7 on btrfs if you like, and it has a very up-to-date
> btrfs codebase.

Ditto for OL7, for obvious reasons. :)

> Indeed, testing 3.8.13-26.2.1.el6uek.x86_64 (which is, I believe,
> the kernel which Avi referred to) via xfstests, I saw
> failures on btrfs/009 and btrfs/022; then the box deadlocked
> on btrfs/024.  I rebooted & resumed, then deadlocked on btrfs/030.
> Rebooted and resumed again, then panicked on btrfs/035.  At that
> point I stopped.

We have a bunch of btrfs fixes queued for UEK3-QU2 which is in alpha build internally at the moment. We do run the full xfstests against our UEK3 releases and are working with Liu Bo to backport fixes from mainline which should resolve some (hopefully all) of the failing xfstests. It’s also worth ensuring that you’re upgrading the userspace btrfs-progs package that ships with the updated UEK3 kernels, if applicable.

> Ben, the best advice I have for you is to test *your* workload
> on btrfs with whatever qualification tests you have, and see how
> things fare.  If you want to know the current state of btrfs,
> test the upstream code as best you can; if you hope to deploy
> on a distribution with a longer support window, test on that
> distribution.

Agreed.

> But I agree with Josef that for now, the fixes and changes are
> still flying fast & furious, and except in limited use cases,
> btrfs is not yet ready for general commercial deployment.

Obviously, we disagree (somewhat) here. We’re happy with the status of btrfs functionality in UEK3 to provide limited production support, but this is just from the Oracle Linux team. The other product teams within Oracle (RDBMS, Java, middleware, etc) obviously have to do their own validation and testing and are responsible for their own support. As above, I agree with Eric that you should test your own workloads and requirements and make your own judgement call.

Cheers,
Avi

--
Oracle <http://www.oracle.com>
Avi Miller | Product Management Director | +61 (3) 8616 3496
Oracle Linux and Virtualization
417 St Kilda Road, Melbourne, Victoria 3004 Australia


^ permalink raw reply	[flat|nested] 25+ messages in thread

* xfstests btrfs/035 (was Re: Testing BTRFS)
  2014-03-11 19:08           ` Eric Sandeen
  2014-03-11 20:30             ` Avi Miller
@ 2014-03-12 11:15             ` David Disseldorp
  1 sibling, 0 replies; 25+ messages in thread
From: David Disseldorp @ 2014-03-12 11:15 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Avi Miller, Lists, linux-btrfs

Hi Eric,

On Tue, 11 Mar 2014 14:08:02 -0500, Eric Sandeen wrote:

> Indeed, testing 3.8.13-26.2.1.el6uek.x86_64 (which is, I believe,
> the kernel which Avi referred to) via xfstests, I saw
> failures on btrfs/009 and btrfs/022; then the box deadlocked
> on btrfs/024.  I rebooted & resumed, then deadlocked on btrfs/030.
> Rebooted and resumed again, then panicked on btrfs/035.  At that
> point I stopped.

FWIW, Liu Bo recently proposed a fix for the btrfs/035 BUG_ON():

http://permalink.gmane.org/gmane.comp.file-systems.btrfs/33327

Cheers, David

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-09 15:30         ` Duncan
@ 2014-03-13  8:18           ` Chris Samuel
  0 siblings, 0 replies; 25+ messages in thread
From: Chris Samuel @ 2014-03-13  8:18 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 637 bytes --]

On Sun, 9 Mar 2014 03:30:44 PM Duncan wrote:

> While I realize that was in reference to the "up in flames" comment and 
> presumably if there's a need to worry about that, offsite backup /is/ of 
> some value, for some people, offsite backup really isn't that valuable.

Actually I missed that comment altogether, it was really just an illustration 
of why people should think about it - and then come to a decision about 
whether or not it makes sense for them.

In your case maybe not, but for me (and my wife) it certainly does.

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC


[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 482 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-07 14:03   ` Eric Mesa
  2014-03-07 15:14     ` Sander
  2014-03-09 16:40     ` Duncan
@ 2014-03-13 17:12     ` Chris Murphy
  2 siblings, 0 replies; 25+ messages in thread
From: Chris Murphy @ 2014-03-13 17:12 UTC (permalink / raw)
  To: Btrfs BTRFS


On Mar 7, 2014, at 7:03 AM, Eric Mesa <ericsbinaryworld@gmail.com> wrote:
> 
> Duncan - thanks for this comprehensive explanation. For a huge portion of
> your reply...I was all wondering why you and others were saying snapshots
> aren't backups. They certainly SEEMED like backups. But now I see that the
> problem is one of precise terminology vs colloquialisms. In other words,
> snapsshots are not backups in and of themselves. They are like Mac's Time
> Machine. BUT if you take these snapshots and then put them on another media
> - whether that's local or not - THEN you have backups. Am I right, or am I
> still missing something subtle?

Hmm, yes because snapshots on a mirrored drive are on another media but that's still not considered a backup. I think what makes a backup is separate device and separate file system. That's because the top vectors for data loss are: user induced, device failure, and file system corruption. These are substantially mitigated by having backup files located both on separate file systems and device.

Also, Time Machine qualifies as a backup because it copies files to a separate device with a separate file system. (There is a feature in recent OS X versions that store hourly incremental backups on the local drive when the usual target device isn't available - these are arguably not backups but rather snapshots that are pending backups. Once the target device is available, the snapshots are copied over to it.)

If you have data you feel is really important, my suggestion is that you have a completely different backup/restore method than what you're talking about. It needs to be bullet proof, well tested. And consider all the Btrfs send/receive work you're doing as testing/work-in-progress. There are still cases on the list where people have had problems with send/receive, both the send and receive code have a lot of churn, so I don't know that anyone can definitively tell you that a btrfs send/receive only based backup is going to reliably restore in one month let alone three years. Should it? Yes of course. Will it?


Chris Murphy


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Testing BTRFS
  2014-03-11  1:02         ` Avi Miller
  2014-03-11 19:08           ` Eric Sandeen
@ 2014-03-13 18:10           ` Lists
  2014-03-13 20:20             ` Avi Miller
  1 sibling, 1 reply; 25+ messages in thread
From: Lists @ 2014-03-13 18:10 UTC (permalink / raw)
  To: Avi Miller; +Cc: linux-btrfs

On 03/10/2014 06:02 PM, Avi Miller wrote:
> Oracle Linux 6 with the Unbreakable Enterprise Kernel Release 2 or Release 3 has production-ready btrfs support. You can even convert your existing CentOS6 boxes across to Oracle Linux 6 in-place without reinstalling:
>
> http://linux.oracle.com/switch/centos/
>
> Oracle also now provides all errata, including security and bug fixes for free athttp://public-yum.oracle.com  and our kernel source code can be found athttps://oss.oracle.com/git/

Is there any issue with BTRFS and 32 bit O/S like with ZFS?

-Ben

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Testing BTRFS
  2014-03-13 18:10           ` Testing BTRFS Lists
@ 2014-03-13 20:20             ` Avi Miller
  0 siblings, 0 replies; 25+ messages in thread
From: Avi Miller @ 2014-03-13 20:20 UTC (permalink / raw)
  To: Lists; +Cc: linux-btrfs

Hi,

On 14 Mar 2014, at 5:10 am, Lists <lists@benjamindsmith.com> wrote:

> Is there any issue with BTRFS and 32 bit O/S like with ZFS?

We provide some btrfs support with the 32-bit UEK Release 2 on OL6, but we strongly recommend only using the UEK Release 3 which is 64-bit only.

--
Oracle <http://www.oracle.com>
Avi Miller | Product Management Director | +61 (3) 8616 3496
Oracle Linux and Virtualization
417 St Kilda Road, Melbourne, Victoria 3004 Australia


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups => automatic snapshot script
  2014-03-06 21:33 ` Duncan
  2014-03-07 10:13   ` Wolfgang Mader
  2014-03-07 14:03   ` Eric Mesa
@ 2014-03-17  5:42   ` Marc MERLIN
  2014-03-21  5:57     ` Marc MERLIN
  2 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2014-03-17  5:42 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Thu, Mar 06, 2014 at 09:33:24PM +0000, Duncan wrote:
> However, best snapshot management practice does progressive snapshot 
> thinning, so you never have more than a few hundred snapshots to manage 
> at once.  Think of it this way.  If you realize you deleted something you 
> needed yesterday, you might well remember about when you deleted it and 
> can thus pick the correct snapshot to mount and copy it back from.  But 
> if you don't realize you need it until a year later, say when you're 
> doing your taxes, how likely are you to remember the specific hour, or 
> even the specific day, you deleted it?  A year later, getting a copy from 
> the correct week, or perhaps the correct month, will probably suffice, 
> and even if you DID still have every single hour's snapshots a year 
> later, how would you ever know which one to pick?  So while a day out, 
> hourly snapshots are nice, a year out, they're just noise.

I'm happy to share my script with others if that helps:
http://marc.merlins.org/linux/scripts/btrfs-snaps

Or for the list archives/google:
----------------------------------------------------------------------------
#!/bin/bash

# By Marc MERLIN <marc_soft@merlins.org>
# License GPL-2 or BSD at your option.

# This lets you create sets of snapshots at any interval (I use hourly,
# daily, and weekly) and delete the older ones automatically.

# Usage:
# This is called from /etc/cron.d like so:
# 0 * * * * root btrfs-snaps hourly 3 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )'
# 1 0 * * * root btrfs-snaps daily  4 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )'
# 2 0 * * 0 root btrfs-snaps weekly 4 | egrep -v '(Create a snapshot of|Will delete the oldest|Delete subvolume|Making snapshot of )'

: ${BTRFSROOT:=/mnt/btrfs_pool1}
DATE="$(date '+%Y%m%d_%H:%M:%S')"

type=${1:-hourly}
keep=${2:-3}

cd "$BTRFSROOT"

for i in $(btrfs subvolume list -q . | grep "parent_uuid -" | awk '{print $11}')
do
    # Skip duplicate dirs once a year on DST 1h rewind.
    test -d "$BTRFSROOT/${i}_${type}_$DATE" && continue
    echo "Making snapshot of $type"
    /sbin/btrfs subvolume snapshot "$BTRFSROOT"/$i "$BTRFSROOT/${i}_${type}_$DATE"
    count="$(ls -d ${i}_${type}_* | wc -l)"
    clip=$(( $count - $keep ))
    if [ $clip -gt 0 ]; then
	echo "Will delete the oldest $clip snapshots for $type"
	for sub in $(ls -d ${i}_${type}_* | head -n $clip)
	do
	    #echo "Will delete $sub"
	    /sbin/btrfs subvolume delete "$sub"
	done
    fi
done
----------------------------------------------------------------------------
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups => automatic snapshot script
  2014-03-17  5:42   ` Understanding btrfs and backups => automatic snapshot script Marc MERLIN
@ 2014-03-21  5:57     ` Marc MERLIN
  2014-03-21  7:41       ` Duncan
  0 siblings, 1 reply; 25+ messages in thread
From: Marc MERLIN @ 2014-03-21  5:57 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Sun, Mar 16, 2014 at 10:42:24PM -0700, Marc MERLIN wrote:
> On Thu, Mar 06, 2014 at 09:33:24PM +0000, Duncan wrote:
> > However, best snapshot management practice does progressive snapshot 
> > thinning, so you never have more than a few hundred snapshots to manage 
> > at once.  Think of it this way.  If you realize you deleted something you 
> > needed yesterday, you might well remember about when you deleted it and 
> > can thus pick the correct snapshot to mount and copy it back from.  But 
> > if you don't realize you need it until a year later, say when you're 
> > doing your taxes, how likely are you to remember the specific hour, or 
> > even the specific day, you deleted it?  A year later, getting a copy from 
> > the correct week, or perhaps the correct month, will probably suffice, 
> > and even if you DID still have every single hour's snapshots a year 
> > later, how would you ever know which one to pick?  So while a day out, 
> > hourly snapshots are nice, a year out, they're just noise.
> 
> I'm happy to share my script with others if that helps:
> http://marc.merlins.org/linux/scripts/btrfs-snaps

Now added to
http://marc.merlins.org/perso/btrfs/post_2014-03-21_Btrfs-Tips_-How-To-Setup-Netapp-Style-Snapshots.html
(mostly to seed google and the archives)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups => automatic snapshot script
  2014-03-21  5:57     ` Marc MERLIN
@ 2014-03-21  7:41       ` Duncan
  0 siblings, 0 replies; 25+ messages in thread
From: Duncan @ 2014-03-21  7:41 UTC (permalink / raw)
  To: linux-btrfs

Marc MERLIN posted on Thu, 20 Mar 2014 22:57:33 -0700 as excerpted:

> On Sun, Mar 16, 2014 at 10:42:24PM -0700, Marc MERLIN wrote:
>> On Thu, Mar 06, 2014 at 09:33:24PM +0000, Duncan wrote:
>> > However, best snapshot management practice does progressive snapshot
>> > thinning, so you never have more than a few hundred snapshots to
>> > manage at once.
>> 
>> I'm happy to share my script with others if that helps:
>> http://marc.merlins.org/linux/scripts/btrfs-snaps
> 
> Now added to
> http://marc.merlins.org/perso/btrfs/post_2014-03-21_Btrfs-Tips_-How-To-
Setup-Netapp-Style-Snapshots.html

Hmm... I hadn't actually looked that closely at scripted snapshotting.  
Now that I did, and see how easy it is to manage both snapshotting and 
thinning, I just might.

But I recently switched to systemd, including replacing my crons with 
timer-unit scripts (which I setup like cron.hourly.d, daily.d, etc, but 
didn't have but those two to worry about, so didn't setup weekly or 
beyond).  I've not actually unmerged cron yet, but I probably will one of 
these days.  Anyway, I might well find myself setting up weekly/quarterly/
whatever too, with your script or something like it modified for systemd-
timer usage.  It'd give me an excuse to practice my unit-file setup 
skills some more. =:^)


-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
  2014-03-06 19:27 Understanding btrfs and backups Eric Mesa
@ 2014-03-06 21:17 ` Brendan Hide
  0 siblings, 0 replies; 25+ messages in thread
From: Brendan Hide @ 2014-03-06 21:17 UTC (permalink / raw)
  To: Eric Mesa, linux-btrfs

On 2014/03/06 09:27 PM, Eric Mesa wrote:
> Brian Wong wrote: a snapshot is different than a backup
> [snip]
>
> ...
>
> Three hard drives: A, B, and C.
>
> Hard drives A and B - btrfs RAID-1 so that if one drive dies I can keep
> using my system until the replacement for the raid arrives.
>
> Hard drive C - gets (hourly/daily/weekly/or some combination of the above)
> snapshots from the RAID. (Starting with the initial state snapshot) Each
> timepoint another snapshot is copied to hard drive C.
>
> [snip]...
>
> So if that's what I'm doing, do snapshots become a way to do backups?
An important distinction for anyone joining the conversation is that 
snapshots are *not* backups, in a similar way that you mentioned that 
RAID is not a backup. If a hard drive implodes, its snapshots go with it.

Snapshots can (and should) be used as part of a backup methodology - and 
your example is almost exactly the same as previous good backup 
examples. I think most of the time there's mention of an external 
"backup server" keeping the backups, which is the only major difference 
compared to the process you're looking at. Btrfs send/receive with 
snapshots can make the process far more efficient compared to rsync. 
Rsync doesn't have any record as to what information has changed so it 
has to compare all the data (causing heavy I/O). Btrfs keeps a record 
and can skip to the part of sending the data.

I do something similar to what you have described on my Archlinux 
desktop - however I haven't updated my (very old) backup script to take 
advantage of btrfs' send/receive functionality. I'm still using rsync. :-/
/ and /home are on btrfs-raid1 on two smallish disks
/mnt/btrfs-backup is on btrfs single/dup on a single larger disk

See https://btrfs.wiki.kernel.org/index.php/Incremental_Backup for a 
basic incremental methodology using btrfs send/receive

-- 
__________
Brendan Hide
http://swiftspirit.co.za/
http://www.webafrica.co.za/?AFF1E97


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
@ 2014-03-06 20:37 Eric Mesa
  0 siblings, 0 replies; 25+ messages in thread
From: Eric Mesa @ 2014-03-06 20:37 UTC (permalink / raw)
  To: linux-btrfs

Brian Wong wrote: a snapshot is different than a backup, with a snapshot
you're still accessing a read-only version of the live filesystem.  i don't
know the specifics of btrfs but if you take daily snapshots, you should be
able to restore a single file from the five-days-ago snapshot by browsing
that snapshot's directory tree and then copying the file to the live version
of the filesystem, if that makes sense.

in the snapshot case the live filesystem serves the same function as the
full backup would if you did full backups then incrementals.  the snapshots
are the incrementals of the live filesystem, only going backwards in time
whereas with backup you would take a full backup then go forward in time
with incrementals.  the filesystem takes care of making sure every snapshot
is complete.

in the snapshot case redundancy is then more important because you may not
have a bunch of full backups (i.e. full copies) lying around.  so full
backups still are useful.

--

OK, I THINK I understand things a bit better. So from the point of view of
restoring a single file, that functionality is there. Excellent. And I guess
you're saying that because the snapshots are diffs off the live system, that
I'd need a backup of the live system - ie snapshots wouldn't be enough. But
what if my first snapshot was a clone of the system at that point (as it
seems from the article) And I back that up to a separate drive. Let me
illustrate with what I plan to do exactly.

Three hard drives: A, B, and C.

Hard drives A and B - btrfs RAID-1 so that if one drive dies I can keep
using my system until the replacement for the raid arrives.

Hard drive C - gets (hourly/daily/weekly/or some combination of the above)
snapshots from the RAID. (Starting with the initial state snapshot) Each
timepoint another snapshot is copied to hard drive C. 

So in the case of a file disappearing on me or being over-written or w/e - I
reach into the directory of the snapshot that contains the file just as I
would now with the backup. 

So if that's what I'm doing, do snapshots become a way to do backups?

Thanks


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Understanding btrfs and backups
@ 2014-03-06 19:27 Eric Mesa
  2014-03-06 21:17 ` Brendan Hide
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Mesa @ 2014-03-06 19:27 UTC (permalink / raw)
  To: linux-btrfs

Brian Wong wrote: a snapshot is different than a backup, with a snapshot
you're still accessing a read-only version of the live filesystem.  i don't
know the specifics of btrfs but if you take daily snapshots, you should be
able to restore a single file from the five-days-ago snapshot by browsing
that snapshot's directory tree and then copying the file to the live version
of the filesystem, if that makes sense.

in the snapshot case the live filesystem serves the same function as the
full backup would if you did full backups then incrementals.  the snapshots
are the incrementals of the live filesystem, only going backwards in time
whereas with backup you would take a full backup then go forward in time
with incrementals.  the filesystem takes care of making sure every snapshot
is complete.

in the snapshot case redundancy is then more important because you may not
have a bunch of full backups (i.e. full copies) lying around.  so full
backups still are useful.

--

OK, I THINK I understand things a bit better. So from the point of view of
restoring a single file, that functionality is there. Excellent. And I guess
you're saying that because the snapshots are diffs off the live system, that
I'd need a backup of the live system - ie snapshots wouldn't be enough. But
what if my first snapshot was a clone of the system at that point (as it
seems from the article) And I back that up to a separate drive. Let me
illustrate with what I plan to do exactly.

Three hard drives: A, B, and C.

Hard drives A and B - btrfs RAID-1 so that if one drive dies I can keep
using my system until the replacement for the raid arrives.

Hard drive C - gets (hourly/daily/weekly/or some combination of the above)
snapshots from the RAID. (Starting with the initial state snapshot) Each
timepoint another snapshot is copied to hard drive C. 

So in the case of a file disappearing on me or being over-written or w/e - I
reach into the directory of the snapshot that contains the file just as I
would now with the backup. 

So if that's what I'm doing, do snapshots become a way to do backups?

Thanks


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-03-21  7:41 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-06 18:18 Understanding btrfs and backups Eric Mesa
2014-03-06 21:33 ` Duncan
2014-03-07 10:13   ` Wolfgang Mader
2014-03-09 15:46     ` Duncan
2014-03-07 14:03   ` Eric Mesa
2014-03-07 15:14     ` Sander
2014-03-09  4:13       ` Chris Samuel
2014-03-09 15:30         ` Duncan
2014-03-13  8:18           ` Chris Samuel
2014-03-09 16:40     ` Duncan
2014-03-11  0:39       ` Testing BTRFS Lists
2014-03-11  1:02         ` Avi Miller
2014-03-11 19:08           ` Eric Sandeen
2014-03-11 20:30             ` Avi Miller
2014-03-12 11:15             ` xfstests btrfs/035 (was Re: Testing BTRFS) David Disseldorp
2014-03-13 18:10           ` Testing BTRFS Lists
2014-03-13 20:20             ` Avi Miller
2014-03-11 13:33         ` Josef Bacik
2014-03-13 17:12     ` Understanding btrfs and backups Chris Murphy
2014-03-17  5:42   ` Understanding btrfs and backups => automatic snapshot script Marc MERLIN
2014-03-21  5:57     ` Marc MERLIN
2014-03-21  7:41       ` Duncan
2014-03-06 19:27 Understanding btrfs and backups Eric Mesa
2014-03-06 21:17 ` Brendan Hide
2014-03-06 20:37 Eric Mesa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.