All of lore.kernel.org
 help / color / mirror / Atom feed
From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: 4.11 relocate crash, null pointer + rolling back a filesystem by X hours?
Date: Wed, 24 May 2017 10:16:50 +0000 (UTC)	[thread overview]
Message-ID: <pan$8532b$ae0e1e6d$c1cd3583$d6dbef31@cox.net> (raw)
In-Reply-To: 20170523165847.fbu45eq3w24tyh3c@merlins.org

Marc MERLIN posted on Tue, 23 May 2017 09:58:47 -0700 as excerpted:

> That's a valid point, and in my case, I can back it up/restore, it just
> takes a bit of time, but most of the time is manually babysitting all
> those subvolumes that I need to recreate by hand with btrfs send/restore
> relationships, which all get lost during backup/restore.
> This is the most painful part.
> What's too big? I've only ever used a filesystem that fits on on a raid
> of 4 data drives. That value has increased over time, but I don't have a
> a crazy array of 20+ drives as a single filesystem, or anything.
> Since drives have gotten bigger, but not that much faster, I use bcache
> to make things more acceptable in speed.

What's too big?  That depends on your tolerance for pain, but given the 
subvolumes manually recreated by hand with send/receive scenario, I'd 
probably try to break it down so while there's the same number of 
snapshots to restore, the number of subvolumes the snapshots are taken 
against are limited.

My own rule of thumb is if it's taking so long that it's a barrier to 
doing it, I really need to either break things down further, or upgrade 
to faster storage.  The latter is why I'm actually looking at upgrading 
my media and second backup set, on spinning rust, to ssd.  Because while 
I used to do backups spinning rust to spinning rust of that size all the 
time, ssds have spoiled me, and now I dread doing the spinning rust 
backups... or restores.   Tho in my case the spinning rust is only a half-
TB, so a pair of half-TB to 1 TB ssds for an upgrade is still cost 
effective.  It's not like I'm going multi-TB, which would still be cost 
prohibitive on SSD, particularly since I want raid1, so doubling the 
number of SSDs.

Meanwhile, what I'd do with that raid of four drives (and /did/ do with 
my 4-drive raid back a few storage generations ago, when 300 GB spinning-
rust disks were still quite big, and what I do with my paired SSDs with 
btrfs now) is partition them up and do raids of partitions on each drive.

One thing that's nice about that is that you can actually do a set of 
backups on a second set of partitions on the same physical devices, 
because the physical device redundancy of the raids covers loss of a 
device, and the separate partitions and raids (btrfs raid1 now) cover the 
fat-finger or simple loss of filesystem risk.  A second set of backups to 
separate devices can then be made just in case, and depending on the 
need, swapped out to off-premises or uploaded to the cloud or whatever, 
but you always have the primary backup at hand to boot to or mount if the 
working copy fails, by simply pointing to the backup partitions and 
filesystem instead of the normal working copy.  For root, I even have a 
grub menu item that switches to the backup copy, and for fstab, I have a 
set of stubs that are assembled via script into three copies of fstab 
that swap working and backup copies as necessary, with /etc/fstab itself 
being a symlink to the working copy one, that I simply switch to point to 
the one that loads the backup copies as working, on the backup.  Or I can 
mount the root filesystem for maintenance from the initramfs, and switch 
the fstab symlink from there, before exiting maintenance and booting the 
main system.

I learned this "split it up" method the hard way back before mdraid had 
write-intent bitmaps, and I had only two much larger raids, working and 
backup, where if one device dropped out and I brought it back in, I had 
to wait way too long for the huge working raid to resync.  When I split 
things up by function into multiple raids, most of the time only some of 
them were active and only one or two of the active ones would actually 
have been being written at the time so were out of sync, and syncing them 
was fast as they were much smaller than the larger full system raids I 
had been using previously.

>> *BUT*, and here's the "go further" part, keep in mind that
>> subvolume-read-
>> only is a property, gettable and settable by btrfs property.
>> 
>> So you should be able to unset the read-only property of a subvolume or
>> snapshot, move it, then if desired, set it again.
>> 
>> Of course I wouldn't expect send -p to work with such a snapshot, but
>> send -c /might/ still work, I'm not actually sure but I'd consider it
>> worth trying.  (I'd try -p as well, but expect it to fail...)
> 
> That's an interesting point, thanks for making it.
> In that case, I did have to destroy and recreate the filesystem since
> btrfs check --repair was unable to fix it, but knowing how to reparent
> read only subvolumes may be handy in the future, thanks.

Hopefully you won't end up testing it any time soon, but if you do, 
please confirm whether my suspicions that send -p won't work after 
toggling and reparenting, but send -c still will, are correct.

(For those who read this out of thread context where I believe I already 
stated it, my own use-case involves neither snapshots nor send-receive.  
But it'd be useful information to confirm, both for others, and in case I 
suddenly find myself with a different use-case for some reason or other.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


  reply	other threads:[~2017-05-24 10:17 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-20 14:39 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Marc MERLIN
2017-06-20 15:23 ` Hugo Mills
2017-06-20 15:26   ` Marc MERLIN
2017-06-20 15:36     ` Hugo Mills
2017-06-20 15:44       ` Marc MERLIN
2017-06-20 23:12         ` Marc MERLIN
2017-06-20 23:58           ` Marc MERLIN
2017-06-21  3:31           ` Chris Murphy
2017-06-21  3:43             ` Marc MERLIN
2017-06-21 15:13               ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN
2017-06-21 23:22                 ` Chris Murphy
2017-06-22  0:48                   ` Marc MERLIN
2017-06-22  2:22                 ` Qu Wenruo
2017-06-22  2:53                   ` Marc MERLIN
2017-06-22  4:08                     ` Qu Wenruo
2017-06-23  4:06                       ` Marc MERLIN
2017-06-23  8:54                         ` Lu Fengqi
2017-06-23 16:17                           ` Marc MERLIN
2017-06-24  2:34                             ` Marc MERLIN
2017-06-26 10:46                               ` Lu Fengqi
2017-06-27 23:11                                 ` Marc MERLIN
2017-06-28  7:10                                   ` Lu Fengqi
2017-06-28 14:43                                     ` Marc MERLIN
2017-05-01 17:06                                       ` 4.11 relocate crash, null pointer Marc MERLIN
2017-05-01 18:08                                         ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN
2017-05-02  1:50                                           ` Chris Murphy
2017-05-02  3:23                                             ` Marc MERLIN
2017-05-02  4:56                                               ` Chris Murphy
2017-05-02  5:11                                                 ` Marc MERLIN
2017-05-02 18:47                                                   ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN
2017-05-03  6:00                                                     ` Marc MERLIN
2017-05-03  6:17                                                       ` Marc MERLIN
2017-05-03  6:32                                                         ` Roman Mamedov
2017-05-03 20:40                                                           ` Marc MERLIN
2017-07-07  5:37                                                   ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN
2017-07-07  5:39                                                     ` Marc MERLIN
2017-07-07  9:33                                                       ` Lu Fengqi
2017-07-07 16:38                                                         ` Marc MERLIN
2017-07-09  4:34                                                           ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  5:05                                                             ` We really need a better/working btrfs check --repair Marc MERLIN
2017-07-09  6:34                                                             ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  7:57                                                             ` Martin Steigerwald
2017-07-09  9:16                                                               ` Paul Jones
2017-07-09 11:17                                                                 ` Duncan
2017-07-09 13:00                                                                   ` Martin Steigerwald
2017-07-29 19:29                                                                   ` Imran Geriskovan
2017-07-29 23:38                                                                     ` Duncan
2017-07-30 14:54                                                                       ` Imran Geriskovan
2017-07-31  4:53                                                                         ` Duncan
2017-07-31 20:32                                                                           ` Imran Geriskovan
2017-08-01  1:36                                                                             ` Duncan
2017-08-01 15:18                                                                               ` Imran Geriskovan
2017-07-31 21:07                                                               ` Ivan Sizov
2017-07-31 21:17                                                                 ` Marc MERLIN
2017-07-31 21:39                                                                   ` Ivan Sizov
2017-08-01 16:41                                                                     ` Ivan Sizov
2017-07-31 22:00                                                                   ` Justin Maggard
2017-08-01  6:38                                                                     ` Marc MERLIN
2017-05-02 19:59                                                 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow
2017-05-02  5:01                                               ` Duncan
2017-05-02 19:53                                                 ` Kai Krakow
2017-05-23 16:58                                                 ` Marc MERLIN
2017-05-24 10:16                                                   ` Duncan [this message]
2017-05-05  1:19                                               ` Qu Wenruo
2017-05-05  2:10                                                 ` Qu Wenruo
2017-05-05  2:40                                                 ` Marc MERLIN
2017-05-05  5:03                                                   ` Qu Wenruo
2017-05-05 15:43                                                     ` Marc MERLIN
2017-05-17 18:23                                                       ` Kai Krakow
2017-05-05  1:13                                           ` Qu Wenruo
2017-06-29 13:36                                       ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Lu Fengqi
2017-06-29 15:30                                         ` Marc MERLIN
2017-06-30 14:59                                           ` Lu Fengqi
2017-06-22  4:08                     ` Qu Wenruo
2017-06-21 12:04           ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan
2017-06-21  3:26         ` Chris Murphy
2017-06-21  4:06           ` Marc MERLIN

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$8532b$ae0e1e6d$c1cd3583$d6dbef31@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.