All of lore.kernel.org
 help / color / mirror / Atom feed
From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0)
Date: Mon, 31 Jul 2017 04:53:07 +0000 (UTC)	[thread overview]
Message-ID: <pan$3b64a$a9b921fc$3e4651f5$627d0636@cox.net> (raw)
In-Reply-To: CAK5rZE6vMujb5KEqwUZD2-ZmZ4vi2T4+J2BzTLi9qqx8GeFKgg@mail.gmail.com

Imran Geriskovan posted on Sun, 30 Jul 2017 16:54:25 +0200 as excerpted:

> On 7/30/17, Duncan <1i5t5.duncan@cox.net> wrote:
>>>> Also, all my btrfs are raid1 or dup for checksummed redundancy
> 
>>> Do you have any experience/advice/comment regarding dup data on ssds?
> 
>> Very good question. =:^)
> 
>> Limited.  Most of my btrfs are raid1, with dup only used on the device-
>> respective /boot btrfs (of which there are four, one on each of the two
>> ssds that otherwise form the btrfs raid1 pairs, for each of the working
>> and backup copy pairs -- I can use BIOS to select any of the four to
>> boot), and those are all sub-GiB mixed-bg mode.
> 
> Is this a military or deep space device? ;)

Just happens to have four physical ssds, two pairs, with everything but 
/boot being paired btrfs raid1.  Because I wanted similar partition 
layout for ease of management, that's a /boot on each one, and because 
bios can only point to one at a time, that's four separate grub installs
[1], each of which is configured to load its own /boot.

While four is a bit much, three can certainly be very useful, because it 
allows a bad grub upgrade to be core-installed to one BIOS-boot 
partition, while allowing me to fat-finger point it to the wrong /boot on 
a second device destroying my ability to boot to it as well, and still 
have a third untouched to boot from.  The forth is simply bonus insurance 
on that, more by accident due to having two pair than because I really 
needed it.

A minimum of three /boots is also quite convenient for my kernel update 
routine, given I routinely test and sometimes bisect pre-release 
kernels.  The default/working /boot gets the prereleases with a release 
and stable fallback, the first backup the releases and a stable fallback, 
and the secondary backups get updated less frequently, generally when I'm 
doing a / backup cycle as well and there has been either a kernel config 
or system change substantial enough that I'm no longer confident the 
older kernels will work correctly with the updated system.

Of course the same general testing/release/stable /boot system works well 
for other related updates, say to the grub menu (I use grub2's bash-like 
scripting language directly, not the high level stuff which I find too 
difficult to tweak to my liking) or the initrd, which I attach to the 
individual kernels at build-time, so a tested kernel selection is a 
tested initramfs selection as well.

> For /boot, I've also tried dup data.
> 
> But because of combinations of constraints you've mentioned,
> I totally give-up trying to have a bullet proof /boot as my poor laptop
> is not mission critical as your device and as I do always have bootable
> backups and always carry some bootable sdcards.

When I complained about the 64-MiB default mixed-bg mode chunk size on a 
256 MiB filesystem being too big to allow balance in dup mode, a dev 
answered that in theory chunk sizes are supposed to be limited to 1/8 
filesystem size (down to something like a 16 MiB minimum chunk size I 
think, but might be 8 or 32), but something about my setup, likely the 
mixed-bg mode as it's less tested, was short-circuiting that, thus the 
quarter-fs-size 64 MiB chunk sizes, which he agreed didn't make much 
sense on a 256 MiB filesystem in dup mode.

He was able to duplicate the problem, and there seemed no disagreement is 
was a bug, but I'm not sure if mkfs.btrfs was ever patched to fix it, and 
of course now with the bigger half-gig filesystem the same 64-MiB initial 
chunk size is fine.

And my other quarter-gig btrfs, log, is raid1, quarter-gig per device, so 
I'd not see the problem there, mixed-mode or not.  (As mentioned in the 
footnote below, at least in this go-round it's not... more by accident 
than intent.)

Meanwhile, such bugs come with the territory when you're running what 
might be roughly compared at the commercial software level to late beta 
or rc level software, or even initial release, pre-service-release-1, 
level, which I'd argue is a more accurate btrfs comparison at this 
point.  As long as you stay within the known stable areas the danger of 
it eating your data is relatively small now, but the full feature set 
isn't there yet, and some of the features that are there are 
significantly less mature and stable than others.

> Perhaps that has something to do with me kicking out all systemd, inits,
> initramfs, mkinitcpio, dracut, etc, etc.
> 
> Now the init on /boot is a "19 lines" shell script, including lines for
> keymap, hdparm, crytpsetup. And let's not forget this is possible by a
> custom kernel, its reliable buddy syslinux.

FWIW...

I really like grub2, especially it's quite flexible bash-like scripting 
language (the higher level stuff intended for normal users just isn't 
flexible enough for me, so I need the scripting language anyway, and once 
I knew that, the higher level stuff only got in the way) and command line 
that allow all sorts of stuff like browsing for kernel commandline 
documentation at the boot prompt that I never imagined possible in a boot 
manager.

And after holding off for awhile, I'm now a cautious adopter and 
supporter of systemd in general, tho I don't use its solutions for 
/everything/ and don't like its extremely aggressive feature expansion.

And after resisting an initr* for years as unnecessary, I've been a 
reluctant adopter since a btrfs raid1 root effectively requires it 
(rootflags=device= doesn't seem to work, for whatever reason, or at least 
didn't when I initially converted to btrfs, so at least a limited initr* 
seems the only viable solution for a btrfs raid1 root).

And I'm using dracut for that, tho quite cut down from its default, with 
a monolithic kernel and only installing necessary dracut modules.

But particularly after the last dracut update pulled in kmod as a 
mandatory dep as it now links against its libs, despite my monolithic 
kernel built without module support, I've been considering similar initr* 
alternatives, including hand-rolling my own initr* build scripts.

Because I'm still not happy having to run an initr* at all, especially 
since there's more "magic" there than I'm particularly comfortable with 
since I like to grok the boot and thus potential recovery process better 
than I do this, and dracut was just the most convenient option at the 
time.

But kmod isn't a /huge/ dep, particularly with the executables and docs 
install-masked so it's only the library, headers and *.pc config file 
installed, and the current dracut solution works /reasonably/ well, so 
finding/creating an alternative isn't particularly high on my priority 
list, and I'll probably never do it unless dracut suddenly decides some 
of its other modules are going to need mandatory deps, or something else 
radically changes the current fragile balance and I really do need that 
currently lacking initr* grok.

> Interestingly my seach for reliability started with "dup data" and ended
> up here. :)

=:^)

---
[1] Grub and partition layout:  I install grub-core (i386-pc) to a raw 
GPT legacy BIOS boot partition.  While this only requires a partition 
size of about a third of a MiB, I use gdisk's default 1 MiB alignment and 
the first MiB is the GTP and the alignment gap, so this first BIOS boot 
partition starts at 1 MiB and must be a whole MiB unit in size.  Because 
I wanted plenty of room, however, and wanted additional partitions a 
minimum of 4 MiB aligned, I configured a 3 MiB BIOS boot partition for 
grub to use, thus accomplishing that 4 MiB alignment for further 
partitions.

The second partition is a currently unused GPT EFI partition for forward 
compatibility, 252 MiB in size so further partitions are quarter-GiB 
aligned.

The third partition is the /boot partition we've been discussing, a half 
GiB in size, thus ending at 3/4 GiB.  It's my only btrfs mixed-mode dup 
in the layout, so a half gig in size but a quarter gig usable.  As 
mentioned, with four physical ssds that's a total of four /boots, each 
pointed at by the grub-core installation in the first partition on the 
corresponding ssd.

Partition 4 is the log partition, a quarter GiB in size as log rotation 
keeps typical usage under 50 MiB, but the quarter gig size means it ends 
on the 1 GiB boundary and further partitions are GiB aligned.  In the 
last layout generation this was a half gig and /boot a quarter gig, but I 
decided /boot could use the extra quarter gig more than log so I traded 
sizes.  This, like all further partitions, is btrfs raid1.  I intended to 
make it mixed-bg mode, as it was in the previous generation layout, but 
forgot the mkfs.btrfs switch for that and it no longer defaults to mixed 
at under a gig, so I got standard mode.  Never-the-less, with raid1 
instead of dup, and low normal usage, the chunk size is small enough that 
balance shouldn't be an issue, and if it is I can always blow it away and 
recreate in mixed mode.

All further partitions are gig-aligned btrfs raid1 pair-device, three 
copies, working/0 and backups 1 and 2, on two separate pairs of ssds.  
The older pair is 256GB/238GiB with the backup/1 copy, the newer pair is 
1TB/931GiB with working/0 and backup/2.  The partition size and layout is 
identical on all four thru the sub-GiB and first copy, with the second 
copy on the larger pair being a same-sequence same-size repeat of the 
first, beyond the non-duplicated sub-GiB, of course.  So as long as the 
GPT on one of the four remains intact and bootable, I can easily recreate 
the other three.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


  reply	other threads:[~2017-07-31  4:53 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-20 14:39 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Marc MERLIN
2017-06-20 15:23 ` Hugo Mills
2017-06-20 15:26   ` Marc MERLIN
2017-06-20 15:36     ` Hugo Mills
2017-06-20 15:44       ` Marc MERLIN
2017-06-20 23:12         ` Marc MERLIN
2017-06-20 23:58           ` Marc MERLIN
2017-06-21  3:31           ` Chris Murphy
2017-06-21  3:43             ` Marc MERLIN
2017-06-21 15:13               ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN
2017-06-21 23:22                 ` Chris Murphy
2017-06-22  0:48                   ` Marc MERLIN
2017-06-22  2:22                 ` Qu Wenruo
2017-06-22  2:53                   ` Marc MERLIN
2017-06-22  4:08                     ` Qu Wenruo
2017-06-23  4:06                       ` Marc MERLIN
2017-06-23  8:54                         ` Lu Fengqi
2017-06-23 16:17                           ` Marc MERLIN
2017-06-24  2:34                             ` Marc MERLIN
2017-06-26 10:46                               ` Lu Fengqi
2017-06-27 23:11                                 ` Marc MERLIN
2017-06-28  7:10                                   ` Lu Fengqi
2017-06-28 14:43                                     ` Marc MERLIN
2017-05-01 17:06                                       ` 4.11 relocate crash, null pointer Marc MERLIN
2017-05-01 18:08                                         ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN
2017-05-02  1:50                                           ` Chris Murphy
2017-05-02  3:23                                             ` Marc MERLIN
2017-05-02  4:56                                               ` Chris Murphy
2017-05-02  5:11                                                 ` Marc MERLIN
2017-05-02 18:47                                                   ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN
2017-05-03  6:00                                                     ` Marc MERLIN
2017-05-03  6:17                                                       ` Marc MERLIN
2017-05-03  6:32                                                         ` Roman Mamedov
2017-05-03 20:40                                                           ` Marc MERLIN
2017-07-07  5:37                                                   ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN
2017-07-07  5:39                                                     ` Marc MERLIN
2017-07-07  9:33                                                       ` Lu Fengqi
2017-07-07 16:38                                                         ` Marc MERLIN
2017-07-09  4:34                                                           ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  5:05                                                             ` We really need a better/working btrfs check --repair Marc MERLIN
2017-07-09  6:34                                                             ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  7:57                                                             ` Martin Steigerwald
2017-07-09  9:16                                                               ` Paul Jones
2017-07-09 11:17                                                                 ` Duncan
2017-07-09 13:00                                                                   ` Martin Steigerwald
2017-07-29 19:29                                                                   ` Imran Geriskovan
2017-07-29 23:38                                                                     ` Duncan
2017-07-30 14:54                                                                       ` Imran Geriskovan
2017-07-31  4:53                                                                         ` Duncan [this message]
2017-07-31 20:32                                                                           ` Imran Geriskovan
2017-08-01  1:36                                                                             ` Duncan
2017-08-01 15:18                                                                               ` Imran Geriskovan
2017-07-31 21:07                                                               ` Ivan Sizov
2017-07-31 21:17                                                                 ` Marc MERLIN
2017-07-31 21:39                                                                   ` Ivan Sizov
2017-08-01 16:41                                                                     ` Ivan Sizov
2017-07-31 22:00                                                                   ` Justin Maggard
2017-08-01  6:38                                                                     ` Marc MERLIN
2017-05-02 19:59                                                 ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow
2017-05-02  5:01                                               ` Duncan
2017-05-02 19:53                                                 ` Kai Krakow
2017-05-23 16:58                                                 ` Marc MERLIN
2017-05-24 10:16                                                   ` Duncan
2017-05-05  1:19                                               ` Qu Wenruo
2017-05-05  2:10                                                 ` Qu Wenruo
2017-05-05  2:40                                                 ` Marc MERLIN
2017-05-05  5:03                                                   ` Qu Wenruo
2017-05-05 15:43                                                     ` Marc MERLIN
2017-05-17 18:23                                                       ` Kai Krakow
2017-05-05  1:13                                           ` Qu Wenruo
2017-06-29 13:36                                       ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Lu Fengqi
2017-06-29 15:30                                         ` Marc MERLIN
2017-06-30 14:59                                           ` Lu Fengqi
2017-06-22  4:08                     ` Qu Wenruo
2017-06-21 12:04           ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan
2017-06-21  3:26         ` Chris Murphy
2017-06-21  4:06           ` Marc MERLIN

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$3b64a$a9b921fc$3e4651f5$627d0636@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.