From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Andrei Borzenkov <arvidjaar@gmail.com>,
Adam Borowski <kilobyte@angband.pl>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: degraded permanent mount option
Date: Mon, 29 Jan 2018 14:00:53 -0500 [thread overview]
Message-ID: <f97f4e60-47fb-f1d6-dc09-0b46638a9eb4@gmail.com> (raw)
In-Reply-To: <dd2863d2-6f50-47b4-06a9-f34e7b341f2f@gmail.com>
On 2018-01-29 12:58, Andrei Borzenkov wrote:
> 29.01.2018 14:24, Adam Borowski пишет:
> ...
>>
>> So any event (the user's request) has already happened. A rc system, of
>> which systemd is one, knows whether we reached the "want root filesystem" or
>> "want secondary filesystems" stage. Once you're there, you can issue the
>> mount() call and let the kernel do the work.
>>
>>> It is a btrfs choice to not expose compound device as separate one (like
>>> every other device manager does)
>>
>> Btrfs is not a device manager, it's a filesystem.
>>
>>> it is a btrfs drawback that doesn't provice anything else except for this
>>> IOCTL with it's logic
>>
>> How can it provide you with something it doesn't yet have? If you want the
>> information, call mount(). And as others in this thread have mentioned,
>> what, pray tell, would you want to know "would a mount succeed?" for if you
>> don't want to mount?
>>
>>> it is a btrfs drawback that there is nothing to push assembling into "OK,
>>> going degraded" state
>>
>> The way to do so is to timeout, then retry with -o degraded.
>>
>
> That's possible way to solve it. This likely requires support from
> mount.btrfs (or btrfs.ko) to return proper indication that filesystem is
> incomplete so caller can decide whether to retry or to try degraded mount.
We already do so in the accepted standard manner. If the mount fails
because of a missing device, you get a very specific message in the
kernel log about it, as is the case for most other common errors (for
uncommon ones you usually just get a generic open_ctree error). This is
really the only option too, as the mount() syscall (which the mount
command calls) returns only 0 on success or -1 and an appropriate errno
value on failure, and we can't exactly go about creating a half dozen
new error numbers just for this (well, technically we could, but I very
much doubt that they would be accepted upstream, which defeats the purpose).
>
> Or may be mount.btrfs should implement this logic internally. This would
> really be the most simple way to make it acceptable to the other side by
> not needing to accept anything :)
And would also be another layering violation which would require a
proliferation of extra mount options to control the mount command itself
and adjust the timeout handling.
This has been done before with mount.nfs, but for slightly different
reasons (primarily to allow nested NFS mounts, since the local directory
that the filesystem is being mounted on not being present is treated
like a mount timeout), and it had near zero control. It works there
because they push the complicated policy decisions to userspace (namely,
there is no support for retrying with different options or trying a
different server).
With what you're proposing for BTRFS however, _everything_ is a
complicated decision, namely:
1. Do you retry at all? During boot, the answer should usually be yes,
but during normal system operation it should normally be no (because we
should be letting the user handle issues at that point).
2. How long should you wait before you retry? There is no right answer
here that will work in all cases (I've seen systems which take multiple
minutes for devices to become available on boot), especially considering
those of us who would rather have things fail early.
3. If the retry fails, do you retry again? How many times before it
just outright fails? This is going to be system specific policy. On
systems where devices may take a while to come online, the answer is
probably yes and some reasonably large number, while on systems where
devices are known to reliably be online immediately, it makes no sense
to retry more than once or twice.
4. If you are going to retry, should you try a degraded mount? Again,
this is going to be system specific policy (regular users would probably
want this to be a yes, while people who care about data integrity over
availability would likely want it to be a no).
5. Assuming you do retry with the degraded mount, how many times should
a normal mount fail before things go degraded? This ties in with 3 and
has the same arguments about variability I gave there.
6. How many times do you try a degraded mount before just giving up?
Again, similar variability to 3.
7. Should each attempt try first a regular mount and then a degraded
one, or do you try just normal a couple times and then switch to
degraded, or even start out trying normal and then start alternating?
Any of those patterns has valid arguments both for and against it, so
this again needs to be user configurable policy.
Altogether, that's a total of 7 policy decisions that should be user
configurable. Having a config file other than /etc/fstab for the mount
command should probably be avoided for sanity reasons (again, BTRFS is a
filesystem, not a volume manager), so they would all have to be handled
through mount options. The kernel will additionally have to understand
that those options need to be ignored (things do try to mount
filesystems without calling a mount helper, most notably the kernel when
it mounts the root filesystem on boot if you're not using an initramfs).
All in all, this type of thing gets out of hand _very_ fast.
next prev parent reply other threads:[~2018-01-29 19:00 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-26 14:02 degraded permanent mount option Christophe Yayon
2018-01-26 14:18 ` Austin S. Hemmelgarn
2018-01-26 14:47 ` Christophe Yayon
2018-01-26 14:55 ` Austin S. Hemmelgarn
2018-01-27 5:50 ` Andrei Borzenkov
[not found] ` <1517035210.1252874.1249880112.19FABD13@webmail.messagingengine.com>
2018-01-27 6:43 ` Andrei Borzenkov
2018-01-27 6:48 ` Christophe Yayon
2018-01-27 10:08 ` Christophe Yayon
2018-01-27 10:26 ` Andrei Borzenkov
2018-01-27 11:06 ` Tomasz Pala
2018-01-27 13:26 ` Adam Borowski
2018-01-27 14:36 ` Goffredo Baroncelli
2018-01-27 15:38 ` Adam Borowski
2018-01-27 15:22 ` Duncan
2018-01-28 0:39 ` Tomasz Pala
2018-01-28 20:02 ` Chris Murphy
2018-01-28 22:39 ` Tomasz Pala
2018-01-29 0:00 ` Chris Murphy
2018-01-29 8:54 ` Tomasz Pala
2018-01-29 11:24 ` Adam Borowski
2018-01-29 13:05 ` Austin S. Hemmelgarn
2018-01-30 13:46 ` Tomasz Pala
2018-01-30 15:05 ` Austin S. Hemmelgarn
2018-01-30 16:07 ` Tomasz Pala
2018-01-29 17:58 ` Andrei Borzenkov
2018-01-29 19:00 ` Austin S. Hemmelgarn [this message]
2018-01-29 21:54 ` waxhead
2018-01-30 13:46 ` Austin S. Hemmelgarn
2018-01-30 19:50 ` Tomasz Pala
2018-01-30 20:40 ` Austin S. Hemmelgarn
2018-01-30 15:24 ` Tomasz Pala
2018-01-30 13:36 ` Tomasz Pala
2018-01-30 4:44 ` Chris Murphy
2018-01-30 15:40 ` Tomasz Pala
2018-01-28 8:06 ` Andrei Borzenkov
2018-01-28 10:27 ` Tomasz Pala
2018-01-28 15:57 ` Duncan
2018-01-28 16:51 ` Andrei Borzenkov
2018-01-28 20:28 ` Chris Murphy
2018-01-28 23:13 ` Tomasz Pala
2018-01-27 21:12 ` Chris Murphy
2018-01-28 0:16 ` Tomasz Pala
2018-01-27 22:42 ` Tomasz Pala
2018-01-29 13:42 ` Austin S. Hemmelgarn
2018-01-30 15:09 ` Tomasz Pala
2018-01-30 16:22 ` Tomasz Pala
2018-01-30 16:30 ` Austin S. Hemmelgarn
2018-01-30 19:24 ` Tomasz Pala
2018-01-30 19:40 ` Tomasz Pala
2018-01-27 20:57 ` Chris Murphy
2018-01-28 0:00 ` Tomasz Pala
2018-01-28 10:43 ` Tomasz Pala
2018-01-26 21:54 ` Chris Murphy
2018-01-26 22:03 ` Christophe Yayon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f97f4e60-47fb-f1d6-dc09-0b46638a9eb4@gmail.com \
--to=ahferroin7@gmail.com \
--cc=arvidjaar@gmail.com \
--cc=kilobyte@angband.pl \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.