From: NeilBrown <email@example.com>
To: "Guilherme G. Piccoli" <firstname.lastname@example.org>,
Cc: email@example.com, firstname.lastname@example.org,
Subject: Re: [RFC] [PATCH 0/1] Introduce emergency raid0 stop for mounted arrays
Date: Thu, 02 Aug 2018 11:51:35 +1000 [thread overview]
Message-ID: <email@example.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2622 bytes --]
On Wed, Aug 01 2018, Guilherme G. Piccoli wrote:
> Currently the md driver completely relies in the userspace to stop an
> array in case of some failure. There's an interesting case for raid0: if
> we remove a raid0 member, like PCI hot(un)plugging an NVMe device, and
> the raid0 array is _mounted_, mdadm cannot stop the array, since the tool
> tries to open the block device (to perform the ioctl) with O_EXCL flag.
> So, in this case the array is still alive - users may write to this
> "broken-yet-alive" array and unless they check the kernel log or some
> other monitor tool, everything will seem fine and the writes are completed
> with no errors. Being more precise, direct writes will not work, but since
> usually writes are done in a regular form, i.e., backed by the page
> cache, the most common scenario is an user being able to regularly write
> to a broken raid0, and get all their data corrupted.
> The idea proposed here to fix this behavior is mimic other block devices:
> if one have a filesystem mounted in a block device on top of an NVMe or
> SCSI disk and the disk gets removed, writes are prevented, errors are
> observed and it's obvious something is wrong. Same goes for USB sticks,
> which are sometimes even removed physically from the machine without
> getting their filesystem unmounted before.
> We believe right now the md driver is not behaving properly for raid0
> arrays (it is handling these errors for other levels though). The approach
> took for raid-0 is basically an emergency removal procedure, in which I/O
> is blocked from the device, the regular clean-up happens and the associate
> disk is deleted. It went to extensive testing, as detailed below.
> Not all are roses, we have some caveats that need to be resolved.
> Feedback is _much appreciated_.
If you have hard drive and some sectors or track stop working, I think
you would still expect IO to the other sectors or tracks to keep
For this reason, the behaviour of md/raid0 is to continue to serve IO to
working devices, and only fail IO to failed/missing devices.
It seems reasonable that you might want a different behaviour, but I
think that should be optional. i.e. you would need to explicitly set a
"one-out-all-out" flag on the array. I'm not sure if this should cause
reads to fail, but it seems quite reasonable that it would cause all
writes to fail.
I would only change the kernel to recognise the flag and refuse any
writes after any error has been seen.
I would use udev/mdadm to detect a device removal and to mark the
relevant component device as missing.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2018-08-02 3:40 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-01 14:56 [RFC] [PATCH 0/1] Introduce emergency raid0 stop for mounted arrays Guilherme G. Piccoli
2018-08-01 14:56 ` [RFC] [PATCH 1/1] md/raid0: Introduce emergency stop for raid0 arrays Guilherme G. Piccoli
2018-08-02 1:51 ` NeilBrown [this message]
2018-08-02 13:30 ` [RFC] [PATCH 0/1] Introduce emergency raid0 stop for mounted arrays Guilherme G. Piccoli
2018-08-02 21:37 ` NeilBrown
2018-08-09 23:17 ` Guilherme G. Piccoli
2018-09-03 12:16 ` Guilherme G. Piccoli
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).