From: Hannes Reinecke <hare@suse.de>
To: Luis Chamberlain <mcgrof@kernel.org>, axboe@kernel.dk
Cc: bvanassche@acm.org, ming.lei@redhat.com, hch@infradead.org,
jack@suse.cz, osandov@fb.com, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 8/8] block: add add_disk() failure injection support
Date: Wed, 12 May 2021 17:22:48 +0200 [thread overview]
Message-ID: <e938c21f-3872-232c-4956-dfa53aec579b@suse.de> (raw)
In-Reply-To: <20210512064629.13899-9-mcgrof@kernel.org>
On 5/12/21 8:46 AM, Luis Chamberlain wrote:
> For a long time we have lived without any error handling
> on the add_disk() error path. Now that we have some initial
> error handling, add error injection support for its path so
> that we can test it and ensure we don't regress this path
> moving forward.
>
> This only adds runtime code *iff* the new bool CONFIG_FAIL_ADD_DISK is
> enabled in your kernel. If you don't have this enabled this provides
> no new functional. When CONFIG_FAIL_ADD_DISK is disabled the new routine
> blk_should_fail_add_disk() ends up being transformed to if (false), and
> so the compiler should optimize these out as dead code producing no
> new effective binary changes.
>
> Failure injection lets us configure at boot how often we want a failure
> to take place by specifying the interval, the probability, and when needed
> a size constraint. We don't need to test for size constraints for
> add_disk() and so ignore that part of error injection. Although testing
> early boot failures with add_disk() failures might be useful we don't
> to make add_disk() fail every time as otherwise we wouldn't be able to
> boot. So enabling add_disk() error injection requires a second post
> boot step where you specify where in the add_disk() code path you want
> to enable failure injection for. This lets us verify correctness of
> the different error handling parts of add_disk(), while also allowing
> a respective blktests test to grow dynamically in case the add_disk()
> paths grows.
>
> We currently enable 11 code paths on add_disk() which can fail
> and we can test for:
>
> # ls -1 /sys/kernel/debug/block/config_fail_add_disk/
> alloc_devt
> alloc_events
> bdi_register
> device_add
> disk_add_events
> get_queue
> integrity_add
> register_disk
> register_queue
> sysfs_bdi_link
> sysfs_depr_link
>
> If you want to modify the configuration of fail_add_disk dynamically
> at boot, you can enable CONFIG_FAULT_INJECTION_DEBUG_FS. If you've
> enabled CONFIG_FAIL_ADD_DISK you will see these knobs:
>
> # ls -1 /sys/kernel/debug/block/fail_add_disk/
> interval
> probability
> space
> task-filter
> times
> verbose
> verbose_ratelimit_burst
> verbose_ratelimit_interval_ms
>
> Suggested-by: Bart Van Assche <bvanassche@acm.org>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
> .../fault-injection/fault-injection.rst | 23 ++++++++
> block/Makefile | 1 +
> block/blk-core.c | 1 +
> block/blk.h | 55 ++++++++++++++++++
> block/failure-injection.c | 54 ++++++++++++++++++
> block/genhd.c | 57 +++++++++++++++++++
> lib/Kconfig.debug | 13 +++++
> 7 files changed, 204 insertions(+)
> create mode 100644 block/failure-injection.c
>
[ .. ]
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index d1467658361f..4fccc0fad190 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1917,6 +1917,19 @@ config FAULT_INJECTION_USERCOPY
> Provides fault-injection capability to inject failures
> in usercopy functions (copy_from_user(), get_user(), ...).
>
> +config FAIL_ADD_DISK
> + bool "Fault-injection capability for add_disk() callers"
> + depends on FAULT_INJECTION && BLOCK
> + help
> + Provide fault-injection capability for the add_disk() block layer
> + call path. This allows the kernel to provide error injection when
> + the add_disk() call is made. You would use something like blktests
> + test against this or just load the null_blk driver. This only
> + enables the error injection functionality. To use it you must
> + configure which path you want to trigger on error on using debugfs
> + under /sys/kernel/debug/block/config_fail_add_disk/. By default
> + all of these are disabled.
> +
> config FAIL_MAKE_REQUEST
> bool "Fault-injection capability for disk IO"
> depends on FAULT_INJECTION && BLOCK
>
Hmm. Not a fan of this approach.
Having to have a separate piece of code just to test individual
functions, _and_ having to place hooks in the code to _simulate_ a
failure seems rather fragile to me.
I would have vastly preferred if we could to this via generic tools like
ebpf or livepatching.
Also I'm worried that this approach doesn't really scale; taken to
extremes we would have to add duplicate calls to each and every function
for full error injection, essentially double the size of the code just
on the off-chance that someone wants to do error injection.
So I'd rather delegate the topic of error injection to a more general
discussion (LSF springs to mind ...), and then agree on a framework
which is suitable for every function.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
next prev parent reply other threads:[~2021-05-12 15:50 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-12 6:46 [PATCH v1 0/8] block: add error handling for *add_disk*() Luis Chamberlain
2021-05-12 6:46 ` [PATCH v1 1/8] block: refcount the request_queue early in __device_add_disk() Luis Chamberlain
2021-05-12 15:07 ` Hannes Reinecke
2021-05-12 6:46 ` [PATCH v1 2/8] block: move disk announce work from register_disk() to a helper Luis Chamberlain
2021-05-12 15:08 ` Hannes Reinecke
2021-05-12 6:46 ` [PATCH v1 3/8] block: move disk invalidation from del_gendisk() into " Luis Chamberlain
2021-05-12 15:09 ` Hannes Reinecke
2021-05-12 6:46 ` [PATCH v1 4/8] block: move disk unregistration work from del_gendisk() to " Luis Chamberlain
2021-05-12 15:09 ` Hannes Reinecke
2021-05-12 6:46 ` [PATCH v1 5/8] block: add initial error handling for *add_disk()* and friends Luis Chamberlain
2021-05-12 15:15 ` Hannes Reinecke
2021-05-12 6:46 ` [PATCH v1 6/8] loop: add error handling support for add_disk() Luis Chamberlain
2021-05-12 15:15 ` Hannes Reinecke
2021-05-12 6:46 ` [PATCH v1 7/8] null_blk: " Luis Chamberlain
2021-05-12 15:16 ` Hannes Reinecke
2021-05-12 16:47 ` Luis Chamberlain
2021-05-12 17:12 ` Hannes Reinecke
2021-05-12 17:20 ` Luis Chamberlain
2021-05-12 17:28 ` Hannes Reinecke
2021-05-19 19:57 ` Luis Chamberlain
2021-05-12 6:46 ` [PATCH v1 8/8] block: add add_disk() failure injection support Luis Chamberlain
2021-05-12 15:22 ` Hannes Reinecke [this message]
2021-05-12 16:56 ` Luis Chamberlain
2021-05-12 17:55 ` Hannes Reinecke
2021-05-12 14:44 ` [PATCH v1 0/8] block: add error handling for *add_disk*() Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e938c21f-3872-232c-4956-dfa53aec579b@suse.de \
--to=hare@suse.de \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=ming.lei@redhat.com \
--cc=osandov@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).