linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Luis Chamberlain <mcgrof@kernel.org>, axboe@kernel.dk
Cc: bvanassche@acm.org, ming.lei@redhat.com, hch@infradead.org,
	jack@suse.cz, osandov@fb.com, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 8/8] block: add add_disk() failure injection support
Date: Wed, 12 May 2021 17:22:48 +0200	[thread overview]
Message-ID: <e938c21f-3872-232c-4956-dfa53aec579b@suse.de> (raw)
In-Reply-To: <20210512064629.13899-9-mcgrof@kernel.org>

On 5/12/21 8:46 AM, Luis Chamberlain wrote:
> For a long time we have lived without any error handling
> on the add_disk() error path. Now that we have some initial
> error handling, add error injection support for its path so
> that we can test it and ensure we don't regress this path
> moving forward.
> 
> This only adds runtime code *iff* the new bool CONFIG_FAIL_ADD_DISK is
> enabled in your kernel. If you don't have this enabled this provides
> no new functional. When CONFIG_FAIL_ADD_DISK is disabled the new routine
> blk_should_fail_add_disk() ends up being transformed to if (false), and
> so the compiler should optimize these out as dead code producing no
> new effective binary changes.
> 
> Failure injection lets us configure at boot how often we want a failure
> to take place by specifying the interval, the probability, and when needed
> a size constraint. We don't need to test for size constraints for
> add_disk() and so ignore that part of error injection. Although testing
> early boot failures with add_disk() failures might be useful we don't
> to make add_disk() fail every time as otherwise we wouldn't be able to
> boot. So enabling add_disk() error injection requires a second post
> boot step where you specify where in the add_disk() code path you want
> to enable failure injection for. This lets us verify correctness of
> the different error handling parts of add_disk(), while also allowing
> a respective blktests test to grow dynamically in case the add_disk()
> paths grows.
> 
> We currently enable 11 code paths on add_disk() which can fail
> and we can test for:
> 
> 	# ls -1 /sys/kernel/debug/block/config_fail_add_disk/
> 	alloc_devt
> 	alloc_events
> 	bdi_register
> 	device_add
> 	disk_add_events
> 	get_queue
> 	integrity_add
> 	register_disk
> 	register_queue
> 	sysfs_bdi_link
> 	sysfs_depr_link
> 
> If you want to modify the configuration of fail_add_disk dynamically
> at boot, you can enable CONFIG_FAULT_INJECTION_DEBUG_FS. If you've
> enabled CONFIG_FAIL_ADD_DISK you will see these knobs:
> 
> 	# ls -1 /sys/kernel/debug/block/fail_add_disk/
> 	interval
> 	probability
> 	space
> 	task-filter
> 	times
> 	verbose
> 	verbose_ratelimit_burst
> 	verbose_ratelimit_interval_ms
> 
> Suggested-by: Bart Van Assche <bvanassche@acm.org>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>   .../fault-injection/fault-injection.rst       | 23 ++++++++
>   block/Makefile                                |  1 +
>   block/blk-core.c                              |  1 +
>   block/blk.h                                   | 55 ++++++++++++++++++
>   block/failure-injection.c                     | 54 ++++++++++++++++++
>   block/genhd.c                                 | 57 +++++++++++++++++++
>   lib/Kconfig.debug                             | 13 +++++
>   7 files changed, 204 insertions(+)
>   create mode 100644 block/failure-injection.c
> 
[ .. ]
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index d1467658361f..4fccc0fad190 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1917,6 +1917,19 @@ config FAULT_INJECTION_USERCOPY
>   	  Provides fault-injection capability to inject failures
>   	  in usercopy functions (copy_from_user(), get_user(), ...).
>   
> +config FAIL_ADD_DISK
> +	bool "Fault-injection capability for add_disk() callers"
> +	depends on FAULT_INJECTION && BLOCK
> +	help
> +	  Provide fault-injection capability for the add_disk() block layer
> +	  call path. This allows the kernel to provide error injection when
> +	  the add_disk() call is made. You would use something like blktests
> +	  test against this or just load the null_blk driver. This only
> +	  enables the error injection functionality. To use it you must
> +	  configure which path you want to trigger on error on using debugfs
> +	  under /sys/kernel/debug/block/config_fail_add_disk/. By default
> +	  all of these are disabled.
> +
>   config FAIL_MAKE_REQUEST
>   	bool "Fault-injection capability for disk IO"
>   	depends on FAULT_INJECTION && BLOCK
> 

Hmm. Not a fan of this approach.

Having to have a separate piece of code just to test individual 
functions, _and_ having to place hooks in the code to _simulate_ a 
failure seems rather fragile to me.

I would have vastly preferred if we could to this via generic tools like 
ebpf or livepatching.
Also I'm worried that this approach doesn't really scale; taken to 
extremes we would have to add duplicate calls to each and every function 
for full error injection, essentially double the size of the code just 
on the off-chance that someone wants to do error injection.

So I'd rather delegate the topic of error injection to a more general 
discussion (LSF springs to mind ...), and then agree on a framework 
which is suitable for every function.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

  reply	other threads:[~2021-05-12 15:50 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-12  6:46 [PATCH v1 0/8] block: add error handling for *add_disk*() Luis Chamberlain
2021-05-12  6:46 ` [PATCH v1 1/8] block: refcount the request_queue early in __device_add_disk() Luis Chamberlain
2021-05-12 15:07   ` Hannes Reinecke
2021-05-12  6:46 ` [PATCH v1 2/8] block: move disk announce work from register_disk() to a helper Luis Chamberlain
2021-05-12 15:08   ` Hannes Reinecke
2021-05-12  6:46 ` [PATCH v1 3/8] block: move disk invalidation from del_gendisk() into " Luis Chamberlain
2021-05-12 15:09   ` Hannes Reinecke
2021-05-12  6:46 ` [PATCH v1 4/8] block: move disk unregistration work from del_gendisk() to " Luis Chamberlain
2021-05-12 15:09   ` Hannes Reinecke
2021-05-12  6:46 ` [PATCH v1 5/8] block: add initial error handling for *add_disk()* and friends Luis Chamberlain
2021-05-12 15:15   ` Hannes Reinecke
2021-05-12  6:46 ` [PATCH v1 6/8] loop: add error handling support for add_disk() Luis Chamberlain
2021-05-12 15:15   ` Hannes Reinecke
2021-05-12  6:46 ` [PATCH v1 7/8] null_blk: " Luis Chamberlain
2021-05-12 15:16   ` Hannes Reinecke
2021-05-12 16:47     ` Luis Chamberlain
2021-05-12 17:12       ` Hannes Reinecke
2021-05-12 17:20         ` Luis Chamberlain
2021-05-12 17:28           ` Hannes Reinecke
2021-05-19 19:57             ` Luis Chamberlain
2021-05-12  6:46 ` [PATCH v1 8/8] block: add add_disk() failure injection support Luis Chamberlain
2021-05-12 15:22   ` Hannes Reinecke [this message]
2021-05-12 16:56     ` Luis Chamberlain
2021-05-12 17:55       ` Hannes Reinecke
2021-05-12 14:44 ` [PATCH v1 0/8] block: add error handling for *add_disk*() Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e938c21f-3872-232c-4956-dfa53aec579b@suse.de \
    --to=hare@suse.de \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=osandov@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).