linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Vyukov <dvyukov@google.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>,
	akpm@linux-foundation.org, mhocko@suse.com, vbabka@suse.cz,
	hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de,
	dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com,
	void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com,
	ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com,
	axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org,
	nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com,
	vschneid@redhat.com, cl@linux.com, penberg@kernel.org,
	iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com,
	elver@google.com, shakeelb@google.com, songmuchun@bytedance.com,
	arnd@arndb.de, jbaron@akamai.com, rientjes@google.com,
	minchan@google.com, kaleshsingh@google.com,
	kernel-team@android.com, linux-mm@kvack.org,
	iommu@lists.linux.dev, kasan-dev@googlegroups.com,
	io-uring@vger.kernel.org, linux-arch@vger.kernel.org,
	xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org,
	linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 22/30] Code tagging based fault injection
Date: Thu, 1 Sep 2022 10:43:26 +0200	[thread overview]
Message-ID: <CACT4Y+bMeqvWQwqzG3nfcf0-VOjU7usxht5mKgUwMcOpWKRjxQ@mail.gmail.com> (raw)
In-Reply-To: <20220831173010.wc5j3ycmfjx6ezfu@moria.home.lan>

 On Wed, 31 Aug 2022 at 19:30, Kent Overstreet
<kent.overstreet@linux.dev> wrote:
> > > From: Kent Overstreet <kent.overstreet@linux.dev>
> > >
> > > This adds a new fault injection capability, based on code tagging.
> > >
> > > To use, simply insert somewhere in your code
> > >
> > >   dynamic_fault("fault_class_name")
> > >
> > > and check whether it returns true - if so, inject the error.
> > > For example
> > >
> > >   if (dynamic_fault("init"))
> > >       return -EINVAL;
> >
> > Hi Suren,
> >
> > If this is going to be used by mainline kernel, it would be good to
> > integrate this with fail_nth systematic fault injection:
> > https://elixir.bootlin.com/linux/latest/source/lib/fault-inject.c#L109
> >
> > Otherwise these dynamic sites won't be tested by testing systems doing
> > systematic fault injection testing.
>
> That's a discussion we need to have, yeah. We don't want two distinct fault
> injection frameworks, we'll have to have a discussion as to whether this is (or
> can be) better enough to make a switch worthwhile, and whether a compatibility
> interface is needed - or maybe there's enough distinct interesting bits in both
> to make merging plausible?
>
> The debugfs interface for this fault injection code is necessarily different
> from our existing fault injection - this gives you a fault injection point _per
> callsite_, which is huge - e.g. for filesystem testing what I need is to be able
> to enable fault injection points within a given module. I can do that easily
> with this, not with our current fault injection.
>
> I think the per-callsite fault injection points would also be pretty valuable
> for CONFIG_FAULT_INJECTION_USERCOPY, too.
>
> OTOH, existing kernel fault injection can filter based on task - this fault
> injection framework doesn't have that. Easy enough to add, though. Similar for
> the interval/probability/ratelimit stuff.
>
> fail_function is the odd one out, I'm not sure how that would fit into this
> model. Everything else I've seen I think fits into this model.
>
> Also, it sounds like you're more familiar with our existing fault injection than
> I am, so if I've misunderstood anything about what it can do please do correct
> me.

What you are saying makes sense. But I can't say if we want to do a
global switch or not. I don't know how many existing users there are
(by users I mean automated testing b/c humans can switch for one-off
manual testing).

However, fail_nth that I mentioned is orthogonal to this. It's a
different mechanism to select the fault site that needs to be failed
(similar to what you mentioned as "interval/probability/ratelimit
stuff"). fail_nth allows to fail the specified n-th call site in the
specified task. And that's the only mechanism we use in
syzkaller/syzbot.
And I think it can be supported relatively easily (copy a few lines to
the "does this site needs to fail" check).

I don't know how exactly you want to use this new mechanism, but I
found fail_nth much better than any of the existing selection
mechanisms, including what this will add for specific site failing.

fail_nth allows to fail every site in a given test/syscall one-by-one
systematically. E.g. we can even have strace-like utility that repeats
the given test failing all sites in to systematically:
$ fail_all ./a_unit_test
This can be integrated into any CI system, e.g. running all LTP tests with this.

For file:line-based selection, first, we need to get these file:line
from somewhere; second, lines are changing over time so can't be
hardcoded in tests; third, it still needs to be per-task, since
unrelated processes can execute the same code.

One downside of fail_nth, though, is that it does not cover background
threads/async work. But we found that there are so many untested
synchronous error paths, that moving to background threads is not
necessary at this point.



> Interestingly: I just discovered from reading the code that
> CONFIG_FAULT_INJECTION_STACKTRACE_FILTER is a thing (hadn't before because it
> depends on !X86_64 - what?). That's cool, though.

  reply	other threads:[~2022-09-01  8:43 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-30 21:48 [RFC PATCH 00/30] Code tagging framework and applications Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 01/30] kernel/module: move find_kallsyms_symbol_value declaration Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 02/30] lib/string_helpers: Drop space in string_get_size's output Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 03/30] Lazy percpu counters Suren Baghdasaryan
2022-08-31 10:02   ` Mel Gorman
2022-08-31 15:37     ` Suren Baghdasaryan
2022-08-31 16:20     ` Kent Overstreet
2022-09-01  6:51   ` Peter Zijlstra
2022-09-01 14:32     ` Kent Overstreet
2022-09-01 14:48       ` Steven Rostedt
2022-09-01 15:43         ` Kent Overstreet
2022-09-01 18:59       ` Peter Zijlstra
2022-08-30 21:48 ` [RFC PATCH 04/30] scripts/kallysms: Always include __start and __stop symbols Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 05/30] lib: code tagging framework Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 06/30] lib: code tagging module support Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 07/30] lib: add support for allocation tagging Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 08/30] lib: introduce page " Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 09/30] change alloc_pages name in dma_map_ops to avoid name conflicts Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 10/30] mm: enable page allocation tagging for __get_free_pages and alloc_pages Suren Baghdasaryan
2022-08-31 10:11   ` Mel Gorman
2022-08-31 15:45     ` Suren Baghdasaryan
2022-08-31 15:52       ` Suren Baghdasaryan
2022-08-31 17:46     ` Kent Overstreet
2022-09-01  1:07       ` Suren Baghdasaryan
2022-09-01  7:41       ` Peter Zijlstra
2022-08-30 21:49 ` [RFC PATCH 11/30] mm: introduce slabobj_ext to support slab object extensions Suren Baghdasaryan
2022-09-01 23:35   ` Roman Gushchin
2022-09-02  0:23     ` Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 12/30] mm: introduce __GFP_NO_OBJ_EXT flag to selectively prevent slabobj_ext creation Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 13/30] mm/slab: introduce SLAB_NO_OBJ_EXT to avoid obj_ext creation Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 14/30] mm: prevent slabobj_ext allocations for slabobj_ext and kmem_cache objects Suren Baghdasaryan
2022-09-01 23:40   ` Roman Gushchin
2022-09-02  0:24     ` Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 15/30] lib: introduce slab allocation tagging Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 16/30] mm: enable slab allocation tagging for kmalloc and friends Suren Baghdasaryan
2022-09-01 23:50   ` Roman Gushchin
2022-08-30 21:49 ` [RFC PATCH 17/30] lib/string.c: strsep_no_empty() Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 18/30] codetag: add codetag query helper functions Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 19/30] move stack capture functionality into a separate function for reuse Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 20/30] lib: introduce support for storing code tag context Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 21/30] lib: implement context capture support for page and slab allocators Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 22/30] Code tagging based fault injection Suren Baghdasaryan
2022-08-31  1:51   ` Randy Dunlap
2022-08-31 15:56     ` Suren Baghdasaryan
2022-08-31 10:37   ` Dmitry Vyukov
2022-08-31 15:51     ` Suren Baghdasaryan
2022-08-31 17:30     ` Kent Overstreet
2022-09-01  8:43       ` Dmitry Vyukov [this message]
2022-08-30 21:49 ` [RFC PATCH 23/30] timekeeping: Add a missing include Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 24/30] wait: Clean up waitqueue_entry initialization Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 25/30] lib/time_stats: New library for statistics on events Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 26/30] bcache: Convert to lib/time_stats Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 27/30] Code tagging based latency tracking Suren Baghdasaryan
2022-08-31  1:53   ` Randy Dunlap
2022-08-31 15:55     ` Suren Baghdasaryan
2022-09-01  7:11   ` Peter Zijlstra
2022-09-01 14:43     ` Kent Overstreet
2022-09-01 21:38   ` Steven Rostedt
2022-09-01 21:46     ` Steven Rostedt
2022-09-01 21:54     ` Kent Overstreet
2022-09-01 22:34       ` Steven Rostedt
2022-09-01 22:55         ` Kent Overstreet
2022-09-02  0:23           ` Steven Rostedt
2022-09-02  1:35             ` Kent Overstreet
2022-09-02  1:59               ` Steven Rostedt
2022-08-30 21:49 ` [RFC PATCH 28/30] Improved symbolic error names Suren Baghdasaryan
2022-09-01 23:19   ` Joe Perches
2022-09-01 23:26     ` Kent Overstreet
2022-08-30 21:49 ` [RFC PATCH 29/30] dyndbg: Convert to code tagging Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 30/30] MAINTAINERS: Add entries for code tagging & related Suren Baghdasaryan
2022-08-31  7:38 ` [RFC PATCH 00/30] Code tagging framework and applications Peter Zijlstra
2022-08-31  8:42   ` Kent Overstreet
2022-08-31 10:19     ` Mel Gorman
2022-08-31 10:47       ` Michal Hocko
2022-08-31 15:28         ` Suren Baghdasaryan
2022-08-31 16:48           ` Suren Baghdasaryan
2022-08-31 19:01         ` Kent Overstreet
2022-08-31 20:56           ` Yosry Ahmed
2022-08-31 21:38             ` Suren Baghdasaryan
2022-09-01 22:27             ` Roman Gushchin
2022-09-01 22:37               ` Kent Overstreet
2022-09-01 22:53                 ` Roman Gushchin
2022-09-01 23:36                   ` Suren Baghdasaryan
2022-09-02  0:17                   ` Kent Overstreet
2022-09-02  1:04                     ` Roman Gushchin
2022-09-02  1:16                       ` Kent Overstreet
2022-09-02 12:02                       ` Jens Axboe
2022-09-02 19:48                         ` Kent Overstreet
2022-09-02 19:53                           ` Jens Axboe
2022-09-02 20:05                             ` Kent Overstreet
2022-09-02 20:23                               ` Jens Axboe
2022-09-01  7:18           ` Michal Hocko
2022-09-01 15:33             ` Suren Baghdasaryan
2022-09-01 19:15               ` Michal Hocko
2022-09-01 19:39                 ` Suren Baghdasaryan
2022-09-01 20:15                   ` Kent Overstreet
2022-09-05  8:49                     ` Michal Hocko
2022-09-05 23:46                       ` Kent Overstreet
2022-09-06  7:23                         ` Michal Hocko
2022-09-06 18:20                           ` Kent Overstreet
2022-09-07 11:00                             ` Michal Hocko
2022-09-07 13:04                               ` Kent Overstreet
2022-09-07 13:45                                 ` Steven Rostedt
2022-09-08  6:35                                   ` Kent Overstreet
2022-09-08  6:49                                     ` Suren Baghdasaryan
2022-09-08  7:07                                       ` Kent Overstreet
2022-09-08  7:12                                     ` Michal Hocko
2022-09-08  7:29                                       ` Kent Overstreet
2022-09-08  7:47                                         ` Michal Hocko
2022-09-05  1:32                 ` Suren Baghdasaryan
2022-09-05  8:12                   ` Michal Hocko
2022-09-05  8:58                     ` Marco Elver
2022-09-05 18:07                       ` Suren Baghdasaryan
2022-09-05 18:03                     ` Suren Baghdasaryan
2022-09-06  8:01                       ` Michal Hocko
2022-09-06 15:35                         ` Suren Baghdasaryan
2022-09-05 15:07                   ` Steven Rostedt
2022-09-05 18:08                     ` Suren Baghdasaryan
2022-09-05 20:42                       ` Kent Overstreet
2022-09-05 22:16                         ` Steven Rostedt
2022-09-05 23:50                           ` Kent Overstreet
2022-09-01  8:05           ` David Hildenbrand
2022-09-01 14:23             ` Kent Overstreet
2022-09-01 15:07               ` David Hildenbrand
2022-09-01 15:39                 ` Suren Baghdasaryan
2022-09-01 15:48                 ` Kent Overstreet
2022-08-31 15:59       ` Kent Overstreet
2022-09-01  7:05         ` Peter Zijlstra
2022-09-01  7:36           ` Daniel Bristot de Oliveira
2022-09-01  7:42           ` Peter Zijlstra
2022-09-01 11:05         ` Mel Gorman
2022-09-01 16:31           ` Kent Overstreet
2022-09-01  7:00       ` Peter Zijlstra
2022-09-01 14:29         ` Kent Overstreet
2022-09-05 18:44       ` Nadav Amit
2022-09-05 19:16         ` Steven Rostedt
2022-09-01  4:52 ` Oscar Salvador
2022-09-01  5:05   ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACT4Y+bMeqvWQwqzG3nfcf0-VOjU7usxht5mKgUwMcOpWKRjxQ@mail.gmail.com \
    --to=dvyukov@google.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=changbin.du@intel.com \
    --cc=cl@linux.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=io-uring@vger.kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=jbaron@akamai.com \
    --cc=juri.lelli@redhat.com \
    --cc=kaleshsingh@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kent.overstreet@linux.dev \
    --cc=kernel-team@android.com \
    --cc=ldufour@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@google.com \
    --cc=nathan@kernel.org \
    --cc=penberg@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=vincent.guittot@linaro.org \
    --cc=void@manifault.com \
    --cc=vschneid@redhat.com \
    --cc=willy@infradead.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=ytcoode@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).