All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Vyukov <dvyukov@google.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>,
	akpm@linux-foundation.org, mhocko@suse.com, vbabka@suse.cz,
	hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de,
	dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com,
	void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com,
	ldufour@linux.ibm.com, peterx@redhat.com, david@redhat.com,
	axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org,
	nathan@kernel.org, changbin.du@intel.com, ytcoode@gmail.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com,
	vschneid@redhat.com, cl@linux.com, penberg@kernel.org,
	iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com,
	elver@google.com, shakeelb@google.com, songmuchun@bytedance.com,
	arnd@arndb.de, jbaron@akamai.com, rientjes@google.com,
	minchan@google.com, kaleshsingh@google.com,
	kernel-team@android.com, linux-mm@kvack.org,
	iommu@lists.linux.dev, kasan-dev@googlegroups.com,
	io-uring@vger.kernel.org, linux-arch@vger.kernel.org,
	xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org,
	linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 22/30] Code tagging based fault injection
Date: Thu, 1 Sep 2022 10:43:26 +0200	[thread overview]
Message-ID: <CACT4Y+bMeqvWQwqzG3nfcf0-VOjU7usxht5mKgUwMcOpWKRjxQ@mail.gmail.com> (raw)
In-Reply-To: <20220831173010.wc5j3ycmfjx6ezfu@moria.home.lan>

 On Wed, 31 Aug 2022 at 19:30, Kent Overstreet
<kent.overstreet@linux.dev> wrote:
> > > From: Kent Overstreet <kent.overstreet@linux.dev>
> > >
> > > This adds a new fault injection capability, based on code tagging.
> > >
> > > To use, simply insert somewhere in your code
> > >
> > >   dynamic_fault("fault_class_name")
> > >
> > > and check whether it returns true - if so, inject the error.
> > > For example
> > >
> > >   if (dynamic_fault("init"))
> > >       return -EINVAL;
> >
> > Hi Suren,
> >
> > If this is going to be used by mainline kernel, it would be good to
> > integrate this with fail_nth systematic fault injection:
> > https://elixir.bootlin.com/linux/latest/source/lib/fault-inject.c#L109
> >
> > Otherwise these dynamic sites won't be tested by testing systems doing
> > systematic fault injection testing.
>
> That's a discussion we need to have, yeah. We don't want two distinct fault
> injection frameworks, we'll have to have a discussion as to whether this is (or
> can be) better enough to make a switch worthwhile, and whether a compatibility
> interface is needed - or maybe there's enough distinct interesting bits in both
> to make merging plausible?
>
> The debugfs interface for this fault injection code is necessarily different
> from our existing fault injection - this gives you a fault injection point _per
> callsite_, which is huge - e.g. for filesystem testing what I need is to be able
> to enable fault injection points within a given module. I can do that easily
> with this, not with our current fault injection.
>
> I think the per-callsite fault injection points would also be pretty valuable
> for CONFIG_FAULT_INJECTION_USERCOPY, too.
>
> OTOH, existing kernel fault injection can filter based on task - this fault
> injection framework doesn't have that. Easy enough to add, though. Similar for
> the interval/probability/ratelimit stuff.
>
> fail_function is the odd one out, I'm not sure how that would fit into this
> model. Everything else I've seen I think fits into this model.
>
> Also, it sounds like you're more familiar with our existing fault injection than
> I am, so if I've misunderstood anything about what it can do please do correct
> me.

What you are saying makes sense. But I can't say if we want to do a
global switch or not. I don't know how many existing users there are
(by users I mean automated testing b/c humans can switch for one-off
manual testing).

However, fail_nth that I mentioned is orthogonal to this. It's a
different mechanism to select the fault site that needs to be failed
(similar to what you mentioned as "interval/probability/ratelimit
stuff"). fail_nth allows to fail the specified n-th call site in the
specified task. And that's the only mechanism we use in
syzkaller/syzbot.
And I think it can be supported relatively easily (copy a few lines to
the "does this site needs to fail" check).

I don't know how exactly you want to use this new mechanism, but I
found fail_nth much better than any of the existing selection
mechanisms, including what this will add for specific site failing.

fail_nth allows to fail every site in a given test/syscall one-by-one
systematically. E.g. we can even have strace-like utility that repeats
the given test failing all sites in to systematically:
$ fail_all ./a_unit_test
This can be integrated into any CI system, e.g. running all LTP tests with this.

For file:line-based selection, first, we need to get these file:line
from somewhere; second, lines are changing over time so can't be
hardcoded in tests; third, it still needs to be per-task, since
unrelated processes can execute the same code.

One downside of fail_nth, though, is that it does not cover background
threads/async work. But we found that there are so many untested
synchronous error paths, that moving to background threads is not
necessary at this point.



> Interestingly: I just discovered from reading the code that
> CONFIG_FAULT_INJECTION_STACKTRACE_FILTER is a thing (hadn't before because it
> depends on !X86_64 - what?). That's cool, though.

  reply	other threads:[~2022-09-01  8:43 UTC|newest]

Thread overview: 142+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-30 21:48 [RFC PATCH 00/30] Code tagging framework and applications Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 01/30] kernel/module: move find_kallsyms_symbol_value declaration Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 02/30] lib/string_helpers: Drop space in string_get_size's output Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 03/30] Lazy percpu counters Suren Baghdasaryan
2022-08-31 10:02   ` Mel Gorman
2022-08-31 15:37     ` Suren Baghdasaryan
2022-08-31 16:20     ` Kent Overstreet
2022-09-01  6:51   ` Peter Zijlstra
2022-09-01 14:32     ` Kent Overstreet
2022-09-01 14:48       ` Steven Rostedt
2022-09-01 15:43         ` Kent Overstreet
2022-09-01 18:59       ` Peter Zijlstra
2022-08-30 21:48 ` [RFC PATCH 04/30] scripts/kallysms: Always include __start and __stop symbols Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 05/30] lib: code tagging framework Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 06/30] lib: code tagging module support Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 07/30] lib: add support for allocation tagging Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 08/30] lib: introduce page " Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 09/30] change alloc_pages name in dma_map_ops to avoid name conflicts Suren Baghdasaryan
2022-08-30 21:48 ` [RFC PATCH 10/30] mm: enable page allocation tagging for __get_free_pages and alloc_pages Suren Baghdasaryan
2022-08-31 10:11   ` Mel Gorman
2022-08-31 15:45     ` Suren Baghdasaryan
2022-08-31 15:52       ` Suren Baghdasaryan
2022-08-31 17:46     ` Kent Overstreet
2022-09-01  1:07       ` Suren Baghdasaryan
2022-09-01  7:41       ` Peter Zijlstra
2022-08-30 21:49 ` [RFC PATCH 11/30] mm: introduce slabobj_ext to support slab object extensions Suren Baghdasaryan
2022-09-01 23:35   ` Roman Gushchin
2022-09-02  0:23     ` Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 12/30] mm: introduce __GFP_NO_OBJ_EXT flag to selectively prevent slabobj_ext creation Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 13/30] mm/slab: introduce SLAB_NO_OBJ_EXT to avoid obj_ext creation Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 14/30] mm: prevent slabobj_ext allocations for slabobj_ext and kmem_cache objects Suren Baghdasaryan
2022-09-01 23:40   ` Roman Gushchin
2022-09-02  0:24     ` Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 15/30] lib: introduce slab allocation tagging Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 16/30] mm: enable slab allocation tagging for kmalloc and friends Suren Baghdasaryan
2022-08-31  3:26   ` kernel test robot
2022-09-01 23:50   ` Roman Gushchin
2022-08-30 21:49 ` [RFC PATCH 17/30] lib/string.c: strsep_no_empty() Suren Baghdasaryan
2022-08-31  6:41   ` kernel test robot
2022-08-30 21:49 ` [RFC PATCH 18/30] codetag: add codetag query helper functions Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 19/30] move stack capture functionality into a separate function for reuse Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 20/30] lib: introduce support for storing code tag context Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 21/30] lib: implement context capture support for page and slab allocators Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 22/30] Code tagging based fault injection Suren Baghdasaryan
2022-08-31  1:51   ` Randy Dunlap
2022-08-31 15:56     ` Suren Baghdasaryan
2022-08-31 10:37   ` Dmitry Vyukov
2022-08-31 15:51     ` Suren Baghdasaryan
2022-08-31 17:30     ` Kent Overstreet
2022-09-01  8:43       ` Dmitry Vyukov [this message]
2022-08-30 21:49 ` [RFC PATCH 23/30] timekeeping: Add a missing include Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 24/30] wait: Clean up waitqueue_entry initialization Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 25/30] lib/time_stats: New library for statistics on events Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 26/30] bcache: Convert to lib/time_stats Suren Baghdasaryan
2022-08-30 21:49 ` [RFC PATCH 27/30] Code tagging based latency tracking Suren Baghdasaryan
2022-08-31  1:53   ` Randy Dunlap
2022-08-31 15:55     ` Suren Baghdasaryan
2022-09-01  7:11   ` Peter Zijlstra
2022-09-01 14:43     ` Kent Overstreet
2022-09-01 21:38   ` Steven Rostedt
2022-09-01 21:46     ` Steven Rostedt
2022-09-01 21:54     ` Kent Overstreet
2022-09-01 22:34       ` Steven Rostedt
2022-09-01 22:55         ` Kent Overstreet
2022-09-02  0:23           ` Steven Rostedt
2022-09-02  1:35             ` Kent Overstreet
2022-09-02  1:59               ` Steven Rostedt
2022-08-30 21:49 ` [RFC PATCH 28/30] Improved symbolic error names Suren Baghdasaryan
2022-08-31  2:46   ` kernel test robot
2022-09-01 23:19   ` Joe Perches
2022-09-01 23:26     ` Kent Overstreet
2022-08-30 21:49 ` [RFC PATCH 29/30] dyndbg: Convert to code tagging Suren Baghdasaryan
2022-08-31  3:26   ` kernel test robot
2022-08-30 21:49 ` [RFC PATCH 30/30] MAINTAINERS: Add entries for code tagging & related Suren Baghdasaryan
2022-08-31  7:38 ` [RFC PATCH 00/30] Code tagging framework and applications Peter Zijlstra
2022-08-31  8:42   ` Kent Overstreet
2022-08-31 10:19     ` Mel Gorman
2022-08-31 10:47       ` Michal Hocko
2022-08-31 15:28         ` Suren Baghdasaryan
2022-08-31 16:48           ` Suren Baghdasaryan
2022-08-31 19:01         ` Kent Overstreet
2022-08-31 20:56           ` Yosry Ahmed
2022-08-31 21:38             ` Suren Baghdasaryan
2022-09-01 22:27             ` Roman Gushchin
2022-09-01 22:37               ` Kent Overstreet
2022-09-01 22:53                 ` Roman Gushchin
2022-09-01 23:36                   ` Suren Baghdasaryan
2022-09-02  0:17                   ` Kent Overstreet
2022-09-02  1:04                     ` Roman Gushchin
2022-09-02  1:16                       ` Kent Overstreet
2022-09-02 12:02                       ` Jens Axboe
2022-09-02 19:48                         ` Kent Overstreet
2022-09-02 19:53                           ` Jens Axboe
2022-09-02 20:05                             ` Kent Overstreet
2022-09-02 20:23                               ` Jens Axboe
2022-09-01  7:18           ` Michal Hocko
2022-09-01 15:33             ` Suren Baghdasaryan
2022-09-01 19:15               ` Michal Hocko
2022-09-01 19:39                 ` Suren Baghdasaryan
2022-09-01 20:15                   ` Kent Overstreet
2022-09-05  8:49                     ` Michal Hocko
2022-09-05 23:46                       ` Kent Overstreet
2022-09-06  7:23                         ` Michal Hocko
2022-09-06 18:20                           ` Kent Overstreet
2022-09-07 11:00                             ` Michal Hocko
2022-09-07 13:04                               ` Kent Overstreet
2022-09-07 13:45                                 ` Steven Rostedt
2022-09-08  6:35                                   ` Kent Overstreet
2022-09-08  6:49                                     ` Suren Baghdasaryan
2022-09-08  7:07                                       ` Kent Overstreet
2022-09-08  7:12                                     ` Michal Hocko
2022-09-08  7:29                                       ` Kent Overstreet
2022-09-08  7:47                                         ` Michal Hocko
2022-09-05  1:32                 ` Suren Baghdasaryan
2022-09-05  8:12                   ` Michal Hocko
2022-09-05  8:58                     ` Marco Elver
2022-09-05 18:07                       ` Suren Baghdasaryan
2022-09-05 18:03                     ` Suren Baghdasaryan
2022-09-06  8:01                       ` Michal Hocko
2022-09-06 15:35                         ` Suren Baghdasaryan
2022-09-05 15:07                   ` Steven Rostedt
2022-09-05 18:08                     ` Suren Baghdasaryan
2022-09-05 20:42                       ` Kent Overstreet
2022-09-05 22:16                         ` Steven Rostedt
2022-09-05 23:50                           ` Kent Overstreet
2022-09-01  8:05           ` David Hildenbrand
2022-09-01 14:23             ` Kent Overstreet
2022-09-01 15:07               ` David Hildenbrand
2022-09-01 15:39                 ` Suren Baghdasaryan
2022-09-01 15:48                 ` Kent Overstreet
2022-08-31 15:59       ` Kent Overstreet
2022-09-01  7:05         ` Peter Zijlstra
2022-09-01  7:36           ` Daniel Bristot de Oliveira
2022-09-01  7:42           ` Peter Zijlstra
2022-09-01 11:05         ` Mel Gorman
2022-09-01 16:31           ` Kent Overstreet
2022-09-01  7:00       ` Peter Zijlstra
2022-09-01 14:29         ` Kent Overstreet
2022-09-05 18:44       ` Nadav Amit
2022-09-05 19:16         ` Steven Rostedt
2022-09-01  4:52 ` Oscar Salvador
2022-09-01  5:05   ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACT4Y+bMeqvWQwqzG3nfcf0-VOjU7usxht5mKgUwMcOpWKRjxQ@mail.gmail.com \
    --to=dvyukov@google.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=changbin.du@intel.com \
    --cc=cl@linux.com \
    --cc=dave@stgolabs.net \
    --cc=david@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=io-uring@vger.kernel.org \
    --cc=iommu@lists.linux.dev \
    --cc=jbaron@akamai.com \
    --cc=juri.lelli@redhat.com \
    --cc=kaleshsingh@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=kent.overstreet@linux.dev \
    --cc=kernel-team@android.com \
    --cc=ldufour@linux.ibm.com \
    --cc=liam.howlett@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=minchan@google.com \
    --cc=nathan@kernel.org \
    --cc=penberg@kernel.org \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=shakeelb@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=vincent.guittot@linaro.org \
    --cc=void@manifault.com \
    --cc=vschneid@redhat.com \
    --cc=willy@infradead.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=ytcoode@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.