All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Qian Cai <cai@lca.pw>
Cc: LKMM Maintainers -- Akira Yokosawa <akiyks@gmail.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	Alexander Potapenko <glider@google.com>,
	Andrea Parri <parri.andrea@gmail.com>,
	Andrey Konovalov <andreyknvl@google.com>,
	Andy Lutomirski <luto@kernel.org>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Arnd Bergmann <arnd@arndb.de>, Boqun Feng <boqun.feng@gmail.com>,
	Borislav Petkov <bp@alien8.de>, Daniel Axtens <dja@axtens.net>,
	Daniel Lustig <dlustig@nvidia.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	David Howells <dhowells@redhat.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Jade Alglave <j.alglave@ucl.ac.uk>,
	Joel Fernandes <joel@joelfernandes.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Luc Maranget <luc.maranget@inria.fr>,
	Mark Rutland <mark.rutland@arm.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>, Eric Dumazet <edumazet@google.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	linux-arch <linux-arch@vger.kernel.org>,
	"open list:DOCUMENTATION" <linux-doc@vger.kernel.org>,
	linux-efi@vger.kernel.org,
	Linux Kbuild mailing list <linux-kbuild@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	the arch/x86 maintainers <x86@kernel.org>
Subject: Re: [PATCH v4 00/10] Add Kernel Concurrency Sanitizer (KCSAN)
Date: Wed, 20 Nov 2019 16:54:48 +0100	[thread overview]
Message-ID: <20191120155448.GA21320@google.com> (raw)
In-Reply-To: <CANpmjNPynCwYc8-GKTreJ8HF81k14JAHZXLt0jQJr_d+ukL=6A@mail.gmail.com>

On Tue, 19 Nov 2019, Marco Elver wrote:

> On Tue, 19 Nov 2019 at 21:13, Qian Cai <cai@lca.pw> wrote:
> >
> > On Thu, 2019-11-14 at 19:02 +0100, 'Marco Elver' via kasan-dev wrote:
> > > This is the patch-series for the Kernel Concurrency Sanitizer (KCSAN).
> > > KCSAN is a sampling watchpoint-based *data race detector*. More details
> > > are included in **Documentation/dev-tools/kcsan.rst**. This patch-series
> > > only enables KCSAN for x86, but we expect adding support for other
> > > architectures is relatively straightforward (we are aware of
> > > experimental ARM64 and POWER support).
> >
> > This does not allow the system to boot. Just hang forever at the end.
> >
> > https://cailca.github.io/files/dmesg.txt
> >
> > the config (dselect KASAN and select KCSAN with default options):
> >
> > https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config
> 
> Thanks! That config enables lots of other debug code. I could
> reproduce the hang. It's related to CONFIG_PROVE_LOCKING etc.
> 
> The problem is definitely not the fact that kcsan_setup_watchpoint
> disables interrupts (tested by removing that code). Although lockdep
> still complains here, and looking at the code in kcsan/core.c, I just
> can't see how local_irq_restore cannot be called before returning (in
> the stacktrace you provided, there is no kcsan function), and
> interrupts should always be re-enabled. (Interrupts are only disabled
> during delay in kcsan_setup_watchpoint.)
> 
> What I also notice is that this happens when the console starts
> getting spammed with data-race reports (presumably because some extra
> debug code has lots of data races according to KCSAN).
> 
> My guess is that some of the extra debug logic enabled in that config
> is incompatible with KCSAN. However, so far I cannot tell where
> exactly the problem is. For now the work-around would be not using
> KCSAN with these extra debug options.  I will investigate more, but
> nothing obviously wrong stands out..

It seems that due to spinlock_debug.c containing data races, the console
gets spammed with reports. However, it's also possible to encounter
deadlock, e.g.  printk lock -> spinlock_debug -> KCSAN detects data race
-> kcsan_print_report() -> printk lock -> deadlock.

So the best thing is to fix the data races in spinlock_debug. I will
send a patch separately for you to test.

The issue that lockdep still reports inconsistency in IRQ flags tracing
I cannot yet say what the problem is. It seems that lockdep IRQ flags
tracing may have an issue with KCSAN for numerous reasons: let's say
lockdep and IRQ flags tracing code is instrumented, which then calls
into KCSAN, which disables/enables interrupts, but due to tracing calls
back into lockdep code. In other words, there may be some recursion
which corrupts hardirqs_enabled.

Thanks,
-- Marco

WARNING: multiple messages have this Message-ID (diff)
From: Marco Elver <elver@google.com>
To: Qian Cai <cai@lca.pw>
Cc: LKMM Maintainers -- Akira Yokosawa <akiyks@gmail.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	Alexander Potapenko <glider@google.com>,
	Andrea Parri <parri.andrea@gmail.com>,
	Andrey Konovalov <andreyknvl@google.com>,
	Andy Lutomirski <luto@kernel.org>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Arnd Bergmann <arnd@arndb.de>, Boqun Feng <boqun.feng@gmail.com>,
	Borislav Petkov <bp@alien8.de>, Daniel Axtens <dja@axtens.net>,
	Daniel Lustig <dlustig@nvidia.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	David Howells <dhowells@redhat.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Jade Alglave <j.alglave@ucl.ac.uk>,
	Joel Fernandes <joel@joelfernandes.org>,
	Jonathan Corbet <corbet@lwn.net>
Subject: Re: [PATCH v4 00/10] Add Kernel Concurrency Sanitizer (KCSAN)
Date: Wed, 20 Nov 2019 16:54:48 +0100	[thread overview]
Message-ID: <20191120155448.GA21320@google.com> (raw)
In-Reply-To: <CANpmjNPynCwYc8-GKTreJ8HF81k14JAHZXLt0jQJr_d+ukL=6A@mail.gmail.com>

On Tue, 19 Nov 2019, Marco Elver wrote:

> On Tue, 19 Nov 2019 at 21:13, Qian Cai <cai@lca.pw> wrote:
> >
> > On Thu, 2019-11-14 at 19:02 +0100, 'Marco Elver' via kasan-dev wrote:
> > > This is the patch-series for the Kernel Concurrency Sanitizer (KCSAN).
> > > KCSAN is a sampling watchpoint-based *data race detector*. More details
> > > are included in **Documentation/dev-tools/kcsan.rst**. This patch-series
> > > only enables KCSAN for x86, but we expect adding support for other
> > > architectures is relatively straightforward (we are aware of
> > > experimental ARM64 and POWER support).
> >
> > This does not allow the system to boot. Just hang forever at the end.
> >
> > https://cailca.github.io/files/dmesg.txt
> >
> > the config (dselect KASAN and select KCSAN with default options):
> >
> > https://raw.githubusercontent.com/cailca/linux-mm/master/x86.config
> 
> Thanks! That config enables lots of other debug code. I could
> reproduce the hang. It's related to CONFIG_PROVE_LOCKING etc.
> 
> The problem is definitely not the fact that kcsan_setup_watchpoint
> disables interrupts (tested by removing that code). Although lockdep
> still complains here, and looking at the code in kcsan/core.c, I just
> can't see how local_irq_restore cannot be called before returning (in
> the stacktrace you provided, there is no kcsan function), and
> interrupts should always be re-enabled. (Interrupts are only disabled
> during delay in kcsan_setup_watchpoint.)
> 
> What I also notice is that this happens when the console starts
> getting spammed with data-race reports (presumably because some extra
> debug code has lots of data races according to KCSAN).
> 
> My guess is that some of the extra debug logic enabled in that config
> is incompatible with KCSAN. However, so far I cannot tell where
> exactly the problem is. For now the work-around would be not using
> KCSAN with these extra debug options.  I will investigate more, but
> nothing obviously wrong stands out..

It seems that due to spinlock_debug.c containing data races, the console
gets spammed with reports. However, it's also possible to encounter
deadlock, e.g.  printk lock -> spinlock_debug -> KCSAN detects data race
-> kcsan_print_report() -> printk lock -> deadlock.

So the best thing is to fix the data races in spinlock_debug. I will
send a patch separately for you to test.

The issue that lockdep still reports inconsistency in IRQ flags tracing
I cannot yet say what the problem is. It seems that lockdep IRQ flags
tracing may have an issue with KCSAN for numerous reasons: let's say
lockdep and IRQ flags tracing code is instrumented, which then calls
into KCSAN, which disables/enables interrupts, but due to tracing calls
back into lockdep code. In other words, there may be some recursion
which corrupts hardirqs_enabled.

Thanks,
-- Marco

  reply	other threads:[~2019-11-20 15:55 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-14 18:02 [PATCH v4 00/10] Add Kernel Concurrency Sanitizer (KCSAN) Marco Elver
2019-11-14 18:02 ` Marco Elver
2019-11-14 18:02 ` Marco Elver
2019-11-14 18:02 ` [PATCH v4 01/10] kcsan: Add Kernel Concurrency Sanitizer infrastructure Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-19 19:27   ` Qian Cai
2019-11-19 19:27     ` Qian Cai
2019-11-19 19:27     ` Qian Cai
2019-11-19 19:54     ` Marco Elver
2019-11-19 19:54       ` Marco Elver
2019-11-19 19:54       ` Marco Elver
2019-11-19 21:42       ` Qian Cai
2019-11-19 21:42         ` Qian Cai
2019-11-19 21:53         ` Marco Elver
2019-11-19 21:53           ` Marco Elver
2019-11-19 21:53           ` Marco Elver
2020-01-03  5:13   ` Qian Cai
2020-01-03  5:13     ` Qian Cai
2020-01-06 12:46     ` Marco Elver
2020-01-06 12:46       ` Marco Elver
2020-01-06 12:46       ` Marco Elver
2020-01-14 11:08       ` Qian Cai
2020-01-14 11:08         ` Qian Cai
2020-01-14 12:51         ` Marco Elver
2020-01-14 12:51           ` Marco Elver
2020-01-14 12:51           ` Marco Elver
2020-01-14 19:22         ` Paul E. McKenney
2020-01-14 19:22           ` Paul E. McKenney
2020-01-14 20:30           ` Qian Cai
2020-01-14 20:30             ` Qian Cai
2020-01-14 21:34             ` Paul E. McKenney
2020-01-14 21:34               ` Paul E. McKenney
2020-01-14 21:48               ` Qian Cai
2020-01-14 21:48                 ` Qian Cai
2020-01-14 22:09                 ` Paul E. McKenney
2020-01-14 22:09                   ` Paul E. McKenney
2019-11-14 18:02 ` [PATCH v4 02/10] include/linux/compiler.h: Introduce data_race(expr) macro Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02 ` [PATCH v4 03/10] kcsan: Add Documentation entry in dev-tools Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02 ` [PATCH v4 04/10] objtool, kcsan: Add KCSAN runtime functions to whitelist Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02 ` [PATCH v4 05/10] build, kcsan: Add KCSAN build exceptions Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02 ` [PATCH v4 06/10] seqlock, kcsan: Add annotations for KCSAN Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:02   ` Marco Elver
2019-11-14 18:03 ` [PATCH v4 07/10] seqlock: Require WRITE_ONCE surrounding raw_seqcount_barrier Marco Elver
2019-11-14 18:03   ` Marco Elver
2019-11-14 18:03   ` Marco Elver
2019-11-14 18:03 ` [PATCH v4 08/10] asm-generic, kcsan: Add KCSAN instrumentation for bitops Marco Elver
2019-11-14 18:03   ` Marco Elver
2019-11-14 18:03   ` Marco Elver
2019-11-15 11:55   ` Marco Elver
2019-11-15 11:55     ` Marco Elver
2019-11-14 18:03 ` [PATCH v4 09/10] locking/atomics, kcsan: Add KCSAN instrumentation Marco Elver
2019-11-14 18:03   ` Marco Elver
2019-11-14 18:03   ` Marco Elver
2019-11-14 18:03 ` [PATCH v4 10/10] x86, kcsan: Enable KCSAN for x86 Marco Elver
2019-11-14 18:03   ` Marco Elver
2019-11-14 18:03   ` Marco Elver
2019-11-14 19:50 ` [PATCH v4 00/10] Add Kernel Concurrency Sanitizer (KCSAN) Paul E. McKenney
2019-11-14 21:33   ` Marco Elver
2019-11-14 22:15     ` Paul E. McKenney
2019-11-15 12:02       ` Marco Elver
2019-11-15 12:02         ` Marco Elver
2019-11-15 12:02         ` Marco Elver
2019-11-15 16:41         ` Paul E. McKenney
2019-11-15 16:41           ` Paul E. McKenney
2019-11-15 17:14           ` Marco Elver
2019-11-15 17:14             ` Marco Elver
2019-11-15 17:14             ` Marco Elver
2019-11-15 20:43             ` Paul E. McKenney
2019-11-15 20:43               ` Paul E. McKenney
2019-11-16  8:20               ` Marco Elver
2019-11-16  8:20                 ` Marco Elver
2019-11-16  8:20                 ` Marco Elver
2019-11-16 15:34                 ` Paul E. McKenney
2019-11-16 15:34                   ` Paul E. McKenney
2019-11-16 18:09                   ` Marco Elver
2019-11-16 18:09                     ` Marco Elver
2019-11-16 18:09                     ` Marco Elver
2019-11-16 18:28                     ` Paul E. McKenney
2019-11-16 18:28                       ` Paul E. McKenney
2019-11-19 19:50 ` Qian Cai
2019-11-19 19:50   ` Qian Cai
2019-11-19 19:50   ` Qian Cai
2019-11-19 20:12 ` Qian Cai
2019-11-19 20:12   ` Qian Cai
2019-11-19 20:12   ` Qian Cai
2019-11-19 21:50   ` Marco Elver
2019-11-19 21:50     ` Marco Elver
2019-11-19 21:50     ` Marco Elver
2019-11-20 15:54     ` Marco Elver [this message]
2019-11-20 15:54       ` Marco Elver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191120155448.GA21320@google.com \
    --to=elver@google.com \
    --cc=akiyks@gmail.com \
    --cc=andreyknvl@google.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=arnd@arndb.de \
    --cc=boqun.feng@gmail.com \
    --cc=bp@alien8.de \
    --cc=cai@lca.pw \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=dhowells@redhat.com \
    --cc=dja@axtens.net \
    --cc=dlustig@nvidia.com \
    --cc=dvyukov@google.com \
    --cc=edumazet@google.com \
    --cc=glider@google.com \
    --cc=hpa@zytor.com \
    --cc=j.alglave@ucl.ac.uk \
    --cc=joel@joelfernandes.org \
    --cc=jpoimboe@redhat.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luc.maranget@inria.fr \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=parri.andrea@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stern@rowland.harvard.edu \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.