From: Marco Elver <elver@google.com> To: Qian Cai <cai@lca.pw> Cc: LKMM Maintainers -- Akira Yokosawa <akiyks@gmail.com>, Alan Stern <stern@rowland.harvard.edu>, Alexander Potapenko <glider@google.com>, Andrea Parri <parri.andrea@gmail.com>, Andrey Konovalov <andreyknvl@google.com>, Andy Lutomirski <luto@kernel.org>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, Arnd Bergmann <arnd@arndb.de>, Boqun Feng <boqun.feng@gmail.com>, Borislav Petkov <bp@alien8.de>, Daniel Axtens <dja@axtens.net>, Daniel Lustig <dlustig@nvidia.com>, Dave Hansen <dave.hansen@linux.intel.com>, David Howells <dhowells@redhat.com>, Dmitry Vyukov <dvyukov@google.com>, "H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>, Jade Alglave <j.alglave@ucl.ac.uk>, Joel Fernandes <joel@joelfernandes.org>, Jonathan Corbet <corbet@lwn.net>, Josh Poimboeuf <jpoimboe@redhat.com>, Luc Maranget <luc.maranget@inria.fr>, Mark Rutland <Mark.Rutland@arm.com>, Nicholas Piggin <npiggin@gmail.com>, "Paul E. McKenney" <paulmck@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Thomas Gleixner <tglx@linutronix.de>, Will Deacon <will@kernel.org>, Eric Dumazet <edumazet@google.com>, kasan-dev <kasan-dev@googlegroups.com>, linux-arch <linux-arch@vger.kernel.org>, "open list:DOCUMENTATION" <linux-doc@vger.kernel.org>, linux-efi@vger.kernel.org, Linux Kbuild mailing list <linux-kbuild@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>, Linux Memory Management List <linux-mm@kvack.org>, "the arch/x86 maintainers" <x86@kernel.org> Subject: Re: [PATCH v4 01/10] kcsan: Add Kernel Concurrency Sanitizer infrastructure Date: Tue, 14 Jan 2020 13:51:01 +0100 [thread overview] Message-ID: <CANpmjNMXD3Qzj748CXWtmenxx4cC3Q8Fr70L5PWNe6ZSARcZ9w@mail.gmail.com> (raw) In-Reply-To: <53F6B915-AC53-41BB-BF32-33732515B3A0@lca.pw> On Tue, 14 Jan 2020 at 12:08, Qian Cai <cai@lca.pw> wrote: > > > > > On Jan 6, 2020, at 7:47 AM, Marco Elver <elver@google.com> wrote: > > > > Thanks, I'll look into KCSAN + lockdep compatibility. It's probably > > missing some KCSAN_SANITIZE := n in some Makefile. > > Can I have a update on fixing this? It looks like more of a problem that kcsan_setup_watchpoint() will disable IRQs and then dive into the page allocator where it would complain because it might sleep. KCSAN does *not* keep IRQs disabled (we have a clear irqsave / restore pair kcsan_setup_watchpoint). If you look closer at the warning you sent in this thread, the warning is not generated because IRQs are off when it wants to sleep, but rather because IRQs are enabled but IRQ tracing state is inconsistent: "DEBUG_LOCKS_WARN_ON(!current->hardirqs_enabled)" in lockdep checks that if IRQs are enabled, the trace state matches. These are only checked with LOCKDEP_DEBUG and TRACE_IRQFLAGS. In other words, IRQ trace flags got corrupted somewhere. AFAIK, this problem here is only relevant with TRACE_IRQFLAGS -- again, it is clear that IRQs are enabled but the IRQ tracing logic somehow ended up corrupting hardirqs_enabled (TRACE_IRQFLAGS). I believe this patch will take care of this issue: http://lkml.kernel.org/r/20200114124919.11891-1-elver@google.com Thanks, -- Marco > BTW, I saw Paul sent a pull request for 5.6 but it is ugly to have everybody could trigger a deadlock (sleep function called in atomic context) like this during boot once this hits the mainline not to mention about only recently it is possible to test this feature (thanks to warning ratelimit) with the existing debugging options because it was unable to boot due to the brokenness with debug_pagealloc as mentioned in this thread, so this does sounds like it needs more soak time for the mainline to me. > > 0000000000000400 > [ 13.416814][ T1] Call Trace: > [ 13.416814][ T1] lock_is_held_type+0x66/0x160 > [ 13.416814][ T1] ___might_sleep+0xc1/0x1d0 > [ 13.416814][ T1] __might_sleep+0x5b/0xa0 > [ 13.416814][ T1] slab_pre_alloc_hook+0x7b/0xa0 > [ 13.416814][ T1] __kmalloc_node+0x60/0x300 > [ 13.416814 T1] ? alloc_cpumask_var_node+0x44/0x70 > [ 13.416814][ T1] ? topology_phys_to_logical_die+0x7e/0x180 > [ 13.416814][ T1] alloc_cpumask_var_node+0x44/0x70 > [ 13.416814][ T1] zalloc_cpumask_var+0x2a/0x40 > [ 13.416814][ T1] native_smp_prepare_cpus+0x246/0x425 > [ 13.416814][ T1] kernel_init_freeable+0x1b8/0x496 > [ 13.416814][ T1] ? rest_init+0x381/0x381 > [ 13.416814][ T1] kernel_init+0x18/0x17f > [ 13.416814][ T1] ? rest_init+0x381/0x381 > [ 13.416814][ T1] ret_from_fork+0x3a/0x50 > [ 13.416814][ T1] irq event stamp: 910 > [ 13.416814][ T1] hardirqs last enabled at (909): [<ffffffff8d1240f3>] _raw_write_unlock_irqrestore+0x53/0x57 > [ 13.416814][ T1] hardirqs last disabled at (910): [<ffffffff8c8bba76>] kcsan_setup_watchpoint+0x96/0x460 > [ 13.416814][ T1] softirqs last enabled at (0): [<ffffffff8c6b697a>] copy_process+0x11fa/0x34f0 > [ 13.416814][ T1] softirqs last disabled at (0): [<0000000000000000>] 0x0 > [ 13.416814][ T1] ---[ end trace 7d1df66da055aa92 ]--- > [ 13.416814][ T1] possible reason: unannotated irqs-on. > [ 13.416814][ent stamp: 910 > [ 13.416814][ T1] hardirqs last enabled at (909): [<ffffffff8d1240f3>] _raw_write_unlock_irqrestore+0x53/0x57 > [ 13.416814][ T1] hardirqs last disabled at (910): [<ffffffff8c8bba76>] kcsan_setup_watchpoint+0x96/0x460 > [ 13.416814][ T1] softirqs last enabled at (0): [<ffffffff8c6b697a>] copy_process+0x11fa/0x34f0 > [ 13.416814][ T1] softirqs last disabled at (0): [<0000000000000000>] 0x0
WARNING: multiple messages have this Message-ID (diff)
From: Marco Elver <elver@google.com> To: Qian Cai <cai@lca.pw> Cc: LKMM Maintainers -- Akira Yokosawa <akiyks@gmail.com>, Alan Stern <stern@rowland.harvard.edu>, Alexander Potapenko <glider@google.com>, Andrea Parri <parri.andrea@gmail.com>, Andrey Konovalov <andreyknvl@google.com>, Andy Lutomirski <luto@kernel.org>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, Arnd Bergmann <arnd@arndb.de>, Boqun Feng <boqun.feng@gmail.com>, Borislav Petkov <bp@alien8.de>, Daniel Axtens <dja@axtens.net>, Daniel Lustig <dlustig@nvidia.com>, Dave Hansen <dave.hansen@linux.intel.com>, David Howells <dhowells@redhat.com>, Dmitry Vyukov <dvyukov@google.com>, "H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>, Jade Alglave <j.alglave@ucl.ac.uk>, Joel Fernandes <joel@joelfernandes.org>, Jonathan Corbet <corbet@lwn.net>, Josh Poimboeuf <jpoimboe@redhat.com>, Luc Maranget <luc.maranget@inria.fr>, Mark Rutland <Mark.Rutland@arm.com>, Nicholas Subject: Re: [PATCH v4 01/10] kcsan: Add Kernel Concurrency Sanitizer infrastructure Date: Tue, 14 Jan 2020 13:51:01 +0100 [thread overview] Message-ID: <CANpmjNMXD3Qzj748CXWtmenxx4cC3Q8Fr70L5PWNe6ZSARcZ9w@mail.gmail.com> (raw) In-Reply-To: <53F6B915-AC53-41BB-BF32-33732515B3A0@lca.pw> On Tue, 14 Jan 2020 at 12:08, Qian Cai <cai@lca.pw> wrote: > > > > > On Jan 6, 2020, at 7:47 AM, Marco Elver <elver@google.com> wrote: > > > > Thanks, I'll look into KCSAN + lockdep compatibility. It's probably > > missing some KCSAN_SANITIZE := n in some Makefile. > > Can I have a update on fixing this? It looks like more of a problem that kcsan_setup_watchpoint() will disable IRQs and then dive into the page allocator where it would complain because it might sleep. KCSAN does *not* keep IRQs disabled (we have a clear irqsave / restore pair kcsan_setup_watchpoint). If you look closer at the warning you sent in this thread, the warning is not generated because IRQs are off when it wants to sleep, but rather because IRQs are enabled but IRQ tracing state is inconsistent: "DEBUG_LOCKS_WARN_ON(!current->hardirqs_enabled)" in lockdep checks that if IRQs are enabled, the trace state matches. These are only checked with LOCKDEP_DEBUG and TRACE_IRQFLAGS. In other words, IRQ trace flags got corrupted somewhere. AFAIK, this problem here is only relevant with TRACE_IRQFLAGS -- again, it is clear that IRQs are enabled but the IRQ tracing logic somehow ended up corrupting hardirqs_enabled (TRACE_IRQFLAGS). I believe this patch will take care of this issue: http://lkml.kernel.org/r/20200114124919.11891-1-elver@google.com Thanks, -- Marco > BTW, I saw Paul sent a pull request for 5.6 but it is ugly to have everybody could trigger a deadlock (sleep function called in atomic context) like this during boot once this hits the mainline not to mention about only recently it is possible to test this feature (thanks to warning ratelimit) with the existing debugging options because it was unable to boot due to the brokenness with debug_pagealloc as mentioned in this thread, so this does sounds like it needs more soak time for the mainline to me. > > 0000000000000400 > [ 13.416814][ T1] Call Trace: > [ 13.416814][ T1] lock_is_held_type+0x66/0x160 > [ 13.416814][ T1] ___might_sleep+0xc1/0x1d0 > [ 13.416814][ T1] __might_sleep+0x5b/0xa0 > [ 13.416814][ T1] slab_pre_alloc_hook+0x7b/0xa0 > [ 13.416814][ T1] __kmalloc_node+0x60/0x300 > [ 13.416814 T1] ? alloc_cpumask_var_node+0x44/0x70 > [ 13.416814][ T1] ? topology_phys_to_logical_die+0x7e/0x180 > [ 13.416814][ T1] alloc_cpumask_var_node+0x44/0x70 > [ 13.416814][ T1] zalloc_cpumask_var+0x2a/0x40 > [ 13.416814][ T1] native_smp_prepare_cpus+0x246/0x425 > [ 13.416814][ T1] kernel_init_freeable+0x1b8/0x496 > [ 13.416814][ T1] ? rest_init+0x381/0x381 > [ 13.416814][ T1] kernel_init+0x18/0x17f > [ 13.416814][ T1] ? rest_init+0x381/0x381 > [ 13.416814][ T1] ret_from_fork+0x3a/0x50 > [ 13.416814][ T1] irq event stamp: 910 > [ 13.416814][ T1] hardirqs last enabled at (909): [<ffffffff8d1240f3>] _raw_write_unlock_irqrestore+0x53/0x57 > [ 13.416814][ T1] hardirqs last disabled at (910): [<ffffffff8c8bba76>] kcsan_setup_watchpoint+0x96/0x460 > [ 13.416814][ T1] softirqs last enabled at (0): [<ffffffff8c6b697a>] copy_process+0x11fa/0x34f0 > [ 13.416814][ T1] softirqs last disabled at (0): [<0000000000000000>] 0x0 > [ 13.416814][ T1] ---[ end trace 7d1df66da055aa92 ]--- > [ 13.416814][ T1] possible reason: unannotated irqs-on. > [ 13.416814][ent stamp: 910 > [ 13.416814][ T1] hardirqs last enabled at (909): [<ffffffff8d1240f3>] _raw_write_unlock_irqrestore+0x53/0x57 > [ 13.416814][ T1] hardirqs last disabled at (910): [<ffffffff8c8bba76>] kcsan_setup_watchpoint+0x96/0x460 > [ 13.416814][ T1] softirqs last enabled at (0): [<ffffffff8c6b697a>] copy_process+0x11fa/0x34f0 > [ 13.416814][ T1] softirqs last disabled at (0): [<0000000000000000>] 0x0
next prev parent reply other threads:[~2020-01-14 12:51 UTC|newest] Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-11-14 18:02 [PATCH v4 00/10] Add Kernel Concurrency Sanitizer (KCSAN) Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` [PATCH v4 01/10] kcsan: Add Kernel Concurrency Sanitizer infrastructure Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-19 19:27 ` Qian Cai 2019-11-19 19:27 ` Qian Cai 2019-11-19 19:27 ` Qian Cai 2019-11-19 19:54 ` Marco Elver 2019-11-19 19:54 ` Marco Elver 2019-11-19 19:54 ` Marco Elver 2019-11-19 21:42 ` Qian Cai 2019-11-19 21:42 ` Qian Cai 2019-11-19 21:53 ` Marco Elver 2019-11-19 21:53 ` Marco Elver 2019-11-19 21:53 ` Marco Elver 2020-01-03 5:13 ` Qian Cai 2020-01-03 5:13 ` Qian Cai 2020-01-06 12:46 ` Marco Elver 2020-01-06 12:46 ` Marco Elver 2020-01-06 12:46 ` Marco Elver 2020-01-14 11:08 ` Qian Cai 2020-01-14 11:08 ` Qian Cai 2020-01-14 12:51 ` Marco Elver [this message] 2020-01-14 12:51 ` Marco Elver 2020-01-14 12:51 ` Marco Elver 2020-01-14 19:22 ` Paul E. McKenney 2020-01-14 19:22 ` Paul E. McKenney 2020-01-14 20:30 ` Qian Cai 2020-01-14 20:30 ` Qian Cai 2020-01-14 21:34 ` Paul E. McKenney 2020-01-14 21:34 ` Paul E. McKenney 2020-01-14 21:48 ` Qian Cai 2020-01-14 21:48 ` Qian Cai 2020-01-14 22:09 ` Paul E. McKenney 2020-01-14 22:09 ` Paul E. McKenney 2019-11-14 18:02 ` [PATCH v4 02/10] include/linux/compiler.h: Introduce data_race(expr) macro Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` [PATCH v4 03/10] kcsan: Add Documentation entry in dev-tools Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` [PATCH v4 04/10] objtool, kcsan: Add KCSAN runtime functions to whitelist Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` [PATCH v4 05/10] build, kcsan: Add KCSAN build exceptions Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` [PATCH v4 06/10] seqlock, kcsan: Add annotations for KCSAN Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:02 ` Marco Elver 2019-11-14 18:03 ` [PATCH v4 07/10] seqlock: Require WRITE_ONCE surrounding raw_seqcount_barrier Marco Elver 2019-11-14 18:03 ` Marco Elver 2019-11-14 18:03 ` Marco Elver 2019-11-14 18:03 ` [PATCH v4 08/10] asm-generic, kcsan: Add KCSAN instrumentation for bitops Marco Elver 2019-11-14 18:03 ` Marco Elver 2019-11-14 18:03 ` Marco Elver 2019-11-15 11:55 ` Marco Elver 2019-11-15 11:55 ` Marco Elver 2019-11-14 18:03 ` [PATCH v4 09/10] locking/atomics, kcsan: Add KCSAN instrumentation Marco Elver 2019-11-14 18:03 ` Marco Elver 2019-11-14 18:03 ` Marco Elver 2019-11-14 18:03 ` [PATCH v4 10/10] x86, kcsan: Enable KCSAN for x86 Marco Elver 2019-11-14 18:03 ` Marco Elver 2019-11-14 18:03 ` Marco Elver 2019-11-14 19:50 ` [PATCH v4 00/10] Add Kernel Concurrency Sanitizer (KCSAN) Paul E. McKenney 2019-11-14 21:33 ` Marco Elver 2019-11-14 22:15 ` Paul E. McKenney 2019-11-15 12:02 ` Marco Elver 2019-11-15 12:02 ` Marco Elver 2019-11-15 12:02 ` Marco Elver 2019-11-15 16:41 ` Paul E. McKenney 2019-11-15 16:41 ` Paul E. McKenney 2019-11-15 17:14 ` Marco Elver 2019-11-15 17:14 ` Marco Elver 2019-11-15 17:14 ` Marco Elver 2019-11-15 20:43 ` Paul E. McKenney 2019-11-15 20:43 ` Paul E. McKenney 2019-11-16 8:20 ` Marco Elver 2019-11-16 8:20 ` Marco Elver 2019-11-16 8:20 ` Marco Elver 2019-11-16 15:34 ` Paul E. McKenney 2019-11-16 15:34 ` Paul E. McKenney 2019-11-16 18:09 ` Marco Elver 2019-11-16 18:09 ` Marco Elver 2019-11-16 18:09 ` Marco Elver 2019-11-16 18:28 ` Paul E. McKenney 2019-11-16 18:28 ` Paul E. McKenney 2019-11-19 19:50 ` Qian Cai 2019-11-19 19:50 ` Qian Cai 2019-11-19 19:50 ` Qian Cai 2019-11-19 20:12 ` Qian Cai 2019-11-19 20:12 ` Qian Cai 2019-11-19 20:12 ` Qian Cai 2019-11-19 21:50 ` Marco Elver 2019-11-19 21:50 ` Marco Elver 2019-11-19 21:50 ` Marco Elver 2019-11-20 15:54 ` Marco Elver 2019-11-20 15:54 ` Marco Elver
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CANpmjNMXD3Qzj748CXWtmenxx4cC3Q8Fr70L5PWNe6ZSARcZ9w@mail.gmail.com \ --to=elver@google.com \ --cc=Mark.Rutland@arm.com \ --cc=akiyks@gmail.com \ --cc=andreyknvl@google.com \ --cc=ard.biesheuvel@linaro.org \ --cc=arnd@arndb.de \ --cc=boqun.feng@gmail.com \ --cc=bp@alien8.de \ --cc=cai@lca.pw \ --cc=corbet@lwn.net \ --cc=dave.hansen@linux.intel.com \ --cc=dhowells@redhat.com \ --cc=dja@axtens.net \ --cc=dlustig@nvidia.com \ --cc=dvyukov@google.com \ --cc=edumazet@google.com \ --cc=glider@google.com \ --cc=hpa@zytor.com \ --cc=j.alglave@ucl.ac.uk \ --cc=joel@joelfernandes.org \ --cc=jpoimboe@redhat.com \ --cc=kasan-dev@googlegroups.com \ --cc=linux-arch@vger.kernel.org \ --cc=linux-doc@vger.kernel.org \ --cc=linux-efi@vger.kernel.org \ --cc=linux-kbuild@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=luc.maranget@inria.fr \ --cc=luto@kernel.org \ --cc=mingo@redhat.com \ --cc=npiggin@gmail.com \ --cc=parri.andrea@gmail.com \ --cc=paulmck@kernel.org \ --cc=peterz@infradead.org \ --cc=stern@rowland.harvard.edu \ --cc=tglx@linutronix.de \ --cc=will@kernel.org \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.