linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Byungchul Park <byungchul.park@lge.com>
Cc: torvalds@linux-foundation.org, peterz@infradead.org,
	mingo@redhat.com, will@kernel.org, linux-kernel@vger.kernel.org,
	tglx@linutronix.de, rostedt@goodmis.org, joel@joelfernandes.org,
	alexander.levin@microsoft.com, daniel.vetter@ffwll.ch,
	chris@chris-wilson.co.uk, duyuyang@gmail.com,
	johannes.berg@intel.com, tj@kernel.org, tytso@mit.edu,
	willy@infradead.org, david@fromorbit.com, amir73il@gmail.com,
	bfields@fieldses.org, gregkh@linuxfoundation.org,
	kernel-team@lge.com
Subject: Re: [RFC] Are you good with Lockdep?
Date: Wed, 11 Nov 2020 11:54:41 +0100	[thread overview]
Message-ID: <20201111105441.GA78848@gmail.com> (raw)
In-Reply-To: <20201111050559.GA24438@X58A-UD3R>


* Byungchul Park <byungchul.park@lge.com> wrote:

> PROBLEM 1) First of all, Lockdep gets disabled on the first detection.

Lockdep disabling itself after the first warning was an intentional 
and deliberate design decision. (See more details below.)

>    What if there are more than two problems?

So the usual way this happens is that the first reported bug gets 
discovered & fixed, then the second gets discovered & fixed.

> We cannot get reported other than the first one.

Correct. Experience has shown that the overwhelming majority of 
lockdep reports are single-cause and single-report.

This is an optimal approach, because after a decade of exorcising 
locking bugs from the kernel, lockdep is currently, most of the time, 
in 'steady-state', with there being no reports for the overwhelming 
majority of testcases, so the statistical probability of there being 
just one new report is by far the highest.

If on the other hand there's some bug in lockdep itself that causes 
excessive false positives, it's better to limit the number of reports 
to one per bootup, so that it's not seen as a nuisance debugging 
facility.

Or if lockdep gets extended that causes multiple previously unreported 
(but very much real) bugs to be reported, it's *still* better to 
handle them one by one: because lockdep doesn't know whether it's real 
or a nuisance, and also because of the "race to log" reasoning below.

>    So the one who has introduced the first one should fix it as soon 
>    as possible so that the other problems can be reported and fixed. 
>    It will get even worse if it's a false positive because it's 
>    worth nothing but only preventing reporting real ones.

Since kernel development is highly distributed, and 90%+ of new 
commits get created in dozens of bigger and hundreds of smaller 
maintainer topic trees, the chance of getting two independent locking 
bugs in the same tree without the first bug being found & fixed is 
actually pretty low.

linux-next offers several weeks/months advance integration testing to 
see whether the combination of maintainer trees causes 
problems/warnings.

And if multiple locking bugs on top of each other happen regularly in 
a particular maintainer tree, it's probably not lockdep's fault. ;-)

>    That's why kernel developers are so sensitive to Lockdep's false
>    positive reporting - I would, too. But precisely speaking, it's a
>    problem of how Lockdep was designed and implemented, not false
>    positive itself. Annoying false positives - as WARN()'s messages are
>    annoying - should be fixed but we don't have to be as sensitive as we
>    are now if the tool keeps normally working even after reporting.

I disagree, and even for WARN()s we are seeing a steady movement 
towards WARN_ON_ONCE(): exactly because developers are usually 
interested in the first warning primarily.

Followup warnings are even marked 'tainted' by the kernel - if a bug 
happened we cannot trust the state of the kernel anymore, even if it 
seems otherwise functional. This is doubly true for lockdep, where 
locking state is complex because the kernel with its thousands of lock 
types and millions of lock instances is fundamentally & inescapably 
complex.

The 'first warning' is by far the most valuable one - and this is what 
lockdep's "turn off after the first warning" policy implements.

But for lockdep there's another concern: we do occasionally report 
bugs in locking facilities themselves. In that case it's imperative 
for all lockdep activity to cease & desist, so that we are able to get 
a log entry out before the kernel goes down potentially.

I.e. there's a "race to log the bug as quickly as possible", which is 
the other reason we shut down lockdep immediately. But once shut down, 
all the lockdep data structures are hopelessly out of sync and it 
cannot be restarted reasonably.

I.e. these are two independent reasons to shut down lockdep after the 
first problem found.

>    But it's very hard to achieve it with Lockdep because of the complex
>    design. Maybe re-designing and re-implementing almost whole code
>    would be required.

Making the code simpler is always welcome, but I disagree with 
enabling multiple warnings, for the technical reasons outlined above.

> PROBLEM 2) Lockdep forces us to emulate lock acquisition for non-lock.

>    I have the patch set. Let me share it with you in a few days.

Not sure I understand the "problem 2)" outlined here, but I'm looking 
forward to your patchset!

Thanks,

	Ingo

  reply	other threads:[~2020-11-11 10:54 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-11  5:05 [RFC] Are you good with Lockdep? Byungchul Park
2020-11-11 10:54 ` Ingo Molnar [this message]
2020-11-11 14:36   ` Steven Rostedt
2020-11-11 23:16     ` Thomas Gleixner
2020-11-12  8:10       ` Byungchul Park
2020-11-12 14:26         ` Steven Rostedt
2020-11-12 14:52           ` Matthew Wilcox
2020-11-16  8:57             ` Byungchul Park
2020-11-16 15:37               ` Matthew Wilcox
2020-11-18  1:45                 ` Boqun Feng
2020-11-18  3:30                   ` Matthew Wilcox
2020-11-23 13:15                 ` Byungchul Park
2020-11-12 14:58           ` Byungchul Park
2020-11-16  9:05             ` Byungchul Park
2020-11-23 10:45               ` Byungchul Park
2020-11-12 10:32     ` Byungchul Park
2020-11-12 13:56       ` Daniel Vetter
2020-11-16  8:45         ` Byungchul Park
2020-11-12  6:15   ` Byungchul Park
2020-11-12  8:51     ` Byungchul Park
2020-11-12  9:46       ` Byungchul Park
2020-11-23 11:05 ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
2020-11-23 11:36   ` [RFC 1/6] dept: Implement Dept(Dependency Tracker) Byungchul Park
2020-11-23 11:36     ` [RFC 2/6] dept: Apply Dept to spinlock Byungchul Park
2020-11-23 11:36     ` [RFC 3/6] dept: Apply Dept to mutex families Byungchul Park
2020-11-23 11:36     ` [RFC 4/6] dept: Apply Dept to rwlock Byungchul Park
2020-11-23 11:36     ` [RFC 5/6] dept: Apply Dept to wait_for_completion()/complete() Byungchul Park
2020-11-23 11:36     ` [RFC 6/6] dept: Assign custom dept_keys or disable to avoid false positives Byungchul Park
2020-11-23 12:29   ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
2020-11-23 11:13 ` [RFC] Dept(Dependency Tracker) Report Example Byungchul Park
2020-11-23 12:14   ` Byungchul Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201111105441.GA78848@gmail.com \
    --to=mingo@kernel.org \
    --cc=alexander.levin@microsoft.com \
    --cc=amir73il@gmail.com \
    --cc=bfields@fieldses.org \
    --cc=byungchul.park@lge.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=david@fromorbit.com \
    --cc=duyuyang@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=joel@joelfernandes.org \
    --cc=johannes.berg@intel.com \
    --cc=kernel-team@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).