linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Are you good with Lockdep?
@ 2020-11-11  5:05 Byungchul Park
  2020-11-11 10:54 ` Ingo Molnar
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-11  5:05 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

Hello folks,

We have no choise but to use Lockdep to track dependencies for deadlock
detection with the current kernel. I'm wondering if they are satifsied
in that tool. Lockdep has too big problems to continue to use.

---

PROBLEM 1) First of all, Lockdep gets disabled on the first detection.

   What if there are more than two problems? We cannot get reported
   other than the first one. So the one who has introduced the first one
   should fix it as soon as possible so that the other problems can be
   reported and fixed. It will get even worse if it's a false positive
   because it's worth nothing but only preventing reporting real ones.

   That's why kernel developers are so sensitive to Lockdep's false
   positive reporting - I would, too. But precisely speaking, it's a
   problem of how Lockdep was designed and implemented, not false
   positive itself. Annoying false positives - as WARN()'s messages are
   annoying - should be fixed but we don't have to be as sensitive as we
   are now if the tool keeps normally working even after reporting.

   But it's very hard to achieve it with Lockdep because of the complex
   design. Maybe re-designing and re-implementing almost whole code
   would be required.

PROBLEM 2) Lockdep forces us to emulate lock acquisition for non-lock.

   We add manual annotations for non-lock code in the following way:

   At the interest wait,

      ...
      lockdep_acquire(X);
      lockdep_release(X);
      wait_for_something(X);
      ...

   At begin and end of the region where we expect there's the something,

      ...
      lockdep_acquire(X);
      (or lockdep_acquire_read(); to allow recursive annotations.)
      function_doing_the_something(X);
      lockdep_release(X);
      ...

   This way we try to detect deadlocks by waits for now. But don't you
   think it looks ugly? Are you good if it manages to work by some
   means? That even doesn't work correctly. Instead it should look like:

   At the interest wait,

      ...
      xxx_wait(X);
      wait_for_something(X);
      ...

   At the something,

      ...
      xxx_event(X);
      do_the_something(X);
      ...

   Or at begin and end of the region for hint,

      ...
      xxx_event_context_enter(X);
      function_doing_the_something(X);
      xxx_event_context_exit(X);
      ...

   Lockdep had been a not bad tool for detecting deadlock by problematic
   acquisition order. But it's worth noting that deadlock is caused by
   *waits* and their *events* that never reach. Deadlock detection tool
   should focus on waits and events instead of lock acquisition order.

   Just FYI, it should look like for locks:

   At the interest lock acquisition,

      ...
      xxx_wait(X);
      xxx_event_context_enter(X);
      lock(X);
      ...

   At the lock acquisition using trylock type,

      ...
      xxx_event_context_enter(X);
      lock(X);
      ...

   At the lock release,

      ...
      xxx_event(X);
      xxx_event_context_exit(X);
      unlock(X);
      ...

---

These two are big-why we should not keep using Lockdep as a deadlock
detection tool. Other small things can be fixed by modifying Lockdep but
these two are not.

Fine. What could we do for it? Options that I've considered are:

---

OPTION 1) Revert reverting cross-release locking checks (e966eaeeb62
locking/lockdep: Remove the cross-release locking checks) or implement
another Lockdep extention like cross-release.

   The reason cross-release was reverted was a few false positives -
   someone was lying like there were too many false positives though -
   leading people to disable Lockdep. I admit it had to be done that way.
   Folks still don't like Lockdep's false positive that stops the tool.

OPTION 2) Newally design and implement another tool for deadlock
detection based on wait-event model. And replace Lockdep right away.

   Lockdep definitely includes all the efforts great developers have
   made for a long time as as to be quite stable enough. But the new one
   is not. It's not good idea to replace Lockdep right away.

OPTION 3) Newally design and implement another tool for deadlock
detection based on wait-event model. And keep both Lockdep and the new
tool until the new one gets considered stable.

   For people who need stronger capacity for deadlock detection, the new
   tool needs to be introduced but de-coupled with Lockdep so as to be
   getting matured independently. I think this option is the best.

   I have the patch set. Let me share it with you in a few days.

---

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-11  5:05 [RFC] Are you good with Lockdep? Byungchul Park
@ 2020-11-11 10:54 ` Ingo Molnar
  2020-11-11 14:36   ` Steven Rostedt
  2020-11-12  6:15   ` Byungchul Park
  2020-11-23 11:05 ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
  2020-11-23 11:13 ` [RFC] Dept(Dependency Tracker) Report Example Byungchul Park
  2 siblings, 2 replies; 31+ messages in thread
From: Ingo Molnar @ 2020-11-11 10:54 UTC (permalink / raw)
  To: Byungchul Park
  Cc: torvalds, peterz, mingo, will, linux-kernel, tglx, rostedt, joel,
	alexander.levin, daniel.vetter, chris, duyuyang, johannes.berg,
	tj, tytso, willy, david, amir73il, bfields, gregkh, kernel-team


* Byungchul Park <byungchul.park@lge.com> wrote:

> PROBLEM 1) First of all, Lockdep gets disabled on the first detection.

Lockdep disabling itself after the first warning was an intentional 
and deliberate design decision. (See more details below.)

>    What if there are more than two problems?

So the usual way this happens is that the first reported bug gets 
discovered & fixed, then the second gets discovered & fixed.

> We cannot get reported other than the first one.

Correct. Experience has shown that the overwhelming majority of 
lockdep reports are single-cause and single-report.

This is an optimal approach, because after a decade of exorcising 
locking bugs from the kernel, lockdep is currently, most of the time, 
in 'steady-state', with there being no reports for the overwhelming 
majority of testcases, so the statistical probability of there being 
just one new report is by far the highest.

If on the other hand there's some bug in lockdep itself that causes 
excessive false positives, it's better to limit the number of reports 
to one per bootup, so that it's not seen as a nuisance debugging 
facility.

Or if lockdep gets extended that causes multiple previously unreported 
(but very much real) bugs to be reported, it's *still* better to 
handle them one by one: because lockdep doesn't know whether it's real 
or a nuisance, and also because of the "race to log" reasoning below.

>    So the one who has introduced the first one should fix it as soon 
>    as possible so that the other problems can be reported and fixed. 
>    It will get even worse if it's a false positive because it's 
>    worth nothing but only preventing reporting real ones.

Since kernel development is highly distributed, and 90%+ of new 
commits get created in dozens of bigger and hundreds of smaller 
maintainer topic trees, the chance of getting two independent locking 
bugs in the same tree without the first bug being found & fixed is 
actually pretty low.

linux-next offers several weeks/months advance integration testing to 
see whether the combination of maintainer trees causes 
problems/warnings.

And if multiple locking bugs on top of each other happen regularly in 
a particular maintainer tree, it's probably not lockdep's fault. ;-)

>    That's why kernel developers are so sensitive to Lockdep's false
>    positive reporting - I would, too. But precisely speaking, it's a
>    problem of how Lockdep was designed and implemented, not false
>    positive itself. Annoying false positives - as WARN()'s messages are
>    annoying - should be fixed but we don't have to be as sensitive as we
>    are now if the tool keeps normally working even after reporting.

I disagree, and even for WARN()s we are seeing a steady movement 
towards WARN_ON_ONCE(): exactly because developers are usually 
interested in the first warning primarily.

Followup warnings are even marked 'tainted' by the kernel - if a bug 
happened we cannot trust the state of the kernel anymore, even if it 
seems otherwise functional. This is doubly true for lockdep, where 
locking state is complex because the kernel with its thousands of lock 
types and millions of lock instances is fundamentally & inescapably 
complex.

The 'first warning' is by far the most valuable one - and this is what 
lockdep's "turn off after the first warning" policy implements.

But for lockdep there's another concern: we do occasionally report 
bugs in locking facilities themselves. In that case it's imperative 
for all lockdep activity to cease & desist, so that we are able to get 
a log entry out before the kernel goes down potentially.

I.e. there's a "race to log the bug as quickly as possible", which is 
the other reason we shut down lockdep immediately. But once shut down, 
all the lockdep data structures are hopelessly out of sync and it 
cannot be restarted reasonably.

I.e. these are two independent reasons to shut down lockdep after the 
first problem found.

>    But it's very hard to achieve it with Lockdep because of the complex
>    design. Maybe re-designing and re-implementing almost whole code
>    would be required.

Making the code simpler is always welcome, but I disagree with 
enabling multiple warnings, for the technical reasons outlined above.

> PROBLEM 2) Lockdep forces us to emulate lock acquisition for non-lock.

>    I have the patch set. Let me share it with you in a few days.

Not sure I understand the "problem 2)" outlined here, but I'm looking 
forward to your patchset!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-11 10:54 ` Ingo Molnar
@ 2020-11-11 14:36   ` Steven Rostedt
  2020-11-11 23:16     ` Thomas Gleixner
  2020-11-12 10:32     ` Byungchul Park
  2020-11-12  6:15   ` Byungchul Park
  1 sibling, 2 replies; 31+ messages in thread
From: Steven Rostedt @ 2020-11-11 14:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Byungchul Park, torvalds, peterz, mingo, will, linux-kernel,
	tglx, joel, alexander.levin, daniel.vetter, chris, duyuyang,
	johannes.berg, tj, tytso, willy, david, amir73il, bfields,
	gregkh, kernel-team

On Wed, 11 Nov 2020 11:54:41 +0100
Ingo Molnar <mingo@kernel.org> wrote:

> * Byungchul Park <byungchul.park@lge.com> wrote:
> 
> > PROBLEM 1) First of all, Lockdep gets disabled on the first detection.  
> 
> Lockdep disabling itself after the first warning was an intentional 
> and deliberate design decision. (See more details below.)
> 

[..]

> Making the code simpler is always welcome, but I disagree with 
> enabling multiple warnings, for the technical reasons outlined above.

I 100% agree with Ingo. I wish we could stop *all* warnings after the first
one. The number of times people sent me bug reports with warnings without
showing me previous warnings that caused me to go on wild goose chases is
too many to count. The first warning is the *only* thing I look at.
Anything after that is likely to be caused by the integrity of the system
being compromised by the first bug.

And this is especially true with lockdep, because lockdep only detects the
deadlock, it doesn't tell you which lock was the incorrect locking.

For example. If we have a locking chain of:

 A -> B -> D

 A -> C -> D

Which on a correct system looks like this:

 lock(A)
 lock(B)
 unlock(B)
 unlock(A)

 lock(B)
 lock(D)
 unlock(D)
 unlock(B)

 lock(A)
 lock(C)
 unlock(C)
 unlock(A)

 lock(C)
 lock(D)
 unlock(D)
 unlock(C)

which creates the above chains in that order.

But, lets say we have a bug and the system boots up doing:

 lock(D)
 lock(A)
 unlock(A)
 unlock(D)

which creates the incorrect chain.

 D -> A


Now you do the correct locking:

 lock(A)
 lock(B)

Creates A -> B

 lock(A)
 lock(C)

Creates A -> C

 lock(B)
 lock(D)

Creates B -> D and lockdep detects:

 D -> A -> B -> D

and gives us the lockdep splat!!!

But we don't disable lockdep. We let it continue...

 lock(C)
 lock(D)

Which creates C -> D

Now it explodes with D -> A -> C -> D

Which it already reported. And it can be much more complex when dealing
with interrupt contexts and longer chains. That is, perhaps a different
chain had a missing irq disable, now you might get 5 or 6 more lockdep
splats because of that one bug.

The point I'm making is that the lockdep splats after the first one may
just be another version of the same bug and not a new one. Worse, if you
only look at the later lockdep splats, it may be much more difficult to
find the original bug than if you just had the first one. Believe me, I've
been down that road too many times!

And it can be very difficult to know if new lockdep splats are not the same
bug, and this will waste a lot of developers time!

This is why the decision to disable lockdep after the first splat was made.
There were times I wanted to check locking somewhere, but is was using
linux-next which had a lockdep splat that I didn't care about. So I
made it not disable lockdep. And then I hit this exact scenario, that the
one incorrect chain was causing reports all over the place. To solve it, I
had to patch the incorrect chain to do raw locking to have lockdep ignore
it ;-) Then I was able to test the code I was interested in.

> 
> > PROBLEM 2) Lockdep forces us to emulate lock acquisition for non-lock.  
> 
> >    I have the patch set. Let me share it with you in a few days.  
> 
> Not sure I understand the "problem 2)" outlined here, but I'm looking 
> forward to your patchset!
> 

I think I understand it. For things like completions and other "wait for
events" we have lockdep annotation, but it is rather awkward to implement.
Having something that says "lockdep_wait_event()" and
"lockdep_exec_event()" wrappers would be useful.

-- Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-11 14:36   ` Steven Rostedt
@ 2020-11-11 23:16     ` Thomas Gleixner
  2020-11-12  8:10       ` Byungchul Park
  2020-11-12 10:32     ` Byungchul Park
  1 sibling, 1 reply; 31+ messages in thread
From: Thomas Gleixner @ 2020-11-11 23:16 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar
  Cc: Byungchul Park, torvalds, peterz, mingo, will, linux-kernel,
	joel, alexander.levin, daniel.vetter, chris, duyuyang,
	johannes.berg, tj, tytso, willy, david, amir73il, bfields,
	gregkh, kernel-team

On Wed, Nov 11 2020 at 09:36, Steven Rostedt wrote:
> Ingo Molnar <mingo@kernel.org> wrote:
>> Not sure I understand the "problem 2)" outlined here, but I'm looking 
>> forward to your patchset!
>> 
> I think I understand it. For things like completions and other "wait for
> events" we have lockdep annotation, but it is rather awkward to implement.
> Having something that says "lockdep_wait_event()" and
> "lockdep_exec_event()" wrappers would be useful.

Wrappers which make things simpler are always useful, but the lack of
wrappers does not justify a wholesale replacement.

We all know that lockdep has limitations but I yet have to see a proper
argument why this can't be solved incrementaly on top of the existing
infrastructure.

That said, I'm not at all interested in a wholesale replacement of
lockdep which will take exactly the same amount of time to stabilize and
weed out the shortcomings again.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-11 10:54 ` Ingo Molnar
  2020-11-11 14:36   ` Steven Rostedt
@ 2020-11-12  6:15   ` Byungchul Park
  2020-11-12  8:51     ` Byungchul Park
  1 sibling, 1 reply; 31+ messages in thread
From: Byungchul Park @ 2020-11-12  6:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: torvalds, peterz, mingo, will, linux-kernel, tglx, rostedt, joel,
	alexander.levin, daniel.vetter, chris, duyuyang, johannes.berg,
	tj, tytso, willy, david, amir73il, bfields, gregkh, kernel-team

On Wed, Nov 11, 2020 at 11:54:41AM +0100, Ingo Molnar wrote:
> > We cannot get reported other than the first one.
> 
> Correct. Experience has shown that the overwhelming majority of 
> lockdep reports are single-cause and single-report.
> 
> This is an optimal approach, because after a decade of exorcising 
> locking bugs from the kernel, lockdep is currently, most of the time, 

I also think Lockdep has been doing great job exorcising almost all
locking bugs so far. Respect it.

> in 'steady-state', with there being no reports for the overwhelming 
> majority of testcases, so the statistical probability of there being 
> just one new report is by far the highest.

This is true if Lockdep is only for checking if maintainers' tree are
ok and if we totally ignore how a tool could help folks in the middle of
development esp. when developing something complicated wrt.
synchronization.

But I don't agree if a tool could help while developing something that
could introduce many dependency issues.

> If on the other hand there's some bug in lockdep itself that causes 
> excessive false positives, it's better to limit the number of reports 
> to one per bootup, so that it's not seen as a nuisance debugging 
> facility.
> 
> Or if lockdep gets extended that causes multiple previously unreported 
> (but very much real) bugs to be reported, it's *still* better to 
> handle them one by one: because lockdep doesn't know whether it's real 

Why do you think we cannot handle them one by one with multi-reporting?
We can handle them with the first one as we do with single-reporting.
And also that's how we work, for example, when building the kernel or
somethinig.

> >    So the one who has introduced the first one should fix it as soon 
> >    as possible so that the other problems can be reported and fixed. 
> >    It will get even worse if it's a false positive because it's 
> >    worth nothing but only preventing reporting real ones.
> 
> Since kernel development is highly distributed, and 90%+ of new 
> commits get created in dozens of bigger and hundreds of smaller 
> maintainer topic trees, the chance of getting two independent locking 
> bugs in the same tree without the first bug being found & fixed is 
> actually pretty low.

Again, this is true if Lockdep is for checking maintainers' tree only.

> linux-next offers several weeks/months advance integration testing to 
> see whether the combination of maintainer trees causes 
> problems/warnings.

Good for us.

> >    That's why kernel developers are so sensitive to Lockdep's false
> >    positive reporting - I would, too. But precisely speaking, it's a
> >    problem of how Lockdep was designed and implemented, not false
> >    positive itself. Annoying false positives - as WARN()'s messages are
> >    annoying - should be fixed but we don't have to be as sensitive as we
> >    are now if the tool keeps normally working even after reporting.
> 
> I disagree, and even for WARN()s we are seeing a steady movement 
> towards WARN_ON_ONCE(): exactly because developers are usually 
> interested in the first warning primarily.
> 
> Followup warnings are even marked 'tainted' by the kernel - if a bug 
> happened we cannot trust the state of the kernel anymore, even if it 
> seems otherwise functional. This is doubly true for lockdep, where 

I definitely think so. Already tainted kernel is not the kernel we can
trust anymore. Again, IMO, a tool should help us not only for checking
almost final trees but also in developing something. No?

> But for lockdep there's another concern: we do occasionally report 
> bugs in locking facilities themselves. In that case it's imperative 
> for all lockdep activity to cease & desist, so that we are able to get 
> a log entry out before the kernel goes down potentially.

Sure. Makes sense.

> I.e. there's a "race to log the bug as quickly as possible", which is 
> the other reason we shut down lockdep immediately. But once shut down, 

Not sure I understand this part.

> all the lockdep data structures are hopelessly out of sync and it 
> cannot be restarted reasonably.

Is it about tracking IRQ and IRQ-enabled state? That's exactly what I'd
like to point out. Or is there something else?

> Not sure I understand the "problem 2)" outlined here, but I'm looking 
> forward to your patchset!

Thank you for the response.

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-11 23:16     ` Thomas Gleixner
@ 2020-11-12  8:10       ` Byungchul Park
  2020-11-12 14:26         ` Steven Rostedt
  0 siblings, 1 reply; 31+ messages in thread
From: Byungchul Park @ 2020-11-12  8:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Steven Rostedt, Ingo Molnar, torvalds, peterz, mingo, will,
	linux-kernel, joel, alexander.levin, daniel.vetter, chris,
	duyuyang, johannes.berg, tj, tytso, willy, david, amir73il,
	bfields, gregkh, kernel-team

On Thu, Nov 12, 2020 at 12:16:50AM +0100, Thomas Gleixner wrote:
> Wrappers which make things simpler are always useful, but the lack of
> wrappers does not justify a wholesale replacement.

Totally right. Lack of wrappers doesn't matter at all. That could be
achieved easily by modifying the original e.i. Lockdep. That's why I
didn't mention wrapper things in the description. (Sorry if I misled
you so it looked like I mentioned just wrappers. I should've explained
it in more detail.)

xxx_wait(), xxx_event() and xxx_event_context_start() are much more than
wrappers. It was about what deadlock detection tool should work based on.

> We all know that lockdep has limitations but I yet have to see a proper
> argument why this can't be solved incrementaly on top of the existing
> infrastructure.

This is exactly what I'd like to address. As you can see in the first
mail, the reasons why this can't be solved incrementaly are:

1. Lockdep's design and implementation are too complicated to be
   generalized so as to allow multi-reporting. Quite big change onto
   Lockdep would be required.

   I think allowing multi-reporting is very important for tools like
   Lockdep. As long as false positive in the single-reporting manner
   bothers folks, tools like Lockdep cannot be enhanced so as to have
   stronger capability.

2. Does Lockdep do what a deadlock detection tool should do? From
   internal engine to APIs, all the internal data structure and
   algotithm of Lockdep is only looking at lock(?) acquisition order.
   Fundamentally Lockdep cannot work correctly with all general cases,
   for example, read/write/trylock and any wait/event.

   This can be done by re-introducing cross-release but still partially.
   A deadlock detector tool should thoroughly focus on *waits* and
   *events* to be more perfect at detecting deadlock because the fact is
   *waits* and their *events* that never reach cause deadlock.

   With the philosophy of Lockdep, we can only handle partial cases
   fundamently. We have no choice but to do various work-around or adopt
   tricky ways to cover more cases if we keep using Lockdep.

> That said, I'm not at all interested in a wholesale replacement of
> lockdep which will take exactly the same amount of time to stabilize and
> weed out the shortcomings again.

I don't want to bother ones who don't want to be bothered from the tool.
But I think some day we need a new tool doing exactly what it should do
for deadlock detection for sure.

I'm willing to make it matured on my own, or with ones who need a
stronger tool or willing to make it matured together - I wish tho.
That's why I suggest to make both there until the new tool gets
considered stable.

FYI, roughly Lockdep is doing:

   1. Dependency check
   2. Lock usage correctness check (including RCU)
   3. IRQ related usage correctness check with IRQFLAGS

2 and 3 should be there forever which is subtle and have gotten matured.
But 1 is not. I've been talking about 1. But again, it's not about
replacing it right away but having both for a while. I'm gonna try my
best to make it better.

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-12  6:15   ` Byungchul Park
@ 2020-11-12  8:51     ` Byungchul Park
  2020-11-12  9:46       ` Byungchul Park
  0 siblings, 1 reply; 31+ messages in thread
From: Byungchul Park @ 2020-11-12  8:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: torvalds, peterz, mingo, will, linux-kernel, tglx, rostedt, joel,
	alexander.levin, daniel.vetter, chris, duyuyang, johannes.berg,
	tj, tytso, willy, david, amir73il, bfields, gregkh, kernel-team

On Thu, Nov 12, 2020 at 03:15:32PM +0900, Byungchul Park wrote:
> > If on the other hand there's some bug in lockdep itself that causes 
> > excessive false positives, it's better to limit the number of reports 
> > to one per bootup, so that it's not seen as a nuisance debugging 
> > facility.
> > 
> > Or if lockdep gets extended that causes multiple previously unreported 
> > (but very much real) bugs to be reported, it's *still* better to 
> > handle them one by one: because lockdep doesn't know whether it's real 
> 
> Why do you think we cannot handle them one by one with multi-reporting?
> We can handle them with the first one as we do with single-reporting.
> And also that's how we work, for example, when building the kernel or
> somethinig.

Let me add a little bit more. I just said the fact that we are able to
handle the bugs one by one as if we do with single-reporting.

But the thing is multi-reporting could be more useful in some cases.
More precisely speaking, bugs not caused by IRQ state will be reported
without annoying nuisance. I bet you have experienced a ton of nuisances
when multi-reporting Lockdep detected a deadlock by IRQ state.

For some cases, multi-reporting is as useful as single-reporting, while
for the other cases, multi-reporting is more useful. Then I think we
have to go with mutil-reporting if there's no technical issue.

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-12  8:51     ` Byungchul Park
@ 2020-11-12  9:46       ` Byungchul Park
  0 siblings, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-12  9:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: torvalds, peterz, mingo, will, linux-kernel, tglx, rostedt, joel,
	alexander.levin, daniel.vetter, chris, duyuyang, johannes.berg,
	tj, tytso, willy, david, amir73il, bfields, gregkh, kernel-team

On Thu, Nov 12, 2020 at 05:51:14PM +0900, Byungchul Park wrote:
> On Thu, Nov 12, 2020 at 03:15:32PM +0900, Byungchul Park wrote:
> > > If on the other hand there's some bug in lockdep itself that causes 
> > > excessive false positives, it's better to limit the number of reports 
> > > to one per bootup, so that it's not seen as a nuisance debugging 
> > > facility.
> > > 
> > > Or if lockdep gets extended that causes multiple previously unreported 
> > > (but very much real) bugs to be reported, it's *still* better to 
> > > handle them one by one: because lockdep doesn't know whether it's real 
> > 
> > Why do you think we cannot handle them one by one with multi-reporting?
> > We can handle them with the first one as we do with single-reporting.
> > And also that's how we work, for example, when building the kernel or
> > somethinig.
> 
> Let me add a little bit more. I just said the fact that we are able to
> handle the bugs one by one as if we do with single-reporting.
> 
> But the thing is multi-reporting could be more useful in some cases.
> More precisely speaking, bugs not caused by IRQ state will be reported
> without annoying nuisance. I bet you have experienced a ton of nuisances
> when multi-reporting Lockdep detected a deadlock by IRQ state.

Last, we should never use multi-reporting Lockdep if Lockdep works like
the current code because redundant warnings caused by IRQ state would be
reported almost infinite times. I'm afraid you were talking about it.

Thanks,
Byungchul

> For some cases, multi-reporting is as useful as single-reporting, while
> for the other cases, multi-reporting is more useful. Then I think we
> have to go with mutil-reporting if there's no technical issue.
> 
> Thanks,
> Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-11 14:36   ` Steven Rostedt
  2020-11-11 23:16     ` Thomas Gleixner
@ 2020-11-12 10:32     ` Byungchul Park
  2020-11-12 13:56       ` Daniel Vetter
  1 sibling, 1 reply; 31+ messages in thread
From: Byungchul Park @ 2020-11-12 10:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, torvalds, peterz, mingo, will, linux-kernel, tglx,
	joel, alexander.levin, daniel.vetter, chris, duyuyang,
	johannes.berg, tj, tytso, willy, david, amir73il, bfields,
	gregkh, kernel-team

On Wed, Nov 11, 2020 at 09:36:09AM -0500, Steven Rostedt wrote:
> And this is especially true with lockdep, because lockdep only detects the
> deadlock, it doesn't tell you which lock was the incorrect locking.
> 
> For example. If we have a locking chain of:
> 
>  A -> B -> D
> 
>  A -> C -> D
> 
> Which on a correct system looks like this:
> 
>  lock(A)
>  lock(B)
>  unlock(B)
>  unlock(A)
> 
>  lock(B)
>  lock(D)
>  unlock(D)
>  unlock(B)
> 
>  lock(A)
>  lock(C)
>  unlock(C)
>  unlock(A)
> 
>  lock(C)
>  lock(D)
>  unlock(D)
>  unlock(C)
> 
> which creates the above chains in that order.
> 
> But, lets say we have a bug and the system boots up doing:
> 
>  lock(D)
>  lock(A)
>  unlock(A)
>  unlock(D)
> 
> which creates the incorrect chain.
> 
>  D -> A
> 
> 
> Now you do the correct locking:
> 
>  lock(A)
>  lock(B)
> 
> Creates A -> B
> 
>  lock(A)
>  lock(C)
> 
> Creates A -> C
> 
>  lock(B)
>  lock(D)
> 
> Creates B -> D and lockdep detects:
> 
>  D -> A -> B -> D
> 
> and gives us the lockdep splat!!!
> 
> But we don't disable lockdep. We let it continue...
> 
>  lock(C)
>  lock(D)
> 
> Which creates C -> D
> 
> Now it explodes with D -> A -> C -> D

It would be better to check both so that we can choose either
breaking a single D -> A chain or both breaking A -> B -> D and
A -> C -> D.

> Which it already reported. And it can be much more complex when dealing
> with interrupt contexts and longer chains. That is, perhaps a different

IRQ context is much much worse than longer chains. I understand what you
try to explain.

> chain had a missing irq disable, now you might get 5 or 6 more lockdep
> splats because of that one bug.
> 
> The point I'm making is that the lockdep splats after the first one may
> just be another version of the same bug and not a new one. Worse, if you
> only look at the later lockdep splats, it may be much more difficult to
> find the original bug than if you just had the first one. Believe me, I've

If the later lockdep splats make us more difficult to fix, then we can
look at the first one. If it's more informative, then we can check the
all splats. Anyway it's up to us.

> been down that road too many times!
> 
> And it can be very difficult to know if new lockdep splats are not the same
> bug, and this will waste a lot of developers time!

Again, we don't have to waste time. We can go with the first one.

> This is why the decision to disable lockdep after the first splat was made.
> There were times I wanted to check locking somewhere, but is was using
> linux-next which had a lockdep splat that I didn't care about. So I
> made it not disable lockdep. And then I hit this exact scenario, that the
> one incorrect chain was causing reports all over the place. To solve it, I
> had to patch the incorrect chain to do raw locking to have lockdep ignore
> it ;-) Then I was able to test the code I was interested in.

It's not a problem of whether it's single-reporting or multi-reporting
but it's the problem of the lock creating the incorrect chain and making
you felt hard to handle.

Even if you were using single-reporting Lockdep, you anyway had to
continue to ignore locks in the same way until you got to the intest.

> I think I understand it. For things like completions and other "wait for
> events" we have lockdep annotation, but it is rather awkward to implement.
> Having something that says "lockdep_wait_event()" and
> "lockdep_exec_event()" wrappers would be useful.

Yes. It's a problem of lack of APIs. It can be done by reverting revert
of cross-release without big change. ;-)

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-12 10:32     ` Byungchul Park
@ 2020-11-12 13:56       ` Daniel Vetter
  2020-11-16  8:45         ` Byungchul Park
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Vetter @ 2020-11-12 13:56 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Steven Rostedt, Ingo Molnar, Linus Torvalds, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Linux Kernel Mailing List,
	Thomas Gleixner, Joel Fernandes, Sasha Levin, Wilson, Chris,
	duyuyang, Johannes Berg, Tejun Heo, Theodore Ts'o,
	Matthew Wilcox, Dave Chinner, Amir Goldstein, J. Bruce Fields,
	Greg KH, kernel-team

On Thu, Nov 12, 2020 at 11:33 AM Byungchul Park <byungchul.park@lge.com> wrote:
>
> On Wed, Nov 11, 2020 at 09:36:09AM -0500, Steven Rostedt wrote:
> > And this is especially true with lockdep, because lockdep only detects the
> > deadlock, it doesn't tell you which lock was the incorrect locking.
> >
> > For example. If we have a locking chain of:
> >
> >  A -> B -> D
> >
> >  A -> C -> D
> >
> > Which on a correct system looks like this:
> >
> >  lock(A)
> >  lock(B)
> >  unlock(B)
> >  unlock(A)
> >
> >  lock(B)
> >  lock(D)
> >  unlock(D)
> >  unlock(B)
> >
> >  lock(A)
> >  lock(C)
> >  unlock(C)
> >  unlock(A)
> >
> >  lock(C)
> >  lock(D)
> >  unlock(D)
> >  unlock(C)
> >
> > which creates the above chains in that order.
> >
> > But, lets say we have a bug and the system boots up doing:
> >
> >  lock(D)
> >  lock(A)
> >  unlock(A)
> >  unlock(D)
> >
> > which creates the incorrect chain.
> >
> >  D -> A
> >
> >
> > Now you do the correct locking:
> >
> >  lock(A)
> >  lock(B)
> >
> > Creates A -> B
> >
> >  lock(A)
> >  lock(C)
> >
> > Creates A -> C
> >
> >  lock(B)
> >  lock(D)
> >
> > Creates B -> D and lockdep detects:
> >
> >  D -> A -> B -> D
> >
> > and gives us the lockdep splat!!!
> >
> > But we don't disable lockdep. We let it continue...
> >
> >  lock(C)
> >  lock(D)
> >
> > Which creates C -> D
> >
> > Now it explodes with D -> A -> C -> D
>
> It would be better to check both so that we can choose either
> breaking a single D -> A chain or both breaking A -> B -> D and
> A -> C -> D.
>
> > Which it already reported. And it can be much more complex when dealing
> > with interrupt contexts and longer chains. That is, perhaps a different
>
> IRQ context is much much worse than longer chains. I understand what you
> try to explain.
>
> > chain had a missing irq disable, now you might get 5 or 6 more lockdep
> > splats because of that one bug.
> >
> > The point I'm making is that the lockdep splats after the first one may
> > just be another version of the same bug and not a new one. Worse, if you
> > only look at the later lockdep splats, it may be much more difficult to
> > find the original bug than if you just had the first one. Believe me, I've
>
> If the later lockdep splats make us more difficult to fix, then we can
> look at the first one. If it's more informative, then we can check the
> all splats. Anyway it's up to us.
>
> > been down that road too many times!
> >
> > And it can be very difficult to know if new lockdep splats are not the same
> > bug, and this will waste a lot of developers time!
>
> Again, we don't have to waste time. We can go with the first one.
>
> > This is why the decision to disable lockdep after the first splat was made.
> > There were times I wanted to check locking somewhere, but is was using
> > linux-next which had a lockdep splat that I didn't care about. So I
> > made it not disable lockdep. And then I hit this exact scenario, that the
> > one incorrect chain was causing reports all over the place. To solve it, I
> > had to patch the incorrect chain to do raw locking to have lockdep ignore
> > it ;-) Then I was able to test the code I was interested in.
>
> It's not a problem of whether it's single-reporting or multi-reporting
> but it's the problem of the lock creating the incorrect chain and making
> you felt hard to handle.
>
> Even if you were using single-reporting Lockdep, you anyway had to
> continue to ignore locks in the same way until you got to the intest.
>
> > I think I understand it. For things like completions and other "wait for
> > events" we have lockdep annotation, but it is rather awkward to implement.
> > Having something that says "lockdep_wait_event()" and
> > "lockdep_exec_event()" wrappers would be useful.
>
> Yes. It's a problem of lack of APIs. It can be done by reverting revert
> of cross-release without big change. ;-)

+1 on lockdep-native support for this. For another use case I've added
annotations for dma_fence_wait, and they're not entirely correct
unfortunately. But the false positives is along the lines of "you
really shouldn't do this, even if it's in theory deadlock free". See

commit 5fbff813a4a328b730cb117027c43a4ae9d8b6c0
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Jul 7 22:12:05 2020 +0200

   dma-fence: basic lockdep annotations

for fairly lengthy discussion of the problem and what I ended up with.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-12  8:10       ` Byungchul Park
@ 2020-11-12 14:26         ` Steven Rostedt
  2020-11-12 14:52           ` Matthew Wilcox
  2020-11-12 14:58           ` Byungchul Park
  0 siblings, 2 replies; 31+ messages in thread
From: Steven Rostedt @ 2020-11-12 14:26 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Thomas Gleixner, Ingo Molnar, torvalds, peterz, mingo, will,
	linux-kernel, joel, alexander.levin, daniel.vetter, chris,
	duyuyang, johannes.berg, tj, tytso, willy, david, amir73il,
	bfields, gregkh, kernel-team

On Thu, 12 Nov 2020 17:10:30 +0900
Byungchul Park <byungchul.park@lge.com> wrote:

> 2. Does Lockdep do what a deadlock detection tool should do? From
>    internal engine to APIs, all the internal data structure and
>    algotithm of Lockdep is only looking at lock(?) acquisition order.
>    Fundamentally Lockdep cannot work correctly with all general cases,
>    for example, read/write/trylock and any wait/event.

But lockdep does handle read/write/trylock and can handle wait/event (just
needs better wrappers to annotate this). Perhaps part of the confusion here
is that we believe that lockdep already does what you are asking for.

> 
>    This can be done by re-introducing cross-release but still partially.
>    A deadlock detector tool should thoroughly focus on *waits* and
>    *events* to be more perfect at detecting deadlock because the fact is
>    *waits* and their *events* that never reach cause deadlock.
> 
>    With the philosophy of Lockdep, we can only handle partial cases
>    fundamently. We have no choice but to do various work-around or adopt
>    tricky ways to cover more cases if we keep using Lockdep.
> 
> > That said, I'm not at all interested in a wholesale replacement of
> > lockdep which will take exactly the same amount of time to stabilize and
> > weed out the shortcomings again.  
> 
> I don't want to bother ones who don't want to be bothered from the tool.
> But I think some day we need a new tool doing exactly what it should do
> for deadlock detection for sure.
> 
> I'm willing to make it matured on my own, or with ones who need a
> stronger tool or willing to make it matured together - I wish tho.
> That's why I suggest to make both there until the new tool gets
> considered stable.
> 
> FYI, roughly Lockdep is doing:
> 
>    1. Dependency check
>    2. Lock usage correctness check (including RCU)
>    3. IRQ related usage correctness check with IRQFLAGS
> 
> 2 and 3 should be there forever which is subtle and have gotten matured.
> But 1 is not. I've been talking about 1. But again, it's not about
> replacing it right away but having both for a while. I'm gonna try my
> best to make it better.

And I believe lockdep does handle 1. Perhaps show some tangible use case
that you want to cover that you do not believe that lockdep can handle. If
lockdep cannot handle it, it will show us where lockdep is lacking. If it
can handle it, it will educate you on other ways that lockdep can be
helpful in your development ;-)

-- Steve

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-12 14:26         ` Steven Rostedt
@ 2020-11-12 14:52           ` Matthew Wilcox
  2020-11-16  8:57             ` Byungchul Park
  2020-11-12 14:58           ` Byungchul Park
  1 sibling, 1 reply; 31+ messages in thread
From: Matthew Wilcox @ 2020-11-12 14:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Byungchul Park, Thomas Gleixner, Ingo Molnar, torvalds, peterz,
	mingo, will, linux-kernel, joel, alexander.levin, daniel.vetter,
	chris, duyuyang, johannes.berg, tj, tytso, david, amir73il,
	bfields, gregkh, kernel-team

On Thu, Nov 12, 2020 at 09:26:12AM -0500, Steven Rostedt wrote:
> > FYI, roughly Lockdep is doing:
> > 
> >    1. Dependency check
> >    2. Lock usage correctness check (including RCU)
> >    3. IRQ related usage correctness check with IRQFLAGS
> > 
> > 2 and 3 should be there forever which is subtle and have gotten matured.
> > But 1 is not. I've been talking about 1. But again, it's not about
> > replacing it right away but having both for a while. I'm gonna try my
> > best to make it better.
> 
> And I believe lockdep does handle 1. Perhaps show some tangible use case
> that you want to cover that you do not believe that lockdep can handle. If
> lockdep cannot handle it, it will show us where lockdep is lacking. If it
> can handle it, it will educate you on other ways that lockdep can be
> helpful in your development ;-)

Something I believe lockdep is missing is a way to annotate "This lock
will be released by a softirq".  If we had lockdep for lock_page(), this
would be a great case to show off.  The filesystem locks the page, then
submits it to a device driver.  On completion, the filesystem's bio
completion handler will be called in softirq context and unlock the page.

So if the filesystem has another lock which is acquired by the completion
handler. we could get an ABBA deadlock that lockdep would be unable to see.

There are other similar things; if you look at the remaining semaphore
users in the kernel, you'll see the general pattern is that they're
acquired in process context and then released in interrupt context.
If we had a way to transfer ownership of the semaphore to a generic
"interrupt context", they could become mutexes and lockdep could check
that nothing else will cause a deadlock.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-12 14:26         ` Steven Rostedt
  2020-11-12 14:52           ` Matthew Wilcox
@ 2020-11-12 14:58           ` Byungchul Park
  2020-11-16  9:05             ` Byungchul Park
  1 sibling, 1 reply; 31+ messages in thread
From: Byungchul Park @ 2020-11-12 14:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Byungchul Park, Thomas Gleixner, Ingo Molnar, Linus Torvalds,
	Peter Zijlstra, mingo, will, LKML, Joel Fernandes,
	alexander.levin, Daniel Vetter, Chris Wilson, duyuyang,
	johannes.berg, Tejun Heo, Theodore Ts'o, willy, david,
	Amir Goldstein, bfields, gregkh, kernel-team

On Thu, Nov 12, 2020 at 11:28 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Thu, 12 Nov 2020 17:10:30 +0900
> Byungchul Park <byungchul.park@lge.com> wrote:
>
> > 2. Does Lockdep do what a deadlock detection tool should do? From
> >    internal engine to APIs, all the internal data structure and
> >    algotithm of Lockdep is only looking at lock(?) acquisition order.
> >    Fundamentally Lockdep cannot work correctly with all general cases,
> >    for example, read/write/trylock and any wait/event.
>
> But lockdep does handle read/write/trylock and can handle wait/event (just
> needs better wrappers to annotate this). Perhaps part of the confusion here
> is that we believe that lockdep already does what you are asking for.
>
> >
> >    This can be done by re-introducing cross-release but still partially.
> >    A deadlock detector tool should thoroughly focus on *waits* and
> >    *events* to be more perfect at detecting deadlock because the fact is
> >    *waits* and their *events* that never reach cause deadlock.
> >
> >    With the philosophy of Lockdep, we can only handle partial cases
> >    fundamently. We have no choice but to do various work-around or adopt
> >    tricky ways to cover more cases if we keep using Lockdep.
> >
> > > That said, I'm not at all interested in a wholesale replacement of
> > > lockdep which will take exactly the same amount of time to stabilize and
> > > weed out the shortcomings again.
> >
> > I don't want to bother ones who don't want to be bothered from the tool.
> > But I think some day we need a new tool doing exactly what it should do
> > for deadlock detection for sure.
> >
> > I'm willing to make it matured on my own, or with ones who need a
> > stronger tool or willing to make it matured together - I wish tho.
> > That's why I suggest to make both there until the new tool gets
> > considered stable.
> >
> > FYI, roughly Lockdep is doing:
> >
> >    1. Dependency check
> >    2. Lock usage correctness check (including RCU)
> >    3. IRQ related usage correctness check with IRQFLAGS
> >
> > 2 and 3 should be there forever which is subtle and have gotten matured.
> > But 1 is not. I've been talking about 1. But again, it's not about
> > replacing it right away but having both for a while. I'm gonna try my
> > best to make it better.
>
> And I believe lockdep does handle 1. Perhaps show some tangible use case
> that you want to cover that you do not believe that lockdep can handle. If
> lockdep cannot handle it, it will show us where lockdep is lacking. If it
> can handle it, it will educate you on other ways that lockdep can be
> helpful in your development ;-)

Yes. That's the best thing I can do for all of us. I will.

I already did exactly the same thing while I was developing cross-release.
But I'm willing to do it again with the current Lockdep code.

But not today. It's over mid-night. Good night~

-- 
Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-12 13:56       ` Daniel Vetter
@ 2020-11-16  8:45         ` Byungchul Park
  0 siblings, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-16  8:45 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Steven Rostedt, Ingo Molnar, Linus Torvalds, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Linux Kernel Mailing List,
	Thomas Gleixner, Joel Fernandes, Sasha Levin, Wilson, Chris,
	duyuyang, Johannes Berg, Tejun Heo, Theodore Ts'o,
	Matthew Wilcox, Dave Chinner, Amir Goldstein, J. Bruce Fields,
	Greg KH, kernel-team

On Thu, Nov 12, 2020 at 02:56:49PM +0100, Daniel Vetter wrote:
> > > I think I understand it. For things like completions and other "wait for
> > > events" we have lockdep annotation, but it is rather awkward to implement.
> > > Having something that says "lockdep_wait_event()" and
> > > "lockdep_exec_event()" wrappers would be useful.
> >
> > Yes. It's a problem of lack of APIs. It can be done by reverting revert
> > of cross-release without big change. ;-)
> 
> +1 on lockdep-native support for this. For another use case I've added
> annotations for dma_fence_wait, and they're not entirely correct
> unfortunately. But the false positives is along the lines of "you

I'd like to help you solve the problem you are facing. Let me be back
and help you later. I have to all-stop what I'm doing at the moment
becasue of a very big personal issue, which is a sad thing.

Thank you,
Byungchul

> really shouldn't do this, even if it's in theory deadlock free". See
> 
> commit 5fbff813a4a328b730cb117027c43a4ae9d8b6c0
> Author: Daniel Vetter <daniel.vetter@ffwll.ch>
> Date:   Tue Jul 7 22:12:05 2020 +0200
> 
>    dma-fence: basic lockdep annotations
> 
> for fairly lengthy discussion of the problem and what I ended up with.
> 
> Thanks, Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-12 14:52           ` Matthew Wilcox
@ 2020-11-16  8:57             ` Byungchul Park
  2020-11-16 15:37               ` Matthew Wilcox
  0 siblings, 1 reply; 31+ messages in thread
From: Byungchul Park @ 2020-11-16  8:57 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Steven Rostedt, Thomas Gleixner, Ingo Molnar, torvalds, peterz,
	mingo, will, linux-kernel, joel, alexander.levin, daniel.vetter,
	chris, duyuyang, johannes.berg, tj, tytso, david, amir73il,
	bfields, gregkh, kernel-team

On Thu, Nov 12, 2020 at 02:52:51PM +0000, Matthew Wilcox wrote:
> On Thu, Nov 12, 2020 at 09:26:12AM -0500, Steven Rostedt wrote:
> > > FYI, roughly Lockdep is doing:
> > > 
> > >    1. Dependency check
> > >    2. Lock usage correctness check (including RCU)
> > >    3. IRQ related usage correctness check with IRQFLAGS
> > > 
> > > 2 and 3 should be there forever which is subtle and have gotten matured.
> > > But 1 is not. I've been talking about 1. But again, it's not about
> > > replacing it right away but having both for a while. I'm gonna try my
> > > best to make it better.
> > 
> > And I believe lockdep does handle 1. Perhaps show some tangible use case
> > that you want to cover that you do not believe that lockdep can handle. If
> > lockdep cannot handle it, it will show us where lockdep is lacking. If it
> > can handle it, it will educate you on other ways that lockdep can be
> > helpful in your development ;-)
> 
> Something I believe lockdep is missing is a way to annotate "This lock
> will be released by a softirq".  If we had lockdep for lock_page(), this
> would be a great case to show off.  The filesystem locks the page, then
> submits it to a device driver.  On completion, the filesystem's bio
> completion handler will be called in softirq context and unlock the page.
> 
> So if the filesystem has another lock which is acquired by the completion
> handler. we could get an ABBA deadlock that lockdep would be unable to see.
> 
> There are other similar things; if you look at the remaining semaphore
> users in the kernel, you'll see the general pattern is that they're
> acquired in process context and then released in interrupt context.
> If we had a way to transfer ownership of the semaphore to a generic
> "interrupt context", they could become mutexes and lockdep could check
> that nothing else will cause a deadlock.

Yes. Those are exactly what Cross-release feature solves. Those problems
can be achieved with Cross-release. But even with Cross-release, we
still cannot solve the problem of (1) readlock handling (2) and false
positives preventing further reporting.

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-12 14:58           ` Byungchul Park
@ 2020-11-16  9:05             ` Byungchul Park
  2020-11-23 10:45               ` Byungchul Park
  0 siblings, 1 reply; 31+ messages in thread
From: Byungchul Park @ 2020-11-16  9:05 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Steven Rostedt, Thomas Gleixner, Ingo Molnar, Linus Torvalds,
	Peter Zijlstra, mingo, will, LKML, Joel Fernandes,
	alexander.levin, Daniel Vetter, Chris Wilson, duyuyang,
	johannes.berg, Tejun Heo, Theodore Ts'o, willy, david,
	Amir Goldstein, bfields, gregkh, kernel-team

On Thu, Nov 12, 2020 at 11:58:44PM +0900, Byungchul Park wrote:
> > > FYI, roughly Lockdep is doing:
> > >
> > >    1. Dependency check
> > >    2. Lock usage correctness check (including RCU)
> > >    3. IRQ related usage correctness check with IRQFLAGS
> > >
> > > 2 and 3 should be there forever which is subtle and have gotten matured.
> > > But 1 is not. I've been talking about 1. But again, it's not about
> > > replacing it right away but having both for a while. I'm gonna try my
> > > best to make it better.
> >
> > And I believe lockdep does handle 1. Perhaps show some tangible use case
> > that you want to cover that you do not believe that lockdep can handle. If
> > lockdep cannot handle it, it will show us where lockdep is lacking. If it
> > can handle it, it will educate you on other ways that lockdep can be
> > helpful in your development ;-)

1) OK. Lockdep might work with trylock well.
2) Definitely Lockdep cannot do what Cross-release was doing.
3) For readlock handling, let me be back later and give you examples. I
   need check current Lockdep code first. But I have to all-stop what
   I'm doing at the moment because of a very big personal issue, which
   is a sad thing.

Sorry for the late response.

Thank you,
Byungchul

> 
> Yes. That's the best thing I can do for all of us. I will.
> 
> I already did exactly the same thing while I was developing cross-release.
> But I'm willing to do it again with the current Lockdep code.
> 
> But not today. It's over mid-night. Good night~
> 
> -- 
> Thanks,
> Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-16  8:57             ` Byungchul Park
@ 2020-11-16 15:37               ` Matthew Wilcox
  2020-11-18  1:45                 ` Boqun Feng
  2020-11-23 13:15                 ` Byungchul Park
  0 siblings, 2 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-11-16 15:37 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Steven Rostedt, Thomas Gleixner, Ingo Molnar, torvalds, peterz,
	mingo, will, linux-kernel, joel, alexander.levin, daniel.vetter,
	chris, duyuyang, johannes.berg, tj, tytso, david, amir73il,
	bfields, gregkh, kernel-team

On Mon, Nov 16, 2020 at 05:57:57PM +0900, Byungchul Park wrote:
> On Thu, Nov 12, 2020 at 02:52:51PM +0000, Matthew Wilcox wrote:
> > On Thu, Nov 12, 2020 at 09:26:12AM -0500, Steven Rostedt wrote:
> > > > FYI, roughly Lockdep is doing:
> > > > 
> > > >    1. Dependency check
> > > >    2. Lock usage correctness check (including RCU)
> > > >    3. IRQ related usage correctness check with IRQFLAGS
> > > > 
> > > > 2 and 3 should be there forever which is subtle and have gotten matured.
> > > > But 1 is not. I've been talking about 1. But again, it's not about
> > > > replacing it right away but having both for a while. I'm gonna try my
> > > > best to make it better.
> > > 
> > > And I believe lockdep does handle 1. Perhaps show some tangible use case
> > > that you want to cover that you do not believe that lockdep can handle. If
> > > lockdep cannot handle it, it will show us where lockdep is lacking. If it
> > > can handle it, it will educate you on other ways that lockdep can be
> > > helpful in your development ;-)
> > 
> > Something I believe lockdep is missing is a way to annotate "This lock
> > will be released by a softirq".  If we had lockdep for lock_page(), this
> > would be a great case to show off.  The filesystem locks the page, then
> > submits it to a device driver.  On completion, the filesystem's bio
> > completion handler will be called in softirq context and unlock the page.
> > 
> > So if the filesystem has another lock which is acquired by the completion
> > handler. we could get an ABBA deadlock that lockdep would be unable to see.
> > 
> > There are other similar things; if you look at the remaining semaphore
> > users in the kernel, you'll see the general pattern is that they're
> > acquired in process context and then released in interrupt context.
> > If we had a way to transfer ownership of the semaphore to a generic
> > "interrupt context", they could become mutexes and lockdep could check
> > that nothing else will cause a deadlock.
> 
> Yes. Those are exactly what Cross-release feature solves. Those problems
> can be achieved with Cross-release. But even with Cross-release, we
> still cannot solve the problem of (1) readlock handling (2) and false
> positives preventing further reporting.

It's not just about lockdep for semaphores.  Mutexes will spin if the
current owner is still running, so to convert an interrupt-released
semaphore to a mutex, we need a way to mark the mutex as being released
by the new owner.

I really don't think you want to report subsequent lockdep splats.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-16 15:37               ` Matthew Wilcox
@ 2020-11-18  1:45                 ` Boqun Feng
  2020-11-18  3:30                   ` Matthew Wilcox
  2020-11-23 13:15                 ` Byungchul Park
  1 sibling, 1 reply; 31+ messages in thread
From: Boqun Feng @ 2020-11-18  1:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Byungchul Park, Steven Rostedt, Thomas Gleixner, Ingo Molnar,
	torvalds, peterz, mingo, will, linux-kernel, joel,
	alexander.levin, daniel.vetter, chris, duyuyang, johannes.berg,
	tj, tytso, david, amir73il, bfields, gregkh, kernel-team

Hi Matthew,

On Mon, Nov 16, 2020 at 03:37:29PM +0000, Matthew Wilcox wrote:
[...]
> 
> It's not just about lockdep for semaphores.  Mutexes will spin if the
> current owner is still running, so to convert an interrupt-released
> semaphore to a mutex, we need a way to mark the mutex as being released

Could you provide an example for the conversion from interrupt-released
semaphore to a mutex? I'd like to see if we can improve lockdep to help
on that case.

Regards,
Boqun

> by the new owner.
> 
> I really don't think you want to report subsequent lockdep splats.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-18  1:45                 ` Boqun Feng
@ 2020-11-18  3:30                   ` Matthew Wilcox
  0 siblings, 0 replies; 31+ messages in thread
From: Matthew Wilcox @ 2020-11-18  3:30 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Byungchul Park, Steven Rostedt, Thomas Gleixner, Ingo Molnar,
	torvalds, peterz, mingo, will, linux-kernel, joel,
	alexander.levin, daniel.vetter, chris, duyuyang, johannes.berg,
	tj, tytso, david, amir73il, bfields, gregkh, kernel-team

On Wed, Nov 18, 2020 at 09:45:40AM +0800, Boqun Feng wrote:
> Hi Matthew,
> 
> On Mon, Nov 16, 2020 at 03:37:29PM +0000, Matthew Wilcox wrote:
> [...]
> > 
> > It's not just about lockdep for semaphores.  Mutexes will spin if the
> > current owner is still running, so to convert an interrupt-released
> > semaphore to a mutex, we need a way to mark the mutex as being released
> 
> Could you provide an example for the conversion from interrupt-released
> semaphore to a mutex? I'd like to see if we can improve lockdep to help
> on that case.

How about adb_probe_mutex in drivers/macintosh/adb.c.  Most of
the acquires/releases are within the same task.  But adb_reset_bus()
calls down(&adb_probe_mutex), then schedules adb_reset_work() which runs
adb_probe_task() which calls up(&adb_probe_mutex).

Ideally adb_probe_mutex would become a mutex instead of the semaphore
it currently is.  adb_reset_bus() would pass ownership of the mutex to
kadbprobe since it's the one which must run in order to release the mutex.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-16  9:05             ` Byungchul Park
@ 2020-11-23 10:45               ` Byungchul Park
  0 siblings, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 10:45 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Steven Rostedt, Thomas Gleixner, Ingo Molnar, Linus Torvalds,
	Peter Zijlstra, mingo, will, LKML, Joel Fernandes,
	alexander.levin, Daniel Vetter, Chris Wilson, duyuyang,
	johannes.berg, Tejun Heo, Theodore Ts'o, willy, david,
	Amir Goldstein, bfields, gregkh, kernel-team

On Mon, Nov 16, 2020 at 06:05:47PM +0900, Byungchul Park wrote:
> On Thu, Nov 12, 2020 at 11:58:44PM +0900, Byungchul Park wrote:
> > > > FYI, roughly Lockdep is doing:
> > > >
> > > >    1. Dependency check
> > > >    2. Lock usage correctness check (including RCU)
> > > >    3. IRQ related usage correctness check with IRQFLAGS
> > > >
> > > > 2 and 3 should be there forever which is subtle and have gotten matured.
> > > > But 1 is not. I've been talking about 1. But again, it's not about
> > > > replacing it right away but having both for a while. I'm gonna try my
> > > > best to make it better.
> > >
> > > And I believe lockdep does handle 1. Perhaps show some tangible use case
> > > that you want to cover that you do not believe that lockdep can handle. If
> > > lockdep cannot handle it, it will show us where lockdep is lacking. If it
> > > can handle it, it will educate you on other ways that lockdep can be
> > > helpful in your development ;-)
> 
> 1) OK. Lockdep might work with trylock well.
> 2) Definitely Lockdep cannot do what Cross-release was doing.
> 3) For readlock handling, let me be back later and give you examples. I
>    need check current Lockdep code first. But I have to all-stop what
>    I'm doing at the moment because of a very big personal issue, which
>    is a sad thing.

I just found that Boqun Feng has made a lot of changes into Lockdep
recently to support tracking recursive read locks, while I was checking
how the current Lockdep deals with read locks.

I need to read the code more.. I'll add my opinion on it once I see
how it works. Before that, I'd like to share my approach so that you
guys can see what means to track *wait* and *event*, how simply the tool
can work and what exactly a deadlock detection tool should do. Let me
add my patches onto this thread right away.

I understand you all don't want to replace such a stable tool but I
hope you to see *the* right way to track things for that purpose.
Again, I do never touch any other functions of Lockdep including all
great efforts that have been made but dependency tracking.

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC] Dept(Dependency Tracker) Implementation
  2020-11-11  5:05 [RFC] Are you good with Lockdep? Byungchul Park
  2020-11-11 10:54 ` Ingo Molnar
@ 2020-11-23 11:05 ` Byungchul Park
  2020-11-23 11:36   ` [RFC 1/6] dept: Implement Dept(Dependency Tracker) Byungchul Park
  2020-11-23 12:29   ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
  2020-11-23 11:13 ` [RFC] Dept(Dependency Tracker) Report Example Byungchul Park
  2 siblings, 2 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 11:05 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

Hi,

This patchset is too nasty to get reviewed in detail for now.

This have:

   1. applying Dept to spinlock/mutex/rwlock/completion
   2. assigning custom keys or disable maps to avoid false positives

This doesn't have yet (but will be done):

   1. proc interfaces e.g. to see dependecies the tool has built,
   2. applying Dept to rw semaphore and the like,
   3. applying Dept to lock_page()/unlock_page(),
   4. assigning custom keys to more places properly,
   5. replace all manual Lockdep annotations,
   (and so on..)

But I decided to share it to let others able to test how it works and
someone who wants to see the detail able to check the code. The most
important thing I'd like to show is what exactly a deadlock detection
tool should do.

Turn on CONFIG_DEPT to test it. Feel free to leave any questions if you
have.

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC] Dept(Dependency Tracker) Report Example
  2020-11-11  5:05 [RFC] Are you good with Lockdep? Byungchul Park
  2020-11-11 10:54 ` Ingo Molnar
  2020-11-23 11:05 ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
@ 2020-11-23 11:13 ` Byungchul Park
  2020-11-23 12:14   ` Byungchul Park
  2 siblings, 1 reply; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 11:13 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

Search the following dmesg for the reports with "Dept".

Of course, it's not real but fake to show you how Dept reports multiple
problems. This is FYI. Feel free to ask me if you have questions.

--->8---
[    0.000000] Linux version 5.9.0+ (byungchul.park@X58A-UD3R) (gcc (Ubuntu 6.5.0-2ubuntu1~14.04.1) 6.5.0 20181026, GNU ld (GNU Binutils for Ubuntu) 2.24) #8 SMP Mon Nov 23 18:47:03 KST 2020
[    0.000000] Command line: root=/dev/sda1 text console=ttyS0 nokaslr
[    0.000000] x86/fpu: x87 FPU will use FXSAVE
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bfffdfff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bfffe000-0x00000000bfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000013fffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.4 present.
[    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3197.604 MHz processor
[    0.000752] last_pfn = 0x140000 max_arch_pfn = 0x400000000
[    0.000819] x86/PAT: PAT not supported by the CPU.
[    0.000830] x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WB  WT  UC- UC  
[    0.000834] last_pfn = 0xbfffe max_arch_pfn = 0x400000000
[    0.001462] found SMP MP-table at [mem 0x000f0ae0-0x000f0aef]
[    0.001488] check: Scanning 1 areas for low memory corruption
[    0.002605] ACPI: Early table checksum verification disabled
[    0.002612] ACPI: RSDP 0x00000000000F08B0 000014 (v00 BOCHS )
[    0.002618] ACPI: RSDT 0x00000000BFFFFCFC 000034 (v01 BOCHS  BXPCRSDT 00000001 BXPC 00000001)
[    0.002626] ACPI: FACP 0x00000000BFFFF1C0 000074 (v01 BOCHS  BXPCFACP 00000001 BXPC 00000001)
[    0.002634] ACPI: DSDT 0x00000000BFFFE040 001180 (v01 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
[    0.002640] ACPI: FACS 0x00000000BFFFE000 000040
[    0.002645] ACPI: SSDT 0x00000000BFFFF234 000A00 (v01 BOCHS  BXPCSSDT 00000001 BXPC 00000001)
[    0.002651] ACPI: APIC 0x00000000BFFFFC34 000090 (v01 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
[    0.002657] ACPI: HPET 0x00000000BFFFFCC4 000038 (v01 BOCHS  BXPCHPET 00000001 BXPC 00000001)
[    0.002900] No NUMA configuration found
[    0.002904] Faking a node at [mem 0x0000000000000000-0x000000013fffffff]
[    0.002910] NODE_DATA(0) allocated [mem 0x13fff9000-0x13fffcfff]
[    0.002936] Zone ranges:
[    0.002939]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.002943]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.002947]   Normal   [mem 0x0000000100000000-0x000000013fffffff]
[    0.002951] Movable zone start for each node
[    0.002955] Early memory node ranges
[    0.002958]   node   0: [mem 0x0000000000001000-0x000000000009efff]
[    0.002962]   node   0: [mem 0x0000000000100000-0x00000000bfffdfff]
[    0.002966]   node   0: [mem 0x0000000100000000-0x000000013fffffff]
[    0.003444] Zeroed struct page in unavailable ranges: 100 pages
[    0.003446] Initmem setup node 0 [mem 0x0000000000001000-0x000000013fffffff]
[    0.017534] ACPI: PM-Timer IO Port: 0xb008
[    0.017550] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
[    0.017583] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
[    0.017589] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.017593] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[    0.017597] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.017601] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[    0.017604] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[    0.017621] Using ACPI (MADT) for SMP configuration information
[    0.017625] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[    0.017635] smpboot: Allowing 4 CPUs, 0 hotplug CPUs
[    0.017653] PM: hibernation: Registered nosave memory: [mem 0x00000000-0x00000fff]
[    0.017657] PM: hibernation: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
[    0.017661] PM: hibernation: Registered nosave memory: [mem 0x000a0000-0x000effff]
[    0.017664] PM: hibernation: Registered nosave memory: [mem 0x000f0000-0x000fffff]
[    0.017668] PM: hibernation: Registered nosave memory: [mem 0xbfffe000-0xbfffffff]
[    0.017671] PM: hibernation: Registered nosave memory: [mem 0xc0000000-0xfeffbfff]
[    0.017674] PM: hibernation: Registered nosave memory: [mem 0xfeffc000-0xfeffffff]
[    0.017678] PM: hibernation: Registered nosave memory: [mem 0xff000000-0xfffbffff]
[    0.017681] PM: hibernation: Registered nosave memory: [mem 0xfffc0000-0xffffffff]
[    0.017690] [mem 0xc0000000-0xfeffbfff] available for PCI devices
[    0.017695] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[    0.022235] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:4 nr_node_ids:1
[    0.023359] percpu: Embedded 54 pages/cpu s182080 r8192 d30912 u524288
[    0.023418] Built 1 zonelists, mobility grouping on.  Total pages: 1032071
[    0.023426] Policy zone: Normal
[    0.023438] Kernel command line: root=/dev/sda1 text console=ttyS0 nokaslr
[    0.028817] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[    0.029379] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes, linear)
[    0.029436] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.082983] Memory: 4014012K/4193904K available (14340K kernel code, 1973K rwdata, 4560K rodata, 1184K init, 10680K bss, 179636K reserved, 0K cma-reserved)
[    0.083161] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.083194] Kernel/User page tables isolation: enabled
[    0.083738] rcu: Hierarchical RCU implementation.
[    0.083748] rcu: 	RCU event tracing is enabled.
[    0.083758] rcu: 	RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=4.
[    0.083767] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies.
[    0.083777] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.084125] NR_IRQS: 4352, nr_irqs: 456, preallocated irqs: 16
[    0.084669] random: get_random_bytes called from start_kernel+0x367/0x533 with crng_init=0
[    0.087902] Console: colour VGA+ 80x25
[    0.141251] printk: console [ttyS0] enabled
[    0.141659] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.142413] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.142811] ... MAX_LOCK_DEPTH:          48
[    0.143217] ... MAX_LOCKDEP_KEYS:        8192
[    0.143645] ... CLASSHASH_SIZE:          4096
[    0.144069] ... MAX_LOCKDEP_ENTRIES:     32768
[    0.144498] ... MAX_LOCKDEP_CHAINS:      65536
[    0.144928] ... CHAINHASH_SIZE:          32768
[    0.145357]  memory used by lock dependency info: 3573 kB
[    0.145875]  per task-struct memory footprint: 1920 bytes
[    0.146403] DEPendency Tracker: Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
[    0.147163] ... DEPT_MAX_STACK_ENTRY: 16
[    0.147546] ... DEPT_MAX_WAIT_HIST  : 16
[    0.147929] ... DEPT_MAX_ECXT_HELD  : 48
[    0.148316] ... DEPT_MAX_SUBCLASSES : 16
[    0.148701] ... memory used by dep: 416 KB
[    0.149101] ... memory used by class: 608 KB
[    0.149517] ... memory used by stack: 2304 KB
[    0.149942] ... memory used by ecxt: 416 KB
[    0.150355] ... memory used by wait: 384 KB
[    0.150762] ... hash list head used by dep: 32 KB
[    0.151216] ... hash list head used by class: 32 KB
[    0.151687] ... total memory used by objects and hashs: 4192 KB
[    0.152253] ... per task memory footprint: 1864 bytes
[    0.152770] ACPI: Core revision 20200717
[    0.153363] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
[    0.154350] APIC: Switch to symmetric I/O mode setup
[    0.156041] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.161359] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x2e1772a46d9, max_idle_ns: 440795370507 ns
[    0.162659] Calibrating delay loop (skipped), value calculated using timer frequency.. 6395.20 BogoMIPS (lpj=3197604)
[    0.163660] pid_max: default: 32768 minimum: 301
[    0.164729] LSM: Security Framework initializing
[    0.165294] SELinux:  Initializing.
[    0.165741] Mount-cache hash table entries: 8192 (order: 4, 65536 bytes, linear)
[    0.166668] Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes, linear)
[    0.168426] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[    0.168664] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
[    0.169668] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[    0.170660] Spectre V2 : Mitigation: Full generic retpoline
[    0.171299] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.172580] Speculative Store Bypass: Vulnerable
[    0.172668] MDS: Vulnerable: Clear CPU buffers attempted, no microcode
[    0.173947] Freeing SMP alternatives memory: 40K
[    0.174836] ------------[ cut here ]------------
[    0.175392] DEPT_WARN_ONCE: Need to expand the ring buffer.
[    0.175650] WARNING: CPU: 0 PID: 0 at kernel/dependency/dept.c:1365 dept_event+0x3b7/0x580
[    0.175650] Modules linked in:
[    0.175650] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0+ #8
[    0.175650] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    0.175650] RIP: 0010:dept_event+0x3b7/0x580
[    0.175650] Code: 85 c0 75 32 80 3d e6 3f 6e 01 00 75 29 48 c7 c7 68 f9 31 82 4c 89 4c 24 20 44 89 44 24 18 c6 05 cc 3f 6e 01 01 e8 59 cd f7 ff <0f> 0b 44 8b 44 24 18 4c 8b 4c 24 20 41 83 ff ff 0f 84 d4 fd ff ff
[    0.175650] RSP: 0000:ffffffff82603e58 EFLAGS: 00010096
[    0.175650] RAX: 000000000000002f RBX: ffffffff82d22968 RCX: 0000000000000001
[    0.175650] RDX: 0000000000000000 RSI: ffffffff810c7d2c RDI: ffffffff826801e0
[    0.175650] RBP: 0000000000000000 R08: ffffffff8200c310 R09: ffffffff8200c310
[    0.175650] R10: 0000000000000000 R11: ffffffff82603cf5 R12: ffffffff8263a8c0
[    0.175650] R13: ffffffff8200bf58 R14: ffffffff82d22980 R15: 0000000000000008
[    0.175650] FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
[    0.175650] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.175650] CR2: ffff88813ffff000 CR3: 0000000002632000 CR4: 00000000000006f0
[    0.175650] Call Trace:
[    0.175650]  ? rest_init+0x107/0x120
[    0.175650]  ? find_held_lock+0x2d/0x90
[    0.175650]  ? __noinstr_text_end+0x1/0x1
[    0.175650]  complete+0x28/0x60
[    0.175650]  rest_init+0x107/0x120
[    0.175650]  start_kernel+0x523/0x533
[    0.175650]  secondary_startup_64+0xb6/0xc0
[    0.175650] ---[ end trace 6e44938380d9db2c ]---
[    0.278002] smpboot: CPU0: Intel QEMU Virtual CPU version 2.0.0 (family: 0x6, model: 0x6, stepping: 0x3)
[    0.279295] Performance Events: PMU not available due to virtualization, using software events only.
[    0.279837] rcu: Hierarchical SRCU implementation.
[    0.281787] smp: Bringing up secondary CPUs ...
[    0.282962] x86: Booting SMP configuration:
[    0.283465] .... node  #0, CPUs:      #1
[    0.072074] smpboot: CPU 1 Converting physical 0 to logical die 1
[    0.346270]  #2
[    0.072074] smpboot: CPU 2 Converting physical 0 to logical die 2
[    0.410303]  #3
[    0.072074] smpboot: CPU 3 Converting physical 0 to logical die 3
[    0.471795] smp: Brought up 1 node, 4 CPUs
[    0.472195] smpboot: Max logical packages: 4
[    0.472670] smpboot: Total of 4 processors activated (25579.89 BogoMIPS)
[    0.474099] devtmpfs: initialized
[    0.476925] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[    0.477675] futex hash table entries: 1024 (order: 5, 196608 bytes, linear)
[    0.478852] PM: RTC time: 09:47:17, date: 2020-11-23
[    0.479693] NET: Registered protocol family 16
[    0.481116] audit: initializing netlink subsys (disabled)
[    0.481719] audit: type=2000 audit(1606124837.326:1): state=initialized audit_enabled=0 res=1
[    0.482752] thermal_sys: Registered thermal governor 'step_wise'
[    0.482757] thermal_sys: Registered thermal governor 'user_space'
[    0.483518] cpuidle: using governor menu
[    0.484738] ACPI: bus type PCI registered
[    0.485493] PCI: Using configuration type 1 for base access
[    0.512937] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.513847] cryptomgr_test (35) used greatest stack depth: 15104 bytes left
[    0.519990] ACPI: Added _OSI(Module Device)
[    0.520418] ACPI: Added _OSI(Processor Device)
[    0.520667] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.521107] ACPI: Added _OSI(Processor Aggregator Device)
[    0.521664] ACPI: Added _OSI(Linux-Dell-Video)
[    0.522090] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[    0.522664] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[    0.526888] ACPI: 2 ACPI AML tables successfully acquired and loaded
[    0.536947] ACPI: Interpreter enabled
[    0.537465] ACPI: (supports S0 S3 S4 S5)
[    0.537660] ACPI: Using IOAPIC for interrupt routing
[    0.538183] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.540484] ACPI: Enabled 16 GPEs in block 00 to 0F
[    0.571959] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[    0.572585] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI HPX-Type3]
[    0.572736] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
[    0.574905] PCI host bridge to bus 0000:00
[    0.575297] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.575661] pci_bus 0000:00: root bus resource [io  0x0d00-0xadff window]
[    0.576266] pci_bus 0000:00: root bus resource [io  0xae0f-0xaeff window]
[    0.576661] pci_bus 0000:00: root bus resource [io  0xaf20-0xafdf window]
[    0.577662] pci_bus 0000:00: root bus resource [io  0xafe4-0xffff window]
[    0.578282] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.578661] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xfebfffff window]
[    0.579664] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.580220] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000
[    0.582483] pci 0000:00:01.0: [8086:7000] type 00 class 0x060100
[    0.584608] pci 0000:00:01.1: [8086:7010] type 00 class 0x010180
[    0.587034] pci 0000:00:01.1: reg 0x20: [io  0xc040-0xc04f]
[    0.588432] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
[    0.588661] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io  0x03f6]
[    0.589661] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
[    0.590292] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io  0x0376]
[    0.592010] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000
[    0.592937] pci 0000:00:01.3: quirk: [io  0xb000-0xb03f] claimed by PIIX4 ACPI
[    0.593587] pci 0000:00:01.3: quirk: [io  0xb100-0xb10f] claimed by PIIX4 SMB
[    0.595518] pci 0000:00:02.0: [1013:00b8] type 00 class 0x030000
[    0.596661] pci 0000:00:02.0: reg 0x10: [mem 0xfc000000-0xfdffffff pref]
[    0.598440] pci 0000:00:02.0: reg 0x14: [mem 0xfebf0000-0xfebf0fff]
[    0.603038] pci 0000:00:02.0: reg 0x30: [mem 0xfebe0000-0xfebeffff pref]
[    0.604943] pci 0000:00:03.0: [8086:100e] type 00 class 0x020000
[    0.606009] pci 0000:00:03.0: reg 0x10: [mem 0xfebc0000-0xfebdffff]
[    0.607357] pci 0000:00:03.0: reg 0x14: [io  0xc000-0xc03f]
[    0.611657] pci 0000:00:03.0: reg 0x30: [mem 0xfeb80000-0xfebbffff pref]
[    0.615890] ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11)
[    0.616780] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
[    0.617678] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
[    0.618547] ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11)
[    0.618851] ACPI: PCI Interrupt Link [LNKS] (IRQs *9)
[    0.623002] iommu: Default domain type: Translated 
[    0.623712] pci 0000:00:02.0: vgaarb: setting as boot VGA device
[    0.624269] pci 0000:00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    0.624672] pci 0000:00:02.0: vgaarb: bridge control possible
[    0.625665] vgaarb: loaded
[    0.626231] SCSI subsystem initialized
[    0.626930] ACPI: bus type USB registered
[    0.627744] usbcore: registered new interface driver usbfs
[    0.628337] usbcore: registered new interface driver hub
[    0.628785] usbcore: registered new device driver usb
[    0.629365] pps_core: LinuxPPS API ver. 1 registered
[    0.629659] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.630674] PTP clock support registered
[    0.631824] Advanced Linux Sound Architecture Driver Initialized.
[    0.633396] NetLabel: Initializing
[    0.633660] NetLabel:  domain hash size = 128
[    0.634064] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
[    0.634705] NetLabel:  unlabeled traffic allowed by default
[    0.635359] PCI: Using ACPI for IRQ routing
[    0.635860] hpet: 3 channels of 0 reserved for per-cpu timers
[    0.636667] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
[    0.637127] hpet0: 3 comparators, 64-bit 100.000000 MHz counter
[    0.640800] clocksource: Switched to clocksource tsc-early
[    0.888845] VFS: Disk quotas dquot_6.6.0
[    0.889303] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.890178] pnp: PnP ACPI init
[    0.893715] pnp: PnP ACPI: found 6 devices
[    0.919486] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    0.920623] NET: Registered protocol family 2
[    0.921743] tcp_listen_portaddr_hash hash table entries: 2048 (order: 6, 262144 bytes, linear)
[    0.922730] TCP established hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    0.923611] TCP bind hash table entries: 32768 (order: 9, 3932160 bytes, linear)
[    0.926324] TCP: Hash tables configured (established 32768 bind 32768)
[    0.927113] UDP hash table entries: 2048 (order: 7, 524288 bytes, linear)
[    0.928005] UDP-Lite hash table entries: 2048 (order: 7, 524288 bytes, linear)
[    0.929141] NET: Registered protocol family 1
[    0.930453] RPC: Registered named UNIX socket transport module.
[    0.931065] RPC: Registered udp transport module.
[    0.931590] RPC: Registered tcp transport module.
[    0.932071] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.933210] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7 window]
[    0.933820] pci_bus 0000:00: resource 5 [io  0x0d00-0xadff window]
[    0.934394] pci_bus 0000:00: resource 6 [io  0xae0f-0xaeff window]
[    0.934961] pci_bus 0000:00: resource 7 [io  0xaf20-0xafdf window]
[    0.935559] pci_bus 0000:00: resource 8 [io  0xafe4-0xffff window]
[    0.936114] pci_bus 0000:00: resource 9 [mem 0x000a0000-0x000bffff window]
[    0.936773] pci_bus 0000:00: resource 10 [mem 0xc0000000-0xfebfffff window]
[    0.937921] pci 0000:00:01.0: PIIX3: Enabling Passive Release
[    0.938487] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[    0.939032] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[    0.939661] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[    0.940438] PCI: CLS 0 bytes, default 64
[    0.941050] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    0.941676] software IO TLB: mapped [mem 0xbbffe000-0xbfffe000] (64MB)
[    0.946671] check: Scanning for low memory corruption every 60 seconds
[    0.950094] Initialise system trusted keyrings
[    0.951024] workingset: timestamp_bits=56 max_order=20 bucket_order=0
[    0.978755] NFS: Registering the id_resolver key type
[    0.979236] Key type id_resolver registered
[    0.979647] Key type id_legacy registered
[    0.990223] Key type asymmetric registered
[    0.990669] Asymmetric key parser 'x509' registered
[    0.991155] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
[    0.991849] io scheduler mq-deadline registered
[    0.992267] io scheduler kyber registered
[    0.994101] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[    0.995081] ===================================================
[    0.995619] Dept: Circular dependency has been detected.
[    0.995816] 5.9.0+ #8 Tainted: G        W        
[    0.996493] ---------------------------------------------------
[    0.996493] summary
[    0.996493] ---------------------------------------------------
[    0.996493] *** AA DEADLOCK ***
[    0.996493] 
[    0.996493] context A
[    0.996493]     [S] __mutex_lock(&dev->mutex:0)
[    0.996493]     [W] __mutex_lock(&dev->mutex:0)
[    0.996493]     [E] __mutex_unlock(&dev->mutex:0)
[    0.996493] 
[    0.996493] [S]: start of the event context
[    0.996493] [W]: the wait blocked
[    0.996493] [E]: the event not reachable
[    0.996493] ---------------------------------------------------
[    0.996493] context A's detail
[    0.996493] ---------------------------------------------------
[    0.996493] context A
[    0.996493]     [S] __mutex_lock(&dev->mutex:0)
[    0.996493]     [W] __mutex_lock(&dev->mutex:0)
[    0.996493]     [E] __mutex_unlock(&dev->mutex:0)
[    0.996493] 
[    0.996493] [S] __mutex_lock(&dev->mutex:0):
[    0.996493] [<ffffffff8168a0f8>] device_driver_attach+0x18/0x50
[    0.996493] stacktrace:
[    0.996493]       __mutex_lock+0x6b5/0x8f0
[    0.996493]       device_driver_attach+0x18/0x50
[    0.996493]       __driver_attach+0x82/0xc0
[    0.996493]       bus_for_each_dev+0x57/0x90
[    0.996493]       bus_add_driver+0x175/0x1f0
[    0.996493]       driver_register+0x56/0xe0
[    0.996493]       do_one_initcall+0x62/0x20f
[    0.996493]       kernel_init_freeable+0x22e/0x26a
[    0.996493]       kernel_init+0x5/0x110
[    0.996493]       ret_from_fork+0x22/0x30
[    0.996493] 
[    0.996493] [W] __mutex_lock(&dev->mutex:0):
[    0.996493] [<ffffffff816841f2>] device_del+0x32/0x3d0
[    0.996493] stacktrace:
[    0.996493]       dept_wait_ecxt_enter+0x130/0x2b0
[    0.996493]       __mutex_lock+0x6b5/0x8f0
[    0.996493]       device_del+0x32/0x3d0
[    0.996493]       device_unregister+0x9/0x20
[    0.996493]       wakeup_source_unregister+0x20/0x30
[    0.996493]       device_wakeup_enable+0x93/0xd0
[    0.996493]       acpi_button_add+0x3a1/0x480
[    0.996493]       acpi_device_probe+0x40/0x100
[    0.996493]       really_probe+0x1b7/0x380
[    0.996493]       driver_probe_device+0x4a/0xa0
[    0.996493]       device_driver_attach+0x4a/0x50
[    0.996493]       __driver_attach+0x82/0xc0
[    0.996493]       bus_for_each_dev+0x57/0x90
[    0.996493]       bus_add_driver+0x175/0x1f0
[    0.996493]       driver_register+0x56/0xe0
[    0.996493]       do_one_initcall+0x62/0x20f
[    0.996493] 
[    0.996493] [E] __mutex_unlock(&dev->mutex:0):
[    0.996493] (N/A)
[    0.996493] ---------------------------------------------------
[    0.996493] information that might be helpful
[    0.996493] ---------------------------------------------------
[    0.996493] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.9.0+ #8
[    0.996493] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    0.996493] Call Trace:
[    0.996493]  dump_stack+0x77/0x9b
[    0.996493]  print_circle+0x431/0x670
[    0.996493]  ? device_driver_attach+0x18/0x50
[    0.996493]  ? print_circle+0x670/0x670
[    0.996493]  ? device_del+0x32/0x3d0
[    0.996493]  cb_check_dl+0x45/0x60
[    0.996493]  bfs+0x6c/0x190
[    0.996493]  ? device_del+0x32/0x3d0
[    0.996493]  add_dep+0x6b/0x80
[    0.996493]  add_wait+0x2fb/0x350
[    0.996493]  ? device_del+0x32/0x3d0
[    0.996493]  dept_wait_ecxt_enter+0x130/0x2b0
[    0.996493]  __mutex_lock+0x6b5/0x8f0
[    0.996493]  ? device_del+0x32/0x3d0
[    0.996493]  ? _raw_spin_unlock_irqrestore+0x39/0x40
[    0.996493]  ? _raw_spin_unlock_irqrestore+0x39/0x40
[    0.996493]  ? device_del+0x32/0x3d0
[    0.996493]  device_del+0x32/0x3d0
[    0.996493]  ? del_timer_sync+0x74/0xa0
[    0.996493]  device_unregister+0x9/0x20
[    0.996493]  wakeup_source_unregister+0x20/0x30
[    0.996493]  device_wakeup_enable+0x93/0xd0
[    0.996493]  acpi_button_add+0x3a1/0x480
[    0.996493]  acpi_device_probe+0x40/0x100
[    0.996493]  really_probe+0x1b7/0x380
[    0.996493]  ? rdinit_setup+0x26/0x26
[    0.996493]  driver_probe_device+0x4a/0xa0
[    0.996493]  device_driver_attach+0x4a/0x50
[    0.996493]  __driver_attach+0x82/0xc0
[    0.996493]  ? device_driver_attach+0x50/0x50
[    0.996493]  bus_for_each_dev+0x57/0x90
[    0.996493]  bus_add_driver+0x175/0x1f0
[    0.996493]  driver_register+0x56/0xe0
[    0.996493]  ? acpi_ac_init+0x8e/0x8e
[    0.996493]  do_one_initcall+0x62/0x20f
[    0.996493]  kernel_init_freeable+0x22e/0x26a
[    0.996493]  ? rest_init+0x120/0x120
[    0.996493]  kernel_init+0x5/0x110
[    0.996493]  ret_from_fork+0x22/0x30
[    1.033630] ACPI: Power Button [PWRF]
[    1.034325] ===================================================
[    1.034425] Dept: Circular dependency has been detected.
[    1.034425] 5.9.0+ #8 Tainted: G        W        
[    1.034425] ---------------------------------------------------
[    1.034425] summary
[    1.034425] ---------------------------------------------------
[    1.034425] *** DEADLOCK ***
[    1.034425] 
[    1.034425] context A
[    1.034425]     [S] __mutex_lock(&dev->mutex:0)
[    1.034425]     [W] __mutex_lock(cpu_add_remove_lock:0)
[    1.034425]     [E] __mutex_unlock(&dev->mutex:0)
[    1.034425] 
[    1.034425] context B
[    1.034425]     [S] __mutex_lock(pmus_lock:0)
[    1.034425]     [W] __mutex_lock(&dev->mutex:0)
[    1.034425]     [E] __mutex_unlock(pmus_lock:0)
[    1.034425] 
[    1.034425] context C
[    1.034425]     [S] __mutex_lock(cpu_add_remove_lock:0)
[    1.034425]     [W] __mutex_lock(pmus_lock:0)
[    1.034425]     [E] __mutex_unlock(cpu_add_remove_lock:0)
[    1.034425] 
[    1.034425] [S]: start of the event context
[    1.034425] [W]: the wait blocked
[    1.034425] [E]: the event not reachable
[    1.034425] ---------------------------------------------------
[    1.034425] context A's detail
[    1.034425] ---------------------------------------------------
[    1.034425] context A
[    1.034425]     [S] __mutex_lock(&dev->mutex:0)
[    1.034425]     [W] __mutex_lock(cpu_add_remove_lock:0)
[    1.034425]     [E] __mutex_unlock(&dev->mutex:0)
[    1.034425] 
[    1.034425] [S] __mutex_lock(&dev->mutex:0):
[    1.034425] [<ffffffff8168a0f8>] device_driver_attach+0x18/0x50
[    1.034425] stacktrace:
[    1.034425]       __mutex_lock+0x6b5/0x8f0
[    1.034425]       device_driver_attach+0x18/0x50
[    1.034425]       __driver_attach+0x82/0xc0
[    1.034425]       bus_for_each_dev+0x57/0x90
[    1.034425]       bus_add_driver+0x175/0x1f0
[    1.034425]       driver_register+0x56/0xe0
[    1.034425]       acpi_processor_driver_init+0x1c/0xad
[    1.034425]       do_one_initcall+0x62/0x20f
[    1.034425]       kernel_init_freeable+0x22e/0x26a
[    1.034425]       kernel_init+0x5/0x110
[    1.034425]       ret_from_fork+0x22/0x30
[    1.034425] 
[    1.034425] [W] __mutex_lock(cpu_add_remove_lock:0):
[    1.034425] [<ffffffff8106635e>] cpu_hotplug_disable+0xe/0x30
[    1.034425] stacktrace:
[    1.034425]       dept_wait_ecxt_enter+0x130/0x2b0
[    1.034425]       __mutex_lock+0x6b5/0x8f0
[    1.034425]       cpu_hotplug_disable+0xe/0x30
[    1.034425]       acpi_processor_start+0x25/0x40
[    1.034425]       really_probe+0x1b7/0x380
[    1.034425]       driver_probe_device+0x4a/0xa0
[    1.034425]       device_driver_attach+0x4a/0x50
[    1.034425]       __driver_attach+0x82/0xc0
[    1.034425]       bus_for_each_dev+0x57/0x90
[    1.034425]       bus_add_driver+0x175/0x1f0
[    1.034425]       driver_register+0x56/0xe0
[    1.034425]       acpi_processor_driver_init+0x1c/0xad
[    1.034425]       do_one_initcall+0x62/0x20f
[    1.034425]       kernel_init_freeable+0x22e/0x26a
[    1.034425]       kernel_init+0x5/0x110
[    1.034425]       ret_from_fork+0x22/0x30
[    1.034425] 
[    1.034425] [E] __mutex_unlock(&dev->mutex:0):
[    1.034425] (N/A)
[    1.034425] ---------------------------------------------------
[    1.034425] context B's detail
[    1.034425] ---------------------------------------------------
[    1.034425] context B
[    1.034425]     [S] __mutex_lock(pmus_lock:0)
[    1.034425]     [W] __mutex_lock(&dev->mutex:0)
[    1.034425]     [E] __mutex_unlock(pmus_lock:0)
[    1.034425] 
[    1.034425] [S] __mutex_lock(pmus_lock:0):
[    1.034425] [<ffffffff82cad537>] perf_event_sysfs_init+0x12/0x86
[    1.034425] stacktrace:
[    1.034425]       __mutex_lock+0x6b5/0x8f0
[    1.034425]       perf_event_sysfs_init+0x12/0x86
[    1.034425]       do_one_initcall+0x62/0x20f
[    1.034425]       kernel_init_freeable+0x22e/0x26a
[    1.034425]       kernel_init+0x5/0x110
[    1.034425]       ret_from_fork+0x22/0x30
[    1.034425] 
[    1.034425] [W] __mutex_lock(&dev->mutex:0):
[    1.034425] [<ffffffff81689974>] __device_attach+0x24/0x120
[    1.034425] stacktrace:
[    1.034425]       __mutex_lock+0x6b5/0x8f0
[    1.034425]       __device_attach+0x24/0x120
[    1.034425]       bus_probe_device+0x97/0xb0
[    1.034425]       device_add+0x49a/0x810
[    1.034425]       pmu_dev_alloc+0x83/0xf0
[    1.034425]       perf_event_sysfs_init+0x50/0x86
[    1.034425]       do_one_initcall+0x62/0x20f
[    1.034425]       kernel_init_freeable+0x22e/0x26a
[    1.034425]       kernel_init+0x5/0x110
[    1.034425]       ret_from_fork+0x22/0x30
[    1.034425] 
[    1.034425] [E] __mutex_unlock(pmus_lock:0):
[    1.034425] (N/A)
[    1.034425] ---------------------------------------------------
[    1.034425] context C's detail
[    1.034425] ---------------------------------------------------
[    1.034425] context C
[    1.034425]     [S] __mutex_lock(cpu_add_remove_lock:0)
[    1.034425]     [W] __mutex_lock(pmus_lock:0)
[    1.034425]     [E] __mutex_unlock(cpu_add_remove_lock:0)
[    1.034425] 
[    1.034425] [S] __mutex_lock(cpu_add_remove_lock:0):
[    1.034425] [<ffffffff81067f97>] cpu_up+0x27/0xa0
[    1.034425] stacktrace:
[    1.034425]       __mutex_lock+0x6b5/0x8f0
[    1.034425]       cpu_up+0x27/0xa0
[    1.034425]       bringup_nonboot_cpus+0x4a/0x60
[    1.034425]       smp_init+0x21/0x6f
[    1.034425]       kernel_init_freeable+0x13f/0x26a
[    1.034425]       kernel_init+0x5/0x110
[    1.034425]       ret_from_fork+0x22/0x30
[    1.034425] 
[    1.034425] [W] __mutex_lock(pmus_lock:0):
[    1.034425] [<ffffffff81176515>] perf_event_init_cpu+0x55/0x100
[    1.034425] stacktrace:
[    1.034425]       __mutex_lock+0x6b5/0x8f0
[    1.034425]       perf_event_init_cpu+0x55/0x100
[    1.034425]       cpuhp_invoke_callback+0xaf/0x610
[    1.034425]       _cpu_up+0xa2/0x130
[    1.034425]       cpu_up+0x61/0xa0
[    1.034425]       bringup_nonboot_cpus+0x4a/0x60
[    1.034425]       smp_init+0x21/0x6f
[    1.034425]       kernel_init_freeable+0x13f/0x26a
[    1.034425]       kernel_init+0x5/0x110
[    1.034425]       ret_from_fork+0x22/0x30
[    1.034425] 
[    1.034425] [E] __mutex_unlock(cpu_add_remove_lock:0):
[    1.034425] [<ffffffff81088ea7>] __kthread_parkme+0x27/0x60
[    1.034425] ---------------------------------------------------
[    1.034425] information that might be helpful
[    1.034425] ---------------------------------------------------
[    1.034425] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.9.0+ #8
[    1.034425] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    1.034425] Call Trace:
[    1.034425]  dump_stack+0x77/0x9b
[    1.034425]  ? __kthread_parkme+0x27/0x60
[    1.034425]  print_circle+0x431/0x670
[    1.034425]  ? cpu_up+0x27/0xa0
[    1.034425]  ? print_circle+0x670/0x670
[    1.034425]  cb_check_dl+0x58/0x60
[    1.034425]  bfs+0xd1/0x190
[    1.034425]  ? cpu_hotplug_disable+0xe/0x30
[    1.034425]  add_dep+0x6b/0x80
[    1.034425]  add_wait+0x2fb/0x350
[    1.034425]  ? cpu_hotplug_disable+0xe/0x30
[    1.034425]  dept_wait_ecxt_enter+0x130/0x2b0
[    1.034425]  __mutex_lock+0x6b5/0x8f0
[    1.034425]  ? kernfs_add_one+0x12e/0x140
[    1.034425]  ? __mutex_unlock_slowpath+0x45/0x290
[    1.034425]  ? cpu_hotplug_disable+0xe/0x30
[    1.034425]  ? kernfs_add_one+0x12e/0x140
[    1.034425]  ? cpu_hotplug_disable+0xe/0x30
[    1.034425]  cpu_hotplug_disable+0xe/0x30
[    1.034425]  acpi_processor_start+0x25/0x40
[    1.034425]  really_probe+0x1b7/0x380
[    1.034425]  ? rdinit_setup+0x26/0x26
[    1.034425]  driver_probe_device+0x4a/0xa0
[    1.034425]  device_driver_attach+0x4a/0x50
[    1.034425]  __driver_attach+0x82/0xc0
[    1.034425]  ? device_driver_attach+0x50/0x50
[    1.034425]  bus_for_each_dev+0x57/0x90
[    1.034425]  bus_add_driver+0x175/0x1f0
[    1.034425]  driver_register+0x56/0xe0
[    1.034425]  ? acpi_video_init+0x7d/0x7d
[    1.034425]  acpi_processor_driver_init+0x1c/0xad
[    1.034425]  do_one_initcall+0x62/0x20f
[    1.034425]  kernel_init_freeable+0x22e/0x26a
[    1.034425]  ? rest_init+0x120/0x120
[    1.034425]  kernel_init+0x5/0x110
[    1.034425]  ret_from_fork+0x22/0x30
[    1.102213] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    1.103016] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[    1.106696] Non-volatile memory driver v1.3
[    1.107354] Linux agpgart interface v0.103
[    1.133716] loop: module loaded
[    1.140278] scsi host0: ata_piix
[    1.141849] scsi host1: ata_piix
[    1.142560] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc040 irq 14
[    1.143177] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc048 irq 15
[    1.144821] e100: Intel(R) PRO/100 Network Driver
[    1.145263] e100: Copyright(c) 1999-2006 Intel Corporation
[    1.145847] e1000: Intel(R) PRO/1000 Network Driver
[    1.146302] e1000: Copyright (c) 1999-2006 Intel Corporation.
[    1.149639] PCI Interrupt Link [LNKC] enabled at IRQ 11
[    1.272470] ata1.00: ATA-7: QEMU HARDDISK, 2.0.0, max UDMA/100
[    1.273150] ata1.00: 16777216 sectors, multi 16: LBA48 
[    1.273808] ata2.00: ATAPI: QEMU DVD-ROM, 2.0.0, max UDMA/100
[    1.275445] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    0    PQ: 0 ANSI: 5
[    1.277740] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    1.277937] sd 0:0:0:0: [sda] 16777216 512-byte logical blocks: (8.59 GB/8.00 GiB)
[    1.279176] sd 0:0:0:0: [sda] Write Protect is off
[    1.280032] scsi 1:0:0:0: CD-ROM            QEMU     QEMU DVD-ROM     2.0. PQ: 0 ANSI: 5
[    1.281733] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.287285]  sda: sda1 sda2 < sda5 >
[    1.290588] sd 0:0:0:0: [sda] Attached SCSI disk
[    1.400427] sr 1:0:0:0: [sr0] scsi3-mmc drive: 4x/4x cd/rw xa/form2 tray
[    1.401118] cdrom: Uniform CD-ROM driver Revision: 3.20
[    1.406553] sr 1:0:0:0: Attached scsi generic sg1 type 5
[    1.456822] e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) 52:54:00:12:34:56
[    1.457472] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
[    1.458256] e1000e: Intel(R) PRO/1000 Network Driver
[    1.458741] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[    1.459375] sky2: driver version 1.30
[    1.460732] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.461363] ehci-pci: EHCI PCI platform driver
[    1.461872] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    1.462464] ohci-pci: OHCI PCI platform driver
[    1.462968] uhci_hcd: USB Universal Host Controller Interface driver
[    1.463844] usbcore: registered new interface driver usblp
[    1.464433] usbcore: registered new interface driver usb-storage
[    1.465370] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
[    1.467044] serio: i8042 KBD port at 0x60,0x64 irq 1
[    1.467535] serio: i8042 AUX port at 0x60,0x64 irq 12
[    1.469937] rtc_cmos 00:00: RTC can wake from S4
[    1.471005] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1
[    1.472993] rtc_cmos 00:00: registered as rtc0
[    1.473839] rtc_cmos 00:00: alarms up to one day, 114 bytes nvram, hpet irqs
[    1.475466] device-mapper: ioctl: 4.42.0-ioctl (2020-02-27) initialised: dm-devel@redhat.com
[    1.476279] intel_pstate: CPU model not supported
[    1.476814] hid: raw HID events driver (C) Jiri Kosina
[    1.478926] usbcore: registered new interface driver usbhid
[    1.479450] usbhid: USB HID core driver
[    1.482086] Initializing XFRM netlink socket
[    1.484824] NET: Registered protocol family 10
[    1.486654] Segment Routing with IPv6
[    1.487535] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[    1.489292] NET: Registered protocol family 17
[    1.489856] Key type dns_resolver registered
[    1.492066] IPI shorthand broadcast: enabled
[    1.492516] sched_clock: Marking stable (1420822386, 71074958)->(1569263593, -77366249)
[    1.493921] registered taskstats version 1
[    1.494323] Loading compiled-in X.509 certificates
[    1.495557] PM:   Magic number: 12:947:778
[    1.496041] printk: console [netcon0] enabled
[    1.496457] netconsole: network logging started
[    1.498082] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    1.499579] kworker/u8:1 (80) used greatest stack depth: 13616 bytes left
[    1.500936] ===================================================
[    1.501536] Dept: Circular dependency has been detected.
[    1.502355] 5.9.0+ #8 Tainted: G        W        
[    1.503244] ---------------------------------------------------
[    1.504366] summary
[    1.504816] ---------------------------------------------------
[    1.505938] *** AA DEADLOCK ***
[    1.505938] 
[    1.506808] context A
[    1.507213]     [S] (unknown)(&larval->completion:0)
[    1.508142]     [W] wait_for_completion_killable(&larval->completion:0)
[    1.509325]     [E] complete_all(&larval->completion:0)
[    1.509833] 
[    1.509975] [S]: start of the event context
[    1.510342] [W]: the wait blocked
[    1.510637] [E]: the event not reachable
[    1.511061] ---------------------------------------------------
[    1.511580] context A's detail
[    1.511929] ---------------------------------------------------
[    1.512446] context A
[    1.512651]     [S] (unknown)(&larval->completion:0)
[    1.513149]     [W] wait_for_completion_killable(&larval->completion:0)
[    1.513781]     [E] complete_all(&larval->completion:0)
[    1.514244] 
[    1.514382] [S] (unknown)(&larval->completion:0):
[    1.514870] (N/A)
[    1.515051] 
[    1.515214] [W] wait_for_completion_killable(&larval->completion:0):
[    1.515836] [<ffffffff813aad5a>] crypto_wait_for_test+0x3a/0x70
[    1.516359] stacktrace:
[    1.516581]       wait_for_completion_killable+0x34/0x150
[    1.517139]       crypto_wait_for_test+0x3a/0x70
[    1.517545]       crypto_register_instance+0xe8/0x110
[    1.518067]       pkcs1pad_create+0x1d9/0x250
[    1.518452]       cryptomgr_probe+0x33/0x80
[    1.518880]       kthread+0x144/0x180
[    1.519204]       ret_from_fork+0x22/0x30
[    1.519557] 
[    1.519735] [E] complete_all(&larval->completion:0):
[    1.520208] [<ffffffff813b0d9c>] cryptomgr_probe+0x5c/0x80
[    1.520702] stacktrace:
[    1.520967]       complete_all+0x28/0x60
[    1.521339]       cryptomgr_probe+0x5c/0x80
[    1.521824]       kthread+0x144/0x180
[    1.522154]       ret_from_fork+0x22/0x30
[    1.522509] ---------------------------------------------------
[    1.523085] information that might be helpful
[    1.523471] ---------------------------------------------------
[    1.524107] CPU: 3 PID: 82 Comm: cryptomgr_probe Tainted: G        W         5.9.0+ #8
[    1.524873] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    1.525876] Call Trace:
[    1.526104]  dump_stack+0x77/0x9b
[    1.526401]  ? cryptomgr_probe+0x5c/0x80
[    1.526859]  print_circle+0x431/0x670
[    1.527190]  ? stack_trace_save+0x36/0x40
[    1.527547]  ? print_circle+0x670/0x670
[    1.528001]  cb_check_dl+0x45/0x60
[    1.528306]  bfs+0x6c/0x190
[    1.528559]  add_dep+0x6b/0x80
[    1.528916]  dept_event+0x4a9/0x580
[    1.529232]  ? cryptomgr_probe+0x5c/0x80
[    1.529603]  ? crypto_alg_put+0x40/0x40
[    1.530007]  complete_all+0x28/0x60
[    1.530328]  cryptomgr_probe+0x5c/0x80
[    1.530662]  kthread+0x144/0x180
[    1.531007]  ? kthread_destroy_worker+0x40/0x40
[    1.531442]  ret_from_fork+0x22/0x30
[    1.532135] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    1.532949] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[    1.533850] cfg80211: failed to load regulatory.db
[    1.533886] ALSA device list:
[    1.534700]   No soundcards found.
[    1.979978] tsc: Refined TSC clocksource calibration: 3197.737 MHz
[    1.981303] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x2e17f0aea14, max_idle_ns: 440795355206 ns
[    1.983501] clocksource: Switched to clocksource tsc
[    2.094732] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
[    2.097508] md: Waiting for all devices to be available before autodetect
[    2.098771] md: If you don't use raid, use raid=noautodetect
[    2.099833] md: Autodetecting RAID arrays.
[    2.100591] md: autorun ...
[    2.101163] md: ... autorun DONE.
[    2.108775] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
[    2.109605] EXT4-fs (sda1): write access will be enabled during recovery
[    2.143768] random: fast init done
[    2.156518] EXT4-fs (sda1): recovery complete
[    2.161843] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[    2.162971] VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
[    2.166192] devtmpfs: mounted
[    2.168706] Freeing unused kernel image (initmem) memory: 1184K
[    2.169965] Write protecting the kernel read-only data: 22528k
[    2.175246] Freeing unused kernel image (text/rodata gap) memory: 2040K
[    2.178542] Freeing unused kernel image (rodata/data gap) memory: 1584K
[    2.180077] Run /sbin/init as init process
SELinux:  Could not open policy file <= /etc/selinux/targeted/policy/policy.33:  No such file or directory
[    2.327709] random: init: uninitialized urandom read (12 bytes read)
[    2.356127] hostname (87) used greatest stack depth: 13000 bytes left
[    2.393522] startpar-upstar (93) used greatest stack depth: 12904 bytes left
[    2.402851] ===================================================
[    2.403419] Dept: Circular dependency has been detected.
[    2.404088] 5.9.0+ #8 Tainted: G        W        
[    2.404677] ---------------------------------------------------
[    2.405532] summary
[    2.405887] ---------------------------------------------------
[    2.406745] *** AA DEADLOCK ***
[    2.406745] 
[    2.407383] context A
[    2.407717]     [S] __mutex_lock(&tty->legacy_mutex:0)
[    2.408466]     [W] __mutex_lock(&tty->legacy_mutex:0)
[    2.409208]     [E] __mutex_unlock(&tty->legacy_mutex:0)
[    2.409746] 
[    2.409894] [S]: start of the event context
[    2.410270] [W]: the wait blocked
[    2.410570] [E]: the event not reachable
[    2.411032] ---------------------------------------------------
[    2.411562] context A's detail
[    2.411892] ---------------------------------------------------
[    2.412421] context A
[    2.412774]     [S] __mutex_lock(&tty->legacy_mutex:0)
[    2.413274]     [W] __mutex_lock(&tty->legacy_mutex:0)
[    2.413791]     [E] __mutex_unlock(&tty->legacy_mutex:0)
[    2.414320] 
[    2.414487] [S] __mutex_lock(&tty->legacy_mutex:0):
[    2.415030] [<ffffffff814c468d>] tty_release+0x4d/0x500
[    2.415498] stacktrace:
[    2.415791]       __mutex_lock+0x6b5/0x8f0
[    2.416168]       tty_release+0x4d/0x500
[    2.416526]       __fput+0x97/0x240
[    2.416913]       task_work_run+0x60/0x90
[    2.417281]       exit_to_user_mode_prepare+0x146/0x150
[    2.417797]       syscall_exit_to_user_mode+0x2d/0x190
[    2.418262]       entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    2.418817] 
[    2.418963] [W] __mutex_lock(&tty->legacy_mutex:0):
[    2.419413] [<ffffffff814c4b9c>] __tty_hangup+0x5c/0x2b0
[    2.419941] stacktrace:
[    2.420166]       dept_wait_ecxt_enter+0x130/0x2b0
[    2.420598]       __mutex_lock+0x6b5/0x8f0
[    2.421052]       __tty_hangup+0x5c/0x2b0
[    2.421425]       tty_release+0xf3/0x500
[    2.421889]       __fput+0x97/0x240
[    2.422205]       task_work_run+0x60/0x90
[    2.422565]       exit_to_user_mode_prepare+0x146/0x150
[    2.423066]       syscall_exit_to_user_mode+0x2d/0x190
[    2.423524]       entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    2.424124] 
[    2.424264] [E] __mutex_unlock(&tty->legacy_mutex:0):
[    2.424774] (N/A)
[    2.424961] ---------------------------------------------------
[    2.425764] information that might be helpful
[    2.426160] ---------------------------------------------------
[    2.426737] CPU: 0 PID: 1 Comm: init Tainted: G        W         5.9.0+ #8
[    2.427355] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    2.428126] Call Trace:
[    2.428354]  dump_stack+0x77/0x9b
[    2.428656]  print_circle+0x431/0x670
[    2.429049]  ? tty_release+0x4d/0x500
[    2.429382]  ? print_circle+0x670/0x670
[    2.429778]  ? __tty_hangup+0x5c/0x2b0
[    2.430135]  cb_check_dl+0x45/0x60
[    2.430491]  bfs+0x6c/0x190
[    2.430815]  ? __tty_hangup+0x5c/0x2b0
[    2.431158]  add_dep+0x6b/0x80
[    2.431472]  add_wait+0x2fb/0x350
[    2.431912]  ? __tty_hangup+0x5c/0x2b0
[    2.432252]  dept_wait_ecxt_enter+0x130/0x2b0
[    2.432645]  __mutex_lock+0x6b5/0x8f0
[    2.433076]  ? __tty_hangup+0x29/0x2b0
[    2.433422]  ? __tty_hangup+0x5c/0x2b0
[    2.433810]  ? find_held_lock+0x2d/0x90
[    2.434161]  ? __tty_hangup+0x54/0x2b0
[    2.434501]  ? to_pool+0x43/0x70
[    2.434847]  ? pop_ecxt+0x108/0x110
[    2.435164]  ? __tty_hangup+0x5c/0x2b0
[    2.435503]  __tty_hangup+0x5c/0x2b0
[    2.435900]  tty_release+0xf3/0x500
[    2.436216]  ? _raw_spin_unlock_irq+0x2d/0x30
[    2.436608]  __fput+0x97/0x240
[    2.436941]  task_work_run+0x60/0x90
[    2.437269]  exit_to_user_mode_prepare+0x146/0x150
[    2.437764]  syscall_exit_to_user_mode+0x2d/0x190
[    2.438202]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    2.438653] RIP: 0033:0x7fb31f764f60
[    2.439032] Code: 00 64 c7 00 0d 00 00 00 b8 ff ff ff ff eb 90 b8 ff ff ff ff eb 89 0f 1f 40 00 83 3d 1d 81 2d 00 00 75 10 b8 03 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe c1 01 00 48 89 04 24
[    2.440820] RSP: 002b:00007ffeaef17ad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[    2.441499] RAX: 0000000000000000 RBX: 0000561ce2919ca0 RCX: 00007fb31f764f60
[    2.442199] RDX: 0000561ce2919c70 RSI: 00007ffeaef17b10 RDI: 000000000000000a
[    2.442900] RBP: 0000561ce2919c70 R08: 0000000000000000 R09: 0000561ce2919d00
[    2.443567] R10: 00007fb31fa38c8c R11: 0000000000000246 R12: 0000561ce2919d00
[    2.444274] R13: 0000561ce2919c70 R14: 0000561ce2919ca0 R15: 00007ffeaef17d60
[    2.445912] init: plymouth-upstart-bridge main process (89) terminated with status 1
[    2.446781] init: plymouth-upstart-bridge main process ended, respawning
[    2.467366] init: plymouth-upstart-bridge main process (100) terminated with status 1
[    2.469260] init: plymouth-upstart-bridge main process ended, respawning
[    2.485527] init: ureadahead main process (92) terminated with status 5
[    2.486635] init: plymouth-upstart-bridge main process (104) terminated with status 1
[    2.486750] init: plymouth-upstart-bridge main process ended, respawning
[    2.492446] plymouthd (90) used greatest stack depth: 12744 bytes left
[    2.512003] ===================================================
[    2.512589] Dept: Circular dependency has been detected.
[    2.513102] 5.9.0+ #8 Tainted: G        W        
[    2.513538] ---------------------------------------------------
[    2.514408] summary
[    2.514611] ---------------------------------------------------
[    2.515243] *** AA DEADLOCK ***
[    2.515243] 
[    2.515678] context A
[    2.515898]     [S] __raw_spin_lock(&wq_head->lock:0)
[    2.516382]         <softirq interrupt>
[    2.516745]         [W] __raw_spin_lock_irqsave(&wq_head->lock:0)
[    2.517324]     [E] __raw_spin_unlock(&wq_head->lock:0)
[    2.517806] 
[    2.517950] [S]: start of the event context
[    2.518338] [W]: the wait blocked
[    2.518648] [E]: the event not reachable
[    2.519012] ---------------------------------------------------
[    2.519558] context A's detail
[    2.519844] ---------------------------------------------------
[    2.520388] context A
[    2.520601]     [S] __raw_spin_lock(&wq_head->lock:0)
[    2.521076]         <softirq interrupt>
[    2.521437]         [W] __raw_spin_lock_irqsave(&wq_head->lock:0)
[    2.521993]     [E] __raw_spin_unlock(&wq_head->lock:0)
[    2.522459] 
[    2.522600] softirq has been enabled:
[    2.522946] [<ffffffff81b4fd89>] _raw_spin_unlock_irqrestore+0x39/0x40
[    2.523523] 
[    2.523664] [S] __raw_spin_lock(&wq_head->lock:0):
[    2.524109] [<ffffffff819da779>] unix_dgram_peer_wake_disconnect+0x19/0x80
[    2.524742] stacktrace:
[    2.524975]       dept_wait_ecxt_enter+0x1cc/0x2b0
[    2.525411]       _raw_spin_lock+0x49/0x60
[    2.525789]       unix_dgram_peer_wake_disconnect+0x19/0x80
[    2.526292]       unix_release_sock+0x15c/0x330
[    2.526702]       unix_release+0x14/0x20
[    2.527067]       __sock_release+0x38/0xb0
[    2.527454]       sock_close+0xf/0x20
[    2.527795]       __fput+0x97/0x240
[    2.528117]       task_work_run+0x60/0x90
[    2.528479]       exit_to_user_mode_prepare+0x146/0x150
[    2.528961]       syscall_exit_to_user_mode+0x2d/0x190
[    2.529429]       entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    2.529931] 
[    2.530072] [W] __raw_spin_lock_irqsave(&wq_head->lock:0):
[    2.530574] [<ffffffff810d7f2a>] rcu_sync_func+0x2a/0xb0
[    2.531064] stacktrace:
[    2.531314]       _raw_spin_lock_irqsave+0x5a/0x70
[    2.531764]       rcu_sync_func+0x2a/0xb0
[    2.532134]       rcu_core+0x265/0x870
[    2.532473]       __do_softirq+0x16f/0x390
[    2.532852]       asm_call_irq_on_stack+0x12/0x20
[    2.533313]       do_softirq_own_stack+0x32/0x40
[    2.533742]       irq_exit_rcu+0xb0/0xc0
[    2.534105]       sysvec_apic_timer_interrupt+0x2c/0x80
[    2.534585]       asm_sysvec_apic_timer_interrupt+0x12/0x20
[    2.535097]       _raw_spin_unlock_irqrestore+0x3b/0x40
[    2.535565]       __set_cpus_allowed_ptr+0x7a/0x190
[    2.536020]       __kthread_create_on_node+0x161/0x1c0
[    2.536478]       kthread_create_on_node+0x32/0x40
[    2.536922]       kthread_create_on_cpu+0x22/0x90
[    2.537352]       __smpboot_create_thread.part.8+0x5e/0x110
[    2.537861]       smpboot_create_threads+0x61/0x90
[    2.538295] 
[    2.538436] [E] __raw_spin_unlock(&wq_head->lock:0):
[    2.538894] (N/A)
[    2.539072] ---------------------------------------------------
[    2.539598] information that might be helpful
[    2.540004] ---------------------------------------------------
[    2.540531] CPU: 3 PID: 98 Comm: status Tainted: G        W         5.9.0+ #8
[    2.541188] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[    2.541917] Call Trace:
[    2.542146]  dump_stack+0x77/0x9b
[    2.542448]  print_circle+0x431/0x670
[    2.542787]  ? unix_dgram_peer_wake_disconnect+0x19/0x80
[    2.543268]  ? print_circle+0x670/0x670
[    2.543627]  cb_check_dl+0x45/0x60
[    2.543963]  bfs+0x6c/0x190
[    2.544219]  add_iecxt+0x123/0x1d0
[    2.544529]  ? arch_stack_walk+0x7e/0xb0
[    2.544901]  ? unix_dgram_peer_wake_disconnect+0x19/0x80
[    2.545589]  add_ecxt+0x126/0x1a0
[    2.545907]  ? unix_dgram_peer_wake_disconnect+0x19/0x80
[    2.546386]  dept_wait_ecxt_enter+0x1cc/0x2b0
[    2.546789]  _raw_spin_lock+0x49/0x60
[    2.547140]  ? unix_dgram_peer_wake_disconnect+0x19/0x80
[    2.547617]  unix_dgram_peer_wake_disconnect+0x19/0x80
[    2.548095]  unix_release_sock+0x15c/0x330
[    2.548494]  unix_release+0x14/0x20
[    2.548824]  __sock_release+0x38/0xb0
[    2.549188]  sock_close+0xf/0x20
[    2.549488]  __fput+0x97/0x240
[    2.549776]  task_work_run+0x60/0x90
[    2.550109]  exit_to_user_mode_prepare+0x146/0x150
[    2.550541]  syscall_exit_to_user_mode+0x2d/0x190
[    2.550981]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    2.551436] RIP: 0033:0x7f397dbc3f60
[    2.551769] Code: 00 64 c7 00 0d 00 00 00 b8 ff ff ff ff eb 90 b8 ff ff ff ff eb 89 0f 1f 40 00 83 3d 1d 81 2d 00 00 75 10 b8 03 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe c1 01 00 48 89 04 24
[    2.553462] RSP: 002b:00007ffed2511978 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[    2.554149] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 00007f397dbc3f60
[    2.554801] RDX: 00007f397de96780 RSI: 0000000000000000 RDI: 0000000000000003
[    2.555441] RBP: 0000000000000000 R08: 0000000000000a00 R09: 0000563a8fc69ec0
[    2.556083] R10: 00007ffed2511740 R11: 0000000000000246 R12: 0000563a8dae973b
[    2.556716] R13: 00007ffed2511b40 R14: 0000000000000000 R15: 0000000000000000
[    2.595010] random: mountall: uninitialized urandom read (12 bytes read)
[    2.792977] apt-config (170) used greatest stack depth: 12728 bytes left
 * Starting Mount filesystems on boot^[[74G[ OK ]
 * Starting Populate /dev filesystem^[[74G[ OK ]
 * Stopping Populate /dev filesystem^[[74G[ OK ]
 * Starting Populate and link to /run filesystem^[[74G[ OK ]
 * Stopping Populate and link to /run filesystem^[[74G[ OK ]
 * Stopping Track if upstart is running in a container^[[74G[ OK ]
 * Starting Initialize or finalize resolvconf^[[74G[ OK ]
 * Starting set console keymap^[[74G[ OK ]
 * Starting Signal sysvinit that virtual filesystems are mounted^[[74G[ OK ]
 * Starting Signal sysvinit that virtual filesystems are mounted^[[74G[ OK ]
 * Starting Bridge udev events into upstart^[[74G[ OK ]
 * Starting Signal sysvinit that remote filesystems are mounted^[[74G[ OK ]
 * Starting device node and kernel event manager^[[74G[ OK ]
 * Stopping set console keymap^[[74G[ OK ]
 * Starting load modules from /etc/modules^[[74G[ OK ]
 * Starting cold plug devices^[[74G[ OK ]
 * Starting log initial device creation^[[74G[ OK ]
 * Stopping load modules from /etc/modules^[[74G[ OK ]
 * Starting Uncomplicated firewall^[[74G[ OK ]
 * Starting configure network device security^[[74G[ OK ]
 * Starting configure network device^[[74G[ OK ]
 * Starting configure network device security^[[74G[ OK ]
 * Starting configure network device security^[[74G[ OK ]
 * Starting configure network device^[[74G[ OK ]
 * Starting Mount network filesystems^[[74G[ OK ]
 * Starting Signal sysvinit that the rootfs is mounted^[[74G[ OK ]
 * Stopping Mount network filesystems^[[74G[ OK ]
 * Stopping cold plug devices^[[74G[ OK ]
 * Stopping log initial device creation^[[74G[ OK ]
 * Starting load fallback graphics devices^[[74G[ OK ]
 * Stopping load fallback graphics devices^[[74G[ OK ]
 * Starting set console font^[[74G[ OK ]
 * Starting set sysctls from /etc/sysctl.conf^[[74G[ OK ]
 * Stopping set sysctls from /etc/sysctl.conf^[[74G[ OK ]
 * Starting Clean /tmp directory^[[74G[ OK ]
 * Starting Bridge socket events into upstart^[[74G[ OK ]
 * Stopping Clean /tmp directory^[[74G[ OK ]
 * Starting Signal sysvinit that local filesystems are mounted^[[74G[ OK ]
 * Starting restore software rfkill state^[[74G[ OK ]
 * Starting configure network device security^[[74G[ OK ]
 * Stopping set console font^[[74G[ OK ]
 * Stopping restore software rfkill state^[[74G[ OK ]
 * Starting userspace bootsplash^[[74G[ OK ]
 * Starting flush early job output to logs^[[74G[ OK ]
 * Stopping Failsafe Boot Delay^[[74G[ OK ]
 * Stopping Mount filesystems on boot^[[74G[ OK ]
 * Starting System V initialisation compatibility^[[74G[ OK ]
 * Starting Send an event to indicate plymouth is up^[[74G[ OK ]
 * Starting configure network device^[[74G[ OK ]
 * Starting D-Bus system message bus^[[74G[ OK ]
 * Stopping userspace bootsplash^[[74G[ OK ]
 * Stopping flush early job output to logs^[[74G[ OK ]
 * Starting modem connection manager^[[74G[ OK ]
 * Starting configure network device security^[[74G[ OK ]
ModemManager[564]: <info>  ModemManager (version 1.0.0) starting...

 * Stopping Send an event to indicate plymouth is up^[[74G[ OK ]
 * Starting SystemD login management service^[[74G[ OK ]
 * Starting configure virtual network devices^[[74G[ OK ]
 * Starting bluetooth daemon^[[74G[ OK ]
 * Starting network connection manager^[[74G[ OK ]
 * Starting system logging daemon^[[74G[ OK ]
 * Setting up X socket directories...       ^[[80G \r^[[74G[ OK ]
 * Starting mDNS/DNS-SD daemon^[[74G[ OK ]
 * Starting Reload cups, upon starting avahi-daemon to make sure remote queues are populated^[[74G[ OK ]
 * Stopping System V initialisation compatibility^[[74G[ OK ]
 * Starting System V runlevel compatibility^[[74G[ OK ]
 * Starting Restore Sound Card State^[[74G[ OK ]
 * Starting anac(h)ronistic cron^[[74G[ OK ]
 * Starting save kernel messages^[[74G[ OK ]
 * Starting CPU interrupts balancing daemon^[[74G[ OK ]
 * Starting ACPI daemon^[[74G[ OK ]
 * Starting crash report submission daemon^[[74G[ OK ]
 * Starting OpenSSH server^[[74G[ OK ]
 * Starting regular background program processing daemon^[[74G[ OK ]
 * Stopping Restore Sound Card State^[[74G[ OK ]
 * speech-dispatcher disabled; edit /etc/default/speech-dispatcher
saned disabled; edit /etc/default/saned
 * Stopping save kernel messages^[[74G[ OK ]
 * Stopping Reload cups, upon starting avahi-daemon to make sure remote queues are populated^[[74G[ OK ]
 * Restoring resolver state...       ^[[80G \r^[[74G[ OK ]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC 1/6] dept: Implement Dept(Dependency Tracker)
  2020-11-23 11:05 ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
@ 2020-11-23 11:36   ` Byungchul Park
  2020-11-23 11:36     ` [RFC 2/6] dept: Apply Dept to spinlock Byungchul Park
                       ` (4 more replies)
  2020-11-23 12:29   ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
  1 sibling, 5 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 11:36 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

CURRENT STATUS
--------------
Lockdep tracks acquisition order of locks in order to detect deadlock,
and IRQ and IRQ enable/disable state as well to take accident
acquisitions into account.

Lockdep should be turned off once it detects and reports a deadlock
since the data structure and algorithm are not reusable after detection
because of the complex design.

PROBLEM
-------
*Waits* and their *events* that never reach eventually cause deadlock.
However, Lockdep is only interested in lock acquisition order, forcing
to emulate lock acqusition even for just waits and events that have
nothing to do with real lock.

Even worse, no one likes Lockdep's false positive detection because that
prevents further one that might be more valuable. That's why all the
kernel developers are sensitive to Lockdep's false positive.

Besides those, by tracking acquisition order, it cannot correctly deal
with read lock and cross-event e.g. wait_for_completion()/complete() for
deadlock detection. Lockdep is no longer a good tool for that purpose.

SOLUTION
--------
Again, *waits* and their *events* that never reach eventually cause
deadlock. The new solution, Dept(DEPendency Tracker), focuses on waits
and events themselves. Dept tracks waits and events and report it if
any event would be never reachable.

Dept does:
   . Works with read lock in the right way.
   . Works with any wait and event e.i. cross-event.
   . Continue to work even after reporting multiple times.
   . Provides simple and intuitive APIs.
   . Does exactly what dependency checker should do.

Q & A
-----
Q. Is this the first try ever to address the problem?
A. No. Cross-release feature (b09be676e0ff2 locking/lockdep: Implement
   the 'crossrelease' feature) addressed it 2 years ago that was a
   Lockdep extention and merged but reverted shortly because:

   Cross-release started to report valuable hidden problems but started
   to give report false positive reports a few times. For sure, no one
   likes Lockdep's false positive reports since it makes Lockdep stop,
   preventing reporting further real problems.

Q. Why not Dept was developed as an extention of Lockdep?
A. Lockdep definitely includes all the efforts great developers have
   made for a long time so as to be quite stable enough. But I had to
   design and implement newly because of the following:

   1) Lockdep was designed to track lock acquisition order. The APIs and
      implementation do not fit on wait-event model.
   2) Lockdep would be turned off on detection including false positive.
      Which is terrible and would keep any extention for stronger
      detection from being developed.

Q. Do you intend to totally replace Lockdep?
A. No. Lockdep also checks if lock usage is correct. Of course, the
   deadlock check routine should be replaced but the other functions
   should be still there.

Q. Do you mean the deadlock check routine should be replaced right away?
A. No. I admit Lockdep is stable enough thanks to great efforts kernel
   developers have made. Lockdep and Dept, both should be in the kernel
   until Dept gets considered stable.

Q. Stronger detection would give more false positive report. Which was a
   big problem when cross-release was introduced. Is it ok with Dept?
A. It's ok. Dept allows multiple reporting thanks to simple and quite
   generalized design. Of course, false positive reports should be fixed
   but it's no longer critical problem.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/dept.h            |  495 +++++++++
 include/linux/hardirq.h         |    3 +
 include/linux/irqflags.h        |   33 +-
 include/linux/sched.h           |    3 +
 init/init_task.c                |    2 +
 init/main.c                     |    2 +
 kernel/Makefile                 |    1 +
 kernel/dependency/Makefile      |    4 +
 kernel/dependency/dept.c        | 2313 +++++++++++++++++++++++++++++++++++++++
 kernel/dependency/dept_hash.h   |    9 +
 kernel/dependency/dept_object.h |   12 +
 kernel/exit.c                   |    1 +
 kernel/fork.c                   |    2 +
 kernel/module.c                 |    2 +
 kernel/softirq.c                |    6 +-
 kernel/trace/trace_preemptirq.c |   19 +-
 lib/Kconfig.debug               |   15 +
 17 files changed, 2913 insertions(+), 9 deletions(-)
 create mode 100644 include/linux/dept.h
 create mode 100644 kernel/dependency/Makefile
 create mode 100644 kernel/dependency/dept.c
 create mode 100644 kernel/dependency/dept_hash.h
 create mode 100644 kernel/dependency/dept_object.h

diff --git a/include/linux/dept.h b/include/linux/dept.h
new file mode 100644
index 0000000..7fe1e04
--- /dev/null
+++ b/include/linux/dept.h
@@ -0,0 +1,495 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Dept(DEPendency Tracker) - runtime dependency tracker
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ */
+
+#ifndef __LINUX_DEPT_H
+#define __LINUX_DEPT_H
+
+#ifdef CONFIG_DEPT
+
+#include <linux/list.h>
+#include <linux/llist.h>
+
+#define DEPT_MAX_STACK_ENTRY		16
+#define DEPT_MAX_WAIT_HIST		16
+#define DEPT_MAX_ECXT_HELD		48
+
+#define DEPT_MAX_SUBCLASSES		16
+#define DEPT_MAX_SUBCLASSES_EVT		2
+#define DEPT_MAX_SUBCLASSES_USR		(DEPT_MAX_SUBCLASSES / DEPT_MAX_SUBCLASSES_EVT)
+#define DEPT_MAX_SUBCLASSES_CACHE	2
+
+#define DEPT_SIRQ			0
+#define DEPT_HIRQ			1
+#define DEPT_IRQS_NR			2
+#define DEPT_SIRQF			(1UL << DEPT_SIRQ)
+#define DEPT_HIRQF			(1UL << DEPT_HIRQ)
+
+enum dept_type {
+	DEPT_TYPE_NO_CHECK,
+	DEPT_TYPE_SPIN,
+	DEPT_TYPE_MUTEX,
+	DEPT_TYPE_RW,
+	DEPT_TYPE_WFC,
+	DEPT_TYPE_SDT,
+	DEPT_TYPE_OTHER,
+};
+
+struct dept_class {
+	union {
+		struct llist_node pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t ref;
+
+			/*
+			 * unique information about the class
+			 */
+			const char *name;
+			unsigned long key;
+			int sub;
+
+			/*
+			 * for BFS
+			 */
+			unsigned int bfs_gen;
+			int bfs_dist;
+			struct dept_class *bfs_parent;
+
+			/*
+			 * for hashing this object
+			 */
+			struct hlist_node hash_node;
+
+			/*
+			 * for linking all classes
+			 */
+			struct list_head all_node;
+
+			/*
+			 * for associating its dependencies
+			 */
+			struct list_head dep_head;
+			struct list_head dep_rev_head;
+
+			/*
+			 * for tracking IRQ dependencies
+			 */
+			int iwait_dist[DEPT_IRQS_NR];
+			struct dept_ecxt *iecxt[DEPT_IRQS_NR];
+			struct dept_wait *iwait[DEPT_IRQS_NR];
+		};
+	};
+};
+
+struct dept_stack {
+	union {
+		struct llist_node pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t ref;
+
+			/*
+			 * backtrace entries
+			 */
+			unsigned long raw[DEPT_MAX_STACK_ENTRY];
+			int nr;
+		};
+	};
+};
+
+struct dept_ecxt {
+	union {
+		struct llist_node pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t ref;
+
+			/*
+			 * function that entered to this ecxt
+			 */
+			const char *ecxt_fn;
+
+			/*
+			 * event function
+			 */
+			const char *event_fn;
+
+			/*
+			 * associated class
+			 */
+			struct dept_class *class;
+
+			/*
+			 * flag indicating which IRQ has been
+			 * enabled within the event context
+			 */
+			unsigned long enirqf;
+
+			/*
+			 * where the IRQ-enabled happened
+			 */
+			unsigned long enirq_ip[DEPT_IRQS_NR];
+			struct dept_stack *enirq_stack[DEPT_IRQS_NR];
+
+			/*
+			 * where the event context started
+			 */
+			unsigned long ecxt_ip;
+			struct dept_stack *ecxt_stack;
+
+			/*
+			 * where the event triggered
+			 */
+			unsigned long event_ip;
+			struct dept_stack *event_stack;
+		};
+	};
+};
+
+struct dept_wait {
+	union {
+		struct llist_node pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t ref;
+
+			/*
+			 * function causing this wait
+			 */
+			const char *wait_fn;
+
+			/*
+			 * the associated class
+			 */
+			struct dept_class *class;
+
+			/*
+			 * which IRQ the wait was placed in
+			 */
+			unsigned long irqf;
+
+			/*
+			 * where the wait happened
+			 */
+			unsigned long ip;
+			struct dept_stack *stack;
+		};
+	};
+};
+
+struct dept_dep {
+	union {
+		struct llist_node pool_node;
+		struct {
+			/*
+			 * reference counter for object management
+			 */
+			atomic_t ref;
+
+			/*
+			 * key data of dependency
+			 */
+			struct dept_ecxt *ecxt;
+			struct dept_wait *wait;
+
+			/*
+			 * This object can be referred without dept_lock
+			 * held but with IRQ disabled, e.g. for hash
+			 * lookup. So deferred deletion is needed.
+			 */
+			struct rcu_head rh;
+
+			/*
+			 * for BFS
+			 */
+			struct list_head bfs_node;
+
+			/*
+			 * for hashing this object
+			 */
+			struct hlist_node hash_node;
+
+			/*
+			 * for linking to a class object
+			 */
+			struct list_head dep_node;
+			struct list_head dep_rev_node;
+		};
+	};
+};
+
+struct dept_hash {
+	/*
+	 * hash table
+	 */
+	struct hlist_head *table;
+
+	/*
+	 * size of the table e.i. 2^bits
+	 */
+	int bits;
+};
+
+struct dept_pool {
+	const char *name;
+
+	/*
+	 * object size
+	 */
+	size_t obj_sz;
+
+	/*
+	 * the number of the static array
+	 */
+	atomic_t obj_nr;
+
+	/*
+	 * offset of ->pool_node
+	 */
+	size_t node_off;
+
+	/*
+	 * pointer to the pool
+	 */
+	void *spool;
+	struct llist_head boot_pool;
+	struct llist_head __percpu *lpool;
+};
+
+struct dept_ecxt_held {
+	/*
+	 * associated event context
+	 */
+	struct dept_ecxt *ecxt;
+
+	/*
+	 * unique key for this dept_ecxt_held
+	 */
+	unsigned long key;
+
+	/*
+	 * the wgen when the event context started
+	 */
+	unsigned int wgen;
+
+	/*
+	 * whether the event context has been started along with a wait
+	 */
+	bool with_wait;
+
+	/*
+	 * for allowing user aware nesting
+	 */
+	int nest;
+};
+
+struct dept_wait_hist {
+	/*
+	 * associated wait
+	 */
+	struct dept_wait *wait;
+
+	/*
+	 * unique id of all waits system-wise until wrapped
+	 */
+	unsigned int wgen;
+
+	/*
+	 * local context id to identify IRQ context
+	 */
+	unsigned int ctxt_id;
+};
+
+struct dept_task {
+	/*
+	 * all event contexts that have entered and before exiting
+	 */
+	struct dept_ecxt_held ecxt_held[DEPT_MAX_ECXT_HELD];
+	int ecxt_held_pos;
+
+	/*
+	 * ring buffer holding all waits that have happened
+	 */
+	struct dept_wait_hist wait_hist[DEPT_MAX_WAIT_HIST];
+	int wait_hist_pos;
+
+	/*
+	 * sequencial id to identify each IRQ context
+	 */
+	unsigned int irq_id[DEPT_IRQS_NR];
+
+	/*
+	 * for tracking IRQ-enabled points with cross-event
+	 */
+	unsigned int wgen_enirq[DEPT_IRQS_NR];
+
+	/*
+	 * for keeping up-to-date IRQ-enabled points
+	 */
+	unsigned long enirq_ip[DEPT_IRQS_NR];
+
+	/*
+	 * current effective IRQ-enabled flag
+	 */
+	unsigned long eff_enirqf;
+
+	/*
+	 * for reserving a current stack instance at each operation
+	 */
+	struct dept_stack *stack;
+
+	/*
+	 * for preventing recursive call into Dept engine
+	 */
+	int recursive;
+
+	/*
+	 * for tracking IRQ-enable state
+	 */
+	bool hardirqs_enabled;
+	bool softirqs_enabled;
+};
+
+struct dept_key {
+	union {
+		/*
+		 * Each byte-wise address will be used as its key.
+		 */
+		char subkeys[DEPT_MAX_SUBCLASSES];
+
+		/*
+		 * for caching the main class pointer
+		 */
+		struct dept_class *classes[DEPT_MAX_SUBCLASSES_CACHE];
+	};
+};
+
+struct dept_map {
+	const char *name;
+	struct dept_key *keys;
+	int sub_usr;
+
+	/*
+	 * It's local copy for fast acces to the associated classes. And
+	 * Also used for dept_key instance for statically defined map.
+	 */
+	struct dept_key keys_local;
+
+	/*
+	 * wait timestamp associated to this map
+	 */
+	unsigned int wgen;
+
+	/*
+	 * for handling the map differently according to the type
+	 */
+	enum dept_type type;
+};
+
+#define DEPT_TASK_INITIALIZER(t)					\
+	.dept_task.wait_hist = { { .wait = NULL, } },			\
+	.dept_task.ecxt_held_pos = 0,					\
+	.dept_task.wait_hist_pos = 0,					\
+	.dept_task.irq_id = { 0 },					\
+	.dept_task.wgen_enirq = { 0 },					\
+	.dept_task.enirq_ip = { 0 },					\
+	.dept_task.recursive = 0,					\
+	.dept_task.hardirqs_enabled = false,				\
+	.dept_task.softirqs_enabled = false,
+
+extern void dept_on(void);
+extern void dept_off(void);
+extern void dept_init(void);
+extern void dept_task_init(struct task_struct *t);
+extern void dept_task_exit(struct task_struct *t);
+extern void dept_free_range(void *start, unsigned int sz);
+extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub, const char *n, enum dept_type t);
+extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub, const char *n);
+extern void dept_map_nocheck(struct dept_map *m);
+
+extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, const char *w_fn, int ne);
+extern void dept_wait_ecxt_enter(struct dept_map *m, unsigned long w_f, unsigned long e_f, unsigned long ip, const char *w_fn, const char *c_fn, const char *e_fn, int ne);
+extern void dept_ecxt_enter(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *c_fn, const char *e_fn, int ne);
+extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *e_fn);
+extern void dept_ecxt_exit(struct dept_map *m, unsigned long ip);
+extern struct dept_map *dept_top_map(void);
+
+/*
+ * for users who want to manage external keys
+ */
+extern void dept_key_init(struct dept_key *k);
+extern void dept_key_destroy(struct dept_key *k);
+
+#define DEPT_MAP_INIT(dname)	{ .name = #dname, .type = DEPT_TYPE_SDT }
+#define DEFINE_DEPT_SDT(x)	struct dept_map x = DEPT_MAP_INIT(x)
+
+/*
+ * SDT(Simple version of Dependency Tracker) APIs
+ *
+ * In case that one dept_map instance maps to a single event, SDT APIs
+ * can be used.
+ */
+#define sdt_map_init(m)							\
+	do {								\
+		static struct dept_key __key;				\
+		dept_map_init(m, &__key, 0, #m, DEPT_TYPE_SDT);		\
+	} while (0)
+#define sdt_map_init_key(m, k)		dept_map_init(m, k, 0, #m)
+
+#define sdt_wait(m)			dept_wait(m, 1UL, _THIS_IP_, "wait", 0)
+#define sdt_wait_ecxt_enter(m)		dept_wait_ecxt_enter(m, 1UL, 1UL, _THIS_IP_, "wait", "start", "event", 0)
+#define sdt_ecxt_enter(m)		dept_ecxt_enter(m, 1UL, _THIS_IP_, "start", "event", 0)
+#define sdt_event(m)			dept_event(m, 1UL, _THIS_IP_, "event")
+#define sdt_ecxt_exit(m)		dept_ecxt_exit(m, _THIS_IP_)
+#else /* !CONFIG_DEPT */
+struct dept_task { };
+struct dept_key  { };
+struct dept_map  { };
+
+#define DEPT_TASK_INITIALIZER(t)
+
+#define dept_on()					do { } while (0)
+#define dept_off()					do { } while (0)
+#define dept_init()					do { } while (0)
+#define dept_task_init(t)				do { } while (0)
+#define dept_task_exit(t)				do { } while (0)
+#define dept_free_range(s, sz)				do { } while (0)
+#define dept_map_init(m, k, s, n, t)			do { (void)(n); (void)(k); } while (0)
+#define dept_map_reinit(m, k, s, n)			do { (void)(n); (void)(k); } while (0)
+#define dept_map_nocheck(m)				do { } while (0)
+
+#define dept_wait(m, w_f, ip, w_fn, ne)			do { } while (0)
+#define dept_wait_ecxt_enter(m, w_f, e_f, ip, w_fn, c_fn, e_fn, ne) do { } while (0)
+#define dept_ecxt_enter(m, e_f, ip, c_fn, e_fn, ne)	do { } while (0)
+#define dept_event(m, e_f, ip, e_fn)			do { } while (0)
+#define dept_ecxt_exit(m, ip)				do { } while (0)
+#define dept_top_map()					NULL
+#define dept_key_init(k)				do { (void)(k); } while (0)
+#define dept_key_destroy(k)				do { (void)(k); } while (0)
+
+#define DEFINE_DEPT_SDT(x)
+
+#define sdt_map_init(m)					do { } while (0)
+#define sdt_map_init_key(m, k)				do { (void)(k); } while (0)
+#define sdt_wait(m)					do { } while (0)
+#define sdt_wait_ecxt_enter(m)				do { } while (0)
+#define sdt_ecxt_enter(m)				do { } while (0)
+#define sdt_event(m)					do { } while (0)
+#define sdt_ecxt_exit(m)				do { } while (0)
+#endif
+#endif /* __LINUX_DEPT_H */
diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index 754f67a..55798fb 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -5,6 +5,7 @@
 #include <linux/context_tracking_state.h>
 #include <linux/preempt.h>
 #include <linux/lockdep.h>
+#include <linux/dept.h>
 #include <linux/ftrace_irq.h>
 #include <linux/vtime.h>
 #include <asm/hardirq.h>
@@ -113,6 +114,7 @@ static inline void rcu_nmi_exit(void) { }
  */
 #define __nmi_enter()						\
 	do {							\
+		dept_off();					\
 		lockdep_off();					\
 		arch_nmi_enter();				\
 		printk_nmi_enter();				\
@@ -137,6 +139,7 @@ static inline void rcu_nmi_exit(void) { }
 		printk_nmi_exit();				\
 		arch_nmi_exit();				\
 		lockdep_on();					\
+		dept_on();					\
 	} while (0)
 
 #define nmi_exit()						\
diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 3ed4e87..dbab683 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -31,6 +31,22 @@
   static inline void lockdep_hardirqs_off(unsigned long ip) { }
 #endif
 
+#ifdef CONFIG_DEPT
+  extern void dept_hardirq_enter(void);
+  extern void dept_softirq_enter(void);
+  extern void dept_enable_hardirq(unsigned long ip);
+  extern void dept_enable_softirq(unsigned long ip);
+  extern void dept_disable_hardirq(unsigned long ip);
+  extern void dept_disable_softirq(unsigned long ip);
+#else
+  static inline void dept_hardirq_enter(void) { }
+  static inline void dept_softirq_enter(void) { }
+  static inline void dept_enable_hardirq(unsigned long ip) { }
+  static inline void dept_enable_softirq(unsigned long ip) { }
+  static inline void dept_disable_hardirq(unsigned long ip) { }
+  static inline void dept_disable_softirq(unsigned long ip) { }
+#endif
+
 #ifdef CONFIG_TRACE_IRQFLAGS
 
 /* Per-task IRQ trace events information. */
@@ -53,15 +69,19 @@ struct irqtrace_events {
 extern void trace_hardirqs_off_finish(void);
 extern void trace_hardirqs_on(void);
 extern void trace_hardirqs_off(void);
+extern void trace_softirqs_on_caller(unsigned long ip);
+extern void trace_softirqs_off_caller(unsigned long ip);
 
 # define lockdep_hardirq_context()	(raw_cpu_read(hardirq_context))
 # define lockdep_softirq_context(p)	((p)->softirq_context)
 # define lockdep_hardirqs_enabled()	(this_cpu_read(hardirqs_enabled))
 # define lockdep_softirqs_enabled(p)	((p)->softirqs_enabled)
-# define lockdep_hardirq_enter()			\
-do {							\
-	if (__this_cpu_inc_return(hardirq_context) == 1)\
-		current->hardirq_threaded = 0;		\
+# define lockdep_hardirq_enter()				\
+do {								\
+	if (__this_cpu_inc_return(hardirq_context) == 1) {	\
+		current->hardirq_threaded = 0;			\
+		dept_hardirq_enter();				\
+	}							\
 } while (0)
 # define lockdep_hardirq_threaded()		\
 do {						\
@@ -73,7 +93,8 @@ struct irqtrace_events {
 } while (0)
 # define lockdep_softirq_enter()		\
 do {						\
-	current->softirq_context++;		\
+	if (!current->softirq_context++)	\
+		dept_softirq_enter();		\
 } while (0)
 # define lockdep_softirq_exit()			\
 do {						\
@@ -123,6 +144,8 @@ struct irqtrace_events {
 # define trace_hardirqs_off_finish()		do { } while (0)
 # define trace_hardirqs_on()			do { } while (0)
 # define trace_hardirqs_off()			do { } while (0)
+# define trace_softirqs_on_caller(ip)		do { } while (0)
+# define trace_softirqs_off_caller(ip)		do { } while (0)
 # define lockdep_hardirq_context()		0
 # define lockdep_softirq_context(p)		0
 # define lockdep_hardirqs_enabled()		0
diff --git a/include/linux/sched.h b/include/linux/sched.h
index afe01e2..d907e5d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -34,6 +34,7 @@
 #include <linux/rseq.h>
 #include <linux/seqlock.h>
 #include <linux/kcsan.h>
+#include <linux/dept.h>
 
 /* task_struct member predeclarations (sorted alphabetically): */
 struct audit_context;
@@ -1008,6 +1009,8 @@ struct task_struct {
 	struct held_lock		held_locks[MAX_LOCK_DEPTH];
 #endif
 
+	struct dept_task		dept_task;
+
 #ifdef CONFIG_UBSAN
 	unsigned int			in_ubsan;
 #endif
diff --git a/init/init_task.c b/init/init_task.c
index f6889fc..1770fb8 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -12,6 +12,7 @@
 #include <linux/audit.h>
 #include <linux/numa.h>
 #include <linux/scs.h>
+#include <linux/dept.h>
 
 #include <linux/uaccess.h>
 
@@ -194,6 +195,7 @@ struct task_struct init_task
 	.curr_chain_key = INITIAL_CHAIN_KEY,
 	.lockdep_recursion = 0,
 #endif
+	DEPT_TASK_INITIALIZER(init_task)
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	.ret_stack	= NULL,
 #endif
diff --git a/init/main.c b/init/main.c
index e880b4e..d33ef05 100644
--- a/init/main.c
+++ b/init/main.c
@@ -63,6 +63,7 @@
 #include <linux/debug_locks.h>
 #include <linux/debugobjects.h>
 #include <linux/lockdep.h>
+#include <linux/dept.h>
 #include <linux/kmemleak.h>
 #include <linux/padata.h>
 #include <linux/pid_namespace.h>
@@ -979,6 +980,7 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void)
 		      panic_param);
 
 	lockdep_init();
+	dept_init();
 
 	/*
 	 * Need to run this when irqs are enabled, because it wants
diff --git a/kernel/Makefile b/kernel/Makefile
index 9a20016..ce54317 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -50,6 +50,7 @@ obj-y += rcu/
 obj-y += livepatch/
 obj-y += dma/
 obj-y += entry/
+obj-y += dependency/
 
 obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
 obj-$(CONFIG_FREEZER) += freezer.o
diff --git a/kernel/dependency/Makefile b/kernel/dependency/Makefile
new file mode 100644
index 0000000..9f7778e
--- /dev/null
+++ b/kernel/dependency/Makefile
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_DEPT) += dept.o
+
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
new file mode 100644
index 0000000..ac840ac
--- /dev/null
+++ b/kernel/dependency/dept.c
@@ -0,0 +1,2313 @@
+/*
+ * Dept(DEPendency Tracker) - Runtime dependency tracker
+ *
+ * Started by Byungchul Park <max.byungchul.park@gmail.com>:
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ *
+ * Dept provides a general way to detect deadlock possibility in runtime
+ * and the interest is not limited to typical lock but to every
+ * syncronization primitives.
+ *
+ * The following ideas were borrowed from Lockdep:
+ *
+ *    1) Use a graph to track relationship between classes.
+ *    2) Prevent performance regression using hash.
+ *
+ * The following items were enhanced from Lockdep:
+ *
+ *    1) Cover more deadlock cases.
+ *    2) Allow muliple reports.
+ *
+ * TODO: Both Lockdep and Dept should co-exist until Dept is considered
+ * stable. Then the dependency check routine shoud be replaced with Dept
+ * after. It should finally look like:
+ *
+ * As is:
+ *
+ *    Lockdep
+ *    +-----------------------------------------+
+ *    | Lock usage correctness check            | <-> locks
+ *    |                                         |
+ *    | Dependency check                        |
+ *    | (by tracking lock aquisition order)     |
+ *    +-----------------------------------------+
+ *
+ *    Dept
+ *    +-----------------------------------------+
+ *    | Dependency check                        | <-> waits/events
+ *    | (by tracking wait and event context)    |     (covering locks)
+ *    +-----------------------------------------+
+ *
+ * To be:
+ *
+ *    Lockdep (Suggest to rename.)
+ *    +-----------------------------------------+
+ *    | Lock usage correctness check            | <-> locks
+ *    +-----------------------------------------+
+ *
+ *    Dept
+ *    +-----------------------------------------+
+ *    | Dependency check                        | <-> waits/events
+ *    | (by tracking wait and event context)    |     (covering locks)
+ *    +-----------------------------------------+
+ *
+ * ---
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your ootion) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ */
+
+#define DISABLE_BRANCH_PROFILING
+
+#include <linux/sched.h>
+#include <linux/stacktrace.h>
+#include <linux/spinlock.h>
+#include <linux/kallsyms.h>
+#include <linux/hash.h>
+#include <linux/dept.h>
+#include <linux/utsname.h>
+
+static int dept_stop;
+static int dept_per_cpu_ready;
+
+/*
+ * Make all operations using DEPT_WARN_ON() fail on oops_in_progress and
+ * prevent warning message.
+ */
+#define DEPT_WARN_ON_ONCE(c)						\
+	({								\
+		int __ret = 1;						\
+		if (likely(!oops_in_progress))				\
+			__ret = WARN_ONCE(c, "DEPT_WARN_ON_ONCE: " #c);	\
+		__ret;							\
+	})
+
+#define DEPT_WARN_ONCE(s...)						\
+	({								\
+		if (likely(!oops_in_progress))				\
+			WARN_ONCE(1, "DEPT_WARN_ONCE: " s);		\
+	})
+
+#define DEPT_WARN_ON(c)							\
+	({								\
+		int __ret = 1;						\
+		if (likely(!oops_in_progress))				\
+			__ret = WARN(c, "DEPT_WARN_ON: " #c);		\
+		__ret;							\
+	})
+
+#define DEPT_WARN(s...)							\
+	({								\
+		if (likely(!oops_in_progress))				\
+			WARN(1, "DEPT_WARN: " s);			\
+	})
+
+#define DEPT_STOP(s...)							\
+	({								\
+		WRITE_ONCE(dept_stop, 1);				\
+		if (likely(!oops_in_progress))				\
+			WARN(1, "DEPT_STOP: " s);			\
+	})
+
+static arch_spinlock_t dept_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
+
+/*
+ * Dept internal engine should be careful in using outside functions
+ * e.g. printk at reporting since that kind of usage might cause
+ * untrackable deadlock.
+ */
+static atomic_t dept_outworld = ATOMIC_INIT(0);
+
+static inline void dept_outworld_enter(void)
+{
+	atomic_inc(&dept_outworld);
+}
+
+static inline void dept_outworld_exit(void)
+{
+	atomic_dec(&dept_outworld);
+}
+
+static inline bool dept_outworld_entered(void)
+{
+	return atomic_read(&dept_outworld);
+}
+
+static inline bool dept_lock(void)
+{
+	while (!arch_spin_trylock(&dept_spin))
+		if (unlikely(dept_outworld_entered()))
+			return false;
+	return true;
+}
+
+static inline void dept_unlock(void)
+{
+	arch_spin_unlock(&dept_spin);
+}
+
+/*
+ * whether to stack-trace on every wait or every ecxt
+ */
+static bool rich_stack = true;
+
+enum bfs_ret {
+	DEPT_BFS_CONTINUE,
+	DEPT_BFS_CONTINUE_REV,
+	DEPT_BFS_DONE,
+	DEPT_BFS_SKIP,
+};
+
+/*
+ * The irq wait is in unknown state. Should identify the state.
+ */
+#define DEPT_IWAIT_UNKNOWN ((void *)1)
+
+static inline bool before(unsigned int a, unsigned int b)
+{
+	return (int)(a - b) < 0;
+}
+
+static inline bool valid_stack(struct dept_stack *s)
+{
+	return s && s->nr > 0;
+}
+
+static inline bool valid_class(struct dept_class *c)
+{
+	return c->key;
+}
+
+static inline void invalidate_class(struct dept_class *c)
+{
+	c->key = 0UL;
+}
+
+static inline struct dept_class *dep_fc(struct dept_dep *d)
+{
+	return d->ecxt->class;
+}
+
+static inline struct dept_class *dep_tc(struct dept_dep *d)
+{
+	return d->wait->class;
+}
+
+static inline const char *irq_str(int irq)
+{
+	if (irq == DEPT_SIRQ)
+		return "softirq";
+	if (irq == DEPT_HIRQ)
+		return "hardirq";
+	return "(unknown)";
+}
+
+static inline struct dept_task *dept_task(void)
+{
+	return &current->dept_task;
+}
+
+/*
+ * Pool
+ * =====================================================================
+ * Dept maintains pools to provide objects in a safe way.
+ *
+ *    1) Static pool is used at the beginning of booting time.
+ *    2) Local pool is tried first before the static pool. Objects that
+ *       have been freed will be placed.
+ */
+
+enum object_t {
+#define OBJECT(id, nr) OBJECT_##id,
+	#include "dept_object.h"
+#undef  OBJECT
+	OBJECT_NR,
+};
+
+#define OBJECT(id, nr)							\
+static struct dept_##id spool_##id[nr];					\
+static DEFINE_PER_CPU(struct llist_head, lpool_##id);
+	#include "dept_object.h"
+#undef  OBJECT
+
+static struct dept_pool pool[OBJECT_NR] = {
+#define OBJECT(id, nr) {						\
+	.name = #id,							\
+	.obj_sz = sizeof(struct dept_##id),				\
+	.obj_nr = ATOMIC_INIT(nr),					\
+	.node_off = offsetof(struct dept_##id, pool_node),		\
+	.spool = spool_##id,						\
+	.lpool = &lpool_##id, },
+	#include "dept_object.h"
+#undef  OBJECT
+};
+
+/*
+ * Can use llist no matter whether CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG is
+ * enabled because Dept never race with NMI by nesting control.
+ */
+static void *from_pool(enum object_t t)
+{
+	struct dept_pool *p;
+	struct llist_head *h;
+	struct llist_node *n;
+
+	/*
+	 * llist_del_first() doesn't allow concurrent access e.g.
+	 * between process and IRQ context.
+	 */
+	if (DEPT_WARN_ON(!irqs_disabled()))
+		return NULL;
+
+	p = &pool[t];
+
+	/*
+	 * Try local pool first.
+	 */
+	if (likely(dept_per_cpu_ready))
+		h = this_cpu_ptr(p->lpool);
+	else
+		h = &p->boot_pool;
+
+	n = llist_del_first(h);
+	if (n)
+		return (void *)n - p->node_off;
+
+	/*
+	 * Try static pool.
+	 */
+	if (atomic_read(&p->obj_nr) > 0) {
+		int idx = atomic_dec_return(&p->obj_nr);
+		if (idx >= 0)
+			return p->spool + (idx * p->obj_sz);
+	}
+
+	DEPT_WARN_ONCE("Pool(%s) is empty.\n", p->name);
+	return NULL;
+}
+
+static void to_pool(void *o, enum object_t t)
+{
+	struct dept_pool *p = &pool[t];
+	struct llist_head *h;
+
+	preempt_disable();
+	if (likely(dept_per_cpu_ready))
+		h = this_cpu_ptr(p->lpool);
+	else
+		h = &p->boot_pool;
+
+	llist_add(o + p->node_off, h);
+	preempt_enable();
+}
+
+#define OBJECT(id, nr)							\
+static void (*ctor_##id)(struct dept_##id *a);				\
+static void (*dtor_##id)(struct dept_##id *a);				\
+static inline struct dept_##id *new_##id(void)				\
+{									\
+	struct dept_##id *a;						\
+									\
+	a = (struct dept_##id *)from_pool(OBJECT_##id);			\
+	if (unlikely(!a))						\
+		return NULL;						\
+									\
+	atomic_set(&a->ref, 1);						\
+									\
+	if (ctor_##id)							\
+		ctor_##id(a);						\
+									\
+	return a;							\
+}									\
+									\
+static inline struct dept_##id *get_##id(struct dept_##id *a)		\
+{									\
+	atomic_inc(&a->ref);						\
+	return a;							\
+}									\
+									\
+static inline void put_##id(struct dept_##id *a)			\
+{									\
+	if (!atomic_dec_return(&a->ref)) {				\
+		if (dtor_##id)						\
+			dtor_##id(a);					\
+		to_pool(a, OBJECT_##id);				\
+	}								\
+}									\
+									\
+static inline void del_##id(struct dept_##id *a)			\
+{									\
+	put_##id(a);							\
+}									\
+									\
+static inline bool id##_refered(struct dept_##id *a, int expect)	\
+{									\
+	return a && atomic_read(&a->ref) > expect;			\
+}
+#include "dept_object.h"
+#undef  OBJECT
+
+#define SET_CONSTRUCTOR(id, f) \
+static void (*ctor_##id)(struct dept_##id *a) = f;
+
+static void initialize_dep(struct dept_dep *d)
+{
+	INIT_LIST_HEAD(&d->bfs_node);
+	INIT_LIST_HEAD(&d->dep_node);
+	INIT_LIST_HEAD(&d->dep_rev_node);
+}
+SET_CONSTRUCTOR(dep, initialize_dep);
+
+static void initialize_class(struct dept_class *c)
+{
+	int i;
+
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		c->iwait_dist[i] = INT_MAX;
+		WRITE_ONCE(c->iecxt[i], NULL);
+		WRITE_ONCE(c->iwait[i], NULL);
+	}
+	c->bfs_gen = 0U;
+
+	INIT_LIST_HEAD(&c->all_node);
+	INIT_LIST_HEAD(&c->dep_head);
+	INIT_LIST_HEAD(&c->dep_rev_head);
+}
+SET_CONSTRUCTOR(class, initialize_class);
+
+static void initialize_ecxt(struct dept_ecxt *e)
+{
+	int i;
+
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		e->enirq_stack[i] = NULL;
+		e->enirq_ip[i] = 0UL;
+	}
+	e->ecxt_ip = 0UL;
+	e->ecxt_stack = NULL;
+	e->enirqf = 0UL;
+	e->event_stack = NULL;
+}
+SET_CONSTRUCTOR(ecxt, initialize_ecxt);
+
+static void initialize_wait(struct dept_wait *w)
+{
+	w->irqf = 0UL;
+	w->stack = NULL;
+}
+SET_CONSTRUCTOR(wait, initialize_wait);
+
+static void initialize_stack(struct dept_stack *s)
+{
+	s->nr = 0;
+}
+SET_CONSTRUCTOR(stack, initialize_stack);
+#undef  SET_CONSTRUCTOR
+
+#define SET_DESTRUCTOR(id, f) \
+static void (*dtor_##id)(struct dept_##id *a) = f;
+
+static void destroy_dep(struct dept_dep *d)
+{
+	if (d->ecxt)
+		put_ecxt(d->ecxt);
+	if (d->wait)
+		put_wait(d->wait);
+}
+SET_DESTRUCTOR(dep, destroy_dep);
+
+static void destroy_ecxt(struct dept_ecxt *e)
+{
+	int i;
+
+	for (i = 0; i < DEPT_IRQS_NR; i++)
+		if (e->enirq_stack[i])
+			put_stack(e->enirq_stack[i]);
+	if (e->class)
+		put_class(e->class);
+	if (e->ecxt_stack)
+		put_stack(e->ecxt_stack);
+	if (e->event_stack)
+		put_stack(e->event_stack);
+}
+SET_DESTRUCTOR(ecxt, destroy_ecxt);
+
+static void destroy_wait(struct dept_wait *w)
+{
+	if (w->class)
+		put_class(w->class);
+	if (w->stack)
+		put_stack(w->stack);
+}
+SET_DESTRUCTOR(wait, destroy_wait);
+#undef  SET_DESTRUCTOR
+
+/*
+ * Caching and hashing
+ * =====================================================================
+ * Dept makes use of caching and hashing to improve performance. Each
+ * object can be obtained in O(1) with its key.
+ *
+ * NOTE: Currently we assume all the objects in the hashs will never be
+ * removed. Implement it when needed.
+ */
+
+/*
+ * Some information might be lost but it's only for hashing key.
+ */
+static inline unsigned long mix(unsigned long a, unsigned long b)
+{
+	int halfbits = sizeof(unsigned long) * 8 / 2;
+	unsigned long halfmask = (1UL << halfbits) - 1UL;
+	return (a << halfbits) | (b & halfmask);
+}
+
+static bool cmp_dep(struct dept_dep *d1, struct dept_dep *d2)
+{
+	return dep_fc(d1)->key == dep_fc(d2)->key &&
+	       dep_tc(d1)->key == dep_tc(d2)->key;
+}
+
+static unsigned long key_dep(struct dept_dep *d)
+{
+	return mix(dep_fc(d)->key, dep_tc(d)->key);
+}
+
+static bool cmp_class(struct dept_class *c1, struct dept_class *c2)
+{
+	return c1->key == c2->key;
+}
+
+static unsigned long key_class(struct dept_class *c)
+{
+	return c->key;
+}
+
+#define HASH(id, bits)							\
+static struct hlist_head table_##id[1UL << bits];			\
+									\
+static inline struct hlist_head *head_##id(struct dept_##id *a)		\
+{									\
+	return table_##id + hash_long(key_##id(a), bits);		\
+}									\
+									\
+static inline struct dept_##id *hash_lookup_##id(struct dept_##id *a)	\
+{									\
+	struct dept_##id *b;						\
+	hlist_for_each_entry_rcu(b, head_##id(a), hash_node)		\
+		if (cmp_##id(a, b))					\
+			return b;					\
+	return NULL;							\
+}									\
+									\
+static inline void hash_add_##id(struct dept_##id *a)			\
+{									\
+	hlist_add_head_rcu(&a->hash_node, head_##id(a));		\
+}									\
+									\
+static inline void hash_del_##id(struct dept_##id *a)			\
+{									\
+	hlist_del_rcu(&a->hash_node);					\
+}
+#include "dept_hash.h"
+#undef  HASH
+
+static inline struct dept_dep *dep(struct dept_class *fc,
+				   struct dept_class *tc)
+{
+	struct dept_ecxt dum_e = { .class = fc };
+	struct dept_wait dum_w = { .class = tc };
+	struct dept_dep  dum_d = { .ecxt = &dum_e, .wait = &dum_w };
+	return hash_lookup_dep(&dum_d);
+}
+
+static inline struct dept_class *class(unsigned long key)
+{
+	struct dept_class dum_c = { .key = key };
+	return hash_lookup_class(&dum_c);
+}
+
+/*
+ * Report
+ * =====================================================================
+ * Dept prints useful information to help debuging on detection of
+ * problematic dependency.
+ */
+
+static inline void print_ip_stack(unsigned long ip, struct dept_stack *s)
+{
+	if (ip)
+		print_ip_sym(KERN_WARNING, ip);
+
+	if (valid_stack(s)) {
+		printk("stacktrace:\n");
+		stack_trace_print(s->raw, s->nr, 5);
+	}
+
+	if (!ip && !valid_stack(s))
+		printk("(N/A)\n");
+}
+
+#define print_spc(spc, fmt, ...)					\
+	printk("%*c" fmt, (spc) * 4, ' ', ##__VA_ARGS__)
+
+static void print_diagram(struct dept_dep *d)
+{
+	struct dept_ecxt *e = d->ecxt;
+	struct dept_wait *w = d->wait;
+	struct dept_class *fc = dep_fc(d);
+	struct dept_class *tc = dep_tc(d);
+	unsigned long irqf;
+	int irq;
+	bool firstline = true;
+	int spc = 1;
+	const char *w_fn = w->wait_fn  ?: "(unknown)";
+	const char *e_fn = e->event_fn ?: "(unknown)";
+	const char *c_fn = e->ecxt_fn ?: "(unknown)";
+
+	irqf = e->enirqf & w->irqf;
+
+	if (!irqf) {
+		print_spc(spc, "[S] %s(%s:%d)\n", c_fn, fc->name, fc->sub);
+		print_spc(spc, "[W] %s(%s:%d)\n", w_fn, tc->name, tc->sub);
+		print_spc(spc, "[E] %s(%s:%d)\n", e_fn, fc->name, fc->sub);
+	}
+
+	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR) {
+		if (!firstline)
+			printk("\nor\n\n");
+		firstline = false;
+
+		print_spc(spc, "[S] %s(%s:%d)\n", c_fn, fc->name, fc->sub);
+		print_spc(spc, "    <%s interrupt>\n", irq_str(irq));
+		print_spc(spc + 1, "[W] %s(%s:%d)\n", w_fn, tc->name, tc->sub);
+		print_spc(spc, "[E] %s(%s:%d)\n", e_fn, fc->name, fc->sub);
+	}
+}
+
+static void print_dep(struct dept_dep *d)
+{
+	struct dept_ecxt *e = d->ecxt;
+	struct dept_wait *w = d->wait;
+	struct dept_class *fc = dep_fc(d);
+	struct dept_class *tc = dep_tc(d);
+	unsigned long irqf;
+	int irq;
+	const char *w_fn = w->wait_fn  ?: "(unknown)";
+	const char *e_fn = e->event_fn ?: "(unknown)";
+	const char *c_fn = e->ecxt_fn ?: "(unknown)";
+
+	irqf = e->enirqf & w->irqf;
+	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR) {
+		printk("%s has been enabled:\n", irq_str(irq));
+		print_ip_stack(e->enirq_ip[irq], e->enirq_stack[irq]);
+		printk("\n");
+	}
+
+	printk("[S] %s(%s:%d):\n", c_fn, fc->name, fc->sub);
+	print_ip_stack(e->ecxt_ip, e->ecxt_stack);
+	printk("\n");
+
+	printk("[W] %s(%s:%d):\n", w_fn, tc->name, tc->sub);
+	print_ip_stack(w->ip, w->stack);
+	printk("\n");
+
+	printk("[E] %s(%s:%d):\n", e_fn, fc->name, fc->sub);
+	print_ip_stack(e->event_ip, e->event_stack);
+}
+
+static void save_current_stack(int skip);
+
+/*
+ * Print all classes in a circle.
+ */
+static void print_circle(struct dept_class *c)
+{
+	struct dept_class *fc = c->bfs_parent;
+	struct dept_class *tc = c;
+	int i;
+
+	dept_outworld_enter();
+	save_current_stack(6);
+
+	printk("===================================================\n");
+	printk("Dept: Circular dependency has been detected.\n");
+	printk("%s %.*s %s\n", init_utsname()->release,
+		(int)strcspn(init_utsname()->version, " "),
+		init_utsname()->version,
+		print_tainted());
+	printk("---------------------------------------------------\n");
+	printk("summary\n");
+	printk("---------------------------------------------------\n");
+
+	if (fc == tc)
+		printk("*** AA DEADLOCK ***\n\n");
+	else
+		printk("*** DEADLOCK ***\n\n");
+
+	i = 0;
+	do {
+		struct dept_dep *d = dep(fc, tc);
+
+		printk("context %c\n", 'A' + (i++));
+		print_diagram(d);
+		if (fc != c)
+			printk("\n");
+
+		tc = fc;
+		fc = fc->bfs_parent;
+	} while (tc != c);
+
+	printk("\n");
+	printk("[S]: start of the event context\n");
+	printk("[W]: the wait blocked\n");
+	printk("[E]: the event not reachable\n");
+
+	i = 0;
+	do {
+		struct dept_dep *d = dep(fc, tc);
+
+		printk("---------------------------------------------------\n");
+		printk("context %c's detail\n", 'A' + i);
+		printk("---------------------------------------------------\n");
+		printk("context %c\n", 'A' + (i++));
+		print_diagram(d);
+		printk("\n");
+		print_dep(d);
+
+		tc = fc;
+		fc = fc->bfs_parent;
+	} while (tc != c);
+
+	printk("---------------------------------------------------\n");
+	printk("information that might be helpful\n");
+	printk("---------------------------------------------------\n");
+	dump_stack();
+
+	dept_outworld_exit();
+}
+
+/*
+ * BFS(Breadth First Search)
+ * =====================================================================
+ * Whenever a new dependency is added into the graph, search the graph
+ * for a new circular dependency.
+ */
+
+static inline void enqueue(struct list_head *h, struct dept_dep *d)
+{
+	list_add_tail(&d->bfs_node, h);
+}
+
+static inline struct dept_dep *dequeue(struct list_head *h)
+{
+	struct dept_dep *d;
+	d = list_first_entry(h, struct dept_dep, bfs_node);
+	list_del(&d->bfs_node);
+	return d;
+}
+
+static inline bool empty(struct list_head *h)
+{
+	return list_empty(h);
+}
+
+static void extend_queue(struct list_head *h, struct dept_class *cur)
+{
+	struct dept_dep *d;
+
+	list_for_each_entry(d, &cur->dep_head, dep_node) {
+		struct dept_class *next = dep_tc(d);
+		if (cur->bfs_gen == next->bfs_gen)
+			continue;
+		next->bfs_gen = cur->bfs_gen;
+		next->bfs_dist = cur->bfs_dist + 1;
+		next->bfs_parent = cur;
+		enqueue(h, d);
+	}
+}
+
+static void extend_queue_rev(struct list_head *h, struct dept_class *cur)
+{
+	struct dept_dep *d;
+
+	list_for_each_entry(d, &cur->dep_rev_head, dep_rev_node) {
+		struct dept_class *next = dep_fc(d);
+		if (cur->bfs_gen == next->bfs_gen)
+			continue;
+		next->bfs_gen = cur->bfs_gen;
+		next->bfs_dist = cur->bfs_dist + 1;
+		next->bfs_parent = cur;
+		enqueue(h, d);
+	}
+}
+
+typedef enum bfs_ret bfs_f(struct dept_dep *d, void *in, void **out);
+static unsigned int bfs_gen;
+
+/*
+ * NOTE: Must be called with dept_lock held.
+ */
+static void bfs(struct dept_class *c, bfs_f *cb, void *in, void **out)
+{
+	LIST_HEAD(q);
+	enum bfs_ret ret;
+
+	if (DEPT_WARN_ON(!cb))
+		return;
+
+	/*
+	 * Avoid zero bfs_gen.
+	 */
+	bfs_gen = bfs_gen + 1 ?: 1;
+
+	c->bfs_gen = bfs_gen;
+	c->bfs_dist = 0;
+	c->bfs_parent = c;
+
+	ret = cb(NULL, in, out);
+	if (ret == DEPT_BFS_DONE)
+		return;
+	if (ret == DEPT_BFS_SKIP)
+		return;
+	if (ret == DEPT_BFS_CONTINUE)
+		extend_queue(&q, c);
+	if (ret == DEPT_BFS_CONTINUE_REV)
+		extend_queue_rev(&q, c);
+
+	while (!empty(&q)) {
+		struct dept_dep *d = dequeue(&q);
+
+		ret = cb(d, in, out);
+		if (ret == DEPT_BFS_DONE)
+			break;
+		if (ret == DEPT_BFS_SKIP)
+			continue;
+		if (ret == DEPT_BFS_CONTINUE)
+			extend_queue(&q, dep_tc(d));
+		if (ret == DEPT_BFS_CONTINUE_REV)
+			extend_queue_rev(&q, dep_fc(d));
+	}
+
+	while (!empty(&q))
+		dequeue(&q);
+}
+
+/*
+ * Main operations
+ * =====================================================================
+ * Add dependencies - Each new dependency is added into the graph and
+ * checked if it forms a circular dependency.
+ *
+ * Track waits - Waits are queued into the ring buffer for later use to
+ * generate appropriate dependencies with cross-event.
+ *
+ * Track event contexts(ecxt) - Event contexts are pushed into local
+ * stack for later use to generate appropriate dependencies with waits.
+ */
+
+static inline unsigned long cur_enirqf(void);
+static inline int cur_irq(void);
+static inline unsigned int cur_ctxt_id(void);
+
+static inline struct dept_stack *get_current_stack(void)
+{
+	struct dept_stack *s = dept_task()->stack;
+	return s ? get_stack(s) : NULL;
+}
+
+static inline void prepare_current_stack(void)
+{
+	struct dept_stack *s = dept_task()->stack;
+	/*
+	 * The reference counter would be 1 when no one refers to the
+	 * dept_stack.
+	 */
+	const int expect_ref = 1;
+
+	/*
+	 * The dept_stack is already ready.
+	 */
+	if (s && !stack_refered(s, expect_ref)) {
+		s->nr = 0;
+		return;
+	}
+
+	if (s)
+		put_stack(s);
+
+	s = dept_task()->stack = new_stack();
+	if (!s)
+		return;
+
+	get_stack(s);
+	del_stack(s);
+}
+
+static void save_current_stack(int skip)
+{
+	struct dept_stack *s = dept_task()->stack;
+
+	if (!s)
+		return;
+	if (valid_stack(s))
+		return;
+
+	s->nr = stack_trace_save(s->raw, DEPT_MAX_STACK_ENTRY, skip);
+}
+
+static void finish_current_stack(void)
+{
+	struct dept_stack *s = dept_task()->stack;
+	/*
+	 * The reference counter would be 1 when no one refers to the
+	 * dept_stack.
+	 */
+	const int expect_ref = 1;
+
+	if (s && stack_refered(s, expect_ref))
+		save_current_stack(2);
+}
+
+/*
+ * FIXME: For now, disable Lockdep while Dept is working.
+ *
+ * Both Lockdep and Dept report it on a deadlock detection using
+ * printk taking the risk of another deadlock that might be caused by
+ * locks of console or printk between inside and outside of them.
+ *
+ * For Dept, it's no problem since multiple reports are allowed. But it
+ * would be a bad idea for Lockdep since it will stop even on a singe
+ * report. So we need to prevent Lockdep from its reporting the risk
+ * Dept would take when reporting something.
+ */
+#include <linux/lockdep.h>
+
+void dept_off(void)
+{
+	dept_task()->recursive++;
+	lockdep_off();
+}
+
+void dept_on(void)
+{
+	dept_task()->recursive--;
+	lockdep_on();
+}
+
+static inline unsigned long dept_enter(void)
+{
+	unsigned long flags;
+	raw_local_irq_save(flags);
+	dept_off();
+	prepare_current_stack();
+	return flags;
+}
+
+static inline void dept_exit(unsigned long flags)
+{
+	finish_current_stack();
+	dept_on();
+	raw_local_irq_restore(flags);
+}
+
+/*
+ * NOTE: Must be called with dept_lock held.
+ */
+static struct dept_dep *__add_dep(struct dept_ecxt *e,
+				  struct dept_wait *w)
+{
+	struct dept_dep *d;
+
+	if (!valid_class(e->class) || !valid_class(w->class))
+		return NULL;
+
+	if (dep(e->class, w->class))
+		return NULL;
+
+	d = new_dep();
+	if (unlikely(!d))
+		return NULL;
+
+	d->ecxt = get_ecxt(e);
+	d->wait = get_wait(w);
+
+	/*
+	 * Add the dependency into hash and graph.
+	 */
+	hash_add_dep(d);
+	list_add(&d->dep_node    , &dep_fc(d)->dep_head    );
+	list_add(&d->dep_rev_node, &dep_tc(d)->dep_rev_head);
+	return d;
+}
+
+static enum bfs_ret cb_check_dl(struct dept_dep *d,
+				void *in, void **out)
+{
+	struct dept_dep *new = (struct dept_dep *)in;
+
+	/*
+	 * initial condition for this BFS search
+	 */
+	if (!d) {
+		dep_tc(new)->bfs_parent = dep_fc(new);
+
+		if (dep_tc(new) != dep_fc(new))
+			return DEPT_BFS_CONTINUE;
+
+		/*
+		 * AA circle does not make additional deadlock. We don't
+		 * have to continue this BFS search.
+		 */
+		print_circle(dep_tc(new));
+		return DEPT_BFS_DONE;
+	}
+
+	/*
+	 * Allow multiple reports.
+	 */
+	if (dep_tc(d) == dep_fc(new))
+		print_circle(dep_tc(new));
+
+	return DEPT_BFS_CONTINUE;
+}
+
+static inline void check_dl(struct dept_dep *d)
+{
+	bfs(dep_tc(d), cb_check_dl, (void *)d, NULL);
+}
+
+static void add_dep(struct dept_ecxt *e, struct dept_wait *w)
+{
+	struct dept_dep *d;
+
+	if (dep(e->class, w->class))
+		return;
+
+	if (unlikely(!dept_lock()))
+		return;
+
+	d = __add_dep(e, w);
+	if (d)
+		check_dl(d);
+	dept_unlock();
+}
+
+static enum bfs_ret cb_prop_iwait(struct dept_dep *d,
+				  void *in, void **out)
+{
+	int irq = *(int *)in;
+	struct dept_class *fc;
+	struct dept_class *tc;
+	struct dept_dep *new;
+
+	if (DEPT_WARN_ON(!out))
+		return DEPT_BFS_DONE;
+
+	/*
+	 * initial condition for this BFS search
+	 */
+	if (!d)
+		return DEPT_BFS_CONTINUE;
+
+	fc = dep_fc(d);
+	tc = dep_tc(d);
+
+	if (tc->iwait_dist[irq] <= tc->bfs_dist)
+		return DEPT_BFS_CONTINUE;
+
+	tc->iwait_dist[irq] = tc->bfs_dist;
+	WRITE_ONCE(tc->iwait[irq], fc->iwait[irq]);
+
+	if (!tc->iecxt[irq])
+		return DEPT_BFS_CONTINUE;
+
+	new = __add_dep(tc->iecxt[irq], tc->iwait[irq]);
+	if (!new)
+		return DEPT_BFS_CONTINUE;
+
+	*out = new;
+	return DEPT_BFS_DONE;
+}
+
+static void propagate_iwait(struct dept_class *c, int irq)
+{
+	struct dept_dep *new;
+	do {
+		new = NULL;
+		bfs(c, cb_prop_iwait, (void *)&irq, (void **)&new);
+
+		/*
+		 * Deadlock detected. Let check_dl() report it.
+		 */
+		if (new)
+			check_dl(new);
+	} while (new);
+}
+
+static enum bfs_ret cb_find_iwait(struct dept_dep *d, void *in, void **out)
+{
+	int irq = *(int *)in;
+	struct dept_wait *w;
+
+	if (DEPT_WARN_ON(!out))
+		return DEPT_BFS_DONE;
+
+	/*
+	 * initial condition for this BFS search
+	 */
+	if (!d)
+		return DEPT_BFS_CONTINUE_REV;
+
+	w = dep_fc(d)->iwait[irq];
+	if (!w || w == DEPT_IWAIT_UNKNOWN)
+		return DEPT_BFS_CONTINUE_REV;
+
+	*out = w;
+	return DEPT_BFS_DONE;
+}
+
+static struct dept_wait *find_iwait(struct dept_class *c, int irq)
+{
+	struct dept_wait *w = NULL;
+	bfs(c, cb_find_iwait, (void *)&irq, (void **)&w);
+	return w;
+}
+
+static void add_iecxt(struct dept_ecxt *e, int irq, bool stack)
+{
+	/*
+	 * This access is safe since we ensure e->class has set locally.
+	 */
+	struct dept_class *c = e->class;
+	struct dept_task *dt = dept_task();
+
+	e->enirqf |= (1UL << irq);
+
+	if (READ_ONCE(c->iecxt[irq]))
+		return;
+
+	if (unlikely(!dept_lock()))
+		return;
+
+	if (c->iecxt[irq])
+		goto unlock;
+	WRITE_ONCE(c->iecxt[irq], get_ecxt(e));
+
+	/*
+	 * Should be NULL since it's the first time that these
+	 * enirq_{ip,stack}[irq] have ever set.
+	 */
+	DEPT_WARN_ON(e->enirq_ip[irq]);
+	DEPT_WARN_ON(e->enirq_stack[irq]);
+
+	e->enirq_ip[irq] = dt->enirq_ip[irq];
+	if (stack)
+		e->enirq_stack[irq] = get_current_stack();
+
+	if (c->iwait[irq] == DEPT_IWAIT_UNKNOWN) {
+		struct dept_wait *w = find_iwait(c, irq);
+		WRITE_ONCE(c->iwait[irq], w);
+
+		/*
+		 * Deadlock detected. Let propagate_iwait() report it.
+		 */
+		if (w)
+			propagate_iwait(w->class, irq);
+	} else if (c->iwait[irq]) {
+		struct dept_dep *d = __add_dep(e, c->iwait[irq]);
+
+		/*
+		 * Deadlock detected. Let check_dl() report it.
+		 */
+		if (d)
+			check_dl(d);
+	}
+unlock:
+	dept_unlock();
+}
+
+static void add_iwait(struct dept_wait *w, int irq)
+{
+	struct dept_class *c = w->class;
+	struct dept_wait *iw;
+
+	w->irqf |= (1UL << irq);
+
+	iw = READ_ONCE(c->iwait[irq]);
+	if (iw && READ_ONCE(iw->class) == c)
+		return;
+
+	if (unlikely(!dept_lock()))
+		return;
+
+	iw = c->iwait[irq];
+	if (iw && iw->class == c)
+		goto unlock;
+
+	c->iwait_dist[irq] = 0;
+	WRITE_ONCE(c->iwait[irq], get_wait(w));
+
+	if (c->iecxt[irq]) {
+		struct dept_dep *d = __add_dep(c->iecxt[irq], w);
+
+		/*
+		 * Deadlock detected. Let check_dl() report it.
+		 */
+		if (d)
+			check_dl(d);
+	}
+	propagate_iwait(c, irq);
+unlock:
+	dept_unlock();
+}
+
+static inline struct dept_wait_hist *hist(int pos)
+{
+	struct dept_task *dt = dept_task();
+	return dt->wait_hist + (pos % DEPT_MAX_WAIT_HIST);
+}
+
+static inline int hist_pos_next(void)
+{
+	struct dept_task *dt = dept_task();
+	return dt->wait_hist_pos % DEPT_MAX_WAIT_HIST;
+}
+
+static inline void hist_advance(void)
+{
+	struct dept_task *dt = dept_task();
+	dt->wait_hist_pos++;
+	dt->wait_hist_pos %= DEPT_MAX_WAIT_HIST;
+}
+
+static inline struct dept_wait_hist *new_hist(void)
+{
+	struct dept_wait_hist *wh = hist(hist_pos_next());
+	hist_advance();
+	return wh;
+}
+
+static void add_hist(struct dept_wait *w, unsigned int wg, unsigned int ctxt_id)
+{
+	struct dept_wait_hist *wh = new_hist();
+
+	if (likely(wh->wait))
+		put_wait(wh->wait);
+
+	wh->wait = get_wait(w);
+	wh->wgen = wg;
+	wh->ctxt_id = ctxt_id;
+}
+
+static atomic_t wgen = ATOMIC_INIT(1);
+
+static unsigned int add_wait(struct dept_class *c, unsigned long ip,
+			     const char *w_fn, int ne)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_wait *w;
+	unsigned int wg = 0U;
+	int irq;
+	int i;
+
+	w = new_wait();
+	if (unlikely(!w))
+		return 0U;
+
+	WRITE_ONCE(w->class, get_class(c));
+	w->ip = ip;
+	w->wait_fn = w_fn;
+	w->stack = get_current_stack();
+
+	irq = cur_irq();
+	if (irq < DEPT_IRQS_NR)
+		add_iwait(w, irq);
+
+	/*
+	 * Avoid adding dependency between user aware nested ecxt and
+	 * wait.
+	 */
+	for (i = dt->ecxt_held_pos - 1; i >= 0; i--) {
+		struct dept_ecxt_held *eh;
+		eh = dt->ecxt_held + i;
+		if (eh->ecxt->class != c || eh->nest == ne)
+			break;
+	}
+
+	for (; i >= 0; i--) {
+		struct dept_ecxt_held *eh;
+		eh = dt->ecxt_held + i;
+		add_dep(eh->ecxt, w);
+		if (eh->with_wait)
+			break;
+	}
+
+	if (!wait_refered(w, 1) && !rich_stack) {
+		if (w->stack)
+			put_stack(w->stack);
+		w->stack = NULL;
+	}
+
+	/*
+	 * Avoid zero wgen.
+	 */
+	wg = atomic_inc_return(&wgen) ?: atomic_inc_return(&wgen);
+	add_hist(w, wg, cur_ctxt_id());
+
+	del_wait(w);
+	return wg;
+}
+
+static void add_ecxt(void *obj, struct dept_class *c, bool with_wait,
+		     unsigned long ip, const char *c_fn, const char *e_fn,
+		     int ne)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_ecxt_held *eh;
+	struct dept_ecxt *e;
+	unsigned long irqf;
+	int irq;
+
+	if (DEPT_WARN_ON(dt->ecxt_held_pos == DEPT_MAX_ECXT_HELD))
+		return;
+
+	e = new_ecxt();
+	if (unlikely(!e))
+		return;
+
+	e->class = get_class(c);
+	e->ecxt_ip = ip;
+	e->ecxt_stack = ip && rich_stack ? get_current_stack() : NULL;
+	e->event_fn = e_fn;
+	e->ecxt_fn = c_fn;
+
+	eh = dt->ecxt_held + (dt->ecxt_held_pos++);
+	eh->ecxt = get_ecxt(e);
+	eh->key = (unsigned long)obj;
+	eh->with_wait = with_wait;
+	eh->wgen = atomic_read(&wgen);
+	eh->nest = ne;
+
+	irqf = cur_enirqf();
+	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR)
+		add_iecxt(e, irq, false);
+
+	del_ecxt(e);
+}
+
+static int find_ecxt_pos(unsigned long key, bool newfirst)
+{
+	struct dept_task *dt = dept_task();
+	int i;
+
+	if (newfirst) {
+		for (i = dt->ecxt_held_pos - 1; i >= 0; i--)
+			if (dt->ecxt_held[i].key == key)
+				return i;
+	} else {
+		for (i = 0; i < dt->ecxt_held_pos; i++)
+			if (dt->ecxt_held[i].key == key)
+				return i;
+	}
+	return -1;
+}
+
+static void pop_ecxt(void *obj)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long key = (unsigned long)obj;
+	int pos;
+	int i;
+
+	/*
+	 * TODO: WARN on pos == -1.
+	 */
+	pos = find_ecxt_pos(key, true);
+	if (pos == -1)
+		return;
+
+	put_ecxt(dt->ecxt_held[pos].ecxt);
+	dt->ecxt_held_pos--;
+
+	for (i = pos; i < dt->ecxt_held_pos; i++)
+		dt->ecxt_held[i] = dt->ecxt_held[i + 1];
+}
+
+static inline bool good_hist(struct dept_wait_hist *wh, unsigned int wg)
+{
+	return wh->wait != NULL && before(wg, wh->wgen);
+}
+
+/*
+ * Binary-search the ring buffer for the earliest valid wait.
+ */
+static int find_hist_pos(unsigned int wg)
+{
+	int oldest;
+	int l;
+	int r;
+	int pos;
+
+	oldest = hist_pos_next();
+	if (unlikely(good_hist(hist(oldest), wg))) {
+		DEPT_WARN_ONCE("Need to expand the ring buffer.\n");
+		return oldest;
+	}
+
+	l = oldest + 1;
+	r = oldest + DEPT_MAX_WAIT_HIST - 1;
+	for (pos = (l + r) / 2; l <= r; pos = (l + r) / 2) {
+		struct dept_wait_hist *p = hist(pos - 1);
+		struct dept_wait_hist *wh = hist(pos);
+
+		if (!good_hist(p, wg) && good_hist(wh, wg))
+			return pos % DEPT_MAX_WAIT_HIST;
+		if (good_hist(wh, wg))
+			r = pos - 1;
+		else
+			l = pos + 1;
+	}
+	return -1;
+}
+
+static void do_event(void *obj, struct dept_class *c, unsigned int wg,
+		     unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_wait_hist *wh;
+	struct dept_ecxt_held *eh;
+	unsigned long key = (unsigned long)obj;
+	unsigned int ctxt_id;
+	int end;
+	int pos;
+	int i;
+
+	/*
+	 * The event was triggered before wait.
+	 */
+	if (!wg)
+		return;
+
+	pos = find_ecxt_pos(key, false);
+	if (pos == -1)
+		return;
+
+	eh = dt->ecxt_held + pos;
+	eh->ecxt->event_ip = ip;
+	eh->ecxt->event_stack = get_current_stack();
+
+	/*
+	 * The ecxt already has done what it needs.
+	 */
+	if (!before(wg, eh->wgen))
+		return;
+
+	pos = find_hist_pos(wg);
+	if (pos == -1)
+		return;
+
+	ctxt_id = cur_ctxt_id();
+	end = hist_pos_next();
+	end = end > pos ? end : end + DEPT_MAX_WAIT_HIST;
+	for (wh = hist(pos); pos < end; wh = hist(++pos)) {
+		if (wh->ctxt_id == ctxt_id)
+			add_dep(eh->ecxt, wh->wait);
+		if (!before(wh->wgen, eh->wgen))
+			break;
+	}
+
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		if (before(dt->wgen_enirq[i], wg))
+			continue;
+		add_iecxt(eh->ecxt, i, false);
+	}
+}
+
+static enum bfs_ret cb_clean_iwait(struct dept_dep *d, void *in, void **out)
+{
+	struct dept_class *c = (struct dept_class *)in;
+	struct dept_class *tc;
+	bool skip = true;
+	int i;
+
+	/*
+	 * initial condition for this BFS search
+	 */
+	if (!d)
+		return DEPT_BFS_CONTINUE;
+
+	tc = dep_tc(d);
+
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		if (tc->iwait[i] != c->iwait[i])
+			continue;
+		WRITE_ONCE(tc->iwait[i], DEPT_IWAIT_UNKNOWN);
+		tc->iwait_dist[i] = INT_MAX;
+		skip = false;
+	}
+
+	if (skip)
+		return DEPT_BFS_SKIP;
+
+	return DEPT_BFS_CONTINUE;
+}
+
+static void clean_iwait(struct dept_class *c)
+{
+	bfs(c, cb_clean_iwait, (void *)c, NULL);
+}
+
+static void del_dep_rcu(struct rcu_head *rh)
+{
+	struct dept_dep *d = container_of(rh, struct dept_dep, rh);
+
+	preempt_disable();
+	del_dep(d);
+	preempt_enable();
+}
+
+/*
+ * NOTE: Must be called with dept_lock held.
+ */
+static void disconnect_class(struct dept_class *c)
+{
+	struct dept_dep *d, *n;
+	int i;
+
+	clean_iwait(c);
+	for (i = 0; i < DEPT_IRQS_NR; i++) {
+		struct dept_ecxt *e = c->iecxt[i];
+		struct dept_wait *w = c->iwait[i];
+		if (e)
+			put_ecxt(e);
+		if (w && w->class == c)
+			put_wait(w);
+		WRITE_ONCE(c->iecxt[i], NULL);
+		WRITE_ONCE(c->iwait[i], NULL);
+	}
+
+	list_for_each_entry_safe(d, n, &c->dep_head, dep_node) {
+		list_del_rcu(&d->dep_node);
+		list_del_rcu(&d->dep_rev_node);
+		hash_del_dep(d);
+		call_rcu(&d->rh, del_dep_rcu);
+	}
+
+	list_for_each_entry_safe(d, n, &c->dep_rev_head, dep_rev_node) {
+		list_del_rcu(&d->dep_node);
+		list_del_rcu(&d->dep_rev_node);
+		hash_del_dep(d);
+		call_rcu(&d->rh, del_dep_rcu);
+	}
+}
+
+/*
+ * IRQ context control
+ * =====================================================================
+ * Whether a wait is in {hard,soft}-IRQ context or whether
+ * {hard,soft}-IRQ has been enabled on the way to an event is very
+ * important to check dependency. All those things should be tracked.
+ */
+
+static inline unsigned long cur_enirqf(void)
+{
+	struct dept_task *dt = dept_task();
+	int he = dt->hardirqs_enabled;
+	int se = dt->softirqs_enabled;
+	if (he)
+		return DEPT_HIRQF | (se ? DEPT_SIRQF : 0UL);
+	return 0UL;
+}
+
+static inline int cur_irq(void)
+{
+	if (lockdep_softirq_context(current))
+		return DEPT_SIRQ;
+	if (lockdep_hardirq_context())
+		return DEPT_HIRQ;
+	return DEPT_IRQS_NR;
+}
+
+static inline unsigned int cur_ctxt_id(void)
+{
+	struct dept_task *dt = dept_task();
+	int irq = cur_irq();
+
+	/*
+	 * Normal process context
+	 */
+	if (irq == DEPT_IRQS_NR)
+		return 0U;
+
+	return dt->irq_id[irq] | (1UL << irq);
+}
+
+static void enirq_transition(int irq)
+{
+	struct dept_task *dt = dept_task();
+	int i;
+
+	/*
+	 * READ wgen >= wgen of an event with IRQ enabled has been
+	 * observed on the way to the event means, the IRQ can cut in
+	 * within the ecxt. Used for cross-event detection.
+	 *
+	 *    wait context	event context(ecxt)
+	 *    ------------	-------------------
+	 *    wait event
+	 *       WRITE wgen
+	 *			observe IRQ enabled
+	 *			   READ wgen
+	 *			   keep the wgen locally
+	 *
+	 *			on the event
+	 *			   check the local wgen
+	 */
+	dt->wgen_enirq[irq] = atomic_read(&wgen);
+
+	for (i = dt->ecxt_held_pos - 1; i >= 0; i--) {
+		struct dept_ecxt_held *eh;
+		eh = dt->ecxt_held + i;
+		add_iecxt(eh->ecxt, irq, true);
+	}
+}
+
+static void enirq_update(unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long irqf;
+	unsigned long prev;
+	int irq;
+
+	prev = dt->eff_enirqf;
+	irqf = cur_enirqf();
+	dt->eff_enirqf = irqf;
+
+	/*
+	 * Do enirq_transition() only on an OFF -> ON transition.
+	 */
+	for_each_set_bit(irq, &irqf, DEPT_IRQS_NR) {
+		unsigned long flags;
+
+		if (prev & (1UL << irq))
+			continue;
+
+		flags = dept_enter();
+		dt->enirq_ip[irq] = ip;
+		enirq_transition(irq);
+		dept_exit(flags);
+	}
+}
+
+/*
+ * Ensure it has been called on OFF -> ON transition.
+ */
+void dept_enable_softirq(unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	if (DEPT_WARN_ON(early_boot_irqs_disabled))
+		return;
+
+	if (DEPT_WARN_ON(!irqs_disabled()))
+		return;
+
+	dt->softirqs_enabled = true;
+	enirq_update(ip);
+}
+
+/*
+ * Ensure it has been called on OFF -> ON transition.
+ */
+void dept_enable_hardirq(unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	if (DEPT_WARN_ON(early_boot_irqs_disabled))
+		return;
+
+	if (DEPT_WARN_ON(!irqs_disabled()))
+		return;
+
+	dt->hardirqs_enabled = true;
+	enirq_update(ip);
+}
+
+/*
+ * Ensure it has been called on ON -> OFF transition.
+ */
+void dept_disable_softirq(unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	if (DEPT_WARN_ON(!irqs_disabled()))
+		return;
+
+	dt->softirqs_enabled = false;
+	enirq_update(ip);
+}
+
+/*
+ * Ensure it has been called on ON -> OFF transition.
+ */
+void dept_disable_hardirq(unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	if (DEPT_WARN_ON(!irqs_disabled()))
+		return;
+
+	dt->hardirqs_enabled = false;
+	enirq_update(ip);
+}
+
+/*
+ * Ensure it's the outmost softirq context.
+ */
+void dept_softirq_enter(void)
+{
+	struct dept_task *dt = dept_task();
+	dt->irq_id[DEPT_SIRQ] += (1UL << DEPT_IRQS_NR);
+}
+
+/*
+ * Ensure it's the outmost hardirq context.
+ */
+void dept_hardirq_enter(void)
+{
+	struct dept_task *dt = dept_task();
+	dt->irq_id[DEPT_HIRQ] += (1UL << DEPT_IRQS_NR);
+}
+
+/*
+ * Dept API
+ * =====================================================================
+ * Main Dept APIs.
+ */
+
+static inline void clean_dept_key(struct dept_key *k)
+{
+	int i;
+
+	for (i = 0; i < DEPT_MAX_SUBCLASSES_CACHE; i++)
+		k->classes[i] = NULL;
+}
+
+static inline void copy_dept_key(struct dept_key *dst,
+				 struct dept_key *src)
+{
+	int i;
+
+	for (i = 0; i < DEPT_MAX_SUBCLASSES_CACHE; i++)
+		WRITE_ONCE(dst->classes[i], READ_ONCE(src->classes[i]));
+}
+
+void dept_map_init(struct dept_map *m, struct dept_key *k, int sub,
+		   const char *n, enum dept_type t)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	flags = dept_enter();
+
+	if (k && sub >= 0 && sub < DEPT_MAX_SUBCLASSES_USR) {
+		m->keys = k;
+		copy_dept_key(&m->keys_local, k);
+		m->sub_usr = sub;
+	} else {
+		m->keys = NULL;
+		DEPT_WARN_ON(sub < 0 || sub >= DEPT_MAX_SUBCLASSES_USR);
+	}
+
+	m->name = n;
+	m->wgen = 0U;
+	m->type = m->keys ? t : DEPT_TYPE_NO_CHECK;
+
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_map_init);
+
+void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub,
+		     const char *n)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	flags = dept_enter();
+
+	if (k && k != m->keys) {
+		m->keys = k;
+		copy_dept_key(&m->keys_local, k);
+	}
+
+	if (sub >= 0 && sub < DEPT_MAX_SUBCLASSES_USR)
+		m->sub_usr = sub;
+
+	if (n)
+		m->name = n;
+
+	m->wgen = 0;
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_map_reinit);
+
+void dept_map_nocheck(struct dept_map *m)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	flags = dept_enter();
+	m->type = DEPT_TYPE_NO_CHECK;
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_map_nocheck);
+
+static LIST_HEAD(classes);
+
+static inline bool within(const void *addr, void *start, unsigned long size)
+{
+	return addr >= start && addr < start + size;
+}
+
+void dept_free_range(void *start, unsigned int sz)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_class *c, *n;
+	unsigned long flags;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	flags = dept_enter();
+
+	/*
+	 * dept_free_range() should not fail.
+	 *
+	 * FIXME: Should be fixed if dept_free_range() causes deadlock
+	 * with dept_lock().
+	 */
+	while (unlikely(!dept_lock()));
+
+	list_for_each_entry_safe(c, n, &classes, all_node) {
+		if (!within((void *)c->key, start, sz) &&
+		    !within(c->name, start, sz))
+			continue;
+
+		hash_del_class(c);
+		disconnect_class(c);
+		list_del(&c->all_node);
+		invalidate_class(c);
+
+		/*
+		 * Actual deletion will happen on the rcu callback
+		 * that has been added in disconnect_class().
+		 */
+		del_class(c);
+	}
+	dept_unlock();
+	dept_exit(flags);
+
+	/*
+	 * Wait until even lockless hash_lookup_class() for the class
+	 * returns NULL.
+	 */
+	might_sleep();
+	synchronize_rcu();
+}
+
+static inline struct dept_key *map_key(struct dept_map *m)
+{
+	if (likely(m->keys))
+		return m->keys;
+
+	/*
+	 * Use the embedded struct dept_key instance as key in case no
+	 * key has been assigned yet e.g. statically defined lock.
+	 */
+	if (m->type != DEPT_TYPE_NO_CHECK)
+		m->keys = &m->keys_local;
+
+	return m->keys;
+}
+
+static inline int subevent_nr(enum dept_type t)
+{
+	if (t == DEPT_TYPE_RW)
+		return 2;
+	else
+		return 1;
+}
+
+static inline int map_sub(struct dept_map *m, int e)
+{
+	/*
+	 * Make the best use of >dept_key.classes caching.
+	 */
+	if (subevent_nr(m->type) > 1)
+		return m->sub_usr * DEPT_MAX_SUBCLASSES_EVT + e;
+	else
+		return m->sub_usr + e * DEPT_MAX_SUBCLASSES_USR;
+}
+
+static struct dept_class *check_new_class(struct dept_key *local,
+					  struct dept_key *k, int sub,
+					  const char *n)
+{
+	struct dept_class *c = NULL;
+
+	if (DEPT_WARN_ON(sub >= DEPT_MAX_SUBCLASSES))
+		return NULL;
+
+	if (DEPT_WARN_ON(!k))
+		return NULL;
+
+	if (sub < DEPT_MAX_SUBCLASSES_CACHE)
+		c = READ_ONCE(local->classes[sub]);
+
+	if (c)
+		return c;
+
+	if (sub < DEPT_MAX_SUBCLASSES_CACHE)
+		c = READ_ONCE(k->classes[sub]);
+
+	if (c)
+		goto caching;
+
+	c = class((unsigned long)k->subkeys + sub);
+	if (c)
+		goto caching;
+
+	if (unlikely(!dept_lock()))
+		return NULL;
+
+	c = class((unsigned long)k->subkeys + sub);
+	if (unlikely(c))
+		goto unlock;
+
+	c = new_class();
+	if (unlikely(!c))
+		goto unlock;
+
+	c->name = n;
+	c->sub = sub;
+	c->key = (unsigned long)(k->subkeys + sub);
+	hash_add_class(c);
+	list_add(&c->all_node, &classes);
+
+	if (sub < DEPT_MAX_SUBCLASSES_CACHE)
+		WRITE_ONCE(k->classes[sub], c);
+unlock:
+	dept_unlock();
+caching:
+	if (local != k && sub < DEPT_MAX_SUBCLASSES_CACHE && c)
+		WRITE_ONCE(local->classes[sub], c);
+
+	return c;
+}
+
+void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip,
+	       const char *w_fn, int ne)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	int e;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	if (m->type == DEPT_TYPE_NO_CHECK)
+		return;
+
+	flags = dept_enter();
+
+	/*
+	 * Be as conservative as possible. In case of mulitple waits for
+	 * a single dept_map, we are going to keep only the last wait's
+	 * wgen for simplicity - keeping all wgens seems overengineering.
+	 *
+	 * Of course, it might cause missing some dependencies that
+	 * would rarely, probabily never, happen but it helps avoid
+	 * false positive report.
+	 */
+	for_each_set_bit(e, &w_f, DEPT_MAX_SUBCLASSES_EVT) {
+		struct dept_class *c;
+
+		c = check_new_class(&m->keys_local, map_key(m),
+				    map_sub(m, e), m->name);
+		if (!c)
+			continue;
+
+		WRITE_ONCE(m->wgen, add_wait(c, ip, w_fn, ne));
+	}
+
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_wait);
+
+void dept_wait_ecxt_enter(struct dept_map *m, unsigned long w_f,
+			  unsigned long e_f, unsigned long ip,
+			  const char *w_fn, const char *c_fn, const char *e_fn,
+			  int ne)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	int e;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	if (m->type == DEPT_TYPE_NO_CHECK)
+		return;
+
+	flags = dept_enter();
+
+	/*
+	 * Be as conservative as possible. In case of mulitple waits for
+	 * a single dept_map, we are going to keep only the last wait's
+	 * wgen for simplicity - keeping all wgens seems overengineering.
+	 *
+	 * Of course, it might cause missing some dependencies that
+	 * would rarely, probabily never, happen but it helps avoid
+	 * false positive report.
+	 */
+	for_each_set_bit(e, &w_f, DEPT_MAX_SUBCLASSES_EVT) {
+		struct dept_class *c;
+
+		c = check_new_class(&m->keys_local, map_key(m),
+				    map_sub(m, e), m->name);
+		if (!c)
+			continue;
+
+		WRITE_ONCE(m->wgen, add_wait(c, ip, w_fn, ne));
+	}
+
+	for_each_set_bit(e, &e_f, DEPT_MAX_SUBCLASSES_EVT) {
+		struct dept_class *c;
+
+		c = check_new_class(&m->keys_local, map_key(m),
+				    map_sub(m, e), m->name);
+		if (!c)
+			continue;
+
+		add_ecxt(m, c, true, ip, c_fn, e_fn, ne);
+	}
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_wait_ecxt_enter);
+
+void dept_ecxt_enter(struct dept_map *m, unsigned long e_f, unsigned long ip,
+		     const char *c_fn, const char *e_fn, int ne)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	int e;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	if (m->type == DEPT_TYPE_NO_CHECK)
+		return;
+
+	flags = dept_enter();
+
+	for_each_set_bit(e, &e_f, DEPT_MAX_SUBCLASSES_EVT) {
+		struct dept_class *c;
+
+		c = check_new_class(&m->keys_local, map_key(m),
+				    map_sub(m, e), m->name);
+		if (!c)
+			continue;
+
+		add_ecxt(m, c, false, ip, c_fn, e_fn, ne);
+	}
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_ecxt_enter);
+
+void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip,
+		const char *e_fn)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	int e;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	if (m->type == DEPT_TYPE_NO_CHECK)
+		return;
+
+	flags = dept_enter();
+
+	for_each_set_bit(e, &e_f, DEPT_MAX_SUBCLASSES_EVT) {
+		struct dept_class *c;
+
+		c = check_new_class(&m->keys_local, map_key(m),
+				    map_sub(m, e), m->name);
+		if (!c)
+			continue;
+
+		add_ecxt(m, c, false, 0UL, NULL, e_fn, 0);
+		do_event(m, c, READ_ONCE(m->wgen), ip);
+		pop_ecxt(m);
+	}
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_event);
+
+void dept_ecxt_exit(struct dept_map *m, unsigned long ip)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	if (m->type == DEPT_TYPE_NO_CHECK)
+		return;
+
+	flags = dept_enter();
+	pop_ecxt(m);
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_ecxt_exit);
+
+struct dept_map *dept_top_map(void)
+{
+	struct dept_task *dt = dept_task();
+	struct dept_map *m;
+	unsigned long flags;
+	int pos;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return NULL;
+
+	flags = dept_enter();
+	pos = dt->ecxt_held_pos;
+	m = pos ? (struct dept_map *)dt->ecxt_held[pos - 1].key : NULL;
+	dept_exit(flags);
+
+	return m;
+}
+EXPORT_SYMBOL_GPL(dept_top_map);
+
+void dept_task_exit(struct task_struct *t)
+{
+	struct dept_task *dt = &t->dept_task;
+	int i;
+
+	raw_local_irq_disable();
+
+	if (dt->stack)
+		put_stack(dt->stack);
+
+	for (i = 0; i < dt->ecxt_held_pos; i++)
+		put_ecxt(dt->ecxt_held[i].ecxt);
+
+	for (i = 0; i < DEPT_MAX_WAIT_HIST; i++)
+		if (dt->wait_hist[i].wait)
+			put_wait(dt->wait_hist[i].wait);
+
+	dept_off();
+
+	raw_local_irq_enable();
+}
+
+void dept_task_init(struct task_struct *t)
+{
+	memset(&t->dept_task, 0x0, sizeof(struct dept_task));
+}
+
+void dept_key_init(struct dept_key *k)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	int sub;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	flags = dept_enter();
+
+	/*
+	 * dept_key_init() should not fail.
+	 *
+	 * FIXME: Should be fixed if dept_key_init() causes deadlock
+	 * with dept_lock().
+	 */
+	while (unlikely(!dept_lock()));
+
+	for (sub = 0; sub < DEPT_MAX_SUBCLASSES; sub++) {
+		struct dept_class *c;
+
+		c = class((unsigned long)k->subkeys + sub);
+		if (!c)
+			continue;
+
+		DEPT_STOP("The class(%s/%d) has not been removed.\n",
+			  c->name, sub);
+		break;
+	}
+
+	clean_dept_key(k);
+	dept_unlock();
+	dept_exit(flags);
+}
+EXPORT_SYMBOL_GPL(dept_key_init);
+
+void dept_key_destroy(struct dept_key *k)
+{
+	struct dept_task *dt = dept_task();
+	unsigned long flags;
+	int sub;
+
+	if (READ_ONCE(dept_stop) || dt->recursive)
+		return;
+
+	flags = dept_enter();
+
+	/*
+	 * dept_key_destroy() should not fail.
+	 *
+	 * FIXME: Should be fixed if dept_key_destroy() causes deadlock
+	 * with dept_lock().
+	 */
+	while (unlikely(!dept_lock()));
+
+	for (sub = 0; sub < DEPT_MAX_SUBCLASSES; sub++) {
+		struct dept_class *c;
+
+		c = class((unsigned long)k->subkeys + sub);
+		if (!c)
+			continue;
+
+		hash_del_class(c);
+		disconnect_class(c);
+		list_del(&c->all_node);
+		invalidate_class(c);
+
+		/*
+		 * Actual deletion will happen on the rcu callback
+		 * that has been added in disconnect_class().
+		 */
+		del_class(c);
+	}
+
+	dept_unlock();
+	dept_exit(flags);
+
+	/*
+	 * Wait until even lockless hash_lookup_class() for the class
+	 * returns NULL.
+	 */
+	might_sleep();
+	synchronize_rcu();
+}
+EXPORT_SYMBOL_GPL(dept_key_destroy);
+
+static void move_llist(struct llist_head *to, struct llist_head *from)
+{
+	struct llist_node *first = llist_del_all(from);
+	struct llist_node *last;
+
+	if (!first)
+		return;
+
+	for (last = first; last->next; last = last->next);
+	llist_add_batch(first, last, to);
+}
+
+static void migrate_per_cpu_pool(void)
+{
+	const int boot_cpu = 0;
+	int i;
+
+	/*
+	 * The boot CPU has been using the temperal local pool so far.
+	 * From now on that per_cpu areas have been ready, use the
+	 * per_cpu local pool instead.
+	 */
+	DEPT_WARN_ON(smp_processor_id() != boot_cpu);
+	for (i = 0; i < OBJECT_NR; i++) {
+		struct llist_head *from;
+		struct llist_head *to;
+
+		from = &pool[i].boot_pool;
+		to = per_cpu_ptr(pool[i].lpool, boot_cpu);
+		move_llist(to, from);
+	}
+}
+
+#define B2KB(B) ((B) / 1024)
+
+/*
+ * Should be called after setup_per_cpu_areas() and before no non-boot
+ * CPUs have been on.
+ */
+void __init dept_init(void)
+{
+	size_t mem_total = 0;
+
+	local_irq_disable();
+	dept_per_cpu_ready = 1;
+	migrate_per_cpu_pool();
+	local_irq_enable();
+
+#define OBJECT(id, nr) mem_total += sizeof(struct dept_##id) * nr;
+	#include "dept_object.h"
+#undef  OBJECT
+#define HASH(id, bits) mem_total += sizeof(struct hlist_head) * (1UL << bits);
+	#include "dept_hash.h"
+#undef  HASH
+
+	printk("DEPendency Tracker: Copyright (c) 2020 LG Electronics, Inc., Byungchul Park\n");
+	printk("... DEPT_MAX_STACK_ENTRY: %d\n", DEPT_MAX_STACK_ENTRY);
+	printk("... DEPT_MAX_WAIT_HIST  : %d\n", DEPT_MAX_WAIT_HIST  );
+	printk("... DEPT_MAX_ECXT_HELD  : %d\n", DEPT_MAX_ECXT_HELD  );
+	printk("... DEPT_MAX_SUBCLASSES : %d\n", DEPT_MAX_SUBCLASSES );
+#define OBJECT(id, nr)							\
+	printk("... memory used by %s: %zu KB\n",			\
+	       #id, B2KB(sizeof(struct dept_##id) * nr));
+	#include "dept_object.h"
+#undef  OBJECT
+#define HASH(id, bits)							\
+	printk("... hash list head used by %s: %zu KB\n",		\
+	       #id, B2KB(sizeof(struct hlist_head) * (1UL << bits)));
+	#include "dept_hash.h"
+#undef  HASH
+	printk("... total memory used by objects and hashs: %zu KB\n", B2KB(mem_total));
+	printk("... per task memory footprint: %zu bytes\n", sizeof(struct dept_task));
+}
diff --git a/kernel/dependency/dept_hash.h b/kernel/dependency/dept_hash.h
new file mode 100644
index 0000000..7c3bea0
--- /dev/null
+++ b/kernel/dependency/dept_hash.h
@@ -0,0 +1,9 @@
+/*
+ * HASH(id, bits)
+ *
+ * id  : Id for the object of struct dept_##id.
+ * bits: 1UL << bits is the hash table size.
+ */
+
+HASH(dep	, 12)
+HASH(class	, 12)
diff --git a/kernel/dependency/dept_object.h b/kernel/dependency/dept_object.h
new file mode 100644
index 0000000..a204f14
--- /dev/null
+++ b/kernel/dependency/dept_object.h
@@ -0,0 +1,12 @@
+/*
+ * OBJECT(id, nr)
+ *
+ * id: Id for the object of struct dept_##id.
+ * nr: # of the object that should be kept in the pool.
+ */
+
+OBJECT(dep	,1024 * 4)
+OBJECT(class	,1024 * 4)
+OBJECT(stack	,1024 * 16)
+OBJECT(ecxt	,1024 * 4)
+OBJECT(wait	,1024 * 8)
diff --git a/kernel/exit.c b/kernel/exit.c
index 733e80f..e11f154 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -854,6 +854,7 @@ void __noreturn do_exit(long code)
 	exit_tasks_rcu_finish();
 
 	lockdep_free_task(tsk);
+	dept_task_exit(tsk);
 	do_task_dead();
 }
 EXPORT_SYMBOL_GPL(do_exit);
diff --git a/kernel/fork.c b/kernel/fork.c
index da8d360..9d00042 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -95,6 +95,7 @@
 #include <linux/stackleak.h>
 #include <linux/kasan.h>
 #include <linux/scs.h>
+#include <linux/dept.h>
 
 #include <asm/pgalloc.h>
 #include <linux/uaccess.h>
@@ -2027,6 +2028,7 @@ static __latent_entropy struct task_struct *copy_process(
 #ifdef CONFIG_LOCKDEP
 	lockdep_init_task(p);
 #endif
+	dept_task_init(p);
 
 #ifdef CONFIG_DEBUG_MUTEXES
 	p->blocked_on = NULL; /* not blocked yet */
diff --git a/kernel/module.c b/kernel/module.c
index 1c5cff3..4dc6350 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2253,6 +2253,7 @@ static void free_module(struct module *mod)
 
 	/* Free lock-classes; relies on the preceding sync_rcu(). */
 	lockdep_free_key_range(mod->core_layout.base, mod->core_layout.size);
+	dept_free_range(mod->core_layout.base, mod->core_layout.size);
 
 	/* Finally, free the core (containing the module structure) */
 	module_memfree(mod->core_layout.base);
@@ -4004,6 +4005,7 @@ static int load_module(struct load_info *info, const char __user *uargs,
  free_module:
 	/* Free lock-classes; relies on the preceding sync_rcu() */
 	lockdep_free_key_range(mod->core_layout.base, mod->core_layout.size);
+	dept_free_range(mod->core_layout.base, mod->core_layout.size);
 
 	module_deallocate(mod, info);
  free_copy:
diff --git a/kernel/softirq.c b/kernel/softirq.c
index bf88d7f6..42d1f51 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -132,7 +132,7 @@ void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
 	 * Were softirqs turned off above:
 	 */
 	if (softirq_count() == (cnt & SOFTIRQ_MASK))
-		lockdep_softirqs_off(ip);
+		trace_softirqs_off_caller(ip);
 	raw_local_irq_restore(flags);
 
 	if (preempt_count() == cnt) {
@@ -153,7 +153,7 @@ static void __local_bh_enable(unsigned int cnt)
 		trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip());
 
 	if (softirq_count() == (cnt & SOFTIRQ_MASK))
-		lockdep_softirqs_on(_RET_IP_);
+		trace_softirqs_on_caller(_RET_IP_);
 
 	__preempt_count_sub(cnt);
 }
@@ -180,7 +180,7 @@ void __local_bh_enable_ip(unsigned long ip, unsigned int cnt)
 	 * Are softirqs going to be turned on now:
 	 */
 	if (softirq_count() == SOFTIRQ_DISABLE_OFFSET)
-		lockdep_softirqs_on(ip);
+		trace_softirqs_on_caller(ip);
 	/*
 	 * Keep preemption disabled until we are done with
 	 * softirq processing:
diff --git a/kernel/trace/trace_preemptirq.c b/kernel/trace/trace_preemptirq.c
index f493804..19cafdfb 100644
--- a/kernel/trace/trace_preemptirq.c
+++ b/kernel/trace/trace_preemptirq.c
@@ -19,6 +19,18 @@
 /* Per-cpu variable to prevent redundant calls when IRQs already off */
 static DEFINE_PER_CPU(int, tracing_irq_cpu);
 
+void trace_softirqs_on_caller(unsigned long ip)
+{
+	lockdep_softirqs_on(ip);
+	dept_enable_softirq(ip);
+}
+
+void trace_softirqs_off_caller(unsigned long ip)
+{
+	lockdep_softirqs_off(ip);
+	dept_disable_softirq(ip);
+}
+
 /*
  * Like trace_hardirqs_on() but without the lockdep invocation. This is
  * used in the low level entry code where the ordering vs. RCU is important
@@ -33,6 +45,7 @@ void trace_hardirqs_on_prepare(void)
 		tracer_hardirqs_on(CALLER_ADDR0, CALLER_ADDR1);
 		this_cpu_write(tracing_irq_cpu, 0);
 	}
+	dept_enable_hardirq(CALLER_ADDR0);
 }
 EXPORT_SYMBOL(trace_hardirqs_on_prepare);
 NOKPROBE_SYMBOL(trace_hardirqs_on_prepare);
@@ -45,6 +58,7 @@ void trace_hardirqs_on(void)
 		tracer_hardirqs_on(CALLER_ADDR0, CALLER_ADDR1);
 		this_cpu_write(tracing_irq_cpu, 0);
 	}
+	dept_enable_hardirq(CALLER_ADDR0);
 
 	lockdep_hardirqs_on_prepare(CALLER_ADDR0);
 	lockdep_hardirqs_on(CALLER_ADDR0);
@@ -66,7 +80,7 @@ void trace_hardirqs_off_finish(void)
 		if (!in_nmi())
 			trace_irq_disable(CALLER_ADDR0, CALLER_ADDR1);
 	}
-
+	dept_disable_hardirq(CALLER_ADDR0);
 }
 EXPORT_SYMBOL(trace_hardirqs_off_finish);
 NOKPROBE_SYMBOL(trace_hardirqs_off_finish);
@@ -81,6 +95,7 @@ void trace_hardirqs_off(void)
 		if (!in_nmi())
 			trace_irq_disable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
 	}
+	dept_disable_hardirq(CALLER_ADDR0);
 }
 EXPORT_SYMBOL(trace_hardirqs_off);
 NOKPROBE_SYMBOL(trace_hardirqs_off);
@@ -93,6 +108,7 @@ __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
 		tracer_hardirqs_on(CALLER_ADDR0, caller_addr);
 		this_cpu_write(tracing_irq_cpu, 0);
 	}
+	dept_enable_hardirq(CALLER_ADDR0);
 
 	lockdep_hardirqs_on_prepare(CALLER_ADDR0);
 	lockdep_hardirqs_on(CALLER_ADDR0);
@@ -110,6 +126,7 @@ __visible void trace_hardirqs_off_caller(unsigned long caller_addr)
 		if (!in_nmi())
 			trace_irq_disable_rcuidle(CALLER_ADDR0, caller_addr);
 	}
+	dept_disable_hardirq(CALLER_ADDR0);
 }
 EXPORT_SYMBOL(trace_hardirqs_off_caller);
 NOKPROBE_SYMBOL(trace_hardirqs_off_caller);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 0c781f9..0c757e8 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1146,6 +1146,21 @@ config DEBUG_PREEMPT
 
 menu "Lock Debugging (spinlocks, mutexes, etc...)"
 
+config DEPT
+	bool "Dependency tracking"
+	depends on DEBUG_KERNEL
+	select DEBUG_LOCK_ALLOC
+	select TRACE_IRQFLAGS
+	select STACKTRACE
+	select FRAME_POINTER if !MIPS && !PPC && !ARM && !S390 && !MICROBLAZE && !ARC && !X86
+	select KALLSYMS
+	select KALLSYMS_ALL
+	default n
+	help
+	  Check dependencies between wait and event and report it if
+	  deadlock possibility has been detected. Multiple reports are
+	  allowed if there are more than a single problem.
+
 config LOCK_DEBUGGING_SUPPORT
 	bool
 	depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 2/6] dept: Apply Dept to spinlock
  2020-11-23 11:36   ` [RFC 1/6] dept: Implement Dept(Dependency Tracker) Byungchul Park
@ 2020-11-23 11:36     ` Byungchul Park
  2020-11-23 11:36     ` [RFC 3/6] dept: Apply Dept to mutex families Byungchul Park
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 11:36 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

Makes Dept able to track dependencies by spinlock.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/dept.h             |  4 +++-
 include/linux/llist.h            |  9 +--------
 include/linux/spinlock.h         | 42 ++++++++++++++++++++++++++++++++++++----
 include/linux/spinlock_api_smp.h | 15 ++++++++++++--
 include/linux/spinlock_types.h   | 37 ++++++++++++++++++++++++++++++-----
 include/linux/types.h            |  8 ++++++++
 kernel/locking/spinlock.c        |  6 +++++-
 kernel/locking/spinlock_debug.c  |  4 +++-
 kernel/sched/core.c              |  2 ++
 9 files changed, 105 insertions(+), 22 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 7fe1e04..baf3519 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -14,7 +14,9 @@
 #ifdef CONFIG_DEPT
 
 #include <linux/list.h>
-#include <linux/llist.h>
+#include <linux/types.h>
+
+struct task_struct;
 
 #define DEPT_MAX_STACK_ENTRY		16
 #define DEPT_MAX_WAIT_HIST		16
diff --git a/include/linux/llist.h b/include/linux/llist.h
index 2e9c721..e0c88d5 100644
--- a/include/linux/llist.h
+++ b/include/linux/llist.h
@@ -50,14 +50,7 @@
 
 #include <linux/atomic.h>
 #include <linux/kernel.h>
-
-struct llist_head {
-	struct llist_node *first;
-};
-
-struct llist_node {
-	struct llist_node *next;
-};
+#include <linux/types.h>
 
 #define LLIST_HEAD_INIT(name)	{ NULL }
 #define LLIST_HEAD(name)	struct llist_head name = LLIST_HEAD_INIT(name)
diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index f2f12d7..4c5c76e 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -92,15 +92,47 @@
 # include <linux/spinlock_up.h>
 #endif
 
+#ifdef CONFIG_DEPT
+#define dept_spin_init(m, k, s, n)		dept_map_init(m, k, s, n, DEPT_TYPE_SPIN)
+#define dept_spin_reinit(m, k, s, n)		dept_map_reinit(m, k, s, n)
+#define dept_spin_nocheck(m)			dept_map_nocheck(m)
+#define dept_spin_lock(m, e_fn, ip)		dept_wait_ecxt_enter(m, 1UL, 1UL, ip, __func__, __func__, e_fn, 0)
+#define dept_spin_lock_nest(m, n_m, ip)		WARN_ON(dept_top_map() != (n_m))
+#define dept_spin_lock_nested(m, ne, e_fn, ip)	dept_wait_ecxt_enter(m, 1UL, 1UL, ip, __func__, __func__, e_fn, ne)
+#define dept_spin_trylock(m, e_fn, ip)		dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, 0)
+#define dept_spin_unlock(m, ip)			dept_ecxt_exit(m, ip)
+#define dept_spin_enter(m, e_fn, ip)		dept_ecxt_enter(m, 1UL, ip, "spin_lock_enter", e_fn, 0)
+#define dept_spin_exit(m, ip)			dept_ecxt_exit(m, ip)
+#define dept_spin_switch_nested(m, ne, ip)				\
+	do {								\
+		dept_ecxt_exit(m, ip);					\
+		dept_ecxt_enter(m, 1UL, ip, __func__, "spin_switch_nested", ne);\
+	} while (0)
+#else
+#define dept_spin_init(m, k, s, n)		do { (void)(n); (void)(k); } while (0)
+#define dept_spin_reinit(m, k, s, n)		do { (void)(n); (void)(k); } while (0)
+#define dept_spin_nocheck(m)			do { } while (0)
+#define dept_spin_lock(m, e_fn, ip)		do { } while (0)
+#define dept_spin_lock_nest(m, n_m, ip)		do { } while (0)
+#define dept_spin_lock_nested(m, ne, e_fn, ip)	do { } while (0)
+#define dept_spin_trylock(m, e_fn, ip)		do { } while (0)
+#define dept_spin_unlock(m, ip)			do { } while (0)
+#define dept_spin_enter(m, e_fn, ip)		do { } while (0)
+#define dept_spin_exit(m, ip)			do { } while (0)
+#define dept_spin_switch_nested(m, ne, ip)	do { } while (0)
+#endif
+
 #ifdef CONFIG_DEBUG_SPINLOCK
   extern void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
-				   struct lock_class_key *key, short inner);
+				   struct lock_class_key *key,
+				   struct dept_key *dkey, short inner);
 
 # define raw_spin_lock_init(lock)					\
 do {									\
 	static struct lock_class_key __key;				\
+	static struct dept_key __dkey;					\
 									\
-	__raw_spin_lock_init((lock), #lock, &__key, LD_WAIT_SPIN);	\
+	__raw_spin_lock_init((lock), #lock, &__key, &__dkey, LD_WAIT_SPIN); \
 } while (0)
 
 #else
@@ -231,7 +263,8 @@ static inline void do_raw_spin_unlock(raw_spinlock_t *lock) __releases(lock)
 # define raw_spin_lock_nest_lock(lock, nest_lock)			\
 	 do {								\
 		 typecheck(struct lockdep_map *, &(nest_lock)->dep_map);\
-		 _raw_spin_lock_nest_lock(lock, &(nest_lock)->dep_map);	\
+		 typecheck(struct dept_map *, &(nest_lock)->dmap);	\
+		 _raw_spin_lock_nest_lock(lock, &(nest_lock)->dep_map, &(nest_lock)->dmap); \
 	 } while (0)
 #else
 /*
@@ -334,9 +367,10 @@ static __always_inline raw_spinlock_t *spinlock_check(spinlock_t *lock)
 # define spin_lock_init(lock)					\
 do {								\
 	static struct lock_class_key __key;			\
+	static struct dept_key __dkey;				\
 								\
 	__raw_spin_lock_init(spinlock_check(lock),		\
-			     #lock, &__key, LD_WAIT_CONFIG);	\
+			     #lock, &__key, &__dkey, LD_WAIT_CONFIG); \
 } while (0)
 
 #else
diff --git a/include/linux/spinlock_api_smp.h b/include/linux/spinlock_api_smp.h
index 19a9be9..56afa5a 100644
--- a/include/linux/spinlock_api_smp.h
+++ b/include/linux/spinlock_api_smp.h
@@ -23,8 +23,8 @@
 void __lockfunc _raw_spin_lock_nested(raw_spinlock_t *lock, int subclass)
 								__acquires(lock);
 void __lockfunc
-_raw_spin_lock_nest_lock(raw_spinlock_t *lock, struct lockdep_map *map)
-								__acquires(lock);
+_raw_spin_lock_nest_lock(raw_spinlock_t *lock, struct lockdep_map *map,
+		struct dept_map *dmap)				__acquires(lock);
 void __lockfunc _raw_spin_lock_bh(raw_spinlock_t *lock)		__acquires(lock);
 void __lockfunc _raw_spin_lock_irq(raw_spinlock_t *lock)
 								__acquires(lock);
@@ -88,6 +88,7 @@ static inline int __raw_spin_trylock(raw_spinlock_t *lock)
 	preempt_disable();
 	if (do_raw_spin_trylock(lock)) {
 		spin_acquire(&lock->dep_map, 0, 1, _RET_IP_);
+		dept_spin_trylock(&lock->dmap, "__raw_spin_unlock", _RET_IP_);
 		return 1;
 	}
 	preempt_enable();
@@ -108,6 +109,8 @@ static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock)
 	local_irq_save(flags);
 	preempt_disable();
 	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_spin_lock(&lock->dmap, "__raw_spin_unlock_irqrestore", _RET_IP_);
+
 	/*
 	 * On lockdep we dont want the hand-coded irq-enable of
 	 * do_raw_spin_lock_flags() code, because lockdep assumes
@@ -126,6 +129,7 @@ static inline void __raw_spin_lock_irq(raw_spinlock_t *lock)
 	local_irq_disable();
 	preempt_disable();
 	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_spin_lock(&lock->dmap, "__raw_spin_unlock_irq", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
 }
 
@@ -133,6 +137,7 @@ static inline void __raw_spin_lock_bh(raw_spinlock_t *lock)
 {
 	__local_bh_disable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET);
 	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_spin_lock(&lock->dmap, "__raw_spin_unlock_bh", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
 }
 
@@ -140,6 +145,7 @@ static inline void __raw_spin_lock(raw_spinlock_t *lock)
 {
 	preempt_disable();
 	spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_spin_lock(&lock->dmap, "__raw_spin_unlock", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
 }
 
@@ -148,6 +154,7 @@ static inline void __raw_spin_lock(raw_spinlock_t *lock)
 static inline void __raw_spin_unlock(raw_spinlock_t *lock)
 {
 	spin_release(&lock->dep_map, _RET_IP_);
+	dept_spin_unlock(&lock->dmap, _RET_IP_);
 	do_raw_spin_unlock(lock);
 	preempt_enable();
 }
@@ -156,6 +163,7 @@ static inline void __raw_spin_unlock_irqrestore(raw_spinlock_t *lock,
 					    unsigned long flags)
 {
 	spin_release(&lock->dep_map, _RET_IP_);
+	dept_spin_unlock(&lock->dmap, _RET_IP_);
 	do_raw_spin_unlock(lock);
 	local_irq_restore(flags);
 	preempt_enable();
@@ -164,6 +172,7 @@ static inline void __raw_spin_unlock_irqrestore(raw_spinlock_t *lock,
 static inline void __raw_spin_unlock_irq(raw_spinlock_t *lock)
 {
 	spin_release(&lock->dep_map, _RET_IP_);
+	dept_spin_unlock(&lock->dmap, _RET_IP_);
 	do_raw_spin_unlock(lock);
 	local_irq_enable();
 	preempt_enable();
@@ -172,6 +181,7 @@ static inline void __raw_spin_unlock_irq(raw_spinlock_t *lock)
 static inline void __raw_spin_unlock_bh(raw_spinlock_t *lock)
 {
 	spin_release(&lock->dep_map, _RET_IP_);
+	dept_spin_unlock(&lock->dmap, _RET_IP_);
 	do_raw_spin_unlock(lock);
 	__local_bh_enable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET);
 }
@@ -181,6 +191,7 @@ static inline int __raw_spin_trylock_bh(raw_spinlock_t *lock)
 	__local_bh_disable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET);
 	if (do_raw_spin_trylock(lock)) {
 		spin_acquire(&lock->dep_map, 0, 1, _RET_IP_);
+		dept_spin_trylock(&lock->dmap, "__raw_spin_unlock_bh", _RET_IP_);
 		return 1;
 	}
 	__local_bh_enable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET);
diff --git a/include/linux/spinlock_types.h b/include/linux/spinlock_types.h
index b981caa..f180d71 100644
--- a/include/linux/spinlock_types.h
+++ b/include/linux/spinlock_types.h
@@ -16,6 +16,7 @@
 #endif
 
 #include <linux/lockdep_types.h>
+#include <linux/dept.h>
 
 typedef struct raw_spinlock {
 	arch_spinlock_t raw_lock;
@@ -26,6 +27,7 @@
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map dep_map;
 #endif
+	struct dept_map dmap;
 } raw_spinlock_t;
 
 #define SPINLOCK_MAGIC		0xdead4ead
@@ -37,17 +39,33 @@
 	.dep_map = {					\
 		.name = #lockname,			\
 		.wait_type_inner = LD_WAIT_SPIN,	\
-	}
+	},
 # define SPIN_DEP_MAP_INIT(lockname)			\
 	.dep_map = {					\
 		.name = #lockname,			\
 		.wait_type_inner = LD_WAIT_CONFIG,	\
-	}
+	},
 #else
 # define RAW_SPIN_DEP_MAP_INIT(lockname)
 # define SPIN_DEP_MAP_INIT(lockname)
 #endif
 
+#ifdef CONFIG_DEPT
+# define RAW_SPIN_DMAP_INIT(lockname)			\
+	.dmap = {					\
+		.name = #lockname,			\
+		.type = DEPT_TYPE_SPIN,			\
+	},
+# define SPIN_DMAP_INIT(lockname)			\
+	.dmap = {					\
+		.name = #lockname,			\
+		.type = DEPT_TYPE_SPIN,			\
+	},
+#else
+# define RAW_SPIN_DMAP_INIT(lockname)
+# define SPIN_DMAP_INIT(lockname)
+#endif
+
 #ifdef CONFIG_DEBUG_SPINLOCK
 # define SPIN_DEBUG_INIT(lockname)		\
 	.magic = SPINLOCK_MAGIC,		\
@@ -61,7 +79,8 @@
 	{					\
 	.raw_lock = __ARCH_SPIN_LOCK_UNLOCKED,	\
 	SPIN_DEBUG_INIT(lockname)		\
-	RAW_SPIN_DEP_MAP_INIT(lockname) }
+	RAW_SPIN_DEP_MAP_INIT(lockname)		\
+	RAW_SPIN_DMAP_INIT(lockname) }
 
 #define __RAW_SPIN_LOCK_UNLOCKED(lockname)	\
 	(raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname)
@@ -73,11 +92,18 @@
 		struct raw_spinlock rlock;
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define LOCK_PADSIZE (offsetof(struct raw_spinlock, dep_map))
+#define LOCK_PADSIZE (offsetof(struct raw_spinlock, dep_map))
 		struct {
 			u8 __padding[LOCK_PADSIZE];
 			struct lockdep_map dep_map;
 		};
+#undef LOCK_PADSIZE
+#define LOCK_PADSIZE (offsetof(struct raw_spinlock, dmap))
+		struct {
+			u8 __padding_for_dept[LOCK_PADSIZE];
+			struct dept_map dmap;
+		};
+#undef LOCK_PADSIZE
 #endif
 	};
 } spinlock_t;
@@ -86,7 +112,8 @@
 	{					\
 	.raw_lock = __ARCH_SPIN_LOCK_UNLOCKED,	\
 	SPIN_DEBUG_INIT(lockname)		\
-	SPIN_DEP_MAP_INIT(lockname) }
+	SPIN_DEP_MAP_INIT(lockname)		\
+	SPIN_DMAP_INIT(lockname) }
 
 #define __SPIN_LOCK_INITIALIZER(lockname) \
 	{ { .rlock = ___SPIN_LOCK_INITIALIZER(lockname) } }
diff --git a/include/linux/types.h b/include/linux/types.h
index a147977..0b058f5 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -187,6 +187,14 @@ struct hlist_node {
 	struct hlist_node *next, **pprev;
 };
 
+struct llist_head {
+	struct llist_node *first;
+};
+
+struct llist_node {
+	struct llist_node *next;
+};
+
 struct ustat {
 	__kernel_daddr_t	f_tfree;
 	__kernel_ino_t		f_tinode;
diff --git a/kernel/locking/spinlock.c b/kernel/locking/spinlock.c
index 0ff0838..e838e0e 100644
--- a/kernel/locking/spinlock.c
+++ b/kernel/locking/spinlock.c
@@ -359,6 +359,7 @@ void __lockfunc _raw_spin_lock_nested(raw_spinlock_t *lock, int subclass)
 {
 	preempt_disable();
 	spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
+	dept_spin_lock_nested(&lock->dmap, subclass, "_raw_spin_unlock", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
 }
 EXPORT_SYMBOL(_raw_spin_lock_nested);
@@ -371,6 +372,7 @@ unsigned long __lockfunc _raw_spin_lock_irqsave_nested(raw_spinlock_t *lock,
 	local_irq_save(flags);
 	preempt_disable();
 	spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
+	dept_spin_lock_nested(&lock->dmap, subclass, "_raw_spin_unlock_irqrestore", _RET_IP_);
 	LOCK_CONTENDED_FLAGS(lock, do_raw_spin_trylock, do_raw_spin_lock,
 				do_raw_spin_lock_flags, &flags);
 	return flags;
@@ -378,10 +380,12 @@ unsigned long __lockfunc _raw_spin_lock_irqsave_nested(raw_spinlock_t *lock,
 EXPORT_SYMBOL(_raw_spin_lock_irqsave_nested);
 
 void __lockfunc _raw_spin_lock_nest_lock(raw_spinlock_t *lock,
-				     struct lockdep_map *nest_lock)
+				     struct lockdep_map *nest_lock,
+				     struct dept_map *nest_dmap)
 {
 	preempt_disable();
 	spin_acquire_nest(&lock->dep_map, 0, 0, nest_lock, _RET_IP_);
+	dept_spin_lock_nest(&lock->dmap, nest_dmap, _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
 }
 EXPORT_SYMBOL(_raw_spin_lock_nest_lock);
diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index b9d9308..03e6812 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -14,7 +14,8 @@
 #include <linux/export.h>
 
 void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
-			  struct lock_class_key *key, short inner)
+			  struct lock_class_key *key,
+			  struct dept_key *dkey, short inner)
 {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	/*
@@ -22,6 +23,7 @@ void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
 	 */
 	debug_check_no_locks_freed((void *)lock, sizeof(*lock));
 	lockdep_init_map_wait(&lock->dep_map, name, key, 0, inner);
+	dept_spin_init(&lock->dmap, dkey, 0, name);
 #endif
 	lock->raw_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 	lock->magic = SPINLOCK_MAGIC;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2d95dc3..f11ef3d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3500,6 +3500,7 @@ static inline void finish_task(struct task_struct *prev)
 	 */
 	rq_unpin_lock(rq, rf);
 	spin_release(&rq->lock.dep_map, _THIS_IP_);
+	dept_spin_exit(&rq->lock.dmap, _THIS_IP_);
 #ifdef CONFIG_DEBUG_SPINLOCK
 	/* this is a valid case when another task releases the spinlock */
 	rq->lock.owner = next;
@@ -3514,6 +3515,7 @@ static inline void finish_lock_switch(struct rq *rq)
 	 * prev into current:
 	 */
 	spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
+	dept_spin_enter(&rq->lock.dmap, "raw_spin_inlock_irq", _THIS_IP_);
 	raw_spin_unlock_irq(&rq->lock);
 }
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 3/6] dept: Apply Dept to mutex families
  2020-11-23 11:36   ` [RFC 1/6] dept: Implement Dept(Dependency Tracker) Byungchul Park
  2020-11-23 11:36     ` [RFC 2/6] dept: Apply Dept to spinlock Byungchul Park
@ 2020-11-23 11:36     ` Byungchul Park
  2020-11-23 11:36     ` [RFC 4/6] dept: Apply Dept to rwlock Byungchul Park
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 11:36 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

Makes Dept able to track dependencies by mutex families.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 drivers/base/bus.c                            |  5 +-
 drivers/base/class.c                          |  5 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  5 +-
 drivers/gpu/drm/i915/i915_active.c            |  5 +-
 drivers/gpu/drm/i915/intel_wakeref.c          |  5 +-
 drivers/gpu/drm/nouveau/nvkm/core/subdev.c    |  5 +-
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c |  5 +-
 drivers/net/team/team.c                       |  5 +-
 include/linux/mutex.h                         | 45 ++++++++++++--
 include/linux/rtmutex.h                       | 18 ++++--
 include/linux/ww_mutex.h                      | 10 ++-
 kernel/locking/mutex-debug.c                  |  4 +-
 kernel/locking/mutex-debug.h                  |  3 +-
 kernel/locking/mutex.c                        | 88 ++++++++++++++++++++-------
 kernel/locking/mutex.h                        |  2 +-
 kernel/locking/rtmutex-debug.c                |  4 +-
 kernel/locking/rtmutex-debug.h                |  2 +-
 kernel/locking/rtmutex.c                      | 23 +++++--
 kernel/locking/rtmutex.h                      |  2 +-
 lib/locking-selftest.c                        | 12 ++--
 net/netfilter/ipvs/ip_vs_sync.c               |  5 +-
 21 files changed, 200 insertions(+), 58 deletions(-)

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 886e905..12288e3 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -845,7 +845,10 @@ int bus_register(struct bus_type *bus)
 	}
 
 	INIT_LIST_HEAD(&priv->interfaces);
-	__mutex_init(&priv->mutex, "subsys mutex", key);
+	/*
+	 * TODO: Initialize the mutex with a valid dept_key.
+	 */
+	__mutex_init(&priv->mutex, "subsys mutex", key, NULL);
 	klist_init(&priv->klist_devices, klist_devices_get, klist_devices_put);
 	klist_init(&priv->klist_drivers, NULL, NULL);
 
diff --git a/drivers/base/class.c b/drivers/base/class.c
index bcd410e..c52ee21a 100644
--- a/drivers/base/class.c
+++ b/drivers/base/class.c
@@ -163,7 +163,10 @@ int __class_register(struct class *cls, struct lock_class_key *key)
 	klist_init(&cp->klist_devices, klist_class_dev_get, klist_class_dev_put);
 	INIT_LIST_HEAD(&cp->interfaces);
 	kset_init(&cp->glue_dirs);
-	__mutex_init(&cp->mutex, "subsys mutex", key);
+	/*
+	 * TODO: Initialize the mutex with a valid dept_key.
+	 */
+	__mutex_init(&cp->mutex, "subsys mutex", key, NULL);
 	error = kobject_set_name(&cp->subsys.kobj, "%s", cls->name);
 	if (error) {
 		kfree(cp);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index c8421fd..0ecde29 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -53,7 +53,10 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops,
 			  struct lock_class_key *key)
 {
-	__mutex_init(&obj->mm.lock, ops->name ?: "obj->mm.lock", key);
+	/*
+	 * TODO: Initialize the mutex with a valid dept_key.
+	 */
+	__mutex_init(&obj->mm.lock, ops->name ?: "obj->mm.lock", key, NULL);
 
 	spin_lock_init(&obj->vma.lock);
 	INIT_LIST_HEAD(&obj->vma.list);
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index d960d0b..49a6972 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -297,7 +297,10 @@ void __i915_active_init(struct i915_active *ref,
 
 	init_llist_head(&ref->preallocated_barriers);
 	atomic_set(&ref->count, 0);
-	__mutex_init(&ref->mutex, "i915_active", mkey);
+	/*
+	 * TODO: Initialize the mutex with a valid dept_key.
+	 */
+	__mutex_init(&ref->mutex, "i915_active", mkey, NULL);
 	__i915_active_fence_init(&ref->excl, NULL, excl_retire);
 	INIT_WORK(&ref->work, active_work);
 #if IS_ENABLED(CONFIG_LOCKDEP)
diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
index dfd87d0..1fcbecf 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -101,7 +101,10 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
 	wf->rpm = rpm;
 	wf->ops = ops;
 
-	__mutex_init(&wf->mutex, "wakeref.mutex", &key->mutex);
+	/*
+	 * TODO: Initialize the mutex with a valid dept_key.
+	 */
+	__mutex_init(&wf->mutex, "wakeref.mutex", &key->mutex, NULL);
 	atomic_set(&wf->count, 0);
 	wf->wakeref = 0;
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
index 49d468b..f968d62 100644
--- a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
+++ b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
@@ -218,7 +218,10 @@
 	subdev->device = device;
 	subdev->index = index;
 
-	__mutex_init(&subdev->mutex, name, &nvkm_subdev_lock_class[index]);
+	/*
+	 * TODO: Initialize the mutex with a valid dept_key.
+	 */
+	__mutex_init(&subdev->mutex, name, &nvkm_subdev_lock_class[index], NULL);
 	subdev->debug = nvkm_dbgopt(device->dbgopt, name);
 }
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
index 710f3f8..f5602c6a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c
@@ -1048,7 +1048,10 @@ struct nvkm_vma *
 	vmm->debug = mmu->subdev.debug;
 	kref_init(&vmm->kref);
 
-	__mutex_init(&vmm->mutex, "&vmm->mutex", key ? key : &_key);
+	/*
+	 * TODO: Initialize the mutex with a valid dept_key.
+	 */
+	__mutex_init(&vmm->mutex, "&vmm->mutex", key ? key : &_key, NULL);
 
 	/* Locate the smallest page size supported by the backend, it will
 	 * have the the deepest nesting of page tables.
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index bcc4a4c..0292404 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1646,7 +1646,10 @@ static int team_init(struct net_device *dev)
 	netif_carrier_off(dev);
 
 	lockdep_register_key(&team->team_lock_key);
-	__mutex_init(&team->lock, "team->team_lock_key", &team->team_lock_key);
+	/*
+	 * TODO: Initialize the mutex with a valid dept_key.
+	 */
+	__mutex_init(&team->lock, "team->team_lock_key", &team->team_lock_key, NULL);
 	netdev_lockdep_set_classes(dev);
 
 	return 0;
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index dcd185c..14018db 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -63,6 +63,7 @@ struct mutex {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map	dep_map;
 #endif
+	struct dept_map		dmap;
 };
 
 struct ww_class;
@@ -76,6 +77,26 @@ struct ww_mutex {
 #endif
 };
 
+#ifdef CONFIG_DEPT
+#define dept_mutex_init(m, k, s, n)		dept_map_init(m, k, s, n, DEPT_TYPE_MUTEX)
+#define dept_mutex_reinit(m, k, s, n)		dept_map_reinit(m, k, s, n)
+#define dept_mutex_nocheck(m)			dept_map_nocheck(m)
+#define dept_mutex_lock(m, e_fn, ip)		dept_wait_ecxt_enter(m, 1UL, 1UL, ip, __func__, __func__, e_fn, 0)
+#define dept_mutex_lock_nest(m, n_m, ip)	WARN_ON(dept_top_map() != (n_m))
+#define dept_mutex_lock_nested(m, ne, e_fn, ip)	dept_wait_ecxt_enter(m, 1UL, 1UL, ip, __func__, __func__, e_fn, ne)
+#define dept_mutex_trylock(m, e_fn, ip)		dept_ecxt_enter(m, 1UL, ip, __func__, e_fn, 0)
+#define dept_mutex_unlock(m, ip)		dept_ecxt_exit(m, ip)
+#else
+#define dept_mutex_init(m, k, s, n)		do { (void)(n); (void)(k); } while (0)
+#define dept_mutex_reinit(m, k, s, n)		do { (void)(n); (void)(k); } while (0)
+#define dept_mutex_nocheck(m)			do { } while (0)
+#define dept_mutex_lock(m, e_fn, ip)		do { } while (0)
+#define dept_mutex_lock_nest(m, n_m, ip)	do { } while (0)
+#define dept_mutex_lock_nested(m, ne, e_fn, ip)	do { } while (0)
+#define dept_mutex_trylock(m, e_fn, ip)		do { } while (0)
+#define dept_mutex_unlock(m, ip)		do { } while (0)
+#endif
+
 /*
  * This is the control structure for tasks blocked on mutex,
  * which resides on the blocked task's kernel stack:
@@ -115,8 +136,9 @@ static inline void mutex_destroy(struct mutex *lock) {}
 #define mutex_init(mutex)						\
 do {									\
 	static struct lock_class_key __key;				\
+	static struct dept_key __dkey;					\
 									\
-	__mutex_init((mutex), #mutex, &__key);				\
+	__mutex_init((mutex), #mutex, &__key, &__dkey);			\
 } while (0)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -129,18 +151,30 @@ static inline void mutex_destroy(struct mutex *lock) {}
 # define __DEP_MAP_MUTEX_INITIALIZER(lockname)
 #endif
 
+#ifdef CONFIG_DEPT
+# define __DMAP_MUTEX_INITIALIZER(lockname)			\
+		, .dmap = {					\
+			.name = #lockname,			\
+			.type = DEPT_TYPE_MUTEX,		\
+		}
+#else
+# define __DMAP_MUTEX_INITIALIZER(lockname)
+#endif
+
 #define __MUTEX_INITIALIZER(lockname) \
 		{ .owner = ATOMIC_LONG_INIT(0) \
 		, .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
 		, .wait_list = LIST_HEAD_INIT(lockname.wait_list) \
 		__DEBUG_MUTEX_INITIALIZER(lockname) \
-		__DEP_MAP_MUTEX_INITIALIZER(lockname) }
+		__DEP_MAP_MUTEX_INITIALIZER(lockname) \
+		__DMAP_MUTEX_INITIALIZER(lockname) }
 
 #define DEFINE_MUTEX(mutexname) \
 	struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
 
 extern void __mutex_init(struct mutex *lock, const char *name,
-			 struct lock_class_key *key);
+			 struct lock_class_key *key,
+			 struct dept_key *dkey);
 
 /**
  * mutex_is_locked - is the mutex locked
@@ -156,7 +190,7 @@ extern void __mutex_init(struct mutex *lock, const char *name,
  */
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
-extern void _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest_lock);
+extern void _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest_lock, struct dept_map *nest_dmap);
 
 extern int __must_check mutex_lock_interruptible_nested(struct mutex *lock,
 					unsigned int subclass);
@@ -172,7 +206,8 @@ extern int __must_check mutex_lock_killable_nested(struct mutex *lock,
 #define mutex_lock_nest_lock(lock, nest_lock)				\
 do {									\
 	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
-	_mutex_lock_nest_lock(lock, &(nest_lock)->dep_map);		\
+	typecheck(struct dept_map *, &(nest_lock)->dmap);	\
+	_mutex_lock_nest_lock(lock, &(nest_lock)->dep_map, &(nest_lock)->dmap); \
 } while (0)
 
 #else
diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 6fd615a..cb81460 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -40,6 +40,7 @@ struct rt_mutex {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map	dep_map;
 #endif
+	struct dept_map		dmap;
 };
 
 struct rt_mutex_waiter;
@@ -65,13 +66,14 @@ struct rt_mutex {
 # define rt_mutex_init(mutex) \
 do { \
 	static struct lock_class_key __key; \
-	__rt_mutex_init(mutex, __func__, &__key); \
+	static struct dept_key __dkey; \
+	__rt_mutex_init(mutex, __func__, &__key, &__dkey); \
 } while (0)
 
  extern void rt_mutex_debug_task_free(struct task_struct *tsk);
 #else
 # define __DEBUG_RT_MUTEX_INITIALIZER(mutexname)
-# define rt_mutex_init(mutex)			__rt_mutex_init(mutex, NULL, NULL)
+# define rt_mutex_init(mutex)			__rt_mutex_init(mutex, NULL, NULL, NULL)
 # define rt_mutex_debug_task_free(t)			do { } while (0)
 #endif
 
@@ -82,12 +84,20 @@ struct rt_mutex {
 #define __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname)
 #endif
 
+#ifdef CONFIG_DEPT
+#define __DMAP_RT_MUTEX_INITIALIZER(mutexname) \
+	, .dmap = { .name = #mutexname, .type = DEPT_TYPE_MUTEX }
+#else
+#define __DMAP_RT_MUTEX_INITIALIZER(mutexname)
+#endif
+
 #define __RT_MUTEX_INITIALIZER(mutexname) \
 	{ .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(mutexname.wait_lock) \
 	, .waiters = RB_ROOT_CACHED \
 	, .owner = NULL \
 	__DEBUG_RT_MUTEX_INITIALIZER(mutexname) \
-	__DEP_MAP_RT_MUTEX_INITIALIZER(mutexname)}
+	__DEP_MAP_RT_MUTEX_INITIALIZER(mutexname) \
+	__DMAP_RT_MUTEX_INITIALIZER(mutexname) }
 
 #define DEFINE_RT_MUTEX(mutexname) \
 	struct rt_mutex mutexname = __RT_MUTEX_INITIALIZER(mutexname)
@@ -103,7 +113,7 @@ static inline int rt_mutex_is_locked(struct rt_mutex *lock)
 	return lock->owner != NULL;
 }
 
-extern void __rt_mutex_init(struct rt_mutex *lock, const char *name, struct lock_class_key *key);
+extern void __rt_mutex_init(struct rt_mutex *lock, const char *name, struct lock_class_key *key, struct dept_key *dkey);
 extern void rt_mutex_destroy(struct rt_mutex *lock);
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
index 850424e..8304e49 100644
--- a/include/linux/ww_mutex.h
+++ b/include/linux/ww_mutex.h
@@ -23,6 +23,8 @@ struct ww_class {
 	atomic_long_t stamp;
 	struct lock_class_key acquire_key;
 	struct lock_class_key mutex_key;
+	struct dept_key acquire_dkey;
+	struct dept_key mutex_dkey;
 	const char *acquire_name;
 	const char *mutex_name;
 	unsigned int is_wait_die;
@@ -42,6 +44,7 @@ struct ww_acquire_ctx {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map dep_map;
 #endif
+	struct dept_map dmap;
 #ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
 	unsigned int deadlock_inject_interval;
 	unsigned int deadlock_inject_countdown;
@@ -87,7 +90,8 @@ struct ww_acquire_ctx {
 static inline void ww_mutex_init(struct ww_mutex *lock,
 				 struct ww_class *ww_class)
 {
-	__mutex_init(&lock->base, ww_class->mutex_name, &ww_class->mutex_key);
+	__mutex_init(&lock->base, ww_class->mutex_name, &ww_class->mutex_key,
+			&ww_class->mutex_dkey);
 	lock->ctx = NULL;
 #ifdef CONFIG_DEBUG_MUTEXES
 	lock->ww_class = ww_class;
@@ -136,6 +140,9 @@ static inline void ww_acquire_init(struct ww_acquire_ctx *ctx,
 	lockdep_init_map(&ctx->dep_map, ww_class->acquire_name,
 			 &ww_class->acquire_key, 0);
 	mutex_acquire(&ctx->dep_map, 0, 0, _RET_IP_);
+	dept_mutex_init(&ctx->dmap, &ww_class->acquire_dkey, 0,
+			ww_class->acquire_name);
+	dept_mutex_lock(&ctx->dmap, "ww_acquire_fini", _RET_IP_);
 #endif
 #ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH
 	ctx->deadlock_inject_interval = 1;
@@ -175,6 +182,7 @@ static inline void ww_acquire_fini(struct ww_acquire_ctx *ctx)
 {
 #ifdef CONFIG_DEBUG_MUTEXES
 	mutex_release(&ctx->dep_map, _THIS_IP_);
+	dept_mutex_unlock(&ctx->dmap, _THIS_IP_);
 
 	DEBUG_LOCKS_WARN_ON(ctx->acquired);
 	if (!IS_ENABLED(CONFIG_PROVE_LOCKING))
diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
index a7276aa..dae269e 100644
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -78,7 +78,8 @@ void debug_mutex_unlock(struct mutex *lock)
 }
 
 void debug_mutex_init(struct mutex *lock, const char *name,
-		      struct lock_class_key *key)
+		      struct lock_class_key *key,
+		      struct dept_key *dkey)
 {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	/*
@@ -86,6 +87,7 @@ void debug_mutex_init(struct mutex *lock, const char *name,
 	 */
 	debug_check_no_locks_freed((void *)lock, sizeof(*lock));
 	lockdep_init_map_wait(&lock->dep_map, name, key, 0, LD_WAIT_SLEEP);
+	dept_mutex_init(&lock->dmap, dkey, 0, name);
 #endif
 	lock->magic = lock;
 }
diff --git a/kernel/locking/mutex-debug.h b/kernel/locking/mutex-debug.h
index 1edd3f4..153c680 100644
--- a/kernel/locking/mutex-debug.h
+++ b/kernel/locking/mutex-debug.h
@@ -26,4 +26,5 @@ extern void mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter,
 				struct task_struct *task);
 extern void debug_mutex_unlock(struct mutex *lock);
 extern void debug_mutex_init(struct mutex *lock, const char *name,
-			     struct lock_class_key *key);
+			     struct lock_class_key *key,
+			     struct dept_key *dkey);
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 5352ce5..27f03ad 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -37,7 +37,8 @@
 #endif
 
 void
-__mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
+__mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key,
+		struct dept_key *dkey)
 {
 	atomic_long_set(&lock->owner, 0);
 	spin_lock_init(&lock->wait_lock);
@@ -46,7 +47,7 @@
 	osq_lock_init(&lock->osq);
 #endif
 
-	debug_mutex_init(lock, name, key);
+	debug_mutex_init(lock, name, key, dkey);
 }
 EXPORT_SYMBOL(__mutex_init);
 
@@ -1098,46 +1099,82 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
 
 static int __sched
 __mutex_lock(struct mutex *lock, long state, unsigned int subclass,
-	     struct lockdep_map *nest_lock, unsigned long ip)
+	     struct lockdep_map *nest_lock, struct dept_map *nest_dmap,
+	     unsigned long ip)
 {
-	return __mutex_lock_common(lock, state, subclass, nest_lock, ip, NULL, false);
+	int ret;
+
+	/*
+	 * TODO: Wrong dependency between this mutex and inner spinlock
+	 * inside __mutex_lock_common() will be added into Dept. For now,
+	 * let's allow it for simplicity.
+	 */
+	if (nest_dmap)
+		dept_mutex_lock_nest(&lock->dmap, nest_dmap, ip);
+	else
+		dept_mutex_lock_nested(&lock->dmap, subclass, "__mutex_unlock", ip);
+
+	ret = __mutex_lock_common(lock, state, subclass, nest_lock, ip, NULL, false);
+
+	if (!nest_dmap && ret)
+		dept_mutex_unlock(&lock->dmap, _THIS_IP_);
+
+	return ret;
 }
 
 static int __sched
 __ww_mutex_lock(struct mutex *lock, long state, unsigned int subclass,
-		struct lockdep_map *nest_lock, unsigned long ip,
-		struct ww_acquire_ctx *ww_ctx)
+		struct lockdep_map *nest_lock, struct dept_map *nest_dmap,
+		unsigned long ip, struct ww_acquire_ctx *ww_ctx)
 {
-	return __mutex_lock_common(lock, state, subclass, nest_lock, ip, ww_ctx, true);
+	int ret;
+
+	/*
+	 * TODO: Wrong dependency between this mutex and inner spinlock
+	 * inside __mutex_lock_common() will be added into Dept. For now,
+	 * let's allow it for simplicity.
+	 */
+	if (nest_dmap)
+		dept_mutex_lock_nest(&lock->dmap, nest_dmap, ip);
+	else
+		dept_mutex_lock_nested(&lock->dmap, subclass, "__mutex_unlock", ip);
+
+	ret = __mutex_lock_common(lock, state, subclass, nest_lock, ip, ww_ctx, true);
+
+	if (!nest_dmap && ret)
+		dept_mutex_unlock(&lock->dmap, _THIS_IP_);
+
+	return ret;
 }
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 void __sched
 mutex_lock_nested(struct mutex *lock, unsigned int subclass)
 {
-	__mutex_lock(lock, TASK_UNINTERRUPTIBLE, subclass, NULL, _RET_IP_);
+	__mutex_lock(lock, TASK_UNINTERRUPTIBLE, subclass, NULL, NULL, _RET_IP_);
 }
 
 EXPORT_SYMBOL_GPL(mutex_lock_nested);
 
 void __sched
-_mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest)
+_mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest,
+		struct dept_map *nest_dmap)
 {
-	__mutex_lock(lock, TASK_UNINTERRUPTIBLE, 0, nest, _RET_IP_);
+	__mutex_lock(lock, TASK_UNINTERRUPTIBLE, 0, nest, nest_dmap, _RET_IP_);
 }
 EXPORT_SYMBOL_GPL(_mutex_lock_nest_lock);
 
 int __sched
 mutex_lock_killable_nested(struct mutex *lock, unsigned int subclass)
 {
-	return __mutex_lock(lock, TASK_KILLABLE, subclass, NULL, _RET_IP_);
+	return __mutex_lock(lock, TASK_KILLABLE, subclass, NULL, NULL, _RET_IP_);
 }
 EXPORT_SYMBOL_GPL(mutex_lock_killable_nested);
 
 int __sched
 mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
 {
-	return __mutex_lock(lock, TASK_INTERRUPTIBLE, subclass, NULL, _RET_IP_);
+	return __mutex_lock(lock, TASK_INTERRUPTIBLE, subclass, NULL, NULL, _RET_IP_);
 }
 EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
 
@@ -1145,12 +1182,16 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
 mutex_lock_io_nested(struct mutex *lock, unsigned int subclass)
 {
 	int token;
+	int ret;
 
 	might_sleep();
 
 	token = io_schedule_prepare();
-	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
+	dept_mutex_lock_nested(&lock->dmap, subclass, "mutex_unlock", _RET_IP_);
+	ret = __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
 			    subclass, NULL, _RET_IP_, NULL, 0);
+	if (ret)
+		dept_mutex_unlock(&lock->dmap, _THIS_IP_);
 	io_schedule_finish(token);
 }
 EXPORT_SYMBOL_GPL(mutex_lock_io_nested);
@@ -1188,7 +1229,8 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
 
 	might_sleep();
 	ret =  __ww_mutex_lock(&lock->base, TASK_UNINTERRUPTIBLE,
-			       0, ctx ? &ctx->dep_map : NULL, _RET_IP_,
+			       0, ctx ? &ctx->dep_map : NULL,
+			       ctx ? &ctx->dmap : NULL, _RET_IP_,
 			       ctx);
 	if (!ret && ctx && ctx->acquired > 1)
 		return ww_mutex_deadlock_injection(lock, ctx);
@@ -1204,7 +1246,8 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
 
 	might_sleep();
 	ret = __ww_mutex_lock(&lock->base, TASK_INTERRUPTIBLE,
-			      0, ctx ? &ctx->dep_map : NULL, _RET_IP_,
+			      0, ctx ? &ctx->dep_map : NULL,
+			      ctx ? &ctx->dmap : NULL, _RET_IP_,
 			      ctx);
 
 	if (!ret && ctx && ctx->acquired > 1)
@@ -1226,6 +1269,7 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
 	unsigned long owner;
 
 	mutex_release(&lock->dep_map, ip);
+	dept_mutex_unlock(&lock->dmap, ip);
 
 	/*
 	 * Release the lock before (potentially) taking the spinlock such that
@@ -1361,26 +1405,26 @@ void __sched mutex_lock_io(struct mutex *lock)
 static noinline void __sched
 __mutex_lock_slowpath(struct mutex *lock)
 {
-	__mutex_lock(lock, TASK_UNINTERRUPTIBLE, 0, NULL, _RET_IP_);
+	__mutex_lock(lock, TASK_UNINTERRUPTIBLE, 0, NULL, NULL, _RET_IP_);
 }
 
 static noinline int __sched
 __mutex_lock_killable_slowpath(struct mutex *lock)
 {
-	return __mutex_lock(lock, TASK_KILLABLE, 0, NULL, _RET_IP_);
+	return __mutex_lock(lock, TASK_KILLABLE, 0, NULL, NULL, _RET_IP_);
 }
 
 static noinline int __sched
 __mutex_lock_interruptible_slowpath(struct mutex *lock)
 {
-	return __mutex_lock(lock, TASK_INTERRUPTIBLE, 0, NULL, _RET_IP_);
+	return __mutex_lock(lock, TASK_INTERRUPTIBLE, 0, NULL, NULL, _RET_IP_);
 }
 
 static noinline int __sched
 __ww_mutex_lock_slowpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
 {
 	return __ww_mutex_lock(&lock->base, TASK_UNINTERRUPTIBLE, 0, NULL,
-			       _RET_IP_, ctx);
+			       NULL, _RET_IP_, ctx);
 }
 
 static noinline int __sched
@@ -1388,7 +1432,7 @@ void __sched mutex_lock_io(struct mutex *lock)
 					    struct ww_acquire_ctx *ctx)
 {
 	return __ww_mutex_lock(&lock->base, TASK_INTERRUPTIBLE, 0, NULL,
-			       _RET_IP_, ctx);
+			       NULL, _RET_IP_, ctx);
 }
 
 #endif
@@ -1416,8 +1460,10 @@ int __sched mutex_trylock(struct mutex *lock)
 #endif
 
 	locked = __mutex_trylock(lock);
-	if (locked)
+	if (locked) {
 		mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_);
+		dept_mutex_trylock(&lock->dmap, "mutex_unlock", _RET_IP_);
+	}
 
 	return locked;
 }
diff --git a/kernel/locking/mutex.h b/kernel/locking/mutex.h
index 1c2287d..8b3eabb 100644
--- a/kernel/locking/mutex.h
+++ b/kernel/locking/mutex.h
@@ -17,7 +17,7 @@
 #define debug_mutex_free_waiter(waiter)			do { } while (0)
 #define debug_mutex_add_waiter(lock, waiter, ti)	do { } while (0)
 #define debug_mutex_unlock(lock)			do { } while (0)
-#define debug_mutex_init(lock, name, key)		do { } while (0)
+#define debug_mutex_init(lock, name, key, dkey)		do { } while (0)
 
 static inline void
 debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waiter)
diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c
index 36e6910..1c35321 100644
--- a/kernel/locking/rtmutex-debug.c
+++ b/kernel/locking/rtmutex-debug.c
@@ -167,7 +167,8 @@ void debug_rt_mutex_free_waiter(struct rt_mutex_waiter *waiter)
 	memset(waiter, 0x22, sizeof(*waiter));
 }
 
-void debug_rt_mutex_init(struct rt_mutex *lock, const char *name, struct lock_class_key *key)
+void debug_rt_mutex_init(struct rt_mutex *lock, const char *name, struct lock_class_key *key,
+			 struct dept_key *dkey)
 {
 	/*
 	 * Make sure we are not reinitializing a held lock:
@@ -177,6 +178,7 @@ void debug_rt_mutex_init(struct rt_mutex *lock, const char *name, struct lock_cl
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	lockdep_init_map(&lock->dep_map, name, key, 0);
+	dept_mutex_init(&lock->dmap, dkey, 0, name);
 #endif
 }
 
diff --git a/kernel/locking/rtmutex-debug.h b/kernel/locking/rtmutex-debug.h
index fc54971..4e88416 100644
--- a/kernel/locking/rtmutex-debug.h
+++ b/kernel/locking/rtmutex-debug.h
@@ -12,7 +12,7 @@
 
 extern void debug_rt_mutex_init_waiter(struct rt_mutex_waiter *waiter);
 extern void debug_rt_mutex_free_waiter(struct rt_mutex_waiter *waiter);
-extern void debug_rt_mutex_init(struct rt_mutex *lock, const char *name, struct lock_class_key *key);
+extern void debug_rt_mutex_init(struct rt_mutex *lock, const char *name, struct lock_class_key *key, struct dept_key *dkey);
 extern void debug_rt_mutex_lock(struct rt_mutex *lock);
 extern void debug_rt_mutex_unlock(struct rt_mutex *lock);
 extern void debug_rt_mutex_proxy_lock(struct rt_mutex *lock,
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index cfdd5b9..893629c 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1467,6 +1467,7 @@ static inline void __rt_mutex_lock(struct rt_mutex *lock, unsigned int subclass)
 	might_sleep();
 
 	mutex_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
+	dept_mutex_lock_nested(&lock->dmap, subclass, "__rt_mutex_unlock", _RET_IP_);
 	rt_mutex_fastlock(lock, TASK_UNINTERRUPTIBLE, rt_mutex_slowlock);
 }
 
@@ -1513,9 +1514,12 @@ int __sched rt_mutex_lock_interruptible(struct rt_mutex *lock)
 	might_sleep();
 
 	mutex_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_mutex_lock(&lock->dmap, "rt_mutex_unlock", _RET_IP_);
 	ret = rt_mutex_fastlock(lock, TASK_INTERRUPTIBLE, rt_mutex_slowlock);
-	if (ret)
+	if (ret) {
 		mutex_release(&lock->dep_map, _RET_IP_);
+		dept_mutex_unlock(&lock->dmap, _RET_IP_);
+	}
 
 	return ret;
 }
@@ -1555,11 +1559,14 @@ int __sched __rt_mutex_futex_trylock(struct rt_mutex *lock)
 	might_sleep();
 
 	mutex_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_mutex_lock(&lock->dmap, "rt_mutex_unlock", _RET_IP_);
 	ret = rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
 				       RT_MUTEX_MIN_CHAINWALK,
 				       rt_mutex_slowlock);
-	if (ret)
+	if (ret) {
 		mutex_release(&lock->dep_map, _RET_IP_);
+		dept_mutex_unlock(&lock->dmap, _RET_IP_);
+	}
 
 	return ret;
 }
@@ -1584,8 +1591,10 @@ int __sched rt_mutex_trylock(struct rt_mutex *lock)
 		return 0;
 
 	ret = rt_mutex_fasttrylock(lock, rt_mutex_slowtrylock);
-	if (ret)
+	if (ret) {
 		mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_);
+		dept_mutex_trylock(&lock->dmap, "rt_mutex_unlock", _RET_IP_);
+	}
 
 	return ret;
 }
@@ -1599,6 +1608,7 @@ int __sched rt_mutex_trylock(struct rt_mutex *lock)
 void __sched rt_mutex_unlock(struct rt_mutex *lock)
 {
 	mutex_release(&lock->dep_map, _RET_IP_);
+	dept_mutex_unlock(&lock->dmap, _RET_IP_);
 	rt_mutex_fastunlock(lock, rt_mutex_slowunlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_unlock);
@@ -1671,14 +1681,15 @@ void rt_mutex_destroy(struct rt_mutex *lock)
  * Initializing of a locked rt lock is not allowed
  */
 void __rt_mutex_init(struct rt_mutex *lock, const char *name,
-		     struct lock_class_key *key)
+		     struct lock_class_key *key,
+		     struct dept_key *dkey)
 {
 	lock->owner = NULL;
 	raw_spin_lock_init(&lock->wait_lock);
 	lock->waiters = RB_ROOT_CACHED;
 
 	if (name && key)
-		debug_rt_mutex_init(lock, name, key);
+		debug_rt_mutex_init(lock, name, key, dkey);
 }
 EXPORT_SYMBOL_GPL(__rt_mutex_init);
 
@@ -1699,7 +1710,7 @@ void __rt_mutex_init(struct rt_mutex *lock, const char *name,
 void rt_mutex_init_proxy_locked(struct rt_mutex *lock,
 				struct task_struct *proxy_owner)
 {
-	__rt_mutex_init(lock, NULL, NULL);
+	__rt_mutex_init(lock, NULL, NULL, NULL);
 	debug_rt_mutex_proxy_lock(lock, proxy_owner);
 	rt_mutex_set_owner(lock, proxy_owner);
 }
diff --git a/kernel/locking/rtmutex.h b/kernel/locking/rtmutex.h
index 732f96a..976b1a0 100644
--- a/kernel/locking/rtmutex.h
+++ b/kernel/locking/rtmutex.h
@@ -18,7 +18,7 @@
 #define debug_rt_mutex_proxy_lock(l,p)			do { } while (0)
 #define debug_rt_mutex_proxy_unlock(l)			do { } while (0)
 #define debug_rt_mutex_unlock(l)			do { } while (0)
-#define debug_rt_mutex_init(m, n, k)			do { } while (0)
+#define debug_rt_mutex_init(m, n, k, dk)		do { } while (0)
 #define debug_rt_mutex_deadlock(d, a ,l)		do { } while (0)
 #define debug_rt_mutex_print_deadlock(w)		do { } while (0)
 #define debug_rt_mutex_reset_waiter(w)			do { } while (0)
diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 14f44f5..34f3b85 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -155,12 +155,12 @@ static void init_shared_classes(void)
 #ifdef CONFIG_RT_MUTEXES
 	static struct lock_class_key rt_X, rt_Y, rt_Z;
 
-	__rt_mutex_init(&rtmutex_X1, __func__, &rt_X);
-	__rt_mutex_init(&rtmutex_X2, __func__, &rt_X);
-	__rt_mutex_init(&rtmutex_Y1, __func__, &rt_Y);
-	__rt_mutex_init(&rtmutex_Y2, __func__, &rt_Y);
-	__rt_mutex_init(&rtmutex_Z1, __func__, &rt_Z);
-	__rt_mutex_init(&rtmutex_Z2, __func__, &rt_Z);
+	__rt_mutex_init(&rtmutex_X1, __func__, &rt_X, NULL);
+	__rt_mutex_init(&rtmutex_X2, __func__, &rt_X, NULL);
+	__rt_mutex_init(&rtmutex_Y1, __func__, &rt_Y, NULL);
+	__rt_mutex_init(&rtmutex_Y2, __func__, &rt_Y, NULL);
+	__rt_mutex_init(&rtmutex_Z1, __func__, &rt_Z, NULL);
+	__rt_mutex_init(&rtmutex_Z2, __func__, &rt_Z, NULL);
 #endif
 
 	init_class_X(&lock_X1, &rwlock_X1, &mutex_X1, &rwsem_X1);
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index 2b8abbf..f76fac8 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -2034,7 +2034,10 @@ int stop_sync_thread(struct netns_ipvs *ipvs, int state)
  */
 int __net_init ip_vs_sync_net_init(struct netns_ipvs *ipvs)
 {
-	__mutex_init(&ipvs->sync_mutex, "ipvs->sync_mutex", &__ipvs_sync_key);
+	/*
+	 * TODO: Initialize the mutex with a valid dept_key.
+	 */
+	__mutex_init(&ipvs->sync_mutex, "ipvs->sync_mutex", &__ipvs_sync_key, NULL);
 	spin_lock_init(&ipvs->sync_lock);
 	spin_lock_init(&ipvs->sync_buff_lock);
 	return 0;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 4/6] dept: Apply Dept to rwlock
  2020-11-23 11:36   ` [RFC 1/6] dept: Implement Dept(Dependency Tracker) Byungchul Park
  2020-11-23 11:36     ` [RFC 2/6] dept: Apply Dept to spinlock Byungchul Park
  2020-11-23 11:36     ` [RFC 3/6] dept: Apply Dept to mutex families Byungchul Park
@ 2020-11-23 11:36     ` Byungchul Park
  2020-11-23 11:36     ` [RFC 5/6] dept: Apply Dept to wait_for_completion()/complete() Byungchul Park
  2020-11-23 11:36     ` [RFC 6/6] dept: Assign custom dept_keys or disable to avoid false positives Byungchul Park
  4 siblings, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 11:36 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

Makes Dept able to track dependencies by rwlock.

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/rwlock.h          | 32 ++++++++++++++++++++++++++++++--
 include/linux/rwlock_api_smp.h  | 18 ++++++++++++++++++
 include/linux/rwlock_types.h    | 19 ++++++++++++++++---
 kernel/locking/spinlock_debug.c |  4 +++-
 4 files changed, 67 insertions(+), 6 deletions(-)

diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h
index 3dcd617..f633341 100644
--- a/include/linux/rwlock.h
+++ b/include/linux/rwlock.h
@@ -16,18 +16,46 @@
 
 #ifdef CONFIG_DEBUG_SPINLOCK
   extern void __rwlock_init(rwlock_t *lock, const char *name,
-			    struct lock_class_key *key);
+			    struct lock_class_key *key,
+			    struct dept_key *dkey);
 # define rwlock_init(lock)					\
 do {								\
 	static struct lock_class_key __key;			\
+	static struct dept_key __dkey;				\
 								\
-	__rwlock_init((lock), #lock, &__key);			\
+	__rwlock_init((lock), #lock, &__key, &__dkey);		\
 } while (0)
 #else
 # define rwlock_init(lock)					\
 	do { *(lock) = __RW_LOCK_UNLOCKED(lock); } while (0)
 #endif
 
+#ifdef CONFIG_DEPT
+#define DEPT_EVT_R		1UL
+#define DEPT_EVT_W		(1UL << 1)
+#define DEPT_EVT_RW		(DEPT_EVT_R | DEPT_EVT_W)
+
+#define dept_rw_init(m, k, s, n)		dept_map_init(m, k, s, n, DEPT_TYPE_RW)
+#define dept_rw_reinit(m, k, s, n)		dept_map_reinit(m, k, s, n)
+#define dept_rw_nocheck(m)			dept_map_nocheck(m)
+#define dept_write_lock(m, e_fn, ip)		dept_wait_ecxt_enter(m, DEPT_EVT_RW, DEPT_EVT_W, ip, __func__, __func__, e_fn, 0)
+#define dept_write_trylock(m, e_fn, ip)		dept_ecxt_enter(m, DEPT_EVT_W, ip, __func__, e_fn, 0)
+#define dept_write_unlock(m, ip)		dept_ecxt_exit(m, ip)
+#define dept_read_lock(m, e_fn, ip)		dept_wait_ecxt_enter(m, DEPT_EVT_W, DEPT_EVT_R, ip, __func__, __func__, e_fn, 0)
+#define dept_read_trylock(m, e_fn, ip)		dept_ecxt_enter(m, DEPT_EVT_R, ip, __func__, e_fn, 0)
+#define dept_read_unlock(m, ip)			dept_ecxt_exit(m, ip)
+#else
+#define dept_rw_init(m, k, s, n)		do { (void)(n); (void)(k); } while (0)
+#define dept_rw_reinit(m, k, s, n)		do { (void)(n); (void)(k); } while (0)
+#define dept_rw_nocheck(m)			do { } while (0)
+#define dept_write_lock(m, e_fn, ip)		do { } while (0)
+#define dept_write_trylock(m, e_fn, ip)		do { } while (0)
+#define dept_write_unlock(m, ip)		do { } while (0)
+#define dept_read_lock(m, e_fn, ip)		do { } while (0)
+#define dept_read_trylock(m, e_fn, ip)		do { } while (0)
+#define dept_read_unlock(m, ip)			do { } while (0)
+#endif
+
 #ifdef CONFIG_DEBUG_SPINLOCK
  extern void do_raw_read_lock(rwlock_t *lock) __acquires(lock);
 #define do_raw_read_lock_flags(lock, flags) do_raw_read_lock(lock)
diff --git a/include/linux/rwlock_api_smp.h b/include/linux/rwlock_api_smp.h
index abfb53a..2003104 100644
--- a/include/linux/rwlock_api_smp.h
+++ b/include/linux/rwlock_api_smp.h
@@ -119,6 +119,7 @@ static inline int __raw_read_trylock(rwlock_t *lock)
 	preempt_disable();
 	if (do_raw_read_trylock(lock)) {
 		rwlock_acquire_read(&lock->dep_map, 0, 1, _RET_IP_);
+		dept_read_trylock(&lock->dmap, "__raw_read_unlock", _RET_IP_);
 		return 1;
 	}
 	preempt_enable();
@@ -130,6 +131,7 @@ static inline int __raw_write_trylock(rwlock_t *lock)
 	preempt_disable();
 	if (do_raw_write_trylock(lock)) {
 		rwlock_acquire(&lock->dep_map, 0, 1, _RET_IP_);
+		dept_write_trylock(&lock->dmap, "__raw_write_unlock", _RET_IP_);
 		return 1;
 	}
 	preempt_enable();
@@ -147,6 +149,7 @@ static inline void __raw_read_lock(rwlock_t *lock)
 {
 	preempt_disable();
 	rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_read_lock(&lock->dmap, "__raw_read_unlock", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_read_trylock, do_raw_read_lock);
 }
 
@@ -157,6 +160,7 @@ static inline unsigned long __raw_read_lock_irqsave(rwlock_t *lock)
 	local_irq_save(flags);
 	preempt_disable();
 	rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_read_lock(&lock->dmap, "__raw_read_unlock_irqrestore", _RET_IP_);
 	LOCK_CONTENDED_FLAGS(lock, do_raw_read_trylock, do_raw_read_lock,
 			     do_raw_read_lock_flags, &flags);
 	return flags;
@@ -167,6 +171,7 @@ static inline void __raw_read_lock_irq(rwlock_t *lock)
 	local_irq_disable();
 	preempt_disable();
 	rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_read_lock(&lock->dmap, "__raw_read_unlock_irq", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_read_trylock, do_raw_read_lock);
 }
 
@@ -174,6 +179,7 @@ static inline void __raw_read_lock_bh(rwlock_t *lock)
 {
 	__local_bh_disable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET);
 	rwlock_acquire_read(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_read_lock(&lock->dmap, "__raw_read_unlock_bh", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_read_trylock, do_raw_read_lock);
 }
 
@@ -184,6 +190,7 @@ static inline unsigned long __raw_write_lock_irqsave(rwlock_t *lock)
 	local_irq_save(flags);
 	preempt_disable();
 	rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_write_lock(&lock->dmap, "__raw_write_unlock_irqrestore", _RET_IP_);
 	LOCK_CONTENDED_FLAGS(lock, do_raw_write_trylock, do_raw_write_lock,
 			     do_raw_write_lock_flags, &flags);
 	return flags;
@@ -194,6 +201,7 @@ static inline void __raw_write_lock_irq(rwlock_t *lock)
 	local_irq_disable();
 	preempt_disable();
 	rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_write_lock(&lock->dmap, "__raw_write_unlock_irq", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock);
 }
 
@@ -201,6 +209,7 @@ static inline void __raw_write_lock_bh(rwlock_t *lock)
 {
 	__local_bh_disable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET);
 	rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_write_lock(&lock->dmap, "__raw_write_unlock_bh", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock);
 }
 
@@ -208,6 +217,7 @@ static inline void __raw_write_lock(rwlock_t *lock)
 {
 	preempt_disable();
 	rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+	dept_write_lock(&lock->dmap, "__raw_write_unlock", _RET_IP_);
 	LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock);
 }
 
@@ -216,6 +226,7 @@ static inline void __raw_write_lock(rwlock_t *lock)
 static inline void __raw_write_unlock(rwlock_t *lock)
 {
 	rwlock_release(&lock->dep_map, _RET_IP_);
+	dept_write_unlock(&lock->dmap, _RET_IP_);
 	do_raw_write_unlock(lock);
 	preempt_enable();
 }
@@ -223,6 +234,7 @@ static inline void __raw_write_unlock(rwlock_t *lock)
 static inline void __raw_read_unlock(rwlock_t *lock)
 {
 	rwlock_release(&lock->dep_map, _RET_IP_);
+	dept_read_unlock(&lock->dmap, _RET_IP_);
 	do_raw_read_unlock(lock);
 	preempt_enable();
 }
@@ -231,6 +243,7 @@ static inline void __raw_read_unlock(rwlock_t *lock)
 __raw_read_unlock_irqrestore(rwlock_t *lock, unsigned long flags)
 {
 	rwlock_release(&lock->dep_map, _RET_IP_);
+	dept_read_unlock(&lock->dmap, _RET_IP_);
 	do_raw_read_unlock(lock);
 	local_irq_restore(flags);
 	preempt_enable();
@@ -239,6 +252,7 @@ static inline void __raw_read_unlock(rwlock_t *lock)
 static inline void __raw_read_unlock_irq(rwlock_t *lock)
 {
 	rwlock_release(&lock->dep_map, _RET_IP_);
+	dept_read_unlock(&lock->dmap, _RET_IP_);
 	do_raw_read_unlock(lock);
 	local_irq_enable();
 	preempt_enable();
@@ -247,6 +261,7 @@ static inline void __raw_read_unlock_irq(rwlock_t *lock)
 static inline void __raw_read_unlock_bh(rwlock_t *lock)
 {
 	rwlock_release(&lock->dep_map, _RET_IP_);
+	dept_read_unlock(&lock->dmap, _RET_IP_);
 	do_raw_read_unlock(lock);
 	__local_bh_enable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET);
 }
@@ -255,6 +270,7 @@ static inline void __raw_write_unlock_irqrestore(rwlock_t *lock,
 					     unsigned long flags)
 {
 	rwlock_release(&lock->dep_map, _RET_IP_);
+	dept_write_unlock(&lock->dmap, _RET_IP_);
 	do_raw_write_unlock(lock);
 	local_irq_restore(flags);
 	preempt_enable();
@@ -263,6 +279,7 @@ static inline void __raw_write_unlock_irqrestore(rwlock_t *lock,
 static inline void __raw_write_unlock_irq(rwlock_t *lock)
 {
 	rwlock_release(&lock->dep_map, _RET_IP_);
+	dept_write_unlock(&lock->dmap, _RET_IP_);
 	do_raw_write_unlock(lock);
 	local_irq_enable();
 	preempt_enable();
@@ -271,6 +288,7 @@ static inline void __raw_write_unlock_irq(rwlock_t *lock)
 static inline void __raw_write_unlock_bh(rwlock_t *lock)
 {
 	rwlock_release(&lock->dep_map, _RET_IP_);
+	dept_write_unlock(&lock->dmap, _RET_IP_);
 	do_raw_write_unlock(lock);
 	__local_bh_enable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET);
 }
diff --git a/include/linux/rwlock_types.h b/include/linux/rwlock_types.h
index 3bd03e1..1e79ea7 100644
--- a/include/linux/rwlock_types.h
+++ b/include/linux/rwlock_types.h
@@ -17,6 +17,7 @@
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	struct lockdep_map dep_map;
 #endif
+	struct dept_map dmap;
 } rwlock_t;
 
 #define RWLOCK_MAGIC		0xdeaf1eed
@@ -26,22 +27,34 @@
 	.dep_map = {							\
 		.name = #lockname,					\
 		.wait_type_inner = LD_WAIT_CONFIG,			\
-	}
+	},
 #else
 # define RW_DEP_MAP_INIT(lockname)
 #endif
 
+#ifdef CONFIG_DEPT
+# define RW_DMAP_INIT(lockname)						\
+	.dmap = {							\
+		.name = #lockname,					\
+		.type = DEPT_TYPE_RW,					\
+	},
+#else
+# define RW_DMAP_INIT(lockname)
+#endif
+
 #ifdef CONFIG_DEBUG_SPINLOCK
 #define __RW_LOCK_UNLOCKED(lockname)					\
 	(rwlock_t)	{	.raw_lock = __ARCH_RW_LOCK_UNLOCKED,	\
 				.magic = RWLOCK_MAGIC,			\
 				.owner = SPINLOCK_OWNER_INIT,		\
 				.owner_cpu = -1,			\
-				RW_DEP_MAP_INIT(lockname) }
+				RW_DEP_MAP_INIT(lockname)		\
+				RW_DMAP_INIT(lockname) }
 #else
 #define __RW_LOCK_UNLOCKED(lockname) \
 	(rwlock_t)	{	.raw_lock = __ARCH_RW_LOCK_UNLOCKED,	\
-				RW_DEP_MAP_INIT(lockname) }
+				RW_DEP_MAP_INIT(lockname)		\
+				RW_DMAP_INIT(lockname) }
 #endif
 
 #define DEFINE_RWLOCK(x)	rwlock_t x = __RW_LOCK_UNLOCKED(x)
diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 03e6812..f4deecb 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -34,7 +34,8 @@ void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
 EXPORT_SYMBOL(__raw_spin_lock_init);
 
 void __rwlock_init(rwlock_t *lock, const char *name,
-		   struct lock_class_key *key)
+		   struct lock_class_key *key,
+		   struct dept_key *dkey)
 {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	/*
@@ -42,6 +43,7 @@ void __rwlock_init(rwlock_t *lock, const char *name,
 	 */
 	debug_check_no_locks_freed((void *)lock, sizeof(*lock));
 	lockdep_init_map_wait(&lock->dep_map, name, key, 0, LD_WAIT_CONFIG);
+	dept_rw_init(&lock->dmap, dkey, 0, name);
 #endif
 	lock->raw_lock = (arch_rwlock_t) __ARCH_RW_LOCK_UNLOCKED;
 	lock->magic = RWLOCK_MAGIC;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 5/6] dept: Apply Dept to wait_for_completion()/complete()
  2020-11-23 11:36   ` [RFC 1/6] dept: Implement Dept(Dependency Tracker) Byungchul Park
                       ` (2 preceding siblings ...)
  2020-11-23 11:36     ` [RFC 4/6] dept: Apply Dept to rwlock Byungchul Park
@ 2020-11-23 11:36     ` Byungchul Park
  2020-11-23 11:36     ` [RFC 6/6] dept: Assign custom dept_keys or disable to avoid false positives Byungchul Park
  4 siblings, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 11:36 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

Makes Dept able to track dependencies by
wait_for_completion()/complete().

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 include/linux/completion.h | 44 ++++++++++++++++++++++++++++++++++++++++----
 kernel/sched/completion.c  | 16 ++++++++++++++--
 2 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index bf8e770..05ce6cb 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -26,15 +26,47 @@
 struct completion {
 	unsigned int done;
 	struct swait_queue_head wait;
+	struct dept_map dmap;
 };
 
-#define init_completion_map(x, m) __init_completion(x)
-#define init_completion(x) __init_completion(x)
+#ifdef CONFIG_DEPT
+#define dept_wfc_init(m, k, s, n)		dept_map_init(m, k, s, n, DEPT_TYPE_WFC)
+#define dept_wfc_reinit(m)			dept_map_reinit(m, NULL, -1, NULL)
+#define dept_wfc_wait(m, ip)			dept_wait(m, 1UL, ip, __func__, 0)
+#define dept_wfc_complete(m, ip)		dept_event(m, 1UL, ip, __func__)
+#define dept_wfc_enter(m, ip)			dept_ecxt_enter(m, 1UL, ip, "completion_context_enter", "complete", 0)
+#define dept_wfc_exit(m, ip)			dept_ecxt_exit(m, ip)
+#else
+#define dept_wfc_init(m, k, s, n)		do { (void)(n); (void)(k); } while (0)
+#define dept_wfc_reinit(m)			do { } while (0)
+#define dept_wfc_wait(m, ip)			do { } while (0)
+#define dept_wfc_complete(m, ip)		do { } while (0)
+#define dept_wfc_enter(m, ip)			do { } while (0)
+#define dept_wfc_exit(m, ip)			do { } while (0)
+#endif
+
+#ifdef CONFIG_DEPT
+#define WFC_DEPT_MAP_INIT(work) .dmap = { .name = #work, .type = DEPT_TYPE_WFC }
+#else
+#define WFC_DEPT_MAP_INIT(work)
+#endif
+
+#define init_completion_map(x, m)				\
+	do {							\
+		static struct dept_key __dkey;			\
+		__init_completion(x, &__dkey, #x);		\
+	} while (0)
+#define init_completion(x)					\
+	do {							\
+		static struct dept_key __dkey;			\
+		__init_completion(x, &__dkey, #x);		\
+	} while (0)
 static inline void complete_acquire(struct completion *x) {}
 static inline void complete_release(struct completion *x) {}
 
 #define COMPLETION_INITIALIZER(work) \
-	{ 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
+	{ 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
+	WFC_DEPT_MAP_INIT(work) }
 
 #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
 	(*({ init_completion_map(&(work), &(map)); &(work); }))
@@ -82,9 +114,12 @@ static inline void complete_release(struct completion *x) {}
  * This inline function will initialize a dynamically created completion
  * structure.
  */
-static inline void __init_completion(struct completion *x)
+static inline void __init_completion(struct completion *x,
+				     struct dept_key *dkey,
+				     const char *name)
 {
 	x->done = 0;
+	dept_wfc_init(&x->dmap, dkey, 0, name);
 	init_swait_queue_head(&x->wait);
 }
 
@@ -98,6 +133,7 @@ static inline void __init_completion(struct completion *x)
 static inline void reinit_completion(struct completion *x)
 {
 	x->done = 0;
+	dept_wfc_reinit(&x->dmap);
 }
 
 extern void wait_for_completion(struct completion *);
diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index a778554..e144413 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -29,6 +29,7 @@ void complete(struct completion *x)
 {
 	unsigned long flags;
 
+	dept_wfc_complete(&x->dmap, _RET_IP_);
 	raw_spin_lock_irqsave(&x->wait.lock, flags);
 
 	if (x->done != UINT_MAX)
@@ -58,6 +59,7 @@ void complete_all(struct completion *x)
 {
 	unsigned long flags;
 
+	dept_wfc_complete(&x->dmap, _RET_IP_);
 	lockdep_assert_RT_in_threaded_ctx();
 
 	raw_spin_lock_irqsave(&x->wait.lock, flags);
@@ -135,6 +137,7 @@ void complete_all(struct completion *x)
  */
 void __sched wait_for_completion(struct completion *x)
 {
+	dept_wfc_wait(&x->dmap, _RET_IP_);
 	wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_UNINTERRUPTIBLE);
 }
 EXPORT_SYMBOL(wait_for_completion);
@@ -154,6 +157,7 @@ void __sched wait_for_completion(struct completion *x)
 unsigned long __sched
 wait_for_completion_timeout(struct completion *x, unsigned long timeout)
 {
+	dept_wfc_wait(&x->dmap, _RET_IP_);
 	return wait_for_common(x, timeout, TASK_UNINTERRUPTIBLE);
 }
 EXPORT_SYMBOL(wait_for_completion_timeout);
@@ -168,6 +172,7 @@ void __sched wait_for_completion(struct completion *x)
  */
 void __sched wait_for_completion_io(struct completion *x)
 {
+	dept_wfc_wait(&x->dmap, _RET_IP_);
 	wait_for_common_io(x, MAX_SCHEDULE_TIMEOUT, TASK_UNINTERRUPTIBLE);
 }
 EXPORT_SYMBOL(wait_for_completion_io);
@@ -188,6 +193,7 @@ void __sched wait_for_completion_io(struct completion *x)
 unsigned long __sched
 wait_for_completion_io_timeout(struct completion *x, unsigned long timeout)
 {
+	dept_wfc_wait(&x->dmap, _RET_IP_);
 	return wait_for_common_io(x, timeout, TASK_UNINTERRUPTIBLE);
 }
 EXPORT_SYMBOL(wait_for_completion_io_timeout);
@@ -203,7 +209,9 @@ void __sched wait_for_completion_io(struct completion *x)
  */
 int __sched wait_for_completion_interruptible(struct completion *x)
 {
-	long t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_INTERRUPTIBLE);
+	long t;
+	dept_wfc_wait(&x->dmap, _RET_IP_);
+	t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_INTERRUPTIBLE);
 	if (t == -ERESTARTSYS)
 		return t;
 	return 0;
@@ -225,6 +233,7 @@ int __sched wait_for_completion_interruptible(struct completion *x)
 wait_for_completion_interruptible_timeout(struct completion *x,
 					  unsigned long timeout)
 {
+	dept_wfc_wait(&x->dmap, _RET_IP_);
 	return wait_for_common(x, timeout, TASK_INTERRUPTIBLE);
 }
 EXPORT_SYMBOL(wait_for_completion_interruptible_timeout);
@@ -240,7 +249,9 @@ int __sched wait_for_completion_interruptible(struct completion *x)
  */
 int __sched wait_for_completion_killable(struct completion *x)
 {
-	long t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_KILLABLE);
+	long t;
+	dept_wfc_wait(&x->dmap, _RET_IP_);
+	t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_KILLABLE);
 	if (t == -ERESTARTSYS)
 		return t;
 	return 0;
@@ -263,6 +274,7 @@ int __sched wait_for_completion_killable(struct completion *x)
 wait_for_completion_killable_timeout(struct completion *x,
 				     unsigned long timeout)
 {
+	dept_wfc_wait(&x->dmap, _RET_IP_);
 	return wait_for_common(x, timeout, TASK_KILLABLE);
 }
 EXPORT_SYMBOL(wait_for_completion_killable_timeout);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [RFC 6/6] dept: Assign custom dept_keys or disable to avoid false positives
  2020-11-23 11:36   ` [RFC 1/6] dept: Implement Dept(Dependency Tracker) Byungchul Park
                       ` (3 preceding siblings ...)
  2020-11-23 11:36     ` [RFC 5/6] dept: Apply Dept to wait_for_completion()/complete() Byungchul Park
@ 2020-11-23 11:36     ` Byungchul Park
  4 siblings, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 11:36 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

Signed-off-by: Byungchul Park <byungchul.park@lge.com>
---
 arch/arm/mach-omap2/omap_hwmod.c                   |  6 +++++
 arch/powerpc/platforms/powermac/low_i2c.c          | 15 +++++++++++
 block/blk-flush.c                                  |  3 +++
 block/blk.h                                        |  1 +
 drivers/base/class.c                               | 13 +++++-----
 drivers/base/core.c                                |  1 +
 drivers/base/regmap/regmap.c                       | 10 ++++++++
 drivers/fpga/dfl.c                                 |  3 +++
 drivers/gpio/gpio-pca953x.c                        |  2 ++
 drivers/gpu/drm/etnaviv/etnaviv_gem.c              |  6 +++++
 drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c        |  3 +++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c          |  3 +++
 drivers/gpu/drm/i915/gt/intel_gtt.c                |  1 +
 drivers/gpu/drm/i915/i915_active.c                 |  8 +++---
 drivers/gpu/drm/i915/i915_active.h                 |  6 +++--
 drivers/gpu/drm/nouveau/nvkm/core/subdev.c         |  7 +++--
 drivers/input/mousedev.c                           |  2 ++
 drivers/input/serio/libps2.c                       |  1 +
 drivers/input/serio/serio.c                        |  1 +
 drivers/macintosh/mac_hid.c                        |  6 +++++
 drivers/md/bcache/btree.c                          |  1 +
 drivers/media/v4l2-core/v4l2-ctrls.c               |  5 +++-
 .../ethernet/netronome/nfp/nfpcore/nfp_cppcore.c   |  2 ++
 drivers/net/team/team.c                            |  5 ++++
 drivers/net/wireless/intel/iwlwifi/pcie/tx.c       |  3 +++
 drivers/s390/char/raw3270.c                        |  1 +
 drivers/s390/cio/cio.c                             |  2 ++
 drivers/s390/net/qeth_core_main.c                  |  3 +++
 drivers/tty/serial/ifx6x60.c                       |  2 ++
 drivers/tty/serial/serial_core.c                   |  2 ++
 drivers/tty/tty_buffer.c                           |  1 +
 drivers/tty/tty_mutex.c                            |  1 +
 drivers/usb/storage/usb.c                          |  2 ++
 fs/btrfs/disk-io.c                                 |  3 +++
 fs/dcache.c                                        |  1 +
 fs/inode.c                                         |  6 +++++
 fs/ntfs/inode.c                                    |  5 ++++
 fs/ntfs/super.c                                    |  6 +++++
 fs/overlayfs/inode.c                               |  3 +++
 fs/super.c                                         |  6 +++++
 fs/xfs/xfs_dquot.c                                 |  4 +++
 include/linux/device/class.h                       | 24 +++++++++--------
 include/linux/if_team.h                            |  1 +
 include/linux/irqdesc.h                            |  6 +++++
 include/linux/kthread.h                            |  6 +++--
 include/linux/percpu_counter.h                     |  6 +++--
 include/linux/skbuff.h                             |  4 ++-
 include/linux/swait.h                              |  6 +++--
 include/media/v4l2-ctrls.h                         |  8 ++++--
 include/net/sock.h                                 |  1 +
 kernel/events/core.c                               |  4 +++
 kernel/irq/irqdesc.c                               |  3 +++
 kernel/kthread.c                                   |  4 ++-
 kernel/rcu/tree.c                                  |  6 +++++
 kernel/sched/sched.h                               |  1 +
 kernel/sched/swait.c                               |  4 ++-
 kernel/sched/wait.c                                |  4 +++
 kernel/trace/ring_buffer.c                         |  4 +++
 kernel/workqueue.c                                 |  1 +
 lib/locking-selftest.c                             | 15 ++++++-----
 lib/percpu_counter.c                               |  4 ++-
 mm/list_lru.c                                      |  8 +++++-
 net/batman-adv/hash.c                              |  7 ++++-
 net/core/dev.c                                     |  7 +++++
 net/core/neighbour.c                               |  4 ++-
 net/core/sock.c                                    | 30 ++++++++++++++++++++--
 net/l2tp/l2tp_core.c                               |  3 +++
 net/netfilter/ipvs/ip_vs_sync.c                    |  7 +++--
 net/netlink/af_netlink.c                           |  4 +++
 net/rxrpc/call_object.c                            |  6 ++++-
 net/sched/sch_generic.c                            |  6 +++++
 71 files changed, 298 insertions(+), 58 deletions(-)

diff --git a/arch/arm/mach-omap2/omap_hwmod.c b/arch/arm/mach-omap2/omap_hwmod.c
index 15b29a1..7fc1c98 100644
--- a/arch/arm/mach-omap2/omap_hwmod.c
+++ b/arch/arm/mach-omap2/omap_hwmod.c
@@ -2604,6 +2604,12 @@ static int _register(struct omap_hwmod *oh)
 	INIT_LIST_HEAD(&oh->slave_ports);
 	spin_lock_init(&oh->_lock);
 	lockdep_set_class(&oh->_lock, &oh->hwmod_key);
+	/*
+	 * TODO: Should re-initialize the map with a valid dept_key.
+	 * Also needs dept_key_init() before using it, dept_key_destroy()
+	 * before freeing it.
+	 */
+	dept_spin_nocheck(&oh->_lock.dmap);
 
 	oh->_state = _HWMOD_STATE_REGISTERED;
 
diff --git a/arch/powerpc/platforms/powermac/low_i2c.c b/arch/powerpc/platforms/powermac/low_i2c.c
index 126c60a..640eeb2 100644
--- a/arch/powerpc/platforms/powermac/low_i2c.c
+++ b/arch/powerpc/platforms/powermac/low_i2c.c
@@ -583,6 +583,11 @@ static void __init kw_i2c_add(struct pmac_i2c_host_kw *host,
 	bus->xfer = kw_i2c_xfer;
 	mutex_init(&bus->mutex);
 	lockdep_set_class(&bus->mutex, &bus->lock_key);
+	/*
+	 * TODO: Should re-initialize the map with a valid dept_key.
+	 * Also need dept_key_init() before using it.
+	 */
+	dept_mutex_nocheck(&bus->mutex.dmap);
 	if (controller == busnode)
 		bus->flags = pmac_i2c_multibus;
 	list_add(&bus->link, &pmac_i2c_busses);
@@ -811,6 +816,11 @@ static void __init pmu_i2c_probe(void)
 		bus->xfer = pmu_i2c_xfer;
 		mutex_init(&bus->mutex);
 		lockdep_set_class(&bus->mutex, &bus->lock_key);
+		/*
+		 * TODO: Should re-initialize the map with a valid dept_key.
+		 * Also need dept_key_init() before using it.
+		 */
+		dept_mutex_nocheck(&bus->mutex.dmap);
 		bus->flags = pmac_i2c_multibus;
 		list_add(&bus->link, &pmac_i2c_busses);
 
@@ -934,6 +944,11 @@ static void __init smu_i2c_probe(void)
 		bus->xfer = smu_i2c_xfer;
 		mutex_init(&bus->mutex);
 		lockdep_set_class(&bus->mutex, &bus->lock_key);
+		/*
+		 * TODO: Should re-initialize the map with a valid dept_key.
+		 * Also need dept_key_init() before using it.
+		 */
+		dept_mutex_nocheck(&bus->mutex.dmap);
 		bus->flags = 0;
 		list_add(&bus->link, &pmac_i2c_busses);
 
diff --git a/block/blk-flush.c b/block/blk-flush.c
index 53abb5c..d283d8f 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -469,7 +469,9 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size,
 	INIT_LIST_HEAD(&fq->flush_data_in_flight);
 
 	lockdep_register_key(&fq->key);
+	dept_key_init(&fq->dkey);
 	lockdep_set_class(&fq->mq_flush_lock, &fq->key);
+	dept_spin_reinit(&fq->mq_flush_lock.dmap, &fq->dkey, -1, NULL);
 
 	return fq;
 
@@ -486,6 +488,7 @@ void blk_free_flush_queue(struct blk_flush_queue *fq)
 		return;
 
 	lockdep_unregister_key(&fq->key);
+	dept_key_destroy(&fq->dkey);
 	kfree(fq->flush_rq);
 	kfree(fq);
 }
diff --git a/block/blk.h b/block/blk.h
index 49e2928..15eaef2 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -26,6 +26,7 @@ struct blk_flush_queue {
 	struct request		*flush_rq;
 
 	struct lock_class_key	key;
+	struct dept_key		dkey;
 	spinlock_t		mq_flush_lock;
 };
 
diff --git a/drivers/base/class.c b/drivers/base/class.c
index c52ee21a..aa36756 100644
--- a/drivers/base/class.c
+++ b/drivers/base/class.c
@@ -150,7 +150,8 @@ static void class_remove_groups(struct class *cls,
 	return sysfs_remove_groups(&cls->p->subsys.kobj, groups);
 }
 
-int __class_register(struct class *cls, struct lock_class_key *key)
+int __class_register(struct class *cls, struct lock_class_key *key,
+		     struct dept_key *dkey)
 {
 	struct subsys_private *cp;
 	int error;
@@ -163,10 +164,7 @@ int __class_register(struct class *cls, struct lock_class_key *key)
 	klist_init(&cp->klist_devices, klist_class_dev_get, klist_class_dev_put);
 	INIT_LIST_HEAD(&cp->interfaces);
 	kset_init(&cp->glue_dirs);
-	/*
-	 * TODO: Initialize the mutex with a valid dept_key.
-	 */
-	__mutex_init(&cp->mutex, "subsys mutex", key, NULL);
+	__mutex_init(&cp->mutex, "subsys mutex", key, dkey);
 	error = kobject_set_name(&cp->subsys.kobj, "%s", cls->name);
 	if (error) {
 		kfree(cp);
@@ -227,7 +225,8 @@ static void class_create_release(struct class *cls)
  * making a call to class_destroy().
  */
 struct class *__class_create(struct module *owner, const char *name,
-			     struct lock_class_key *key)
+			     struct lock_class_key *key,
+			     struct dept_key *dkey)
 {
 	struct class *cls;
 	int retval;
@@ -242,7 +241,7 @@ struct class *__class_create(struct module *owner, const char *name,
 	cls->owner = owner;
 	cls->class_release = class_create_release;
 
-	retval = __class_register(cls, key);
+	retval = __class_register(cls, key, dkey);
 	if (retval)
 		goto error;
 
diff --git a/drivers/base/core.c b/drivers/base/core.c
index bb5806a..f127f7e 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2408,6 +2408,7 @@ void device_initialize(struct device *dev)
 	mutex_init(&dev->lockdep_mutex);
 #endif
 	lockdep_set_novalidate_class(&dev->mutex);
+	dept_mutex_nocheck(&dev->mutex.dmap);
 	spin_lock_init(&dev->devres_lock);
 	INIT_LIST_HEAD(&dev->devres_head);
 	device_pm_init(dev);
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index b71f9ec..fdfb2b7 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -749,12 +749,22 @@ struct regmap *__regmap_init(struct device *dev,
 			map->unlock = regmap_unlock_spinlock;
 			lockdep_set_class_and_name(&map->spinlock,
 						   lock_key, lock_name);
+			/*
+			 * TODO: Should re-initialize the map with a valid
+			 * dept_key.
+			 */
+			dept_spin_nocheck(&map->spinlock.dmap);
 		} else {
 			mutex_init(&map->mutex);
 			map->lock = regmap_lock_mutex;
 			map->unlock = regmap_unlock_mutex;
 			lockdep_set_class_and_name(&map->mutex,
 						   lock_key, lock_name);
+			/*
+			 * TODO: Should re-initialize the map with a valid
+			 * dept_key.
+			 */
+			dept_mutex_nocheck(&map->mutex.dmap);
 		}
 		map->lock_arg = map;
 	}
diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
index 649958a..dd947af 100644
--- a/drivers/fpga/dfl.c
+++ b/drivers/fpga/dfl.c
@@ -43,6 +43,7 @@ enum dfl_fpga_devt_type {
 };
 
 static struct lock_class_key dfl_pdata_keys[DFL_ID_MAX];
+static struct dept_key dfl_pdata_dkeys[DFL_ID_MAX];
 
 static const char *dfl_pdata_key_strings[DFL_ID_MAX] = {
 	"dfl-fme-pdata",
@@ -510,6 +511,8 @@ static int build_info_commit_dev(struct build_feature_devs_info *binfo)
 	mutex_init(&pdata->lock);
 	lockdep_set_class_and_name(&pdata->lock, &dfl_pdata_keys[type],
 				   dfl_pdata_key_strings[type]);
+	dept_spin_reinit(&pdata->lock.dmap, &dfl_pdata_dkeys[type], -1,
+				   dfl_pdata_key_strings[type]);
 
 	/*
 	 * the count should be initialized to 0 to make sure
diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
index c2d6121..fdedd20 100644
--- a/drivers/gpio/gpio-pca953x.c
+++ b/drivers/gpio/gpio-pca953x.c
@@ -1084,6 +1084,8 @@ static int pca953x_probe(struct i2c_client *client,
 	 */
 	lockdep_set_subclass(&chip->i2c_lock,
 			     i2c_adapter_depth(client->adapter));
+	dept_mutex_reinit(&chip->i2c_lock.dmap, NULL,
+			  i2c_adapter_depth(client->adapter), NULL);
 
 	/* initialize cached registers from their original values.
 	 * we can't share this chip with another i2c master.
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index f06e19e..48a2d1f 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -16,6 +16,8 @@
 
 static struct lock_class_key etnaviv_shm_lock_class;
 static struct lock_class_key etnaviv_userptr_lock_class;
+static struct dept_key etnaviv_shm_dkey;
+static struct dept_key etnaviv_userptr_dkey;
 
 static void etnaviv_gem_scatter_map(struct etnaviv_gem_object *etnaviv_obj)
 {
@@ -614,6 +616,8 @@ int etnaviv_gem_new_handle(struct drm_device *dev, struct drm_file *file,
 		goto fail;
 
 	lockdep_set_class(&to_etnaviv_bo(obj)->lock, &etnaviv_shm_lock_class);
+	dept_mutex_reinit(&to_etnaviv_bo(obj)->lock.dmap,
+			  &etnaviv_shm_dkey, -1, NULL);
 
 	ret = drm_gem_object_init(dev, obj, size);
 	if (ret)
@@ -732,6 +736,8 @@ int etnaviv_gem_new_userptr(struct drm_device *dev, struct drm_file *file,
 		return ret;
 
 	lockdep_set_class(&etnaviv_obj->lock, &etnaviv_userptr_lock_class);
+	dept_mutex_reinit(&etnaviv_obj->lock.dmap,
+			  &etnaviv_userptr_dkey, -1, NULL);
 
 	etnaviv_obj->userptr.ptr = ptr;
 	etnaviv_obj->userptr.mm = current->mm;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
index 6d9e5c3..d45b191 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
@@ -10,6 +10,7 @@
 #include "etnaviv_gem.h"
 
 static struct lock_class_key etnaviv_prime_lock_class;
+static struct dept_key etnaviv_prime_dkey;
 
 struct sg_table *etnaviv_gem_prime_get_sg_table(struct drm_gem_object *obj)
 {
@@ -116,6 +117,8 @@ struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device *dev,
 		return ERR_PTR(ret);
 
 	lockdep_set_class(&etnaviv_obj->lock, &etnaviv_prime_lock_class);
+	dept_mutex_reinit(&etnaviv_obj->lock.dmap,
+			  &etnaviv_prime_dkey, -1, NULL);
 
 	npages = size / PAGE_SIZE;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 26087dd..e5a7f0f 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -770,6 +770,7 @@ static int measure_breadcrumb_dw(struct intel_context *ce)
 
 	spin_lock_init(&engine->active.lock);
 	lockdep_set_subclass(&engine->active.lock, subclass);
+	dept_spin_reinit(&engine->active.lock.dmap, NULL, subclass, NULL);
 
 	/*
 	 * Due to an interesting quirk in lockdep's internal debug tracking,
@@ -788,6 +789,7 @@ static int measure_breadcrumb_dw(struct intel_context *ce)
 create_kernel_context(struct intel_engine_cs *engine)
 {
 	static struct lock_class_key kernel;
+	static struct dept_key kernel_dkey;
 	struct intel_context *ce;
 	int err;
 
@@ -810,6 +812,7 @@ static int measure_breadcrumb_dw(struct intel_context *ce)
 	 * construction.
 	 */
 	lockdep_set_class(&ce->timeline->mutex, &kernel);
+	dept_mutex_reinit(&ce->timeline->mutex.dmap, &kernel_dkey, -1, NULL);
 
 	return ce;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 2a72cce..34a68da 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -240,6 +240,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
 	 */
 	mutex_init(&vm->mutex);
 	lockdep_set_subclass(&vm->mutex, subclass);
+	dept_mutex_reinit(&vm->mutex.dmap, NULL, subclass, NULL);
 	i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex);
 
 	GEM_BUG_ON(!vm->total);
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index 49a6972..44eb691 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -279,7 +279,8 @@ void __i915_active_init(struct i915_active *ref,
 			int (*active)(struct i915_active *ref),
 			void (*retire)(struct i915_active *ref),
 			struct lock_class_key *mkey,
-			struct lock_class_key *wkey)
+			struct lock_class_key *wkey,
+			struct dept_key *mdkey)
 {
 	unsigned long bits;
 
@@ -297,10 +298,7 @@ void __i915_active_init(struct i915_active *ref,
 
 	init_llist_head(&ref->preallocated_barriers);
 	atomic_set(&ref->count, 0);
-	/*
-	 * TODO: Initialize the mutex with a valid dept_key.
-	 */
-	__mutex_init(&ref->mutex, "i915_active", mkey, NULL);
+	__mutex_init(&ref->mutex, "i915_active", mkey, mdkey);
 	__i915_active_fence_init(&ref->excl, NULL, excl_retire);
 	INIT_WORK(&ref->work, active_work);
 #if IS_ENABLED(CONFIG_LOCKDEP)
diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h
index cf40581..97aa09b 100644
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -153,14 +153,16 @@ void __i915_active_init(struct i915_active *ref,
 			int (*active)(struct i915_active *ref),
 			void (*retire)(struct i915_active *ref),
 			struct lock_class_key *mkey,
-			struct lock_class_key *wkey);
+			struct lock_class_key *wkey,
+			struct dept_key *mdkey);
 
 /* Specialise each class of i915_active to avoid impossible lockdep cycles. */
 #define i915_active_init(ref, active, retire) do {		\
 	static struct lock_class_key __mkey;				\
 	static struct lock_class_key __wkey;				\
+	static struct dept_key __mdkey;					\
 									\
-	__i915_active_init(ref, active, retire, &__mkey, &__wkey);	\
+	__i915_active_init(ref, active, retire, &__mkey, &__wkey, &__mdkey); \
 } while (0)
 
 int i915_active_ref(struct i915_active *ref,
diff --git a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
index f968d62..180e063 100644
--- a/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
+++ b/drivers/gpu/drm/nouveau/nvkm/core/subdev.c
@@ -27,6 +27,7 @@
 #include <subdev/mc.h>
 
 static struct lock_class_key nvkm_subdev_lock_class[NVKM_SUBDEV_NR];
+static struct dept_key nvkm_subdev_dept_keys[NVKM_SUBDEV_NR];
 
 const char *
 nvkm_subdev_name[NVKM_SUBDEV_NR] = {
@@ -218,10 +219,8 @@
 	subdev->device = device;
 	subdev->index = index;
 
-	/*
-	 * TODO: Initialize the mutex with a valid dept_key.
-	 */
-	__mutex_init(&subdev->mutex, name, &nvkm_subdev_lock_class[index], NULL);
+	__mutex_init(&subdev->mutex, name, &nvkm_subdev_lock_class[index],
+			&nvkm_subdev_dept_keys[index]);
 	subdev->debug = nvkm_dbgopt(device->dbgopt, name);
 }
 
diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
index 505c562..b274b27 100644
--- a/drivers/input/mousedev.c
+++ b/drivers/input/mousedev.c
@@ -865,6 +865,8 @@ static struct mousedev *mousedev_create(struct input_dev *dev,
 	mutex_init(&mousedev->mutex);
 	lockdep_set_subclass(&mousedev->mutex,
 			     mixdev ? SINGLE_DEPTH_NESTING : 0);
+	dept_mutex_reinit(&mousedev->mutex.dmap, NULL,
+			  mixdev ? SINGLE_DEPTH_NESTING : 0, NULL);
 	init_waitqueue_head(&mousedev->wait);
 
 	if (mixdev) {
diff --git a/drivers/input/serio/libps2.c b/drivers/input/serio/libps2.c
index 8a16e41..3001ccb 100644
--- a/drivers/input/serio/libps2.c
+++ b/drivers/input/serio/libps2.c
@@ -377,6 +377,7 @@ void ps2_init(struct ps2dev *ps2dev, struct serio *serio)
 {
 	mutex_init(&ps2dev->cmd_mutex);
 	lockdep_set_subclass(&ps2dev->cmd_mutex, serio->depth);
+	dept_mutex_reinit(&ps2dev->cmd_mutex.dmap, NULL, serio->depth, NULL);
 	init_waitqueue_head(&ps2dev->wait);
 	ps2dev->serio = serio;
 }
diff --git a/drivers/input/serio/serio.c b/drivers/input/serio/serio.c
index 29f4910..0a371bdd 100644
--- a/drivers/input/serio/serio.c
+++ b/drivers/input/serio/serio.c
@@ -517,6 +517,7 @@ static void serio_init_port(struct serio *serio)
 	} else
 		serio->depth = 0;
 	lockdep_set_subclass(&serio->lock, serio->depth);
+	dept_spin_reinit(&serio->lock.dmap, NULL, serio->depth, NULL);
 }
 
 /*
diff --git a/drivers/macintosh/mac_hid.c b/drivers/macintosh/mac_hid.c
index 28b8581..4bd402d 100644
--- a/drivers/macintosh/mac_hid.c
+++ b/drivers/macintosh/mac_hid.c
@@ -30,6 +30,8 @@ static int mac_hid_create_emumouse(void)
 {
 	static struct lock_class_key mac_hid_emumouse_dev_event_class;
 	static struct lock_class_key mac_hid_emumouse_dev_mutex_class;
+	static struct dept_key mac_hid_emumouse_dev_event_dkey;
+	static struct dept_key mac_hid_emumouse_dev_mutex_dkey;
 	int err;
 
 	mac_hid_emumouse_dev = input_allocate_device();
@@ -40,6 +42,10 @@ static int mac_hid_create_emumouse(void)
 			  &mac_hid_emumouse_dev_event_class);
 	lockdep_set_class(&mac_hid_emumouse_dev->mutex,
 			  &mac_hid_emumouse_dev_mutex_class);
+	dept_spin_reinit(&mac_hid_emumouse_dev->event_lock.dmap,
+			  &mac_hid_emumouse_dev_event_dkey, -1, NULL);
+	dept_mutex_reinit(&mac_hid_emumouse_dev->mutex.dmap,
+			  &mac_hid_emumouse_dev_mutex_dkey, -1, NULL);
 
 	mac_hid_emumouse_dev->name = "Macintosh mouse button emulation";
 	mac_hid_emumouse_dev->id.bustype = BUS_ADB;
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 3d8bd06..6917441b 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -573,6 +573,7 @@ static struct btree *mca_bucket_alloc(struct cache_set *c,
 	lockdep_set_novalidate_class(&b->lock);
 	mutex_init(&b->write_lock);
 	lockdep_set_novalidate_class(&b->write_lock);
+	dept_mutex_nocheck(&b->write_lock.dmap);
 	INIT_LIST_HEAD(&b->list);
 	INIT_DELAYED_WORK(&b->work, btree_node_write_work);
 	b->c = c;
diff --git a/drivers/media/v4l2-core/v4l2-ctrls.c b/drivers/media/v4l2-core/v4l2-ctrls.c
index 45a2403..ff2079c 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls.c
@@ -2258,11 +2258,14 @@ static inline int handler_set_err(struct v4l2_ctrl_handler *hdl, int err)
 /* Initialize the handler */
 int v4l2_ctrl_handler_init_class(struct v4l2_ctrl_handler *hdl,
 				 unsigned nr_of_controls_hint,
-				 struct lock_class_key *key, const char *name)
+				 struct lock_class_key *key,
+				 struct dept_eky *dkey,
+				 const char *name)
 {
 	mutex_init(&hdl->_lock);
 	hdl->lock = &hdl->_lock;
 	lockdep_set_class_and_name(hdl->lock, key, name);
+	dept_mutex_reinit(&hdl->lock->dmap, dkey, -1, name);
 	INIT_LIST_HEAD(&hdl->ctrls);
 	INIT_LIST_HEAD(&hdl->ctrl_refs);
 	INIT_LIST_HEAD(&hdl->requests);
diff --git a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c
index 94994a9..2562380 100644
--- a/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c
+++ b/drivers/net/ethernet/netronome/nfp/nfpcore/nfp_cppcore.c
@@ -1140,6 +1140,7 @@ int nfp_xpb_writelm(struct nfp_cpp *cpp, u32 xpb_tgt,
 
 /* Lockdep markers */
 static struct lock_class_key nfp_cpp_resource_lock_key;
+static struct dept_key nfp_cpp_resource_lock_dkey;
 
 static void nfp_cpp_dev_release(struct device *dev)
 {
@@ -1192,6 +1193,7 @@ struct nfp_cpp *
 	rwlock_init(&cpp->resource_lock);
 	init_waitqueue_head(&cpp->waitq);
 	lockdep_set_class(&cpp->resource_lock, &nfp_cpp_resource_lock_key);
+	dept_rw_reinit(&cpp->resource_lock.dmap, &nfp_cpp_resource_lock_dkey, -1, NULL);
 	INIT_LIST_HEAD(&cpp->resource_list);
 	INIT_LIST_HEAD(&cpp->area_cache_list);
 	mutex_init(&cpp->area_cache_mutex);
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 0292404..8135a9e 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -1646,6 +1646,7 @@ static int team_init(struct net_device *dev)
 	netif_carrier_off(dev);
 
 	lockdep_register_key(&team->team_lock_key);
+	dept_key_init(&team->team_dkey);
 	/*
 	 * TODO: Initialize the mutex with a valid dept_key.
 	 */
@@ -1682,6 +1683,7 @@ static void team_uninit(struct net_device *dev)
 	mutex_unlock(&team->lock);
 	netdev_change_features(dev);
 	lockdep_unregister_key(&team->team_lock_key);
+	dept_key_destroy(&team->team_dkey);
 }
 
 static void team_destructor(struct net_device *dev)
@@ -1992,6 +1994,9 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev)
 		lockdep_unregister_key(&team->team_lock_key);
 		lockdep_register_key(&team->team_lock_key);
 		lockdep_set_class(&team->lock, &team->team_lock_key);
+		dept_key_destroy(&team->team_dkey);
+		dept_key_init(&team->team_dkey);
+		dept_mutex_reinit(&team->lock, &team->team_dkey, -1, NULL);
 	}
 	netdev_change_features(dev);
 
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
index eb396c0..49ad297 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
@@ -608,8 +608,11 @@ int iwl_pcie_txq_init(struct iwl_trans *trans, struct iwl_txq *txq,
 
 	if (cmd_queue) {
 		static struct lock_class_key iwl_pcie_cmd_queue_lock_class;
+		static struct dept_key iwl_pcie_cmd_queue_dkey;
 
 		lockdep_set_class(&txq->lock, &iwl_pcie_cmd_queue_lock_class);
+		dept_spin_reinit(&txq->lock.dmap, &iwl_pcie_cmd_queue_dkey,
+				 -1, NULL);
 	}
 
 	__skb_queue_head_init(&txq->overflow_q);
diff --git a/drivers/s390/char/raw3270.c b/drivers/s390/char/raw3270.c
index 63a41b1..10aeb2c 100644
--- a/drivers/s390/char/raw3270.c
+++ b/drivers/s390/char/raw3270.c
@@ -943,6 +943,7 @@ struct raw3270 __init *raw3270_setup_console(void)
 		view->ascebc = rp->ascebc;
 		spin_lock_init(&view->lock);
 		lockdep_set_subclass(&view->lock, subclass);
+		dept_spin_reinit(&view->lock.dmap, NULL, subclass, NULL);
 		list_add(&view->list, &rp->view_list);
 		rc = 0;
 		spin_unlock_irqrestore(get_ccwdev_lock(rp->cdev), flags);
diff --git a/drivers/s390/cio/cio.c b/drivers/s390/cio/cio.c
index 6d716db..cd88dbf 100644
--- a/drivers/s390/cio/cio.c
+++ b/drivers/s390/cio/cio.c
@@ -574,6 +574,7 @@ void __init init_cio_interrupts(void)
 #ifdef CONFIG_CCW_CONSOLE
 static struct subchannel *console_sch;
 static struct lock_class_key console_sch_key;
+static struct dept_key console_sch_dkey;
 
 /*
  * Use cio_tsch to update the subchannel status and call the interrupt handler
@@ -664,6 +665,7 @@ struct subchannel *cio_probe_console(void)
 		return sch;
 
 	lockdep_set_class(sch->lock, &console_sch_key);
+	dept_spin_reinit(&sch->lock->dmap, &console_sch_dkey, -1, NULL);
 	isc_register(CONSOLE_ISC);
 	sch->config.isc = CONSOLE_ISC;
 	sch->config.intparm = (u32)(addr_t)sch;
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 6a73982..f753d93 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -63,6 +63,7 @@ struct qeth_dbf_info qeth_dbf[QETH_DBF_INFOS] = {
 static struct device *qeth_core_root_dev;
 static struct dentry *qeth_debugfs_root;
 static struct lock_class_key qdio_out_skb_queue_key;
+static struct dept_key qdio_out_skb_queue_dkey;
 
 static void qeth_issue_next_read_cb(struct qeth_card *card,
 				    struct qeth_cmd_buffer *iob,
@@ -2632,6 +2633,8 @@ static int qeth_init_qdio_out_buf(struct qeth_qdio_out_q *q, int bidx)
 	newbuf->buffer = q->qdio_bufs[bidx];
 	skb_queue_head_init(&newbuf->skb_list);
 	lockdep_set_class(&newbuf->skb_list.lock, &qdio_out_skb_queue_key);
+	dept_spin_reinit(&newbuf->skb_list.lock.dmap,
+			 &qdio_out_skb_queue_dkey, -1, NULL);
 	newbuf->q = q;
 	newbuf->next_pending = q->bufs[bidx];
 	atomic_set(&newbuf->state, QETH_QDIO_BUF_EMPTY);
diff --git a/drivers/tty/serial/ifx6x60.c b/drivers/tty/serial/ifx6x60.c
index 7d16fe4..f4c4bd5 100644
--- a/drivers/tty/serial/ifx6x60.c
+++ b/drivers/tty/serial/ifx6x60.c
@@ -73,6 +73,7 @@ static int ifx_modem_reboot_callback(struct notifier_block *nfb,
 static struct tty_driver *tty_drv;
 static struct ifx_spi_device *saved_ifx_dev;
 static struct lock_class_key ifx_spi_key;
+static struct dept_key ifx_spi_dkey;
 
 static struct notifier_block ifx_modem_reboot_notifier_block = {
 	.notifier_call = ifx_modem_reboot_callback,
@@ -819,6 +820,7 @@ static int ifx_spi_create_port(struct ifx_spi_device *ifx_dev)
 	spin_lock_init(&ifx_dev->fifo_lock);
 	lockdep_set_class_and_subclass(&ifx_dev->fifo_lock,
 		&ifx_spi_key, 0);
+	dept_spin_reinit(&ifx_dev->fifo_lock.dmap, &ifx_spi_dkey, 0, NULL);
 
 	if (kfifo_alloc(&ifx_dev->tx_fifo, IFX_SPI_FIFO_SIZE, GFP_KERNEL)) {
 		ret = -ENOMEM;
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 124524e..d8fa640 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -39,6 +39,7 @@
  *          want only one lock-class:
  */
 static struct lock_class_key port_lock_key;
+static struct dept_key port_lock_dkey;
 
 #define HIGH_BITS_OFFSET	((sizeof(long)-sizeof(int))*8)
 
@@ -1920,6 +1921,7 @@ static void uart_port_spin_lock_init(struct uart_port *port)
 {
 	spin_lock_init(&port->lock);
 	lockdep_set_class(&port->lock, &port_lock_key);
+	dept_spin_reinit(&port->lock.dmap, &port_lock_dkey, -1, NULL);
 }
 
 #if defined(CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL)
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index ec145a5..65f243b 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -601,6 +601,7 @@ int tty_buffer_set_limit(struct tty_port *port, int limit)
 void tty_buffer_set_lock_subclass(struct tty_port *port)
 {
 	lockdep_set_subclass(&port->buf.lock, TTY_LOCK_SLAVE);
+	dept_mutex_reinit(&port->buf.lock.dmap, NULL, TTY_LOCK_SLAVE, NULL);
 }
 
 bool tty_buffer_restart_work(struct tty_port *port)
diff --git a/drivers/tty/tty_mutex.c b/drivers/tty/tty_mutex.c
index 2640635..b3cde7a 100644
--- a/drivers/tty/tty_mutex.c
+++ b/drivers/tty/tty_mutex.c
@@ -57,4 +57,5 @@ void tty_unlock_slave(struct tty_struct *tty)
 void tty_set_lock_subclass(struct tty_struct *tty)
 {
 	lockdep_set_subclass(&tty->legacy_mutex, TTY_LOCK_SLAVE);
+	dept_mutex_reinit(&tty->legacy_mutex.dmap, NULL, TTY_LOCK_SLAVE, NULL);
 }
diff --git a/drivers/usb/storage/usb.c b/drivers/usb/storage/usb.c
index 94a6472..7ff0a53 100644
--- a/drivers/usb/storage/usb.c
+++ b/drivers/usb/storage/usb.c
@@ -137,6 +137,7 @@
 #ifdef CONFIG_LOCKDEP
 
 static struct lock_class_key us_interface_key[USB_MAXINTERFACES];
+static struct dept_key us_interface_dkey[USB_MAXINTERFACES];
 
 static void us_set_lock_class(struct mutex *mutex,
 		struct usb_interface *intf)
@@ -153,6 +154,7 @@ static void us_set_lock_class(struct mutex *mutex,
 	BUG_ON(i == config->desc.bNumInterfaces);
 
 	lockdep_set_class(mutex, &us_interface_key[i]);
+	dept_mutex_reinit(&mutex->dmap, &us_interface_dkey[i], -1, NULL);
 }
 
 #else
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 9f72b09..dc8c583 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -156,6 +156,7 @@ struct async_submit_bio {
 	const char		*name_stem;	/* lock name stem */
 	char			names[BTRFS_MAX_LEVEL + 1][20];
 	struct lock_class_key	keys[BTRFS_MAX_LEVEL + 1];
+	struct dept_key		dkeys[BTRFS_MAX_LEVEL + 1];
 } btrfs_lockdep_keysets[] = {
 	{ .id = BTRFS_ROOT_TREE_OBJECTID,	.name_stem = "root"	},
 	{ .id = BTRFS_EXTENT_TREE_OBJECTID,	.name_stem = "extent"	},
@@ -200,6 +201,8 @@ void btrfs_set_buffer_lockdep_class(u64 objectid, struct extent_buffer *eb,
 
 	lockdep_set_class_and_name(&eb->lock,
 				   &ks->keys[level], ks->names[level]);
+	dept_rw_reinit(&eb->lock.dmap,
+		       &ks->dkeys[level], -1, ks->names[level]);
 }
 
 #endif
diff --git a/fs/dcache.c b/fs/dcache.c
index 70fdcdc..9f8976b 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1343,6 +1343,7 @@ static void d_walk(struct dentry *parent, void *data,
 			spin_release(&dentry->d_lock.dep_map, _RET_IP_);
 			this_parent = dentry;
 			spin_acquire(&this_parent->d_lock.dep_map, 0, 1, _RET_IP_);
+			dept_spin_switch_nested(&this_parent->d_lock.dmap, 0, _RET_IP_);
 			goto repeat;
 		}
 		spin_unlock(&dentry->d_lock);
diff --git a/fs/inode.c b/fs/inode.c
index 72c4c34..893a55b 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -172,6 +172,12 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
 		goto out;
 	spin_lock_init(&inode->i_lock);
 	lockdep_set_class(&inode->i_lock, &sb->s_type->i_lock_key);
+	/*
+	 * TODO: Should re-initialize the map with a valid dept_key.
+	 * Also needs dept_key_inti() before using it, dept_key_destroy()
+	 * before freeing it.
+	 */
+	dept_spin_nocheck(&inode->i_lock.dmap);
 
 	init_rwsem(&inode->i_rwsem);
 	lockdep_set_class(&inode->i_rwsem, &sb->s_type->i_mutex_key);
diff --git a/fs/ntfs/inode.c b/fs/ntfs/inode.c
index 9bb9f09..88236e0 100644
--- a/fs/ntfs/inode.c
+++ b/fs/ntfs/inode.c
@@ -398,6 +398,7 @@ void __ntfs_init_inode(struct super_block *sb, ntfs_inode *ni)
  * a separate class for nested inode's mrec_lock's:
  */
 static struct lock_class_key extent_inode_mrec_lock_key;
+static struct dept_key extent_inode_mrec_lock_dkey;
 
 inline ntfs_inode *ntfs_new_extent_inode(struct super_block *sb,
 		unsigned long mft_no)
@@ -408,6 +409,8 @@ inline ntfs_inode *ntfs_new_extent_inode(struct super_block *sb,
 	if (likely(ni != NULL)) {
 		__ntfs_init_inode(sb, ni);
 		lockdep_set_class(&ni->mrec_lock, &extent_inode_mrec_lock_key);
+		dept_mutex_reinit(&ni->mrec_lock.dmap,
+				  &extent_inode_mrec_lock_dkey, -1, NULL);
 		ni->mft_no = mft_no;
 		ni->type = AT_UNUSED;
 		ni->name = NULL;
@@ -1711,6 +1714,7 @@ static int ntfs_read_locked_index_inode(struct inode *base_vi, struct inode *vi)
  * map_mft functions.
  */
 static struct lock_class_key mft_ni_runlist_lock_key, mft_ni_mrec_lock_key;
+static struct dept_key mft_ni_mrec_lock_dkey;
 
 /**
  * ntfs_read_inode_mount - special read_inode for mount time use only
@@ -2145,6 +2149,7 @@ int ntfs_read_inode_mount(struct inode *vi)
 	 */
 	lockdep_set_class(&ni->runlist.lock, &mft_ni_runlist_lock_key);
 	lockdep_set_class(&ni->mrec_lock, &mft_ni_mrec_lock_key);
+	dept_mutex_reinit(&ni->mrec_lock.dmap, &mft_ni_mrec_lock_dkey, -1, NULL);
 
 	return 0;
 
diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
index 7dc3bc6..66f98d7 100644
--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -1745,6 +1745,8 @@ static bool load_and_init_upcase(ntfs_volume *vol)
 static struct lock_class_key
 	lcnbmp_runlist_lock_key, lcnbmp_mrec_lock_key,
 	mftbmp_runlist_lock_key, mftbmp_mrec_lock_key;
+static struct dept_key
+	lcnbmp_mrec_lock_dkey, mftbmp_mrec_lock_dkey;
 
 /**
  * load_system_files - open the system files using normal functions
@@ -1806,6 +1808,8 @@ static bool load_system_files(ntfs_volume *vol)
 			   &mftbmp_runlist_lock_key);
 	lockdep_set_class(&NTFS_I(vol->mftbmp_ino)->mrec_lock,
 			   &mftbmp_mrec_lock_key);
+	dept_mutex_reinit(&NTFS_I(vol->mftbmp_ino)->mrec_lock.dmap,
+			   &mftbmp_mrec_lock_dkey, -1, NULL);
 	/* Read upcase table and setup @vol->upcase and @vol->upcase_len. */
 	if (!load_and_init_upcase(vol))
 		goto iput_mftbmp_err_out;
@@ -1832,6 +1836,8 @@ static bool load_system_files(ntfs_volume *vol)
 			   &lcnbmp_runlist_lock_key);
 	lockdep_set_class(&NTFS_I(vol->lcnbmp_ino)->mrec_lock,
 			   &lcnbmp_mrec_lock_key);
+	dept_mutex_reinit(&NTFS_I(vol->lcnbmp_ino)->mrec_lock.dmap,
+			   &lcnbmp_mrec_lock_dkey, -1, NULL);
 
 	NInoSetSparseDisabled(NTFS_I(vol->lcnbmp_ino));
 	if ((vol->nr_clusters + 7) >> 3 > i_size_read(vol->lcnbmp_ino)) {
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 8be6cd2..6ce92d5 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -562,6 +562,7 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode)
 	static struct lock_class_key ovl_i_mutex_key[OVL_MAX_NESTING];
 	static struct lock_class_key ovl_i_mutex_dir_key[OVL_MAX_NESTING];
 	static struct lock_class_key ovl_i_lock_key[OVL_MAX_NESTING];
+	static struct dept_key ovl_i_lock_dkey[OVL_MAX_NESTING];
 
 	int depth = inode->i_sb->s_stack_depth - 1;
 
@@ -574,6 +575,8 @@ static inline void ovl_lockdep_annotate_inode_mutex_key(struct inode *inode)
 		lockdep_set_class(&inode->i_rwsem, &ovl_i_mutex_key[depth]);
 
 	lockdep_set_class(&OVL_I(inode)->lock, &ovl_i_lock_key[depth]);
+	dept_mutex_reinit(&OVL_I(inode)->lock.dmap, &ovl_i_lock_dkey[depth],
+			  -1, NULL);
 #endif
 }
 
diff --git a/fs/super.c b/fs/super.c
index 904459b..6ad5883 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -254,6 +254,12 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
 	atomic_set(&s->s_active, 1);
 	mutex_init(&s->s_vfs_rename_mutex);
 	lockdep_set_class(&s->s_vfs_rename_mutex, &type->s_vfs_rename_key);
+	/*
+	 * TODO: Should re-initialize the map with a valid dept_key.
+	 * Also needs dept_key_init() before using it, dept_key_destroy()
+	 * before freeing it.
+	 */
+	dept_mutex_nocheck(&s->s_vfs_rename_mutex.dmap);
 	init_rwsem(&s->s_dquot.dqio_sem);
 	s->s_maxbytes = MAX_NON_LFS;
 	s->s_op = &default_op;
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index bcd73b9..7370782c 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -43,6 +43,8 @@
 
 static struct lock_class_key xfs_dquot_group_class;
 static struct lock_class_key xfs_dquot_project_class;
+static struct dept_key xfs_dquot_group_dkey;
+static struct dept_key xfs_dquot_project_dkey;
 
 /*
  * This is called to free all the memory associated with a dquot
@@ -461,9 +463,11 @@
 		break;
 	case XFS_DQTYPE_GROUP:
 		lockdep_set_class(&dqp->q_qlock, &xfs_dquot_group_class);
+		dept_mutex_reinit(&dqp->q_qlock.dmap, &xfs_dquot_group_dkey, -1, NULL);
 		break;
 	case XFS_DQTYPE_PROJ:
 		lockdep_set_class(&dqp->q_qlock, &xfs_dquot_project_class);
+		dept_mutex_reinit(&dqp->q_qlock.dmap, &xfs_dquot_project_dkey, -1, NULL);
 		break;
 	default:
 		ASSERT(0);
diff --git a/include/linux/device/class.h b/include/linux/device/class.h
index e8d470c..cbe7e08 100644
--- a/include/linux/device/class.h
+++ b/include/linux/device/class.h
@@ -85,15 +85,17 @@ struct class_dev_iter {
 extern struct kobject *sysfs_dev_block_kobj;
 extern struct kobject *sysfs_dev_char_kobj;
 extern int __must_check __class_register(struct class *class,
-					 struct lock_class_key *key);
+					 struct lock_class_key *key,
+					 struct dept_key *dkey);
 extern void class_unregister(struct class *class);
 
 /* This is a #define to keep the compiler from merging different
  * instances of the __key variable */
-#define class_register(class)			\
-({						\
-	static struct lock_class_key __key;	\
-	__class_register(class, &__key);	\
+#define class_register(class)				\
+({							\
+	static struct lock_class_key __key;		\
+	static struct dept_key __dkey;			\
+	__class_register(class, &__key, &__dkey);	\
 })
 
 struct class_compat;
@@ -251,15 +253,17 @@ struct class_interface {
 
 extern struct class * __must_check __class_create(struct module *owner,
 						  const char *name,
-						  struct lock_class_key *key);
+						  struct lock_class_key *key,
+						  struct dept_key *dkey);
 extern void class_destroy(struct class *cls);
 
 /* This is a #define to keep the compiler from merging different
  * instances of the __key variable */
-#define class_create(owner, name)		\
-({						\
-	static struct lock_class_key __key;	\
-	__class_create(owner, name, &__key);	\
+#define class_create(owner, name)			\
+({							\
+	static struct lock_class_key __key;		\
+	static struct dept_key __dkey;			\
+	__class_create(owner, name, &__key, &__dkey);	\
 })
 
 
diff --git a/include/linux/if_team.h b/include/linux/if_team.h
index add6079..9fc27fe 100644
--- a/include/linux/if_team.h
+++ b/include/linux/if_team.h
@@ -221,6 +221,7 @@ struct team {
 		struct delayed_work dw;
 	} mcast_rejoin;
 	struct lock_class_key team_lock_key;
+	struct dept_key team_dkey;
 	long mode_priv[TEAM_MODE_PRIV_LONGS];
 };
 
diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
index 5745491..57b13d0 100644
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -261,6 +261,12 @@ static inline bool irq_is_percpu_devid(unsigned int irq)
 	if (desc) {
 		lockdep_set_class(&desc->lock, lock_class);
 		lockdep_set_class(&desc->request_mutex, request_class);
+		/*
+		 * TODO: Should re-initialize the map with a valid
+		 * dept_key. Also need dept_key_init() before using it.
+		 */
+		dept_spin_nocheck(&desc->lock.dmap);
+		dept_mutex_nocheck(&desc->request_mutex.dmap);
 	}
 }
 
diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 65b81e0..81f4ad5 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -149,12 +149,14 @@ struct kthread_delayed_work {
 #endif
 
 extern void __kthread_init_worker(struct kthread_worker *worker,
-			const char *name, struct lock_class_key *key);
+			const char *name, struct lock_class_key *key,
+			struct dept_key *dkey);
 
 #define kthread_init_worker(worker)					\
 	do {								\
 		static struct lock_class_key __key;			\
-		__kthread_init_worker((worker), "("#worker")->lock", &__key); \
+		static struct dept_key __dkey;				\
+		__kthread_init_worker((worker), "("#worker")->lock", &__key, &__dkey); \
 	} while (0)
 
 #define kthread_init_work(work, fn)					\
diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h
index 01861ee..32a1dab 100644
--- a/include/linux/percpu_counter.h
+++ b/include/linux/percpu_counter.h
@@ -29,13 +29,15 @@ struct percpu_counter {
 extern int percpu_counter_batch;
 
 int __percpu_counter_init(struct percpu_counter *fbc, s64 amount, gfp_t gfp,
-			  struct lock_class_key *key);
+			  struct lock_class_key *key,
+			  struct dept_key *dkey);
 
 #define percpu_counter_init(fbc, value, gfp)				\
 	({								\
 		static struct lock_class_key __key;			\
+		static struct dept_key __dkey;			\
 									\
-		__percpu_counter_init(fbc, value, gfp, &__key);		\
+		__percpu_counter_init(fbc, value, gfp, &__key, &__dkey); \
 	})
 
 void percpu_counter_destroy(struct percpu_counter *fbc);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 04a18e0..2936a55 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1885,10 +1885,12 @@ static inline void skb_queue_head_init(struct sk_buff_head *list)
 }
 
 static inline void skb_queue_head_init_class(struct sk_buff_head *list,
-		struct lock_class_key *class)
+		struct lock_class_key *class,
+		struct dept_key *dkey)
 {
 	skb_queue_head_init(list);
 	lockdep_set_class(&list->lock, class);
+	dept_spin_reinit(&list->lock.dmap, dkey, -1, NULL);
 }
 
 /*
diff --git a/include/linux/swait.h b/include/linux/swait.h
index 6a8c22b..84e21ec 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -67,12 +67,14 @@ struct swait_queue {
 	struct swait_queue_head name = __SWAIT_QUEUE_HEAD_INITIALIZER(name)
 
 extern void __init_swait_queue_head(struct swait_queue_head *q, const char *name,
-				    struct lock_class_key *key);
+				    struct lock_class_key *key,
+				    struct dept_key *dkey);
 
 #define init_swait_queue_head(q)				\
 	do {							\
 		static struct lock_class_key __key;		\
-		__init_swait_queue_head((q), #q, &__key);	\
+		static struct dept_key __dkey;			\
+		__init_swait_queue_head((q), #q, &__key, &__dkey); \
 	} while (0)
 
 #ifdef CONFIG_LOCKDEP
diff --git a/include/media/v4l2-ctrls.h b/include/media/v4l2-ctrls.h
index f40e2cb..fc37ee0 100644
--- a/include/media/v4l2-ctrls.h
+++ b/include/media/v4l2-ctrls.h
@@ -465,6 +465,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
  *		buckets are allocated, so there are more slow list lookups).
  *		It will always work, though.
  * @key:	Used by the lock validator if CONFIG_LOCKDEP is set.
+ * @dkey:	Used by the dependency tracker if CONFIG_DEPT is set.
  * @name:	Used by the lock validator if CONFIG_LOCKDEP is set.
  *
  * .. attention::
@@ -477,7 +478,8 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum v4l2_ctrl_type *type,
  */
 int v4l2_ctrl_handler_init_class(struct v4l2_ctrl_handler *hdl,
 				 unsigned int nr_of_controls_hint,
-				 struct lock_class_key *key, const char *name);
+				 struct lock_class_key *key,
+				 struct dept_key *dkey, const char *name);
 
 #ifdef CONFIG_LOCKDEP
 
@@ -504,8 +506,10 @@ int v4l2_ctrl_handler_init_class(struct v4l2_ctrl_handler *hdl,
 (									\
 	({								\
 		static struct lock_class_key _key;			\
+		static struct dept_key _dkey;			\
 		v4l2_ctrl_handler_init_class(hdl, nr_of_controls_hint,	\
 					&_key,				\
+					&_dkey,				\
 					KBUILD_BASENAME ":"		\
 					__stringify(__LINE__) ":"	\
 					"(" #hdl ")->_lock");		\
@@ -513,7 +517,7 @@ int v4l2_ctrl_handler_init_class(struct v4l2_ctrl_handler *hdl,
 )
 #else
 #define v4l2_ctrl_handler_init(hdl, nr_of_controls_hint)		\
-	v4l2_ctrl_handler_init_class(hdl, nr_of_controls_hint, NULL, NULL)
+	v4l2_ctrl_handler_init_class(hdl, nr_of_controls_hint, NULL, NULL, NULL)
 #endif
 
 /**
diff --git a/include/net/sock.h b/include/net/sock.h
index 064637d..3d90488 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1565,6 +1565,7 @@ static inline void sock_release_ownership(struct sock *sk)
 			sizeof((sk)->sk_lock));				\
 	lockdep_set_class_and_name(&(sk)->sk_lock.slock,		\
 				(skey), (sname));				\
+	dept_spin_nocheck(&(sk)->sk_lock.slock.dmap);			\
 	lockdep_init_map(&(sk)->sk_lock.dep_map, (name), (key), 0);	\
 } while (0)
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e8bf9220..10556cc 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10703,6 +10703,8 @@ static int pmu_dev_alloc(struct pmu *pmu)
 
 static struct lock_class_key cpuctx_mutex;
 static struct lock_class_key cpuctx_lock;
+static struct dept_key cpuctx_mutex_dkey;
+static struct dept_key cpuctx_spin_dkey;
 
 int perf_pmu_register(struct pmu *pmu, const char *name, int type)
 {
@@ -10771,6 +10773,8 @@ int perf_pmu_register(struct pmu *pmu, const char *name, int type)
 		__perf_event_init_context(&cpuctx->ctx);
 		lockdep_set_class(&cpuctx->ctx.mutex, &cpuctx_mutex);
 		lockdep_set_class(&cpuctx->ctx.lock, &cpuctx_lock);
+		dept_mutex_reinit(&cpuctx->ctx.mutex.dmap, &cpuctx_mutex_dkey, -1, NULL);
+		dept_spin_reinit(&cpuctx->ctx.lock.dmap, &cpuctx_spin_dkey, -1, NULL);
 		cpuctx->ctx.pmu = pmu;
 		cpuctx->online = cpumask_test_cpu(cpu, perf_online_mask);
 
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 1a77236..27d98bf 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -23,6 +23,7 @@
  * lockdep: we want to handle all irq_desc locks as a single lock-class:
  */
 static struct lock_class_key irq_desc_lock_class;
+static struct dept_key irq_desc_dkey;
 
 #if defined(CONFIG_SMP)
 static int __init irq_affinity_setup(char *str)
@@ -403,6 +404,7 @@ static struct irq_desc *alloc_desc(int irq, int node, unsigned int flags,
 
 	raw_spin_lock_init(&desc->lock);
 	lockdep_set_class(&desc->lock, &irq_desc_lock_class);
+	dept_spin_reinit(&desc->lock.dmap, &irq_desc_dkey, -1, NULL);
 	mutex_init(&desc->request_mutex);
 	init_rcu_head(&desc->rcu);
 
@@ -572,6 +574,7 @@ int __init early_irq_init(void)
 		alloc_masks(&desc[i], node);
 		raw_spin_lock_init(&desc[i].lock);
 		lockdep_set_class(&desc[i].lock, &irq_desc_lock_class);
+		dept_spin_reinit(&desc[i].lock.dmap, &irq_desc_dkey, -1, NULL);
 		mutex_init(&desc[i].request_mutex);
 		desc_set_defaults(i, &desc[i], node, NULL, NULL);
 	}
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 3edaa38..212245e 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -642,11 +642,13 @@ int kthreadd(void *unused)
 
 void __kthread_init_worker(struct kthread_worker *worker,
 				const char *name,
-				struct lock_class_key *key)
+				struct lock_class_key *key,
+				struct dept_key *dkey)
 {
 	memset(worker, 0, sizeof(struct kthread_worker));
 	raw_spin_lock_init(&worker->lock);
 	lockdep_set_class_and_name(&worker->lock, key, name);
+	dept_spin_reinit(&worker->lock.dmap, dkey, -1, name);
 	INIT_LIST_HEAD(&worker->work_list);
 	INIT_LIST_HEAD(&worker->delayed_work_list);
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f78ee75..e6a6c31 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4211,6 +4211,8 @@ static void __init rcu_init_one(void)
 	static const char * const fqs[] = RCU_FQS_NAME_INIT;
 	static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
 	static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS];
+	static struct dept_key rcu_node_dkey[RCU_NUM_LVLS];
+	static struct dept_key rcu_fqs_dkey[RCU_NUM_LVLS];
 
 	int levelspread[RCU_NUM_LVLS];		/* kids/node in each level. */
 	int cpustride = 1;
@@ -4240,9 +4242,13 @@ static void __init rcu_init_one(void)
 			raw_spin_lock_init(&ACCESS_PRIVATE(rnp, lock));
 			lockdep_set_class_and_name(&ACCESS_PRIVATE(rnp, lock),
 						   &rcu_node_class[i], buf[i]);
+			dept_spin_reinit(&ACCESS_PRIVATE(rnp, lock).dmap,
+					 &rcu_node_dkey[i], -1, buf[i]);
 			raw_spin_lock_init(&rnp->fqslock);
 			lockdep_set_class_and_name(&rnp->fqslock,
 						   &rcu_fqs_class[i], fqs[i]);
+			dept_spin_reinit(&rnp->fqslock.dmap,
+					 &rcu_fqs_dkey[i], -1, fqs[i]);
 			rnp->gp_seq = rcu_state.gp_seq;
 			rnp->gp_seq_needed = rcu_state.gp_seq;
 			rnp->completedqs = rcu_state.gp_seq;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 28709f6..2d12d23 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2140,6 +2140,7 @@ static inline void double_unlock_balance(struct rq *this_rq, struct rq *busiest)
 {
 	raw_spin_unlock(&busiest->lock);
 	lock_set_subclass(&this_rq->lock.dep_map, 0, _RET_IP_);
+	dept_spin_switch_nested(&this_rq->lock.dmap, 0, _RET_IP_);
 }
 
 static inline void double_lock(spinlock_t *l1, spinlock_t *l2)
diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
index e1c655f..4b71484 100644
--- a/kernel/sched/swait.c
+++ b/kernel/sched/swait.c
@@ -5,10 +5,12 @@
 #include "sched.h"
 
 void __init_swait_queue_head(struct swait_queue_head *q, const char *name,
-			     struct lock_class_key *key)
+			     struct lock_class_key *key,
+			     struct dept_key *dkey)
 {
 	raw_spin_lock_init(&q->lock);
 	lockdep_set_class_and_name(&q->lock, key, name);
+	dept_spin_reinit(&q->lock.dmap, dkey, -1, name);
 	INIT_LIST_HEAD(&q->task_list);
 }
 EXPORT_SYMBOL(__init_swait_queue_head);
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 01f5d30..5964642 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -10,6 +10,10 @@ void __init_waitqueue_head(struct wait_queue_head *wq_head, const char *name, st
 {
 	spin_lock_init(&wq_head->lock);
 	lockdep_set_class_and_name(&wq_head->lock, key, name);
+	/*
+	 * TODO: Should re-initialize the map with a valid dept_key.
+	 */
+	dept_spin_nocheck(&wq_head->lock.dmap);
 	INIT_LIST_HEAD(&wq_head->head);
 }
 
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 93ef0ab..ae1e647 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1533,6 +1533,10 @@ static int rb_allocate_pages(struct ring_buffer_per_cpu *cpu_buffer,
 	cpu_buffer->buffer = buffer;
 	raw_spin_lock_init(&cpu_buffer->reader_lock);
 	lockdep_set_class(&cpu_buffer->reader_lock, buffer->reader_lock_key);
+	/*
+	 * TODO: Shoule initialize the map with a valid dept_key.
+	 */
+	dept_spin_nocheck(&cpu_buffer->reader_lock.dmap);
 	cpu_buffer->lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 	INIT_WORK(&cpu_buffer->update_pages_work, update_pages_handler);
 	init_completion(&cpu_buffer->update_done);
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c41c3c1..710092e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3627,6 +3627,7 @@ static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs)
 		goto fail;
 
 	lockdep_set_subclass(&pool->lock, 1);	/* see put_pwq() */
+	dept_spin_reinit(&pool->lock.dmap, NULL, 1, NULL);
 	copy_workqueue_attrs(pool->attrs, attrs);
 	pool->node = target_node;
 
diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 34f3b85..bc88d1e 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -154,13 +154,14 @@ static void init_shared_classes(void)
 {
 #ifdef CONFIG_RT_MUTEXES
 	static struct lock_class_key rt_X, rt_Y, rt_Z;
-
-	__rt_mutex_init(&rtmutex_X1, __func__, &rt_X, NULL);
-	__rt_mutex_init(&rtmutex_X2, __func__, &rt_X, NULL);
-	__rt_mutex_init(&rtmutex_Y1, __func__, &rt_Y, NULL);
-	__rt_mutex_init(&rtmutex_Y2, __func__, &rt_Y, NULL);
-	__rt_mutex_init(&rtmutex_Z1, __func__, &rt_Z, NULL);
-	__rt_mutex_init(&rtmutex_Z2, __func__, &rt_Z, NULL);
+	static struct dept_key dept_rt_X, dept_rt_Y, dept_rt_Z;
+
+	__rt_mutex_init(&rtmutex_X1, __func__, &rt_X, &dept_rt_X);
+	__rt_mutex_init(&rtmutex_X2, __func__, &rt_X, &dept_rt_X);
+	__rt_mutex_init(&rtmutex_Y1, __func__, &rt_Y, &dept_rt_Y);
+	__rt_mutex_init(&rtmutex_Y2, __func__, &rt_Y, &dept_rt_Y);
+	__rt_mutex_init(&rtmutex_Z1, __func__, &rt_Z, &dept_rt_Z);
+	__rt_mutex_init(&rtmutex_Z2, __func__, &rt_Z, &dept_rt_Z);
 #endif
 
 	init_class_X(&lock_X1, &rwlock_X1, &mutex_X1, &rwsem_X1);
diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c
index a2345de..d96a6eb 100644
--- a/lib/percpu_counter.c
+++ b/lib/percpu_counter.c
@@ -139,12 +139,14 @@ s64 __percpu_counter_sum(struct percpu_counter *fbc)
 EXPORT_SYMBOL(__percpu_counter_sum);
 
 int __percpu_counter_init(struct percpu_counter *fbc, s64 amount, gfp_t gfp,
-			  struct lock_class_key *key)
+			  struct lock_class_key *key,
+			  struct dept_key *dkey)
 {
 	unsigned long flags __maybe_unused;
 
 	raw_spin_lock_init(&fbc->lock);
 	lockdep_set_class(&fbc->lock, key);
+	dept_spin_reinit(&fbc->lock.dmap, dkey, -1, NULL);
 	fbc->count = amount;
 	fbc->counters = alloc_percpu_gfp(s32, gfp);
 	if (!fbc->counters)
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 5aa6e44..4c5142b 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -607,8 +607,14 @@ int __list_lru_init(struct list_lru *lru, bool memcg_aware,
 
 	for_each_node(i) {
 		spin_lock_init(&lru->node[i].lock);
-		if (key)
+		if (key) {
 			lockdep_set_class(&lru->node[i].lock, key);
+			/*
+			 * TODO: Should re-initialize the map with a valid
+			 * dept_key.
+			 */
+			dept_spin_nocheck(&lru->node[i].lock.dmap);
+		}
 		init_one_lru(&lru->node[i].lru);
 	}
 
diff --git a/net/batman-adv/hash.c b/net/batman-adv/hash.c
index 68638e0..48b4937 100644
--- a/net/batman-adv/hash.c
+++ b/net/batman-adv/hash.c
@@ -79,6 +79,11 @@ void batadv_hash_set_lock_class(struct batadv_hashtable *hash,
 {
 	u32 i;
 
-	for (i = 0; i < hash->size; i++)
+	for (i = 0; i < hash->size; i++) {
 		lockdep_set_class(&hash->list_locks[i], key);
+		/*
+		 * TODO: Should re-initialize the map with a valid dept_key.
+		 */
+		dept_spin_nocheck(&hash->list_locks[i].dmap);
+	}
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index 4906b44..1d7db6b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -441,6 +441,8 @@ static void unlist_netdevice(struct net_device *dev)
 
 static struct lock_class_key netdev_xmit_lock_key[ARRAY_SIZE(netdev_lock_type)];
 static struct lock_class_key netdev_addr_lock_key[ARRAY_SIZE(netdev_lock_type)];
+static struct dept_key netdev_xmit_dkey[ARRAY_SIZE(netdev_lock_type)];
+static struct dept_key netdev_addr_dkey[ARRAY_SIZE(netdev_lock_type)];
 
 static inline unsigned short netdev_lock_pos(unsigned short dev_type)
 {
@@ -461,6 +463,8 @@ static inline void netdev_set_xmit_lockdep_class(spinlock_t *lock,
 	i = netdev_lock_pos(dev_type);
 	lockdep_set_class_and_name(lock, &netdev_xmit_lock_key[i],
 				   netdev_lock_name[i]);
+	dept_spin_reinit(&lock->dmap, &netdev_xmit_dkey[i], -1,
+			 netdev_lock_name[i]);
 }
 
 static inline void netdev_set_addr_lockdep_class(struct net_device *dev)
@@ -471,6 +475,9 @@ static inline void netdev_set_addr_lockdep_class(struct net_device *dev)
 	lockdep_set_class_and_name(&dev->addr_list_lock,
 				   &netdev_addr_lock_key[i],
 				   netdev_lock_name[i]);
+	dept_spin_reinit(&dev->addr_list_lock.dmap,
+			 &netdev_addr_dkey[i], -1,
+			 netdev_lock_name[i]);
 }
 #else
 static inline void netdev_set_xmit_lockdep_class(spinlock_t *lock,
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 8e39e28..e205027 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1669,6 +1669,7 @@ static void neigh_parms_destroy(struct neigh_parms *parms)
 }
 
 static struct lock_class_key neigh_table_proxy_queue_class;
+static struct dept_key neigh_table_proxy_queue_dkey;
 
 static struct neigh_table *neigh_tables[NEIGH_NR_TABLES] __read_mostly;
 
@@ -1715,7 +1716,8 @@ void neigh_table_init(int index, struct neigh_table *tbl)
 			tbl->parms.reachable_time);
 	timer_setup(&tbl->proxy_timer, neigh_proxy_process, 0);
 	skb_queue_head_init_class(&tbl->proxy_queue,
-			&neigh_table_proxy_queue_class);
+			&neigh_table_proxy_queue_class,
+			&neigh_table_proxy_queue_dkey);
 
 	tbl->last_flush = now;
 	tbl->last_rand	= now + tbl->parms.reachable_time * 20;
diff --git a/net/core/sock.c b/net/core/sock.c
index 6c5c6b1..d5de709 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -264,6 +264,11 @@ bool sk_net_capable(const struct sock *sk, int cap)
 static struct lock_class_key af_wlock_keys[AF_MAX];
 static struct lock_class_key af_elock_keys[AF_MAX];
 static struct lock_class_key af_kern_callback_keys[AF_MAX];
+static struct dept_key af_callback_dkeys[AF_MAX];
+static struct dept_key af_rlock_dkeys[AF_MAX];
+static struct dept_key af_wlock_dkeys[AF_MAX];
+static struct dept_key af_elock_dkeys[AF_MAX];
+static struct dept_key af_kern_callback_dkeys[AF_MAX];
 
 /* Run time adjustable parameters. */
 __u32 sysctl_wmem_max __read_mostly = SK_WMEM_MAX;
@@ -1864,6 +1869,18 @@ static void sk_init_common(struct sock *sk)
 	lockdep_set_class_and_name(&sk->sk_callback_lock,
 			af_callback_keys + sk->sk_family,
 			af_family_clock_key_strings[sk->sk_family]);
+	dept_spin_reinit(&sk->sk_receive_queue.lock.dmap,
+			af_rlock_dkeys + sk->sk_family, -1,
+			af_family_rlock_key_strings[sk->sk_family]);
+	dept_spin_reinit(&sk->sk_write_queue.lock.dmap,
+			af_wlock_dkeys + sk->sk_family, -1,
+			af_family_wlock_key_strings[sk->sk_family]);
+	dept_spin_reinit(&sk->sk_error_queue.lock.dmap,
+			af_elock_dkeys + sk->sk_family, -1,
+			af_family_elock_key_strings[sk->sk_family]);
+	dept_rw_reinit(&sk->sk_callback_lock.dmap,
+			af_callback_dkeys + sk->sk_family, -1,
+			af_family_clock_key_strings[sk->sk_family]);
 }
 
 /**
@@ -2987,16 +3004,25 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 	}
 
 	rwlock_init(&sk->sk_callback_lock);
-	if (sk->sk_kern_sock)
+	if (sk->sk_kern_sock) {
 		lockdep_set_class_and_name(
 			&sk->sk_callback_lock,
 			af_kern_callback_keys + sk->sk_family,
 			af_family_kern_clock_key_strings[sk->sk_family]);
-	else
+		dept_rw_reinit(
+			&sk->sk_callback_lock.dmap,
+			af_kern_callback_dkeys + sk->sk_family, -1,
+			af_family_kern_clock_key_strings[sk->sk_family]);
+	} else {
 		lockdep_set_class_and_name(
 			&sk->sk_callback_lock,
 			af_callback_keys + sk->sk_family,
 			af_family_clock_key_strings[sk->sk_family]);
+		dept_rw_reinit(
+			&sk->sk_callback_lock.dmap,
+			af_callback_dkeys + sk->sk_family, -1,
+			af_family_clock_key_strings[sk->sk_family]);
+	}
 
 	sk->sk_state_change	=	sock_def_wakeup;
 	sk->sk_data_ready	=	sock_def_readable;
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 701fc72..7862c46 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1463,6 +1463,7 @@ static int l2tp_tunnel_sock_create(struct net *net,
 }
 
 static struct lock_class_key l2tp_socket_class;
+static struct dept_key l2tp_socket_dkey;
 
 int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32 peer_tunnel_id,
 		       struct l2tp_tunnel_cfg *cfg, struct l2tp_tunnel **tunnelp)
@@ -1595,6 +1596,8 @@ int l2tp_tunnel_register(struct l2tp_tunnel *tunnel, struct net *net,
 	sk->sk_destruct = &l2tp_tunnel_destruct;
 	lockdep_set_class_and_name(&sk->sk_lock.slock, &l2tp_socket_class,
 				   "l2tp_sock");
+	dept_spin_reinit(&sk->sk_lock.slock.dmap, &l2tp_socket_dkey, -1,
+			 "l2tp_sock");
 	sk->sk_allocation = GFP_ATOMIC;
 
 	if (tunnel->fd >= 0)
diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index f76fac8..28b0ffa 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -64,6 +64,7 @@
 #define SYNC_PROTO_VER  1		/* Protocol version in header */
 
 static struct lock_class_key __ipvs_sync_key;
+static struct dept_key __ipvs_sync_dkey;
 /*
  *	IPVS sync connection entry
  *	Version 0, i.e. original version.
@@ -2034,10 +2035,8 @@ int stop_sync_thread(struct netns_ipvs *ipvs, int state)
  */
 int __net_init ip_vs_sync_net_init(struct netns_ipvs *ipvs)
 {
-	/*
-	 * TODO: Initialize the mutex with a valid dept_key.
-	 */
-	__mutex_init(&ipvs->sync_mutex, "ipvs->sync_mutex", &__ipvs_sync_key, NULL);
+	__mutex_init(&ipvs->sync_mutex, "ipvs->sync_mutex", &__ipvs_sync_key,
+			&__ipvs_sync_dkey);
 	spin_lock_init(&ipvs->sync_lock);
 	spin_lock_init(&ipvs->sync_buff_lock);
 	return 0;
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index d2d1448..7374dfe 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -89,6 +89,7 @@ static inline int netlink_is_kernel(struct sock *sk)
 static DECLARE_WAIT_QUEUE_HEAD(nl_table_wait);
 
 static struct lock_class_key nlk_cb_mutex_keys[MAX_LINKS];
+static struct dept_key nlk_cb_mutex_dkeys[MAX_LINKS];
 
 static const char *const nlk_cb_mutex_key_strings[MAX_LINKS + 1] = {
 	"nlk_cb_mutex-ROUTE",
@@ -642,6 +643,9 @@ static int __netlink_create(struct net *net, struct socket *sock,
 		lockdep_set_class_and_name(nlk->cb_mutex,
 					   nlk_cb_mutex_keys + protocol,
 					   nlk_cb_mutex_key_strings[protocol]);
+		dept_spin_reinit(&nlk->cb_mutex->dmap,
+				 nlk_cb_mutex_dkeys + protocol, -1,
+				 nlk_cb_mutex_key_strings[protocol]);
 	}
 	init_waitqueue_head(&nlk->wait);
 
diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c
index ed49769..a3c1a30 100644
--- a/net/rxrpc/call_object.c
+++ b/net/rxrpc/call_object.c
@@ -53,6 +53,7 @@ static void rxrpc_call_timer_expired(struct timer_list *t)
 }
 
 static struct lock_class_key rxrpc_call_user_mutex_lock_class_key;
+static struct dept_key rxrpc_call_user_mutex_lock_class_dkey;
 
 /*
  * find an extant server call
@@ -119,9 +120,12 @@ struct rxrpc_call *rxrpc_alloc_call(struct rxrpc_sock *rx, gfp_t gfp,
 	/* Prevent lockdep reporting a deadlock false positive between the afs
 	 * filesystem and sys_sendmsg() via the mmap sem.
 	 */
-	if (rx->sk.sk_kern_sock)
+	if (rx->sk.sk_kern_sock) {
 		lockdep_set_class(&call->user_mutex,
 				  &rxrpc_call_user_mutex_lock_class_key);
+		dept_spin_reinit(&call->user_mutex.dmap,
+				  &rxrpc_call_user_mutex_lock_class_dkey, -1, NULL);
+	}
 
 	timer_setup(&call->timer, rxrpc_call_timer_expired, 0);
 	INIT_WORK(&call->processor, &rxrpc_process_call);
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 54c4172..2d0e2a6 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -858,6 +858,12 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 	lockdep_set_class(&sch->busylock,
 			  dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);
 
+	/*
+	 * TODO: Should re-initialize the map with a valid dept_key.
+	 */
+	dept_spin_nocheck(&sch->busylock.dmap);
+	dept_spin_nocheck(&sch->seqlock.dmap);
+
 	seqcount_init(&sch->running);
 	lockdep_set_class(&sch->running,
 			  dev->qdisc_running_key ?: &qdisc_running_key);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [RFC] Dept(Dependency Tracker) Report Example
  2020-11-23 11:13 ` [RFC] Dept(Dependency Tracker) Report Example Byungchul Park
@ 2020-11-23 12:14   ` Byungchul Park
  0 siblings, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 12:14 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

On Mon, Nov 23, 2020 at 08:13:32PM +0900, Byungchul Park wrote:
> [    0.995081] ===================================================
> [    0.995619] Dept: Circular dependency has been detected.
> [    0.995816] 5.9.0+ #8 Tainted: G        W        
> [    0.996493] ---------------------------------------------------
> [    0.996493] summary
> [    0.996493] ---------------------------------------------------
> [    0.996493] *** AA DEADLOCK ***
> [    0.996493] 
> [    0.996493] context A
> [    0.996493]     [S] __mutex_lock(&dev->mutex:0)
> [    0.996493]     [W] __mutex_lock(&dev->mutex:0)
> [    0.996493]     [E] __mutex_unlock(&dev->mutex:0)
> [    0.996493] 
> [    0.996493] [S]: start of the event context
> [    0.996493] [W]: the wait blocked
> [    0.996493] [E]: the event not reachable

Let me explain what [S], [W] and [E] mean using example:

1. In the case of typical locks:

   if condition a
      lock(&a);		<- [S] start of the event context for the event, unlock(&a)
			   [W] wait as well
      ...
      unlock(&a);	<- [E] event to someone who has been waiting lock &a to
			       be released
   else
      lock(&b);		<- [S] start of the event context for the event, unlock(&b)
			   [W] wait as well
      ...
      unlock(&b);	<- [E] event to someone who has been waiting lock &b to
			       be released

2. In the case of general wait and event:

   THREAD 1
      trigger_the_event_context_to_go();
      ...
      wait_for_something(&c); <- [W] wait
         store_timestamp();

   THREAD 2
      notice_someone_triggered_me();
      ...

      (somewhere can see the timestamp of wait_for_something(&c) in THREAD 1)
      <- [S] start of the event context for the event, do_something(&c)

      ...
      do_something(&c); <- [E] event the wait is waiting for in THREAD 1

Thanks,
Byungchul


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Dept(Dependency Tracker) Implementation
  2020-11-23 11:05 ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
  2020-11-23 11:36   ` [RFC 1/6] dept: Implement Dept(Dependency Tracker) Byungchul Park
@ 2020-11-23 12:29   ` Byungchul Park
  1 sibling, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 12:29 UTC (permalink / raw)
  To: torvalds, peterz, mingo, will
  Cc: linux-kernel, tglx, rostedt, joel, alexander.levin,
	daniel.vetter, chris, duyuyang, johannes.berg, tj, tytso, willy,
	david, amir73il, bfields, gregkh, kernel-team

On Mon, Nov 23, 2020 at 08:05:27PM +0900, Byungchul Park wrote:
> Hi,
> 
> This patchset is too nasty to get reviewed in detail for now.

I worked Dept against mainline v5.9.

Thanks,
Byungchul

> This have:
> 
>    1. applying Dept to spinlock/mutex/rwlock/completion
>    2. assigning custom keys or disable maps to avoid false positives
> 
> This doesn't have yet (but will be done):
> 
>    1. proc interfaces e.g. to see dependecies the tool has built,
>    2. applying Dept to rw semaphore and the like,
>    3. applying Dept to lock_page()/unlock_page(),
>    4. assigning custom keys to more places properly,
>    5. replace all manual Lockdep annotations,
>    (and so on..)
> 
> But I decided to share it to let others able to test how it works and
> someone who wants to see the detail able to check the code. The most
> important thing I'd like to show is what exactly a deadlock detection
> tool should do.
> 
> Turn on CONFIG_DEPT to test it. Feel free to leave any questions if you
> have.
> 
> Thanks,
> Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC] Are you good with Lockdep?
  2020-11-16 15:37               ` Matthew Wilcox
  2020-11-18  1:45                 ` Boqun Feng
@ 2020-11-23 13:15                 ` Byungchul Park
  1 sibling, 0 replies; 31+ messages in thread
From: Byungchul Park @ 2020-11-23 13:15 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Steven Rostedt, Thomas Gleixner, Ingo Molnar, torvalds, peterz,
	mingo, will, linux-kernel, joel, alexander.levin, daniel.vetter,
	chris, duyuyang, johannes.berg, tj, tytso, david, amir73il,
	bfields, gregkh, kernel-team

On Mon, Nov 16, 2020 at 03:37:29PM +0000, Matthew Wilcox wrote:
> > > Something I believe lockdep is missing is a way to annotate "This lock
> > > will be released by a softirq".  If we had lockdep for lock_page(), this
> > > would be a great case to show off.  The filesystem locks the page, then
> > > submits it to a device driver.  On completion, the filesystem's bio
> > > completion handler will be called in softirq context and unlock the page.
> > > 
> > > So if the filesystem has another lock which is acquired by the completion
> > > handler. we could get an ABBA deadlock that lockdep would be unable to see.
> > > 
> > > There are other similar things; if you look at the remaining semaphore
> > > users in the kernel, you'll see the general pattern is that they're
> > > acquired in process context and then released in interrupt context.
> > > If we had a way to transfer ownership of the semaphore to a generic
> > > "interrupt context", they could become mutexes and lockdep could check
> > > that nothing else will cause a deadlock.
> > 
> > Yes. Those are exactly what Cross-release feature solves. Those problems
> > can be achieved with Cross-release. But even with Cross-release, we
> > still cannot solve the problem of (1) readlock handling (2) and false
> > positives preventing further reporting.
> 
> It's not just about lockdep for semaphores.  Mutexes will spin if the
> current owner is still running, so to convert an interrupt-released
> semaphore to a mutex, we need a way to mark the mutex as being released
> by the new owner.
> 
> I really don't think you want to report subsequent lockdep splats.

Don't you think it would be ok if the # of splats is not too many?

Or is it still a problem even if not?

We shouldn't do that if it clearly makes a big problem. Otherwise, it
should be because any deadlock detection tool cannot be enhanced to be
stonger which inevitably produces false positives until proper keys are
assigned to all classes in the kernel, unless multiple reports are
allowed.

Could you explain why? It would be appreciated.

Thanks,
Byungchul

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2020-11-23 13:18 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11  5:05 [RFC] Are you good with Lockdep? Byungchul Park
2020-11-11 10:54 ` Ingo Molnar
2020-11-11 14:36   ` Steven Rostedt
2020-11-11 23:16     ` Thomas Gleixner
2020-11-12  8:10       ` Byungchul Park
2020-11-12 14:26         ` Steven Rostedt
2020-11-12 14:52           ` Matthew Wilcox
2020-11-16  8:57             ` Byungchul Park
2020-11-16 15:37               ` Matthew Wilcox
2020-11-18  1:45                 ` Boqun Feng
2020-11-18  3:30                   ` Matthew Wilcox
2020-11-23 13:15                 ` Byungchul Park
2020-11-12 14:58           ` Byungchul Park
2020-11-16  9:05             ` Byungchul Park
2020-11-23 10:45               ` Byungchul Park
2020-11-12 10:32     ` Byungchul Park
2020-11-12 13:56       ` Daniel Vetter
2020-11-16  8:45         ` Byungchul Park
2020-11-12  6:15   ` Byungchul Park
2020-11-12  8:51     ` Byungchul Park
2020-11-12  9:46       ` Byungchul Park
2020-11-23 11:05 ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
2020-11-23 11:36   ` [RFC 1/6] dept: Implement Dept(Dependency Tracker) Byungchul Park
2020-11-23 11:36     ` [RFC 2/6] dept: Apply Dept to spinlock Byungchul Park
2020-11-23 11:36     ` [RFC 3/6] dept: Apply Dept to mutex families Byungchul Park
2020-11-23 11:36     ` [RFC 4/6] dept: Apply Dept to rwlock Byungchul Park
2020-11-23 11:36     ` [RFC 5/6] dept: Apply Dept to wait_for_completion()/complete() Byungchul Park
2020-11-23 11:36     ` [RFC 6/6] dept: Assign custom dept_keys or disable to avoid false positives Byungchul Park
2020-11-23 12:29   ` [RFC] Dept(Dependency Tracker) Implementation Byungchul Park
2020-11-23 11:13 ` [RFC] Dept(Dependency Tracker) Report Example Byungchul Park
2020-11-23 12:14   ` Byungchul Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).