From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1754732AbcHSMnS (ORCPT <rfc822;w@1wt.eu>);
        Fri, 19 Aug 2016 08:43:18 -0400
Received: from LGEAMRELO13.lge.com ([156.147.23.53]:49787 "EHLO
        lgeamrelo13.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753471AbcHSMmz (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 19 Aug 2016 08:42:55 -0400
X-Original-SENDERIP: 156.147.1.151
X-Original-MAILFROM: byungchul.park@lge.com
X-Original-SENDERIP: 10.177.222.33
X-Original-MAILFROM: byungchul.park@lge.com
Date: Fri, 19 Aug 2016 21:39:59 +0900
From: Byungchul Park <byungchul.park@lge.com>
To: peterz@infradead.org, mingo@kernel.org
Cc: tglx@linutronix.de, npiggin@kernel.dk, walken@google.com,
        boqun.feng@gmail.com, kirill@shutemov.name,
        linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [Revised document] Crossrelease lockdep
Message-ID: <20160819123959.GW2279@X58A-UD3R>
References: <1467883803-29132-1-git-send-email-byungchul.park@lge.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1467883803-29132-1-git-send-email-byungchul.park@lge.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hello,

I rewrote the document so that the need of crossrelease feature can be
described logically. I think it's more important than to post specific
implementation. Could you let me know your opinions about this?

Thanks,
Byungchul

----->8-----

Crossrelease
============

Started by Byungchul Park <byungchul.park@lge.com>

Contents:

 (*) Background.

     - What causes deadlock.
     - What lockdep detects.
     - How lockdep works.

 (*) Limitation.

     - Limit to typical lock.
     - Pros from the limitation.
     - Cons from the limitation.

 (*) Generalization.

     - Relax the limitation.

 (*) Crossrelease.

     - Introduce crossrelease.
     - Introduce commit.

 (*) Implementation.

     - Data structures.
     - How crossrelease works.

 (*) Optimizations.

     - Avoid duplication.
     - Avoid lock contention.


==========
Background
==========

What causes deadlock
--------------------

A deadlock occurs when a context is waiting for an event to be issued
which cannot be issued because the context or another context who can
issue the event is also waiting for an event to be issued which cannot
be issued. Single context or more than one context both waiting for an
event and issuing an event may paricipate in a deadlock.

For example,

A context who can issue event D is waiting for event A to be issued.
A context who can issue event A is waiting for event B to be issued.
A context who can issue event B is waiting for event C to be issued.
A context who can issue event C is waiting for event D to be issued.

A deadlock occurs when these four operations are run at a time because
event D cannot be issued if event A isn't issued which in turn cannot be
issued if event B isn't issued which in turn cannot be issued if event C
isn't issued which in turn cannot be issued if event D isn't issued. No
event can be issued since any of them never meets its precondition.

We can easily recognize that each wait operation creates a dependency
between two issuings e.g. between issuing D and issuing A like, 'event D
cannot be issued if event A isn't issued', in other words, 'issuing
event D depends on issuing event A'. So the whole example can be
rewritten in terms of dependency,

Do an operation making 'event D cannot be issued if event A isn't issued'.
Do an operation making 'event A cannot be issued if event B isn't issued'.
Do an operation making 'event B cannot be issued if event C isn't issued'.
Do an operation making 'event C cannot be issued if event D isn't issued'.

or,

Do an operation making 'issuing event D depends on issuing event A'.
Do an operation making 'issuing event A depends on issuing event B'.
Do an operation making 'issuing event B depends on issuing event C'.
Do an operation making 'issuing event C depends on issuing event D'.

What causes a deadlock is a set of dependencies a chain of which forms a
cycle, which means that issuing event D depending on issuing event A
depending on issuing event B depending on issuing event C depending on
issuing event D, finally depends on issuing event D itself, which means
no event can be issued.

Any set of operations creating dependencies causes a deadlock. The set
of lock operations e.g. acquire and release is an example. Waiting for a
lock to be released corresponds to waiting for an event and releasing a
lock corresponds to issuing an event. So the description of dependency
above can be altered to one in terms of lock.

In terms of event, issuing event A depends on issuing event B if,

	Event A cannot be issued if event B isn't issued.

In terms of lock, releasing lock A depends on releasing lock B if,

	Lock A cannot be released if lock B isn't released.

CONCLUSION

A set of dependencies a chain of which forms a cycle, causes a deadlock,
no matter what creates the dependencies.


What lockdep detects
--------------------

A deadlock actually occurs only when all operations creating problematic
dependencies are run at a time. However, even if it has not happend, the
deadlock potentially can occur if the problematic dependencies obviously
exist. Thus it's meaningful to detect not only an actual deadlock but
also its possibility. Lockdep does the both.

Whether a deadlock actually occurs or not depends on several factors,
which means a deadlock may not occur even though problematic
dependencies exist. For example, what order contexts are switched in is
a factor. A deadlock will occur when contexts are switched so that all
operations causing a deadlock become run simultaneously.

Lockdep tries to detect a deadlock or its possibility aggressively,
though it also tries to avoid false positive detections. So lockdep is
designed to consider all possible combinations of dependencies so that
it can detect all potential possibilities of deadlock in advance. What
lockdep tries in order to consider all possibilities are,

1. Use a global dependency graph including all dependencies.

   What lockdep checks is based on dependencies instead of what actually
   happened. So no matter which context or call path a new dependency is
   detected in, it's just referred to as a global factor.

2. Use lock classes than lock instances when checking dependencies.

   What actually causes a deadlock is lock instances. However, lockdep
   uses lock classes than its instances when checking dependencies since
   any instance of a same lock class can be altered anytime.

So lockdep detects both an actual deadlock and its possibility. But the
latter is more valuable than the former. When a deadlock actually
occures, we can identify what happens in the system by some means or
other even without lockdep. However, there's no way to detect possiblity
without lockdep unless the whole code is parsed in head. It's terrible.

CONCLUSION

Lockdep does, the fisrt one is more valuable,

1. Detecting and reporting deadlock possibility.
2. Detecting and reporting a deadlock actually occured.


How lockdep works
-----------------

What lockdep should do, to detect a deadlock or its possibility are,

1. Detect a new dependency created.
2. Keep the dependency in a global data structure esp. graph.
3. Check if any of all possible chains of dependencies forms a cycle.
4. Report a deadlock or its possibility if a cycle is detected.

A graph used by lockdep to keep all dependencies looks like,

A -> B -        -> F -> G
        \      /
         -> E -        -> L
        /      \      /
C -> D -        -> H -
                      \
                       -> I -> K
                      /
                   J -

where A, B,..., L are different lock classes.

Lockdep adds a dependency into graph when a new dependency is detected.
For example, it adds a dependency 'A -> B' when a dependency between
releasing lock A and releasing lock B, which has not been added yet, is
detected. It does same thing on other dependencies, too. See 'What
causes deadlock' section.

NOTE: Precisely speaking, a dependency is one between releasing a lock
and releasing another lock as described in 'What causes deadlock'
section. However from now on, we will describe a dependency as if it's
one between a lock and another lock for simplicity. Then 'A -> B' can be
described as a dependency between lock A and lock B.

We already checked how a problematic set of dependencies causes a
deadlock in 'What causes deadlock' section. This time let's check if a
deadlock or its possibility can be detected using a problematic set of
dependencies. Assume that 'A -> B', 'B -> E' and 'E -> A' were added in
the sequence into graph. Then the graph finally will be,

 -> A -> B -> E -
/                \
\                /
 ----------------

where A, B and E are different lock classes.

>>From adding three dependencies, a cycle was created which means, by
definition of dependency, the situation 'lock E must be released to
release lock B which in turn must be released to release lock A which in
turn must be released to release lock E which in turn must be released
to release B and so on infinitely' can happen.

Once the situation happens, no lock can be released since any of them
can never meet each precondition. It's a deadlock. Lockdep can detect a
deadlock or its possibility with checking if a cycle was created after
adding each dependency into graph. This is how lockdep detects a
deadlock or its possibility.

CONCLUSION

Lockdep detects a deadlock or its possibility with checking if a cycle
was created after adding each dependency into graph.


==========
Limitation
==========

Limit to typical lock
---------------------

Limiting what lockdep has to consider to only ones satisfying the
following condition, the implementation of adding dependencies becomes
simple while its capacity for detection becomes limited. Typical lock
e.g. spin lock and mutex is the case. Let's check what pros and cons of
it are, in next section.

	A lock should be released within the context holding the lock.

CONCLUSION

Limiting what lockdep has to consider to typical lock e.g. spin lock and
mutex, the implmentation becomes simple while it has a limited capacity.


Pros from the limitation
------------------------

Given the limitation, when acquiring a lock, any lock being in
held_locks of the acquire context cannot be released if the lock to
acquire was not released yet. Yes. It's the exact case to add a new
dependency 'A -> B' into graph, where lock A represents each lock being
in held_locks and lock B represents the lock to acquire.

For example, only considering typical lock,

	PROCESS X
	--------------
	acquire A

	acquire B -> add a dependency 'A -> B'

	acquire C -> add a dependency 'B -> C'

	release C

	release B

	release A

where A, B and C are different lock classes.

When acquiring lock A, there's nothing in held_locks of PROCESS X thus
no dependency is added. When acquiring lock B, lockdep detects and adds
a new dependency 'A -> B' between lock A being in held_locks and lock B.
And when acquiring lock C, lockdep also adds another dependency 'B -> C'
for same reason. They are added when acquiring each lock, simply.

NOTE: Even though every lock being in held_locks depends on the lock to
acquire, lockdep does not add all dependencies between them because all
of them can be covered by other dependencies except one dependency
between the lock on top of held_locks and the lock to acquire, which
must be added.

Besides, we can expect several advantages from the limitation.

1. Any lock being in held_locks cannot be released unconditionally if
   the context is stuck, thus we can easily identify a dependency when
   acquiring a lock.

2. Considering only locks being in local held_locks of a single context
   makes some races avoidable, even though it fails of course when
   modifying its global dependency graph.

3. To build a dependency graph, lockdep only needs to keep locks not
   released yet. However relaxing the limitation, it might need to keep
   even locks already released, additionally. See 'Crossrelease' section.

CONCLUSION

Given the limitation, the implementation becomes simple and efficient.


Cons from the limitation
------------------------

Given the limitation, lockdep is applicable only to typical lock. For
example, page lock for page access or completion for synchronization
cannot play with lockdep having the limitation. However since page lock
or completion also causes a deadlock, it would be better to detect a
deadlock or its possibility even for them.

Can we detect deadlocks below with lockdep having the limitation?

Example 1:

	PROCESS X	PROCESS Y
	--------------	--------------
	mutext_lock A
			lock_page B
	lock_page B
			mutext_lock A // DEADLOCK
	unlock_page B
			mutext_unlock A
	mutex_unlock A
			unlock_page B

where A and B are different lock classes.

No, we cannot.

Example 2:

	PROCESS X	PROCESS Y	PROCESS Z
	--------------	--------------	--------------
			mutex_lock A
	lock_page B
			lock_page B
					mutext_lock A // DEADLOCK
					mutext_unlock A

					unlock_page B held by X
			unlock_page B
			mutex_unlock A

where A and B are different lock classes.

No, we cannot.

Example 3:

	PROCESS X	PROCESS Y
	--------------	--------------
			mutex_lock A
	mutex_lock A
	mutex_unlock A
			wait_for_complete B // DEADLOCK
	complete B
			mutex_unlock A

where A is a lock class and B is a completion variable.

No, we cannot.

CONCLUSION

Given the limitation, lockdep cannot detect a deadlock or its
possibility caused by page lock or completion.


==============
Generalization
==============

Relax the limitation
--------------------

Detecting and adding new dependencies into graph is very important for
lockdep to work because adding a dependency means adding a chance to
check if it causes a deadlock. More dependencies lockdep adds, more
throughly it can work. Therefore Lockdep has to do its best to add as
many true dependencies as possible.

Relaxing the limitation, lockdep can add additional dependencies since
it makes lockdep deal with additional ones creating the dependencies e.g.
page lock or completion, which might be released in any context. Even so,
it needs to be noted that behaviors adding dependencies created by
typical lock don't need to be changed at all.

For example, only considering typical lock, lockdep builds a graph like,

A -> B -        -> F -> G
        \      /
         -> E -        -> L
        /      \      /
C -> D -        -> H -
                      \
                       -> I -> K
                      /
                   J -

where A, B,..., L are different lock classes, and upper case letters
represent typical lock.

After the relaxing, the graph will have additional dependencies like,

A -> B -        -> F -> G
        \      /
         -> E -        -> L -> c
        /      \      /
C -> D -        -> H -
               /      \
            a -        -> I -> K
                      /
              b -> J -

where a, b, c, A, B,..., L are different lock classes, and upper case
letters represent typical lock while lower case letters represent
non-typical lock e.g. page lock and completion.

However, it might suffer performance degradation since relaxing the
limitation with which design and implementation of lockdep become
efficient might introduce inefficiency inevitably. Each option, that is,
strong detection or efficient detection has its pros and cons, thus the
right of choice between two options should be given to users.

Choosing efficient detection, lockdep only deals with locks satisfying,

	A lock should be released within the context holding the lock.

Choosing strong detection, lockdep deals with any locks satisfying,

	A lock can be released in any context.

In the latter, of course, some contexts are not allowed if they
themselves cause a deadlock. For example, acquiring a lock in irq-safe
context before releasing the lock in irq-unsafe context is not allowed,
which after all ends in a cycle of a dependency chain, meaning a
deadlock. Otherwise, any contexts are allowed to release it.

CONCLUSION

Relaxing the limitation, lockdep adds additional dependencies and gets
additional chances to check if they cause a deadlock. It makes lockdep
additionally deal with what might be released in any context.


============
Crossrelease
============

Introduce crossrelease
----------------------

To allow lockdep to add additional dependencies created by what might be
released in any context, which we call 'crosslock', it's necessary to
introduce a new feature which makes it possible to identify and add the
dependencies. We call the feature 'crossrelease'. Crossrelease feature
has to do,

1. Identify a new dependency created by crosslock.
2. Add the dependency into graph when identifying it.

That's all. Once a meaningful dependency is added into graph, lockdep
will work with the graph as it did. So the most important thing to do is
to identify a dependency created by crosslock. Remind what a dependency
is. For example, Lock A depends on lock B if 'lock A cannot be released
if lock B isn't released'. See 'What causes deadlock' section.

By definition, a lock depends on every lock having been added into
held_locks in the lock's release context since the lock was acquired,
because the lock cannot be released if the release context is stuck by
any of dependent locks, not released. So lockdep should technically
consider release contexts of locks to identify dependencies.

It's no matter of course to typical lock because acquire context is same
as release context for typical lock, which means lockdep would work with
considering only acquire contexts for typical lock. However, for
crosslock, lockdep cannot identify release context and any dependency
until the crosslock will be actually released.

Regarding crosslock, lockdep has to record all history by queueing all
locks potentially creating dependencies so that real dependencies can be
added using the history recorded when identifying release context. We
call it 'commit', that is, to add dependencies in batches. See
'Introduce commit' section.

Of course, some actual deadlocks caused by crosslock cannot be detected
at the time it happened, because the deadlocks cannot be indentified and
detected until the crosslock will be actually released. But this way
deadlock possibility can be detected and it's worth just possibility
detection of deadlock. See 'What lockdep does' section.

CONCLUSION

With crossrelease feature, lockdep can works with what might be released
in any context, namely, crosslock.


Introduce commit
----------------

Crossrelease feature names it 'commit' to identify and add dependencies
into graph in batches. Lockdep is already doing what commit does when
acquiring a lock, for typical lock. However, that way must be changed
for crosslock so that it identifies the crosslock's release context
first and then does commit.

The main reason why lockdep performs additional step, namely commit, for
crosslock is that some dependencies by crosslock cannot be identified
until the crosslock's release context is eventually identified, though
some other dependencies by crosslock can. There are four kinds of
dependencies to consider.

1. 'typical lock A -> typical lock B' dependency

   Just when acquiring lock B, lockdep can identify the dependency
   between lock A and lock B as it did. Commit is unnecessary.

2. 'typical lock A -> crosslock b' dependency

   Just when acquiring crosslock b, lockdep can identify the dependency
   between lock A and crosslock B as well. Commit is unnecessary, too.

3. 'crosslock a -> typical lock B' dependency

   When acquiring lock B, lockdep cannot identify the dependency. It can
   be identified only when crosslock a is released. Commit is necessary.

4. 'crosslock a -> crosslock b' dependency

   Creating this kind of dependency directly is unnecessary since it can
   be covered by other kinds of dependencies.

Lockdep works without commit during dealing with only typical locks.
However, it needs to perform commit step, once at least one crosslock is
acquired, until all crosslocks in progress are released. Introducing
commit, lockdep performs three steps i.e. acquire, commit and release.
What lockdep should do in each step is like,

1. Acquire

   1) For typical lock

	Lockdep does what it originally does and queues the lock so
	that lockdep can check dependencies using it at commit step.

   2) For crosslock

	The crosslock is added to a global linked list so that lockdep
	can check dependencies using it at commit step.

2. Commit

   1) For typical lock

	N/A.

   2) For crosslock

	Lockdep checks and adds dependencies using data saved at acquire
	step, as if the dependencies were added without commit step.

3. Release

   1) For typical lock

	No change.

   2) For crosslock

	Lockdep just remove the crosslock from the global linked list,
	to which it was added at acquire step.

CONCLUSION

Lockdep can detect a deadlock or its possibility caused by what might be
released in any context, using commit step, where it checks and adds
dependencies in batches.


==============
Implementation
==============

Data structures
---------------

Crossrelease feature introduces two main data structures.

1. pend_lock (or plock)

   This is an array embedded in task_struct, for keeping locks queued so
   that real dependencies can be added using them at commit step. So
   this data can be accessed locklessly within the owner context. The
   array is filled when acquiring a typical lock and consumed when doing
   commit. And it's managed in circular manner.

2. cross_lock (or xlock)

   This is a global linked list, for keeping all crosslocks in progress.
   The list grows when acquiring a crosslock and is shrunk when
   releasing the crosslock. lockdep_init_map_crosslock() should be used
   to initialize a crosslock instance instead of lockdep_init_map() so
   that lockdep can recognize it as crosslock.

CONCLUSION

Crossrelease feature uses two main data structures.

1. A pend_lock array for queueing typical locks in circular manner.
2. A cross_lock linked list for managing crosslocks in progress.


How crossrelease works
----------------------

Let's take look at how crossrelease feature works step by step, starting
from how lockdep works without crossrelease feaure.

For example, the below is how lockdep works for typical lock.

	RELEASE CONTEXT of A (= ACQUIRE CONTEXT of A)
	--------------------
	acquire A

	acquire B -> add a dependency 'A -> B'

	acquire C -> add a dependency 'B -> C'

	release C

	release B

	release A

where A, B and C are different lock classes, and upper case letters
represent typical lock.

After adding 'A -> B', the dependency graph will be,

A -> B

where A and B are different lock classes, and upper case letters
represent typical lock.

And after adding 'B -> C', the graph will be,

A -> B -> C

where A, B and C are different lock classes, and upper case letters
represent typical lock.

What if applying commit on typical locks? It's not necessary for typical
lock. Just for showing what commit does.

	RELEASE CONTEXT of A (= ACQUIRE CONTEXT of A)
	--------------------
	acquire A -> mark A as started (nothing before, no queueing)

	acquire B -> mark B as started and queue B

	acquire C -> mark C as started and queue C

	release C -> commit C (nothing queued since C started)

	release B -> commit B -> add a dependency 'B -> C'

	release A -> commit A -> add dependencies 'A -> B' and 'A -> C'

where A, B and C are different lock classes, and upper case letters
represent typical lock.

After doing commit A, B and C, the dependency graph becomes like,

A -> B -> C

where A, B and C are different lock classes, and upper case letters
represent typical lock.

NOTE: A dependency 'A -> C' is optimized out.

Here we can see the final graph is same as the graph built without
commit. Of course the former way leads to finish building the graph
earlier than the latter way, which means we can detect a deadlock or its
possibility sooner. So the former way would be prefered if possible. But
we cannot avoid using the latter way using commit, for crosslock.

Let's look at how commit works for crosslock.

	RELEASE CONTEXT of a	ACQUIRE CONTEXT of a
	--------------------	--------------------
				acquire a -> mark a as started

	(serialized by some means e.g. barrier)

	acquire D -> queue D
				acquire B -> queue B
	release D
				acquire C -> add 'B -> C' and queue C
	acquire E -> queue E
				acquire D -> add 'C -> D' and queue D
	release E
				release D
	release a -> commit a -> add 'a -> D' and 'a -> E'
				release C

				release B

where a, B,..., E are different lock classes, and upper case letters
represent typical lock while lower case letters represent crosslock.

When acquiring crosslock a, no dependency can be added since there's
nothing in the held_locks. However, crossrelease feature marks the
crosslock as started, which means all locks to acquire from now are
candidates which might create new dependencies later when identifying
release context.

When acquiring lock B, lockdep does what it originally does for typical
lock and additionally queues the lock for later commit to refer to
because it might be a dependent lock of the crosslock. It does same
thing on lock C, D and E. And then two dependencies 'a -> D' and 'a -> E'
are added when identifying the release context, at commit step.

The final graph is, with crossrelease feature using commit,

B -> C -
        \
         -> D
        /
     a -
        \
         -> E

where a, B,..., E are different lock classes, and upper case letters
represent typical lock while lower case letters represent crosslock.

However, without crossrelease feature, the final graph will be,

B -> C -> D

where B and C are different lock classes, and upper case letters
represent typical lock.

The former graph has two more dependencies 'a -> D' and 'a -> E' giving
additional chances to check if they cause a deadlock. This way lockdep
can detect a deadlock or its possibility caused by crosslock. Again,
behaviors adding dependencies created by only typical locks are not
changed at all.

CONCLUSION

Crossrelease works using commit for crosslock, leaving behaviors adding
dependencies between only typical locks unchanged.


=============
Optimizations
=============

Avoid duplication
-----------------

Crossrelease feature uses a cache like what lockdep already uses for
dependency chains, but this time it's for caching one dependency like
'crosslock -> typical lock' crossing between two different context. Once
that dependency is cached, same dependency will never be added any more.
Even queueing unnecessary locks is also prevented based on the cache.

CONCLUSION

Crossrelease does not add any duplicate dependency.


Avoid lock contention
---------------------

To keep all typical locks for later use, crossrelease feature adopts a
local array embedded in task_struct, which makes accesses to arrays
lockless by forcing each array to be accessed only within each own
context. It's like how held_locks is accessed. Lockless implmentation is
important since typical locks are very frequently accessed.

CONCLUSION

Crossrelease avoids lock contection as far as possible.

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
Received: from mail-oi0-f69.google.com (mail-oi0-f69.google.com [209.85.218.69])
	by kanga.kvack.org (Postfix) with ESMTP id B8AD96B025E
	for <linux-mm@kvack.org>; Fri, 19 Aug 2016 08:42:32 -0400 (EDT)
Received: by mail-oi0-f69.google.com with SMTP id p18so118805132oic.0
        for <linux-mm@kvack.org>; Fri, 19 Aug 2016 05:42:32 -0700 (PDT)
Received: from lgeamrelo13.lge.com (LGEAMRELO13.lge.com. [156.147.23.53])
        by mx.google.com with ESMTP id e11si7961186iof.128.2016.08.19.05.42.30
        for <linux-mm@kvack.org>;
        Fri, 19 Aug 2016 05:42:31 -0700 (PDT)
Date: Fri, 19 Aug 2016 21:39:59 +0900
From: Byungchul Park <byungchul.park@lge.com>
Subject: [Revised document] Crossrelease lockdep
Message-ID: <20160819123959.GW2279@X58A-UD3R>
References: <1467883803-29132-1-git-send-email-byungchul.park@lge.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1467883803-29132-1-git-send-email-byungchul.park@lge.com>
Sender: owner-linux-mm@kvack.org
List-ID: <linux-mm.kvack.org>
To: peterz@infradead.org, mingo@kernel.org
Cc: tglx@linutronix.de, npiggin@kernel.dk, walken@google.com, boqun.feng@gmail.com, kirill@shutemov.name, linux-kernel@vger.kernel.org, linux-mm@kvack.org

Hello,

I rewrote the document so that the need of crossrelease feature can be
described logically. I think it's more important than to post specific
implementation. Could you let me know your opinions about this?

Thanks,
Byungchul

----->8-----

Crossrelease
============

Started by Byungchul Park <byungchul.park@lge.com>

Contents:

 (*) Background.

     - What causes deadlock.
     - What lockdep detects.
     - How lockdep works.

 (*) Limitation.

     - Limit to typical lock.
     - Pros from the limitation.
     - Cons from the limitation.

 (*) Generalization.

     - Relax the limitation.

 (*) Crossrelease.

     - Introduce crossrelease.
     - Introduce commit.

 (*) Implementation.

     - Data structures.
     - How crossrelease works.

 (*) Optimizations.

     - Avoid duplication.
     - Avoid lock contention.


==========
Background
==========

What causes deadlock
--------------------

A deadlock occurs when a context is waiting for an event to be issued
which cannot be issued because the context or another context who can
issue the event is also waiting for an event to be issued which cannot
be issued. Single context or more than one context both waiting for an
event and issuing an event may paricipate in a deadlock.

For example,

A context who can issue event D is waiting for event A to be issued.
A context who can issue event A is waiting for event B to be issued.
A context who can issue event B is waiting for event C to be issued.
A context who can issue event C is waiting for event D to be issued.

A deadlock occurs when these four operations are run at a time because
event D cannot be issued if event A isn't issued which in turn cannot be
issued if event B isn't issued which in turn cannot be issued if event C
isn't issued which in turn cannot be issued if event D isn't issued. No
event can be issued since any of them never meets its precondition.

We can easily recognize that each wait operation creates a dependency
between two issuings e.g. between issuing D and issuing A like, 'event D
cannot be issued if event A isn't issued', in other words, 'issuing
event D depends on issuing event A'. So the whole example can be
rewritten in terms of dependency,

Do an operation making 'event D cannot be issued if event A isn't issued'.
Do an operation making 'event A cannot be issued if event B isn't issued'.
Do an operation making 'event B cannot be issued if event C isn't issued'.
Do an operation making 'event C cannot be issued if event D isn't issued'.

or,

Do an operation making 'issuing event D depends on issuing event A'.
Do an operation making 'issuing event A depends on issuing event B'.
Do an operation making 'issuing event B depends on issuing event C'.
Do an operation making 'issuing event C depends on issuing event D'.

What causes a deadlock is a set of dependencies a chain of which forms a
cycle, which means that issuing event D depending on issuing event A
depending on issuing event B depending on issuing event C depending on
issuing event D, finally depends on issuing event D itself, which means
no event can be issued.

Any set of operations creating dependencies causes a deadlock. The set
of lock operations e.g. acquire and release is an example. Waiting for a
lock to be released corresponds to waiting for an event and releasing a
lock corresponds to issuing an event. So the description of dependency
above can be altered to one in terms of lock.

In terms of event, issuing event A depends on issuing event B if,

	Event A cannot be issued if event B isn't issued.

In terms of lock, releasing lock A depends on releasing lock B if,

	Lock A cannot be released if lock B isn't released.

CONCLUSION

A set of dependencies a chain of which forms a cycle, causes a deadlock,
no matter what creates the dependencies.


What lockdep detects
--------------------

A deadlock actually occurs only when all operations creating problematic
dependencies are run at a time. However, even if it has not happend, the
deadlock potentially can occur if the problematic dependencies obviously
exist. Thus it's meaningful to detect not only an actual deadlock but
also its possibility. Lockdep does the both.

Whether a deadlock actually occurs or not depends on several factors,
which means a deadlock may not occur even though problematic
dependencies exist. For example, what order contexts are switched in is
a factor. A deadlock will occur when contexts are switched so that all
operations causing a deadlock become run simultaneously.

Lockdep tries to detect a deadlock or its possibility aggressively,
though it also tries to avoid false positive detections. So lockdep is
designed to consider all possible combinations of dependencies so that
it can detect all potential possibilities of deadlock in advance. What
lockdep tries in order to consider all possibilities are,

1. Use a global dependency graph including all dependencies.

   What lockdep checks is based on dependencies instead of what actually
   happened. So no matter which context or call path a new dependency is
   detected in, it's just referred to as a global factor.

2. Use lock classes than lock instances when checking dependencies.

   What actually causes a deadlock is lock instances. However, lockdep
   uses lock classes than its instances when checking dependencies since
   any instance of a same lock class can be altered anytime.

So lockdep detects both an actual deadlock and its possibility. But the
latter is more valuable than the former. When a deadlock actually
occures, we can identify what happens in the system by some means or
other even without lockdep. However, there's no way to detect possiblity
without lockdep unless the whole code is parsed in head. It's terrible.

CONCLUSION

Lockdep does, the fisrt one is more valuable,

1. Detecting and reporting deadlock possibility.
2. Detecting and reporting a deadlock actually occured.


How lockdep works
-----------------

What lockdep should do, to detect a deadlock or its possibility are,

1. Detect a new dependency created.
2. Keep the dependency in a global data structure esp. graph.
3. Check if any of all possible chains of dependencies forms a cycle.
4. Report a deadlock or its possibility if a cycle is detected.

A graph used by lockdep to keep all dependencies looks like,

A -> B -        -> F -> G
        \      /
         -> E -        -> L
        /      \      /
C -> D -        -> H -
                      \
                       -> I -> K
                      /
                   J -

where A, B,..., L are different lock classes.

Lockdep adds a dependency into graph when a new dependency is detected.
For example, it adds a dependency 'A -> B' when a dependency between
releasing lock A and releasing lock B, which has not been added yet, is
detected. It does same thing on other dependencies, too. See 'What
causes deadlock' section.

NOTE: Precisely speaking, a dependency is one between releasing a lock
and releasing another lock as described in 'What causes deadlock'
section. However from now on, we will describe a dependency as if it's
one between a lock and another lock for simplicity. Then 'A -> B' can be
described as a dependency between lock A and lock B.

We already checked how a problematic set of dependencies causes a
deadlock in 'What causes deadlock' section. This time let's check if a
deadlock or its possibility can be detected using a problematic set of
dependencies. Assume that 'A -> B', 'B -> E' and 'E -> A' were added in
the sequence into graph. Then the graph finally will be,

 -> A -> B -> E -
/                \
\                /
 ----------------

where A, B and E are different lock classes.

>>From adding three dependencies, a cycle was created which means, by
definition of dependency, the situation 'lock E must be released to
release lock B which in turn must be released to release lock A which in
turn must be released to release lock E which in turn must be released
to release B and so on infinitely' can happen.

Once the situation happens, no lock can be released since any of them
can never meet each precondition. It's a deadlock. Lockdep can detect a
deadlock or its possibility with checking if a cycle was created after
adding each dependency into graph. This is how lockdep detects a
deadlock or its possibility.

CONCLUSION

Lockdep detects a deadlock or its possibility with checking if a cycle
was created after adding each dependency into graph.


==========
Limitation
==========

Limit to typical lock
---------------------

Limiting what lockdep has to consider to only ones satisfying the
following condition, the implementation of adding dependencies becomes
simple while its capacity for detection becomes limited. Typical lock
e.g. spin lock and mutex is the case. Let's check what pros and cons of
it are, in next section.

	A lock should be released within the context holding the lock.

CONCLUSION

Limiting what lockdep has to consider to typical lock e.g. spin lock and
mutex, the implmentation becomes simple while it has a limited capacity.


Pros from the limitation
------------------------

Given the limitation, when acquiring a lock, any lock being in
held_locks of the acquire context cannot be released if the lock to
acquire was not released yet. Yes. It's the exact case to add a new
dependency 'A -> B' into graph, where lock A represents each lock being
in held_locks and lock B represents the lock to acquire.

For example, only considering typical lock,

	PROCESS X
	--------------
	acquire A

	acquire B -> add a dependency 'A -> B'

	acquire C -> add a dependency 'B -> C'

	release C

	release B

	release A

where A, B and C are different lock classes.

When acquiring lock A, there's nothing in held_locks of PROCESS X thus
no dependency is added. When acquiring lock B, lockdep detects and adds
a new dependency 'A -> B' between lock A being in held_locks and lock B.
And when acquiring lock C, lockdep also adds another dependency 'B -> C'
for same reason. They are added when acquiring each lock, simply.

NOTE: Even though every lock being in held_locks depends on the lock to
acquire, lockdep does not add all dependencies between them because all
of them can be covered by other dependencies except one dependency
between the lock on top of held_locks and the lock to acquire, which
must be added.

Besides, we can expect several advantages from the limitation.

1. Any lock being in held_locks cannot be released unconditionally if
   the context is stuck, thus we can easily identify a dependency when
   acquiring a lock.

2. Considering only locks being in local held_locks of a single context
   makes some races avoidable, even though it fails of course when
   modifying its global dependency graph.

3. To build a dependency graph, lockdep only needs to keep locks not
   released yet. However relaxing the limitation, it might need to keep
   even locks already released, additionally. See 'Crossrelease' section.

CONCLUSION

Given the limitation, the implementation becomes simple and efficient.


Cons from the limitation
------------------------

Given the limitation, lockdep is applicable only to typical lock. For
example, page lock for page access or completion for synchronization
cannot play with lockdep having the limitation. However since page lock
or completion also causes a deadlock, it would be better to detect a
deadlock or its possibility even for them.

Can we detect deadlocks below with lockdep having the limitation?

Example 1:

	PROCESS X	PROCESS Y
	--------------	--------------
	mutext_lock A
			lock_page B
	lock_page B
			mutext_lock A // DEADLOCK
	unlock_page B
			mutext_unlock A
	mutex_unlock A
			unlock_page B

where A and B are different lock classes.

No, we cannot.

Example 2:

	PROCESS X	PROCESS Y	PROCESS Z
	--------------	--------------	--------------
			mutex_lock A
	lock_page B
			lock_page B
					mutext_lock A // DEADLOCK
					mutext_unlock A

					unlock_page B held by X
			unlock_page B
			mutex_unlock A

where A and B are different lock classes.

No, we cannot.

Example 3:

	PROCESS X	PROCESS Y
	--------------	--------------
			mutex_lock A
	mutex_lock A
	mutex_unlock A
			wait_for_complete B // DEADLOCK
	complete B
			mutex_unlock A

where A is a lock class and B is a completion variable.

No, we cannot.

CONCLUSION

Given the limitation, lockdep cannot detect a deadlock or its
possibility caused by page lock or completion.


==============
Generalization
==============

Relax the limitation
--------------------

Detecting and adding new dependencies into graph is very important for
lockdep to work because adding a dependency means adding a chance to
check if it causes a deadlock. More dependencies lockdep adds, more
throughly it can work. Therefore Lockdep has to do its best to add as
many true dependencies as possible.

Relaxing the limitation, lockdep can add additional dependencies since
it makes lockdep deal with additional ones creating the dependencies e.g.
page lock or completion, which might be released in any context. Even so,
it needs to be noted that behaviors adding dependencies created by
typical lock don't need to be changed at all.

For example, only considering typical lock, lockdep builds a graph like,

A -> B -        -> F -> G
        \      /
         -> E -        -> L
        /      \      /
C -> D -        -> H -
                      \
                       -> I -> K
                      /
                   J -

where A, B,..., L are different lock classes, and upper case letters
represent typical lock.

After the relaxing, the graph will have additional dependencies like,

A -> B -        -> F -> G
        \      /
         -> E -        -> L -> c
        /      \      /
C -> D -        -> H -
               /      \
            a -        -> I -> K
                      /
              b -> J -

where a, b, c, A, B,..., L are different lock classes, and upper case
letters represent typical lock while lower case letters represent
non-typical lock e.g. page lock and completion.

However, it might suffer performance degradation since relaxing the
limitation with which design and implementation of lockdep become
efficient might introduce inefficiency inevitably. Each option, that is,
strong detection or efficient detection has its pros and cons, thus the
right of choice between two options should be given to users.

Choosing efficient detection, lockdep only deals with locks satisfying,

	A lock should be released within the context holding the lock.

Choosing strong detection, lockdep deals with any locks satisfying,

	A lock can be released in any context.

In the latter, of course, some contexts are not allowed if they
themselves cause a deadlock. For example, acquiring a lock in irq-safe
context before releasing the lock in irq-unsafe context is not allowed,
which after all ends in a cycle of a dependency chain, meaning a
deadlock. Otherwise, any contexts are allowed to release it.

CONCLUSION

Relaxing the limitation, lockdep adds additional dependencies and gets
additional chances to check if they cause a deadlock. It makes lockdep
additionally deal with what might be released in any context.


============
Crossrelease
============

Introduce crossrelease
----------------------

To allow lockdep to add additional dependencies created by what might be
released in any context, which we call 'crosslock', it's necessary to
introduce a new feature which makes it possible to identify and add the
dependencies. We call the feature 'crossrelease'. Crossrelease feature
has to do,

1. Identify a new dependency created by crosslock.
2. Add the dependency into graph when identifying it.

That's all. Once a meaningful dependency is added into graph, lockdep
will work with the graph as it did. So the most important thing to do is
to identify a dependency created by crosslock. Remind what a dependency
is. For example, Lock A depends on lock B if 'lock A cannot be released
if lock B isn't released'. See 'What causes deadlock' section.

By definition, a lock depends on every lock having been added into
held_locks in the lock's release context since the lock was acquired,
because the lock cannot be released if the release context is stuck by
any of dependent locks, not released. So lockdep should technically
consider release contexts of locks to identify dependencies.

It's no matter of course to typical lock because acquire context is same
as release context for typical lock, which means lockdep would work with
considering only acquire contexts for typical lock. However, for
crosslock, lockdep cannot identify release context and any dependency
until the crosslock will be actually released.

Regarding crosslock, lockdep has to record all history by queueing all
locks potentially creating dependencies so that real dependencies can be
added using the history recorded when identifying release context. We
call it 'commit', that is, to add dependencies in batches. See
'Introduce commit' section.

Of course, some actual deadlocks caused by crosslock cannot be detected
at the time it happened, because the deadlocks cannot be indentified and
detected until the crosslock will be actually released. But this way
deadlock possibility can be detected and it's worth just possibility
detection of deadlock. See 'What lockdep does' section.

CONCLUSION

With crossrelease feature, lockdep can works with what might be released
in any context, namely, crosslock.


Introduce commit
----------------

Crossrelease feature names it 'commit' to identify and add dependencies
into graph in batches. Lockdep is already doing what commit does when
acquiring a lock, for typical lock. However, that way must be changed
for crosslock so that it identifies the crosslock's release context
first and then does commit.

The main reason why lockdep performs additional step, namely commit, for
crosslock is that some dependencies by crosslock cannot be identified
until the crosslock's release context is eventually identified, though
some other dependencies by crosslock can. There are four kinds of
dependencies to consider.

1. 'typical lock A -> typical lock B' dependency

   Just when acquiring lock B, lockdep can identify the dependency
   between lock A and lock B as it did. Commit is unnecessary.

2. 'typical lock A -> crosslock b' dependency

   Just when acquiring crosslock b, lockdep can identify the dependency
   between lock A and crosslock B as well. Commit is unnecessary, too.

3. 'crosslock a -> typical lock B' dependency

   When acquiring lock B, lockdep cannot identify the dependency. It can
   be identified only when crosslock a is released. Commit is necessary.

4. 'crosslock a -> crosslock b' dependency

   Creating this kind of dependency directly is unnecessary since it can
   be covered by other kinds of dependencies.

Lockdep works without commit during dealing with only typical locks.
However, it needs to perform commit step, once at least one crosslock is
acquired, until all crosslocks in progress are released. Introducing
commit, lockdep performs three steps i.e. acquire, commit and release.
What lockdep should do in each step is like,

1. Acquire

   1) For typical lock

	Lockdep does what it originally does and queues the lock so
	that lockdep can check dependencies using it at commit step.

   2) For crosslock

	The crosslock is added to a global linked list so that lockdep
	can check dependencies using it at commit step.

2. Commit

   1) For typical lock

	N/A.

   2) For crosslock

	Lockdep checks and adds dependencies using data saved at acquire
	step, as if the dependencies were added without commit step.

3. Release

   1) For typical lock

	No change.

   2) For crosslock

	Lockdep just remove the crosslock from the global linked list,
	to which it was added at acquire step.

CONCLUSION

Lockdep can detect a deadlock or its possibility caused by what might be
released in any context, using commit step, where it checks and adds
dependencies in batches.


==============
Implementation
==============

Data structures
---------------

Crossrelease feature introduces two main data structures.

1. pend_lock (or plock)

   This is an array embedded in task_struct, for keeping locks queued so
   that real dependencies can be added using them at commit step. So
   this data can be accessed locklessly within the owner context. The
   array is filled when acquiring a typical lock and consumed when doing
   commit. And it's managed in circular manner.

2. cross_lock (or xlock)

   This is a global linked list, for keeping all crosslocks in progress.
   The list grows when acquiring a crosslock and is shrunk when
   releasing the crosslock. lockdep_init_map_crosslock() should be used
   to initialize a crosslock instance instead of lockdep_init_map() so
   that lockdep can recognize it as crosslock.

CONCLUSION

Crossrelease feature uses two main data structures.

1. A pend_lock array for queueing typical locks in circular manner.
2. A cross_lock linked list for managing crosslocks in progress.


How crossrelease works
----------------------

Let's take look at how crossrelease feature works step by step, starting
from how lockdep works without crossrelease feaure.

For example, the below is how lockdep works for typical lock.

	RELEASE CONTEXT of A (= ACQUIRE CONTEXT of A)
	--------------------
	acquire A

	acquire B -> add a dependency 'A -> B'

	acquire C -> add a dependency 'B -> C'

	release C

	release B

	release A

where A, B and C are different lock classes, and upper case letters
represent typical lock.

After adding 'A -> B', the dependency graph will be,

A -> B

where A and B are different lock classes, and upper case letters
represent typical lock.

And after adding 'B -> C', the graph will be,

A -> B -> C

where A, B and C are different lock classes, and upper case letters
represent typical lock.

What if applying commit on typical locks? It's not necessary for typical
lock. Just for showing what commit does.

	RELEASE CONTEXT of A (= ACQUIRE CONTEXT of A)
	--------------------
	acquire A -> mark A as started (nothing before, no queueing)

	acquire B -> mark B as started and queue B

	acquire C -> mark C as started and queue C

	release C -> commit C (nothing queued since C started)

	release B -> commit B -> add a dependency 'B -> C'

	release A -> commit A -> add dependencies 'A -> B' and 'A -> C'

where A, B and C are different lock classes, and upper case letters
represent typical lock.

After doing commit A, B and C, the dependency graph becomes like,

A -> B -> C

where A, B and C are different lock classes, and upper case letters
represent typical lock.

NOTE: A dependency 'A -> C' is optimized out.

Here we can see the final graph is same as the graph built without
commit. Of course the former way leads to finish building the graph
earlier than the latter way, which means we can detect a deadlock or its
possibility sooner. So the former way would be prefered if possible. But
we cannot avoid using the latter way using commit, for crosslock.

Let's look at how commit works for crosslock.

	RELEASE CONTEXT of a	ACQUIRE CONTEXT of a
	--------------------	--------------------
				acquire a -> mark a as started

	(serialized by some means e.g. barrier)

	acquire D -> queue D
				acquire B -> queue B
	release D
				acquire C -> add 'B -> C' and queue C
	acquire E -> queue E
				acquire D -> add 'C -> D' and queue D
	release E
				release D
	release a -> commit a -> add 'a -> D' and 'a -> E'
				release C

				release B

where a, B,..., E are different lock classes, and upper case letters
represent typical lock while lower case letters represent crosslock.

When acquiring crosslock a, no dependency can be added since there's
nothing in the held_locks. However, crossrelease feature marks the
crosslock as started, which means all locks to acquire from now are
candidates which might create new dependencies later when identifying
release context.

When acquiring lock B, lockdep does what it originally does for typical
lock and additionally queues the lock for later commit to refer to
because it might be a dependent lock of the crosslock. It does same
thing on lock C, D and E. And then two dependencies 'a -> D' and 'a -> E'
are added when identifying the release context, at commit step.

The final graph is, with crossrelease feature using commit,

B -> C -
        \
         -> D
        /
     a -
        \
         -> E

where a, B,..., E are different lock classes, and upper case letters
represent typical lock while lower case letters represent crosslock.

However, without crossrelease feature, the final graph will be,

B -> C -> D

where B and C are different lock classes, and upper case letters
represent typical lock.

The former graph has two more dependencies 'a -> D' and 'a -> E' giving
additional chances to check if they cause a deadlock. This way lockdep
can detect a deadlock or its possibility caused by crosslock. Again,
behaviors adding dependencies created by only typical locks are not
changed at all.

CONCLUSION

Crossrelease works using commit for crosslock, leaving behaviors adding
dependencies between only typical locks unchanged.


=============
Optimizations
=============

Avoid duplication
-----------------

Crossrelease feature uses a cache like what lockdep already uses for
dependency chains, but this time it's for caching one dependency like
'crosslock -> typical lock' crossing between two different context. Once
that dependency is cached, same dependency will never be added any more.
Even queueing unnecessary locks is also prevented based on the cache.

CONCLUSION

Crossrelease does not add any duplicate dependency.


Avoid lock contention
---------------------

To keep all typical locks for later use, crossrelease feature adopts a
local array embedded in task_struct, which makes accesses to arrays
lockless by forcing each array to be accessed only within each own
context. It's like how held_locks is accessed. Lockless implmentation is
important since typical locks are very frequently accessed.

CONCLUSION

Crossrelease avoids lock contection as far as possible.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>