All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: paulmck@kernel.org
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Sebastian Siewior <bigeasy@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Jonathan Corbet <corbet@lwn.net>,
	Randy Dunlap <rdunlap@infradead.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Kurt Schwemmer <kurt.schwemmer@microsemi.com>,
	linux-pci@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Felipe Balbi <balbi@kernel.org>,
	linux-usb@vger.kernel.org, Kalle Valo <kvalo@codeaurora.org>,
	"David S. Miller" <davem@davemloft.net>,
	linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
	Darren Hart <dvhart@infradead.org>,
	Andy Shevchenko <andy@infradead.org>,
	platform-driver-x86@vger.kernel.org,
	Zhang Rui <rui.zhang@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	linux-pm@vger.kernel.org, Len Brown <lenb@kernel.org>,
	linux-acpi@vger.kernel.org, kbuild test robot <lkp@intel.com>,
	Nick Hu <nickhu@andestech.com>, Greentime Hu <green.hu@gmail.com>,
	Vincent Chen <deanbo422@gmail.com>, Guo Ren <guoren@kernel.org>,
	linux-csky@vger.kernel.org, Brian Cain <bcain@codeaurora.org>,
	linux-hexagon@vger.kernel.org, Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	linux-ia64@vger.kernel.org, Michal Simek <monstr@monstr.eu>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Arnd Bergmann <arnd@arndb.de>, Geoff Levand <geoff@infradead.org>,
	linuxppc-dev@lists.ozlabs.org, Davidlohr Bueso <dbueso@suse.de>
Subject: Re: [patch V3 13/20] Documentation: Add lock ordering and nesting documentation
Date: Wed, 25 Mar 2020 00:13:34 +0100	[thread overview]
Message-ID: <87r1xhz6qp.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20200323025501.GE3199@paulmck-ThinkPad-P72>

Paul,

"Paul E. McKenney" <paulmck@kernel.org> writes:
> On Sat, Mar 21, 2020 at 12:25:57PM +0100, Thomas Gleixner wrote:
> In the normal case where the task sleeps through the entire lock
> acquisition, the sequence of events is as follows:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                                lock wakeup
>                                  state = real_state == UNINTERRUPTIBLE
>
> This sequence of events can occur when the task acquires spinlocks
> on its way to sleeping, for example, in a call to wait_event().
>
> The non-lock wakeup can occur when a wakeup races with this wait_event(),
> which can result in the following sequence of events:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                              non lock wakeup
>                                  real_state = RUNNING
>
>                                lock wakeup
>                                  state = real_state == RUNNING
>
> Without this real_state subterfuge, the wakeup might be lost.

I added this with a few modifications which reflect the actual
implementation. Conceptually the same.

> rwsems have grown special-purpose interfaces that allow non-owner release.
> This non-owner release prevents PREEMPT_RT from substituting RT-mutex
> implementations, for example, by defeating priority inheritance.
> After all, if the lock has no owner, whose priority should be boosted?
> As a result, PREEMPT_RT does not currently support rwsem, which in turn
> means that code using it must therefore be disabled until a workable
> solution presents itself.
>
> [ Note: Not as confident as I would like to be in the above. ]

I'm not confident either especially not after looking at the actual
code.

In fact I feel really stupid because the rw_semaphore reader non-owner
restriction on RT simply does not exist anymore and my history biased
memory tricked me.

The first rw_semaphore implementation of RT was simple and restricted
the reader side to a single reader to support PI on both the reader and
the writer side. That obviosuly did not scale well and made mmap_sem
heavy use cases pretty unhappy.

The short interlude with multi-reader boosting turned out to be a failed
experiment - Steven might still disagree though :)

At some point we gave up and I myself (sic!) reimplemented the RT
variant of rw_semaphore with a reader biased mechanism.

The reader never holds the underlying rt_mutex accross the read side
critical section. It merily increments the reader count and drops it on
release.

The only time a reader takes the rt_mutex is when it blocks on a
writer. Writers hold the rt_mutex across the write side critical section
to allow incoming readers to boost them. Once the writer releases the
rw_semaphore it unlocks the rt_mutex which is then handed off to the
readers. They increment the reader count and then drop the rt_mutex
before continuing in the read side critical section.

So while I changed the implementation it did obviously not occur to me
that this also lifted the non-owner release restriction. Nobody else
noticed either. So we kept dragging this along in both memory and
implementation. Both will be fixed now :)

The owner semantics of down/up_read() are only enforced by lockdep. That
applies to both RT and !RT. The up/down_read_non_owner() variants are
just there to tell lockdep about it.

So, I picked up your other suggestions with slight modifications and
adjusted the owner, semaphore and rw_semaphore docs accordingly.

Please have a close look at the patch below (applies on tip core/locking).

Thanks,

        tglx, who is searching a brown paperbag

8<----------

 Documentation/locking/locktypes.rst |  148 +++++++++++++++++++++++-------------
 1 file changed, 98 insertions(+), 50 deletions(-)

--- a/Documentation/locking/locktypes.rst
+++ b/Documentation/locking/locktypes.rst
@@ -67,6 +67,17 @@ Spinning locks implicitly disable preemp
  _irqsave/restore()   Save and disable / restore interrupt disabled state
  ===================  ====================================================
 
+Owner semantics
+===============
+
+The aforementioned lock types except semaphores have strict owner
+semantics:
+
+  The context (task) that acquired the lock must release it.
+
+rw_semaphores have a special interface which allows non-owner release for
+readers.
+
 
 rtmutex
 =======
@@ -83,6 +94,51 @@ interrupt handlers and soft interrupts.
 and rwlock_t to be implemented via RT-mutexes.
 
 
+sempahore
+=========
+
+semaphore is a counting semaphore implementation.
+
+Semaphores are often used for both serialization and waiting, but new use
+cases should instead use separate serialization and wait mechanisms, such
+as mutexes and completions.
+
+sempahores and PREEMPT_RT
+----------------------------
+
+PREEMPT_RT does not change the sempahore implementation. That's impossible
+due to the counting semaphore semantics which have no concept of owners.
+The lack of an owner conflicts with priority inheritance. After all an
+unknown owner cannot be boosted. As a consequence blocking on semaphores
+can be subject to priority inversion.
+
+
+rw_sempahore
+============
+
+rw_semaphore is a multiple readers and single writer lock mechanism.
+
+On non-PREEMPT_RT kernels the implementation is fair, thus preventing
+writer starvation.
+
+rw_semaphore complies by default with the strict owner semantics, but there
+exist special-purpose interfaces that allow non-owner release for readers.
+These work independent of the kernel configuration.
+
+rw_sempahore and PREEMPT_RT
+---------------------------
+
+PREEMPT_RT kernels map rw_sempahore to a separate rt_mutex-based
+implementation, thus changing the fairness:
+
+ Because an rw_sempaphore writer cannot grant its priority to multiple
+ readers, a preempted low-priority reader will continue holding its lock,
+ thus starving even high-priority writers.  In contrast, because readers
+ can grant their priority to a writer, a preempted low-priority writer will
+ have its priority boosted until it releases the lock, thus preventing that
+ writer from starving readers.
+
+
 raw_spinlock_t and spinlock_t
 =============================
 
@@ -140,7 +196,16 @@ On a PREEMPT_RT enabled kernel spinlock_
    kernels leave task state untouched.  However, PREEMPT_RT must change
    task state if the task blocks during acquisition.  Therefore, it saves
    the current task state before blocking and the corresponding lock wakeup
-   restores it.
+   restores it::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					lock wakeup
+					  task->state = task->saved_state
 
    Other types of wakeups would normally unconditionally set the task state
    to RUNNING, but that does not work here because the task must remain
@@ -148,7 +213,22 @@ On a PREEMPT_RT enabled kernel spinlock_
    wakeup attempts to awaken a task blocked waiting for a spinlock, it
    instead sets the saved state to RUNNING.  Then, when the lock
    acquisition completes, the lock wakeup sets the task state to the saved
-   state, in this case setting it to RUNNING.
+   state, in this case setting it to RUNNING::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					non lock wakeup
+					  task->saved_state = TASK_RUNNING
+
+					lock wakeup
+					  task->state = task->saved_state
+
+   This ensures that the real wakeup cannot be lost.
+
 
 rwlock_t
 ========
@@ -228,17 +308,16 @@ while holding normal non-raw spinlocks b
 bit spinlocks
 -------------
 
-Bit spinlocks are problematic for PREEMPT_RT as they cannot be easily
-substituted by an RT-mutex based implementation for obvious reasons.
-
-The semantics of bit spinlocks are preserved on PREEMPT_RT kernels and the
-caveats vs. raw_spinlock_t apply.
-
-Some bit spinlocks are substituted by regular spinlock_t for PREEMPT_RT but
-this requires conditional (#ifdef'ed) code changes at the usage site while
-the spinlock_t substitution is simply done by the compiler and the
-conditionals are restricted to header files and core implementation of the
-locking primitives and the usage sites do not require any changes.
+PREEMPT_RT cannot substitute bit spinlocks because a single bit is too
+small to accommodate an RT-mutex.  Therefore, the semantics of bit
+spinlocks are preserved on PREEMPT_RT kernels, so that the raw_spinlock_t
+caveats also apply to bit spinlocks.
+
+Some bit spinlocks are replaced with regular spinlock_t for PREEMPT_RT
+using conditional (#ifdef'ed) code changes at the usage site.  In contrast,
+usage-site changes are not needed for the spinlock_t substitution.
+Instead, conditionals in header files and the core locking implemementation
+enable the compiler to do the substitution transparently.
 
 
 Lock type nesting rules
@@ -254,46 +333,15 @@ Lock type nesting rules
 
   - Spinning lock types can nest inside sleeping lock types.
 
-These rules apply in general independent of CONFIG_PREEMPT_RT.
+These constraints apply both in CONFIG_PREEMPT_RT and otherwise.
 
-As PREEMPT_RT changes the lock category of spinlock_t and rwlock_t from
-spinning to sleeping this has obviously restrictions how they can nest with
-raw_spinlock_t.
-
-This results in the following nest ordering:
+The fact that PREEMPT_RT changes the lock category of spinlock_t and
+rwlock_t from spinning to sleeping means that they cannot be acquired while
+holding a raw spinlock.  This results in the following nesting ordering:
 
   1) Sleeping locks
   2) spinlock_t and rwlock_t
   3) raw_spinlock_t and bit spinlocks
 
-Lockdep is aware of these constraints to ensure that they are respected.
-
-
-Owner semantics
-===============
-
-Most lock types in the Linux kernel have strict owner semantics, i.e. the
-context (task) which acquires a lock has to release it.
-
-There are two exceptions:
-
-  - semaphores
-  - rwsems
-
-semaphores have no owner semantics for historical reason, and as such
-trylock and release operations can be called from any context. They are
-often used for both serialization and waiting purposes. That's generally
-discouraged and should be replaced by separate serialization and wait
-mechanisms, such as mutexes and completions.
-
-rwsems have grown interfaces which allow non owner release for special
-purposes. This usage is problematic on PREEMPT_RT because PREEMPT_RT
-substitutes all locking primitives except semaphores with RT-mutex based
-implementations to provide priority inheritance for all lock types except
-the truly spinning ones. Priority inheritance on ownerless locks is
-obviously impossible.
-
-For now the rwsem non-owner release excludes code which utilizes it from
-being used on PREEMPT_RT enabled kernels. In same cases this can be
-mitigated by disabling portions of the code, in other cases the complete
-functionality has to be disabled until a workable solution has been found.
+Lockdep will complain if these constraints are violated, both in
+CONFIG_PREEMPT_RT and otherwise.


WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: paulmck@kernel.org
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Sebastian Siewior <bigeasy@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Jonathan Corbet <corbet@lwn.net>,
	Randy Dunlap <rdunlap@infradead.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Kurt Schwemmer <kurt.schwemmer@microsemi.com>,
	linux-pci@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Felipe Balbi <balbi@kernel.org>,
	linux-usb@vger.kernel.org, Kalle Valo <kvalo@codeaurora.org>,
	"David S. Miller" <davem@davemloft.net>,
	linux-wireless@vger.kernel.orgnet
Subject: Re: [patch V3 13/20] Documentation: Add lock ordering and nesting documentation
Date: Wed, 25 Mar 2020 00:13:34 +0100	[thread overview]
Message-ID: <87r1xhz6qp.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20200323025501.GE3199@paulmck-ThinkPad-P72>

Paul,

"Paul E. McKenney" <paulmck@kernel.org> writes:
> On Sat, Mar 21, 2020 at 12:25:57PM +0100, Thomas Gleixner wrote:
> In the normal case where the task sleeps through the entire lock
> acquisition, the sequence of events is as follows:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                                lock wakeup
>                                  state = real_state == UNINTERRUPTIBLE
>
> This sequence of events can occur when the task acquires spinlocks
> on its way to sleeping, for example, in a call to wait_event().
>
> The non-lock wakeup can occur when a wakeup races with this wait_event(),
> which can result in the following sequence of events:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                              non lock wakeup
>                                  real_state = RUNNING
>
>                                lock wakeup
>                                  state = real_state == RUNNING
>
> Without this real_state subterfuge, the wakeup might be lost.

I added this with a few modifications which reflect the actual
implementation. Conceptually the same.

> rwsems have grown special-purpose interfaces that allow non-owner release.
> This non-owner release prevents PREEMPT_RT from substituting RT-mutex
> implementations, for example, by defeating priority inheritance.
> After all, if the lock has no owner, whose priority should be boosted?
> As a result, PREEMPT_RT does not currently support rwsem, which in turn
> means that code using it must therefore be disabled until a workable
> solution presents itself.
>
> [ Note: Not as confident as I would like to be in the above. ]

I'm not confident either especially not after looking at the actual
code.

In fact I feel really stupid because the rw_semaphore reader non-owner
restriction on RT simply does not exist anymore and my history biased
memory tricked me.

The first rw_semaphore implementation of RT was simple and restricted
the reader side to a single reader to support PI on both the reader and
the writer side. That obviosuly did not scale well and made mmap_sem
heavy use cases pretty unhappy.

The short interlude with multi-reader boosting turned out to be a failed
experiment - Steven might still disagree though :)

At some point we gave up and I myself (sic!) reimplemented the RT
variant of rw_semaphore with a reader biased mechanism.

The reader never holds the underlying rt_mutex accross the read side
critical section. It merily increments the reader count and drops it on
release.

The only time a reader takes the rt_mutex is when it blocks on a
writer. Writers hold the rt_mutex across the write side critical section
to allow incoming readers to boost them. Once the writer releases the
rw_semaphore it unlocks the rt_mutex which is then handed off to the
readers. They increment the reader count and then drop the rt_mutex
before continuing in the read side critical section.

So while I changed the implementation it did obviously not occur to me
that this also lifted the non-owner release restriction. Nobody else
noticed either. So we kept dragging this along in both memory and
implementation. Both will be fixed now :)

The owner semantics of down/up_read() are only enforced by lockdep. That
applies to both RT and !RT. The up/down_read_non_owner() variants are
just there to tell lockdep about it.

So, I picked up your other suggestions with slight modifications and
adjusted the owner, semaphore and rw_semaphore docs accordingly.

Please have a close look at the patch below (applies on tip core/locking).

Thanks,

        tglx, who is searching a brown paperbag

8<----------

 Documentation/locking/locktypes.rst |  148 +++++++++++++++++++++++-------------
 1 file changed, 98 insertions(+), 50 deletions(-)

--- a/Documentation/locking/locktypes.rst
+++ b/Documentation/locking/locktypes.rst
@@ -67,6 +67,17 @@ Spinning locks implicitly disable preemp
  _irqsave/restore()   Save and disable / restore interrupt disabled state
  ===================  ====================================================
 
+Owner semantics
+===============
+
+The aforementioned lock types except semaphores have strict owner
+semantics:
+
+  The context (task) that acquired the lock must release it.
+
+rw_semaphores have a special interface which allows non-owner release for
+readers.
+
 
 rtmutex
 =======
@@ -83,6 +94,51 @@ interrupt handlers and soft interrupts.
 and rwlock_t to be implemented via RT-mutexes.
 
 
+sempahore
+=========
+
+semaphore is a counting semaphore implementation.
+
+Semaphores are often used for both serialization and waiting, but new use
+cases should instead use separate serialization and wait mechanisms, such
+as mutexes and completions.
+
+sempahores and PREEMPT_RT
+----------------------------
+
+PREEMPT_RT does not change the sempahore implementation. That's impossible
+due to the counting semaphore semantics which have no concept of owners.
+The lack of an owner conflicts with priority inheritance. After all an
+unknown owner cannot be boosted. As a consequence blocking on semaphores
+can be subject to priority inversion.
+
+
+rw_sempahore
+============
+
+rw_semaphore is a multiple readers and single writer lock mechanism.
+
+On non-PREEMPT_RT kernels the implementation is fair, thus preventing
+writer starvation.
+
+rw_semaphore complies by default with the strict owner semantics, but there
+exist special-purpose interfaces that allow non-owner release for readers.
+These work independent of the kernel configuration.
+
+rw_sempahore and PREEMPT_RT
+---------------------------
+
+PREEMPT_RT kernels map rw_sempahore to a separate rt_mutex-based
+implementation, thus changing the fairness:
+
+ Because an rw_sempaphore writer cannot grant its priority to multiple
+ readers, a preempted low-priority reader will continue holding its lock,
+ thus starving even high-priority writers.  In contrast, because readers
+ can grant their priority to a writer, a preempted low-priority writer will
+ have its priority boosted until it releases the lock, thus preventing that
+ writer from starving readers.
+
+
 raw_spinlock_t and spinlock_t
 =============================
 
@@ -140,7 +196,16 @@ On a PREEMPT_RT enabled kernel spinlock_
    kernels leave task state untouched.  However, PREEMPT_RT must change
    task state if the task blocks during acquisition.  Therefore, it saves
    the current task state before blocking and the corresponding lock wakeup
-   restores it.
+   restores it::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					lock wakeup
+					  task->state = task->saved_state
 
    Other types of wakeups would normally unconditionally set the task state
    to RUNNING, but that does not work here because the task must remain
@@ -148,7 +213,22 @@ On a PREEMPT_RT enabled kernel spinlock_
    wakeup attempts to awaken a task blocked waiting for a spinlock, it
    instead sets the saved state to RUNNING.  Then, when the lock
    acquisition completes, the lock wakeup sets the task state to the saved
-   state, in this case setting it to RUNNING.
+   state, in this case setting it to RUNNING::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					non lock wakeup
+					  task->saved_state = TASK_RUNNING
+
+					lock wakeup
+					  task->state = task->saved_state
+
+   This ensures that the real wakeup cannot be lost.
+
 
 rwlock_t
 ========
@@ -228,17 +308,16 @@ while holding normal non-raw spinlocks b
 bit spinlocks
 -------------
 
-Bit spinlocks are problematic for PREEMPT_RT as they cannot be easily
-substituted by an RT-mutex based implementation for obvious reasons.
-
-The semantics of bit spinlocks are preserved on PREEMPT_RT kernels and the
-caveats vs. raw_spinlock_t apply.
-
-Some bit spinlocks are substituted by regular spinlock_t for PREEMPT_RT but
-this requires conditional (#ifdef'ed) code changes at the usage site while
-the spinlock_t substitution is simply done by the compiler and the
-conditionals are restricted to header files and core implementation of the
-locking primitives and the usage sites do not require any changes.
+PREEMPT_RT cannot substitute bit spinlocks because a single bit is too
+small to accommodate an RT-mutex.  Therefore, the semantics of bit
+spinlocks are preserved on PREEMPT_RT kernels, so that the raw_spinlock_t
+caveats also apply to bit spinlocks.
+
+Some bit spinlocks are replaced with regular spinlock_t for PREEMPT_RT
+using conditional (#ifdef'ed) code changes at the usage site.  In contrast,
+usage-site changes are not needed for the spinlock_t substitution.
+Instead, conditionals in header files and the core locking implemementation
+enable the compiler to do the substitution transparently.
 
 
 Lock type nesting rules
@@ -254,46 +333,15 @@ Lock type nesting rules
 
   - Spinning lock types can nest inside sleeping lock types.
 
-These rules apply in general independent of CONFIG_PREEMPT_RT.
+These constraints apply both in CONFIG_PREEMPT_RT and otherwise.
 
-As PREEMPT_RT changes the lock category of spinlock_t and rwlock_t from
-spinning to sleeping this has obviously restrictions how they can nest with
-raw_spinlock_t.
-
-This results in the following nest ordering:
+The fact that PREEMPT_RT changes the lock category of spinlock_t and
+rwlock_t from spinning to sleeping means that they cannot be acquired while
+holding a raw spinlock.  This results in the following nesting ordering:
 
   1) Sleeping locks
   2) spinlock_t and rwlock_t
   3) raw_spinlock_t and bit spinlocks
 
-Lockdep is aware of these constraints to ensure that they are respected.
-
-
-Owner semantics
-===============
-
-Most lock types in the Linux kernel have strict owner semantics, i.e. the
-context (task) which acquires a lock has to release it.
-
-There are two exceptions:
-
-  - semaphores
-  - rwsems
-
-semaphores have no owner semantics for historical reason, and as such
-trylock and release operations can be called from any context. They are
-often used for both serialization and waiting purposes. That's generally
-discouraged and should be replaced by separate serialization and wait
-mechanisms, such as mutexes and completions.
-
-rwsems have grown interfaces which allow non owner release for special
-purposes. This usage is problematic on PREEMPT_RT because PREEMPT_RT
-substitutes all locking primitives except semaphores with RT-mutex based
-implementations to provide priority inheritance for all lock types except
-the truly spinning ones. Priority inheritance on ownerless locks is
-obviously impossible.
-
-For now the rwsem non-owner release excludes code which utilizes it from
-being used on PREEMPT_RT enabled kernels. In same cases this can be
-mitigated by disabling portions of the code, in other cases the complete
-functionality has to be disabled until a workable solution has been found.
+Lockdep will complain if these constraints are violated, both in
+CONFIG_PREEMPT_RT and otherwise.

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: paulmck@kernel.org
Cc: linux-usb@vger.kernel.org, linux-ia64@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	linux-pci@vger.kernel.org,
	Sebastian Siewior <bigeasy@linutronix.de>,
	Oleg Nesterov <oleg@redhat.com>, Guo Ren <guoren@kernel.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Vincent Chen <deanbo422@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	linux-acpi@vger.kernel.org, Brian Cain <bcain@codeaurora.org>,
	Jonathan Corbet <corbet@lwn.net>,
	linux-hexagon@vger.kernel.org,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	linux-csky@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Darren Hart <dvhart@infradead.org>,
	Zhang Rui <rui.zhang@intel.com>, Len Brown <lenb@kernel.org>,
	Fenghua Yu <fenghua.yu@intel.com>, Arnd Bergmann <arnd@arndb.de>,
	linux-pm@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	Greentime Hu <green.hu@gmail.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Kurt Schwemmer <kurt.schwemmer@microsemi.com>,
	platform-driver-x86@vger.kernel.org,
	Kalle Valo <kvalo@codeaurora.org>,
	kbuild test robot <lkp@intel.com>,
	Felipe Balbi <balbi@kernel.org>, Michal Simek <monstr@monstr.eu>,
	Tony Luck <tony.luck@intel.com>, Nick Hu <nickhu@andestech.com>,
	Geoff Levand <geoff@infradead.org>,
	netdev@vger.kernel.org, Randy Dunlap <rdunlap@infradead.org>,
	linux-wireless@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Davidlohr Bueso <dbueso@suse.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	"David S. Miller" <davem@davemloft.net>,
	Andy Shevchenko <andy@infradead.org>
Subject: Re: [patch V3 13/20] Documentation: Add lock ordering and nesting documentation
Date: Wed, 25 Mar 2020 00:13:34 +0100	[thread overview]
Message-ID: <87r1xhz6qp.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20200323025501.GE3199@paulmck-ThinkPad-P72>

Paul,

"Paul E. McKenney" <paulmck@kernel.org> writes:
> On Sat, Mar 21, 2020 at 12:25:57PM +0100, Thomas Gleixner wrote:
> In the normal case where the task sleeps through the entire lock
> acquisition, the sequence of events is as follows:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                                lock wakeup
>                                  state = real_state == UNINTERRUPTIBLE
>
> This sequence of events can occur when the task acquires spinlocks
> on its way to sleeping, for example, in a call to wait_event().
>
> The non-lock wakeup can occur when a wakeup races with this wait_event(),
> which can result in the following sequence of events:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                              non lock wakeup
>                                  real_state = RUNNING
>
>                                lock wakeup
>                                  state = real_state == RUNNING
>
> Without this real_state subterfuge, the wakeup might be lost.

I added this with a few modifications which reflect the actual
implementation. Conceptually the same.

> rwsems have grown special-purpose interfaces that allow non-owner release.
> This non-owner release prevents PREEMPT_RT from substituting RT-mutex
> implementations, for example, by defeating priority inheritance.
> After all, if the lock has no owner, whose priority should be boosted?
> As a result, PREEMPT_RT does not currently support rwsem, which in turn
> means that code using it must therefore be disabled until a workable
> solution presents itself.
>
> [ Note: Not as confident as I would like to be in the above. ]

I'm not confident either especially not after looking at the actual
code.

In fact I feel really stupid because the rw_semaphore reader non-owner
restriction on RT simply does not exist anymore and my history biased
memory tricked me.

The first rw_semaphore implementation of RT was simple and restricted
the reader side to a single reader to support PI on both the reader and
the writer side. That obviosuly did not scale well and made mmap_sem
heavy use cases pretty unhappy.

The short interlude with multi-reader boosting turned out to be a failed
experiment - Steven might still disagree though :)

At some point we gave up and I myself (sic!) reimplemented the RT
variant of rw_semaphore with a reader biased mechanism.

The reader never holds the underlying rt_mutex accross the read side
critical section. It merily increments the reader count and drops it on
release.

The only time a reader takes the rt_mutex is when it blocks on a
writer. Writers hold the rt_mutex across the write side critical section
to allow incoming readers to boost them. Once the writer releases the
rw_semaphore it unlocks the rt_mutex which is then handed off to the
readers. They increment the reader count and then drop the rt_mutex
before continuing in the read side critical section.

So while I changed the implementation it did obviously not occur to me
that this also lifted the non-owner release restriction. Nobody else
noticed either. So we kept dragging this along in both memory and
implementation. Both will be fixed now :)

The owner semantics of down/up_read() are only enforced by lockdep. That
applies to both RT and !RT. The up/down_read_non_owner() variants are
just there to tell lockdep about it.

So, I picked up your other suggestions with slight modifications and
adjusted the owner, semaphore and rw_semaphore docs accordingly.

Please have a close look at the patch below (applies on tip core/locking).

Thanks,

        tglx, who is searching a brown paperbag

8<----------

 Documentation/locking/locktypes.rst |  148 +++++++++++++++++++++++-------------
 1 file changed, 98 insertions(+), 50 deletions(-)

--- a/Documentation/locking/locktypes.rst
+++ b/Documentation/locking/locktypes.rst
@@ -67,6 +67,17 @@ Spinning locks implicitly disable preemp
  _irqsave/restore()   Save and disable / restore interrupt disabled state
  ===================  ====================================================
 
+Owner semantics
+===============
+
+The aforementioned lock types except semaphores have strict owner
+semantics:
+
+  The context (task) that acquired the lock must release it.
+
+rw_semaphores have a special interface which allows non-owner release for
+readers.
+
 
 rtmutex
 =======
@@ -83,6 +94,51 @@ interrupt handlers and soft interrupts.
 and rwlock_t to be implemented via RT-mutexes.
 
 
+sempahore
+=========
+
+semaphore is a counting semaphore implementation.
+
+Semaphores are often used for both serialization and waiting, but new use
+cases should instead use separate serialization and wait mechanisms, such
+as mutexes and completions.
+
+sempahores and PREEMPT_RT
+----------------------------
+
+PREEMPT_RT does not change the sempahore implementation. That's impossible
+due to the counting semaphore semantics which have no concept of owners.
+The lack of an owner conflicts with priority inheritance. After all an
+unknown owner cannot be boosted. As a consequence blocking on semaphores
+can be subject to priority inversion.
+
+
+rw_sempahore
+============
+
+rw_semaphore is a multiple readers and single writer lock mechanism.
+
+On non-PREEMPT_RT kernels the implementation is fair, thus preventing
+writer starvation.
+
+rw_semaphore complies by default with the strict owner semantics, but there
+exist special-purpose interfaces that allow non-owner release for readers.
+These work independent of the kernel configuration.
+
+rw_sempahore and PREEMPT_RT
+---------------------------
+
+PREEMPT_RT kernels map rw_sempahore to a separate rt_mutex-based
+implementation, thus changing the fairness:
+
+ Because an rw_sempaphore writer cannot grant its priority to multiple
+ readers, a preempted low-priority reader will continue holding its lock,
+ thus starving even high-priority writers.  In contrast, because readers
+ can grant their priority to a writer, a preempted low-priority writer will
+ have its priority boosted until it releases the lock, thus preventing that
+ writer from starving readers.
+
+
 raw_spinlock_t and spinlock_t
 =============================
 
@@ -140,7 +196,16 @@ On a PREEMPT_RT enabled kernel spinlock_
    kernels leave task state untouched.  However, PREEMPT_RT must change
    task state if the task blocks during acquisition.  Therefore, it saves
    the current task state before blocking and the corresponding lock wakeup
-   restores it.
+   restores it::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					lock wakeup
+					  task->state = task->saved_state
 
    Other types of wakeups would normally unconditionally set the task state
    to RUNNING, but that does not work here because the task must remain
@@ -148,7 +213,22 @@ On a PREEMPT_RT enabled kernel spinlock_
    wakeup attempts to awaken a task blocked waiting for a spinlock, it
    instead sets the saved state to RUNNING.  Then, when the lock
    acquisition completes, the lock wakeup sets the task state to the saved
-   state, in this case setting it to RUNNING.
+   state, in this case setting it to RUNNING::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					non lock wakeup
+					  task->saved_state = TASK_RUNNING
+
+					lock wakeup
+					  task->state = task->saved_state
+
+   This ensures that the real wakeup cannot be lost.
+
 
 rwlock_t
 ========
@@ -228,17 +308,16 @@ while holding normal non-raw spinlocks b
 bit spinlocks
 -------------
 
-Bit spinlocks are problematic for PREEMPT_RT as they cannot be easily
-substituted by an RT-mutex based implementation for obvious reasons.
-
-The semantics of bit spinlocks are preserved on PREEMPT_RT kernels and the
-caveats vs. raw_spinlock_t apply.
-
-Some bit spinlocks are substituted by regular spinlock_t for PREEMPT_RT but
-this requires conditional (#ifdef'ed) code changes at the usage site while
-the spinlock_t substitution is simply done by the compiler and the
-conditionals are restricted to header files and core implementation of the
-locking primitives and the usage sites do not require any changes.
+PREEMPT_RT cannot substitute bit spinlocks because a single bit is too
+small to accommodate an RT-mutex.  Therefore, the semantics of bit
+spinlocks are preserved on PREEMPT_RT kernels, so that the raw_spinlock_t
+caveats also apply to bit spinlocks.
+
+Some bit spinlocks are replaced with regular spinlock_t for PREEMPT_RT
+using conditional (#ifdef'ed) code changes at the usage site.  In contrast,
+usage-site changes are not needed for the spinlock_t substitution.
+Instead, conditionals in header files and the core locking implemementation
+enable the compiler to do the substitution transparently.
 
 
 Lock type nesting rules
@@ -254,46 +333,15 @@ Lock type nesting rules
 
   - Spinning lock types can nest inside sleeping lock types.
 
-These rules apply in general independent of CONFIG_PREEMPT_RT.
+These constraints apply both in CONFIG_PREEMPT_RT and otherwise.
 
-As PREEMPT_RT changes the lock category of spinlock_t and rwlock_t from
-spinning to sleeping this has obviously restrictions how they can nest with
-raw_spinlock_t.
-
-This results in the following nest ordering:
+The fact that PREEMPT_RT changes the lock category of spinlock_t and
+rwlock_t from spinning to sleeping means that they cannot be acquired while
+holding a raw spinlock.  This results in the following nesting ordering:
 
   1) Sleeping locks
   2) spinlock_t and rwlock_t
   3) raw_spinlock_t and bit spinlocks
 
-Lockdep is aware of these constraints to ensure that they are respected.
-
-
-Owner semantics
-===============
-
-Most lock types in the Linux kernel have strict owner semantics, i.e. the
-context (task) which acquires a lock has to release it.
-
-There are two exceptions:
-
-  - semaphores
-  - rwsems
-
-semaphores have no owner semantics for historical reason, and as such
-trylock and release operations can be called from any context. They are
-often used for both serialization and waiting purposes. That's generally
-discouraged and should be replaced by separate serialization and wait
-mechanisms, such as mutexes and completions.
-
-rwsems have grown interfaces which allow non owner release for special
-purposes. This usage is problematic on PREEMPT_RT because PREEMPT_RT
-substitutes all locking primitives except semaphores with RT-mutex based
-implementations to provide priority inheritance for all lock types except
-the truly spinning ones. Priority inheritance on ownerless locks is
-obviously impossible.
-
-For now the rwsem non-owner release excludes code which utilizes it from
-being used on PREEMPT_RT enabled kernels. In same cases this can be
-mitigated by disabling portions of the code, in other cases the complete
-functionality has to be disabled until a workable solution has been found.
+Lockdep will complain if these constraints are violated, both in
+CONFIG_PREEMPT_RT and otherwise.


WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: paulmck@kernel.org
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Sebastian Siewior <bigeasy@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Jonathan Corbet <corbet@lwn.net>,
	Randy Dunlap <rdunlap@infradead.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Kurt Schwemmer <kurt.schwemmer@microsemi.com>,
	linux-pci@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Felipe Balbi <balbi@kernel.org>,
	linux-usb@vger.kernel.org, Kalle Valo <kvalo@codeaurora.org>,
	"David S. Miller" <davem@davemloft.net>,
	linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
	Darren Hart <dvhart@infradead.org>,
	Andy Shevchenko <andy@infradead.org>,
	platform-driver-x86@vger.kernel.org,
	Zhang Rui <rui.zhang@intel.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	linux-pm@vger.kernel.org, Len Brown <lenb@kernel.org>,
	linux-acpi@vger.kernel.org, kbuild test robot <lkp@intel.com>,
	Nick Hu <nickhu@andestech.com>, Greentime Hu <green.hu@gmail.com>,
	Vincent Chen <deanbo422@gmail.com>, Guo Ren <guoren@kernel.org>,
	linux-csky@vger.kernel.org, Brian Cain <bcain@codeaurora.org>,
	linux-hexagon@vger.kernel.org, Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	linux-ia64@vger.kernel.org, Michal Simek <monstr@monstr.eu>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Arnd Bergmann <arnd@arndb.de>, Geoff Levand <geoff@infradead.org>,
	linuxppc-dev@lists.ozlabs.org, Davidlohr Bueso <dbueso@suse.de>
Subject: Re: [patch V3 13/20] Documentation: Add lock ordering and nesting documentation
Date: Tue, 24 Mar 2020 23:13:34 +0000	[thread overview]
Message-ID: <87r1xhz6qp.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20200323025501.GE3199@paulmck-ThinkPad-P72>
In-Reply-To: <20200321113242.026561244@linutronix.de>

Paul,

"Paul E. McKenney" <paulmck@kernel.org> writes:
> On Sat, Mar 21, 2020 at 12:25:57PM +0100, Thomas Gleixner wrote:
> In the normal case where the task sleeps through the entire lock
> acquisition, the sequence of events is as follows:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                                lock wakeup
>                                  state = real_state = UNINTERRUPTIBLE
>
> This sequence of events can occur when the task acquires spinlocks
> on its way to sleeping, for example, in a call to wait_event().
>
> The non-lock wakeup can occur when a wakeup races with this wait_event(),
> which can result in the following sequence of events:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                              non lock wakeup
>                                  real_state = RUNNING
>
>                                lock wakeup
>                                  state = real_state = RUNNING
>
> Without this real_state subterfuge, the wakeup might be lost.

I added this with a few modifications which reflect the actual
implementation. Conceptually the same.

> rwsems have grown special-purpose interfaces that allow non-owner release.
> This non-owner release prevents PREEMPT_RT from substituting RT-mutex
> implementations, for example, by defeating priority inheritance.
> After all, if the lock has no owner, whose priority should be boosted?
> As a result, PREEMPT_RT does not currently support rwsem, which in turn
> means that code using it must therefore be disabled until a workable
> solution presents itself.
>
> [ Note: Not as confident as I would like to be in the above. ]

I'm not confident either especially not after looking at the actual
code.

In fact I feel really stupid because the rw_semaphore reader non-owner
restriction on RT simply does not exist anymore and my history biased
memory tricked me.

The first rw_semaphore implementation of RT was simple and restricted
the reader side to a single reader to support PI on both the reader and
the writer side. That obviosuly did not scale well and made mmap_sem
heavy use cases pretty unhappy.

The short interlude with multi-reader boosting turned out to be a failed
experiment - Steven might still disagree though :)

At some point we gave up and I myself (sic!) reimplemented the RT
variant of rw_semaphore with a reader biased mechanism.

The reader never holds the underlying rt_mutex accross the read side
critical section. It merily increments the reader count and drops it on
release.

The only time a reader takes the rt_mutex is when it blocks on a
writer. Writers hold the rt_mutex across the write side critical section
to allow incoming readers to boost them. Once the writer releases the
rw_semaphore it unlocks the rt_mutex which is then handed off to the
readers. They increment the reader count and then drop the rt_mutex
before continuing in the read side critical section.

So while I changed the implementation it did obviously not occur to me
that this also lifted the non-owner release restriction. Nobody else
noticed either. So we kept dragging this along in both memory and
implementation. Both will be fixed now :)

The owner semantics of down/up_read() are only enforced by lockdep. That
applies to both RT and !RT. The up/down_read_non_owner() variants are
just there to tell lockdep about it.

So, I picked up your other suggestions with slight modifications and
adjusted the owner, semaphore and rw_semaphore docs accordingly.

Please have a close look at the patch below (applies on tip core/locking).

Thanks,

        tglx, who is searching a brown paperbag

8<----------

 Documentation/locking/locktypes.rst |  148 +++++++++++++++++++++++-------------
 1 file changed, 98 insertions(+), 50 deletions(-)

--- a/Documentation/locking/locktypes.rst
+++ b/Documentation/locking/locktypes.rst
@@ -67,6 +67,17 @@ Spinning locks implicitly disable preemp
  _irqsave/restore()   Save and disable / restore interrupt disabled state
  ==========  ==========================
 
+Owner semantics
+=======+
+The aforementioned lock types except semaphores have strict owner
+semantics:
+
+  The context (task) that acquired the lock must release it.
+
+rw_semaphores have a special interface which allows non-owner release for
+readers.
+
 
 rtmutex
 ===@@ -83,6 +94,51 @@ interrupt handlers and soft interrupts.
 and rwlock_t to be implemented via RT-mutexes.
 
 
+sempahore
+====+
+semaphore is a counting semaphore implementation.
+
+Semaphores are often used for both serialization and waiting, but new use
+cases should instead use separate serialization and wait mechanisms, such
+as mutexes and completions.
+
+sempahores and PREEMPT_RT
+----------------------------
+
+PREEMPT_RT does not change the sempahore implementation. That's impossible
+due to the counting semaphore semantics which have no concept of owners.
+The lack of an owner conflicts with priority inheritance. After all an
+unknown owner cannot be boosted. As a consequence blocking on semaphores
+can be subject to priority inversion.
+
+
+rw_sempahore
+======
+
+rw_semaphore is a multiple readers and single writer lock mechanism.
+
+On non-PREEMPT_RT kernels the implementation is fair, thus preventing
+writer starvation.
+
+rw_semaphore complies by default with the strict owner semantics, but there
+exist special-purpose interfaces that allow non-owner release for readers.
+These work independent of the kernel configuration.
+
+rw_sempahore and PREEMPT_RT
+---------------------------
+
+PREEMPT_RT kernels map rw_sempahore to a separate rt_mutex-based
+implementation, thus changing the fairness:
+
+ Because an rw_sempaphore writer cannot grant its priority to multiple
+ readers, a preempted low-priority reader will continue holding its lock,
+ thus starving even high-priority writers.  In contrast, because readers
+ can grant their priority to a writer, a preempted low-priority writer will
+ have its priority boosted until it releases the lock, thus preventing that
+ writer from starving readers.
+
+
 raw_spinlock_t and spinlock_t
 ============== 
@@ -140,7 +196,16 @@ On a PREEMPT_RT enabled kernel spinlock_
    kernels leave task state untouched.  However, PREEMPT_RT must change
    task state if the task blocks during acquisition.  Therefore, it saves
    the current task state before blocking and the corresponding lock wakeup
-   restores it.
+   restores it::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					lock wakeup
+					  task->state = task->saved_state
 
    Other types of wakeups would normally unconditionally set the task state
    to RUNNING, but that does not work here because the task must remain
@@ -148,7 +213,22 @@ On a PREEMPT_RT enabled kernel spinlock_
    wakeup attempts to awaken a task blocked waiting for a spinlock, it
    instead sets the saved state to RUNNING.  Then, when the lock
    acquisition completes, the lock wakeup sets the task state to the saved
-   state, in this case setting it to RUNNING.
+   state, in this case setting it to RUNNING::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					non lock wakeup
+					  task->saved_state = TASK_RUNNING
+
+					lock wakeup
+					  task->state = task->saved_state
+
+   This ensures that the real wakeup cannot be lost.
+
 
 rwlock_t
 ====
@@ -228,17 +308,16 @@ while holding normal non-raw spinlocks b
 bit spinlocks
 -------------
 
-Bit spinlocks are problematic for PREEMPT_RT as they cannot be easily
-substituted by an RT-mutex based implementation for obvious reasons.
-
-The semantics of bit spinlocks are preserved on PREEMPT_RT kernels and the
-caveats vs. raw_spinlock_t apply.
-
-Some bit spinlocks are substituted by regular spinlock_t for PREEMPT_RT but
-this requires conditional (#ifdef'ed) code changes at the usage site while
-the spinlock_t substitution is simply done by the compiler and the
-conditionals are restricted to header files and core implementation of the
-locking primitives and the usage sites do not require any changes.
+PREEMPT_RT cannot substitute bit spinlocks because a single bit is too
+small to accommodate an RT-mutex.  Therefore, the semantics of bit
+spinlocks are preserved on PREEMPT_RT kernels, so that the raw_spinlock_t
+caveats also apply to bit spinlocks.
+
+Some bit spinlocks are replaced with regular spinlock_t for PREEMPT_RT
+using conditional (#ifdef'ed) code changes at the usage site.  In contrast,
+usage-site changes are not needed for the spinlock_t substitution.
+Instead, conditionals in header files and the core locking implemementation
+enable the compiler to do the substitution transparently.
 
 
 Lock type nesting rules
@@ -254,46 +333,15 @@ Lock type nesting rules
 
   - Spinning lock types can nest inside sleeping lock types.
 
-These rules apply in general independent of CONFIG_PREEMPT_RT.
+These constraints apply both in CONFIG_PREEMPT_RT and otherwise.
 
-As PREEMPT_RT changes the lock category of spinlock_t and rwlock_t from
-spinning to sleeping this has obviously restrictions how they can nest with
-raw_spinlock_t.
-
-This results in the following nest ordering:
+The fact that PREEMPT_RT changes the lock category of spinlock_t and
+rwlock_t from spinning to sleeping means that they cannot be acquired while
+holding a raw spinlock.  This results in the following nesting ordering:
 
   1) Sleeping locks
   2) spinlock_t and rwlock_t
   3) raw_spinlock_t and bit spinlocks
 
-Lockdep is aware of these constraints to ensure that they are respected.
-
-
-Owner semantics
-=======-
-Most lock types in the Linux kernel have strict owner semantics, i.e. the
-context (task) which acquires a lock has to release it.
-
-There are two exceptions:
-
-  - semaphores
-  - rwsems
-
-semaphores have no owner semantics for historical reason, and as such
-trylock and release operations can be called from any context. They are
-often used for both serialization and waiting purposes. That's generally
-discouraged and should be replaced by separate serialization and wait
-mechanisms, such as mutexes and completions.
-
-rwsems have grown interfaces which allow non owner release for special
-purposes. This usage is problematic on PREEMPT_RT because PREEMPT_RT
-substitutes all locking primitives except semaphores with RT-mutex based
-implementations to provide priority inheritance for all lock types except
-the truly spinning ones. Priority inheritance on ownerless locks is
-obviously impossible.
-
-For now the rwsem non-owner release excludes code which utilizes it from
-being used on PREEMPT_RT enabled kernels. In same cases this can be
-mitigated by disabling portions of the code, in other cases the complete
-functionality has to be disabled until a workable solution has been found.
+Lockdep will complain if these constraints are violated, both in
+CONFIG_PREEMPT_RT and otherwise.

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: paulmck@kernel.org
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Sebastian Siewior <bigeasy@linutronix.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Jonathan Corbet <corbet@lwn.net>,
	Randy Dunlap <rdunlap@infradead.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Kurt Schwemmer <kurt.schwemmer@microsemi.com>,
	linux-pci@vger.kernel.org,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Felipe Balbi <balbi@kernel.org>,
	linux-usb@vger.kernel.org, Kalle Valo <kvalo@codeaurora.org>,
	"David S. Miller" <davem@davemloft.net>,
	linux-wireless@vger.kernel.org, net
Subject: Re: [patch V3 13/20] Documentation: Add lock ordering and nesting documentation
Date: Wed, 25 Mar 2020 00:13:34 +0100	[thread overview]
Message-ID: <87r1xhz6qp.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20200323025501.GE3199@paulmck-ThinkPad-P72>

Paul,

"Paul E. McKenney" <paulmck@kernel.org> writes:
> On Sat, Mar 21, 2020 at 12:25:57PM +0100, Thomas Gleixner wrote:
> In the normal case where the task sleeps through the entire lock
> acquisition, the sequence of events is as follows:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                                lock wakeup
>                                  state = real_state == UNINTERRUPTIBLE
>
> This sequence of events can occur when the task acquires spinlocks
> on its way to sleeping, for example, in a call to wait_event().
>
> The non-lock wakeup can occur when a wakeup races with this wait_event(),
> which can result in the following sequence of events:
>
>      state = UNINTERRUPTIBLE
>      lock()
>        block()
>          real_state = state
>          state = SLEEPONLOCK
>
>                              non lock wakeup
>                                  real_state = RUNNING
>
>                                lock wakeup
>                                  state = real_state == RUNNING
>
> Without this real_state subterfuge, the wakeup might be lost.

I added this with a few modifications which reflect the actual
implementation. Conceptually the same.

> rwsems have grown special-purpose interfaces that allow non-owner release.
> This non-owner release prevents PREEMPT_RT from substituting RT-mutex
> implementations, for example, by defeating priority inheritance.
> After all, if the lock has no owner, whose priority should be boosted?
> As a result, PREEMPT_RT does not currently support rwsem, which in turn
> means that code using it must therefore be disabled until a workable
> solution presents itself.
>
> [ Note: Not as confident as I would like to be in the above. ]

I'm not confident either especially not after looking at the actual
code.

In fact I feel really stupid because the rw_semaphore reader non-owner
restriction on RT simply does not exist anymore and my history biased
memory tricked me.

The first rw_semaphore implementation of RT was simple and restricted
the reader side to a single reader to support PI on both the reader and
the writer side. That obviosuly did not scale well and made mmap_sem
heavy use cases pretty unhappy.

The short interlude with multi-reader boosting turned out to be a failed
experiment - Steven might still disagree though :)

At some point we gave up and I myself (sic!) reimplemented the RT
variant of rw_semaphore with a reader biased mechanism.

The reader never holds the underlying rt_mutex accross the read side
critical section. It merily increments the reader count and drops it on
release.

The only time a reader takes the rt_mutex is when it blocks on a
writer. Writers hold the rt_mutex across the write side critical section
to allow incoming readers to boost them. Once the writer releases the
rw_semaphore it unlocks the rt_mutex which is then handed off to the
readers. They increment the reader count and then drop the rt_mutex
before continuing in the read side critical section.

So while I changed the implementation it did obviously not occur to me
that this also lifted the non-owner release restriction. Nobody else
noticed either. So we kept dragging this along in both memory and
implementation. Both will be fixed now :)

The owner semantics of down/up_read() are only enforced by lockdep. That
applies to both RT and !RT. The up/down_read_non_owner() variants are
just there to tell lockdep about it.

So, I picked up your other suggestions with slight modifications and
adjusted the owner, semaphore and rw_semaphore docs accordingly.

Please have a close look at the patch below (applies on tip core/locking).

Thanks,

        tglx, who is searching a brown paperbag

8<----------

 Documentation/locking/locktypes.rst |  148 +++++++++++++++++++++++-------------
 1 file changed, 98 insertions(+), 50 deletions(-)

--- a/Documentation/locking/locktypes.rst
+++ b/Documentation/locking/locktypes.rst
@@ -67,6 +67,17 @@ Spinning locks implicitly disable preemp
  _irqsave/restore()   Save and disable / restore interrupt disabled state
  ===================  ====================================================
 
+Owner semantics
+===============
+
+The aforementioned lock types except semaphores have strict owner
+semantics:
+
+  The context (task) that acquired the lock must release it.
+
+rw_semaphores have a special interface which allows non-owner release for
+readers.
+
 
 rtmutex
 =======
@@ -83,6 +94,51 @@ interrupt handlers and soft interrupts.
 and rwlock_t to be implemented via RT-mutexes.
 
 
+sempahore
+=========
+
+semaphore is a counting semaphore implementation.
+
+Semaphores are often used for both serialization and waiting, but new use
+cases should instead use separate serialization and wait mechanisms, such
+as mutexes and completions.
+
+sempahores and PREEMPT_RT
+----------------------------
+
+PREEMPT_RT does not change the sempahore implementation. That's impossible
+due to the counting semaphore semantics which have no concept of owners.
+The lack of an owner conflicts with priority inheritance. After all an
+unknown owner cannot be boosted. As a consequence blocking on semaphores
+can be subject to priority inversion.
+
+
+rw_sempahore
+============
+
+rw_semaphore is a multiple readers and single writer lock mechanism.
+
+On non-PREEMPT_RT kernels the implementation is fair, thus preventing
+writer starvation.
+
+rw_semaphore complies by default with the strict owner semantics, but there
+exist special-purpose interfaces that allow non-owner release for readers.
+These work independent of the kernel configuration.
+
+rw_sempahore and PREEMPT_RT
+---------------------------
+
+PREEMPT_RT kernels map rw_sempahore to a separate rt_mutex-based
+implementation, thus changing the fairness:
+
+ Because an rw_sempaphore writer cannot grant its priority to multiple
+ readers, a preempted low-priority reader will continue holding its lock,
+ thus starving even high-priority writers.  In contrast, because readers
+ can grant their priority to a writer, a preempted low-priority writer will
+ have its priority boosted until it releases the lock, thus preventing that
+ writer from starving readers.
+
+
 raw_spinlock_t and spinlock_t
 =============================
 
@@ -140,7 +196,16 @@ On a PREEMPT_RT enabled kernel spinlock_
    kernels leave task state untouched.  However, PREEMPT_RT must change
    task state if the task blocks during acquisition.  Therefore, it saves
    the current task state before blocking and the corresponding lock wakeup
-   restores it.
+   restores it::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					lock wakeup
+					  task->state = task->saved_state
 
    Other types of wakeups would normally unconditionally set the task state
    to RUNNING, but that does not work here because the task must remain
@@ -148,7 +213,22 @@ On a PREEMPT_RT enabled kernel spinlock_
    wakeup attempts to awaken a task blocked waiting for a spinlock, it
    instead sets the saved state to RUNNING.  Then, when the lock
    acquisition completes, the lock wakeup sets the task state to the saved
-   state, in this case setting it to RUNNING.
+   state, in this case setting it to RUNNING::
+
+    task->state = TASK_INTERRUPTIBLE
+     lock()
+       block()
+         task->saved_state = task->state
+	 task->state = TASK_UNINTERRUPTIBLE
+	 schedule()
+					non lock wakeup
+					  task->saved_state = TASK_RUNNING
+
+					lock wakeup
+					  task->state = task->saved_state
+
+   This ensures that the real wakeup cannot be lost.
+
 
 rwlock_t
 ========
@@ -228,17 +308,16 @@ while holding normal non-raw spinlocks b
 bit spinlocks
 -------------
 
-Bit spinlocks are problematic for PREEMPT_RT as they cannot be easily
-substituted by an RT-mutex based implementation for obvious reasons.
-
-The semantics of bit spinlocks are preserved on PREEMPT_RT kernels and the
-caveats vs. raw_spinlock_t apply.
-
-Some bit spinlocks are substituted by regular spinlock_t for PREEMPT_RT but
-this requires conditional (#ifdef'ed) code changes at the usage site while
-the spinlock_t substitution is simply done by the compiler and the
-conditionals are restricted to header files and core implementation of the
-locking primitives and the usage sites do not require any changes.
+PREEMPT_RT cannot substitute bit spinlocks because a single bit is too
+small to accommodate an RT-mutex.  Therefore, the semantics of bit
+spinlocks are preserved on PREEMPT_RT kernels, so that the raw_spinlock_t
+caveats also apply to bit spinlocks.
+
+Some bit spinlocks are replaced with regular spinlock_t for PREEMPT_RT
+using conditional (#ifdef'ed) code changes at the usage site.  In contrast,
+usage-site changes are not needed for the spinlock_t substitution.
+Instead, conditionals in header files and the core locking implemementation
+enable the compiler to do the substitution transparently.
 
 
 Lock type nesting rules
@@ -254,46 +333,15 @@ Lock type nesting rules
 
   - Spinning lock types can nest inside sleeping lock types.
 
-These rules apply in general independent of CONFIG_PREEMPT_RT.
+These constraints apply both in CONFIG_PREEMPT_RT and otherwise.
 
-As PREEMPT_RT changes the lock category of spinlock_t and rwlock_t from
-spinning to sleeping this has obviously restrictions how they can nest with
-raw_spinlock_t.
-
-This results in the following nest ordering:
+The fact that PREEMPT_RT changes the lock category of spinlock_t and
+rwlock_t from spinning to sleeping means that they cannot be acquired while
+holding a raw spinlock.  This results in the following nesting ordering:
 
   1) Sleeping locks
   2) spinlock_t and rwlock_t
   3) raw_spinlock_t and bit spinlocks
 
-Lockdep is aware of these constraints to ensure that they are respected.
-
-
-Owner semantics
-===============
-
-Most lock types in the Linux kernel have strict owner semantics, i.e. the
-context (task) which acquires a lock has to release it.
-
-There are two exceptions:
-
-  - semaphores
-  - rwsems
-
-semaphores have no owner semantics for historical reason, and as such
-trylock and release operations can be called from any context. They are
-often used for both serialization and waiting purposes. That's generally
-discouraged and should be replaced by separate serialization and wait
-mechanisms, such as mutexes and completions.
-
-rwsems have grown interfaces which allow non owner release for special
-purposes. This usage is problematic on PREEMPT_RT because PREEMPT_RT
-substitutes all locking primitives except semaphores with RT-mutex based
-implementations to provide priority inheritance for all lock types except
-the truly spinning ones. Priority inheritance on ownerless locks is
-obviously impossible.
-
-For now the rwsem non-owner release excludes code which utilizes it from
-being used on PREEMPT_RT enabled kernels. In same cases this can be
-mitigated by disabling portions of the code, in other cases the complete
-functionality has to be disabled until a workable solution has been found.
+Lockdep will complain if these constraints are violated, both in
+CONFIG_PREEMPT_RT and otherwise.


  reply	other threads:[~2020-03-24 23:14 UTC|newest]

Thread overview: 195+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-21 11:25 [patch V3 00/20] Lock ordering documentation and annotation for lockdep Thomas Gleixner
2020-03-21 11:25 ` Thomas Gleixner
2020-03-21 11:25 ` Thomas Gleixner
2020-03-21 11:25 ` Thomas Gleixner
2020-03-21 11:25 ` [patch V3 01/20] PCI/switchtec: Fix init_completion race condition with poll_wait() Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25 ` [patch V3 02/20] pci/switchtec: Replace completion wait queue usage for poll Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Sebastian Andrzej Siewior
2020-03-21 11:25 ` [patch V3 03/20] usb: gadget: Use completion interface instead of open coding it Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Thomas Gleixner
2020-03-25  8:37   ` [patch V3 03/20] " Felipe Balbi
2020-03-25  8:37     ` Felipe Balbi
2020-03-25  8:37     ` Felipe Balbi
2020-03-25  8:37     ` Felipe Balbi
2020-03-27 12:14     ` Sebastian Siewior
2020-03-27 12:14       ` Sebastian Siewior
2020-03-27 12:14       ` Sebastian Siewior
2020-03-27 12:14       ` Sebastian Siewior
2020-03-21 11:25 ` [patch V3 04/20] orinoco_usb: Use the regular completion interfaces Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Thomas Gleixner
2020-03-22 14:42   ` [patch V3 04/20] " Kalle Valo
2020-03-22 14:42     ` Kalle Valo
2020-03-22 14:42     ` Kalle Valo
2020-03-21 11:25 ` [patch V3 05/20] acpi: Remove header dependency Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 12:23   ` Andy Shevchenko
2020-03-21 12:23     ` Andy Shevchenko
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-03-22  7:02   ` [patch V3 05/20] " Rafael J. Wysocki
2020-03-22  7:02     ` Rafael J. Wysocki
2020-03-21 11:25 ` [patch V3 06/20] nds32: Remove mm.h from asm/uaccess.h Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Sebastian Andrzej Siewior
2020-03-21 11:25 ` [patch V3 07/20] csky: " Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Sebastian Andrzej Siewior
2020-03-21 11:25 ` [patch V3 08/20] hexagon: " Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Sebastian Andrzej Siewior
2020-03-23 21:46   ` [patch V3 08/20] " Brian Cain
2020-03-23 21:46     ` Brian Cain
2020-03-23 21:46     ` Brian Cain
2020-03-23 21:46     ` Brian Cain
2020-03-21 11:25 ` [patch V3 09/20] ia64: " Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Sebastian Andrzej Siewior
2020-03-21 11:25 ` [patch V3 10/20] microblaze: " Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Sebastian Andrzej Siewior
2020-03-21 11:25 ` [patch V3 11/20] rcuwait: Add @state argument to rcuwait_wait_event() Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra (Intel)
2020-03-21 11:25 ` [patch V3 12/20] powerpc/ps3: Convert half completion to rcuwait Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 13:22   ` Thomas Gleixner
2020-03-21 13:22     ` Thomas Gleixner
2020-03-21 13:22     ` Thomas Gleixner
2020-03-21 13:22     ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra (Intel)
2020-03-27 19:14   ` [patch V3 12/20] " Geoff Levand
2020-03-27 19:14     ` Geoff Levand
2020-03-27 19:14     ` Geoff Levand
2020-03-27 19:14     ` Geoff Levand
2020-03-21 11:25 ` [patch V3 13/20] Documentation: Add lock ordering and nesting documentation Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Thomas Gleixner
2020-03-23  2:55   ` [patch V3 13/20] " Paul E. McKenney
2020-03-23  2:55     ` Paul E. McKenney
2020-03-23  2:55     ` Paul E. McKenney
2020-03-23  2:55     ` Paul E. McKenney
2020-03-23  2:55     ` Paul E. McKenney
2020-03-24 23:13     ` Thomas Gleixner [this message]
2020-03-24 23:13       ` Thomas Gleixner
2020-03-24 23:13       ` Thomas Gleixner
2020-03-24 23:13       ` Thomas Gleixner
2020-03-24 23:13       ` Thomas Gleixner
2020-03-25  0:28       ` Paul E. McKenney
2020-03-25  0:28         ` Paul E. McKenney
2020-03-25  0:28         ` Paul E. McKenney
2020-03-25  0:28         ` Paul E. McKenney
2020-03-25  0:28         ` Paul E. McKenney
2020-03-25 12:27         ` Documentation/locking/locktypes: Further clarifications and wordsmithing Thomas Gleixner
2020-03-25 12:27           ` Thomas Gleixner
2020-03-25 12:27           ` Thomas Gleixner
2020-03-25 12:27           ` Thomas Gleixner
2020-03-25 12:27           ` Thomas Gleixner
2020-03-25 16:02           ` Sebastian Siewior
2020-03-25 16:02             ` Sebastian Siewior
2020-03-25 16:02             ` Sebastian Siewior
2020-03-25 16:02             ` Sebastian Siewior
2020-03-25 16:02             ` Sebastian Siewior
2020-03-25 16:39             ` Paul E. McKenney
2020-03-25 16:39               ` Paul E. McKenney
2020-03-25 16:39               ` Paul E. McKenney
2020-03-25 16:39               ` Paul E. McKenney
2020-03-25 16:54               ` Sebastian Siewior
2020-03-25 16:54                 ` Sebastian Siewior
2020-03-25 16:54                 ` Sebastian Siewior
2020-03-25 16:54                 ` Sebastian Siewior
2020-03-25 16:58           ` [PATCH v2] Documentation/locking/locktypes: minor copy editor fixes Randy Dunlap
2020-03-25 16:58             ` Randy Dunlap
2020-03-25 16:58             ` Randy Dunlap
2020-03-25 16:58             ` Randy Dunlap
2020-03-26  2:40             ` Paul E. McKenney
2020-03-26  2:40               ` Paul E. McKenney
2020-03-26  2:40               ` Paul E. McKenney
2020-03-26  2:40               ` Paul E. McKenney
2020-03-26  2:40               ` Paul E. McKenney
2020-03-28 11:52             ` [tip: locking/core] Documentation/locking/locktypes: Minor " tip-bot2 for Randy Dunlap
2020-03-28 11:52           ` [tip: locking/core] Documentation/locking/locktypes: Further clarifications and wordsmithing tip-bot2 for Thomas Gleixner
2020-03-21 11:25 ` [patch V3 14/20] timekeeping: Split jiffies seqlock Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Thomas Gleixner
2020-03-21 11:25 ` [patch V3 15/20] sched/swait: Prepare usage in completions Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 11:25   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Thomas Gleixner
2020-03-21 11:26 ` [patch V3 16/20] completion: Use simple wait queues Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Thomas Gleixner
2020-03-23 15:20   ` [PATCH] completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all() Sebastian Siewior
2020-03-23 15:20     ` Sebastian Siewior
2020-03-23 15:20     ` Sebastian Siewior
2020-03-23 15:20     ` Sebastian Siewior
2020-03-23 17:50     ` [tip: locking/core] " tip-bot2 for Sebastian Siewior
2020-03-21 11:26 ` [patch V3 17/20] lockdep: Introduce wait-type checks Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2020-03-21 11:26 ` [patch V3 18/20] lockdep: Add hrtimer context tracing bits Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Sebastian Andrzej Siewior
2020-03-21 16:46     ` Frederic Weisbecker
2020-03-21 11:26 ` [patch V3 19/20] lockdep: Annotate irq_work Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Sebastian Andrzej Siewior
2020-03-21 16:40     ` Frederic Weisbecker
2020-03-21 18:12       ` Sebastian Andrzej Siewior
2020-03-22  2:33         ` Frederic Weisbecker
2020-03-22  2:39           ` Frederic Weisbecker
2020-03-22 12:27           ` Sebastian Andrzej Siewior
2020-03-21 11:26 ` [patch V3 20/20] lockdep: Add posixtimer context tracing bits Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 11:26   ` Thomas Gleixner
2020-03-21 15:53   ` [tip: locking/core] " tip-bot2 for Sebastian Andrzej Siewior
2020-03-21 17:19 ` [patch V3 00/20] Lock ordering documentation and annotation for lockdep Davidlohr Bueso
2020-03-21 17:19   ` Davidlohr Bueso
2020-03-21 17:19   ` Davidlohr Bueso
2020-03-21 17:19   ` Davidlohr Bueso
2020-03-21 17:45   ` Thomas Gleixner
2020-03-21 17:45     ` Thomas Gleixner
2020-03-21 17:45     ` Thomas Gleixner
2020-03-21 17:45     ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r1xhz6qp.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=andy@infradead.org \
    --cc=arnd@arndb.de \
    --cc=balbi@kernel.org \
    --cc=bcain@codeaurora.org \
    --cc=bhelgaas@google.com \
    --cc=bigeasy@linutronix.de \
    --cc=corbet@lwn.net \
    --cc=dave@stgolabs.net \
    --cc=davem@davemloft.net \
    --cc=dbueso@suse.de \
    --cc=deanbo422@gmail.com \
    --cc=dvhart@infradead.org \
    --cc=fenghua.yu@intel.com \
    --cc=geoff@infradead.org \
    --cc=green.hu@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=guoren@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=kurt.schwemmer@microsemi.com \
    --cc=kvalo@codeaurora.org \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-hexagon@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lkp@intel.com \
    --cc=logang@deltatee.com \
    --cc=mingo@kernel.org \
    --cc=monstr@monstr.eu \
    --cc=mpe@ellerman.id.au \
    --cc=netdev@vger.kernel.org \
    --cc=nickhu@andestech.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=platform-driver-x86@vger.kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rdunlap@infradead.org \
    --cc=rui.zhang@intel.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.