All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>, Will Deacon <will@kernel.org>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Logan Gunthorpe <logang@deltatee.com>,
	Kurt Schwemmer <kurt.schwemmer@microsemi.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org, Felipe Balbi <balbi@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-usb@vger.kernel.org, Kalle Valo <kvalo@codeaurora.org>,
	"David S. Miller" <davem@davemloft.net>,
	linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
	Oleg Nesterov <oleg@redhat.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Arnd Bergmann <arnd@arndb.de>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [patch V2 08/15] Documentation: Add lock ordering and nesting documentation
Date: Sat, 21 Mar 2020 22:49:04 +0100	[thread overview]
Message-ID: <874kuhqsz3.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20200321212144.GA6475@google.com>

Joel Fernandes <joel@joelfernandes.org> writes:
>> +rwlock_t
>> +========
>> +
>> +rwlock_t is a multiple readers and single writer lock mechanism.
>> +
>> +On a non PREEMPT_RT enabled kernel rwlock_t is implemented as a spinning
>> +lock and the suffix rules of spinlock_t apply accordingly. The
>> +implementation is fair and prevents writer starvation.
>>
>
> You mentioned writer starvation, but I think it would be good to also mention
> that rwlock_t on a non-PREEMPT_RT kernel also does not have _reader_
> starvation problem, since it uses queued implementation.  This fact is worth
> mentioning here, since further below you explain that an rwlock in PREEMPT_RT
> does have reader starvation problem.

It's worth mentioning. But RT really has only write starvation not
reader starvation.

>> +rwlock_t and PREEMPT_RT
>> +-----------------------
>> +
>> +On a PREEMPT_RT enabled kernel rwlock_t is mapped to a separate
>> +implementation based on rt_mutex which changes the semantics:
>> +
>> + - Same changes as for spinlock_t
>> +
>> + - The implementation is not fair and can cause writer starvation under
>> +   certain circumstances. The reason for this is that a writer cannot grant
>> +   its priority to multiple readers. Readers which are blocked on a writer
>> +   fully support the priority inheritance protocol.
>
> Is it hard to give priority to multiple readers because the number of readers
> to give priority to could be unbounded?

Yes, and it's horribly complex and racy. We had an implemetation years
ago which taught us not to try it again :)

>> +PREEMPT_RT also offers a local_lock mechanism to substitute the
>> +local_irq_disable/save() constructs in cases where a separation of the
>> +interrupt disabling and the locking is really unavoidable. This should be
>> +restricted to very rare cases.
>
> It would also be nice to mention where else local_lock() can be used, such as
> protecting per-cpu variables without disabling preemption. Could we add a
> section on protecting per-cpu data? (Happy to do that and send a patch if you
> prefer).

The local lock section will come soon when we post the local lock
patches again.

>> +rwsems have grown interfaces which allow non owner release for special
>> +purposes. This usage is problematic on PREEMPT_RT because PREEMPT_RT
>> +substitutes all locking primitives except semaphores with RT-mutex based
>> +implementations to provide priority inheritance for all lock types except
>> +the truly spinning ones. Priority inheritance on ownerless locks is
>> +obviously impossible.
>> +
>> +For now the rwsem non-owner release excludes code which utilizes it from
>> +being used on PREEMPT_RT enabled kernels.
>
> I could not parse the last sentence here, but I think you meant "For now,
> PREEMPT_RT enabled kernels disable code that perform a non-owner release of
> an rwsem". Correct me if I'm wrong.

Right, that's what I wanted to say :)

Care to send a delta patch?

Thanks!

        tglx

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: Randy Dunlap <rdunlap@infradead.org>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-pci@vger.kernel.org,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Oleg Nesterov <oleg@redhat.com>, Will Deacon <will@kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Arnd Bergmann <arnd@arndb.de>,
	Logan Gunthorpe <logang@deltatee.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	linuxppc-dev@lists.ozlabs.org,
	Steven Rostedt <rostedt@goodmis.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Kurt Schwemmer <kurt.schwemmer@microsemi.com>,
	Kalle Valo <kvalo@codeaurora.org>,
	Felipe Balbi <balbi@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-usb@vger.kernel.org, linux-wireless@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	netdev@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [patch V2 08/15] Documentation: Add lock ordering and nesting documentation
Date: Sat, 21 Mar 2020 22:49:04 +0100	[thread overview]
Message-ID: <874kuhqsz3.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <20200321212144.GA6475@google.com>

Joel Fernandes <joel@joelfernandes.org> writes:
>> +rwlock_t
>> +========
>> +
>> +rwlock_t is a multiple readers and single writer lock mechanism.
>> +
>> +On a non PREEMPT_RT enabled kernel rwlock_t is implemented as a spinning
>> +lock and the suffix rules of spinlock_t apply accordingly. The
>> +implementation is fair and prevents writer starvation.
>>
>
> You mentioned writer starvation, but I think it would be good to also mention
> that rwlock_t on a non-PREEMPT_RT kernel also does not have _reader_
> starvation problem, since it uses queued implementation.  This fact is worth
> mentioning here, since further below you explain that an rwlock in PREEMPT_RT
> does have reader starvation problem.

It's worth mentioning. But RT really has only write starvation not
reader starvation.

>> +rwlock_t and PREEMPT_RT
>> +-----------------------
>> +
>> +On a PREEMPT_RT enabled kernel rwlock_t is mapped to a separate
>> +implementation based on rt_mutex which changes the semantics:
>> +
>> + - Same changes as for spinlock_t
>> +
>> + - The implementation is not fair and can cause writer starvation under
>> +   certain circumstances. The reason for this is that a writer cannot grant
>> +   its priority to multiple readers. Readers which are blocked on a writer
>> +   fully support the priority inheritance protocol.
>
> Is it hard to give priority to multiple readers because the number of readers
> to give priority to could be unbounded?

Yes, and it's horribly complex and racy. We had an implemetation years
ago which taught us not to try it again :)

>> +PREEMPT_RT also offers a local_lock mechanism to substitute the
>> +local_irq_disable/save() constructs in cases where a separation of the
>> +interrupt disabling and the locking is really unavoidable. This should be
>> +restricted to very rare cases.
>
> It would also be nice to mention where else local_lock() can be used, such as
> protecting per-cpu variables without disabling preemption. Could we add a
> section on protecting per-cpu data? (Happy to do that and send a patch if you
> prefer).

The local lock section will come soon when we post the local lock
patches again.

>> +rwsems have grown interfaces which allow non owner release for special
>> +purposes. This usage is problematic on PREEMPT_RT because PREEMPT_RT
>> +substitutes all locking primitives except semaphores with RT-mutex based
>> +implementations to provide priority inheritance for all lock types except
>> +the truly spinning ones. Priority inheritance on ownerless locks is
>> +obviously impossible.
>> +
>> +For now the rwsem non-owner release excludes code which utilizes it from
>> +being used on PREEMPT_RT enabled kernels.
>
> I could not parse the last sentence here, but I think you meant "For now,
> PREEMPT_RT enabled kernels disable code that perform a non-owner release of
> an rwsem". Correct me if I'm wrong.

Right, that's what I wanted to say :)

Care to send a delta patch?

Thanks!

        tglx

  reply	other threads:[~2020-03-21 21:49 UTC|newest]

Thread overview: 148+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-18 20:43 [patch V2 00/15] Lock ordering documentation and annotation for lockdep Thomas Gleixner
2020-03-18 20:43 ` Thomas Gleixner
2020-03-18 20:43 ` [patch V2 01/15] PCI/switchtec: Fix init_completion race condition with poll_wait() Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-18 21:25   ` Bjorn Helgaas
2020-03-18 21:25     ` Bjorn Helgaas
2020-03-18 20:43 ` [patch V2 02/15] pci/switchtec: Replace completion wait queue usage for poll Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-18 21:26   ` Bjorn Helgaas
2020-03-18 21:26     ` Bjorn Helgaas
2020-03-18 22:11   ` Logan Gunthorpe
2020-03-18 22:11     ` Logan Gunthorpe
2020-03-18 20:43 ` [patch V2 03/15] usb: gadget: Use completion interface instead of open coding it Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-19  8:41   ` Greg Kroah-Hartman
2020-03-19  8:41     ` Greg Kroah-Hartman
2020-03-18 20:43 ` [patch V2 04/15] orinoco_usb: Use the regular completion interfaces Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-19  8:40   ` Greg Kroah-Hartman
2020-03-19  8:40     ` Greg Kroah-Hartman
2020-03-18 20:43 ` [patch V2 05/15] acpi: Remove header dependency Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-18 20:43 ` [patch V2 06/15] rcuwait: Add @state argument to rcuwait_wait_event() Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-20  5:36   ` Davidlohr Bueso
2020-03-20  5:36     ` Davidlohr Bueso
2020-03-20  8:45     ` Sebastian Andrzej Siewior
2020-03-20  8:45       ` Sebastian Andrzej Siewior
2020-03-20  8:58       ` Davidlohr Bueso
2020-03-20  8:58         ` Davidlohr Bueso
2020-03-20  9:48   ` [PATCH 0/5] Remove mm.h from arch/*/include/asm/uaccess.h Sebastian Andrzej Siewior
2020-03-20  9:48     ` Sebastian Andrzej Siewior
2020-03-20  9:48     ` [PATCH 1/5] nds32: Remove mm.h from asm/uaccess.h Sebastian Andrzej Siewior
2020-03-20  9:48       ` Sebastian Andrzej Siewior
2020-03-20  9:48     ` [PATCH 2/5] csky: " Sebastian Andrzej Siewior
2020-03-20  9:48       ` Sebastian Andrzej Siewior
2020-03-21 11:24       ` Guo Ren
2020-03-21 11:24         ` Guo Ren
2020-03-21 12:08         ` Thomas Gleixner
2020-03-21 12:08           ` Thomas Gleixner
2020-03-21 14:11           ` Guo Ren
2020-03-21 14:11             ` Guo Ren
2020-03-20  9:48     ` [PATCH 3/5] hexagon: " Sebastian Andrzej Siewior
2020-03-20  9:48       ` Sebastian Andrzej Siewior
2020-03-20  9:48     ` [PATCH 4/5] ia64: " Sebastian Andrzej Siewior
2020-03-20  9:48       ` Sebastian Andrzej Siewior
2020-03-20  9:48       ` Sebastian Andrzej Siewior
2020-03-20  9:48     ` [PATCH 5/5] microblaze: " Sebastian Andrzej Siewior
2020-03-20  9:48       ` Sebastian Andrzej Siewior
2020-03-18 20:43 ` [patch V2 07/15] powerpc/ps3: Convert half completion to rcuwait Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-19  9:00   ` Sebastian Andrzej Siewior
2020-03-19  9:00     ` Sebastian Andrzej Siewior
2020-03-19  9:18     ` Peter Zijlstra
2020-03-19  9:18       ` Peter Zijlstra
2020-03-19  9:21   ` Davidlohr Bueso
2020-03-19  9:21     ` Davidlohr Bueso
2020-03-19 10:04   ` Christoph Hellwig
2020-03-19 10:04     ` Christoph Hellwig
2020-03-19 10:26     ` Sebastian Andrzej Siewior
2020-03-19 10:26       ` Sebastian Andrzej Siewior
2020-03-20  0:01       ` Geoff Levand
2020-03-20  0:01         ` Geoff Levand
2020-03-20  0:45       ` Michael Ellerman
2020-03-20  0:45         ` Michael Ellerman
2020-03-21 10:41     ` Thomas Gleixner
2020-03-21 10:41       ` Thomas Gleixner
2020-03-18 20:43 ` [patch V2 08/15] Documentation: Add lock ordering and nesting documentation Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-18 22:31   ` Paul E. McKenney
2020-03-18 22:31     ` Paul E. McKenney
2020-03-19 18:02     ` Thomas Gleixner
2020-03-19 18:02       ` Thomas Gleixner
2020-03-20 16:01       ` Paul E. McKenney
2020-03-20 16:01         ` Paul E. McKenney
2020-03-20 19:51         ` Thomas Gleixner
2020-03-20 19:51           ` Thomas Gleixner
2020-03-20 21:02           ` Paul E. McKenney
2020-03-20 21:02             ` Paul E. McKenney
2020-03-20 22:36             ` Thomas Gleixner
2020-03-20 22:36               ` Thomas Gleixner
2020-03-21  2:29               ` Paul E. McKenney
2020-03-21  2:29                 ` Paul E. McKenney
2020-03-21 10:26                 ` Thomas Gleixner
2020-03-21 10:26                   ` Thomas Gleixner
2020-03-21 17:23                   ` Paul E. McKenney
2020-03-21 17:23                     ` Paul E. McKenney
2020-03-19  8:51   ` Davidlohr Bueso
2020-03-19  8:51     ` Davidlohr Bueso
2020-03-19 15:04   ` Jonathan Corbet
2020-03-19 15:04     ` Jonathan Corbet
2020-03-19 18:04     ` Thomas Gleixner
2020-03-19 18:04       ` Thomas Gleixner
2020-03-21 21:21   ` Joel Fernandes
2020-03-21 21:21     ` Joel Fernandes
2020-03-21 21:49     ` Thomas Gleixner [this message]
2020-03-21 21:49       ` Thomas Gleixner
2020-03-22  1:36       ` Joel Fernandes
2020-03-22  1:36         ` Joel Fernandes
2020-03-18 20:43 ` [patch V2 09/15] timekeeping: Split jiffies seqlock Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-18 20:43 ` [patch V2 10/15] sched/swait: Prepare usage in completions Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-18 20:43 ` [patch V2 11/15] completion: Use simple wait queues Thomas Gleixner
2020-03-18 20:43   ` Thomas Gleixner
2020-03-18 22:28   ` Logan Gunthorpe
2020-03-18 22:28     ` Logan Gunthorpe
2020-03-19  0:33   ` Joel Fernandes
2020-03-19  0:33     ` Joel Fernandes
2020-03-19  0:44     ` Thomas Gleixner
2020-03-19  0:44       ` Thomas Gleixner
2020-03-19  8:42   ` Greg Kroah-Hartman
2020-03-19  8:42     ` Greg Kroah-Hartman
2020-03-19 17:12   ` Linus Torvalds
2020-03-19 17:12     ` Linus Torvalds
2020-03-19 23:25   ` Julian Calaby
2020-03-19 23:25     ` Julian Calaby
2020-03-20  6:59     ` Christoph Hellwig
2020-03-20  6:59       ` Christoph Hellwig
2020-03-20  9:01   ` Davidlohr Bueso
2020-03-20  9:01     ` Davidlohr Bueso
2020-03-18 20:43 ` [patch V2 13/15] lockdep: Add hrtimer context tracing bits Thomas Gleixner
2020-03-18 20:43 ` [patch V2 14/15] lockdep: Annotate irq_work Thomas Gleixner
2020-03-18 20:43 ` [patch V2 15/15] lockdep: Add posixtimer context tracing bits Thomas Gleixner
2020-03-20  8:50 ` [patch V2 00/15] Lock ordering documentation and annotation for lockdep Davidlohr Bueso
2020-03-20  8:50   ` Davidlohr Bueso
2020-03-20  8:55 ` [PATCH 16/15] rcuwait: Get rid of stale name comment Davidlohr Bueso
2020-03-20  8:55   ` Davidlohr Bueso
2020-03-20  8:55   ` [PATCH 17/15] rcuwait: Inform rcuwait_wake_up() users if a wakeup was attempted Davidlohr Bueso
2020-03-20  8:55     ` Davidlohr Bueso
2020-03-20  9:13     ` Sebastian Andrzej Siewior
2020-03-20  9:13       ` Sebastian Andrzej Siewior
2020-03-20 10:44     ` Peter Zijlstra
2020-03-20 10:44       ` Peter Zijlstra
2020-03-20  8:55   ` [PATCH 18/15] kvm: Replace vcpu->swait with rcuwait Davidlohr Bueso
2020-03-20  8:55     ` Davidlohr Bueso
2020-03-20 11:20     ` Paolo Bonzini
2020-03-20 11:20       ` Paolo Bonzini
2020-03-20 12:54     ` Peter Zijlstra
2020-03-20 12:54       ` Peter Zijlstra
2020-03-22 16:33       ` Davidlohr Bueso
2020-03-22 16:33         ` Davidlohr Bueso
2020-03-22 22:32         ` Peter Zijlstra
2020-03-22 22:32           ` Peter Zijlstra
2020-03-20  8:55   ` [PATCH 19/15] sched/swait: Reword some of the main description Davidlohr Bueso
2020-03-20  8:55     ` Davidlohr Bueso
2020-03-20  9:19     ` Sebastian Andrzej Siewior
2020-03-20  9:19       ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874kuhqsz3.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=arnd@arndb.de \
    --cc=balbi@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=bigeasy@linutronix.de \
    --cc=dave@stgolabs.net \
    --cc=davem@davemloft.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=joel@joelfernandes.org \
    --cc=kurt.schwemmer@microsemi.com \
    --cc=kvalo@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=logang@deltatee.com \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.