linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Waiman Long <longman@redhat.com>,
	linux-bcachefs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, Boqun Feng <boqun.feng@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Stephen Rothwell <sfr@canb.auug.org.au>
Subject: [NAK] Re: [PATCH 11/20] locking/osq: Export osq_(lock|unlock)
Date: Tue, 10 Oct 2023 10:09:38 +0200	[thread overview]
Message-ID: <ZSUGwr5S5Nflbiay@gmail.com> (raw)
In-Reply-To: <20230802214211.y3x3swic4jbphmtg@moria.home.lan>


* Kent Overstreet <kent.overstreet@linux.dev> wrote:

> On Wed, Aug 02, 2023 at 05:09:13PM -0400, Waiman Long wrote:
> > On 8/2/23 16:44, Kent Overstreet wrote:
> > > On Wed, Aug 02, 2023 at 04:16:12PM -0400, Waiman Long wrote:
> > > > On 7/12/23 17:11, Kent Overstreet wrote:
> > > > > These are used by bcachefs's six locks.
> > > > > 
> > > > > Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
> > > > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > > > Cc: Ingo Molnar <mingo@redhat.com>
> > > > > Cc: Waiman Long <longman@redhat.com>
> > > > > Cc: Boqun Feng <boqun.feng@gmail.com>
> > > > > ---
> > > > >    kernel/locking/osq_lock.c | 2 ++
> > > > >    1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > > > > index d5610ad52b..b752ec5cc6 100644
> > > > > --- a/kernel/locking/osq_lock.c
> > > > > +++ b/kernel/locking/osq_lock.c
> > > > > @@ -203,6 +203,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > > > >    	return false;
> > > > >    }
> > > > > +EXPORT_SYMBOL_GPL(osq_lock);
> > > > >    void osq_unlock(struct optimistic_spin_queue *lock)
> > > > >    {
> > > > > @@ -230,3 +231,4 @@ void osq_unlock(struct optimistic_spin_queue *lock)
> > > > >    	if (next)
> > > > >    		WRITE_ONCE(next->locked, 1);
> > > > >    }
> > > > > +EXPORT_SYMBOL_GPL(osq_unlock);
> > > > Have you considered extending the current rw_semaphore to support a SIX lock
> > > > semantics? There are a number of instances in the kernel that a up_read() is
> > > > followed by a down_write(). Basically, the code try to upgrade the lock from
> > > > read to write. I have been thinking about adding a upgrade_read() API to do
> > > > that. However, the concern that I had was that another writer may come in
> > > > and make modification before the reader can be upgraded to have exclusive
> > > > write access and will make the task to repeat what has been done in the read
> > > > lock part. By adding a read with intent to upgrade to write, we can have
> > > > that guarantee.
> > > It's been discussed, Linus had the same thought.
> > > 
> > > But it'd be a massive change to the rw semaphore code; this "read with
> > > intent" really is a third lock state which needs all the same
> > > lock/trylock/unlock paths, and with the way rw semaphore has separate
> > > entry points for read and write it'd be a _ton_ of new code. It really
> > > touches everything - waitlist handling included.
> > 
> > Yes, it is a major change, but I had done that before and it is certainly
> > doable. There are spare bits in the low byte of rwsem->count that can be
> > used as an intent bit. We also need to add a new rwsem_wake_type for that
> > for waitlist handling.
> > 
> > 
> > > 
> > > And six locks have several other features that bcachefs needs, and other
> > > users may also end up wanting, that rw semaphores don't have; the two
> > > main features being a percpu read lock mode and support for an external
> > > cycle detector (which requires exposing lock waitlists, with some
> > > guarantees about how those waitlists are used).
> > 
> > Can you provide more information about those features?
> > 
> > > 
> > > > With that said, I would prefer to keep osq_{lock/unlock} for internal use by
> > > > some higher level locking primitives - mutex, rwsem and rt_mutex.
> > > Yeah, I'm aware, but it seems like exposing osq_(lock|unlock) is the
> > > most palatable solution for now. Long term, I'd like to get six locks
> > > promoted to kernel/locking.
> > 
> > Your SIX overlaps with rwsem in term of features. So we will have to somehow
> > merge them instead of having 2 APIs with somewhat similar functionality.
> 
> Waiman, if you think you can add all the features of six locks to rwsem,
> knock yourself out - but right now this is a vaporware idea for you, not
> something I can seriously entertain. I'm looking to merge bcachefs next
> cycle, not sit around and bikeshed for the next six months.

That's an entirely inappropriate response to valid review feedback.

Not having two overlapping locking facilities is not 'bikeshedding' at all ...

> If you start making a serious effort on adding those features to rwsem
> I'll start walking you through everything six locks has, but right now
> this is a major digression on a patch that just exports two symbols.

In Linux the burden of work is on people submitting new code, not on 
reviewers. The rule is that you should not reinvent the wheel in new
features - extend existing locking facilities please.

Waiman gave you some pointers as to how to extend rwsems.

Meanwhile, NAK on the export of osq_(lock|unlock):

    NAKed-by: Ingo Molnar <mingo@kernel.org>

Ie. NAK on this commit in linux-next:

    97da2065b7cb ("locking/osq: Export osq_(lock|unlock)")

This is an internal function of Linux locking subsystem we would not like to
expose or export.

This commit was applied without an Ack or Reviewed-by by a locking subsystem
maintainer or reviewer (which Waiman Long is).

Thanks,

	Ingo

  reply	other threads:[~2023-10-10  8:09 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-12 21:10 [PATCH 00/20] bcachefs prereqs patch series Kent Overstreet
2023-07-12 21:10 ` [PATCH 01/20] sched: Add task_struct->faults_disabled_mapping Kent Overstreet
2023-07-12 21:10 ` [PATCH 02/20] fs: factor out d_mark_tmpfile() Kent Overstreet
2023-07-12 21:10 ` [PATCH 03/20] iov_iter: Handle compound highmem pages in copy_page_from_iter_atomic() Kent Overstreet
2023-07-12 21:10 ` [PATCH 04/20] block: Add some exports for bcachefs Kent Overstreet
2023-07-24 17:31   ` Christoph Hellwig
2023-07-25  3:00     ` Kent Overstreet
2023-07-26 13:20       ` Christoph Hellwig
2023-08-01 18:59         ` Kent Overstreet
2023-07-25  2:59   ` Matthew Wilcox
2023-07-12 21:11 ` [PATCH 05/20] block: Allow bio_iov_iter_get_pages() with bio->bi_bdev unset Kent Overstreet
2023-07-24 17:34   ` Christoph Hellwig
2023-07-25  2:43     ` Kent Overstreet
2023-07-26 13:23       ` Christoph Hellwig
2023-08-01 19:04         ` Kent Overstreet
2023-08-02 11:47           ` Christoph Hellwig
2023-08-02 16:44             ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 06/20] block: Bring back zero_fill_bio_iter Kent Overstreet
2023-07-24 17:35   ` Christoph Hellwig
2023-07-25  2:45     ` Kent Overstreet
2023-07-26 13:21       ` Christoph Hellwig
2023-08-01 19:05         ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 07/20] block: Don't block on s_umount from __invalidate_super() Kent Overstreet
2023-07-12 21:11 ` [PATCH 08/20] stacktrace: Export stack_trace_save_tsk Kent Overstreet
2023-07-12 21:11 ` [PATCH 09/20] lib/string_helpers: string_get_size() now returns characters wrote Kent Overstreet
2023-07-12 21:11 ` [PATCH 10/20] lib: Export errname Kent Overstreet
2023-07-13  7:10   ` Eric Biggers
2023-07-12 21:11 ` [PATCH 11/20] locking/osq: Export osq_(lock|unlock) Kent Overstreet
2023-08-02 20:16   ` Waiman Long
2023-08-02 20:44     ` Kent Overstreet
2023-08-02 21:09       ` Waiman Long
2023-08-02 21:42         ` Kent Overstreet
2023-10-10  8:09           ` Ingo Molnar [this message]
2023-10-18 21:04             ` [NAK] " Kent Overstreet
2023-07-12 21:11 ` [PATCH 12/20] bcache: move closures to lib/ Kent Overstreet
2023-07-13  3:21   ` Randy Dunlap
2023-07-13  3:52     ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 13/20] MAINTAINERS: Add entry for closures Kent Overstreet
2023-07-12 21:11 ` [PATCH 14/20] closures: closure_wait_event() Kent Overstreet
2023-07-12 21:11 ` [PATCH 15/20] closures: closure_nr_remaining() Kent Overstreet
2023-07-12 21:11 ` [PATCH 16/20] closures: Add a missing include Kent Overstreet
2023-07-12 21:11 ` [PATCH 17/20] MAINTAINERS: Add entry for generic-radix-tree Kent Overstreet
2023-07-12 21:11 ` [PATCH 18/20] lib/generic-radix-tree.c: Don't overflow in peek() Kent Overstreet
2023-07-12 21:11 ` [PATCH 19/20] lib/generic-radix-tree.c: Add a missing include Kent Overstreet
2023-07-25  3:04   ` Matthew Wilcox
2023-07-25  3:36     ` Kent Overstreet
2023-07-12 21:11 ` [PATCH 20/20] lib/generic-radix-tree.c: Add peek_prev() Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSUGwr5S5Nflbiay@gmail.com \
    --to=mingo@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).