linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Waiman Long <longman@redhat.com>,
	mingo@kernel.org, will@kernel.org, tglx@linutronix.de,
	linux-kernel@vger.kernel.org, bigeasy@linutronix.de,
	juri.lelli@redhat.com, williams@redhat.com, bristot@redhat.com,
	dave@stgolabs.net, jack@suse.com
Subject: Re: [PATCH 5/5] locking/percpu-rwsem: Remove the embedded rwsem
Date: Tue, 17 Dec 2019 11:28:08 +0100	[thread overview]
Message-ID: <20191217102808.GO2871@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20191217102654.GA2844@hirez.programming.kicks-ass.net>

On Tue, Dec 17, 2019 at 11:26:54AM +0100, Peter Zijlstra wrote:
> On Tue, Nov 19, 2019 at 04:58:26PM +0100, Oleg Nesterov wrote:
> > On 11/19, Waiman Long wrote:
> > >
> > > On 11/13/19 5:21 AM, Peter Zijlstra wrote:
> > > > +static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
> > > > +				      unsigned int mode, int wake_flags,
> > > > +				      void *key)
> > > > +{
> > > > +	struct task_struct *p = get_task_struct(wq_entry->private);
> > > > +	bool reader = wq_entry->flags & WQ_FLAG_CUSTOM;
> > > > +	struct percpu_rw_semaphore *sem = key;
> > > > +
> > > > +	/* concurrent against percpu_down_write(), can get stolen */
> > > > +	if (!__percpu_rwsem_trylock(sem, reader))
> > > > +		return 1;
> > > > +
> > > > +	list_del_init(&wq_entry->entry);
> > > > +	smp_store_release(&wq_entry->private, NULL);
> > > > +
> > > > +	wake_up_process(p);
> > > > +	put_task_struct(p);
> > > > +
> > > > +	return !reader; /* wake 'all' readers and 1 writer */
> > > > +}
> > > > +
> > >
> > > If I read the function correctly, you are setting the WQ_FLAG_EXCLUSIVE
> > > for both readers and writers and __wake_up() is called with an exclusive
> > > count of one. So only one reader or writer is woken up each time.
> > 
> > This depends on what percpu_rwsem_wake_function() returns. If it returns 1,
> > __wake_up_common() stops, exactly because all waiters have WQ_FLAG_EXCLUSIVE.
> 
> Indeed, let me see if I can clarify that somehow.
> 
> > > However, the comment above said we wake 'all' readers and 1 writer. That
> > > doesn't match the actual code, IMO.
> > 
> > Well, "'all' readers" probably means "all readers before writer",
> 
> Correct.

Does this clarify?

--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -101,6 +101,19 @@ static bool __percpu_rwsem_trylock(struc
 	return __percpu_down_write_trylock(sem);
 }
 
+/*
+ * The return value of wait_queue_entry::func means:
+ *
+ *  <0 - error, wakeup is terminated and the error is returned
+ *   0 - no wakeup, a next waiter is tried
+ *  >0 - woken, if EXCLUSIVE, counted towards @nr_exclusive.
+ * 
+ * We use EXCLUSIVE for both readers and writers to preserve FIFO order,
+ * and play games with the return value to allow waking multiple readers.
+ *
+ * Specifically, we wake readers until we've woken a single writer, or until a
+ * trylock fails.
+ */
 static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
 				      unsigned int mode, int wake_flags,
 				      void *key)
@@ -119,7 +132,7 @@ static int percpu_rwsem_wake_function(st
 	wake_up_process(p);
 	put_task_struct(p);
 
-	return !reader; /* wake 'all' readers and 1 writer */
+	return !reader; /* wake (readers until) 1 writer */
 }
 
 static void percpu_rwsem_wait(struct percpu_rw_semaphore *sem, bool reader)

  reply	other threads:[~2019-12-17 10:28 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-13 10:21 [PATCH 0/5] locking: Percpu-rwsem rewrite Peter Zijlstra
2019-11-13 10:21 ` [PATCH 1/5] locking/percpu-rwsem, lockdep: Make percpu-rwsem use its own lockdep_map Peter Zijlstra
2019-11-15 20:39   ` Davidlohr Bueso
2020-01-08  1:33     ` [PATCH] locking/percpu-rwsem: Add might_sleep() for writer locking Davidlohr Bueso
2020-01-08  1:33       ` Davidlohr Bueso
2020-02-11 12:48       ` [tip: locking/core] " tip-bot2 for Davidlohr Bueso
2019-11-13 10:21 ` [PATCH 2/5] locking/percpu-rwsem: Convert to bool Peter Zijlstra
2019-11-13 10:21 ` [PATCH 3/5] locking/percpu-rwsem: Move __this_cpu_inc() into the slowpath Peter Zijlstra
2019-11-13 10:21 ` [PATCH 4/5] locking/percpu-rwsem: Extract __percpu_down_read_trylock() Peter Zijlstra
2019-11-18 16:28   ` Oleg Nesterov
2019-11-13 10:21 ` [PATCH 5/5] locking/percpu-rwsem: Remove the embedded rwsem Peter Zijlstra
2019-11-18 19:53   ` Davidlohr Bueso
2019-11-18 23:19     ` Davidlohr Bueso
2019-12-17 10:45       ` Peter Zijlstra
2019-12-17 10:35     ` Peter Zijlstra
2019-11-18 21:52   ` Waiman Long
2019-12-17 10:28     ` Peter Zijlstra
2019-11-19 13:50   ` Waiman Long
2019-11-19 15:58     ` Oleg Nesterov
2019-11-19 16:28       ` Waiman Long
2019-12-17 10:26       ` Peter Zijlstra
2019-12-17 10:28         ` Peter Zijlstra [this message]
2019-11-15 17:14 ` [PATCH 0/5] locking: Percpu-rwsem rewrite Juri Lelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191217102808.GO2871@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=dave@stgolabs.net \
    --cc=jack@suse.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).