All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Waiman Long <longman@redhat.com>,
	mingo@kernel.org, will@kernel.org, tglx@linutronix.de,
	linux-kernel@vger.kernel.org, bigeasy@linutronix.de,
	juri.lelli@redhat.com, williams@redhat.com, bristot@redhat.com,
	dave@stgolabs.net, jack@suse.com
Subject: Re: [PATCH 5/5] locking/percpu-rwsem: Remove the embedded rwsem
Date: Tue, 17 Dec 2019 11:28:08 +0100	[thread overview]
Message-ID: <20191217102808.GO2871@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20191217102654.GA2844@hirez.programming.kicks-ass.net>

On Tue, Dec 17, 2019 at 11:26:54AM +0100, Peter Zijlstra wrote:
> On Tue, Nov 19, 2019 at 04:58:26PM +0100, Oleg Nesterov wrote:
> > On 11/19, Waiman Long wrote:
> > >
> > > On 11/13/19 5:21 AM, Peter Zijlstra wrote:
> > > > +static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
> > > > +				      unsigned int mode, int wake_flags,
> > > > +				      void *key)
> > > > +{
> > > > +	struct task_struct *p = get_task_struct(wq_entry->private);
> > > > +	bool reader = wq_entry->flags & WQ_FLAG_CUSTOM;
> > > > +	struct percpu_rw_semaphore *sem = key;
> > > > +
> > > > +	/* concurrent against percpu_down_write(), can get stolen */
> > > > +	if (!__percpu_rwsem_trylock(sem, reader))
> > > > +		return 1;
> > > > +
> > > > +	list_del_init(&wq_entry->entry);
> > > > +	smp_store_release(&wq_entry->private, NULL);
> > > > +
> > > > +	wake_up_process(p);
> > > > +	put_task_struct(p);
> > > > +
> > > > +	return !reader; /* wake 'all' readers and 1 writer */
> > > > +}
> > > > +
> > >
> > > If I read the function correctly, you are setting the WQ_FLAG_EXCLUSIVE
> > > for both readers and writers and __wake_up() is called with an exclusive
> > > count of one. So only one reader or writer is woken up each time.
> > 
> > This depends on what percpu_rwsem_wake_function() returns. If it returns 1,
> > __wake_up_common() stops, exactly because all waiters have WQ_FLAG_EXCLUSIVE.
> 
> Indeed, let me see if I can clarify that somehow.
> 
> > > However, the comment above said we wake 'all' readers and 1 writer. That
> > > doesn't match the actual code, IMO.
> > 
> > Well, "'all' readers" probably means "all readers before writer",
> 
> Correct.

Does this clarify?

--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -101,6 +101,19 @@ static bool __percpu_rwsem_trylock(struc
 	return __percpu_down_write_trylock(sem);
 }
 
+/*
+ * The return value of wait_queue_entry::func means:
+ *
+ *  <0 - error, wakeup is terminated and the error is returned
+ *   0 - no wakeup, a next waiter is tried
+ *  >0 - woken, if EXCLUSIVE, counted towards @nr_exclusive.
+ * 
+ * We use EXCLUSIVE for both readers and writers to preserve FIFO order,
+ * and play games with the return value to allow waking multiple readers.
+ *
+ * Specifically, we wake readers until we've woken a single writer, or until a
+ * trylock fails.
+ */
 static int percpu_rwsem_wake_function(struct wait_queue_entry *wq_entry,
 				      unsigned int mode, int wake_flags,
 				      void *key)
@@ -119,7 +132,7 @@ static int percpu_rwsem_wake_function(st
 	wake_up_process(p);
 	put_task_struct(p);
 
-	return !reader; /* wake 'all' readers and 1 writer */
+	return !reader; /* wake (readers until) 1 writer */
 }
 
 static void percpu_rwsem_wait(struct percpu_rw_semaphore *sem, bool reader)

  reply	other threads:[~2019-12-17 10:28 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-13 10:21 [PATCH 0/5] locking: Percpu-rwsem rewrite Peter Zijlstra
2019-11-13 10:21 ` [PATCH 1/5] locking/percpu-rwsem, lockdep: Make percpu-rwsem use its own lockdep_map Peter Zijlstra
2019-11-15 20:39   ` Davidlohr Bueso
2020-01-08  1:33     ` [PATCH] locking/percpu-rwsem: Add might_sleep() for writer locking Davidlohr Bueso
2020-01-08  1:33       ` Davidlohr Bueso
2020-02-11 12:48       ` [tip: locking/core] " tip-bot2 for Davidlohr Bueso
2019-11-13 10:21 ` [PATCH 2/5] locking/percpu-rwsem: Convert to bool Peter Zijlstra
2019-11-13 10:21 ` [PATCH 3/5] locking/percpu-rwsem: Move __this_cpu_inc() into the slowpath Peter Zijlstra
2019-11-13 10:21 ` [PATCH 4/5] locking/percpu-rwsem: Extract __percpu_down_read_trylock() Peter Zijlstra
2019-11-18 16:28   ` Oleg Nesterov
2019-11-13 10:21 ` [PATCH 5/5] locking/percpu-rwsem: Remove the embedded rwsem Peter Zijlstra
2019-11-18 19:53   ` Davidlohr Bueso
2019-11-18 23:19     ` Davidlohr Bueso
2019-12-17 10:45       ` Peter Zijlstra
2019-12-17 10:35     ` Peter Zijlstra
2019-11-18 21:52   ` Waiman Long
2019-12-17 10:28     ` Peter Zijlstra
2019-11-19 13:50   ` Waiman Long
2019-11-19 15:58     ` Oleg Nesterov
2019-11-19 16:28       ` Waiman Long
2019-12-17 10:26       ` Peter Zijlstra
2019-12-17 10:28         ` Peter Zijlstra [this message]
2019-11-15 17:14 ` [PATCH 0/5] locking: Percpu-rwsem rewrite Juri Lelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191217102808.GO2871@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bigeasy@linutronix.de \
    --cc=bristot@redhat.com \
    --cc=dave@stgolabs.net \
    --cc=jack@suse.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.