All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin
@ 2014-08-07 22:26 Waiman Long
  2014-08-07 22:26 ` [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup Waiman Long
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-07 22:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, Davidlohr Bueso, Jason Low, Scott J Norton, Waiman Long

v1->v2:
 - Remove patch 1 which changes preempt_enable() to
   preempt_enable_no_resched().
 - Remove the RWSEM_READ_OWNED macro and assume readers own the lock
   when owner is NULL.
 - Reduce the spin threshold to 64.
 - Enable writer respin only if spinners are present.

This patch set improves upon the rwsem optimistic spinning patch set
from Davidlohr to enable better performing rwsem and more aggressive
use of optimistic spinning.

By using a microbenchmark running 1 million lock-unlock operations per
thread on a 4-socket 40-core Westmere-EX x86-64 test machine running
3.16-rc7 based kernels, the following table shows the execution times
with 2/10 threads running on different CPUs on the same socket where
load is the number of pause instructions in the critical section:

  lock/r:w ratio # of threads	Load:Execution Time (ms)
  -------------- ------------	------------------------
  mutex		      2		1:530.7, 5:406.0, 10:472.7
  mutex		     10		1:1848 , 5:2046 , 10:4394

Before patch:
  rwsem/0:1	      2		1:339.4, 5:368.9, 10:394.0
  rwsem/1:1	      2		1:2915 , 5:2621 , 10:2764
  rwsem/10:1	      2		1:891.2, 5:779.2, 10:827.2
  rwsem/0:1	     10		1:5618 , 5:5722 , 10:5683
  rwsem/1:1	     10		1:14562, 5:14561, 10:14770
  rwsem/10:1	     10		1:5914 , 5:5971 , 10:5912

After patch:
  rwsem/0:1	     2		1:334.6, 5:334.5, 10:366.9
  rwsem/1:1	     2		1:311.0, 5:320.5, 10:300.0
  rwsem/10:1	     2		1:184.6, 5:180.6, 10:188.9
  rwsem/0:1	    10		1:1842 , 5:1925 , 10:2306
  rwsem/1:1	    10		1:1668 , 5:1706 , 10:1555
  rwsem/10:1	    10		1:1266 , 5:1294 , 10:1342

% Change:
  rwsem/0:1	     2		1: -1.4%, 5: -9.6%, 10: -6.7%
  rwsem/1:1	     2		1:-89.3%, 5:-87.7%, 10:-89.1%
  rwsem/10:1	     2		1:-79.3%, 5:-76.8%, 10:-77.2%
  rwsem/0:1	    10		1:-67.2%, 5:-66.4%, 10:-59.4%
  rwsem/1:1	    10		1:-88.5%, 5:-88.3%, 10:-89.5%
  rwsem/10:1	    10		1:-78.6%, 5:-78.3%, 10:-77.3%

It can be seen that there is dramatic reduction in the execution
times. The new rwsem is now even faster than mutex whether it is all
writers or a mixture of writers and readers.

Running the AIM7 benchmarks on a larger 8-socket 80-core system
(HT off), the performance improvements on some of the workloads were
as follows:

      Workload	       Before Patch	After Patch	% Change
      --------	       ------------	-----------	--------
  alltests (200-1000)	  337892	  345888	 + 2.4%
  alltests (1100-2000)	  402535	  474065	 +17.8%
  custom (200-1000)	  480651	  547522	 +13.9%
  custom (1100-2000)	  461037	  561588	 +21.8%
  shared (200-1000)	  420845	  458048	 + 8.8%
  shared (1100-2000)	  428045	  473121	 +10.5%

Waiman Long (7):
  locking/rwsem: check for active writer/spinner before wakeup
  locking/rwsem: threshold limited spinning for active readers
  locking/rwsem: rwsem_can_spin_on_owner can be called with preemption
    enabled
  locking/rwsem: more aggressive use of optimistic spinning
  locking/rwsem: move down rwsem_down_read_failed function
  locking/rwsem: enables optimistic spinning for readers
  locking/rwsem: allow waiting writers to go back to spinning

 include/linux/osq_lock.h    |    5 +
 kernel/locking/rwsem-xadd.c |  348 ++++++++++++++++++++++++++++++++++---------
 2 files changed, 283 insertions(+), 70 deletions(-)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
@ 2014-08-07 22:26 ` Waiman Long
  2014-08-08  0:45   ` Davidlohr Bueso
  2014-08-07 22:26 ` [PATCH v2 2/7] locking/rwsem: threshold limited spinning for active readers Waiman Long
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Waiman Long @ 2014-08-07 22:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, Davidlohr Bueso, Jason Low, Scott J Norton, Waiman Long

On a highly contended rwsem, spinlock contention due to the slow
rwsem_wake() call can be a significant portion of the total CPU cycles
used. With writer lock stealing and writer optimistic spinning, there
is also a pretty good chance that the lock may have been stolen
before the waker wakes up the waiters. The woken tasks, if any,
will have to go back to sleep again.

This patch adds checking code at the beginning of the rwsem_wake()
and __rwsem_do_wake() function to look for spinner and active
writer respectively.  The presence of an active writer will abort the
wakeup operation.  The presence of a spinner will still allow wakeup
operation to proceed as long as the trylock operation succeeds. This
strikes a good balance between excessive spinlock contention especially
when there are a lot of active readers and a lot of failed fastpath
operations because there are tasks waiting in the queue.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 include/linux/osq_lock.h    |    5 ++++
 kernel/locking/rwsem-xadd.c |   57 ++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 61 insertions(+), 1 deletions(-)

diff --git a/include/linux/osq_lock.h b/include/linux/osq_lock.h
index 90230d5..ade1973 100644
--- a/include/linux/osq_lock.h
+++ b/include/linux/osq_lock.h
@@ -24,4 +24,9 @@ static inline void osq_lock_init(struct optimistic_spin_queue *lock)
 	atomic_set(&lock->tail, OSQ_UNLOCKED_VAL);
 }
 
+static inline bool osq_is_locked(struct optimistic_spin_queue *lock)
+{
+	return atomic_read(&lock->tail) != OSQ_UNLOCKED_VAL;
+}
+
 #endif
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index a2391ac..d38cfae 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -107,6 +107,37 @@ enum rwsem_wake_type {
 	RWSEM_WAKE_READ_OWNED	/* Waker thread holds the read lock */
 };
 
+#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
+/*
+ * return true if there is an active writer by checking the owner field which
+ * should be set if there is one.
+ */
+static inline bool rwsem_has_active_writer(struct rw_semaphore *sem)
+{
+	struct task_struct *owner = ACCESS_ONCE(sem->owner);
+
+	return owner != NULL;
+}
+
+/*
+ * Return true if the rwsem has active spinner
+ */
+static inline bool rwsem_has_spinner(struct rw_semaphore *sem)
+{
+	return osq_is_locked(&sem->osq);
+}
+#else /* CONFIG_RWSEM_SPIN_ON_OWNER */
+static inline bool rwsem_has_active_writer(struct rw_semaphore *sem)
+{
+	return false;	/* Assume it has no active writer */
+}
+
+static inline bool rwsem_has_spinner(struct rw_semaphore *sem)
+{
+	return false;
+}
+#endif /* CONFIG_RWSEM_SPIN_ON_OWNER */
+
 /*
  * handle the lock release when processes blocked on it that can now run
  * - if we come here from up_xxxx(), then:
@@ -125,6 +156,15 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum rwsem_wake_type wake_type)
 	struct list_head *next;
 	long oldcount, woken, loop, adjustment;
 
+	/*
+	 * Abort the wakeup operation if there is an active writer as the
+	 * lock was stolen. up_write() should have cleared the owner field
+	 * before calling this function. If that field is now set, there must
+	 * be an active writer present.
+	 */
+	if (rwsem_has_active_writer(sem))
+		goto out;
+
 	waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);
 	if (waiter->type == RWSEM_WAITING_FOR_WRITE) {
 		if (wake_type == RWSEM_WAKE_ANY)
@@ -475,7 +515,22 @@ struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem)
 {
 	unsigned long flags;
 
-	raw_spin_lock_irqsave(&sem->wait_lock, flags);
+	/*
+	 * If a spinner is present, it is not necessary to do the wakeup.
+	 * Try to do wakeup when the trylock succeeds to avoid potential
+	 * spinlock contention which may introduce too much delay in the
+	 * unlock operation.
+	 *
+	 * In case the spinning writer is just going to break out of the loop,
+	 * it will still do a trylock in rwsem_down_write_failed() before
+	 * sleeping.
+	 */
+	if (rwsem_has_spinner(sem)) {
+		if (!raw_spin_trylock_irqsave(&sem->wait_lock, flags))
+			return sem;
+	} else {
+		raw_spin_lock_irqsave(&sem->wait_lock, flags);
+	}
 
 	/* do nothing if list empty */
 	if (!list_empty(&sem->wait_list))
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/7] locking/rwsem: threshold limited spinning for active readers
  2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
  2014-08-07 22:26 ` [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup Waiman Long
@ 2014-08-07 22:26 ` Waiman Long
  2014-08-07 22:26 ` [PATCH v2 3/7] locking/rwsem: rwsem_can_spin_on_owner can be called with preemption enabled Waiman Long
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-07 22:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, Davidlohr Bueso, Jason Low, Scott J Norton, Waiman Long

Even thought only the writers can perform optimistic spinning, there
is still a chance that readers may take the lock before a spinning
writer can get it. In that case, the owner field will be NULL and the
spinning writer can spin indefinitely until its time quantum expires
when some lock owning readers are not running.

This patch tries to handle this special case by doing threshold limited
spinning when the owner field is NULL. The threshold is small enough
that even if the readers are not running, it will not cause a lot of
wasted spinning cycles.  With that change, the patch tries to strike a
balance between giving up too early and losing potential performance
gain and wasting too many precious CPU cycles when some lock owning
readers are not running.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 kernel/locking/rwsem-xadd.c |   30 +++++++++++++++++++++++++++++-
 1 files changed, 29 insertions(+), 1 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index d38cfae..ddd56d2 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -304,6 +304,14 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem)
 
 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
 /*
+ * Thresholds for optimistic spinning on readers
+ *
+ * This is the threshold for the number of spins that happens before the
+ * spinner gives up when the owner field is NULL.
+ */
+#define SPIN_READ_THRESHOLD	64
+
+/*
  * Try to acquire write lock before the writer has been put on wait queue.
  */
 static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
@@ -381,10 +389,20 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
 	return sem->owner == NULL;
 }
 
+/*
+ * With active writer, spinning is done by checking if that writer is on
+ * CPU. With active readers, there is no easy way to determine if all of
+ * them are active. So it falls back to spin a certain number of times
+ * (SPIN_READ_THRESHOLD) before giving up. The threshold is relatively
+ * small with the expectation that readers are quick. For slow readers,
+ * the spinners will still fall back to sleep. On the other hand, it won't
+ * waste too many cycles when the lock owning readers are not running.
+ */
 static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 {
 	struct task_struct *owner;
 	bool taken = false;
+	int  spincnt = 0;
 
 	preempt_disable();
 
@@ -397,8 +415,18 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 
 	while (true) {
 		owner = ACCESS_ONCE(sem->owner);
-		if (owner && !rwsem_spin_on_owner(sem, owner))
+		if (!owner) {
+			/*
+			 * Give up spinning if spincnt reaches the threshold.
+			 */
+			if (spincnt++ >= SPIN_READ_THRESHOLD)
+				break;
+		} else if (!rwsem_spin_on_owner(sem, owner)) {
 			break;
+		} else {
+			/* Reset count when owner is defined */
+			spincnt = 0;
+		}
 
 		/* wait_lock will be acquired if write_lock is obtained */
 		if (rwsem_try_write_lock_unqueued(sem)) {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/7] locking/rwsem: rwsem_can_spin_on_owner can be called with preemption enabled
  2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
  2014-08-07 22:26 ` [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup Waiman Long
  2014-08-07 22:26 ` [PATCH v2 2/7] locking/rwsem: threshold limited spinning for active readers Waiman Long
@ 2014-08-07 22:26 ` Waiman Long
  2014-08-07 22:26 ` [PATCH v2 4/7] locking/rwsem: more aggressive use of optimistic spinning Waiman Long
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-07 22:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, Davidlohr Bueso, Jason Low, Scott J Norton, Waiman Long

The rwsem_can_spin_on_owner() function can be called with preemption
enabled. This patch moves it before preempt_disable() so that it won't
do unnecessary preemption operations when spinning is not possible.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 kernel/locking/rwsem-xadd.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index ddd56d2..c8887ba 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -404,11 +404,11 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 	bool taken = false;
 	int  spincnt = 0;
 
-	preempt_disable();
-
 	/* sem->wait_lock should not be held when doing optimistic spinning */
 	if (!rwsem_can_spin_on_owner(sem))
-		goto done;
+		return false;
+
+	preempt_disable();
 
 	if (!osq_lock(&sem->osq))
 		goto done;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 4/7] locking/rwsem: more aggressive use of optimistic spinning
  2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
                   ` (2 preceding siblings ...)
  2014-08-07 22:26 ` [PATCH v2 3/7] locking/rwsem: rwsem_can_spin_on_owner can be called with preemption enabled Waiman Long
@ 2014-08-07 22:26 ` Waiman Long
  2014-08-07 22:26 ` [PATCH v2 5/7] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-07 22:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, Davidlohr Bueso, Jason Low, Scott J Norton, Waiman Long

The rwsem_can_spin_on_owner() function currently allows optimistic
spinning only if the owner field is defined and is running. That is
too conservative as it will cause some tasks to miss the opportunity
of doing spinning in case the readers own the lock, the owner hasn't
been able to set the owner field in time or the lock has just become
available. With threshold limited spinning for readers, spinning
while lock owners are running should be less a concern.

This patch enables more aggressive use of optimistic spinning by
assuming that the lock is spinnable unless proved otherwise.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 kernel/locking/rwsem-xadd.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index c8887ba..1281de8 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -333,7 +333,7 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
 static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 {
 	struct task_struct *owner;
-	bool on_cpu = false;
+	bool on_cpu = true;
 
 	if (need_resched())
 		return false;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 5/7] locking/rwsem: move down rwsem_down_read_failed function
  2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
                   ` (3 preceding siblings ...)
  2014-08-07 22:26 ` [PATCH v2 4/7] locking/rwsem: more aggressive use of optimistic spinning Waiman Long
@ 2014-08-07 22:26 ` Waiman Long
  2014-08-07 22:26 ` [PATCH v2 6/7] locking/rwsem: enables optimistic spinning for readers Waiman Long
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-07 22:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, Davidlohr Bueso, Jason Low, Scott J Norton, Waiman Long

Move the rwsem_down_read_failed() function down to below the optimistic
spinning section before enabling optimistic spinning for the readers.
There is no change in code.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 kernel/locking/rwsem-xadd.c |   96 +++++++++++++++++++++---------------------
 1 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 1281de8..aa03479 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -239,54 +239,6 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum rwsem_wake_type wake_type)
 	return sem;
 }
 
-/*
- * Wait for the read lock to be granted
- */
-__visible
-struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem)
-{
-	long count, adjustment = -RWSEM_ACTIVE_READ_BIAS;
-	struct rwsem_waiter waiter;
-	struct task_struct *tsk = current;
-
-	/* set up my own style of waitqueue */
-	waiter.task = tsk;
-	waiter.type = RWSEM_WAITING_FOR_READ;
-	get_task_struct(tsk);
-
-	raw_spin_lock_irq(&sem->wait_lock);
-	if (list_empty(&sem->wait_list))
-		adjustment += RWSEM_WAITING_BIAS;
-	list_add_tail(&waiter.list, &sem->wait_list);
-
-	/* we're now waiting on the lock, but no longer actively locking */
-	count = rwsem_atomic_update(adjustment, sem);
-
-	/* If there are no active locks, wake the front queued process(es).
-	 *
-	 * If there are no writers and we are first in the queue,
-	 * wake our own waiter to join the existing active readers !
-	 */
-	if (count == RWSEM_WAITING_BIAS ||
-	    (count > RWSEM_WAITING_BIAS &&
-	     adjustment != -RWSEM_ACTIVE_READ_BIAS))
-		sem = __rwsem_do_wake(sem, RWSEM_WAKE_ANY);
-
-	raw_spin_unlock_irq(&sem->wait_lock);
-
-	/* wait to be given the lock */
-	while (true) {
-		set_task_state(tsk, TASK_UNINTERRUPTIBLE);
-		if (!waiter.task)
-			break;
-		schedule();
-	}
-
-	tsk->state = TASK_RUNNING;
-
-	return sem;
-}
-
 static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem)
 {
 	if (!(count & RWSEM_ACTIVE_MASK)) {
@@ -465,6 +417,54 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 #endif
 
 /*
+ * Wait for the read lock to be granted
+ */
+__visible
+struct rw_semaphore __sched * rwsem_down_read_failed(struct rw_semaphore *sem)
+{
+	long count, adjustment = -RWSEM_ACTIVE_READ_BIAS;
+	struct rwsem_waiter waiter;
+	struct task_struct *tsk = current;
+
+	/* set up my own style of waitqueue */
+	waiter.task = tsk;
+	waiter.type = RWSEM_WAITING_FOR_READ;
+	get_task_struct(tsk);
+
+	raw_spin_lock_irq(&sem->wait_lock);
+	if (list_empty(&sem->wait_list))
+		adjustment += RWSEM_WAITING_BIAS;
+	list_add_tail(&waiter.list, &sem->wait_list);
+
+	/* we're now waiting on the lock, but no longer actively locking */
+	count = rwsem_atomic_update(adjustment, sem);
+
+	/* If there are no active locks, wake the front queued process(es).
+	 *
+	 * If there are no writers and we are first in the queue,
+	 * wake our own waiter to join the existing active readers !
+	 */
+	if (count == RWSEM_WAITING_BIAS ||
+	    (count > RWSEM_WAITING_BIAS &&
+	     adjustment != -RWSEM_ACTIVE_READ_BIAS))
+		sem = __rwsem_do_wake(sem, RWSEM_WAKE_ANY);
+
+	raw_spin_unlock_irq(&sem->wait_lock);
+
+	/* wait to be given the lock */
+	while (true) {
+		set_task_state(tsk, TASK_UNINTERRUPTIBLE);
+		if (!waiter.task)
+			break;
+		schedule();
+	}
+
+	tsk->state = TASK_RUNNING;
+
+	return sem;
+}
+
+/*
  * Wait until we successfully acquire the write lock
  */
 __visible
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 6/7] locking/rwsem: enables optimistic spinning for readers
  2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
                   ` (4 preceding siblings ...)
  2014-08-07 22:26 ` [PATCH v2 5/7] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
@ 2014-08-07 22:26 ` Waiman Long
  2014-08-07 22:26 ` [PATCH v2 7/7] locking/rwsem: allow waiting writers to go back to spinning Waiman Long
  2014-08-07 23:52 ` [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Davidlohr Bueso
  7 siblings, 0 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-07 22:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, Davidlohr Bueso, Jason Low, Scott J Norton, Waiman Long

This patch enables readers to go into the optimistic spinning loop
so that both the readers and writers can spin together. This could
speed up workloads that use both readers and writers.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 kernel/locking/rwsem-xadd.c |  122 +++++++++++++++++++++++++++++++++++++------
 1 files changed, 106 insertions(+), 16 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index aa03479..16fabe0 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -282,6 +282,79 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
 	}
 }
 
+/*
+ * Try to acquire read lock
+ *
+ * There is ambiguity when RWSEM_WAITING_BIAS < count < 0 as a writer may
+ * be active instead of having waiters. So we need to recheck the count
+ * under wait_lock to be sure.
+ */
+static inline bool rwsem_try_read_lock_unqueued(struct rw_semaphore *sem)
+{
+	long old, count = ACCESS_ONCE(sem->count);
+	bool taken = false;	/* True if lock taken */
+
+	while (!taken) {
+		if (count < RWSEM_WAITING_BIAS)
+			break;	/* Have writer and waiter */
+
+		old = count;
+		if (count >= 0 || count == RWSEM_WAITING_BIAS) {
+			count = cmpxchg(&sem->count, old,
+					old + RWSEM_ACTIVE_READ_BIAS);
+			if (count == old) {
+				/* Got the read lock */
+				taken = true;
+				/*
+				 * Try to wake up readers if lock is free
+				 */
+				if ((count == RWSEM_WAITING_BIAS) &&
+				    raw_spin_trylock_irq(&sem->wait_lock)) {
+					if (!list_empty(&sem->wait_list))
+						goto wake_readers;
+					raw_spin_unlock_irq(&sem->wait_lock);
+				}
+			}
+		} else if (!rwsem_has_active_writer(sem)) {
+			long threshold;
+
+			/*
+			 * RWSEM_WAITING_BIAS < count < 0
+			 */
+			raw_spin_lock_irq(&sem->wait_lock);
+			threshold = list_empty(&sem->wait_list)
+				  ? 0 : RWSEM_WAITING_BIAS;
+			count = ACCESS_ONCE(sem->count);
+			if (count < threshold) {
+				raw_spin_unlock_irq(&sem->wait_lock);
+				break;
+			}
+			old   = count;
+			count = cmpxchg(&sem->count, old,
+					old + RWSEM_ACTIVE_READ_BIAS);
+			if (count == old) {
+				taken = true;
+				/*
+				 * Wake up pending readers, if any,
+				 * while holding the lock.
+				 */
+				if (threshold)
+					goto wake_readers;
+			}
+			raw_spin_unlock_irq(&sem->wait_lock);
+		} else {
+			break;
+		}
+	}
+	return taken;
+
+wake_readers:
+	__rwsem_do_wake(sem, RWSEM_WAKE_READ_OWNED);
+	raw_spin_unlock_irq(&sem->wait_lock);
+	return true;
+
+}
+
 static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 {
 	struct task_struct *owner;
@@ -350,7 +423,8 @@ bool rwsem_spin_on_owner(struct rw_semaphore *sem, struct task_struct *owner)
  * the spinners will still fall back to sleep. On the other hand, it won't
  * waste too many cycles when the lock owning readers are not running.
  */
-static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
+static bool rwsem_optimistic_spin(struct rw_semaphore *sem,
+				  enum rwsem_waiter_type type)
 {
 	struct task_struct *owner;
 	bool taken = false;
@@ -380,11 +454,11 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 			spincnt = 0;
 		}
 
-		/* wait_lock will be acquired if write_lock is obtained */
-		if (rwsem_try_write_lock_unqueued(sem)) {
-			taken = true;
+		taken = (type == RWSEM_WAITING_FOR_WRITE)
+		      ? rwsem_try_write_lock_unqueued(sem)
+		      : rwsem_try_read_lock_unqueued(sem);
+		if (taken)
 			break;
-		}
 
 		/*
 		 * When there's no owner, we might have preempted between the
@@ -410,7 +484,8 @@ done:
 }
 
 #else
-static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
+static bool rwsem_optimistic_spin(struct rw_semaphore *sem,
+				  enum rwsem_waiter_type type)
 {
 	return false;
 }
@@ -422,11 +497,21 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 __visible
 struct rw_semaphore __sched * rwsem_down_read_failed(struct rw_semaphore *sem)
 {
-	long count, adjustment = -RWSEM_ACTIVE_READ_BIAS;
+	long count, adjustment = 0;
 	struct rwsem_waiter waiter;
 	struct task_struct *tsk = current;
 
-	/* set up my own style of waitqueue */
+	/* undo read bias from down_read operation, stop active locking */
+	count = rwsem_atomic_update(-RWSEM_ACTIVE_READ_BIAS, sem);
+
+	/* do optimistic spinning and steal lock if possible */
+	if (rwsem_optimistic_spin(sem, RWSEM_WAITING_FOR_READ))
+		return sem;
+
+	/*
+	 * Optimistic spinning failed, proceed to the slowpath
+	 * and block until we can acquire the sem.
+	 */
 	waiter.task = tsk;
 	waiter.type = RWSEM_WAITING_FOR_READ;
 	get_task_struct(tsk);
@@ -436,8 +521,11 @@ struct rw_semaphore __sched * rwsem_down_read_failed(struct rw_semaphore *sem)
 		adjustment += RWSEM_WAITING_BIAS;
 	list_add_tail(&waiter.list, &sem->wait_list);
 
-	/* we're now waiting on the lock, but no longer actively locking */
-	count = rwsem_atomic_update(adjustment, sem);
+	/* we're now waiting on the lock */
+	if (adjustment)
+		count = rwsem_atomic_update(adjustment, sem);
+	else
+		count = ACCESS_ONCE(sem->count);
 
 	/* If there are no active locks, wake the front queued process(es).
 	 *
@@ -445,8 +533,7 @@ struct rw_semaphore __sched * rwsem_down_read_failed(struct rw_semaphore *sem)
 	 * wake our own waiter to join the existing active readers !
 	 */
 	if (count == RWSEM_WAITING_BIAS ||
-	    (count > RWSEM_WAITING_BIAS &&
-	     adjustment != -RWSEM_ACTIVE_READ_BIAS))
+	   (count >  RWSEM_WAITING_BIAS && adjustment))
 		sem = __rwsem_do_wake(sem, RWSEM_WAKE_ANY);
 
 	raw_spin_unlock_irq(&sem->wait_lock);
@@ -478,7 +565,7 @@ struct rw_semaphore __sched *rwsem_down_write_failed(struct rw_semaphore *sem)
 	count = rwsem_atomic_update(-RWSEM_ACTIVE_WRITE_BIAS, sem);
 
 	/* do optimistic spinning and steal lock if possible */
-	if (rwsem_optimistic_spin(sem))
+	if (rwsem_optimistic_spin(sem, RWSEM_WAITING_FOR_WRITE))
 		return sem;
 
 	/*
@@ -549,9 +636,12 @@ struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem)
 	 * spinlock contention which may introduce too much delay in the
 	 * unlock operation.
 	 *
-	 * In case the spinning writer is just going to break out of the loop,
-	 * it will still do a trylock in rwsem_down_write_failed() before
-	 * sleeping.
+	 * In case the spinner is just going to break out of the loop, it
+	 * will still do a trylock in rwsem_down_write_failed() before
+	 * sleeping, or call __rwsem_do_wake() in rwsem_down_read_failed()
+	 * if it detects a free lock. In either cases, we won't have the
+	 * situation that the lock is free and no task is woken up from the
+	 * waiting queue.
 	 */
 	if (rwsem_has_spinner(sem)) {
 		if (!raw_spin_trylock_irqsave(&sem->wait_lock, flags))
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 7/7] locking/rwsem: allow waiting writers to go back to spinning
  2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
                   ` (5 preceding siblings ...)
  2014-08-07 22:26 ` [PATCH v2 6/7] locking/rwsem: enables optimistic spinning for readers Waiman Long
@ 2014-08-07 22:26 ` Waiman Long
  2014-08-07 23:52 ` [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Davidlohr Bueso
  7 siblings, 0 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-07 22:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, Davidlohr Bueso, Jason Low, Scott J Norton, Waiman Long

More aggressive use of optimistic spinning and enabling readers to
participate in spinning may make tasks waiting in the queue harder
to get access to the semaphore, especially for writers who need
exclusive access.

This patch enables a waking writer to go back to the optimistic
spinning loop as long as the owner is really running and spinners are
present. This allows waiting writers better access to the semaphore
as well as reduce the size of the waiting queue.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
---
 kernel/locking/rwsem-xadd.c |   55 +++++++++++++++++++++++++++++++++++--------
 1 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 16fabe0..20adbe9 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -355,10 +355,15 @@ wake_readers:
 
 }
 
-static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
+/*
+ * The defval argument controls whether true or false is returned
+ * when the owner field is NULL.
+ */
+static inline bool
+rwsem_can_spin_on_owner(struct rw_semaphore *sem, bool defval)
 {
 	struct task_struct *owner;
-	bool on_cpu = true;
+	bool on_cpu = defval;
 
 	if (need_resched())
 		return false;
@@ -431,7 +436,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem,
 	int  spincnt = 0;
 
 	/* sem->wait_lock should not be held when doing optimistic spinning */
-	if (!rwsem_can_spin_on_owner(sem))
+	if (!rwsem_can_spin_on_owner(sem, true))
 		return false;
 
 	preempt_disable();
@@ -489,6 +494,12 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem,
 {
 	return false;
 }
+
+static inline bool
+rwsem_can_spin_on_owner(struct rw_semaphore *sem, bool default)
+{
+	return false;
+}
 #endif
 
 /*
@@ -558,12 +569,14 @@ __visible
 struct rw_semaphore __sched *rwsem_down_write_failed(struct rw_semaphore *sem)
 {
 	long count;
-	bool waiting = true; /* any queued threads before us */
+	bool waiting; /* any queued threads before us */
+	bool respin;
 	struct rwsem_waiter waiter;
 
 	/* undo write bias from down_write operation, stop active locking */
 	count = rwsem_atomic_update(-RWSEM_ACTIVE_WRITE_BIAS, sem);
 
+optspin:
 	/* do optimistic spinning and steal lock if possible */
 	if (rwsem_optimistic_spin(sem, RWSEM_WAITING_FOR_WRITE))
 		return sem;
@@ -578,8 +591,7 @@ struct rw_semaphore __sched *rwsem_down_write_failed(struct rw_semaphore *sem)
 	raw_spin_lock_irq(&sem->wait_lock);
 
 	/* account for this before adding a new element to the list */
-	if (list_empty(&sem->wait_list))
-		waiting = false;
+	waiting = !list_empty(&sem->wait_list);
 
 	list_add_tail(&waiter.list, &sem->wait_list);
 
@@ -600,23 +612,46 @@ struct rw_semaphore __sched *rwsem_down_write_failed(struct rw_semaphore *sem)
 
 	/* wait until we successfully acquire the lock */
 	set_current_state(TASK_UNINTERRUPTIBLE);
-	while (true) {
+	respin = false;
+	while (!respin) {
 		if (rwsem_try_write_lock(count, sem))
 			break;
 		raw_spin_unlock_irq(&sem->wait_lock);
 
-		/* Block until there are no active lockers. */
-		do {
+		/*
+		 * Block until there are no active lockers or optimistic
+		 * spinning is possible.
+		 */
+		while (true) {
 			schedule();
 			set_current_state(TASK_UNINTERRUPTIBLE);
-		} while ((count = sem->count) & RWSEM_ACTIVE_MASK);
+			count = ACCESS_ONCE(sem->count);
+			if (!(count & RWSEM_ACTIVE_MASK))
+				break;
+			/*
+			 * Go back to optimistic spinning if the lock
+			 * owner is really running and there are spinners.
+			 * If there is no spinner, the task is already at
+			 * the head of the queue or the lock owner (maybe
+			 * readers) may not be actually running.
+			 */
+			if (rwsem_has_spinner(sem) &&
+			    rwsem_can_spin_on_owner(sem, false)) {
+				respin = true;
+				break;
+			}
+		}
 
 		raw_spin_lock_irq(&sem->wait_lock);
 	}
 	__set_current_state(TASK_RUNNING);
 
 	list_del(&waiter.list);
+	if (respin && list_empty(&sem->wait_list))
+		rwsem_atomic_update(-RWSEM_WAITING_BIAS, sem);
 	raw_spin_unlock_irq(&sem->wait_lock);
+	if (respin)
+		goto optspin;
 
 	return sem;
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin
  2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
                   ` (6 preceding siblings ...)
  2014-08-07 22:26 ` [PATCH v2 7/7] locking/rwsem: allow waiting writers to go back to spinning Waiman Long
@ 2014-08-07 23:52 ` Davidlohr Bueso
  2014-08-08 18:16   ` Waiman Long
  7 siblings, 1 reply; 21+ messages in thread
From: Davidlohr Bueso @ 2014-08-07 23:52 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Jason Low, Scott J Norton

On Thu, 2014-08-07 at 18:26 -0400, Waiman Long wrote:
> v1->v2:
>  - Remove patch 1 which changes preempt_enable() to
>    preempt_enable_no_resched().
>  - Remove the RWSEM_READ_OWNED macro and assume readers own the lock
>    when owner is NULL.
>  - Reduce the spin threshold to 64.

So I still don't like this, and the fact that it is used in some
virtualization locking bits doesn't really address the concerns about
arbitrary logic in our general locking code.

Also, why did you reduce it from 100 to 64? This very much wants to be
commented.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-07 22:26 ` [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup Waiman Long
@ 2014-08-08  0:45   ` Davidlohr Bueso
  2014-08-08  5:39     ` Davidlohr Bueso
  0 siblings, 1 reply; 21+ messages in thread
From: Davidlohr Bueso @ 2014-08-08  0:45 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Jason Low,
	Scott J Norton, aswin

On Thu, 2014-08-07 at 18:26 -0400, Waiman Long wrote:
> On a highly contended rwsem, spinlock contention due to the slow
> rwsem_wake() call can be a significant portion of the total CPU cycles
> used. With writer lock stealing and writer optimistic spinning, there
> is also a pretty good chance that the lock may have been stolen
> before the waker wakes up the waiters. The woken tasks, if any,
> will have to go back to sleep again.

Good catch! And this applies to mutexes as well. How about something
like this:

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index dadbf88..e037588 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -707,6 +707,20 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
 
 #endif
 
+#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+	struct task_struct *owner = ACCESS_ONCE(lock->owner);
+
+	return owner != NULL;
+}
+#else
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+	return false;
+}
+#endif
+
 /*
  * Release the lock, slowpath:
  */
@@ -734,6 +748,15 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
 	mutex_release(&lock->dep_map, nested, _RET_IP_);
 	debug_mutex_unlock(lock);
 
+	/*
+	 * Abort the wakeup operation if there is an active writer as the
+	 * lock was stolen. mutex_unlock() should have cleared the owner field
+	 * before calling this function. If that field is now set, there must
+	 * be an active writer present.
+	 */
+	if (mutex_has_owner(lock))
+		goto done;
+
 	if (!list_empty(&lock->wait_list)) {
 		/* get the first entry from the wait-list: */
 		struct mutex_waiter *waiter =
@@ -744,7 +767,7 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
 
 		wake_up_process(waiter->task);
 	}
-
+done:
 	spin_unlock_mutex(&lock->wait_lock, flags);
 }
 







^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-08  0:45   ` Davidlohr Bueso
@ 2014-08-08  5:39     ` Davidlohr Bueso
  2014-08-08 18:30       ` Waiman Long
  2014-08-08 19:50       ` Jason Low
  0 siblings, 2 replies; 21+ messages in thread
From: Davidlohr Bueso @ 2014-08-08  5:39 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Jason Low,
	Scott J Norton, aswin

On Thu, 2014-08-07 at 17:45 -0700, Davidlohr Bueso wrote:
> On Thu, 2014-08-07 at 18:26 -0400, Waiman Long wrote:
> > On a highly contended rwsem, spinlock contention due to the slow
> > rwsem_wake() call can be a significant portion of the total CPU cycles
> > used. With writer lock stealing and writer optimistic spinning, there
> > is also a pretty good chance that the lock may have been stolen
> > before the waker wakes up the waiters. The woken tasks, if any,
> > will have to go back to sleep again.
> 
> Good catch! And this applies to mutexes as well. How about something
> like this:
> 
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index dadbf88..e037588 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -707,6 +707,20 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
>  
>  #endif
>  
> +#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)

If DEBUG, we don't clear the owner when unlocking. This can just be 

+#ifdef CONFIG_MUTEX_SPIN_ON_OWNER

> +static inline bool mutex_has_owner(struct mutex *lock)
> +{
> +	struct task_struct *owner = ACCESS_ONCE(lock->owner);
> +
> +	return owner != NULL;
> +}
> +#else
> +static inline bool mutex_has_owner(struct mutex *lock)
> +{
> +	return false;
> +}
> +#endif
> +
>  /*
>   * Release the lock, slowpath:
>   */
> @@ -734,6 +748,15 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>  	mutex_release(&lock->dep_map, nested, _RET_IP_);
>  	debug_mutex_unlock(lock);
>  
> +	/*
> +	 * Abort the wakeup operation if there is an active writer as the
> +	 * lock was stolen. mutex_unlock() should have cleared the owner field
> +	 * before calling this function. If that field is now set, there must
> +	 * be an active writer present.
> +	 */
> +	if (mutex_has_owner(lock))
> +		goto done;

Err so we actually deadlock here because we do the check with the
lock->wait_lock held and at the same time another task comes into the
slowpath of a mutex_lock() call which also tries to take the wait_lock.
Ending up with hung tasks. Here's a more tested patch against
peterz-queue, survives aim7 and kernel builds on a 80core box. Thanks.


8<---------------------------------------------------------------
From: Davidlohr Bueso <davidlohr@hp.com>
Subject: [PATCH] locking/mutex: Do not falsely wake-up tasks

Mutexes lock-stealing functionality allows another task to
skip its turn in the wait-queue and atomically acquire the lock.
This is fine and a nice optimization, however, when releasing
the mutex, we always wakeup the next task in FIFO order. When
the lock has been stolen this leads to wasting waking up a
task just to immediately realize it cannot acquire the lock
and just go back to sleep. This is specially true on highly
contended mutexes that stress the wait_lock.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
---
 kernel/locking/mutex.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index dadbf88..52e1136 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -383,12 +383,26 @@ done:
 
 	return false;
 }
+
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+	struct task_struct *owner = ACCESS_ONCE(lock->owner);
+
+	return owner != NULL;
+}
+
 #else
+
 static bool mutex_optimistic_spin(struct mutex *lock,
 				  struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx)
 {
 	return false;
 }
+
+static inline bool mutex_has_owner(struct mutex *lock)
+{
+	return false;
+}
 #endif
 
 __visible __used noinline
@@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
 	if (__mutex_slowpath_needs_to_unlock())
 		atomic_set(&lock->count, 1);
 
+/*
+ * Skipping the mutex_has_owner() check when DEBUG, allows us to
+ * avoid taking the wait_lock in order to do not call mutex_release()
+ * and debug_mutex_unlock() when !DEBUG. This can otherwise result in
+ * deadlocks when another task enters the lock's slowpath in mutex_lock().
+ */
+#ifndef CONFIG_DEBUG_MUTEXES
+	/*
+	 * Abort the wakeup operation if there is an another mutex owner, as the
+	 * lock was stolen. mutex_unlock() should have cleared the owner field
+	 * before calling this function. If that field is now set, another task
+	 * must have acquired the mutex.
+	 */
+	if (mutex_has_owner(lock))
+		return;
+#endif
+
 	spin_lock_mutex(&lock->wait_lock, flags);
 	mutex_release(&lock->dep_map, nested, _RET_IP_);
 	debug_mutex_unlock(lock);
@@ -744,7 +775,6 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
 
 		wake_up_process(waiter->task);
 	}
-
 	spin_unlock_mutex(&lock->wait_lock, flags);
 }
 
-- 
1.8.1.4




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin
  2014-08-07 23:52 ` [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Davidlohr Bueso
@ 2014-08-08 18:16   ` Waiman Long
  0 siblings, 0 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-08 18:16 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Jason Low, Scott J Norton

On 08/07/2014 07:52 PM, Davidlohr Bueso wrote:
> On Thu, 2014-08-07 at 18:26 -0400, Waiman Long wrote:
>> v1->v2:
>>   - Remove patch 1 which changes preempt_enable() to
>>     preempt_enable_no_resched().
>>   - Remove the RWSEM_READ_OWNED macro and assume readers own the lock
>>     when owner is NULL.
>>   - Reduce the spin threshold to 64.
> So I still don't like this, and the fact that it is used in some
> virtualization locking bits doesn't really address the concerns about
> arbitrary logic in our general locking code.

As I said in the comments, there is no easy way to figure if all the 
readers are running. I set the spin count to a relatively small number 
to catch those readers with a short critical sections. For those that 
hold the lock for a relatively long time, the spin will end and the task 
will be put to sleep. I know the solution is not elegant, but it is 
simple. I thought about using more elaborate scheme, but there is no 
guarantee that that it will be better than a simple spin count while 
greatly complicating the code.

> Also, why did you reduce it from 100 to 64? This very much wants to be
> commented.

In the v1 patch, the 100 spin threshold was for the whole spinning 
period. In the v2 patch, I reset the count when a writer is there. There 
is why I reduce the spin count a bit.

-Longman

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-08  5:39     ` Davidlohr Bueso
@ 2014-08-08 18:30       ` Waiman Long
  2014-08-08 19:03         ` Davidlohr Bueso
  2014-08-08 19:50       ` Jason Low
  1 sibling, 1 reply; 21+ messages in thread
From: Waiman Long @ 2014-08-08 18:30 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Jason Low,
	Scott J Norton, aswin

On 08/08/2014 01:39 AM, Davidlohr Bueso wrote:
> On Thu, 2014-08-07 at 17:45 -0700, Davidlohr Bueso wrote:
>> On Thu, 2014-08-07 at 18:26 -0400, Waiman Long wrote:
>>> On a highly contended rwsem, spinlock contention due to the slow
>>> rwsem_wake() call can be a significant portion of the total CPU cycles
>>> used. With writer lock stealing and writer optimistic spinning, there
>>> is also a pretty good chance that the lock may have been stolen
>>> before the waker wakes up the waiters. The woken tasks, if any,
>>> will have to go back to sleep again.
>> Good catch! And this applies to mutexes as well. How about something
>> like this:
>>
>> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
>> index dadbf88..e037588 100644
>> --- a/kernel/locking/mutex.c
>> +++ b/kernel/locking/mutex.c
>> @@ -707,6 +707,20 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
>>
>>   #endif
>>
>> +#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)
> If DEBUG, we don't clear the owner when unlocking. This can just be
>
> +#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
>
>> +static inline bool mutex_has_owner(struct mutex *lock)
>> +{
>> +	struct task_struct *owner = ACCESS_ONCE(lock->owner);
>> +
>> +	return owner != NULL;
>> +}
>> +#else
>> +static inline bool mutex_has_owner(struct mutex *lock)
>> +{
>> +	return false;
>> +}
>> +#endif
>> +
>>   /*
>>    * Release the lock, slowpath:
>>    */
>> @@ -734,6 +748,15 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>>   	mutex_release(&lock->dep_map, nested, _RET_IP_);
>>   	debug_mutex_unlock(lock);
>>
>> +	/*
>> +	 * Abort the wakeup operation if there is an active writer as the
>> +	 * lock was stolen. mutex_unlock() should have cleared the owner field
>> +	 * before calling this function. If that field is now set, there must
>> +	 * be an active writer present.
>> +	 */
>> +	if (mutex_has_owner(lock))
>> +		goto done;
> Err so we actually deadlock here because we do the check with the
> lock->wait_lock held and at the same time another task comes into the
> slowpath of a mutex_lock() call which also tries to take the wait_lock.
> Ending up with hung tasks. Here's a more tested patch against
> peterz-queue, survives aim7 and kernel builds on a 80core box. Thanks.

I couldn't figure out why there will be hang tasks. The logic looks OK 
to me.

>
> 8<---------------------------------------------------------------
> From: Davidlohr Bueso<davidlohr@hp.com>
> Subject: [PATCH] locking/mutex: Do not falsely wake-up tasks
>
> Mutexes lock-stealing functionality allows another task to
> skip its turn in the wait-queue and atomically acquire the lock.
> This is fine and a nice optimization, however, when releasing
> the mutex, we always wakeup the next task in FIFO order. When
> the lock has been stolen this leads to wasting waking up a
> task just to immediately realize it cannot acquire the lock
> and just go back to sleep. This is specially true on highly
> contended mutexes that stress the wait_lock.
>
> Signed-off-by: Davidlohr Bueso<davidlohr@hp.com>
> ---
>   kernel/locking/mutex.c | 32 +++++++++++++++++++++++++++++++-
>   1 file changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index dadbf88..52e1136 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -383,12 +383,26 @@ done:
>
>   	return false;
>   }
> +
> +static inline bool mutex_has_owner(struct mutex *lock)
> +{
> +	struct task_struct *owner = ACCESS_ONCE(lock->owner);
> +
> +	return owner != NULL;
> +}
> +
>   #else
> +
>   static bool mutex_optimistic_spin(struct mutex *lock,
>   				  struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx)
>   {
>   	return false;
>   }
> +
> +static inline bool mutex_has_owner(struct mutex *lock)
> +{
> +	return false;
> +}
>   #endif
>
>   __visible __used noinline
> @@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>   	if (__mutex_slowpath_needs_to_unlock())
>   		atomic_set(&lock->count, 1);
>
> +/*
> + * Skipping the mutex_has_owner() check when DEBUG, allows us to
> + * avoid taking the wait_lock in order to do not call mutex_release()
> + * and debug_mutex_unlock() when !DEBUG. This can otherwise result in
> + * deadlocks when another task enters the lock's slowpath in mutex_lock().
> + */
> +#ifndef CONFIG_DEBUG_MUTEXES
> +	/*
> +	 * Abort the wakeup operation if there is an another mutex owner, as the
> +	 * lock was stolen. mutex_unlock() should have cleared the owner field
> +	 * before calling this function. If that field is now set, another task
> +	 * must have acquired the mutex.
> +	 */
> +	if (mutex_has_owner(lock))
> +		return;
> +#endif
> +
>   	spin_lock_mutex(&lock->wait_lock, flags);
>   	mutex_release(&lock->dep_map, nested, _RET_IP_);
>   	debug_mutex_unlock(lock);
> @@ -744,7 +775,6 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>
>   		wake_up_process(waiter->task);
>   	}
> -
>   	spin_unlock_mutex(&lock->wait_lock, flags);
>   }
>

I have 2 issues about this. First of all, the timing windows between 
atomic_set() and mutex_has_owner() check is really small, I doubt it 
will be that effective. Secondly, I think you may need to call 
mutex_release() and debug_mutex_unlock() to make the debugging code 
work, but they seems to be called only under the wait_lock. So I think 
there is more work that need to be done before this patch is ready.

-Longman

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-08 18:30       ` Waiman Long
@ 2014-08-08 19:03         ` Davidlohr Bueso
  2014-08-10 21:41           ` Waiman Long
  0 siblings, 1 reply; 21+ messages in thread
From: Davidlohr Bueso @ 2014-08-08 19:03 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Jason Low,
	Scott J Norton, aswin

On Fri, 2014-08-08 at 14:30 -0400, Waiman Long wrote:
> I have 2 issues about this. First of all, the timing windows between 
> atomic_set() and mutex_has_owner() check is really small, I doubt it 
> will be that effective. 

That is true, which is why I didn't bother showing any performance data
in the changelog. However, more important than any performance, avoiding
bogus wakeups is the _right_ thing to do when allowing lock stealing.

> Secondly, I think you may need to call 
> mutex_release() and debug_mutex_unlock() to make the debugging code 
> work, but they seems to be called only under the wait_lock. So I think 
> there is more work that need to be done before this patch is ready.

When !DEBUG both mutex_release() and debug_mutex_unlock() should be
no-ops. So this allows us to do the mutex_has_owner() check *without*
holding the wait_lock.

When DEBUG is set, we don't even bother calling mutex_has_owner(), so
nothing changes.

I don't understand your concern. 

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-08  5:39     ` Davidlohr Bueso
  2014-08-08 18:30       ` Waiman Long
@ 2014-08-08 19:50       ` Jason Low
  2014-08-08 20:21         ` Davidlohr Bueso
  1 sibling, 1 reply; 21+ messages in thread
From: Jason Low @ 2014-08-08 19:50 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Waiman Long, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Scott J Norton, aswin


>  __visible __used noinline
> @@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>  	if (__mutex_slowpath_needs_to_unlock())
>  		atomic_set(&lock->count, 1);
>  
> +/*
> + * Skipping the mutex_has_owner() check when DEBUG, allows us to
> + * avoid taking the wait_lock in order to do not call mutex_release()
> + * and debug_mutex_unlock() when !DEBUG. This can otherwise result in
> + * deadlocks when another task enters the lock's slowpath in mutex_lock().
> + */
> +#ifndef CONFIG_DEBUG_MUTEXES
> +	/*
> +	 * Abort the wakeup operation if there is an another mutex owner, as the
> +	 * lock was stolen. mutex_unlock() should have cleared the owner field
> +	 * before calling this function. If that field is now set, another task
> +	 * must have acquired the mutex.
> +	 */
> +	if (mutex_has_owner(lock))
> +		return;

Would we need the mutex lock count to eventually get set to a negative
value if there are waiters? An optimistic spinner can get the lock and
set lock->count to 0. Then the lock count might remain 0 since a waiter
might not get waken up here to try-lock and set lock->count to -1 if it
goes back to sleep in the lock path.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-08 19:50       ` Jason Low
@ 2014-08-08 20:21         ` Davidlohr Bueso
  2014-08-08 20:38           ` Jason Low
  0 siblings, 1 reply; 21+ messages in thread
From: Davidlohr Bueso @ 2014-08-08 20:21 UTC (permalink / raw)
  To: Jason Low
  Cc: Waiman Long, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Scott J Norton, aswin

On Fri, 2014-08-08 at 12:50 -0700, Jason Low wrote:
> >  __visible __used noinline
> > @@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
> >  	if (__mutex_slowpath_needs_to_unlock())
> >  		atomic_set(&lock->count, 1);
> >  
> > +/*
> > + * Skipping the mutex_has_owner() check when DEBUG, allows us to
> > + * avoid taking the wait_lock in order to do not call mutex_release()
> > + * and debug_mutex_unlock() when !DEBUG. This can otherwise result in
> > + * deadlocks when another task enters the lock's slowpath in mutex_lock().
> > + */
> > +#ifndef CONFIG_DEBUG_MUTEXES
> > +	/*
> > +	 * Abort the wakeup operation if there is an another mutex owner, as the
> > +	 * lock was stolen. mutex_unlock() should have cleared the owner field
> > +	 * before calling this function. If that field is now set, another task
> > +	 * must have acquired the mutex.
> > +	 */
> > +	if (mutex_has_owner(lock))
> > +		return;
> 
> Would we need the mutex lock count to eventually get set to a negative
> value if there are waiters? An optimistic spinner can get the lock and
> set lock->count to 0. Then the lock count might remain 0 since a waiter
> might not get waken up here to try-lock and set lock->count to -1 if it
> goes back to sleep in the lock path.

This is a good point, but I think we are safe because we do not rely on
strict dependence between the mutex counter and the wait list. So to see
if there are waiters to wakeup, we do a !list_empty() check, but to
determine the lock state, we rely on the counter.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-08 20:21         ` Davidlohr Bueso
@ 2014-08-08 20:38           ` Jason Low
  2014-08-10 21:44             ` Waiman Long
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Low @ 2014-08-08 20:38 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Waiman Long, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Scott J Norton, aswin

On Fri, 2014-08-08 at 13:21 -0700, Davidlohr Bueso wrote:
> On Fri, 2014-08-08 at 12:50 -0700, Jason Low wrote:
> > >  __visible __used noinline
> > > @@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
> > >  	if (__mutex_slowpath_needs_to_unlock())
> > >  		atomic_set(&lock->count, 1);
> > >  
> > > +/*
> > > + * Skipping the mutex_has_owner() check when DEBUG, allows us to
> > > + * avoid taking the wait_lock in order to do not call mutex_release()
> > > + * and debug_mutex_unlock() when !DEBUG. This can otherwise result in
> > > + * deadlocks when another task enters the lock's slowpath in mutex_lock().
> > > + */
> > > +#ifndef CONFIG_DEBUG_MUTEXES
> > > +	/*
> > > +	 * Abort the wakeup operation if there is an another mutex owner, as the
> > > +	 * lock was stolen. mutex_unlock() should have cleared the owner field
> > > +	 * before calling this function. If that field is now set, another task
> > > +	 * must have acquired the mutex.
> > > +	 */
> > > +	if (mutex_has_owner(lock))
> > > +		return;
> > 
> > Would we need the mutex lock count to eventually get set to a negative
> > value if there are waiters? An optimistic spinner can get the lock and
> > set lock->count to 0. Then the lock count might remain 0 since a waiter
> > might not get waken up here to try-lock and set lock->count to -1 if it
> > goes back to sleep in the lock path.
> 
> This is a good point, but I think we are safe because we do not rely on
> strict dependence between the mutex counter and the wait list. So to see
> if there are waiters to wakeup, we do a !list_empty() check, but to
> determine the lock state, we rely on the counter.

Right, though if an optimistic spinner gets the lock, it would set
lock->count to 0. After it is done with its critical region and calls
mutex_unlock(), it would skip the slowpath and not wake up the next
thread either, because it sees that the lock->count is 0. In that case,
there might be a situation where the following mutex_unlock() call would
skip waking up the waiter as there's no call to slowpath.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-08 19:03         ` Davidlohr Bueso
@ 2014-08-10 21:41           ` Waiman Long
  2014-08-10 23:50             ` Davidlohr Bueso
  0 siblings, 1 reply; 21+ messages in thread
From: Waiman Long @ 2014-08-10 21:41 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Jason Low,
	Scott J Norton, aswin

On 08/08/2014 03:03 PM, Davidlohr Bueso wrote:
> On Fri, 2014-08-08 at 14:30 -0400, Waiman Long wrote:
>> I have 2 issues about this. First of all, the timing windows between
>> atomic_set() and mutex_has_owner() check is really small, I doubt it
>> will be that effective.
> That is true, which is why I didn't bother showing any performance data
> in the changelog. However, more important than any performance, avoiding
> bogus wakeups is the _right_ thing to do when allowing lock stealing.
>
>> Secondly, I think you may need to call
>> mutex_release() and debug_mutex_unlock() to make the debugging code
>> work, but they seems to be called only under the wait_lock. So I think
>> there is more work that need to be done before this patch is ready.
> When !DEBUG both mutex_release() and debug_mutex_unlock() should be
> no-ops. So this allows us to do the mutex_has_owner() check *without*
> holding the wait_lock.
>
> When DEBUG is set, we don't even bother calling mutex_has_owner(), so
> nothing changes.
>
> I don't understand your concern.

It is true I forgot the fact that MUTEX_SPIN_ON_OWNER is disabled when 
DEBUG_MUTEX is on. However, mutex_release is controlled by the LOCKDEP 
config variable which is independent of DEBUG_MUTEX. So it is still a 
concern.

-Longman

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-08 20:38           ` Jason Low
@ 2014-08-10 21:44             ` Waiman Long
  0 siblings, 0 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-10 21:44 UTC (permalink / raw)
  To: Jason Low
  Cc: Davidlohr Bueso, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Scott J Norton, aswin

On 08/08/2014 04:38 PM, Jason Low wrote:
> On Fri, 2014-08-08 at 13:21 -0700, Davidlohr Bueso wrote:
>> On Fri, 2014-08-08 at 12:50 -0700, Jason Low wrote:
>>>>   __visible __used noinline
>>>> @@ -730,6 +744,23 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>>>>   	if (__mutex_slowpath_needs_to_unlock())
>>>>   		atomic_set(&lock->count, 1);
>>>>
>>>> +/*
>>>> + * Skipping the mutex_has_owner() check when DEBUG, allows us to
>>>> + * avoid taking the wait_lock in order to do not call mutex_release()
>>>> + * and debug_mutex_unlock() when !DEBUG. This can otherwise result in
>>>> + * deadlocks when another task enters the lock's slowpath in mutex_lock().
>>>> + */
>>>> +#ifndef CONFIG_DEBUG_MUTEXES
>>>> +	/*
>>>> +	 * Abort the wakeup operation if there is an another mutex owner, as the
>>>> +	 * lock was stolen. mutex_unlock() should have cleared the owner field
>>>> +	 * before calling this function. If that field is now set, another task
>>>> +	 * must have acquired the mutex.
>>>> +	 */
>>>> +	if (mutex_has_owner(lock))
>>>> +		return;
>>> Would we need the mutex lock count to eventually get set to a negative
>>> value if there are waiters? An optimistic spinner can get the lock and
>>> set lock->count to 0. Then the lock count might remain 0 since a waiter
>>> might not get waken up here to try-lock and set lock->count to -1 if it
>>> goes back to sleep in the lock path.
>> This is a good point, but I think we are safe because we do not rely on
>> strict dependence between the mutex counter and the wait list. So to see
>> if there are waiters to wakeup, we do a !list_empty() check, but to
>> determine the lock state, we rely on the counter.
> Right, though if an optimistic spinner gets the lock, it would set
> lock->count to 0. After it is done with its critical region and calls
> mutex_unlock(), it would skip the slowpath and not wake up the next
> thread either, because it sees that the lock->count is 0. In that case,
> there might be a situation where the following mutex_unlock() call would
> skip waking up the waiter as there's no call to slowpath.
>
>

Actually, I am contemplating making similar changes for mutex. One code 
change that I made is for the spinner to change the count value to 
either 0 or -1 depending on the status of list_empty() so as to prevent 
the case of missed wakeup.

-Longman

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-10 21:41           ` Waiman Long
@ 2014-08-10 23:50             ` Davidlohr Bueso
  2014-08-11 19:35               ` Waiman Long
  0 siblings, 1 reply; 21+ messages in thread
From: Davidlohr Bueso @ 2014-08-10 23:50 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Jason Low,
	Scott J Norton, aswin

On Sun, 2014-08-10 at 17:41 -0400, Waiman Long wrote:
> On 08/08/2014 03:03 PM, Davidlohr Bueso wrote:
> > On Fri, 2014-08-08 at 14:30 -0400, Waiman Long wrote:
> >> I have 2 issues about this. First of all, the timing windows between
> >> atomic_set() and mutex_has_owner() check is really small, I doubt it
> >> will be that effective.
> > That is true, which is why I didn't bother showing any performance data
> > in the changelog. However, more important than any performance, avoiding
> > bogus wakeups is the _right_ thing to do when allowing lock stealing.
> >
> >> Secondly, I think you may need to call
> >> mutex_release() and debug_mutex_unlock() to make the debugging code
> >> work, but they seems to be called only under the wait_lock. So I think
> >> there is more work that need to be done before this patch is ready.
> > When !DEBUG both mutex_release() and debug_mutex_unlock() should be
> > no-ops. So this allows us to do the mutex_has_owner() check *without*
> > holding the wait_lock.
> >
> > When DEBUG is set, we don't even bother calling mutex_has_owner(), so
> > nothing changes.
> >
> > I don't understand your concern.
> 
> It is true I forgot the fact that MUTEX_SPIN_ON_OWNER is disabled when 
> DEBUG_MUTEX is on. However, mutex_release is controlled by the LOCKDEP 
> config variable which is independent of DEBUG_MUTEX. So it is still a 
> concern.

But afaict you cannot have LOCKDEP without enabling DEBUG_MUTEX (but not
necessarily vice-versa). Both are quite intertwined within other
debugging dependencies/options.

Thanks,
Davidlohr


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup
  2014-08-10 23:50             ` Davidlohr Bueso
@ 2014-08-11 19:35               ` Waiman Long
  0 siblings, 0 replies; 21+ messages in thread
From: Waiman Long @ 2014-08-11 19:35 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Jason Low,
	Scott J Norton, aswin

On 08/10/2014 07:50 PM, Davidlohr Bueso wrote:
> On Sun, 2014-08-10 at 17:41 -0400, Waiman Long wrote:
>> On 08/08/2014 03:03 PM, Davidlohr Bueso wrote:
>>> On Fri, 2014-08-08 at 14:30 -0400, Waiman Long wrote:
>>>> I have 2 issues about this. First of all, the timing windows between
>>>> atomic_set() and mutex_has_owner() check is really small, I doubt it
>>>> will be that effective.
>>> That is true, which is why I didn't bother showing any performance data
>>> in the changelog. However, more important than any performance, avoiding
>>> bogus wakeups is the _right_ thing to do when allowing lock stealing.
>>>
>>>> Secondly, I think you may need to call
>>>> mutex_release() and debug_mutex_unlock() to make the debugging code
>>>> work, but they seems to be called only under the wait_lock. So I think
>>>> there is more work that need to be done before this patch is ready.
>>> When !DEBUG both mutex_release() and debug_mutex_unlock() should be
>>> no-ops. So this allows us to do the mutex_has_owner() check *without*
>>> holding the wait_lock.
>>>
>>> When DEBUG is set, we don't even bother calling mutex_has_owner(), so
>>> nothing changes.
>>>
>>> I don't understand your concern.
>> It is true I forgot the fact that MUTEX_SPIN_ON_OWNER is disabled when
>> DEBUG_MUTEX is on. However, mutex_release is controlled by the LOCKDEP
>> config variable which is independent of DEBUG_MUTEX. So it is still a
>> concern.
> But afaict you cannot have LOCKDEP without enabling DEBUG_MUTEX (but not
> necessarily vice-versa). Both are quite intertwined within other
> debugging dependencies/options.
>

I think you are right. This will require comment in the code to avoid 
this kind of confusion.

-Longman

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-08-11 19:35 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-07 22:26 [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Waiman Long
2014-08-07 22:26 ` [PATCH v2 1/7] locking/rwsem: check for active writer/spinner before wakeup Waiman Long
2014-08-08  0:45   ` Davidlohr Bueso
2014-08-08  5:39     ` Davidlohr Bueso
2014-08-08 18:30       ` Waiman Long
2014-08-08 19:03         ` Davidlohr Bueso
2014-08-10 21:41           ` Waiman Long
2014-08-10 23:50             ` Davidlohr Bueso
2014-08-11 19:35               ` Waiman Long
2014-08-08 19:50       ` Jason Low
2014-08-08 20:21         ` Davidlohr Bueso
2014-08-08 20:38           ` Jason Low
2014-08-10 21:44             ` Waiman Long
2014-08-07 22:26 ` [PATCH v2 2/7] locking/rwsem: threshold limited spinning for active readers Waiman Long
2014-08-07 22:26 ` [PATCH v2 3/7] locking/rwsem: rwsem_can_spin_on_owner can be called with preemption enabled Waiman Long
2014-08-07 22:26 ` [PATCH v2 4/7] locking/rwsem: more aggressive use of optimistic spinning Waiman Long
2014-08-07 22:26 ` [PATCH v2 5/7] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
2014-08-07 22:26 ` [PATCH v2 6/7] locking/rwsem: enables optimistic spinning for readers Waiman Long
2014-08-07 22:26 ` [PATCH v2 7/7] locking/rwsem: allow waiting writers to go back to spinning Waiman Long
2014-08-07 23:52 ` [PATCH v2 0/7] locking/rwsem: enable reader opt-spinning & writer respin Davidlohr Bueso
2014-08-08 18:16   ` Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.