linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH-tip v2 0/6] locking/rwsem: Enable reader optimistic spinning
@ 2016-06-14 22:48 Waiman Long
  2016-06-14 22:48 ` [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier Waiman Long
                   ` (5 more replies)
  0 siblings, 6 replies; 44+ messages in thread
From: Waiman Long @ 2016-06-14 22:48 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, x86, linux-alpha, linux-ia64, linux-s390,
	linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch, Waiman Long

v1->v2:
 - Fixed a 0day build error.
 - Add a new patch 1 to make osq_lock() a proper acquire memory
   barrier.
 - Replaced the explicit enabling of reader spinning by an autotuning
   mechanism that disable reader spinning for those rwsems that may
   not benefit from reader spinning.
 - Remove the last xfs patch as it is no longer necessary.

This patchset enables more aggressive optimistic spinning on
both readers and writers waiting on a writer or reader owned
lock. Spinning on writer is done by looking at the on_cpu flag of the
lock owner. Spinning on readers, on the other hand, is count-based as
there is no easy way to figure out if all the readers are running. The
spinner will stop spinning once the count goes to 0. Because of that,
spinning on readers may hurt performance in some cases.

An autotuning mechanism is used to determine if a rwsem can benefit
from reader optimistic spinning.

Patch 1 updates the osq_lock() function to make it a proper acquire
memory barrier.

Patch 2 reduces the length of the blocking window after a read locking
attempt where writer lock stealing is disabled because of the active
read lock. It can improve rwsem performance for contended lock. It is
independent of the rest of the patchset.

Patch 3 puts in place the autotuning mechanism to check if reader
optimistic spinning should be used or not.

Patch 4 moves down the rwsem_down_read_failed() function for later
patches.

Patch 5 changes RWSEM_WAITING_BIAS to simpify reader trylock code.

Patch 6 enables readers to do optimistic spinning.

Waiman Long (6):
  locking/osq: Make lock/unlock proper acquire/release barrier
  locking/rwsem: Stop active read lock ASAP
  locking/rwsem: Enable count-based spinning on reader
  locking/rwsem: move down rwsem_down_read_failed function
  locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation
  locking/rwsem: Enable spinning readers

 arch/alpha/include/asm/rwsem.h |    7 +-
 arch/ia64/include/asm/rwsem.h  |    6 +-
 arch/s390/include/asm/rwsem.h  |    6 +-
 arch/x86/include/asm/rwsem.h   |   13 ++-
 include/asm-generic/rwsem.h    |    9 +-
 include/linux/rwsem.h          |   19 +++-
 kernel/locking/osq_lock.c      |    4 +-
 kernel/locking/rwsem-xadd.c    |  238 ++++++++++++++++++++++++++++------------
 8 files changed, 209 insertions(+), 93 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-14 22:48 [RFC PATCH-tip v2 0/6] locking/rwsem: Enable reader optimistic spinning Waiman Long
@ 2016-06-14 22:48 ` Waiman Long
  2016-06-15  8:04   ` Boqun Feng
  2016-06-15 16:56   ` Davidlohr Bueso
  2016-06-14 22:48 ` [RFC PATCH-tip v2 2/6] locking/rwsem: Stop active read lock ASAP Waiman Long
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 44+ messages in thread
From: Waiman Long @ 2016-06-14 22:48 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, x86, linux-alpha, linux-ia64, linux-s390,
	linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch, Waiman Long

The osq_lock() and osq_unlock() function may not provide the necessary
acquire and release barrier in some cases. This patch makes sure
that the proper barriers are provided when osq_lock() is successful
or when osq_unlock() is called.

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 kernel/locking/osq_lock.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a3785..7dd4ee5 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 	 * cmpxchg in an attempt to undo our queueing.
 	 */
 
-	while (!READ_ONCE(node->locked)) {
+	while (!smp_load_acquire(&node->locked)) {
 		/*
 		 * If we need to reschedule bail... so we can block.
 		 */
@@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
 	 * Second most likely case.
 	 */
 	node = this_cpu_ptr(&osq_node);
-	next = xchg(&node->next, NULL);
+	next = xchg_release(&node->next, NULL);
 	if (next) {
 		WRITE_ONCE(next->locked, 1);
 		return;
-- 
1.7.1

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [RFC PATCH-tip v2 2/6] locking/rwsem: Stop active read lock ASAP
  2016-06-14 22:48 [RFC PATCH-tip v2 0/6] locking/rwsem: Enable reader optimistic spinning Waiman Long
  2016-06-14 22:48 ` [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier Waiman Long
@ 2016-06-14 22:48 ` Waiman Long
  2016-06-15 17:22   ` Peter Zijlstra
  2016-06-14 22:48 ` [RFC PATCH-tip v2 3/6] locking/rwsem: Enable count-based spinning on reader Waiman Long
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-14 22:48 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, x86, linux-alpha, linux-ia64, linux-s390,
	linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch, Waiman Long

Currently, when down_read() fails, the active read locking isn't undone
until the rwsem_down_read_failed() function grabs the wait_lock. If the
wait_lock is contended, it may takes a while to get the lock. During
that period, writer lock stealing will be disabled because of the
active read lock.

This patch will release the active read lock ASAP so that writer lock
stealing can happen sooner. The only downside is when the reader is
the first one in the wait queue as it has to issue another atomic
operation to update the count.

On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
the fio test with multithreaded randrw and randwrite tests on the
same file on a XFS partition on top of a NVDIMM with DAX were run,
the aggregated bandwidths before and after the patch were as follows:

  Test      BW before patch     BW after patch  % change
  ----      ---------------     --------------  --------
  randrw        1210 MB/s          1352 MB/s      +12%
  randwrite     1622 MB/s          1710 MB/s      +5.4%

The write-only microbench also showed improvement because some read
locking was done by the XFS code.

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 kernel/locking/rwsem-xadd.c |   19 ++++++++++++++-----
 1 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 2031281..29027c6 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -230,11 +230,18 @@ __rwsem_mark_wake(struct rw_semaphore *sem,
 __visible
 struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem)
 {
-	long count, adjustment = -RWSEM_ACTIVE_READ_BIAS;
+	long count, adjustment = 0;
 	struct rwsem_waiter waiter;
 	struct task_struct *tsk = current;
 	WAKE_Q(wake_q);
 
+	/*
+	 * Undo read bias from down_read operation, stop active locking.
+	 * Doing that after taking the wait_lock may block writer lock
+	 * stealing for too long.
+	 */
+	atomic_long_add(-RWSEM_ACTIVE_READ_BIAS, &sem->count);
+
 	/* set up my own style of waitqueue */
 	waiter.task = tsk;
 	waiter.type = RWSEM_WAITING_FOR_READ;
@@ -244,8 +251,11 @@ struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem)
 		adjustment += RWSEM_WAITING_BIAS;
 	list_add_tail(&waiter.list, &sem->wait_list);
 
-	/* we're now waiting on the lock, but no longer actively locking */
-	count = atomic_long_add_return(adjustment, &sem->count);
+	/* we're now waiting on the lock */
+	if (adjustment)
+		count = atomic_long_add_return(adjustment, &sem->count);
+	else
+		count = atomic_long_read(&sem->count);
 
 	/* If there are no active locks, wake the front queued process(es).
 	 *
@@ -253,8 +263,7 @@ struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem)
 	 * wake our own waiter to join the existing active readers !
 	 */
 	if (count == RWSEM_WAITING_BIAS ||
-	    (count > RWSEM_WAITING_BIAS &&
-	     adjustment != -RWSEM_ACTIVE_READ_BIAS))
+	    (count > RWSEM_WAITING_BIAS && adjustment))
 		sem = __rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q);
 
 	raw_spin_unlock_irq(&sem->wait_lock);
-- 
1.7.1

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [RFC PATCH-tip v2 3/6] locking/rwsem: Enable count-based spinning on reader
  2016-06-14 22:48 [RFC PATCH-tip v2 0/6] locking/rwsem: Enable reader optimistic spinning Waiman Long
  2016-06-14 22:48 ` [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier Waiman Long
  2016-06-14 22:48 ` [RFC PATCH-tip v2 2/6] locking/rwsem: Stop active read lock ASAP Waiman Long
@ 2016-06-14 22:48 ` Waiman Long
  2016-06-15 17:38   ` Peter Zijlstra
  2016-06-14 22:48 ` [RFC PATCH-tip v2 4/6] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-14 22:48 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, x86, linux-alpha, linux-ia64, linux-s390,
	linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch, Waiman Long

When the rwsem is owned by reader, writers stop optimistic spinning
simply because there is no easy way to figure out if all the readers
are actively running or not. However, there are scenarios where
the readers are unlikely to sleep and optimistic spinning can help
performance.

This patch provides an autotuning mechanism to find out if a rwsem
can benefit from count-based reader optimistic spinning. A count
(rspin_enabled) in the rwsem data structure is used to track if
optimistic spinning should be enabled. Reader spinning is enabled by
default. Each successful spin (with lock acquisition) will increment
the count by 1 and each unsuccessful spin will decrement it by 8.
When the count reaches 0, reader spinning is permanently disabled.
Modification of that count is protected by the osq lock.

Both the spinning threshold and the default value for rspin_enabled
can be overridden by architecture specific rwsem.h header file.

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 include/linux/rwsem.h       |   19 ++++++++++++++-
 kernel/locking/rwsem-xadd.c |   54 +++++++++++++++++++++++++++++++++++-------
 2 files changed, 63 insertions(+), 10 deletions(-)

diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
index dd1d142..466e744 100644
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -32,6 +32,8 @@ struct rw_semaphore {
 	raw_spinlock_t wait_lock;
 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
 	struct optimistic_spin_queue osq; /* spinner MCS lock */
+	int rspin_enabled;	/* protected by osq lock */
+
 	/*
 	 * Write owner. Used as a speculative check to see
 	 * if the owner is running on the cpu.
@@ -69,8 +71,23 @@ static inline int rwsem_is_locked(struct rw_semaphore *sem)
 # define __RWSEM_DEP_MAP_INIT(lockname)
 #endif
 
+/*
+ * Each successful reader spin will increment the rspin_enabled by 1.
+ * Each unsuccessful spin, on the other hand, will decrement it by 8.
+ * Reader spinning will be permanently disabled when it reaches 0.
+ */
+#ifndef RWSEM_RSPIN_ENABLED_DEFAULT
+# define RWSEM_RSPIN_ENABLED_DEFAULT	40
+#endif
+#define RWSEM_RSPIN_ENABLED_MAX		1024
+
+#ifndef RWSEM_RSPIN_THRESHOLD
+# define RWSEM_RSPIN_THRESHOLD	(1 << 12)
+#endif
+
 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
-#define __RWSEM_OPT_INIT(lockname) , .osq = OSQ_LOCK_UNLOCKED, .owner = NULL
+#define __RWSEM_OPT_INIT(lockname) , .osq = OSQ_LOCK_UNLOCKED, .owner = NULL, \
+		.rspin_enabled = RWSEM_RSPIN_ENABLED_DEFAULT
 #else
 #define __RWSEM_OPT_INIT(lockname)
 #endif
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 29027c6..adc2f44 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -85,6 +85,7 @@ void __init_rwsem(struct rw_semaphore *sem, const char *name,
 	INIT_LIST_HEAD(&sem->wait_list);
 #ifdef CONFIG_RWSEM_SPIN_ON_OWNER
 	sem->owner = NULL;
+	sem->rspin_enabled = RWSEM_RSPIN_ENABLED_DEFAULT;
 	osq_lock_init(&sem->osq);
 #endif
 }
@@ -347,9 +348,10 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 	owner = READ_ONCE(sem->owner);
 	if (!rwsem_owner_is_writer(owner)) {
 		/*
-		 * Don't spin if the rwsem is readers owned.
+		 * Don't spin if the rwsem is readers owned and the
+		 * reader spinning threshold isn't set.
 		 */
-		ret = !rwsem_owner_is_reader(owner);
+		ret = !rwsem_owner_is_reader(owner) || sem->rspin_enabled;
 		goto done;
 	}
 
@@ -398,7 +400,8 @@ out:
 
 static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 {
-	bool taken = false;
+	bool taken = false, can_spin;
+	int loopcnt;
 
 	preempt_disable();
 
@@ -409,6 +412,8 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 	if (!osq_lock(&sem->osq))
 		goto done;
 
+	loopcnt = sem->rspin_enabled ? RWSEM_RSPIN_THRESHOLD : 0;
+
 	/*
 	 * Optimistically spin on the owner field and attempt to acquire the
 	 * lock whenever the owner changes. Spinning will be stopped when:
@@ -416,7 +421,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 	 *  2) readers own the lock as we can't determine if they are
 	 *     actively running or not.
 	 */
-	while (rwsem_spin_on_owner(sem)) {
+	while ((can_spin = rwsem_spin_on_owner(sem)) || loopcnt) {
 		/*
 		 * Try to acquire the lock
 		 */
@@ -425,13 +430,16 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 			break;
 		}
 
+		if (!can_spin && loopcnt)
+			loopcnt--;
+
 		/*
-		 * When there's no owner, we might have preempted between the
-		 * owner acquiring the lock and setting the owner field. If
-		 * we're an RT task that will live-lock because we won't let
-		 * the owner complete.
+		 * The need_resched() check in rwsem_spin_on_owner() won't
+		 * break the loop anymore. So we need to check this in
+		 * the outer loop. If we're an RT task that will live-lock
+		 * because we won't let the owner complete.
 		 */
-		if (!sem->owner && (need_resched() || rt_task(current)))
+		if (need_resched() || rt_task(current))
 			break;
 
 		/*
@@ -442,6 +450,22 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 		 */
 		cpu_relax_lowlatency();
 	}
+	/*
+	 * Was owner a reader?
+	 */
+	if (rwsem_owner_is_reader(sem->owner)) {
+		/*
+		 * Update rspin_enabled for reader spinning
+		 * Increment by 1 if successfully & decrement by 8 if
+		 * unsuccessful. The decrement amount is kind of arbitrary
+		 * and can be adjusted if necessary.
+		 */
+		if (taken && (sem->rspin_enabled < RWSEM_RSPIN_ENABLED_MAX))
+			sem->rspin_enabled++;
+		else if (!taken)
+			sem->rspin_enabled = (sem->rspin_enabled >= 8)
+					   ? sem->rspin_enabled - 8 : 0;
+	}
 	osq_unlock(&sem->osq);
 done:
 	preempt_enable();
@@ -456,6 +480,13 @@ static inline bool rwsem_has_spinner(struct rw_semaphore *sem)
 	return osq_is_locked(&sem->osq);
 }
 
+/*
+ * Return true if reader optimistic spinning is enabled
+ */
+static inline bool reader_spinning_enabled(struct rw_semaphore *sem)
+{
+	return sem->rspin_enabled;
+}
 #else
 static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 {
@@ -466,6 +497,11 @@ static inline bool rwsem_has_spinner(struct rw_semaphore *sem)
 {
 	return false;
 }
+
+static inline bool reader_spinning_enabled(struct rw_semaphore *sem)
+{
+	return false;
+}
 #endif
 
 /*
-- 
1.7.1

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [RFC PATCH-tip v2 4/6] locking/rwsem: move down rwsem_down_read_failed function
  2016-06-14 22:48 [RFC PATCH-tip v2 0/6] locking/rwsem: Enable reader optimistic spinning Waiman Long
                   ` (2 preceding siblings ...)
  2016-06-14 22:48 ` [RFC PATCH-tip v2 3/6] locking/rwsem: Enable count-based spinning on reader Waiman Long
@ 2016-06-14 22:48 ` Waiman Long
  2016-06-15 17:40   ` Peter Zijlstra
  2016-06-14 22:48 ` [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation Waiman Long
  2016-06-14 22:48 ` [RFC PATCH-tip v2 6/6] locking/rwsem: Enable spinning readers Waiman Long
  5 siblings, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-14 22:48 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, x86, linux-alpha, linux-ia64, linux-s390,
	linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch, Waiman Long

Move the rwsem_down_read_failed() function down to below the optimistic
spinning section before enabling optimistic spinning for the readers.
There is no change in code.

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 kernel/locking/rwsem-xadd.c |  116 +++++++++++++++++++++---------------------
 1 files changed, 58 insertions(+), 58 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index adc2f44..c08c778 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -226,64 +226,6 @@ __rwsem_mark_wake(struct rw_semaphore *sem,
 }
 
 /*
- * Wait for the read lock to be granted
- */
-__visible
-struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem)
-{
-	long count, adjustment = 0;
-	struct rwsem_waiter waiter;
-	struct task_struct *tsk = current;
-	WAKE_Q(wake_q);
-
-	/*
-	 * Undo read bias from down_read operation, stop active locking.
-	 * Doing that after taking the wait_lock may block writer lock
-	 * stealing for too long.
-	 */
-	atomic_long_add(-RWSEM_ACTIVE_READ_BIAS, &sem->count);
-
-	/* set up my own style of waitqueue */
-	waiter.task = tsk;
-	waiter.type = RWSEM_WAITING_FOR_READ;
-
-	raw_spin_lock_irq(&sem->wait_lock);
-	if (list_empty(&sem->wait_list))
-		adjustment += RWSEM_WAITING_BIAS;
-	list_add_tail(&waiter.list, &sem->wait_list);
-
-	/* we're now waiting on the lock */
-	if (adjustment)
-		count = atomic_long_add_return(adjustment, &sem->count);
-	else
-		count = atomic_long_read(&sem->count);
-
-	/* If there are no active locks, wake the front queued process(es).
-	 *
-	 * If there are no writers and we are first in the queue,
-	 * wake our own waiter to join the existing active readers !
-	 */
-	if (count == RWSEM_WAITING_BIAS ||
-	    (count > RWSEM_WAITING_BIAS && adjustment))
-		sem = __rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q);
-
-	raw_spin_unlock_irq(&sem->wait_lock);
-	wake_up_q(&wake_q);
-
-	/* wait to be given the lock */
-	while (true) {
-		set_task_state(tsk, TASK_UNINTERRUPTIBLE);
-		if (!waiter.task)
-			break;
-		schedule();
-	}
-
-	__set_task_state(tsk, TASK_RUNNING);
-	return sem;
-}
-EXPORT_SYMBOL(rwsem_down_read_failed);
-
-/*
  * This function must be called with the sem->wait_lock held to prevent
  * race conditions between checking the rwsem wait list and setting the
  * sem->count accordingly.
@@ -505,6 +447,64 @@ static inline bool reader_spinning_enabled(struct rw_semaphore *sem)
 #endif
 
 /*
+ * Wait for the read lock to be granted
+ */
+__visible
+struct rw_semaphore __sched * rwsem_down_read_failed(struct rw_semaphore *sem)
+{
+	long count, adjustment = 0;
+	struct rwsem_waiter waiter;
+	struct task_struct *tsk = current;
+	WAKE_Q(wake_q);
+
+	/*
+	 * Undo read bias from down_read operation, stop active locking.
+	 * Doing that after taking the wait_lock may block writer lock
+	 * stealing for too long.
+	 */
+	atomic_long_add(-RWSEM_ACTIVE_READ_BIAS, &sem->count);
+
+	/* set up my own style of waitqueue */
+	waiter.task = tsk;
+	waiter.type = RWSEM_WAITING_FOR_READ;
+
+	raw_spin_lock_irq(&sem->wait_lock);
+	if (list_empty(&sem->wait_list))
+		adjustment += RWSEM_WAITING_BIAS;
+	list_add_tail(&waiter.list, &sem->wait_list);
+
+	/* we're now waiting on the lock */
+	if (adjustment)
+		count = atomic_long_add_return(adjustment, &sem->count);
+	else
+		count = atomic_long_read(&sem->count);
+
+	/* If there are no active locks, wake the front queued process(es).
+	 *
+	 * If there are no writers and we are first in the queue,
+	 * wake our own waiter to join the existing active readers !
+	 */
+	if (count == RWSEM_WAITING_BIAS ||
+	    (count > RWSEM_WAITING_BIAS && adjustment))
+		sem = __rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q);
+
+	raw_spin_unlock_irq(&sem->wait_lock);
+	wake_up_q(&wake_q);
+
+	/* wait to be given the lock */
+	while (true) {
+		set_task_state(tsk, TASK_UNINTERRUPTIBLE);
+		if (!waiter.task)
+			break;
+		schedule();
+	}
+
+	__set_task_state(tsk, TASK_RUNNING);
+	return sem;
+}
+EXPORT_SYMBOL(rwsem_down_read_failed);
+
+/*
  * Wait until we successfully acquire the write lock
  */
 static inline struct rw_semaphore *
-- 
1.7.1

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation
  2016-06-14 22:48 [RFC PATCH-tip v2 0/6] locking/rwsem: Enable reader optimistic spinning Waiman Long
                   ` (3 preceding siblings ...)
  2016-06-14 22:48 ` [RFC PATCH-tip v2 4/6] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
@ 2016-06-14 22:48 ` Waiman Long
  2016-06-15 17:43   ` Peter Zijlstra
  2016-06-15 17:45   ` Peter Zijlstra
  2016-06-14 22:48 ` [RFC PATCH-tip v2 6/6] locking/rwsem: Enable spinning readers Waiman Long
  5 siblings, 2 replies; 44+ messages in thread
From: Waiman Long @ 2016-06-14 22:48 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, x86, linux-alpha, linux-ia64, linux-s390,
	linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch, Waiman Long

When the count value is in between 0 and RWSEM_WAITING_BIAS, there
are 2 possibilities. Either a writer is present and there is no
waiter or there are waiters and readers. There is no easy way to
know which is true unless the wait_lock is taken.

This patch changes the RWSEM_WAITING_BIAS from 0xffff (32-bit) or
0xffffffff (64-bit) to 0xc0000000 (32-bit) or 0xc000000000000000
(64-bit). By doing so, we will be able to determine if writers
are present by looking at the count value alone without taking the
wait_lock.

This patch has the effect of halving the maximum number of writers
that can attempt to take the write lock simultaneously. However,
even the reduced maximum of about 16k (32-bit) or 1G (64-bit) should
be more than enough for the foreseeable future.

With that change, the following identity is now no longer true:

  RWSEM_ACTIVE_WRITE_BIAS = RWSEM_WAITING_BIAS + RWSEM_ACTIVE_READ_BIAS

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 arch/alpha/include/asm/rwsem.h |    7 ++++---
 arch/ia64/include/asm/rwsem.h  |    6 +++---
 arch/s390/include/asm/rwsem.h  |    6 +++---
 arch/x86/include/asm/rwsem.h   |   13 ++++++++-----
 include/asm-generic/rwsem.h    |    9 +++++----
 kernel/locking/rwsem-xadd.c    |   32 ++++++++++++++++++++++++--------
 6 files changed, 47 insertions(+), 26 deletions(-)

diff --git a/arch/alpha/include/asm/rwsem.h b/arch/alpha/include/asm/rwsem.h
index 77873d0..0f08fad 100644
--- a/arch/alpha/include/asm/rwsem.h
+++ b/arch/alpha/include/asm/rwsem.h
@@ -17,9 +17,9 @@
 #define RWSEM_UNLOCKED_VALUE		0x0000000000000000L
 #define RWSEM_ACTIVE_BIAS		0x0000000000000001L
 #define RWSEM_ACTIVE_MASK		0x00000000ffffffffL
-#define RWSEM_WAITING_BIAS		(-0x0000000100000000L)
+#define RWSEM_WAITING_BIAS		0xc000000000000000L
 #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
-#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)
 
 static inline void __down_read(struct rw_semaphore *sem)
 {
@@ -185,7 +185,8 @@ static inline void __downgrade_write(struct rw_semaphore *sem)
 	"2:	br	1b\n"
 	".previous"
 	:"=&r" (oldcount), "=m" (sem->count), "=&r" (temp)
-	:"Ir" (-RWSEM_WAITING_BIAS), "m" (sem->count) : "memory");
+	:"Ir" (-RWSEM_ACTIVE_WRITE_BIAS + RWSEM_ACTIVE_READ_BIAS),
+	 "m" (sem->count) : "memory");
 #endif
 	if (unlikely(oldcount < 0))
 		rwsem_downgrade_wake(sem);
diff --git a/arch/ia64/include/asm/rwsem.h b/arch/ia64/include/asm/rwsem.h
index 8fa98dd..db3693b 100644
--- a/arch/ia64/include/asm/rwsem.h
+++ b/arch/ia64/include/asm/rwsem.h
@@ -30,9 +30,9 @@
 #define RWSEM_UNLOCKED_VALUE		__IA64_UL_CONST(0x0000000000000000)
 #define RWSEM_ACTIVE_BIAS		(1L)
 #define RWSEM_ACTIVE_MASK		(0xffffffffL)
-#define RWSEM_WAITING_BIAS		(-0x100000000L)
+#define RWSEM_WAITING_BIAS		(-(1L << 62))
 #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
-#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)
 
 /*
  * lock for reading
@@ -144,7 +144,7 @@ __downgrade_write (struct rw_semaphore *sem)
 
 	do {
 		old = atomic_long_read(&sem->count);
-		new = old - RWSEM_WAITING_BIAS;
+		new = old - RWSEM_ACTIVE_WRITE_BIAS + RWSEM_ACTIVE_READ_BIAS;
 	} while (atomic_long_cmpxchg_release(&sem->count, old, new) != old);
 
 	if (old < 0)
diff --git a/arch/s390/include/asm/rwsem.h b/arch/s390/include/asm/rwsem.h
index 597e7e9..e7f50f5 100644
--- a/arch/s390/include/asm/rwsem.h
+++ b/arch/s390/include/asm/rwsem.h
@@ -42,9 +42,9 @@
 #define RWSEM_UNLOCKED_VALUE	0x0000000000000000L
 #define RWSEM_ACTIVE_BIAS	0x0000000000000001L
 #define RWSEM_ACTIVE_MASK	0x00000000ffffffffL
-#define RWSEM_WAITING_BIAS	(-0x0000000100000000L)
+#define RWSEM_WAITING_BIAS	0xc000000000000000L
 #define RWSEM_ACTIVE_READ_BIAS	RWSEM_ACTIVE_BIAS
-#define RWSEM_ACTIVE_WRITE_BIAS	(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+#define RWSEM_ACTIVE_WRITE_BIAS	(-RWSEM_ACTIVE_MASK)
 
 /*
  * lock for reading
@@ -193,7 +193,7 @@ static inline void __downgrade_write(struct rw_semaphore *sem)
 {
 	signed long old, new, tmp;
 
-	tmp = -RWSEM_WAITING_BIAS;
+	tmp = -RWSEM_ACTIVE_WRITE_BIAS + RWSEM_ACTIVE_READ_BIAS;
 	asm volatile(
 		"	lg	%0,%2\n"
 		"0:	lgr	%1,%0\n"
diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index 089ced4..f1fb09f 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -41,21 +41,23 @@
 
 /*
  * The bias values and the counter type limits the number of
- * potential readers/writers to 32767 for 32 bits and 2147483647
- * for 64 bits.
+ * potential writers to 16383 for 32 bits and 1073741823 for 64 bits.
+ * The combined readers and writers can go up to 65534 for 32-bits and
+ * 4294967294 for 64-bits.
  */
 
 #ifdef CONFIG_X86_64
 # define RWSEM_ACTIVE_MASK		0xffffffffL
+# define RWSEM_WAITING_BIAS		(-(1L << 62))
 #else
 # define RWSEM_ACTIVE_MASK		0x0000ffffL
+# define RWSEM_WAITING_BIAS		(-(1L << 30))
 #endif
 
 #define RWSEM_UNLOCKED_VALUE		0x00000000L
 #define RWSEM_ACTIVE_BIAS		0x00000001L
-#define RWSEM_WAITING_BIAS		(-RWSEM_ACTIVE_MASK-1)
 #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
-#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)
 
 /*
  * lock for reading
@@ -209,7 +211,8 @@ static inline void __downgrade_write(struct rw_semaphore *sem)
 		     "1:\n\t"
 		     "# ending __downgrade_write\n"
 		     : "+m" (sem->count)
-		     : "a" (sem), "er" (-RWSEM_WAITING_BIAS)
+		     : "a" (sem), "er" (-RWSEM_ACTIVE_WRITE_BIAS +
+					 RWSEM_ACTIVE_READ_BIAS)
 		     : "memory", "cc");
 }
 
diff --git a/include/asm-generic/rwsem.h b/include/asm-generic/rwsem.h
index 5be122e..655e405 100644
--- a/include/asm-generic/rwsem.h
+++ b/include/asm-generic/rwsem.h
@@ -18,15 +18,16 @@
  */
 #ifdef CONFIG_64BIT
 # define RWSEM_ACTIVE_MASK		0xffffffffL
+# define RWSEM_WAITING_BIAS		(-(1L << 62))
 #else
 # define RWSEM_ACTIVE_MASK		0x0000ffffL
+# define RWSEM_WAITING_BIAS		(-(1L << 30))
 #endif
 
 #define RWSEM_UNLOCKED_VALUE		0x00000000L
 #define RWSEM_ACTIVE_BIAS		0x00000001L
-#define RWSEM_WAITING_BIAS		(-RWSEM_ACTIVE_MASK-1)
 #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
-#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)
 
 /*
  * lock for reading
@@ -120,8 +121,8 @@ static inline void __downgrade_write(struct rw_semaphore *sem)
 	 * read-locked region is ok to be re-ordered into the
 	 * write side. As such, rely on RELEASE semantics.
 	 */
-	tmp = atomic_long_add_return_release(-RWSEM_WAITING_BIAS,
-				     (atomic_long_t *)&sem->count);
+	tmp = atomic_long_add_return_release(-RWSEM_ACTIVE_WRITE_BIAS +
+			RWSEM_ACTIVE_READ_BIAS, (atomic_long_t *)&sem->count);
 	if (tmp < 0)
 		rwsem_downgrade_wake(sem);
 }
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index c08c778..b1e7923 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -29,28 +29,30 @@
  * 0x00000000	rwsem is unlocked, and no one is waiting for the lock or
  *		attempting to read lock or write lock.
  *
- * 0xffff000X	(1) X readers active or attempting lock, with waiters for lock
+ * 0xc000000X	(1) X readers active or attempting lock, with waiters for lock
  *		    X = #active readers + # readers attempting lock
  *		    (X*ACTIVE_BIAS + WAITING_BIAS)
- *		(2) 1 writer attempting lock, no waiters for lock
+ *
+ * 0xffff000X	(1) 1 writer attempting lock, no waiters for lock
  *		    X-1 = #active readers + #readers attempting lock
  *		    ((X-1)*ACTIVE_BIAS + ACTIVE_WRITE_BIAS)
- *		(3) 1 writer active, no waiters for lock
+ *		(2) 1 writer active, no waiters for lock
  *		    X-1 = #active readers + #readers attempting lock
  *		    ((X-1)*ACTIVE_BIAS + ACTIVE_WRITE_BIAS)
  *
- * 0xffff0001	(1) 1 reader active or attempting lock, waiters for lock
+ * 0xc0000001	(1) 1 reader active or attempting lock, waiters for lock
  *		    (WAITING_BIAS + ACTIVE_BIAS)
- *		(2) 1 writer active or attempting lock, no waiters for lock
+ *
+ * 0xffff0001	(1) 1 writer active or attempting lock, no waiters for lock
  *		    (ACTIVE_WRITE_BIAS)
  *
- * 0xffff0000	(1) There are writers or readers queued but none active
+ * 0xc0000000	(1) There are writers or readers queued but none active
  *		    or in the process of attempting lock.
  *		    (WAITING_BIAS)
  *		Note: writer can attempt to steal lock for this count by adding
  *		ACTIVE_WRITE_BIAS in cmpxchg and checking the old count
  *
- * 0xfffe0001	(1) 1 writer active, or attempting lock. Waiters on queue.
+ * 0xbfff0001	(1) 1 writer active, or attempting lock. Waiters on queue.
  *		    (ACTIVE_WRITE_BIAS + WAITING_BIAS)
  *
  * Note: Readers attempt to lock by adding ACTIVE_BIAS in down_read and checking
@@ -62,9 +64,23 @@
  *	 checking the count becomes ACTIVE_WRITE_BIAS for successful lock
  *	 acquisition (i.e. nobody else has lock or attempts lock).  If
  *	 unsuccessful, in rwsem_down_write_failed, we'll check to see if there
- *	 are only waiters but none active (5th case above), and attempt to
+ *	 are only waiters but none active (7th case above), and attempt to
  *	 steal the lock.
  *
+ *	 We can infer the reader/writer/waiter state of the lock by looking
+ *	 at the count value:
+ *	 (1) count > 0
+ *	     Only readers are present.
+ *	 (2) WAITING_BIAS - ACTIVE_WRITE_BIAS < count < 0
+ *	     Have writers, maybe readers, but no waiter
+ *	 (3) WAITING_BIAS < count <= WAITING_BIAS - ACTIVE_WRITE_BIAS
+ *	     Have readers and waiters, but no writer
+ *	 (4) count < WAITING_BIAS
+ *	     Have writers and waiters, maybe readers
+ *
+ *	 IOW, writers are present when
+ *	 (1) count < WAITING_BIAS, or
+ *	 (2) WAITING_BIAS - ACTIVE_WRITE_BIAS < count < 0
  */
 
 /*
-- 
1.7.1

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [RFC PATCH-tip v2 6/6] locking/rwsem: Enable spinning readers
  2016-06-14 22:48 [RFC PATCH-tip v2 0/6] locking/rwsem: Enable reader optimistic spinning Waiman Long
                   ` (4 preceding siblings ...)
  2016-06-14 22:48 ` [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation Waiman Long
@ 2016-06-14 22:48 ` Waiman Long
  5 siblings, 0 replies; 44+ messages in thread
From: Waiman Long @ 2016-06-14 22:48 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: linux-kernel, x86, linux-alpha, linux-ia64, linux-s390,
	linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch, Waiman Long

This patch enables readers to optimistically spin when the
rspin_threshold is non-zero. That threshold value should only
be set when the lock owners of the rwsem are unlikely to go to
sleep. Otherwise enabling reader spinning may make the performance
worse in some cases.

On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
the fio test with multithreaded randrw and randwrite tests on the
same file on a XFS partition on top of a NVDIMM with DAX were run,
the aggregated bandwidths before and after the reader optimistic
spinning patchset were as follows:

  Test      BW before patch     BW after patch  % change
  ----      ---------------     --------------  --------
  randrw        1352 MB/s          2164 MB/s      +60%
  randwrite     1710 MB/s          2550 MB/s      +49%

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 kernel/locking/rwsem-xadd.c |   45 +++++++++++++++++++++++++++++++++++++-----
 1 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index b1e7923..f3848fa 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -83,6 +83,12 @@
  *	 (2) WAITING_BIAS - ACTIVE_WRITE_BIAS < count < 0
  */
 
+static inline bool count_has_writer(long count)
+{
+	return (count < RWSEM_WAITING_BIAS) || ((count < 0) &&
+	       (count > RWSEM_WAITING_BIAS - RWSEM_ACTIVE_WRITE_BIAS));
+}
+
 /*
  * Initialize an rwsem:
  */
@@ -294,6 +300,25 @@ static inline bool rwsem_try_write_lock_unqueued(struct rw_semaphore *sem)
 	}
 }
 
+/*
+ * Try to acquire read lock before the reader is put on wait queue
+ */
+static inline bool rwsem_try_read_lock_unqueued(struct rw_semaphore *sem)
+{
+	long count = atomic_long_read(&sem->count);
+
+	if (count_has_writer(count))
+		return false;
+	count = atomic_long_add_return_acquire(RWSEM_ACTIVE_READ_BIAS,
+					       &sem->count);
+	if (!count_has_writer(count))
+		return true;
+
+	/* Back out the change */
+	atomic_long_add(-RWSEM_ACTIVE_READ_BIAS, &sem->count);
+	return false;
+}
+
 static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 {
 	struct task_struct *owner;
@@ -356,7 +381,8 @@ out:
 	return !rwsem_owner_is_reader(READ_ONCE(sem->owner));
 }
 
-static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
+static bool rwsem_optimistic_spin(struct rw_semaphore *sem,
+				  enum rwsem_waiter_type type)
 {
 	bool taken = false, can_spin;
 	int loopcnt;
@@ -383,10 +409,11 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 		/*
 		 * Try to acquire the lock
 		 */
-		if (rwsem_try_write_lock_unqueued(sem)) {
-			taken = true;
+		taken = (type == RWSEM_WAITING_FOR_WRITE)
+		      ? rwsem_try_write_lock_unqueued(sem)
+		      : rwsem_try_read_lock_unqueued(sem);
+		if (taken)
 			break;
-		}
 
 		if (!can_spin && loopcnt)
 			loopcnt--;
@@ -446,7 +473,8 @@ static inline bool reader_spinning_enabled(struct rw_semaphore *sem)
 	return sem->rspin_enabled;
 }
 #else
-static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
+static bool rwsem_optimistic_spin(struct rw_semaphore *sem,
+				  enum rwsem_waiter_type type)
 {
 	return false;
 }
@@ -480,6 +508,11 @@ struct rw_semaphore __sched * rwsem_down_read_failed(struct rw_semaphore *sem)
 	 */
 	atomic_long_add(-RWSEM_ACTIVE_READ_BIAS, &sem->count);
 
+	/* do optimistic spinning and steal lock if possible */
+	if (reader_spinning_enabled(sem) &&
+	    rwsem_optimistic_spin(sem, RWSEM_WAITING_FOR_READ))
+		return sem;
+
 	/* set up my own style of waitqueue */
 	waiter.task = tsk;
 	waiter.type = RWSEM_WAITING_FOR_READ;
@@ -536,7 +569,7 @@ __rwsem_down_write_failed_common(struct rw_semaphore *sem, int state)
 	count = atomic_long_sub_return(RWSEM_ACTIVE_WRITE_BIAS, &sem->count);
 
 	/* do optimistic spinning and steal lock if possible */
-	if (rwsem_optimistic_spin(sem))
+	if (rwsem_optimistic_spin(sem, RWSEM_WAITING_FOR_WRITE))
 		return sem;
 
 	/*
-- 
1.7.1

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-14 22:48 ` [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier Waiman Long
@ 2016-06-15  8:04   ` Boqun Feng
  2016-06-15 17:18     ` Peter Zijlstra
  2016-06-15 19:01     ` Waiman Long
  2016-06-15 16:56   ` Davidlohr Bueso
  1 sibling, 2 replies; 44+ messages in thread
From: Boqun Feng @ 2016-06-15  8:04 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Davidlohr Bueso, Jason Low,
	Dave Chinner, Scott J Norton, Douglas Hatch

[-- Attachment #1: Type: text/plain, Size: 1774 bytes --]

Hi Waiman,

On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:
> The osq_lock() and osq_unlock() function may not provide the necessary
> acquire and release barrier in some cases. This patch makes sure
> that the proper barriers are provided when osq_lock() is successful
> or when osq_unlock() is called.
> 
> Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
> ---
>  kernel/locking/osq_lock.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index 05a3785..7dd4ee5 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>  	 * cmpxchg in an attempt to undo our queueing.
>  	 */
>  
> -	while (!READ_ONCE(node->locked)) {
> +	while (!smp_load_acquire(&node->locked)) {
>  		/*
>  		 * If we need to reschedule bail... so we can block.
>  		 */
> @@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
>  	 * Second most likely case.
>  	 */
>  	node = this_cpu_ptr(&osq_node);
> -	next = xchg(&node->next, NULL);
> +	next = xchg_release(&node->next, NULL);
>  	if (next) {
>  		WRITE_ONCE(next->locked, 1);

So we still use WRITE_ONCE() rather than smp_store_release() here?

Though, IIUC, This is fine for all the archs but ARM64, because there
will always be a xchg_release()/xchg() before the WRITE_ONCE(), which
carries a necessary barrier to upgrade WRITE_ONCE() to a RELEASE.

Not sure whether it's a problem on ARM64, but I think we certainly need
to add some comments here, if we count on this trick.

Am I missing something or misunderstanding you here?

Regards,
Boqun

>  		return;
> -- 
> 1.7.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-14 22:48 ` [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier Waiman Long
  2016-06-15  8:04   ` Boqun Feng
@ 2016-06-15 16:56   ` Davidlohr Bueso
  2016-06-15 17:12     ` Peter Zijlstra
  1 sibling, 1 reply; 44+ messages in thread
From: Davidlohr Bueso @ 2016-06-15 16:56 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Tue, 14 Jun 2016, Waiman Long wrote:

>The osq_lock() and osq_unlock() function may not provide the necessary
>acquire and release barrier in some cases. This patch makes sure
>that the proper barriers are provided when osq_lock() is successful
>or when osq_unlock() is called.
>
>Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
>---
> kernel/locking/osq_lock.c |    4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
>index 05a3785..7dd4ee5 100644
>--- a/kernel/locking/osq_lock.c
>+++ b/kernel/locking/osq_lock.c
>@@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> 	 * cmpxchg in an attempt to undo our queueing.
> 	 */
>
>-	while (!READ_ONCE(node->locked)) {
>+	while (!smp_load_acquire(&node->locked)) {

Hmm this being a polling path, that barrier can get pretty expensive and
last I checked it was unnecessary:

036cc30c6b6 (locking/osq: No need for load/acquire when acquire-polling)

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15 16:56   ` Davidlohr Bueso
@ 2016-06-15 17:12     ` Peter Zijlstra
  2016-06-15 18:27       ` Davidlohr Bueso
  2016-06-15 19:08       ` Waiman Long
  0 siblings, 2 replies; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 17:12 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Waiman Long, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Wed, Jun 15, 2016 at 09:56:59AM -0700, Davidlohr Bueso wrote:
> On Tue, 14 Jun 2016, Waiman Long wrote:

> >+++ b/kernel/locking/osq_lock.c
> >@@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> >	 * cmpxchg in an attempt to undo our queueing.
> >	 */
> >
> >-	while (!READ_ONCE(node->locked)) {
> >+	while (!smp_load_acquire(&node->locked)) {
> 
> Hmm this being a polling path, that barrier can get pretty expensive and
> last I checked it was unnecessary:

I think he'll go rely on it later on.

In any case, its fairly simple to cure, just add
smp_acquire__after_ctrl_dep() at the end. If we bail because
need_resched() we don't need the acquire I think.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15  8:04   ` Boqun Feng
@ 2016-06-15 17:18     ` Peter Zijlstra
  2016-06-15 19:01     ` Waiman Long
  1 sibling, 0 replies; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 17:18 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Waiman Long, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Davidlohr Bueso, Jason Low,
	Dave Chinner, Scott J Norton, Douglas Hatch

On Wed, Jun 15, 2016 at 04:04:46PM +0800, Boqun Feng wrote:
> On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:

> > @@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
> >  	 * Second most likely case.
> >  	 */
> >  	node = this_cpu_ptr(&osq_node);
> > -	next = xchg(&node->next, NULL);
> > +	next = xchg_release(&node->next, NULL);
> >  	if (next) {
> >  		WRITE_ONCE(next->locked, 1);
> 
> So we still use WRITE_ONCE() rather than smp_store_release() here?
> 
> Though, IIUC, This is fine for all the archs but ARM64, because there
> will always be a xchg_release()/xchg() before the WRITE_ONCE(), which
> carries a necessary barrier to upgrade WRITE_ONCE() to a RELEASE.

Not sure. On PPC for example, you'll use lwsync() but will that not
attach to the store to &node->next instead?

Still leaving that store and the WRITE_ONCE() unordered.

Also I don't see the control dependency between xchg-load and WRITE_ONCE
helping anything to order the two stores.


So yeah, subtle if not broken, definitely needs more explanation.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 2/6] locking/rwsem: Stop active read lock ASAP
  2016-06-14 22:48 ` [RFC PATCH-tip v2 2/6] locking/rwsem: Stop active read lock ASAP Waiman Long
@ 2016-06-15 17:22   ` Peter Zijlstra
  2016-06-15 19:17     ` Waiman Long
  0 siblings, 1 reply; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 17:22 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Tue, Jun 14, 2016 at 06:48:05PM -0400, Waiman Long wrote:
> Currently, when down_read() fails, the active read locking isn't undone
> until the rwsem_down_read_failed() function grabs the wait_lock. If the
> wait_lock is contended, it may takes a while to get the lock. During
> that period, writer lock stealing will be disabled because of the
> active read lock.
> 
> This patch will release the active read lock ASAP so that writer lock
> stealing can happen sooner. The only downside is when the reader is
> the first one in the wait queue as it has to issue another atomic
> operation to update the count.
> 
> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
> the fio test with multithreaded randrw and randwrite tests on the
> same file on a XFS partition on top of a NVDIMM with DAX were run,
> the aggregated bandwidths before and after the patch were as follows:
> 
>   Test      BW before patch     BW after patch  % change
>   ----      ---------------     --------------  --------
>   randrw        1210 MB/s          1352 MB/s      +12%
>   randwrite     1622 MB/s          1710 MB/s      +5.4%
> 
> The write-only microbench also showed improvement because some read
> locking was done by the XFS code.

How does a reader only micro-bench react? I'm thinking the extra atomic
might hurt a bit.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 3/6] locking/rwsem: Enable count-based spinning on reader
  2016-06-14 22:48 ` [RFC PATCH-tip v2 3/6] locking/rwsem: Enable count-based spinning on reader Waiman Long
@ 2016-06-15 17:38   ` Peter Zijlstra
  2016-06-15 19:28     ` Waiman Long
  0 siblings, 1 reply; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 17:38 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Tue, Jun 14, 2016 at 06:48:06PM -0400, Waiman Long wrote:
>  static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
>  {
> -	bool taken = false;
> +	bool taken = false, can_spin;

I would place the variables without assignment first.

> +	int loopcnt;
>  
>  	preempt_disable();
>  
> @@ -409,6 +412,8 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
>  	if (!osq_lock(&sem->osq))
>  		goto done;
>  
> +	loopcnt = sem->rspin_enabled ? RWSEM_RSPIN_THRESHOLD : 0;
> +
>  	/*
>  	 * Optimistically spin on the owner field and attempt to acquire the
>  	 * lock whenever the owner changes. Spinning will be stopped when:
> @@ -416,7 +421,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
>  	 *  2) readers own the lock as we can't determine if they are
>  	 *     actively running or not.
>  	 */
> -	while (rwsem_spin_on_owner(sem)) {
> +	while ((can_spin = rwsem_spin_on_owner(sem)) || loopcnt) {
>  		/*
>  		 * Try to acquire the lock
>  		 */
> @@ -425,13 +430,16 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
>  			break;
>  		}
>  
> +		if (!can_spin && loopcnt)
> +			loopcnt--;

This seems to suggest 'can_spin' is a bad name, because if we cannot
spin, we do in fact spin anyway?

Maybe call it write_spin or something, which makes it clear that if its
not a write spin we'll do a read spin?

Also, isn't this the wrong level to do loopcnt at?
rwsem_spin_on_owner() can have spend any amount of cycles spinning. So
you're not counting loops of similar unit.

> +	/*
> +	 * Was owner a reader?
> +	 */
> +	if (rwsem_owner_is_reader(sem->owner)) {
> +		/*
> +		 * Update rspin_enabled for reader spinning

full stop and newline?

> +		 * Increment by 1 if successfully & decrement by 8 if
> +		 * unsuccessful.

This is bloody obvious from the code, explain why, not what the code
does.

>                                The decrement amount is kind of arbitrary
> +		 * and can be adjusted if necessary.
> +		 */
> +		if (taken && (sem->rspin_enabled < RWSEM_RSPIN_ENABLED_MAX))
> +			sem->rspin_enabled++;
> +		else if (!taken)
> +			sem->rspin_enabled = (sem->rspin_enabled >= 8)
> +					   ? sem->rspin_enabled - 8 : 0;

This is unreadable and against coding style.

> +	}
>  	osq_unlock(&sem->osq);
>  done:
>  	preempt_enable();

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 4/6] locking/rwsem: move down rwsem_down_read_failed function
  2016-06-14 22:48 ` [RFC PATCH-tip v2 4/6] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
@ 2016-06-15 17:40   ` Peter Zijlstra
  2016-06-15 19:21     ` Waiman Long
  0 siblings, 1 reply; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 17:40 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Tue, Jun 14, 2016 at 06:48:07PM -0400, Waiman Long wrote:
> Move the rwsem_down_read_failed() function down to below the optimistic
> spinning section before enabling optimistic spinning for the readers.

newline

> There is no change in code.

Changelog fails to explain the why part.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation
  2016-06-14 22:48 ` [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation Waiman Long
@ 2016-06-15 17:43   ` Peter Zijlstra
  2016-06-15 19:31     ` Waiman Long
  2016-06-15 17:45   ` Peter Zijlstra
  1 sibling, 1 reply; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 17:43 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Tue, Jun 14, 2016 at 06:48:08PM -0400, Waiman Long wrote:
> even the reduced maximum of about 16k (32-bit) or 1G (64-bit) should
> be more than enough for the foreseeable future.

So what happens if I manage to create 16k+ threads on my 32bit kernel
and get them all to do mmap() or so at the same time.

That doesn't seem too far fetched.

Then again, with double that (the current limit) that doesn't seem
impossible either.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation
  2016-06-14 22:48 ` [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation Waiman Long
  2016-06-15 17:43   ` Peter Zijlstra
@ 2016-06-15 17:45   ` Peter Zijlstra
  2016-06-15 19:35     ` Waiman Long
  1 sibling, 1 reply; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 17:45 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Tue, Jun 14, 2016 at 06:48:08PM -0400, Waiman Long wrote:
> +++ b/arch/alpha/include/asm/rwsem.h
> @@ -17,9 +17,9 @@
>  #define RWSEM_UNLOCKED_VALUE		0x0000000000000000L
>  #define RWSEM_ACTIVE_BIAS		0x0000000000000001L
>  #define RWSEM_ACTIVE_MASK		0x00000000ffffffffL
> -#define RWSEM_WAITING_BIAS		(-0x0000000100000000L)
> +#define RWSEM_WAITING_BIAS		0xc000000000000000L
>  #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
> -#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
> +#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)

> +++ b/arch/ia64/include/asm/rwsem.h
> @@ -30,9 +30,9 @@
>  #define RWSEM_UNLOCKED_VALUE		__IA64_UL_CONST(0x0000000000000000)
>  #define RWSEM_ACTIVE_BIAS		(1L)
>  #define RWSEM_ACTIVE_MASK		(0xffffffffL)
> -#define RWSEM_WAITING_BIAS		(-0x100000000L)
> +#define RWSEM_WAITING_BIAS		(-(1L << 62))
>  #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
> -#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
> +#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)

> +++ b/arch/s390/include/asm/rwsem.h
> @@ -42,9 +42,9 @@
>  #define RWSEM_UNLOCKED_VALUE	0x0000000000000000L
>  #define RWSEM_ACTIVE_BIAS	0x0000000000000001L
>  #define RWSEM_ACTIVE_MASK	0x00000000ffffffffL
> -#define RWSEM_WAITING_BIAS	(-0x0000000100000000L)
> +#define RWSEM_WAITING_BIAS	0xc000000000000000L
>  #define RWSEM_ACTIVE_READ_BIAS	RWSEM_ACTIVE_BIAS
> -#define RWSEM_ACTIVE_WRITE_BIAS	(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
> +#define RWSEM_ACTIVE_WRITE_BIAS	(-RWSEM_ACTIVE_MASK)

> +++ b/arch/x86/include/asm/rwsem.h
> @@ -41,21 +41,23 @@
>  
>  /*
>   * The bias values and the counter type limits the number of
> - * potential readers/writers to 32767 for 32 bits and 2147483647
> - * for 64 bits.
> + * potential writers to 16383 for 32 bits and 1073741823 for 64 bits.
> + * The combined readers and writers can go up to 65534 for 32-bits and
> + * 4294967294 for 64-bits.
>   */
>  
>  #ifdef CONFIG_X86_64
>  # define RWSEM_ACTIVE_MASK		0xffffffffL
> +# define RWSEM_WAITING_BIAS		(-(1L << 62))
>  #else
>  # define RWSEM_ACTIVE_MASK		0x0000ffffL
> +# define RWSEM_WAITING_BIAS		(-(1L << 30))
>  #endif
>  
>  #define RWSEM_UNLOCKED_VALUE		0x00000000L
>  #define RWSEM_ACTIVE_BIAS		0x00000001L
> -#define RWSEM_WAITING_BIAS		(-RWSEM_ACTIVE_MASK-1)
>  #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
> -#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
> +#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)

> +++ b/include/asm-generic/rwsem.h
> @@ -18,15 +18,16 @@
>   */
>  #ifdef CONFIG_64BIT
>  # define RWSEM_ACTIVE_MASK		0xffffffffL
> +# define RWSEM_WAITING_BIAS		(-(1L << 62))
>  #else
>  # define RWSEM_ACTIVE_MASK		0x0000ffffL
> +# define RWSEM_WAITING_BIAS		(-(1L << 30))
>  #endif
>  
>  #define RWSEM_UNLOCKED_VALUE		0x00000000L
>  #define RWSEM_ACTIVE_BIAS		0x00000001L
> -#define RWSEM_WAITING_BIAS		(-RWSEM_ACTIVE_MASK-1)
>  #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
> -#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
> +#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)

Can't we collapse all that? They all seem very similar.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15 17:12     ` Peter Zijlstra
@ 2016-06-15 18:27       ` Davidlohr Bueso
  2016-06-15 18:40         ` Peter Zijlstra
  2016-06-15 19:08       ` Waiman Long
  1 sibling, 1 reply; 44+ messages in thread
From: Davidlohr Bueso @ 2016-06-15 18:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Wed, 15 Jun 2016, Peter Zijlstra wrote:

>In any case, its fairly simple to cure, just add
>smp_acquire__after_ctrl_dep() at the end. If we bail because
>need_resched() we don't need the acquire I think.

I was just considering this for your smp_cond_acquire/smp_cond_load_acquire
rework, so yeah I guess an smp_acquire__after_ctrl_dep would be a nice
compromise. 

However, I was always under the impression that races with node->locked were
rather harmless (as indicated in the mentioned commit) -- which is why ->locked
are simple load/stores, with the exception of the unqueueing -- but yeah, that's
not even paired.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15 18:27       ` Davidlohr Bueso
@ 2016-06-15 18:40         ` Peter Zijlstra
  2016-06-15 18:56           ` Davidlohr Bueso
  2016-06-17  1:11           ` Davidlohr Bueso
  0 siblings, 2 replies; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 18:40 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Waiman Long, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Wed, Jun 15, 2016 at 11:27:24AM -0700, Davidlohr Bueso wrote:
> On Wed, 15 Jun 2016, Peter Zijlstra wrote:
> 
> >In any case, its fairly simple to cure, just add
> >smp_acquire__after_ctrl_dep() at the end. If we bail because
> >need_resched() we don't need the acquire I think.
> 
> I was just considering this for your smp_cond_acquire/smp_cond_load_acquire

Right, so that need_resched break makes that a bit awkward. Not to
mention the cpu_relaxed() vs cpu_relaxed_lowlatency() difference.

> rework, so yeah I guess an smp_acquire__after_ctrl_dep would be a nice
> compromise.
> 
> However, I was always under the impression that races with node->locked were
> rather harmless (as indicated in the mentioned commit) -- which is why ->locked
> are simple load/stores, with the exception of the unqueueing -- but yeah, that's
> not even paired.

Yeah, see a few patches further in this series, where he guards a
variables with the osq_lock.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15 18:40         ` Peter Zijlstra
@ 2016-06-15 18:56           ` Davidlohr Bueso
  2016-06-17  1:11           ` Davidlohr Bueso
  1 sibling, 0 replies; 44+ messages in thread
From: Davidlohr Bueso @ 2016-06-15 18:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Wed, 15 Jun 2016, Peter Zijlstra wrote:

>On Wed, Jun 15, 2016 at 11:27:24AM -0700, Davidlohr Bueso wrote:
>> On Wed, 15 Jun 2016, Peter Zijlstra wrote:
>>
>> >In any case, its fairly simple to cure, just add
>> >smp_acquire__after_ctrl_dep() at the end. If we bail because
>> >need_resched() we don't need the acquire I think.
>>
>> I was just considering this for your smp_cond_acquire/smp_cond_load_acquire
>
>Right, so that need_resched break makes that a bit awkward. Not to
>mention the cpu_relaxed() vs cpu_relaxed_lowlatency() difference.

Oh sure, I was merely refering to the ordering semantics, not the calls
themselves -- although at some point, as archs begin to port locking/core
optimizations, we _will_ need the variants for dealing with '_lowlatency'.

>
>> rework, so yeah I guess an smp_acquire__after_ctrl_dep would be a nice
>> compromise.
>>
>> However, I was always under the impression that races with node->locked were
>> rather harmless (as indicated in the mentioned commit) -- which is why ->locked
>> are simple load/stores, with the exception of the unqueueing -- but yeah, that's
>> not even paired.
>
>Yeah, see a few patches further in this series, where he guards a
>variables with the osq_lock.

*sigh*

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15  8:04   ` Boqun Feng
  2016-06-15 17:18     ` Peter Zijlstra
@ 2016-06-15 19:01     ` Waiman Long
  2016-06-16  2:19       ` Boqun Feng
  1 sibling, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-15 19:01 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Davidlohr Bueso, Jason Low,
	Dave Chinner, Scott J Norton, Douglas Hatch

On 06/15/2016 04:04 AM, Boqun Feng wrote:
> Hi Waiman,
>
> On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:
>> The osq_lock() and osq_unlock() function may not provide the necessary
>> acquire and release barrier in some cases. This patch makes sure
>> that the proper barriers are provided when osq_lock() is successful
>> or when osq_unlock() is called.
>>
>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
>> ---
>>   kernel/locking/osq_lock.c |    4 ++--
>>   1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
>> index 05a3785..7dd4ee5 100644
>> --- a/kernel/locking/osq_lock.c
>> +++ b/kernel/locking/osq_lock.c
>> @@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>>   	 * cmpxchg in an attempt to undo our queueing.
>>   	 */
>>
>> -	while (!READ_ONCE(node->locked)) {
>> +	while (!smp_load_acquire(&node->locked)) {
>>   		/*
>>   		 * If we need to reschedule bail... so we can block.
>>   		 */
>> @@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
>>   	 * Second most likely case.
>>   	 */
>>   	node = this_cpu_ptr(&osq_node);
>> -	next = xchg(&node->next, NULL);
>> +	next = xchg_release(&node->next, NULL);
>>   	if (next) {
>>   		WRITE_ONCE(next->locked, 1);
> So we still use WRITE_ONCE() rather than smp_store_release() here?
>
> Though, IIUC, This is fine for all the archs but ARM64, because there
> will always be a xchg_release()/xchg() before the WRITE_ONCE(), which
> carries a necessary barrier to upgrade WRITE_ONCE() to a RELEASE.
>
> Not sure whether it's a problem on ARM64, but I think we certainly need
> to add some comments here, if we count on this trick.
>
> Am I missing something or misunderstanding you here?
>
> Regards,
> Boqun

The change on the unlock side is more for documentation purpose than is 
actually needed. As you had said, the xchg() call has provided the 
necessary memory barrier. Using the _release variant, however, may have 
some performance benefit in some architectures.

BTW, osq_lock/osq_unlock aren't general purpose locking primitives. So 
there is some leeways on how fancy we want on the lock and unlock sides.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15 17:12     ` Peter Zijlstra
  2016-06-15 18:27       ` Davidlohr Bueso
@ 2016-06-15 19:08       ` Waiman Long
  2016-06-15 20:04         ` Waiman Long
  1 sibling, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-15 19:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Davidlohr Bueso, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On 06/15/2016 01:12 PM, Peter Zijlstra wrote:
> On Wed, Jun 15, 2016 at 09:56:59AM -0700, Davidlohr Bueso wrote:
>> On Tue, 14 Jun 2016, Waiman Long wrote:
>>> +++ b/kernel/locking/osq_lock.c
>>> @@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>>> 	 * cmpxchg in an attempt to undo our queueing.
>>> 	 */
>>>
>>> -	while (!READ_ONCE(node->locked)) {
>>> +	while (!smp_load_acquire(&node->locked)) {
>> Hmm this being a polling path, that barrier can get pretty expensive and
>> last I checked it was unnecessary:
> I think he'll go rely on it later on.
>
> In any case, its fairly simple to cure, just add
> smp_acquire__after_ctrl_dep() at the end. If we bail because
> need_resched() we don't need the acquire I think.

Yes, I only need the acquire barrier when the locking is successful. 
Thanks for the suggestion. I will make the change accordingly.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 2/6] locking/rwsem: Stop active read lock ASAP
  2016-06-15 17:22   ` Peter Zijlstra
@ 2016-06-15 19:17     ` Waiman Long
  2016-06-16  2:14       ` Davidlohr Bueso
  0 siblings, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-15 19:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On 06/15/2016 01:22 PM, Peter Zijlstra wrote:
> On Tue, Jun 14, 2016 at 06:48:05PM -0400, Waiman Long wrote:
>> Currently, when down_read() fails, the active read locking isn't undone
>> until the rwsem_down_read_failed() function grabs the wait_lock. If the
>> wait_lock is contended, it may takes a while to get the lock. During
>> that period, writer lock stealing will be disabled because of the
>> active read lock.
>>
>> This patch will release the active read lock ASAP so that writer lock
>> stealing can happen sooner. The only downside is when the reader is
>> the first one in the wait queue as it has to issue another atomic
>> operation to update the count.
>>
>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>> the fio test with multithreaded randrw and randwrite tests on the
>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>> the aggregated bandwidths before and after the patch were as follows:
>>
>>    Test      BW before patch     BW after patch  % change
>>    ----      ---------------     --------------  --------
>>    randrw        1210 MB/s          1352 MB/s      +12%
>>    randwrite     1622 MB/s          1710 MB/s      +5.4%
>>
>> The write-only microbench also showed improvement because some read
>> locking was done by the XFS code.
> How does a reader only micro-bench react? I'm thinking the extra atomic
> might hurt a bit.
>

A reader only benchmark will not go into the slow path at all. It is 
only when there is a mix of readers and writers will the reader slowpath 
be executed.

I think there will be a little bit of performance impact for a workload 
that produce just the right amount of rwsem contentions. However, it is 
hard to produce a microbenchmark to create such a right amount of 
contention. As the amount of contention increases, I believe this patch 
will help performance instead of hurting it. Even then, the amount of 
performance degradation in that particular case will be pretty small.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 4/6] locking/rwsem: move down rwsem_down_read_failed function
  2016-06-15 17:40   ` Peter Zijlstra
@ 2016-06-15 19:21     ` Waiman Long
  0 siblings, 0 replies; 44+ messages in thread
From: Waiman Long @ 2016-06-15 19:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On 06/15/2016 01:40 PM, Peter Zijlstra wrote:
> On Tue, Jun 14, 2016 at 06:48:07PM -0400, Waiman Long wrote:
>> Move the rwsem_down_read_failed() function down to below the optimistic
>> spinning section before enabling optimistic spinning for the readers.
> newline
>
>> There is no change in code.
> Changelog fails to explain the why part.

Thanks for pointing this out. I will update the changelog to mention 
that the movement is needed because it is going to call 
rwsem_optimistic_spin() in later patch.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 3/6] locking/rwsem: Enable count-based spinning on reader
  2016-06-15 17:38   ` Peter Zijlstra
@ 2016-06-15 19:28     ` Waiman Long
  0 siblings, 0 replies; 44+ messages in thread
From: Waiman Long @ 2016-06-15 19:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On 06/15/2016 01:38 PM, Peter Zijlstra wrote:
> On Tue, Jun 14, 2016 at 06:48:06PM -0400, Waiman Long wrote:
>>   static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
>>   {
>> -	bool taken = false;
>> +	bool taken = false, can_spin;
> I would place the variables without assignment first.

Sure, easy change.

>
>> +	int loopcnt;
>>
>>   	preempt_disable();
>>
>> @@ -409,6 +412,8 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
>>   	if (!osq_lock(&sem->osq))
>>   		goto done;
>>
>> +	loopcnt = sem->rspin_enabled ? RWSEM_RSPIN_THRESHOLD : 0;
>> +
>>   	/*
>>   	 * Optimistically spin on the owner field and attempt to acquire the
>>   	 * lock whenever the owner changes. Spinning will be stopped when:
>> @@ -416,7 +421,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
>>   	 *  2) readers own the lock as we can't determine if they are
>>   	 *     actively running or not.
>>   	 */
>> -	while (rwsem_spin_on_owner(sem)) {
>> +	while ((can_spin = rwsem_spin_on_owner(sem)) || loopcnt) {
>>   		/*
>>   		 * Try to acquire the lock
>>   		 */
>> @@ -425,13 +430,16 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
>>   			break;
>>   		}
>>
>> +		if (!can_spin&&  loopcnt)
>> +			loopcnt--;
> This seems to suggest 'can_spin' is a bad name, because if we cannot
> spin, we do in fact spin anyway?
>
> Maybe call it write_spin or something, which makes it clear that if its
> not a write spin we'll do a read spin?

I am fine with the write_spin name.

>
> Also, isn't this the wrong level to do loopcnt at?
> rwsem_spin_on_owner() can have spend any amount of cycles spinning. So
> you're not counting loops of similar unit.

The loopcnt does not include time spinning on writer. It will be 
decremented only if the lock is owned by reader (can_spin == false). I 
will clarify that with additional comments.

>> +	/*
>> +	 * Was owner a reader?
>> +	 */
>> +	if (rwsem_owner_is_reader(sem->owner)) {
>> +		/*
>> +		 * Update rspin_enabled for reader spinning
> full stop and newline?

Sure.

>
>> +		 * Increment by 1 if successfully&  decrement by 8 if
>> +		 * unsuccessful.
> This is bloody obvious from the code, explain why, not what the code
> does.

Will clarify the comment.

>>                                 The decrement amount is kind of arbitrary
>> +		 * and can be adjusted if necessary.
>> +		 */
>> +		if (taken&&  (sem->rspin_enabled<  RWSEM_RSPIN_ENABLED_MAX))
>> +			sem->rspin_enabled++;
>> +		else if (!taken)
>> +			sem->rspin_enabled = (sem->rspin_enabled>= 8)
>> +					   ? sem->rspin_enabled - 8 : 0;
> This is unreadable and against coding style.

I will fix the coding style problem.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation
  2016-06-15 17:43   ` Peter Zijlstra
@ 2016-06-15 19:31     ` Waiman Long
  2016-06-15 21:57       ` Peter Zijlstra
  0 siblings, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-15 19:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On 06/15/2016 01:43 PM, Peter Zijlstra wrote:
> On Tue, Jun 14, 2016 at 06:48:08PM -0400, Waiman Long wrote:
>> even the reduced maximum of about 16k (32-bit) or 1G (64-bit) should
>> be more than enough for the foreseeable future.
> So what happens if I manage to create 16k+ threads on my 32bit kernel
> and get them all to do mmap() or so at the same time.
>
> That doesn't seem too far fetched.
>
> Then again, with double that (the current limit) that doesn't seem
> impossible either.

To hit the limit, we need to have all the threads calling down_write() 
at exactly the same instance in time which, I think, is pretty hard to 
do. Also, I don't believe you will ever see a 16k-cpu massive SMP system 
running on 32-bit kernel. I can imagine such a system running on 64-bit 
kernel, but certainly not 32-bit.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation
  2016-06-15 17:45   ` Peter Zijlstra
@ 2016-06-15 19:35     ` Waiman Long
  0 siblings, 0 replies; 44+ messages in thread
From: Waiman Long @ 2016-06-15 19:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On 06/15/2016 01:45 PM, Peter Zijlstra wrote:
> On Tue, Jun 14, 2016 at 06:48:08PM -0400, Waiman Long wrote:
>> +++ b/arch/alpha/include/asm/rwsem.h
>> @@ -17,9 +17,9 @@
>>   #define RWSEM_UNLOCKED_VALUE		0x0000000000000000L
>>   #define RWSEM_ACTIVE_BIAS		0x0000000000000001L
>>   #define RWSEM_ACTIVE_MASK		0x00000000ffffffffL
>> -#define RWSEM_WAITING_BIAS		(-0x0000000100000000L)
>> +#define RWSEM_WAITING_BIAS		0xc000000000000000L
>>   #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
>> -#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
>> +#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)
>> +++ b/arch/ia64/include/asm/rwsem.h
>> @@ -30,9 +30,9 @@
>>   #define RWSEM_UNLOCKED_VALUE		__IA64_UL_CONST(0x0000000000000000)
>>   #define RWSEM_ACTIVE_BIAS		(1L)
>>   #define RWSEM_ACTIVE_MASK		(0xffffffffL)
>> -#define RWSEM_WAITING_BIAS		(-0x100000000L)
>> +#define RWSEM_WAITING_BIAS		(-(1L<<  62))
>>   #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
>> -#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
>> +#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)
>> +++ b/arch/s390/include/asm/rwsem.h
>> @@ -42,9 +42,9 @@
>>   #define RWSEM_UNLOCKED_VALUE	0x0000000000000000L
>>   #define RWSEM_ACTIVE_BIAS	0x0000000000000001L
>>   #define RWSEM_ACTIVE_MASK	0x00000000ffffffffL
>> -#define RWSEM_WAITING_BIAS	(-0x0000000100000000L)
>> +#define RWSEM_WAITING_BIAS	0xc000000000000000L
>>   #define RWSEM_ACTIVE_READ_BIAS	RWSEM_ACTIVE_BIAS
>> -#define RWSEM_ACTIVE_WRITE_BIAS	(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
>> +#define RWSEM_ACTIVE_WRITE_BIAS	(-RWSEM_ACTIVE_MASK)
>> +++ b/arch/x86/include/asm/rwsem.h
>> @@ -41,21 +41,23 @@
>>
>>   /*
>>    * The bias values and the counter type limits the number of
>> - * potential readers/writers to 32767 for 32 bits and 2147483647
>> - * for 64 bits.
>> + * potential writers to 16383 for 32 bits and 1073741823 for 64 bits.
>> + * The combined readers and writers can go up to 65534 for 32-bits and
>> + * 4294967294 for 64-bits.
>>    */
>>
>>   #ifdef CONFIG_X86_64
>>   # define RWSEM_ACTIVE_MASK		0xffffffffL
>> +# define RWSEM_WAITING_BIAS		(-(1L<<  62))
>>   #else
>>   # define RWSEM_ACTIVE_MASK		0x0000ffffL
>> +# define RWSEM_WAITING_BIAS		(-(1L<<  30))
>>   #endif
>>
>>   #define RWSEM_UNLOCKED_VALUE		0x00000000L
>>   #define RWSEM_ACTIVE_BIAS		0x00000001L
>> -#define RWSEM_WAITING_BIAS		(-RWSEM_ACTIVE_MASK-1)
>>   #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
>> -#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
>> +#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)
>> +++ b/include/asm-generic/rwsem.h
>> @@ -18,15 +18,16 @@
>>    */
>>   #ifdef CONFIG_64BIT
>>   # define RWSEM_ACTIVE_MASK		0xffffffffL
>> +# define RWSEM_WAITING_BIAS		(-(1L<<  62))
>>   #else
>>   # define RWSEM_ACTIVE_MASK		0x0000ffffL
>> +# define RWSEM_WAITING_BIAS		(-(1L<<  30))
>>   #endif
>>
>>   #define RWSEM_UNLOCKED_VALUE		0x00000000L
>>   #define RWSEM_ACTIVE_BIAS		0x00000001L
>> -#define RWSEM_WAITING_BIAS		(-RWSEM_ACTIVE_MASK-1)
>>   #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
>> -#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
>> +#define RWSEM_ACTIVE_WRITE_BIAS		(-RWSEM_ACTIVE_MASK)
> Can't we collapse all that? They all seem very similar.

Yes, they are actually the same. I think we could extract the macro 
definitions into asm-generic/rwsem-macros.h, for instance, and make the 
architecture specific header file include the new generic header.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15 19:08       ` Waiman Long
@ 2016-06-15 20:04         ` Waiman Long
  2016-06-15 21:59           ` Peter Zijlstra
  0 siblings, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-15 20:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Davidlohr Bueso, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On 06/15/2016 03:08 PM, Waiman Long wrote:
> On 06/15/2016 01:12 PM, Peter Zijlstra wrote:
>> On Wed, Jun 15, 2016 at 09:56:59AM -0700, Davidlohr Bueso wrote:
>>> On Tue, 14 Jun 2016, Waiman Long wrote:
>>>> +++ b/kernel/locking/osq_lock.c
>>>> @@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>>>>      * cmpxchg in an attempt to undo our queueing.
>>>>      */
>>>>
>>>> -    while (!READ_ONCE(node->locked)) {
>>>> +    while (!smp_load_acquire(&node->locked)) {
>>> Hmm this being a polling path, that barrier can get pretty expensive 
>>> and
>>> last I checked it was unnecessary:
>> I think he'll go rely on it later on.
>>
>> In any case, its fairly simple to cure, just add
>> smp_acquire__after_ctrl_dep() at the end. If we bail because
>> need_resched() we don't need the acquire I think.
>
> Yes, I only need the acquire barrier when the locking is successful. 
> Thanks for the suggestion. I will make the change accordingly.
>

BTW, when will the smp_acquire__after_ctrl_dep() patch goes into the tip 
tree? My patch will have a dependency on that when I make the change.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation
  2016-06-15 19:31     ` Waiman Long
@ 2016-06-15 21:57       ` Peter Zijlstra
  0 siblings, 0 replies; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 21:57 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, linux-kernel, x86, linux-alpha, linux-ia64,
	linux-s390, linux-arch, Davidlohr Bueso, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Wed, Jun 15, 2016 at 03:31:51PM -0400, Waiman Long wrote:
> On 06/15/2016 01:43 PM, Peter Zijlstra wrote:
> >On Tue, Jun 14, 2016 at 06:48:08PM -0400, Waiman Long wrote:
> >>even the reduced maximum of about 16k (32-bit) or 1G (64-bit) should
> >>be more than enough for the foreseeable future.
> >So what happens if I manage to create 16k+ threads on my 32bit kernel
> >and get them all to do mmap() or so at the same time.
> >
> >That doesn't seem too far fetched.
> >
> >Then again, with double that (the current limit) that doesn't seem
> >impossible either.
> 
> To hit the limit, we need to have all the threads calling down_write() at
> exactly the same instance in time which, I think, is pretty hard to do.
> Also, I don't believe you will ever see a 16k-cpu massive SMP system running
> on 32-bit kernel. I can imagine such a system running on 64-bit kernel, but
> certainly not 32-bit.

Ah, so I thought we kept the WRITE_BIAS while blocking, which we don't.

But if they all get preempted before we undo the WRITE_BIAS then 1 CPU
will be able to trigger this. However utterly unlikely.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15 20:04         ` Waiman Long
@ 2016-06-15 21:59           ` Peter Zijlstra
  0 siblings, 0 replies; 44+ messages in thread
From: Peter Zijlstra @ 2016-06-15 21:59 UTC (permalink / raw)
  To: Waiman Long
  Cc: Davidlohr Bueso, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Wed, Jun 15, 2016 at 04:04:07PM -0400, Waiman Long wrote:
> 
> BTW, when will the smp_acquire__after_ctrl_dep() patch goes into the tip
> tree? My patch will have a dependency on that when I make the change.

Should already be in as part of that spin_unlock_wait() fixup.

/me checks..

tip/locking/core contains:

33ac279677dc ("locking/barriers: Introduce smp_acquire__after_ctrl_dep()")

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 2/6] locking/rwsem: Stop active read lock ASAP
  2016-06-15 19:17     ` Waiman Long
@ 2016-06-16  2:14       ` Davidlohr Bueso
  2016-06-16 21:25         ` Waiman Long
  0 siblings, 1 reply; 44+ messages in thread
From: Davidlohr Bueso @ 2016-06-16  2:14 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Wed, 15 Jun 2016, Waiman Long wrote:

>I think there will be a little bit of performance impact for a 
>workload that produce just the right amount of rwsem contentions. 

I'm not saying the change doesn't make sense, but this is the sort of
thing that will show nice numbers in one workload and go bite you in
another.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15 19:01     ` Waiman Long
@ 2016-06-16  2:19       ` Boqun Feng
  2016-06-16 10:16         ` Will Deacon
  2016-06-16 21:35         ` Waiman Long
  0 siblings, 2 replies; 44+ messages in thread
From: Boqun Feng @ 2016-06-16  2:19 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Davidlohr Bueso, Jason Low,
	Dave Chinner, Scott J Norton, Douglas Hatch, Will Deacon

[-- Attachment #1: Type: text/plain, Size: 3794 bytes --]

On Wed, Jun 15, 2016 at 03:01:19PM -0400, Waiman Long wrote:
> On 06/15/2016 04:04 AM, Boqun Feng wrote:
> > Hi Waiman,
> > 
> > On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:
> > > The osq_lock() and osq_unlock() function may not provide the necessary
> > > acquire and release barrier in some cases. This patch makes sure
> > > that the proper barriers are provided when osq_lock() is successful
> > > or when osq_unlock() is called.
> > > 
> > > Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
> > > ---
> > >   kernel/locking/osq_lock.c |    4 ++--
> > >   1 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > > index 05a3785..7dd4ee5 100644
> > > --- a/kernel/locking/osq_lock.c
> > > +++ b/kernel/locking/osq_lock.c
> > > @@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > >   	 * cmpxchg in an attempt to undo our queueing.
> > >   	 */
> > > 
> > > -	while (!READ_ONCE(node->locked)) {
> > > +	while (!smp_load_acquire(&node->locked)) {
> > >   		/*
> > >   		 * If we need to reschedule bail... so we can block.
> > >   		 */
> > > @@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
> > >   	 * Second most likely case.
> > >   	 */
> > >   	node = this_cpu_ptr(&osq_node);
> > > -	next = xchg(&node->next, NULL);
> > > +	next = xchg_release(&node->next, NULL);
> > >   	if (next) {
> > >   		WRITE_ONCE(next->locked, 1);
> > So we still use WRITE_ONCE() rather than smp_store_release() here?
> > 
> > Though, IIUC, This is fine for all the archs but ARM64, because there
> > will always be a xchg_release()/xchg() before the WRITE_ONCE(), which
> > carries a necessary barrier to upgrade WRITE_ONCE() to a RELEASE.
> > 
> > Not sure whether it's a problem on ARM64, but I think we certainly need
> > to add some comments here, if we count on this trick.
> > 
> > Am I missing something or misunderstanding you here?
> > 
> > Regards,
> > Boqun
> 
> The change on the unlock side is more for documentation purpose than is
> actually needed. As you had said, the xchg() call has provided the necessary
> memory barrier. Using the _release variant, however, may have some

But I'm afraid the barrier doesn't remain if we replace xchg() with
xchg_release() on ARM64v8, IIUC, xchg_release() is just a ldxr+stlxr
loop with no barrier on ARM64v8. This means the following code:

	CPU 0					CPU 1 (next)
	========================		==================
	WRITE_ONCE(x, 1);			r1 = smp_load_acquire(next->locked, 1);
	xchg_release(&node->next, NULL);	r2 = READ_ONCE(x);
	WRITE_ONCE(next->locked, 1);

could result in (r1 == 1 && r2 == 0) on ARM64v8, IIUC.

I translated it into a litmus test:

	AArch64 stlxr+str
	""
	{
	0:X4=x; 0:X5=node; node=next;
	1:X4=x; 1:X5=next;
	}
	 P0              | P1          ;
	 MOV W0,#1       | LDAR W1,[X5];
	 STR W0,[X4]     | LDR W2,[X4] ;
	 MOV X0,#0       |             ;
	 LDXR X2,[X5]    |             ;
	 STLXR W1,X0,[X5]|             ;
	 CBNZ W1, fail   |             ;
	 MOV W0, #1      |             ;
	 STR W0,[X2]     |             ;
	 fail:           |             ;

	exists
	(0:X0 = 1 /\ 1:X1 = 1 /\ 1:X2 = 0)

and herd said "Sometimes".

But I may miss something here or make a mistake in the translation. So
add Will in Cc list ;-)

> performance benefit in some architectures.
> 
> BTW, osq_lock/osq_unlock aren't general purpose locking primitives. So there
> is some leeways on how fancy we want on the lock and unlock sides.
> 

Understood, I think it's fine if we rely on something subtle here, but
I just want to make we won't be bitten by some corner cases.

Regards,
Boqun

> Cheers,
> Longman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-16  2:19       ` Boqun Feng
@ 2016-06-16 10:16         ` Will Deacon
  2016-06-16 21:35         ` Waiman Long
  1 sibling, 0 replies; 44+ messages in thread
From: Will Deacon @ 2016-06-16 10:16 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Waiman Long, Peter Zijlstra, Ingo Molnar, linux-kernel, x86,
	linux-alpha, linux-ia64, linux-s390, linux-arch, Davidlohr Bueso,
	Jason Low, Dave Chinner, Scott J Norton, Douglas Hatch

Hi guys,

On Thu, Jun 16, 2016 at 10:19:51AM +0800, Boqun Feng wrote:
> On Wed, Jun 15, 2016 at 03:01:19PM -0400, Waiman Long wrote:
> > On 06/15/2016 04:04 AM, Boqun Feng wrote:
> > > On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:
> > > > @@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
> > > >   	 * Second most likely case.
> > > >   	 */
> > > >   	node = this_cpu_ptr(&osq_node);
> > > > -	next = xchg(&node->next, NULL);
> > > > +	next = xchg_release(&node->next, NULL);
> > > >   	if (next) {
> > > >   		WRITE_ONCE(next->locked, 1);
> > > So we still use WRITE_ONCE() rather than smp_store_release() here?
> > > 
> > > Though, IIUC, This is fine for all the archs but ARM64, because there
> > > will always be a xchg_release()/xchg() before the WRITE_ONCE(), which
> > > carries a necessary barrier to upgrade WRITE_ONCE() to a RELEASE.
> > > 
> > > Not sure whether it's a problem on ARM64, but I think we certainly need
> > > to add some comments here, if we count on this trick.
> > > 
> > > Am I missing something or misunderstanding you here?
> > > 
> > The change on the unlock side is more for documentation purpose than is
> > actually needed. As you had said, the xchg() call has provided the necessary
> > memory barrier. Using the _release variant, however, may have some
> 
> But I'm afraid the barrier doesn't remain if we replace xchg() with
> xchg_release() on ARM64v8, IIUC, xchg_release() is just a ldxr+stlxr
> loop with no barrier on ARM64v8. This means the following code:
> 
> 	CPU 0					CPU 1 (next)
> 	========================		==================
> 	WRITE_ONCE(x, 1);			r1 = smp_load_acquire(next->locked, 1);
> 	xchg_release(&node->next, NULL);	r2 = READ_ONCE(x);
> 	WRITE_ONCE(next->locked, 1);
> 
> could result in (r1 == 1 && r2 == 0) on ARM64v8, IIUC.

Yes, of course. Why is that unexpected? You could just as easily make
the xchg_release an smp_store_release and this would still be permitted,
that's the whole point of acquire/release -- they're semi-permeable
barriers that allow accesses outside of the critical section to leak in,
but not the other way around.

It's worth noting that you've omitted the control dependency from
xchg_release to the subsequent write in your litmus tests, but I don't
think that actually changes anything here.

Will

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 2/6] locking/rwsem: Stop active read lock ASAP
  2016-06-16  2:14       ` Davidlohr Bueso
@ 2016-06-16 21:25         ` Waiman Long
  0 siblings, 0 replies; 44+ messages in thread
From: Waiman Long @ 2016-06-16 21:25 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On 06/15/2016 10:14 PM, Davidlohr Bueso wrote:
> On Wed, 15 Jun 2016, Waiman Long wrote:
>
>> I think there will be a little bit of performance impact for a 
>> workload that produce just the right amount of rwsem contentions. 
>
> I'm not saying the change doesn't make sense, but this is the sort of
> thing that will show nice numbers in one workload and go bite you in
> another.
>
> Thanks,
> Davidlohr

I would certainly agree if the additional atomic op is in the fastpath. 
Since it is in the slowpath, one additional atomic op will just be a 
small part of the whole rwsem_down_read_failed() function. I doubt if 
the performance degradation, if any, can be even noticeable.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-16  2:19       ` Boqun Feng
  2016-06-16 10:16         ` Will Deacon
@ 2016-06-16 21:35         ` Waiman Long
  2016-06-17  0:48           ` Boqun Feng
  1 sibling, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-16 21:35 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Davidlohr Bueso, Jason Low,
	Dave Chinner, Scott J Norton, Douglas Hatch, Will Deacon

On 06/15/2016 10:19 PM, Boqun Feng wrote:
> On Wed, Jun 15, 2016 at 03:01:19PM -0400, Waiman Long wrote:
>> On 06/15/2016 04:04 AM, Boqun Feng wrote:
>>> Hi Waiman,
>>>
>>> On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:
>>>> The osq_lock() and osq_unlock() function may not provide the necessary
>>>> acquire and release barrier in some cases. This patch makes sure
>>>> that the proper barriers are provided when osq_lock() is successful
>>>> or when osq_unlock() is called.
>>>>
>>>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
>>>> ---
>>>>    kernel/locking/osq_lock.c |    4 ++--
>>>>    1 files changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
>>>> index 05a3785..7dd4ee5 100644
>>>> --- a/kernel/locking/osq_lock.c
>>>> +++ b/kernel/locking/osq_lock.c
>>>> @@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>>>>    	 * cmpxchg in an attempt to undo our queueing.
>>>>    	 */
>>>>
>>>> -	while (!READ_ONCE(node->locked)) {
>>>> +	while (!smp_load_acquire(&node->locked)) {
>>>>    		/*
>>>>    		 * If we need to reschedule bail... so we can block.
>>>>    		 */
>>>> @@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
>>>>    	 * Second most likely case.
>>>>    	 */
>>>>    	node = this_cpu_ptr(&osq_node);
>>>> -	next = xchg(&node->next, NULL);
>>>> +	next = xchg_release(&node->next, NULL);
>>>>    	if (next) {
>>>>    		WRITE_ONCE(next->locked, 1);
>>> So we still use WRITE_ONCE() rather than smp_store_release() here?
>>>
>>> Though, IIUC, This is fine for all the archs but ARM64, because there
>>> will always be a xchg_release()/xchg() before the WRITE_ONCE(), which
>>> carries a necessary barrier to upgrade WRITE_ONCE() to a RELEASE.
>>>
>>> Not sure whether it's a problem on ARM64, but I think we certainly need
>>> to add some comments here, if we count on this trick.
>>>
>>> Am I missing something or misunderstanding you here?
>>>
>>> Regards,
>>> Boqun
>> The change on the unlock side is more for documentation purpose than is
>> actually needed. As you had said, the xchg() call has provided the necessary
>> memory barrier. Using the _release variant, however, may have some
> But I'm afraid the barrier doesn't remain if we replace xchg() with
> xchg_release() on ARM64v8, IIUC, xchg_release() is just a ldxr+stlxr
> loop with no barrier on ARM64v8. This means the following code:
>
> 	CPU 0					CPU 1 (next)
> 	========================		==================
> 	WRITE_ONCE(x, 1);			r1 = smp_load_acquire(next->locked, 1);
> 	xchg_release(&node->next, NULL);	r2 = READ_ONCE(x);
> 	WRITE_ONCE(next->locked, 1);
>
> could result in (r1 == 1&&  r2 == 0) on ARM64v8, IIUC.

If you look into the actual code:

         next = xchg_release(&node->next, NULL);
         if (next) {
                 WRITE_ONCE(next->locked, 1);
                 return;
         }

There is a control dependency that WRITE_ONCE() won't happen until 
xchg_release() returns. For your particular example, I will change it to

     CPU 0
     ===================
     WRITE_ONCE(x, 1);
     xchg_relaxed(&node->next, NULL);
     smp_store_release(next->locked, 1);

I don't change WRITE_ONCE to a smp_store_release() because it may not 
always execute.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-16 21:35         ` Waiman Long
@ 2016-06-17  0:48           ` Boqun Feng
  2016-06-17 15:26             ` Waiman Long
  0 siblings, 1 reply; 44+ messages in thread
From: Boqun Feng @ 2016-06-17  0:48 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Davidlohr Bueso, Jason Low,
	Dave Chinner, Scott J Norton, Douglas Hatch, Will Deacon

[-- Attachment #1: Type: text/plain, Size: 4470 bytes --]

On Thu, Jun 16, 2016 at 05:35:54PM -0400, Waiman Long wrote:
> On 06/15/2016 10:19 PM, Boqun Feng wrote:
> > On Wed, Jun 15, 2016 at 03:01:19PM -0400, Waiman Long wrote:
> > > On 06/15/2016 04:04 AM, Boqun Feng wrote:
> > > > Hi Waiman,
> > > > 
> > > > On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:
> > > > > The osq_lock() and osq_unlock() function may not provide the necessary
> > > > > acquire and release barrier in some cases. This patch makes sure
> > > > > that the proper barriers are provided when osq_lock() is successful
> > > > > or when osq_unlock() is called.
> > > > > 
> > > > > Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
> > > > > ---
> > > > >    kernel/locking/osq_lock.c |    4 ++--
> > > > >    1 files changed, 2 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> > > > > index 05a3785..7dd4ee5 100644
> > > > > --- a/kernel/locking/osq_lock.c
> > > > > +++ b/kernel/locking/osq_lock.c
> > > > > @@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
> > > > >    	 * cmpxchg in an attempt to undo our queueing.
> > > > >    	 */
> > > > > 
> > > > > -	while (!READ_ONCE(node->locked)) {
> > > > > +	while (!smp_load_acquire(&node->locked)) {
> > > > >    		/*
> > > > >    		 * If we need to reschedule bail... so we can block.
> > > > >    		 */
> > > > > @@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
> > > > >    	 * Second most likely case.
> > > > >    	 */
> > > > >    	node = this_cpu_ptr(&osq_node);
> > > > > -	next = xchg(&node->next, NULL);
> > > > > +	next = xchg_release(&node->next, NULL);
> > > > >    	if (next) {
> > > > >    		WRITE_ONCE(next->locked, 1);
> > > > So we still use WRITE_ONCE() rather than smp_store_release() here?
> > > > 
> > > > Though, IIUC, This is fine for all the archs but ARM64, because there
> > > > will always be a xchg_release()/xchg() before the WRITE_ONCE(), which
> > > > carries a necessary barrier to upgrade WRITE_ONCE() to a RELEASE.
> > > > 
> > > > Not sure whether it's a problem on ARM64, but I think we certainly need
> > > > to add some comments here, if we count on this trick.
> > > > 
> > > > Am I missing something or misunderstanding you here?
> > > > 
> > > > Regards,
> > > > Boqun
> > > The change on the unlock side is more for documentation purpose than is
> > > actually needed. As you had said, the xchg() call has provided the necessary
> > > memory barrier. Using the _release variant, however, may have some
> > But I'm afraid the barrier doesn't remain if we replace xchg() with
> > xchg_release() on ARM64v8, IIUC, xchg_release() is just a ldxr+stlxr
> > loop with no barrier on ARM64v8. This means the following code:
> > 
> > 	CPU 0					CPU 1 (next)
> > 	========================		==================
> > 	WRITE_ONCE(x, 1);			r1 = smp_load_acquire(next->locked, 1);
> > 	xchg_release(&node->next, NULL);	r2 = READ_ONCE(x);
> > 	WRITE_ONCE(next->locked, 1);
> > 
> > could result in (r1 == 1&&  r2 == 0) on ARM64v8, IIUC.
> 
> If you look into the actual code:
> 
>         next = xchg_release(&node->next, NULL);
>         if (next) {
>                 WRITE_ONCE(next->locked, 1);
>                 return;
>         }
> 
> There is a control dependency that WRITE_ONCE() won't happen until

But a control dependency only orders LOAD->STORE pairs, right? And here
the control dependency orders the LOAD part of xchg_release() and the
WRITE_ONCE().

Along with the fact that RELEASE only orders the STORE part of xchg with
the memory operations preceding the STORE part, so for the following
code:

	WRTIE_ONCE(x,1);
	next = xchg_release(&node->next, NULL);
	if (next)
		WRITE_ONCE(next->locked, 1);

such a reordering is allowed to happen on ARM64v8

	next = ldxr [&node->next] // LOAD part of xchg_release()

	if (next)
		WRITE_ONCE(next->locked, 1);

	WRITE_ONCE(x,1);
	stlxr NULL [&node->next]  // STORE part of xchg_releae()

Am I missing your point here?

Regards,
Boqun

> xchg_release() returns. For your particular example, I will change it to
> 
>     CPU 0
>     ===================
>     WRITE_ONCE(x, 1);
>     xchg_relaxed(&node->next, NULL);
>     smp_store_release(next->locked, 1);
> 
> I don't change WRITE_ONCE to a smp_store_release() because it may not always
> execute.
> 
> Cheers,
> Longman
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-15 18:40         ` Peter Zijlstra
  2016-06-15 18:56           ` Davidlohr Bueso
@ 2016-06-17  1:11           ` Davidlohr Bueso
  2016-06-17 14:28             ` Waiman Long
  1 sibling, 1 reply; 44+ messages in thread
From: Davidlohr Bueso @ 2016-06-17  1:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Waiman Long, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Wed, 15 Jun 2016, Peter Zijlstra wrote:

>Yeah, see a few patches further in this series, where he guards a
>variables with the osq_lock.

So one problem I have with all this is that if we are hardening osq_lock/unlock()
because of some future use that is specific to rwsems, then we will immediately
be hurting mutexes for no good reason.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-17  1:11           ` Davidlohr Bueso
@ 2016-06-17 14:28             ` Waiman Long
  2016-06-17 16:29               ` Davidlohr Bueso
  0 siblings, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-17 14:28 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On 06/16/2016 09:11 PM, Davidlohr Bueso wrote:
> On Wed, 15 Jun 2016, Peter Zijlstra wrote:
>
>> Yeah, see a few patches further in this series, where he guards a
>> variables with the osq_lock.
>
> So one problem I have with all this is that if we are hardening 
> osq_lock/unlock()
> because of some future use that is specific to rwsems, then we will 
> immediately
> be hurting mutexes for no good reason.
>

I am going to change it to use smp_acquire__after_ctrl_dep() as 
suggested by PeterZ. Is that a good enough compromise? I have also 
changed the xchg in the unlock side to xchg_release which could help 
performance in some archs. The thing is when developers see the name 
osq_lock/osq_unlock, they will naturally assume the proper barrriers are 
provided which is not currently the case.

Anyway, the change won't affect x86, it is probably ARM or PPC that may 
have an impact.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-17  0:48           ` Boqun Feng
@ 2016-06-17 15:26             ` Waiman Long
  2016-06-17 15:45               ` Will Deacon
  0 siblings, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-17 15:26 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Davidlohr Bueso, Jason Low,
	Dave Chinner, Scott J Norton, Douglas Hatch, Will Deacon

On 06/16/2016 08:48 PM, Boqun Feng wrote:
> On Thu, Jun 16, 2016 at 05:35:54PM -0400, Waiman Long wrote:
>> On 06/15/2016 10:19 PM, Boqun Feng wrote:
>>> On Wed, Jun 15, 2016 at 03:01:19PM -0400, Waiman Long wrote:
>>>> On 06/15/2016 04:04 AM, Boqun Feng wrote:
>>>>> Hi Waiman,
>>>>>
>>>>> On Tue, Jun 14, 2016 at 06:48:04PM -0400, Waiman Long wrote:
>>>>>> The osq_lock() and osq_unlock() function may not provide the necessary
>>>>>> acquire and release barrier in some cases. This patch makes sure
>>>>>> that the proper barriers are provided when osq_lock() is successful
>>>>>> or when osq_unlock() is called.
>>>>>>
>>>>>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
>>>>>> ---
>>>>>>     kernel/locking/osq_lock.c |    4 ++--
>>>>>>     1 files changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
>>>>>> index 05a3785..7dd4ee5 100644
>>>>>> --- a/kernel/locking/osq_lock.c
>>>>>> +++ b/kernel/locking/osq_lock.c
>>>>>> @@ -115,7 +115,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>>>>>>     	 * cmpxchg in an attempt to undo our queueing.
>>>>>>     	 */
>>>>>>
>>>>>> -	while (!READ_ONCE(node->locked)) {
>>>>>> +	while (!smp_load_acquire(&node->locked)) {
>>>>>>     		/*
>>>>>>     		 * If we need to reschedule bail... so we can block.
>>>>>>     		 */
>>>>>> @@ -198,7 +198,7 @@ void osq_unlock(struct optimistic_spin_queue *lock)
>>>>>>     	 * Second most likely case.
>>>>>>     	 */
>>>>>>     	node = this_cpu_ptr(&osq_node);
>>>>>> -	next = xchg(&node->next, NULL);
>>>>>> +	next = xchg_release(&node->next, NULL);
>>>>>>     	if (next) {
>>>>>>     		WRITE_ONCE(next->locked, 1);
>>>>> So we still use WRITE_ONCE() rather than smp_store_release() here?
>>>>>
>>>>> Though, IIUC, This is fine for all the archs but ARM64, because there
>>>>> will always be a xchg_release()/xchg() before the WRITE_ONCE(), which
>>>>> carries a necessary barrier to upgrade WRITE_ONCE() to a RELEASE.
>>>>>
>>>>> Not sure whether it's a problem on ARM64, but I think we certainly need
>>>>> to add some comments here, if we count on this trick.
>>>>>
>>>>> Am I missing something or misunderstanding you here?
>>>>>
>>>>> Regards,
>>>>> Boqun
>>>> The change on the unlock side is more for documentation purpose than is
>>>> actually needed. As you had said, the xchg() call has provided the necessary
>>>> memory barrier. Using the _release variant, however, may have some
>>> But I'm afraid the barrier doesn't remain if we replace xchg() with
>>> xchg_release() on ARM64v8, IIUC, xchg_release() is just a ldxr+stlxr
>>> loop with no barrier on ARM64v8. This means the following code:
>>>
>>> 	CPU 0					CPU 1 (next)
>>> 	========================		==================
>>> 	WRITE_ONCE(x, 1);			r1 = smp_load_acquire(next->locked, 1);
>>> 	xchg_release(&node->next, NULL);	r2 = READ_ONCE(x);
>>> 	WRITE_ONCE(next->locked, 1);
>>>
>>> could result in (r1 == 1&&   r2 == 0) on ARM64v8, IIUC.
>> If you look into the actual code:
>>
>>          next = xchg_release(&node->next, NULL);
>>          if (next) {
>>                  WRITE_ONCE(next->locked, 1);
>>                  return;
>>          }
>>
>> There is a control dependency that WRITE_ONCE() won't happen until
> But a control dependency only orders LOAD->STORE pairs, right? And here
> the control dependency orders the LOAD part of xchg_release() and the
> WRITE_ONCE().
>
> Along with the fact that RELEASE only orders the STORE part of xchg with
> the memory operations preceding the STORE part, so for the following
> code:
>
> 	WRTIE_ONCE(x,1);
> 	next = xchg_release(&node->next, NULL);
> 	if (next)
> 		WRITE_ONCE(next->locked, 1);
>
> such a reordering is allowed to happen on ARM64v8
>
> 	next = ldxr [&node->next] // LOAD part of xchg_release()
>
> 	if (next)
> 		WRITE_ONCE(next->locked, 1);
>
> 	WRITE_ONCE(x,1);
> 	stlxr NULL [&node->next]  // STORE part of xchg_releae()
>
> Am I missing your point here?
>
> Regards,
> Boqun

My understanding of the release barrier is that both prior LOADs and 
STOREs can't move after the barrier. If WRITE_ONCE(x, 1) can move to 
below as shown above, it is not a real release barrier and we may need 
to change the barrier code.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-17 15:26             ` Waiman Long
@ 2016-06-17 15:45               ` Will Deacon
  2016-06-17 18:17                 ` Waiman Long
  0 siblings, 1 reply; 44+ messages in thread
From: Will Deacon @ 2016-06-17 15:45 UTC (permalink / raw)
  To: Waiman Long
  Cc: Boqun Feng, Peter Zijlstra, Ingo Molnar, linux-kernel, x86,
	linux-alpha, linux-ia64, linux-s390, linux-arch, Davidlohr Bueso,
	Jason Low, Dave Chinner, Scott J Norton, Douglas Hatch

On Fri, Jun 17, 2016 at 11:26:41AM -0400, Waiman Long wrote:
> On 06/16/2016 08:48 PM, Boqun Feng wrote:
> >On Thu, Jun 16, 2016 at 05:35:54PM -0400, Waiman Long wrote:
> >>If you look into the actual code:
> >>
> >>         next = xchg_release(&node->next, NULL);
> >>         if (next) {
> >>                 WRITE_ONCE(next->locked, 1);
> >>                 return;
> >>         }
> >>
> >>There is a control dependency that WRITE_ONCE() won't happen until
> >But a control dependency only orders LOAD->STORE pairs, right? And here
> >the control dependency orders the LOAD part of xchg_release() and the
> >WRITE_ONCE().
> >
> >Along with the fact that RELEASE only orders the STORE part of xchg with
> >the memory operations preceding the STORE part, so for the following
> >code:
> >
> >	WRTIE_ONCE(x,1);
> >	next = xchg_release(&node->next, NULL);
> >	if (next)
> >		WRITE_ONCE(next->locked, 1);
> >
> >such a reordering is allowed to happen on ARM64v8
> >
> >	next = ldxr [&node->next] // LOAD part of xchg_release()
> >
> >	if (next)
> >		WRITE_ONCE(next->locked, 1);
> >
> >	WRITE_ONCE(x,1);
> >	stlxr NULL [&node->next]  // STORE part of xchg_releae()
> >
> >Am I missing your point here?
> 
> My understanding of the release barrier is that both prior LOADs and STOREs
> can't move after the barrier. If WRITE_ONCE(x, 1) can move to below as shown
> above, it is not a real release barrier and we may need to change the
> barrier code.

You seem to be missing the point.

{READ,WRITE}_ONCE accesses appearing in program order after a release
are not externally ordered with respect to the release unless they
access the same location.

This is illustrated by Boqun's example, which shows two WRITE_ONCE
accesses being reordered before a store-release forming the write
component of an xchg_release. In both cases, WRITE_ONCE(x, 1) remains
ordered before the store-release.

Will

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-17 14:28             ` Waiman Long
@ 2016-06-17 16:29               ` Davidlohr Bueso
  2016-06-17 16:46                 ` Davidlohr Bueso
  0 siblings, 1 reply; 44+ messages in thread
From: Davidlohr Bueso @ 2016-06-17 16:29 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Fri, 17 Jun 2016, Waiman Long wrote:

>On 06/16/2016 09:11 PM, Davidlohr Bueso wrote:
>>On Wed, 15 Jun 2016, Peter Zijlstra wrote:
>>
>>>Yeah, see a few patches further in this series, where he guards a
>>>variables with the osq_lock.
>>
>>So one problem I have with all this is that if we are hardening 
>>osq_lock/unlock()
>>because of some future use that is specific to rwsems, then we will 
>>immediately
>>be hurting mutexes for no good reason.
>>
>
>I am going to change it to use smp_acquire__after_ctrl_dep() as 
>suggested by PeterZ. Is that a good enough compromise? I have also 
>changed the xchg in the unlock side to xchg_release which could help 
>performance in some archs. The thing is when developers see the name 
>osq_lock/osq_unlock, they will naturally assume the proper barrriers 
>are provided which is not currently the case.

Oh, from your discussions with Boqun, I was under the impression that ->locked
was now going to be properly ordered in all cases now, which is why
I worry about mutexes.

>Anyway, the change won't affect x86, it is probably ARM or PPC that 
>may have an impact.

Yes, that xchg() won't affect x86, but adding an smp_store_release(node->locked, 1)
or such will obviously.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-17 16:29               ` Davidlohr Bueso
@ 2016-06-17 16:46                 ` Davidlohr Bueso
  0 siblings, 0 replies; 44+ messages in thread
From: Davidlohr Bueso @ 2016-06-17 16:46 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, linux-kernel, x86, linux-alpha,
	linux-ia64, linux-s390, linux-arch, Jason Low, Dave Chinner,
	Scott J Norton, Douglas Hatch

On Fri, 17 Jun 2016, Davidlohr Bueso wrote:

>On Fri, 17 Jun 2016, Waiman Long wrote:
>
>>On 06/16/2016 09:11 PM, Davidlohr Bueso wrote:
>>>On Wed, 15 Jun 2016, Peter Zijlstra wrote:
>>>
>>>>Yeah, see a few patches further in this series, where he guards a
>>>>variables with the osq_lock.
>>>
>>>So one problem I have with all this is that if we are hardening 
>>>osq_lock/unlock()
>>>because of some future use that is specific to rwsems, then we 
>>>will immediately
>>>be hurting mutexes for no good reason.
>>>
>>
>>I am going to change it to use smp_acquire__after_ctrl_dep() as 
>>suggested by PeterZ. Is that a good enough compromise? I have also 
>>changed the xchg in the unlock side to xchg_release which could help 
>>performance in some archs. The thing is when developers see the name 
>>osq_lock/osq_unlock, they will naturally assume the proper barrriers 
>>are provided which is not currently the case.
>
>Oh, from your discussions with Boqun, I was under the impression that ->locked
>was now going to be properly ordered in all cases now, which is why
>I worry about mutexes.
>
>>Anyway, the change won't affect x86, it is probably ARM or PPC that 
>>may have an impact.
>
>Yes, that xchg() won't affect x86, but adding an smp_store_release(node->locked, 1)
>or such will obviously.

nm this last part, you're right, x86 smp_store_release is a nop.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-17 15:45               ` Will Deacon
@ 2016-06-17 18:17                 ` Waiman Long
  2016-06-18  8:46                   ` Boqun Feng
  0 siblings, 1 reply; 44+ messages in thread
From: Waiman Long @ 2016-06-17 18:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: Boqun Feng, Peter Zijlstra, Ingo Molnar, linux-kernel, x86,
	linux-alpha, linux-ia64, linux-s390, linux-arch, Davidlohr Bueso,
	Jason Low, Dave Chinner, Scott J Norton, Douglas Hatch

On 06/17/2016 11:45 AM, Will Deacon wrote:
> On Fri, Jun 17, 2016 at 11:26:41AM -0400, Waiman Long wrote:
>> On 06/16/2016 08:48 PM, Boqun Feng wrote:
>>> On Thu, Jun 16, 2016 at 05:35:54PM -0400, Waiman Long wrote:
>>>> If you look into the actual code:
>>>>
>>>>          next = xchg_release(&node->next, NULL);
>>>>          if (next) {
>>>>                  WRITE_ONCE(next->locked, 1);
>>>>                  return;
>>>>          }
>>>>
>>>> There is a control dependency that WRITE_ONCE() won't happen until
>>> But a control dependency only orders LOAD->STORE pairs, right? And here
>>> the control dependency orders the LOAD part of xchg_release() and the
>>> WRITE_ONCE().
>>>
>>> Along with the fact that RELEASE only orders the STORE part of xchg with
>>> the memory operations preceding the STORE part, so for the following
>>> code:
>>>
>>> 	WRTIE_ONCE(x,1);
>>> 	next = xchg_release(&node->next, NULL);
>>> 	if (next)
>>> 		WRITE_ONCE(next->locked, 1);
>>>
>>> such a reordering is allowed to happen on ARM64v8
>>>
>>> 	next = ldxr [&node->next] // LOAD part of xchg_release()
>>>
>>> 	if (next)
>>> 		WRITE_ONCE(next->locked, 1);
>>>
>>> 	WRITE_ONCE(x,1);
>>> 	stlxr NULL [&node->next]  // STORE part of xchg_releae()
>>>
>>> Am I missing your point here?
>> My understanding of the release barrier is that both prior LOADs and STOREs
>> can't move after the barrier. If WRITE_ONCE(x, 1) can move to below as shown
>> above, it is not a real release barrier and we may need to change the
>> barrier code.
> You seem to be missing the point.
>
> {READ,WRITE}_ONCE accesses appearing in program order after a release
> are not externally ordered with respect to the release unless they
> access the same location.
>
> This is illustrated by Boqun's example, which shows two WRITE_ONCE
> accesses being reordered before a store-release forming the write
> component of an xchg_release. In both cases, WRITE_ONCE(x, 1) remains
> ordered before the store-release.
>
> Will

I am sorry that I misread the mail. I am not used to treating xchg as 
two separate instructions. Yes, it is a problem. In that case, we have 
to either keep the xchg() function as it is or use 
smp_store_release(&next->locked, 1). So which one is a better 
alternative for ARM or PPC?

Cheers,
Longman

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-17 18:17                 ` Waiman Long
@ 2016-06-18  8:46                   ` Boqun Feng
  2016-06-20  7:59                     ` Will Deacon
  0 siblings, 1 reply; 44+ messages in thread
From: Boqun Feng @ 2016-06-18  8:46 UTC (permalink / raw)
  To: Waiman Long
  Cc: Will Deacon, Peter Zijlstra, Ingo Molnar, linux-kernel, x86,
	linux-alpha, linux-ia64, linux-s390, linux-arch, Davidlohr Bueso,
	Jason Low, Dave Chinner, Scott J Norton, Douglas Hatch

[-- Attachment #1: Type: text/plain, Size: 2929 bytes --]

On Fri, Jun 17, 2016 at 02:17:27PM -0400, Waiman Long wrote:
> On 06/17/2016 11:45 AM, Will Deacon wrote:
> > On Fri, Jun 17, 2016 at 11:26:41AM -0400, Waiman Long wrote:
> > > On 06/16/2016 08:48 PM, Boqun Feng wrote:
> > > > On Thu, Jun 16, 2016 at 05:35:54PM -0400, Waiman Long wrote:
> > > > > If you look into the actual code:
> > > > > 
> > > > >          next = xchg_release(&node->next, NULL);
> > > > >          if (next) {
> > > > >                  WRITE_ONCE(next->locked, 1);
> > > > >                  return;
> > > > >          }
> > > > > 
> > > > > There is a control dependency that WRITE_ONCE() won't happen until
> > > > But a control dependency only orders LOAD->STORE pairs, right? And here
> > > > the control dependency orders the LOAD part of xchg_release() and the
> > > > WRITE_ONCE().
> > > > 
> > > > Along with the fact that RELEASE only orders the STORE part of xchg with
> > > > the memory operations preceding the STORE part, so for the following
> > > > code:
> > > > 
> > > > 	WRTIE_ONCE(x,1);
> > > > 	next = xchg_release(&node->next, NULL);
> > > > 	if (next)
> > > > 		WRITE_ONCE(next->locked, 1);
> > > > 
> > > > such a reordering is allowed to happen on ARM64v8
> > > > 
> > > > 	next = ldxr [&node->next] // LOAD part of xchg_release()
> > > > 
> > > > 	if (next)
> > > > 		WRITE_ONCE(next->locked, 1);
> > > > 
> > > > 	WRITE_ONCE(x,1);
> > > > 	stlxr NULL [&node->next]  // STORE part of xchg_releae()
> > > > 
> > > > Am I missing your point here?
> > > My understanding of the release barrier is that both prior LOADs and STOREs
> > > can't move after the barrier. If WRITE_ONCE(x, 1) can move to below as shown
> > > above, it is not a real release barrier and we may need to change the
> > > barrier code.
> > You seem to be missing the point.
> > 
> > {READ,WRITE}_ONCE accesses appearing in program order after a release
> > are not externally ordered with respect to the release unless they
> > access the same location.
> > 
> > This is illustrated by Boqun's example, which shows two WRITE_ONCE
> > accesses being reordered before a store-release forming the write
> > component of an xchg_release. In both cases, WRITE_ONCE(x, 1) remains
> > ordered before the store-release.
> > 
> > Will
> 
> I am sorry that I misread the mail. I am not used to treating xchg as two
> separate instructions. Yes, it is a problem. In that case, we have to either

And sorry for the Red Pill ;-)

> keep the xchg() function as it is or use smp_store_release(&next->locked,
> 1). So which one is a better alternative for ARM or PPC?
> 

For PPC, I think xchg_release() + smp_store_release() is better than the 
current code, because the former has two lwsync while the latter has two
sync, and sync is quite expensive than lwsync on PPC.

I need to leave the ARM part to Will ;-)

Regards,
Boqun

> Cheers,
> Longman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier
  2016-06-18  8:46                   ` Boqun Feng
@ 2016-06-20  7:59                     ` Will Deacon
  0 siblings, 0 replies; 44+ messages in thread
From: Will Deacon @ 2016-06-20  7:59 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Waiman Long, Peter Zijlstra, Ingo Molnar, linux-kernel, x86,
	linux-alpha, linux-ia64, linux-s390, linux-arch, Davidlohr Bueso,
	Jason Low, Dave Chinner, Scott J Norton, Douglas Hatch

On Sat, Jun 18, 2016 at 04:46:20PM +0800, Boqun Feng wrote:
> On Fri, Jun 17, 2016 at 02:17:27PM -0400, Waiman Long wrote:
> > keep the xchg() function as it is or use smp_store_release(&next->locked,
> > 1). So which one is a better alternative for ARM or PPC?
> > 
> 
> For PPC, I think xchg_release() + smp_store_release() is better than the 
> current code, because the former has two lwsync while the latter has two
> sync, and sync is quite expensive than lwsync on PPC.
> 
> I need to leave the ARM part to Will ;-)

I doubt there's much in it, but xchg() has DMB + release, so xchg_release +
smp_store_release is probably slightly better for us too.

Will

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2016-06-20  8:08 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-14 22:48 [RFC PATCH-tip v2 0/6] locking/rwsem: Enable reader optimistic spinning Waiman Long
2016-06-14 22:48 ` [RFC PATCH-tip v2 1/6] locking/osq: Make lock/unlock proper acquire/release barrier Waiman Long
2016-06-15  8:04   ` Boqun Feng
2016-06-15 17:18     ` Peter Zijlstra
2016-06-15 19:01     ` Waiman Long
2016-06-16  2:19       ` Boqun Feng
2016-06-16 10:16         ` Will Deacon
2016-06-16 21:35         ` Waiman Long
2016-06-17  0:48           ` Boqun Feng
2016-06-17 15:26             ` Waiman Long
2016-06-17 15:45               ` Will Deacon
2016-06-17 18:17                 ` Waiman Long
2016-06-18  8:46                   ` Boqun Feng
2016-06-20  7:59                     ` Will Deacon
2016-06-15 16:56   ` Davidlohr Bueso
2016-06-15 17:12     ` Peter Zijlstra
2016-06-15 18:27       ` Davidlohr Bueso
2016-06-15 18:40         ` Peter Zijlstra
2016-06-15 18:56           ` Davidlohr Bueso
2016-06-17  1:11           ` Davidlohr Bueso
2016-06-17 14:28             ` Waiman Long
2016-06-17 16:29               ` Davidlohr Bueso
2016-06-17 16:46                 ` Davidlohr Bueso
2016-06-15 19:08       ` Waiman Long
2016-06-15 20:04         ` Waiman Long
2016-06-15 21:59           ` Peter Zijlstra
2016-06-14 22:48 ` [RFC PATCH-tip v2 2/6] locking/rwsem: Stop active read lock ASAP Waiman Long
2016-06-15 17:22   ` Peter Zijlstra
2016-06-15 19:17     ` Waiman Long
2016-06-16  2:14       ` Davidlohr Bueso
2016-06-16 21:25         ` Waiman Long
2016-06-14 22:48 ` [RFC PATCH-tip v2 3/6] locking/rwsem: Enable count-based spinning on reader Waiman Long
2016-06-15 17:38   ` Peter Zijlstra
2016-06-15 19:28     ` Waiman Long
2016-06-14 22:48 ` [RFC PATCH-tip v2 4/6] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
2016-06-15 17:40   ` Peter Zijlstra
2016-06-15 19:21     ` Waiman Long
2016-06-14 22:48 ` [RFC PATCH-tip v2 5/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation Waiman Long
2016-06-15 17:43   ` Peter Zijlstra
2016-06-15 19:31     ` Waiman Long
2016-06-15 21:57       ` Peter Zijlstra
2016-06-15 17:45   ` Peter Zijlstra
2016-06-15 19:35     ` Waiman Long
2016-06-14 22:48 ` [RFC PATCH-tip v2 6/6] locking/rwsem: Enable spinning readers Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).