linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4
@ 2015-10-06 16:45 Paul E. McKenney
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
  2015-10-06 17:42 ` [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4 Josh Triplett
  0 siblings, 2 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani

Hello!

This series contains performance improvements and locktorture testing
for percpu-rwsem:

1.	Add rtmutex torturing to locktorture, courtesy of Davidlohr Bueso.

2.	Add exports to allow locktorture to be built as a module.

3.	Add torture tests for percpu-rwsem.

4.	Consolidate cond_resched_rcu_qs() into stutter_wait().

5.	Create rcu_sync infrastructure, courtesy of Oleg Nesterov.

6.	Simplify rcu_sync using new rcu_sync_ops structure, courtesy
	of Oleg Nesterov.

7.	Add CONFIG_PROVE_RCU checks for rcu_sync, courtesy of Oleg Nesterov.

8.	Introduce rcu_sync_dtor(), courtesy of Oleg Nesterov.

9.	Make percpu_free_rwsem() after kzalloc() safe, courtesy of Oleg
	Nesterov.

10.	Make percpu-rwsem make use of rcu_sync, courtesy of Oleg Nesterov.

11.	Fix the comments outdated by rcu_sync, courtesy of Oleg Nesterov.

12.	Clean up the lockdep annotations in percpu_down_read(), courtesy
	Peter Zijlstra and of Oleg Nesterov.

13.	Cleanup the CONFIG_PROVE_RCU checks, courtesy of Oleg Nesterov.

							Thanx, Paul

------------------------------------------------------------------------

 b/Documentation/locking/locktorture.txt                       |    3 
 b/include/linux/percpu-rwsem.h                                |    3 
 b/include/linux/rcu_sync.h                                    |  168 ++++--
 b/kernel/locking/locktorture.c                                |  158 +++++
 b/kernel/locking/percpu-rwsem.c                               |   90 +--
 b/kernel/rcu/Makefile                                         |    2 
 b/kernel/rcu/rcutorture.c                                     |    2 
 b/kernel/rcu/sync.c                                           |  269 +++++++++-
 b/kernel/torture.c                                            |    1 
 b/tools/testing/selftests/rcutorture/configs/lock/CFLIST      |    4 
 b/tools/testing/selftests/rcutorture/configs/lock/LOCK05      |    6 
 b/tools/testing/selftests/rcutorture/configs/lock/LOCK05.boot |    1 
 b/tools/testing/selftests/rcutorture/configs/lock/LOCK06      |    6 
 b/tools/testing/selftests/rcutorture/configs/lock/LOCK06.boot |    1 
 14 files changed, 588 insertions(+), 126 deletions(-)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing
  2015-10-06 16:45 [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4 Paul E. McKenney
@ 2015-10-06 16:45 ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 02/13] locking/percpu-rwsem: Export symbols for locktorture Paul E. McKenney
                     ` (11 more replies)
  2015-10-06 17:42 ` [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4 Josh Triplett
  1 sibling, 12 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Davidlohr Bueso, Davidlohr Bueso,
	Paul E. McKenney

From: Davidlohr Bueso <dave@stgolabs.net>

Real time mutexes is one of the few general primitives
that we do not have in locktorture. Address this -- a few
considerations:

o To spice things up, enable competing thread(s) to become
rt, such that we can stress different prio boosting paths
in the rtmutex code. Introduce a ->task_boost callback,
only used by rtmutex-torturer. Tasks will boost/deboost
around every 50k (arbitrarily) lock/unlock operations.

o Hold times are similar to what we have for other locks:
only occasionally having longer hold times (per ~200k ops).
So we roughly do two full rt boost+deboosting ops with
short hold times.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 Documentation/locking/locktorture.txt              |   3 +
 kernel/locking/locktorture.c                       | 114 ++++++++++++++++++++-
 .../selftests/rcutorture/configs/lock/CFLIST       |   3 +-
 .../selftests/rcutorture/configs/lock/LOCK05       |   6 ++
 .../selftests/rcutorture/configs/lock/LOCK05.boot  |   1 +
 5 files changed, 124 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/rcutorture/configs/lock/LOCK05
 create mode 100644 tools/testing/selftests/rcutorture/configs/lock/LOCK05.boot

diff --git a/Documentation/locking/locktorture.txt b/Documentation/locking/locktorture.txt
index 619f2bb136a5..a2ef3a929bf1 100644
--- a/Documentation/locking/locktorture.txt
+++ b/Documentation/locking/locktorture.txt
@@ -52,6 +52,9 @@ torture_type	  Type of lock to torture. By default, only spinlocks will
 
 		     o "mutex_lock": mutex_lock() and mutex_unlock() pairs.
 
+		     o "rtmutex_lock": rtmutex_lock() and rtmutex_unlock()
+				       pairs. Kernel must have CONFIG_RT_MUTEX=y.
+
 		     o "rwsem_lock": read/write down() and up() semaphore pairs.
 
 torture_runnable  Start locktorture at boot time in the case where the
diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 32244186f1f2..e1ca7a2fae91 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -17,12 +17,14 @@
  *
  * Copyright (C) IBM Corporation, 2014
  *
- * Author: Paul E. McKenney <paulmck@us.ibm.com>
+ * Authors: Paul E. McKenney <paulmck@us.ibm.com>
+ *          Davidlohr Bueso <dave@stgolabs.net>
  *	Based on kernel/rcu/torture.c.
  */
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/kthread.h>
+#include <linux/sched/rt.h>
 #include <linux/spinlock.h>
 #include <linux/rwlock.h>
 #include <linux/mutex.h>
@@ -91,11 +93,13 @@ struct lock_torture_ops {
 	void (*init)(void);
 	int (*writelock)(void);
 	void (*write_delay)(struct torture_random_state *trsp);
+	void (*task_boost)(struct torture_random_state *trsp);
 	void (*writeunlock)(void);
 	int (*readlock)(void);
 	void (*read_delay)(struct torture_random_state *trsp);
 	void (*readunlock)(void);
-	unsigned long flags;
+
+	unsigned long flags; /* for irq spinlocks */
 	const char *name;
 };
 
@@ -139,9 +143,15 @@ static void torture_lock_busted_write_unlock(void)
 	  /* BUGGY, do not use in real life!!! */
 }
 
+static void torture_boost_dummy(struct torture_random_state *trsp)
+{
+	/* Only rtmutexes care about priority */
+}
+
 static struct lock_torture_ops lock_busted_ops = {
 	.writelock	= torture_lock_busted_write_lock,
 	.write_delay	= torture_lock_busted_write_delay,
+	.task_boost     = torture_boost_dummy,
 	.writeunlock	= torture_lock_busted_write_unlock,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -185,6 +195,7 @@ static void torture_spin_lock_write_unlock(void) __releases(torture_spinlock)
 static struct lock_torture_ops spin_lock_ops = {
 	.writelock	= torture_spin_lock_write_lock,
 	.write_delay	= torture_spin_lock_write_delay,
+	.task_boost     = torture_boost_dummy,
 	.writeunlock	= torture_spin_lock_write_unlock,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -211,6 +222,7 @@ __releases(torture_spinlock)
 static struct lock_torture_ops spin_lock_irq_ops = {
 	.writelock	= torture_spin_lock_write_lock_irq,
 	.write_delay	= torture_spin_lock_write_delay,
+	.task_boost     = torture_boost_dummy,
 	.writeunlock	= torture_lock_spin_write_unlock_irq,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -275,6 +287,7 @@ static void torture_rwlock_read_unlock(void) __releases(torture_rwlock)
 static struct lock_torture_ops rw_lock_ops = {
 	.writelock	= torture_rwlock_write_lock,
 	.write_delay	= torture_rwlock_write_delay,
+	.task_boost     = torture_boost_dummy,
 	.writeunlock	= torture_rwlock_write_unlock,
 	.readlock       = torture_rwlock_read_lock,
 	.read_delay     = torture_rwlock_read_delay,
@@ -315,6 +328,7 @@ __releases(torture_rwlock)
 static struct lock_torture_ops rw_lock_irq_ops = {
 	.writelock	= torture_rwlock_write_lock_irq,
 	.write_delay	= torture_rwlock_write_delay,
+	.task_boost     = torture_boost_dummy,
 	.writeunlock	= torture_rwlock_write_unlock_irq,
 	.readlock       = torture_rwlock_read_lock_irq,
 	.read_delay     = torture_rwlock_read_delay,
@@ -354,6 +368,7 @@ static void torture_mutex_unlock(void) __releases(torture_mutex)
 static struct lock_torture_ops mutex_lock_ops = {
 	.writelock	= torture_mutex_lock,
 	.write_delay	= torture_mutex_delay,
+	.task_boost     = torture_boost_dummy,
 	.writeunlock	= torture_mutex_unlock,
 	.readlock       = NULL,
 	.read_delay     = NULL,
@@ -361,6 +376,90 @@ static struct lock_torture_ops mutex_lock_ops = {
 	.name		= "mutex_lock"
 };
 
+#ifdef CONFIG_RT_MUTEXES
+static DEFINE_RT_MUTEX(torture_rtmutex);
+
+static int torture_rtmutex_lock(void) __acquires(torture_rtmutex)
+{
+	rt_mutex_lock(&torture_rtmutex);
+	return 0;
+}
+
+static void torture_rtmutex_boost(struct torture_random_state *trsp)
+{
+	int policy;
+	struct sched_param param;
+	const unsigned int factor = 50000; /* yes, quite arbitrary */
+
+	if (!rt_task(current)) {
+		/*
+		 * (1) Boost priority once every ~50k operations. When the
+		 * task tries to take the lock, the rtmutex it will account
+		 * for the new priority, and do any corresponding pi-dance.
+		 */
+		if (!(torture_random(trsp) %
+		      (cxt.nrealwriters_stress * factor))) {
+			policy = SCHED_FIFO;
+			param.sched_priority = MAX_RT_PRIO - 1;
+		} else /* common case, do nothing */
+			return;
+	} else {
+		/*
+		 * The task will remain boosted for another ~500k operations,
+		 * then restored back to its original prio, and so forth.
+		 *
+		 * When @trsp is nil, we want to force-reset the task for
+		 * stopping the kthread.
+		 */
+		if (!trsp || !(torture_random(trsp) %
+			       (cxt.nrealwriters_stress * factor * 2))) {
+			policy = SCHED_NORMAL;
+			param.sched_priority = 0;
+		} else /* common case, do nothing */
+			return;
+	}
+
+	sched_setscheduler_nocheck(current, policy, &param);
+}
+
+static void torture_rtmutex_delay(struct torture_random_state *trsp)
+{
+	const unsigned long shortdelay_us = 2;
+	const unsigned long longdelay_ms = 100;
+
+	/*
+	 * We want a short delay mostly to emulate likely code, and
+	 * we want a long delay occasionally to force massive contention.
+	 */
+	if (!(torture_random(trsp) %
+	      (cxt.nrealwriters_stress * 2000 * longdelay_ms)))
+		mdelay(longdelay_ms);
+	if (!(torture_random(trsp) %
+	      (cxt.nrealwriters_stress * 2 * shortdelay_us)))
+		udelay(shortdelay_us);
+#ifdef CONFIG_PREEMPT
+	if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
+		preempt_schedule();  /* Allow test to be preempted. */
+#endif
+}
+
+static void torture_rtmutex_unlock(void) __releases(torture_rtmutex)
+{
+	rt_mutex_unlock(&torture_rtmutex);
+}
+
+static struct lock_torture_ops rtmutex_lock_ops = {
+	.writelock	= torture_rtmutex_lock,
+	.write_delay	= torture_rtmutex_delay,
+	.task_boost     = torture_rtmutex_boost,
+	.writeunlock	= torture_rtmutex_unlock,
+	.readlock       = NULL,
+	.read_delay     = NULL,
+	.readunlock     = NULL,
+	.name		= "rtmutex_lock"
+};
+#endif
+
 static DECLARE_RWSEM(torture_rwsem);
 static int torture_rwsem_down_write(void) __acquires(torture_rwsem)
 {
@@ -419,6 +518,7 @@ static void torture_rwsem_up_read(void) __releases(torture_rwsem)
 static struct lock_torture_ops rwsem_lock_ops = {
 	.writelock	= torture_rwsem_down_write,
 	.write_delay	= torture_rwsem_write_delay,
+	.task_boost     = torture_boost_dummy,
 	.writeunlock	= torture_rwsem_up_write,
 	.readlock       = torture_rwsem_down_read,
 	.read_delay     = torture_rwsem_read_delay,
@@ -442,6 +542,7 @@ static int lock_torture_writer(void *arg)
 		if ((torture_random(&rand) & 0xfffff) == 0)
 			schedule_timeout_uninterruptible(1);
 
+		cxt.cur_ops->task_boost(&rand);
 		cxt.cur_ops->writelock();
 		if (WARN_ON_ONCE(lock_is_write_held))
 			lwsp->n_lock_fail++;
@@ -456,6 +557,8 @@ static int lock_torture_writer(void *arg)
 
 		stutter_wait("lock_torture_writer");
 	} while (!torture_must_stop());
+
+	cxt.cur_ops->task_boost(NULL); /* reset prio */
 	torture_kthread_stopping("lock_torture_writer");
 	return 0;
 }
@@ -642,6 +745,9 @@ static int __init lock_torture_init(void)
 		&spin_lock_ops, &spin_lock_irq_ops,
 		&rw_lock_ops, &rw_lock_irq_ops,
 		&mutex_lock_ops,
+#ifdef CONFIG_RT_MUTEXES
+		&rtmutex_lock_ops,
+#endif
 		&rwsem_lock_ops,
 	};
 
@@ -676,6 +782,10 @@ static int __init lock_torture_init(void)
 	if (strncmp(torture_type, "mutex", 5) == 0)
 		cxt.debug_lock = true;
 #endif
+#ifdef CONFIG_DEBUG_RT_MUTEXES
+	if (strncmp(torture_type, "rtmutex", 7) == 0)
+		cxt.debug_lock = true;
+#endif
 #ifdef CONFIG_DEBUG_SPINLOCK
 	if ((strncmp(torture_type, "spin", 4) == 0) ||
 	    (strncmp(torture_type, "rw_lock", 7) == 0))
diff --git a/tools/testing/selftests/rcutorture/configs/lock/CFLIST b/tools/testing/selftests/rcutorture/configs/lock/CFLIST
index 6910b7370761..6ed32794eaa1 100644
--- a/tools/testing/selftests/rcutorture/configs/lock/CFLIST
+++ b/tools/testing/selftests/rcutorture/configs/lock/CFLIST
@@ -1,4 +1,5 @@
 LOCK01
 LOCK02
 LOCK03
-LOCK04
\ No newline at end of file
+LOCK04
+LOCK05
diff --git a/tools/testing/selftests/rcutorture/configs/lock/LOCK05 b/tools/testing/selftests/rcutorture/configs/lock/LOCK05
new file mode 100644
index 000000000000..1d1da1477fc3
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/lock/LOCK05
@@ -0,0 +1,6 @@
+CONFIG_SMP=y
+CONFIG_NR_CPUS=4
+CONFIG_HOTPLUG_CPU=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
diff --git a/tools/testing/selftests/rcutorture/configs/lock/LOCK05.boot b/tools/testing/selftests/rcutorture/configs/lock/LOCK05.boot
new file mode 100644
index 000000000000..8ac37307c987
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/lock/LOCK05.boot
@@ -0,0 +1 @@
+locktorture.torture_type=rtmutex_lock
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 02/13] locking/percpu-rwsem: Export symbols for locktorture
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 03/13] locktorture: Add torture tests for percpu_rwsem Paul E. McKenney
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

This commit exports percpu_down_read(), percpu_down_write(),
__percpu_init_rwsem(), percpu_up_read(), and percpu_up_write() to allow
locktorture to test them when built as a module.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/locking/percpu-rwsem.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
index f32567254867..e2621fbbcbd1 100644
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -22,6 +22,7 @@ int __percpu_init_rwsem(struct percpu_rw_semaphore *brw,
 	init_waitqueue_head(&brw->write_waitq);
 	return 0;
 }
+EXPORT_SYMBOL_GPL(__percpu_init_rwsem);
 
 void percpu_free_rwsem(struct percpu_rw_semaphore *brw)
 {
@@ -87,6 +88,7 @@ void percpu_down_read(struct percpu_rw_semaphore *brw)
 	/* avoid up_read()->rwsem_release() */
 	__up_read(&brw->rw_sem);
 }
+EXPORT_SYMBOL_GPL(percpu_down_read);
 
 int percpu_down_read_trylock(struct percpu_rw_semaphore *brw)
 {
@@ -112,6 +114,7 @@ void percpu_up_read(struct percpu_rw_semaphore *brw)
 	if (atomic_dec_and_test(&brw->slow_read_ctr))
 		wake_up_all(&brw->write_waitq);
 }
+EXPORT_SYMBOL_GPL(percpu_up_read);
 
 static int clear_fast_ctr(struct percpu_rw_semaphore *brw)
 {
@@ -163,6 +166,7 @@ void percpu_down_write(struct percpu_rw_semaphore *brw)
 	/* wait for all readers to complete their percpu_up_read() */
 	wait_event(brw->write_waitq, !atomic_read(&brw->slow_read_ctr));
 }
+EXPORT_SYMBOL_GPL(percpu_down_write);
 
 void percpu_up_write(struct percpu_rw_semaphore *brw)
 {
@@ -176,3 +180,4 @@ void percpu_up_write(struct percpu_rw_semaphore *brw)
 	/* the last writer unblocks update_fast_ctr() */
 	atomic_dec(&brw->write_ctr);
 }
+EXPORT_SYMBOL_GPL(percpu_up_write);
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 03/13] locktorture: Add torture tests for percpu_rwsem
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 02/13] locking/percpu-rwsem: Export symbols for locktorture Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 04/13] torture: Consolidate cond_resched_rcu_qs() into stutter_wait() Paul E. McKenney
                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney, Davidlohr Bueso

This commit adds percpu_rwsem tests based on the earlier rwsem tests.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
---
 kernel/locking/locktorture.c                       | 44 ++++++++++++++++++++++
 .../selftests/rcutorture/configs/lock/CFLIST       |  1 +
 .../selftests/rcutorture/configs/lock/LOCK06       |  6 +++
 .../selftests/rcutorture/configs/lock/LOCK06.boot  |  1 +
 4 files changed, 52 insertions(+)
 create mode 100644 tools/testing/selftests/rcutorture/configs/lock/LOCK06
 create mode 100644 tools/testing/selftests/rcutorture/configs/lock/LOCK06.boot

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index e1ca7a2fae91..8545e12598ce 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -36,6 +36,7 @@
 #include <linux/moduleparam.h>
 #include <linux/delay.h>
 #include <linux/slab.h>
+#include <linux/percpu-rwsem.h>
 #include <linux/torture.h>
 
 MODULE_LICENSE("GPL");
@@ -526,6 +527,48 @@ static struct lock_torture_ops rwsem_lock_ops = {
 	.name		= "rwsem_lock"
 };
 
+#include <linux/percpu-rwsem.h>
+static struct percpu_rw_semaphore pcpu_rwsem;
+
+void torture_percpu_rwsem_init(void)
+{
+	BUG_ON(percpu_init_rwsem(&pcpu_rwsem));
+}
+
+static int torture_percpu_rwsem_down_write(void) __acquires(pcpu_rwsem)
+{
+	percpu_down_write(&pcpu_rwsem);
+	return 0;
+}
+
+static void torture_percpu_rwsem_up_write(void) __releases(pcpu_rwsem)
+{
+	percpu_up_write(&pcpu_rwsem);
+}
+
+static int torture_percpu_rwsem_down_read(void) __acquires(pcpu_rwsem)
+{
+	percpu_down_read(&pcpu_rwsem);
+	return 0;
+}
+
+static void torture_percpu_rwsem_up_read(void) __releases(pcpu_rwsem)
+{
+	percpu_up_read(&pcpu_rwsem);
+}
+
+static struct lock_torture_ops percpu_rwsem_lock_ops = {
+	.init		= torture_percpu_rwsem_init,
+	.writelock	= torture_percpu_rwsem_down_write,
+	.write_delay	= torture_rwsem_write_delay,
+	.task_boost     = torture_boost_dummy,
+	.writeunlock	= torture_percpu_rwsem_up_write,
+	.readlock       = torture_percpu_rwsem_down_read,
+	.read_delay     = torture_rwsem_read_delay,
+	.readunlock     = torture_percpu_rwsem_up_read,
+	.name		= "percpu_rwsem_lock"
+};
+
 /*
  * Lock torture writer kthread.  Repeatedly acquires and releases
  * the lock, checking for duplicate acquisitions.
@@ -749,6 +792,7 @@ static int __init lock_torture_init(void)
 		&rtmutex_lock_ops,
 #endif
 		&rwsem_lock_ops,
+		&percpu_rwsem_lock_ops,
 	};
 
 	if (!torture_init_begin(torture_type, verbose, &torture_runnable))
diff --git a/tools/testing/selftests/rcutorture/configs/lock/CFLIST b/tools/testing/selftests/rcutorture/configs/lock/CFLIST
index 6ed32794eaa1..b9611c523723 100644
--- a/tools/testing/selftests/rcutorture/configs/lock/CFLIST
+++ b/tools/testing/selftests/rcutorture/configs/lock/CFLIST
@@ -3,3 +3,4 @@ LOCK02
 LOCK03
 LOCK04
 LOCK05
+LOCK06
diff --git a/tools/testing/selftests/rcutorture/configs/lock/LOCK06 b/tools/testing/selftests/rcutorture/configs/lock/LOCK06
new file mode 100644
index 000000000000..1d1da1477fc3
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/lock/LOCK06
@@ -0,0 +1,6 @@
+CONFIG_SMP=y
+CONFIG_NR_CPUS=4
+CONFIG_HOTPLUG_CPU=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
diff --git a/tools/testing/selftests/rcutorture/configs/lock/LOCK06.boot b/tools/testing/selftests/rcutorture/configs/lock/LOCK06.boot
new file mode 100644
index 000000000000..f92219cd4ad9
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/lock/LOCK06.boot
@@ -0,0 +1 @@
+locktorture.torture_type=percpu_rwsem_lock
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 04/13] torture: Consolidate cond_resched_rcu_qs() into stutter_wait()
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 02/13] locking/percpu-rwsem: Export symbols for locktorture Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 03/13] locktorture: Add torture tests for percpu_rwsem Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 05/13] rcu: Create rcu_sync infrastructure Paul E. McKenney
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

This commit moves cond_resched_rcu_qs() into stutter_wait(), saving
a line and also avoiding RCU CPU stall warnings from all torture
loops containing a stutter_wait().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcu/rcutorture.c | 2 --
 kernel/torture.c        | 1 +
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 77192953dee5..8a65b7d471a0 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -768,7 +768,6 @@ static int rcu_torture_boost(void *arg)
 				}
 				call_rcu_time = jiffies;
 			}
-			cond_resched_rcu_qs();
 			stutter_wait("rcu_torture_boost");
 			if (torture_must_stop())
 				goto checkwait;
@@ -1208,7 +1207,6 @@ rcu_torture_reader(void *arg)
 		__this_cpu_inc(rcu_torture_batch[completed]);
 		preempt_enable();
 		cur_ops->readunlock(idx);
-		cond_resched_rcu_qs();
 		stutter_wait("rcu_torture_reader");
 	} while (!torture_must_stop());
 	if (irqreader && cur_ops->irq_capable) {
diff --git a/kernel/torture.c b/kernel/torture.c
index 3e4840633d3e..44aa462d033f 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -523,6 +523,7 @@ static int stutter;
  */
 void stutter_wait(const char *title)
 {
+	cond_resched_rcu_qs();
 	while (READ_ONCE(stutter_pause_test) ||
 	       (torture_runnable && !READ_ONCE(*torture_runnable))) {
 		if (stutter_pause_test)
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 05/13] rcu: Create rcu_sync infrastructure
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
                     ` (2 preceding siblings ...)
  2015-10-06 16:45   ` [PATCH tip/core/rcu 04/13] torture: Consolidate cond_resched_rcu_qs() into stutter_wait() Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 06/13] rcu_sync: Simplify rcu_sync using new rcu_sync_ops structure Paul E. McKenney
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

From: Oleg Nesterov <oleg@redhat.com>

The rcu_sync infrastructure can be thought of as infrastructure to be
used to implement reader-writer primitives having extremely lightweight
readers during times when there are no writers.  The first use is in
the percpu_rwsem used by the VFS subsystem.

This infrastructure is functionally equivalent to

        struct rcu_sync_struct {
                atomic_t counter;
        };

	/* Check possibility of fast-path read-side operations. */
        static inline bool rcu_sync_is_idle(struct rcu_sync_struct *rss)
        {
                return atomic_read(&rss->counter) == 0;
        }

	/* Tell readers to use slowpaths. */
        static inline void rcu_sync_enter(struct rcu_sync_struct *rss)
        {
                atomic_inc(&rss->counter);
                synchronize_sched();
        }

	/* Allow readers to once again use fastpaths. */
        static inline void rcu_sync_exit(struct rcu_sync_struct *rss)
        {
                synchronize_sched();
                atomic_dec(&rss->counter);
        }

The main difference is that it records the state and only calls
synchronize_sched() if required.  At least some of the calls to
synchronize_sched() will be optimized away when rcu_sync_enter() and
rcu_sync_exit() are invoked repeatedly in quick succession.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcu_sync.h |  94 +++++++++++++++++++++++++
 kernel/rcu/Makefile      |   2 +-
 kernel/rcu/sync.c        | 175 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 270 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/rcu_sync.h
 create mode 100644 kernel/rcu/sync.c

diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h
new file mode 100644
index 000000000000..cb044df2e21c
--- /dev/null
+++ b/include/linux/rcu_sync.h
@@ -0,0 +1,94 @@
+/*
+ * RCU-based infrastructure for lightweight reader-writer locking
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ *
+ * Copyright (c) 2015, Red Hat, Inc.
+ *
+ * Author: Oleg Nesterov <oleg@redhat.com>
+ */
+
+#ifndef _LINUX_RCU_SYNC_H_
+#define _LINUX_RCU_SYNC_H_
+
+#include <linux/wait.h>
+#include <linux/rcupdate.h>
+
+/* Structure to mediate between updaters and fastpath-using readers.  */
+struct rcu_sync {
+	int			gp_state;
+	int			gp_count;
+	wait_queue_head_t	gp_wait;
+
+	int			cb_state;
+	struct rcu_head		cb_head;
+
+	void (*sync)(void);
+	void (*call)(struct rcu_head *, void (*)(struct rcu_head *));
+};
+
+#define ___RCU_SYNC_INIT(name)						\
+	.gp_state = 0,							\
+	.gp_count = 0,							\
+	.gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait),		\
+	.cb_state = 0
+
+#define __RCU_SCHED_SYNC_INIT(name) {					\
+	___RCU_SYNC_INIT(name),						\
+	.sync = synchronize_sched,					\
+	.call = call_rcu_sched,						\
+}
+
+#define __RCU_BH_SYNC_INIT(name) {					\
+	___RCU_SYNC_INIT(name),						\
+	.sync = synchronize_rcu_bh,					\
+	.call = call_rcu_bh,						\
+}
+
+#define __RCU_SYNC_INIT(name) {						\
+	___RCU_SYNC_INIT(name),						\
+	.sync = synchronize_rcu,					\
+	.call = call_rcu,						\
+}
+
+#define DEFINE_RCU_SCHED_SYNC(name)					\
+	struct rcu_sync name = __RCU_SCHED_SYNC_INIT(name)
+
+#define DEFINE_RCU_BH_SYNC(name)					\
+	struct rcu_sync name = __RCU_BH_SYNC_INIT(name)
+
+#define DEFINE_RCU_SYNC(name)						\
+	struct rcu_sync name = __RCU_SYNC_INIT(name)
+
+/**
+ * rcu_sync_is_idle() - Are readers permitted to use their fastpaths?
+ * @rsp: Pointer to rcu_sync structure to use for synchronization
+ *
+ * Returns true if readers are permitted to use their fastpaths.
+ * Must be invoked within an RCU read-side critical section whose
+ * flavor matches that of the rcu_sync struture.
+ */
+static inline bool rcu_sync_is_idle(struct rcu_sync *rsp)
+{
+	return !rsp->gp_state; /* GP_IDLE */
+}
+
+enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
+
+extern void rcu_sync_init(struct rcu_sync *, enum rcu_sync_type);
+extern void rcu_sync_enter(struct rcu_sync *);
+extern void rcu_sync_exit(struct rcu_sync *);
+
+#endif /* _LINUX_RCU_SYNC_H_ */
diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
index 50a808424b06..61a16569ffbf 100644
--- a/kernel/rcu/Makefile
+++ b/kernel/rcu/Makefile
@@ -1,4 +1,4 @@
-obj-y += update.o
+obj-y += update.o sync.o
 obj-$(CONFIG_SRCU) += srcu.o
 obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
 obj-$(CONFIG_TREE_RCU) += tree.o
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
new file mode 100644
index 000000000000..0a11df43be23
--- /dev/null
+++ b/kernel/rcu/sync.c
@@ -0,0 +1,175 @@
+/*
+ * RCU-based infrastructure for lightweight reader-writer locking
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, you can access it online at
+ * http://www.gnu.org/licenses/gpl-2.0.html.
+ *
+ * Copyright (c) 2015, Red Hat, Inc.
+ *
+ * Author: Oleg Nesterov <oleg@redhat.com>
+ */
+
+#include <linux/rcu_sync.h>
+#include <linux/sched.h>
+
+enum { GP_IDLE = 0, GP_PENDING, GP_PASSED };
+enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY };
+
+#define	rss_lock	gp_wait.lock
+
+/**
+ * rcu_sync_init() - Initialize an rcu_sync structure
+ * @rsp: Pointer to rcu_sync structure to be initialized
+ * @type: Flavor of RCU with which to synchronize rcu_sync structure
+ */
+void rcu_sync_init(struct rcu_sync *rsp, enum rcu_sync_type type)
+{
+	memset(rsp, 0, sizeof(*rsp));
+	init_waitqueue_head(&rsp->gp_wait);
+
+	switch (type) {
+	case RCU_SYNC:
+		rsp->sync = synchronize_rcu;
+		rsp->call = call_rcu;
+		break;
+
+	case RCU_SCHED_SYNC:
+		rsp->sync = synchronize_sched;
+		rsp->call = call_rcu_sched;
+		break;
+
+	case RCU_BH_SYNC:
+		rsp->sync = synchronize_rcu_bh;
+		rsp->call = call_rcu_bh;
+		break;
+	}
+}
+
+/**
+ * rcu_sync_enter() - Force readers onto slowpath
+ * @rsp: Pointer to rcu_sync structure to use for synchronization
+ *
+ * This function is used by updaters who need readers to make use of
+ * a slowpath during the update.  After this function returns, all
+ * subsequent calls to rcu_sync_is_idle() will return false, which
+ * tells readers to stay off their fastpaths.  A later call to
+ * rcu_sync_exit() re-enables reader slowpaths.
+ *
+ * When called in isolation, rcu_sync_enter() must wait for a grace
+ * period, however, closely spaced calls to rcu_sync_enter() can
+ * optimize away the grace-period wait via a state machine implemented
+ * by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func().
+ */
+void rcu_sync_enter(struct rcu_sync *rsp)
+{
+	bool need_wait, need_sync;
+
+	spin_lock_irq(&rsp->rss_lock);
+	need_wait = rsp->gp_count++;
+	need_sync = rsp->gp_state == GP_IDLE;
+	if (need_sync)
+		rsp->gp_state = GP_PENDING;
+	spin_unlock_irq(&rsp->rss_lock);
+
+	BUG_ON(need_wait && need_sync);
+
+	if (need_sync) {
+		rsp->sync();
+		rsp->gp_state = GP_PASSED;
+		wake_up_all(&rsp->gp_wait);
+	} else if (need_wait) {
+		wait_event(rsp->gp_wait, rsp->gp_state == GP_PASSED);
+	} else {
+		/*
+		 * Possible when there's a pending CB from a rcu_sync_exit().
+		 * Nobody has yet been allowed the 'fast' path and thus we can
+		 * avoid doing any sync(). The callback will get 'dropped'.
+		 */
+		BUG_ON(rsp->gp_state != GP_PASSED);
+	}
+}
+
+/**
+ * rcu_sync_func() - Callback function managing reader access to fastpath
+ * @rsp: Pointer to rcu_sync structure to use for synchronization
+ *
+ * This function is passed to one of the call_rcu() functions by
+ * rcu_sync_exit(), so that it is invoked after a grace period following the
+ * that invocation of rcu_sync_exit().  It takes action based on events that
+ * have taken place in the meantime, so that closely spaced rcu_sync_enter()
+ * and rcu_sync_exit() pairs need not wait for a grace period.
+ *
+ * If another rcu_sync_enter() is invoked before the grace period
+ * ended, reset state to allow the next rcu_sync_exit() to let the
+ * readers back onto their fastpaths (after a grace period).  If both
+ * another rcu_sync_enter() and its matching rcu_sync_exit() are invoked
+ * before the grace period ended, re-invoke call_rcu() on behalf of that
+ * rcu_sync_exit().  Otherwise, set all state back to idle so that readers
+ * can again use their fastpaths.
+ */
+static void rcu_sync_func(struct rcu_head *rcu)
+{
+	struct rcu_sync *rsp = container_of(rcu, struct rcu_sync, cb_head);
+	unsigned long flags;
+
+	BUG_ON(rsp->gp_state != GP_PASSED);
+	BUG_ON(rsp->cb_state == CB_IDLE);
+
+	spin_lock_irqsave(&rsp->rss_lock, flags);
+	if (rsp->gp_count) {
+		/*
+		 * A new rcu_sync_begin() has happened; drop the callback.
+		 */
+		rsp->cb_state = CB_IDLE;
+	} else if (rsp->cb_state == CB_REPLAY) {
+		/*
+		 * A new rcu_sync_exit() has happened; requeue the callback
+		 * to catch a later GP.
+		 */
+		rsp->cb_state = CB_PENDING;
+		rsp->call(&rsp->cb_head, rcu_sync_func);
+	} else {
+		/*
+		 * We're at least a GP after rcu_sync_exit(); eveybody will now
+		 * have observed the write side critical section. Let 'em rip!.
+		 */
+		rsp->cb_state = CB_IDLE;
+		rsp->gp_state = GP_IDLE;
+	}
+	spin_unlock_irqrestore(&rsp->rss_lock, flags);
+}
+
+/**
+ * rcu_sync_exit() - Allow readers back onto fast patch after grace period
+ * @rsp: Pointer to rcu_sync structure to use for synchronization
+ *
+ * This function is used by updaters who have completed, and can therefore
+ * now allow readers to make use of their fastpaths after a grace period
+ * has elapsed.  After this grace period has completed, all subsequent
+ * calls to rcu_sync_is_idle() will return true, which tells readers that
+ * they can once again use their fastpaths.
+ */
+void rcu_sync_exit(struct rcu_sync *rsp)
+{
+	spin_lock_irq(&rsp->rss_lock);
+	if (!--rsp->gp_count) {
+		if (rsp->cb_state == CB_IDLE) {
+			rsp->cb_state = CB_PENDING;
+			rsp->call(&rsp->cb_head, rcu_sync_func);
+		} else if (rsp->cb_state == CB_PENDING) {
+			rsp->cb_state = CB_REPLAY;
+		}
+	}
+	spin_unlock_irq(&rsp->rss_lock);
+}
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 06/13] rcu_sync: Simplify rcu_sync using new rcu_sync_ops structure
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
                     ` (3 preceding siblings ...)
  2015-10-06 16:45   ` [PATCH tip/core/rcu 05/13] rcu: Create rcu_sync infrastructure Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 07/13] rcu_sync: Add CONFIG_PROVE_RCU checks Paul E. McKenney
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

From: Oleg Nesterov <oleg@redhat.com>

This commit adds the new struct rcu_sync_ops which holds sync/call
methods, and turns the function pointers in rcu_sync_struct into an array
of struct rcu_sync_ops.  This simplifies the "init" helpers by collapsing
a switch statement and explicit multiple definitions into a simple
assignment and a helper macro, respectively.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcu_sync.h | 60 +++++++++++++++++++-----------------------------
 kernel/rcu/sync.c        | 42 +++++++++++++++++----------------
 2 files changed, 45 insertions(+), 57 deletions(-)

diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h
index cb044df2e21c..c6d2272c4459 100644
--- a/include/linux/rcu_sync.h
+++ b/include/linux/rcu_sync.h
@@ -26,6 +26,8 @@
 #include <linux/wait.h>
 #include <linux/rcupdate.h>
 
+enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
+
 /* Structure to mediate between updaters and fastpath-using readers.  */
 struct rcu_sync {
 	int			gp_state;
@@ -35,43 +37,9 @@ struct rcu_sync {
 	int			cb_state;
 	struct rcu_head		cb_head;
 
-	void (*sync)(void);
-	void (*call)(struct rcu_head *, void (*)(struct rcu_head *));
+	enum rcu_sync_type	gp_type;
 };
 
-#define ___RCU_SYNC_INIT(name)						\
-	.gp_state = 0,							\
-	.gp_count = 0,							\
-	.gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait),		\
-	.cb_state = 0
-
-#define __RCU_SCHED_SYNC_INIT(name) {					\
-	___RCU_SYNC_INIT(name),						\
-	.sync = synchronize_sched,					\
-	.call = call_rcu_sched,						\
-}
-
-#define __RCU_BH_SYNC_INIT(name) {					\
-	___RCU_SYNC_INIT(name),						\
-	.sync = synchronize_rcu_bh,					\
-	.call = call_rcu_bh,						\
-}
-
-#define __RCU_SYNC_INIT(name) {						\
-	___RCU_SYNC_INIT(name),						\
-	.sync = synchronize_rcu,					\
-	.call = call_rcu,						\
-}
-
-#define DEFINE_RCU_SCHED_SYNC(name)					\
-	struct rcu_sync name = __RCU_SCHED_SYNC_INIT(name)
-
-#define DEFINE_RCU_BH_SYNC(name)					\
-	struct rcu_sync name = __RCU_BH_SYNC_INIT(name)
-
-#define DEFINE_RCU_SYNC(name)						\
-	struct rcu_sync name = __RCU_SYNC_INIT(name)
-
 /**
  * rcu_sync_is_idle() - Are readers permitted to use their fastpaths?
  * @rsp: Pointer to rcu_sync structure to use for synchronization
@@ -85,10 +53,28 @@ static inline bool rcu_sync_is_idle(struct rcu_sync *rsp)
 	return !rsp->gp_state; /* GP_IDLE */
 }
 
-enum rcu_sync_type { RCU_SYNC, RCU_SCHED_SYNC, RCU_BH_SYNC };
-
 extern void rcu_sync_init(struct rcu_sync *, enum rcu_sync_type);
 extern void rcu_sync_enter(struct rcu_sync *);
 extern void rcu_sync_exit(struct rcu_sync *);
 
+#define __RCU_SYNC_INITIALIZER(name, type) {				\
+		.gp_state = 0,						\
+		.gp_count = 0,						\
+		.gp_wait = __WAIT_QUEUE_HEAD_INITIALIZER(name.gp_wait),	\
+		.cb_state = 0,						\
+		.gp_type = type,					\
+	}
+
+#define	__DEFINE_RCU_SYNC(name, type)	\
+	struct rcu_sync_struct name = __RCU_SYNC_INITIALIZER(name, type)
+
+#define DEFINE_RCU_SYNC(name)		\
+	__DEFINE_RCU_SYNC(name, RCU_SYNC)
+
+#define DEFINE_RCU_SCHED_SYNC(name)	\
+	__DEFINE_RCU_SYNC(name, RCU_SCHED_SYNC)
+
+#define DEFINE_RCU_BH_SYNC(name)	\
+	__DEFINE_RCU_SYNC(name, RCU_BH_SYNC)
+
 #endif /* _LINUX_RCU_SYNC_H_ */
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index 0a11df43be23..5a9aa4c394f1 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -23,6 +23,24 @@
 #include <linux/rcu_sync.h>
 #include <linux/sched.h>
 
+static const struct {
+	void (*sync)(void);
+	void (*call)(struct rcu_head *, void (*)(struct rcu_head *));
+} gp_ops[] = {
+	[RCU_SYNC] = {
+		.sync = synchronize_rcu,
+		.call = call_rcu,
+	},
+	[RCU_SCHED_SYNC] = {
+		.sync = synchronize_sched,
+		.call = call_rcu_sched,
+	},
+	[RCU_BH_SYNC] = {
+		.sync = synchronize_rcu_bh,
+		.call = call_rcu_bh,
+	},
+};
+
 enum { GP_IDLE = 0, GP_PENDING, GP_PASSED };
 enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY };
 
@@ -37,23 +55,7 @@ void rcu_sync_init(struct rcu_sync *rsp, enum rcu_sync_type type)
 {
 	memset(rsp, 0, sizeof(*rsp));
 	init_waitqueue_head(&rsp->gp_wait);
-
-	switch (type) {
-	case RCU_SYNC:
-		rsp->sync = synchronize_rcu;
-		rsp->call = call_rcu;
-		break;
-
-	case RCU_SCHED_SYNC:
-		rsp->sync = synchronize_sched;
-		rsp->call = call_rcu_sched;
-		break;
-
-	case RCU_BH_SYNC:
-		rsp->sync = synchronize_rcu_bh;
-		rsp->call = call_rcu_bh;
-		break;
-	}
+	rsp->gp_type = type;
 }
 
 /**
@@ -85,7 +87,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
 	BUG_ON(need_wait && need_sync);
 
 	if (need_sync) {
-		rsp->sync();
+		gp_ops[rsp->gp_type].sync();
 		rsp->gp_state = GP_PASSED;
 		wake_up_all(&rsp->gp_wait);
 	} else if (need_wait) {
@@ -138,7 +140,7 @@ static void rcu_sync_func(struct rcu_head *rcu)
 		 * to catch a later GP.
 		 */
 		rsp->cb_state = CB_PENDING;
-		rsp->call(&rsp->cb_head, rcu_sync_func);
+		gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func);
 	} else {
 		/*
 		 * We're at least a GP after rcu_sync_exit(); eveybody will now
@@ -166,7 +168,7 @@ void rcu_sync_exit(struct rcu_sync *rsp)
 	if (!--rsp->gp_count) {
 		if (rsp->cb_state == CB_IDLE) {
 			rsp->cb_state = CB_PENDING;
-			rsp->call(&rsp->cb_head, rcu_sync_func);
+			gp_ops[rsp->gp_type].call(&rsp->cb_head, rcu_sync_func);
 		} else if (rsp->cb_state == CB_PENDING) {
 			rsp->cb_state = CB_REPLAY;
 		}
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 07/13] rcu_sync: Add CONFIG_PROVE_RCU checks
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
                     ` (4 preceding siblings ...)
  2015-10-06 16:45   ` [PATCH tip/core/rcu 06/13] rcu_sync: Simplify rcu_sync using new rcu_sync_ops structure Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 08/13] rcu_sync: Introduce rcu_sync_dtor() Paul E. McKenney
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

From: Oleg Nesterov <oleg@redhat.com>

This commit validates that the caller of rcu_sync_is_idle() holds the
corresponding type of RCU read-side lock, but only in kernels built
with CONFIG_PROVE_RCU=y.  This validation is carried out via a new
rcu_sync_ops->held() method that is checked within rcu_sync_is_idle().

Note that although this does add code to the fast path, it only does so
in kernels built with CONFIG_PROVE_RCU=y.

Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcu_sync.h |  6 ++++++
 kernel/rcu/sync.c        | 20 ++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h
index c6d2272c4459..1f2d4fc30b04 100644
--- a/include/linux/rcu_sync.h
+++ b/include/linux/rcu_sync.h
@@ -40,6 +40,8 @@ struct rcu_sync {
 	enum rcu_sync_type	gp_type;
 };
 
+extern bool __rcu_sync_is_idle(struct rcu_sync *);
+
 /**
  * rcu_sync_is_idle() - Are readers permitted to use their fastpaths?
  * @rsp: Pointer to rcu_sync structure to use for synchronization
@@ -50,7 +52,11 @@ struct rcu_sync {
  */
 static inline bool rcu_sync_is_idle(struct rcu_sync *rsp)
 {
+#ifdef CONFIG_PROVE_RCU
+	return __rcu_sync_is_idle(rsp);
+#else
 	return !rsp->gp_state; /* GP_IDLE */
+#endif
 }
 
 extern void rcu_sync_init(struct rcu_sync *, enum rcu_sync_type);
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index 5a9aa4c394f1..01c9807a7f73 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -23,21 +23,33 @@
 #include <linux/rcu_sync.h>
 #include <linux/sched.h>
 
+#ifdef CONFIG_PROVE_RCU
+#define __INIT_HELD(func)	.held = func,
+#else
+#define __INIT_HELD(func)
+#endif
+
 static const struct {
 	void (*sync)(void);
 	void (*call)(struct rcu_head *, void (*)(struct rcu_head *));
+#ifdef CONFIG_PROVE_RCU
+	int  (*held)(void);
+#endif
 } gp_ops[] = {
 	[RCU_SYNC] = {
 		.sync = synchronize_rcu,
 		.call = call_rcu,
+		__INIT_HELD(rcu_read_lock_held)
 	},
 	[RCU_SCHED_SYNC] = {
 		.sync = synchronize_sched,
 		.call = call_rcu_sched,
+		__INIT_HELD(rcu_read_lock_sched_held)
 	},
 	[RCU_BH_SYNC] = {
 		.sync = synchronize_rcu_bh,
 		.call = call_rcu_bh,
+		__INIT_HELD(rcu_read_lock_bh_held)
 	},
 };
 
@@ -46,6 +58,14 @@ enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY };
 
 #define	rss_lock	gp_wait.lock
 
+#ifdef CONFIG_PROVE_RCU
+bool __rcu_sync_is_idle(struct rcu_sync *rsp)
+{
+	WARN_ON(!gp_ops[rsp->gp_type].held());
+	return rsp->gp_state == GP_IDLE;
+}
+#endif
+
 /**
  * rcu_sync_init() - Initialize an rcu_sync structure
  * @rsp: Pointer to rcu_sync structure to be initialized
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 08/13] rcu_sync: Introduce rcu_sync_dtor()
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
                     ` (5 preceding siblings ...)
  2015-10-06 16:45   ` [PATCH tip/core/rcu 07/13] rcu_sync: Add CONFIG_PROVE_RCU checks Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 09/13] locking/percpu-rwsem: Make percpu_free_rwsem() after kzalloc() safe Paul E. McKenney
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

From: Oleg Nesterov <oleg@redhat.com>

This commit allows rcu_sync structures to be safely deallocated,
The trick is to add a new ->wait field to the gp_ops array.
This field is a pointer to the rcu_barrier() function corresponding
to the flavor of RCU in question.  This allows a new rcu_sync_dtor()
to wait for any outstanding callbacks before freeing the rcu_sync
structure.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcu_sync.h |  1 +
 kernel/rcu/sync.c        | 26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h
index 1f2d4fc30b04..8069d6468bc4 100644
--- a/include/linux/rcu_sync.h
+++ b/include/linux/rcu_sync.h
@@ -62,6 +62,7 @@ static inline bool rcu_sync_is_idle(struct rcu_sync *rsp)
 extern void rcu_sync_init(struct rcu_sync *, enum rcu_sync_type);
 extern void rcu_sync_enter(struct rcu_sync *);
 extern void rcu_sync_exit(struct rcu_sync *);
+extern void rcu_sync_dtor(struct rcu_sync *);
 
 #define __RCU_SYNC_INITIALIZER(name, type) {				\
 		.gp_state = 0,						\
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index 01c9807a7f73..1e353f0a2b66 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -32,6 +32,7 @@
 static const struct {
 	void (*sync)(void);
 	void (*call)(struct rcu_head *, void (*)(struct rcu_head *));
+	void (*wait)(void);
 #ifdef CONFIG_PROVE_RCU
 	int  (*held)(void);
 #endif
@@ -39,16 +40,19 @@ static const struct {
 	[RCU_SYNC] = {
 		.sync = synchronize_rcu,
 		.call = call_rcu,
+		.wait = rcu_barrier,
 		__INIT_HELD(rcu_read_lock_held)
 	},
 	[RCU_SCHED_SYNC] = {
 		.sync = synchronize_sched,
 		.call = call_rcu_sched,
+		.wait = rcu_barrier_sched,
 		__INIT_HELD(rcu_read_lock_sched_held)
 	},
 	[RCU_BH_SYNC] = {
 		.sync = synchronize_rcu_bh,
 		.call = call_rcu_bh,
+		.wait = rcu_barrier_bh,
 		__INIT_HELD(rcu_read_lock_bh_held)
 	},
 };
@@ -195,3 +199,25 @@ void rcu_sync_exit(struct rcu_sync *rsp)
 	}
 	spin_unlock_irq(&rsp->rss_lock);
 }
+
+/**
+ * rcu_sync_dtor() - Clean up an rcu_sync structure
+ * @rsp: Pointer to rcu_sync structure to be cleaned up
+ */
+void rcu_sync_dtor(struct rcu_sync *rsp)
+{
+	int cb_state;
+
+	BUG_ON(rsp->gp_count);
+
+	spin_lock_irq(&rsp->rss_lock);
+	if (rsp->cb_state == CB_REPLAY)
+		rsp->cb_state = CB_PENDING;
+	cb_state = rsp->cb_state;
+	spin_unlock_irq(&rsp->rss_lock);
+
+	if (cb_state != CB_IDLE) {
+		gp_ops[rsp->gp_type].wait();
+		BUG_ON(rsp->cb_state != CB_IDLE);
+	}
+}
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 09/13] locking/percpu-rwsem: Make percpu_free_rwsem() after kzalloc() safe
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
                     ` (6 preceding siblings ...)
  2015-10-06 16:45   ` [PATCH tip/core/rcu 08/13] rcu_sync: Introduce rcu_sync_dtor() Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 10/13] locking/percpu-rwsem: Make use of the rcu_sync infrastructure Paul E. McKenney
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

From: Oleg Nesterov <oleg@redhat.com>

This is the temporary ugly hack which will be reverted later. We only
need it to ensure that the next patch will not break "change sb_writers
to use percpu_rw_semaphore" patches routed via the VFS tree.

The alloc_super()->destroy_super() error path assumes that it is safe
to call percpu_free_rwsem() after kzalloc() without percpu_init_rwsem(),
so let's not disappoint it.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/locking/percpu-rwsem.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
index e2621fbbcbd1..9529a30ec57b 100644
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -26,6 +26,13 @@ EXPORT_SYMBOL_GPL(__percpu_init_rwsem);
 
 void percpu_free_rwsem(struct percpu_rw_semaphore *brw)
 {
+	/*
+	 * XXX: temporary kludge. The error path in alloc_super()
+	 * assumes that percpu_free_rwsem() is safe after kzalloc().
+	 */
+	if (!brw->fast_read_ctr)
+		return;
+
 	free_percpu(brw->fast_read_ctr);
 	brw->fast_read_ctr = NULL; /* catch use after free bugs */
 }
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 10/13] locking/percpu-rwsem: Make use of the rcu_sync infrastructure
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
                     ` (7 preceding siblings ...)
  2015-10-06 16:45   ` [PATCH tip/core/rcu 09/13] locking/percpu-rwsem: Make percpu_free_rwsem() after kzalloc() safe Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 11/13] locking/percpu-rwsem: Fix the comments outdated by rcu_sync Paul E. McKenney
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

From: Oleg Nesterov <oleg@redhat.com>

Currently down_write/up_write calls synchronize_sched_expedited()
twice, which is evil.  Change this code to rely on rcu-sync primitives.
This avoids the _expedited "big hammer", and this can be faster in
the contended case or even in the case when a single thread does
down_write/up_write in a loop.

Of course, a single down_write() will take more time, but otoh it
will be much more friendly to the whole system.

To simplify the review this patch doesn't update the comments, fixed
by the next change.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/percpu-rwsem.h  |  3 ++-
 kernel/locking/percpu-rwsem.c | 18 +++++++-----------
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h
index 834c4e52cb2d..c2fa3ecb0dce 100644
--- a/include/linux/percpu-rwsem.h
+++ b/include/linux/percpu-rwsem.h
@@ -5,11 +5,12 @@
 #include <linux/rwsem.h>
 #include <linux/percpu.h>
 #include <linux/wait.h>
+#include <linux/rcu_sync.h>
 #include <linux/lockdep.h>
 
 struct percpu_rw_semaphore {
+	struct rcu_sync		rss;
 	unsigned int __percpu	*fast_read_ctr;
-	atomic_t		write_ctr;
 	struct rw_semaphore	rw_sem;
 	atomic_t		slow_read_ctr;
 	wait_queue_head_t	write_waitq;
diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
index 9529a30ec57b..183a71151ac0 100644
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -17,7 +17,7 @@ int __percpu_init_rwsem(struct percpu_rw_semaphore *brw,
 
 	/* ->rw_sem represents the whole percpu_rw_semaphore for lockdep */
 	__init_rwsem(&brw->rw_sem, name, rwsem_key);
-	atomic_set(&brw->write_ctr, 0);
+	rcu_sync_init(&brw->rss, RCU_SCHED_SYNC);
 	atomic_set(&brw->slow_read_ctr, 0);
 	init_waitqueue_head(&brw->write_waitq);
 	return 0;
@@ -33,6 +33,7 @@ void percpu_free_rwsem(struct percpu_rw_semaphore *brw)
 	if (!brw->fast_read_ctr)
 		return;
 
+	rcu_sync_dtor(&brw->rss);
 	free_percpu(brw->fast_read_ctr);
 	brw->fast_read_ctr = NULL; /* catch use after free bugs */
 }
@@ -62,13 +63,12 @@ void percpu_free_rwsem(struct percpu_rw_semaphore *brw)
  */
 static bool update_fast_ctr(struct percpu_rw_semaphore *brw, unsigned int val)
 {
-	bool success = false;
+	bool success;
 
 	preempt_disable();
-	if (likely(!atomic_read(&brw->write_ctr))) {
+	success = rcu_sync_is_idle(&brw->rss);
+	if (likely(success))
 		__this_cpu_add(*brw->fast_read_ctr, val);
-		success = true;
-	}
 	preempt_enable();
 
 	return success;
@@ -149,8 +149,6 @@ static int clear_fast_ctr(struct percpu_rw_semaphore *brw)
  */
 void percpu_down_write(struct percpu_rw_semaphore *brw)
 {
-	/* tell update_fast_ctr() there is a pending writer */
-	atomic_inc(&brw->write_ctr);
 	/*
 	 * 1. Ensures that write_ctr != 0 is visible to any down_read/up_read
 	 *    so that update_fast_ctr() can't succeed.
@@ -162,7 +160,7 @@ void percpu_down_write(struct percpu_rw_semaphore *brw)
 	 *    fast-path, it executes a full memory barrier before we return.
 	 *    See R_W case in the comment above update_fast_ctr().
 	 */
-	synchronize_sched_expedited();
+	rcu_sync_enter(&brw->rss);
 
 	/* exclude other writers, and block the new readers completely */
 	down_write(&brw->rw_sem);
@@ -183,8 +181,6 @@ void percpu_up_write(struct percpu_rw_semaphore *brw)
 	 * Insert the barrier before the next fast-path in down_read,
 	 * see W_R case in the comment above update_fast_ctr().
 	 */
-	synchronize_sched_expedited();
-	/* the last writer unblocks update_fast_ctr() */
-	atomic_dec(&brw->write_ctr);
+	rcu_sync_exit(&brw->rss);
 }
 EXPORT_SYMBOL_GPL(percpu_up_write);
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 11/13] locking/percpu-rwsem: Fix the comments outdated by rcu_sync
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
                     ` (8 preceding siblings ...)
  2015-10-06 16:45   ` [PATCH tip/core/rcu 10/13] locking/percpu-rwsem: Make use of the rcu_sync infrastructure Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 12/13] locking/percpu-rwsem: Clean up the lockdep annotations in percpu_down_read() Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 13/13] rcu_sync: Cleanup the CONFIG_PROVE_RCU checks Paul E. McKenney
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

From: Oleg Nesterov <oleg@redhat.com>

Update the comments broken by the previous change.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/locking/percpu-rwsem.c | 50 ++++++++++---------------------------------
 1 file changed, 11 insertions(+), 39 deletions(-)

diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
index 183a71151ac0..02a726dd9adc 100644
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -39,27 +39,12 @@ void percpu_free_rwsem(struct percpu_rw_semaphore *brw)
 }
 
 /*
- * This is the fast-path for down_read/up_read, it only needs to ensure
- * there is no pending writer (atomic_read(write_ctr) == 0) and inc/dec the
- * fast per-cpu counter. The writer uses synchronize_sched_expedited() to
- * serialize with the preempt-disabled section below.
- *
- * The nontrivial part is that we should guarantee acquire/release semantics
- * in case when
- *
- *	R_W: down_write() comes after up_read(), the writer should see all
- *	     changes done by the reader
- * or
- *	W_R: down_read() comes after up_write(), the reader should see all
- *	     changes done by the writer
+ * This is the fast-path for down_read/up_read. If it succeeds we rely
+ * on the barriers provided by rcu_sync_enter/exit; see the comments in
+ * percpu_down_write() and percpu_up_write().
  *
  * If this helper fails the callers rely on the normal rw_semaphore and
  * atomic_dec_and_test(), so in this case we have the necessary barriers.
- *
- * But if it succeeds we do not have any barriers, atomic_read(write_ctr) or
- * __this_cpu_add() below can be reordered with any LOAD/STORE done by the
- * reader inside the critical section. See the comments in down_write and
- * up_write below.
  */
 static bool update_fast_ctr(struct percpu_rw_semaphore *brw, unsigned int val)
 {
@@ -136,29 +121,15 @@ static int clear_fast_ctr(struct percpu_rw_semaphore *brw)
 	return sum;
 }
 
-/*
- * A writer increments ->write_ctr to force the readers to switch to the
- * slow mode, note the atomic_read() check in update_fast_ctr().
- *
- * After that the readers can only inc/dec the slow ->slow_read_ctr counter,
- * ->fast_read_ctr is stable. Once the writer moves its sum into the slow
- * counter it represents the number of active readers.
- *
- * Finally the writer takes ->rw_sem for writing and blocks the new readers,
- * then waits until the slow counter becomes zero.
- */
 void percpu_down_write(struct percpu_rw_semaphore *brw)
 {
 	/*
-	 * 1. Ensures that write_ctr != 0 is visible to any down_read/up_read
-	 *    so that update_fast_ctr() can't succeed.
-	 *
-	 * 2. Ensures we see the result of every previous this_cpu_add() in
-	 *    update_fast_ctr().
+	 * Make rcu_sync_is_idle() == F and thus disable the fast-path in
+	 * percpu_down_read() and percpu_up_read(), and wait for gp pass.
 	 *
-	 * 3. Ensures that if any reader has exited its critical section via
-	 *    fast-path, it executes a full memory barrier before we return.
-	 *    See R_W case in the comment above update_fast_ctr().
+	 * The latter synchronises us with the preceding readers which used
+	 * the fast-past, so we can not miss the result of __this_cpu_add()
+	 * or anything else inside their criticial sections.
 	 */
 	rcu_sync_enter(&brw->rss);
 
@@ -178,8 +149,9 @@ void percpu_up_write(struct percpu_rw_semaphore *brw)
 	/* release the lock, but the readers can't use the fast-path */
 	up_write(&brw->rw_sem);
 	/*
-	 * Insert the barrier before the next fast-path in down_read,
-	 * see W_R case in the comment above update_fast_ctr().
+	 * Enable the fast-path in percpu_down_read() and percpu_up_read()
+	 * but only after another gp pass; this adds the necessary barrier
+	 * to ensure the reader can't miss the changes done by us.
 	 */
 	rcu_sync_exit(&brw->rss);
 }
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 12/13] locking/percpu-rwsem: Clean up the lockdep annotations in percpu_down_read()
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
                     ` (9 preceding siblings ...)
  2015-10-06 16:45   ` [PATCH tip/core/rcu 11/13] locking/percpu-rwsem: Fix the comments outdated by rcu_sync Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  2015-10-06 16:45   ` [PATCH tip/core/rcu 13/13] rcu_sync: Cleanup the CONFIG_PROVE_RCU checks Paul E. McKenney
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

From: Oleg Nesterov <oleg@redhat.com>

Based on Peter Zijlstra's earlier patch.

Change percpu_down_read() to use __down_read(), this way we can
do rwsem_acquire_read() unconditionally at the start to make this
code more symmetric and clean.

Originally-From: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/locking/percpu-rwsem.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
index 02a726dd9adc..f231e0bb311c 100644
--- a/kernel/locking/percpu-rwsem.c
+++ b/kernel/locking/percpu-rwsem.c
@@ -70,14 +70,14 @@ static bool update_fast_ctr(struct percpu_rw_semaphore *brw, unsigned int val)
 void percpu_down_read(struct percpu_rw_semaphore *brw)
 {
 	might_sleep();
-	if (likely(update_fast_ctr(brw, +1))) {
-		rwsem_acquire_read(&brw->rw_sem.dep_map, 0, 0, _RET_IP_);
+	rwsem_acquire_read(&brw->rw_sem.dep_map, 0, 0, _RET_IP_);
+
+	if (likely(update_fast_ctr(brw, +1)))
 		return;
-	}
 
-	down_read(&brw->rw_sem);
+	/* Avoid rwsem_acquire_read() and rwsem_release() */
+	__down_read(&brw->rw_sem);
 	atomic_inc(&brw->slow_read_ctr);
-	/* avoid up_read()->rwsem_release() */
 	__up_read(&brw->rw_sem);
 }
 EXPORT_SYMBOL_GPL(percpu_down_read);
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH tip/core/rcu 13/13] rcu_sync: Cleanup the CONFIG_PROVE_RCU checks
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
                     ` (10 preceding siblings ...)
  2015-10-06 16:45   ` [PATCH tip/core/rcu 12/13] locking/percpu-rwsem: Clean up the lockdep annotations in percpu_down_read() Paul E. McKenney
@ 2015-10-06 16:45   ` Paul E. McKenney
  11 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 16:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh,
	tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
	oleg, bobby.prani, Paul E. McKenney

From: Oleg Nesterov <oleg@redhat.com>

1. Rename __rcu_sync_is_idle() to rcu_sync_lockdep_assert() and
   change it to use rcu_lockdep_assert().

2. Change rcu_sync_is_idle() to return rsp->gp_state == GP_IDLE
   unconditonally, this way we can remove the same check from
   rcu_sync_lockdep_assert() and clearly isolate the debugging
   code.

Note: rcu_sync_enter()->wait_event(gp_state == GP_PASSED) needs
another CONFIG_PROVE_RCU check, the same as is done in ->sync(); but
this needs some simple preparations in the core RCU code to avoid the
code duplication.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rcu_sync.h | 7 +++----
 kernel/rcu/sync.c        | 6 +++---
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/include/linux/rcu_sync.h b/include/linux/rcu_sync.h
index 8069d6468bc4..a63a33e6196e 100644
--- a/include/linux/rcu_sync.h
+++ b/include/linux/rcu_sync.h
@@ -40,7 +40,7 @@ struct rcu_sync {
 	enum rcu_sync_type	gp_type;
 };
 
-extern bool __rcu_sync_is_idle(struct rcu_sync *);
+extern void rcu_sync_lockdep_assert(struct rcu_sync *);
 
 /**
  * rcu_sync_is_idle() - Are readers permitted to use their fastpaths?
@@ -53,10 +53,9 @@ extern bool __rcu_sync_is_idle(struct rcu_sync *);
 static inline bool rcu_sync_is_idle(struct rcu_sync *rsp)
 {
 #ifdef CONFIG_PROVE_RCU
-	return __rcu_sync_is_idle(rsp);
-#else
-	return !rsp->gp_state; /* GP_IDLE */
+	rcu_sync_lockdep_assert(rsp);
 #endif
+	return !rsp->gp_state; /* GP_IDLE */
 }
 
 extern void rcu_sync_init(struct rcu_sync *, enum rcu_sync_type);
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index 1e353f0a2b66..be922c9f3d37 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -63,10 +63,10 @@ enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY };
 #define	rss_lock	gp_wait.lock
 
 #ifdef CONFIG_PROVE_RCU
-bool __rcu_sync_is_idle(struct rcu_sync *rsp)
+void rcu_sync_lockdep_assert(struct rcu_sync *rsp)
 {
-	WARN_ON(!gp_ops[rsp->gp_type].held());
-	return rsp->gp_state == GP_IDLE;
+	RCU_LOCKDEP_WARN(!gp_ops[rsp->gp_type].held(),
+			 "suspicious rcu_sync_is_idle() usage");
 }
 #endif
 
-- 
2.5.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4
  2015-10-06 16:45 [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4 Paul E. McKenney
  2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
@ 2015-10-06 17:42 ` Josh Triplett
  2015-10-06 18:50   ` Oleg Nesterov
  1 sibling, 1 reply; 18+ messages in thread
From: Josh Triplett @ 2015-10-06 17:42 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, peterz, rostedt, dhowells, edumazet,
	dvhart, fweisbec, oleg, bobby.prani

On Tue, Oct 06, 2015 at 09:45:14AM -0700, Paul E. McKenney wrote:
> Hello!
> 
> This series contains performance improvements and locktorture testing
> for percpu-rwsem:
> 
> 1.	Add rtmutex torturing to locktorture, courtesy of Davidlohr Bueso.
> 
> 2.	Add exports to allow locktorture to be built as a module.
> 
> 3.	Add torture tests for percpu-rwsem.
> 
> 4.	Consolidate cond_resched_rcu_qs() into stutter_wait().
> 
> 5.	Create rcu_sync infrastructure, courtesy of Oleg Nesterov.
> 
> 6.	Simplify rcu_sync using new rcu_sync_ops structure, courtesy
> 	of Oleg Nesterov.
> 
> 7.	Add CONFIG_PROVE_RCU checks for rcu_sync, courtesy of Oleg Nesterov.
> 
> 8.	Introduce rcu_sync_dtor(), courtesy of Oleg Nesterov.
> 
> 9.	Make percpu_free_rwsem() after kzalloc() safe, courtesy of Oleg
> 	Nesterov.
> 
> 10.	Make percpu-rwsem make use of rcu_sync, courtesy of Oleg Nesterov.
> 
> 11.	Fix the comments outdated by rcu_sync, courtesy of Oleg Nesterov.
> 
> 12.	Clean up the lockdep annotations in percpu_down_read(), courtesy
> 	Peter Zijlstra and of Oleg Nesterov.
> 
> 13.	Cleanup the CONFIG_PROVE_RCU checks, courtesy of Oleg Nesterov.

For all 13:
Reviewed-by: Josh Triplett <josh@joshtriplett.org>

Regarding the rcu_sync infrastructure: odd that an atomic read
on the reader proves ligher weight than
rcu_read_lock()/rcu_read_unlock().

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4
  2015-10-06 17:42 ` [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4 Josh Triplett
@ 2015-10-06 18:50   ` Oleg Nesterov
  2015-10-06 19:33     ` Paul E. McKenney
  0 siblings, 1 reply; 18+ messages in thread
From: Oleg Nesterov @ 2015-10-06 18:50 UTC (permalink / raw)
  To: Josh Triplett
  Cc: Paul E. McKenney, linux-kernel, mingo, jiangshanlai, dipankar,
	akpm, mathieu.desnoyers, tglx, peterz, rostedt, dhowells,
	edumazet, dvhart, fweisbec, bobby.prani

On 10/06, Josh Triplett wrote:
>
> For all 13:
> Reviewed-by: Josh Triplett <josh@joshtriplett.org>

Thanks!

> Regarding the rcu_sync infrastructure: odd that an atomic read
> on the reader proves ligher weight than
> rcu_read_lock()/rcu_read_unlock().

Cough... Could you spell? ;) I am just curious and I can't understand
what do you mean.

Oleg.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4
  2015-10-06 18:50   ` Oleg Nesterov
@ 2015-10-06 19:33     ` Paul E. McKenney
  2015-10-06 20:36       ` Josh Triplett
  0 siblings, 1 reply; 18+ messages in thread
From: Paul E. McKenney @ 2015-10-06 19:33 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Josh Triplett, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, peterz, rostedt, dhowells, edumazet,
	dvhart, fweisbec, bobby.prani

On Tue, Oct 06, 2015 at 08:50:56PM +0200, Oleg Nesterov wrote:
> On 10/06, Josh Triplett wrote:
> >
> > For all 13:
> > Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> 
> Thanks!
> 
> > Regarding the rcu_sync infrastructure: odd that an atomic read
> > on the reader proves ligher weight than
> > rcu_read_lock()/rcu_read_unlock().
> 
> Cough... Could you spell? ;) I am just curious and I can't understand
> what do you mean.

I would guess that Josh is thinking of CONFIG_PREEMPT=n and that you
worked with CONFIG_PREEMPT=y.  But it would be good to get this fully
understood.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4
  2015-10-06 19:33     ` Paul E. McKenney
@ 2015-10-06 20:36       ` Josh Triplett
  0 siblings, 0 replies; 18+ messages in thread
From: Josh Triplett @ 2015-10-06 20:36 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Oleg Nesterov, linux-kernel, mingo, jiangshanlai, dipankar, akpm,
	mathieu.desnoyers, tglx, peterz, rostedt, dhowells, edumazet,
	dvhart, fweisbec, bobby.prani

On Tue, Oct 06, 2015 at 12:33:11PM -0700, Paul E. McKenney wrote:
> On Tue, Oct 06, 2015 at 08:50:56PM +0200, Oleg Nesterov wrote:
> > On 10/06, Josh Triplett wrote:
> > >
> > > For all 13:
> > > Reviewed-by: Josh Triplett <josh@joshtriplett.org>
> > 
> > Thanks!
> > 
> > > Regarding the rcu_sync infrastructure: odd that an atomic read
> > > on the reader proves ligher weight than
> > > rcu_read_lock()/rcu_read_unlock().
> > 
> > Cough... Could you spell? ;) I am just curious and I can't understand
> > what do you mean.
> 
> I would guess that Josh is thinking of CONFIG_PREEMPT=n and that you
> worked with CONFIG_PREEMPT=y.  But it would be good to get this fully
> understood.

Right.

- Josh Triplett

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-10-06 20:36 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-06 16:45 [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4 Paul E. McKenney
2015-10-06 16:45 ` [PATCH tip/core/rcu 01/13] locktorture: Support rtmutex torturing Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 02/13] locking/percpu-rwsem: Export symbols for locktorture Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 03/13] locktorture: Add torture tests for percpu_rwsem Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 04/13] torture: Consolidate cond_resched_rcu_qs() into stutter_wait() Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 05/13] rcu: Create rcu_sync infrastructure Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 06/13] rcu_sync: Simplify rcu_sync using new rcu_sync_ops structure Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 07/13] rcu_sync: Add CONFIG_PROVE_RCU checks Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 08/13] rcu_sync: Introduce rcu_sync_dtor() Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 09/13] locking/percpu-rwsem: Make percpu_free_rwsem() after kzalloc() safe Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 10/13] locking/percpu-rwsem: Make use of the rcu_sync infrastructure Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 11/13] locking/percpu-rwsem: Fix the comments outdated by rcu_sync Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 12/13] locking/percpu-rwsem: Clean up the lockdep annotations in percpu_down_read() Paul E. McKenney
2015-10-06 16:45   ` [PATCH tip/core/rcu 13/13] rcu_sync: Cleanup the CONFIG_PROVE_RCU checks Paul E. McKenney
2015-10-06 17:42 ` [PATCH tip/core/rcu 0/13] percpu-rwsem patches for 4.4 Josh Triplett
2015-10-06 18:50   ` Oleg Nesterov
2015-10-06 19:33     ` Paul E. McKenney
2015-10-06 20:36       ` Josh Triplett

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).