linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol
@ 2018-01-23  7:59 lianglihao
  2018-01-23  7:59 ` [PATCH RFC 01/16] prcu: Add PRCU implementation lianglihao
                   ` (16 more replies)
  0 siblings, 17 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Dear Paul,

This patch set implements a preemptive version of RCU (PRCU) based on the following paper:

Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
https://dl.acm.org/citation.cfm?id=3024114.3024143

We have also added preliminary callback-handling support.  Thus, the current version
provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), call_prcu(),
and prcu_barrier().

This is an experimental patch, so it would be good to have some feedback.

Known shortcoming is that the grace-period version is incremented in synchronize_prcu().
If call_prcu() or prcu_barrier() is called but there is no synchronized_prcu() invoked,
callbacks cannot be invoked.  Later version should address this issue, e.g. adding a
grace-period expedition mechanism.  Others include to use a a hierarchical structure,
taking into account the NUMA topology, to send IPI in synchronize_prcu().

We have tested the implementation using rcutorture on both an x86 and ARM64 machine.
PRCU passed 1h and 3h tests on all the newly added config files except PRCU07 reported BUG 
in a 1h run.

[ 1593.604201] ---[ end trace b3bae911bec86152 ]---
[ 1594.629450] prcu-torture:torture_onoff task: offlining 14
[ 1594.755553] smpboot: CPU 14 is now offline
[ 1594.757732] prcu-torture:torture_onoff task: offlined 14
[ 1597.765149] prcu-torture:torture_onoff task: onlining 11
[ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
[ 1597.804102] prcu-torture:torture_onoff task: onlined 11
[ 1599.365098] prcu-torture: rtc: ffffffffb0277b90 ver: 66358 tfle: 0 rta: 66358 rtaf: 0 
rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418 
onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 cbflood: 225
[ 1599.367946] prcu-torture: !!!
[ 1599.367966] ------------[ cut here ]------------


We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to true, that is
synchronize_rcu_expedited was tested.

The rcuperf results are as follows (average grace-period duration in ms of ten 10min runs):

16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic

CPUs      2       4       8      12      15       16
PRCU   0.14    1.07    4.15    8.02   10.79    15.16 
TREE  49.30  104.75  277.55  390.82  620.82  1381.54

64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic

CPUs       2       4        8      16      32       48       63        64
PRCU    0.23   19.69    38.28   63.21   95.41   167.18   252.01   1841.44
TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27

Best wishes,
Lihao.


Lihao Liang (15):
  rcutorture: Add PRCU rcu_torture_ops
  rcutorture: Add PRCU test config files
  rcuperf: Add PRCU rcu_perf_ops
  rcuperf: Add PRCU test config files
  rcuperf: Set gp_exp to true for tests to run
  prcu: Implement call_prcu() API
  prcu: Implement PRCU callback processing
  prcu: Implement prcu_barrier() API
  rcutorture: Test call_prcu() and prcu_barrier()
  rcutorture: Add basic ARM64 support to run scripts
  prcu: Add PRCU Kconfig parameter
  prcu: Comment source code
  rcuperf: Add config files with various CONFIG_NR_CPUS
  rcutorture: Add scripts to run experiments
  Add GPLv2 license

Heng Zhang (1):
  prcu: Add PRCU implementation

 include/linux/interrupt.h                          |   3 +
 include/linux/prcu.h                               | 122 +++++
 include/linux/rcupdate.h                           |   1 +
 init/Kconfig                                       |   7 +
 init/main.c                                        |   2 +
 kernel/rcu/Makefile                                |   1 +
 kernel/rcu/prcu.c                                  | 497 +++++++++++++++++++++
 kernel/rcu/rcuperf.c                               |  33 +-
 kernel/rcu/rcutorture.c                            |  40 +-
 kernel/rcu/tree.c                                  |   1 +
 kernel/sched/core.c                                |   2 +
 kernel/time/timer.c                                |   2 +
 kvm.sh                                             | 452 +++++++++++++++++++
 run-rcuperf.sh                                     |  26 ++
 .../testing/selftests/rcutorture/bin/functions.sh  |  17 +-
 .../selftests/rcutorture/configs/rcu/CFLIST        |   5 +
 .../selftests/rcutorture/configs/rcu/PRCU02        |  27 ++
 .../selftests/rcutorture/configs/rcu/PRCU02.boot   |   1 +
 .../selftests/rcutorture/configs/rcu/PRCU03        |  23 +
 .../selftests/rcutorture/configs/rcu/PRCU03.boot   |   2 +
 .../selftests/rcutorture/configs/rcu/PRCU06        |  26 ++
 .../selftests/rcutorture/configs/rcu/PRCU06.boot   |   5 +
 .../selftests/rcutorture/configs/rcu/PRCU07        |  25 ++
 .../selftests/rcutorture/configs/rcu/PRCU07.boot   |   2 +
 .../selftests/rcutorture/configs/rcu/PRCU09        |  19 +
 .../selftests/rcutorture/configs/rcu/PRCU09.boot   |   1 +
 .../selftests/rcutorture/configs/rcuperf/CFLIST    |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU      |  20 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-12   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-12.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-14   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-14.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-15   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-15.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-16   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-16.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-2    |  21 +
 .../rcutorture/configs/rcuperf/PRCU-2.boot         |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-32   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-32.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-4    |  21 +
 .../rcutorture/configs/rcuperf/PRCU-4.boot         |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-48   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-48.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-56   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-56.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-60   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-60.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-62   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-62.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-64   |  21 +
 .../rcutorture/configs/rcuperf/PRCU-64.boot        |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-8    |  21 +
 .../rcutorture/configs/rcuperf/PRCU-8.boot         |   1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU.boot |   1 +
 .../selftests/rcutorture/configs/rcuperf/TREE-12   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-14   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-15   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-16   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-2    |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-32   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-4    |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-48   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-56   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-60   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-62   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-64   |  21 +
 .../selftests/rcutorture/configs/rcuperf/TREE-8    |  21 +
 68 files changed, 1918 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/prcu.h
 create mode 100644 kernel/rcu/prcu.c
 create mode 100755 kvm.sh
 create mode 100755 run-rcuperf.sh
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8

-- 
2.14.1.729.g59c0ea183

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-24 11:26   ` Peter Zijlstra
                     ` (2 more replies)
  2018-01-23  7:59 ` [PATCH RFC 02/16] rcutorture: Add PRCU rcu_torture_ops lianglihao
                   ` (15 subsequent siblings)
  16 siblings, 3 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Heng Zhang <heng.z@huawei.com>

This RCU implementation (PRCU) is based on a fast consensus protocol
published in the following paper:

Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
https://dl.acm.org/citation.cfm?id=3024114.3024143

Signed-off-by: Heng Zhang <heng.z@huawei.com>
Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 include/linux/prcu.h |  37 +++++++++++++++
 kernel/rcu/Makefile  |   2 +-
 kernel/rcu/prcu.c    | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/sched/core.c  |   2 +
 4 files changed, 165 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/prcu.h
 create mode 100644 kernel/rcu/prcu.c

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
new file mode 100644
index 00000000..653b4633
--- /dev/null
+++ b/include/linux/prcu.h
@@ -0,0 +1,37 @@
+#ifndef __LINUX_PRCU_H
+#define __LINUX_PRCU_H
+
+#include <linux/atomic.h>
+#include <linux/mutex.h>
+#include <linux/wait.h>
+
+#define CONFIG_PRCU
+
+struct prcu_local_struct {
+	unsigned int locked;
+	unsigned int online;
+	unsigned long long version;
+};
+
+struct prcu_struct {
+	atomic64_t global_version;
+	atomic_t active_ctr;
+	struct mutex mtx;
+	wait_queue_head_t wait_q;
+};
+
+#ifdef CONFIG_PRCU
+void prcu_read_lock(void);
+void prcu_read_unlock(void);
+void synchronize_prcu(void);
+void prcu_note_context_switch(void);
+
+#else /* #ifdef CONFIG_PRCU */
+
+#define prcu_read_lock() do {} while (0)
+#define prcu_read_unlock() do {} while (0)
+#define synchronize_prcu() do {} while (0)
+#define prcu_note_context_switch() do {} while (0)
+
+#endif /* #ifdef CONFIG_PRCU */
+#endif /* __LINUX_PRCU_H */
diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
index 23803c7d..8791419c 100644
--- a/kernel/rcu/Makefile
+++ b/kernel/rcu/Makefile
@@ -2,7 +2,7 @@
 # and is generally not a function of system call inputs.
 KCOV_INSTRUMENT := n
 
-obj-y += update.o sync.o
+obj-y += update.o sync.o prcu.o
 obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
 obj-$(CONFIG_TREE_SRCU) += srcutree.o
 obj-$(CONFIG_TINY_SRCU) += srcutiny.o
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
new file mode 100644
index 00000000..a00b9420
--- /dev/null
+++ b/kernel/rcu/prcu.c
@@ -0,0 +1,125 @@
+#include <linux/smp.h>
+#include <linux/prcu.h>
+#include <linux/percpu.h>
+#include <linux/compiler.h>
+#include <linux/sched.h>
+
+#include <asm/barrier.h>
+
+DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
+
+struct prcu_struct global_prcu = {
+	.global_version = ATOMIC64_INIT(0),
+	.active_ctr = ATOMIC_INIT(0),
+	.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
+	.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
+};
+struct prcu_struct *prcu = &global_prcu;
+
+static inline void prcu_report(struct prcu_local_struct *local)
+{
+	unsigned long long global_version;
+	unsigned long long local_version;
+
+	global_version = atomic64_read(&prcu->global_version);
+	local_version = local->version;
+	if (global_version > local_version)
+		cmpxchg(&local->version, local_version, global_version);
+}
+
+void prcu_read_lock(void)
+{
+	struct prcu_local_struct *local;
+
+	local = get_cpu_ptr(&prcu_local);
+	if (!local->online) {
+		WRITE_ONCE(local->online, 1);
+		smp_mb();
+	}
+
+	local->locked++;
+	put_cpu_ptr(&prcu_local);
+}
+EXPORT_SYMBOL(prcu_read_lock);
+
+void prcu_read_unlock(void)
+{
+	int locked;
+	struct prcu_local_struct *local;
+
+	barrier();
+	local = get_cpu_ptr(&prcu_local);
+	locked = local->locked;
+	if (locked) {
+		local->locked--;
+		if (locked == 1)
+			prcu_report(local);
+		put_cpu_ptr(&prcu_local);
+	} else {
+		put_cpu_ptr(&prcu_local);
+		if (!atomic_dec_return(&prcu->active_ctr))
+			wake_up(&prcu->wait_q);
+	}
+}
+EXPORT_SYMBOL(prcu_read_unlock);
+
+static void prcu_handler(void *info)
+{
+	struct prcu_local_struct *local;
+
+	local = this_cpu_ptr(&prcu_local);
+	if (!local->locked)
+		WRITE_ONCE(local->version, atomic64_read(&prcu->global_version));
+}
+
+void synchronize_prcu(void)
+{
+	int cpu;
+	cpumask_t cpus;
+	unsigned long long version;
+	struct prcu_local_struct *local;
+
+	version = atomic64_add_return(1, &prcu->global_version);
+	mutex_lock(&prcu->mtx);
+
+	local = get_cpu_ptr(&prcu_local);
+	local->version = version;
+	put_cpu_ptr(&prcu_local);
+
+	cpumask_clear(&cpus);
+	for_each_possible_cpu(cpu) {
+		local = per_cpu_ptr(&prcu_local, cpu);
+		if (!READ_ONCE(local->online))
+			continue;
+		if (READ_ONCE(local->version) < version) {
+			smp_call_function_single(cpu, prcu_handler, NULL, 0);
+			cpumask_set_cpu(cpu, &cpus);
+		}
+	}
+
+	for_each_cpu(cpu, &cpus) {
+		local = per_cpu_ptr(&prcu_local, cpu);
+		while (READ_ONCE(local->version) < version)
+			cpu_relax();
+	}
+
+	if (atomic_read(&prcu->active_ctr))
+		wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
+
+	mutex_unlock(&prcu->mtx);
+}
+EXPORT_SYMBOL(synchronize_prcu);
+
+void prcu_note_context_switch(void)
+{
+	struct prcu_local_struct *local;
+
+	local = get_cpu_ptr(&prcu_local);
+	if (local->locked) {
+		atomic_add(local->locked, &prcu->active_ctr);
+		local->locked = 0;
+	}
+	local->online = 0;
+	prcu_report(local);
+	put_cpu_ptr(&prcu_local);
+}
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 326d4f88..a308581b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -15,6 +15,7 @@
 #include <linux/init_task.h>
 #include <linux/context_tracking.h>
 #include <linux/rcupdate_wait.h>
+#include <linux/prcu.h>
 
 #include <linux/blkdev.h>
 #include <linux/kprobes.h>
@@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool preempt)
 
 	local_irq_disable();
 	rcu_note_context_switch(preempt);
+	prcu_note_context_switch();
 
 	/*
 	 * Make sure that signal_pending_state()->signal_pending() below
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 02/16] rcutorture: Add PRCU rcu_torture_ops
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
  2018-01-23  7:59 ` [PATCH RFC 01/16] prcu: Add PRCU implementation lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-23  7:59 ` [PATCH RFC 03/16] rcutorture: Add PRCU test config files lianglihao
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Reviewed-by: Heng Zhang <heng.z@huawei.com>
Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 include/linux/rcupdate.h |  1 +
 kernel/rcu/rcutorture.c  | 40 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index e1e5d002..12df9709 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -84,6 +84,7 @@ enum rcutorture_type {
 	RCU_SCHED_FLAVOR,
 	RCU_TASKS_FLAVOR,
 	SRCU_FLAVOR,
+	PRCU_FLAVOR,
 	INVALID_RCU_FLAVOR
 };
 
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index ae6e574d..7d65bf0c 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -46,6 +46,7 @@
 #include <linux/delay.h>
 #include <linux/stat.h>
 #include <linux/srcu.h>
+#include <linux/prcu.h>
 #include <linux/slab.h>
 #include <linux/trace_clock.h>
 #include <asm/byteorder.h>
@@ -768,6 +769,43 @@ static bool __maybe_unused torturing_tasks(void)
 
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
 
+/*
+ * Definitions for prcu torture testing.
+ */
+
+static int prcu_torture_read_lock(void) __acquires(RCU)
+{
+	prcu_read_lock();
+	return 0;
+}
+
+static void prcu_torture_read_unlock(int idx) __releases(RCU)
+{
+	prcu_read_unlock();
+}
+
+static struct rcu_torture_ops prcu_ops = {
+	.ttype		= PRCU_FLAVOR,
+	.init		= rcu_sync_torture_init,
+	.readlock	= prcu_torture_read_lock,
+	.read_delay	= rcu_read_delay,  /* just reuse rcu's version. */
+	.readunlock	= prcu_torture_read_unlock,
+	.started	= rcu_no_completed,
+	.completed	= rcu_no_completed,
+	.deferred_free	= NULL,
+	.sync		= synchronize_prcu,
+	.exp_sync	= synchronize_prcu,
+	.get_state	= NULL,
+	.cond_sync	= NULL,
+	.call		= NULL,
+	.cb_barrier	= NULL,
+	.fqs		= NULL,
+	.stats		= NULL,
+	.irq_capable	= 1,
+	.can_boost	= 0,
+	.name		= "prcu"
+};
+
 /*
  * RCU torture priority-boost testing.  Runs one real-time thread per
  * CPU for moderate bursts, repeatedly registering RCU callbacks and
@@ -1764,7 +1802,7 @@ rcu_torture_init(void)
 	int firsterr = 0;
 	static struct rcu_torture_ops *torture_ops[] = {
 		&rcu_ops, &rcu_bh_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops,
-		&sched_ops, RCUTORTURE_TASKS_OPS
+		&sched_ops, &prcu_ops, RCUTORTURE_TASKS_OPS
 	};
 
 	if (!torture_init_begin(torture_type, verbose, &torture_runnable))
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 03/16] rcutorture: Add PRCU test config files
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
  2018-01-23  7:59 ` [PATCH RFC 01/16] prcu: Add PRCU implementation lianglihao
  2018-01-23  7:59 ` [PATCH RFC 02/16] rcutorture: Add PRCU rcu_torture_ops lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-25  6:27   ` Paul E. McKenney
  2018-01-23  7:59 ` [PATCH RFC 04/16] rcuperf: Add PRCU rcu_perf_ops lianglihao
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Use the same config files as TREE02, TREE03, TREE06, TREE07, and TREE09.

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 .../selftests/rcutorture/configs/rcu/CFLIST        |  5 ++++
 .../selftests/rcutorture/configs/rcu/PRCU02        | 27 ++++++++++++++++++++++
 .../selftests/rcutorture/configs/rcu/PRCU02.boot   |  1 +
 .../selftests/rcutorture/configs/rcu/PRCU03        | 23 ++++++++++++++++++
 .../selftests/rcutorture/configs/rcu/PRCU03.boot   |  2 ++
 .../selftests/rcutorture/configs/rcu/PRCU06        | 26 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcu/PRCU06.boot   |  5 ++++
 .../selftests/rcutorture/configs/rcu/PRCU07        | 25 ++++++++++++++++++++
 .../selftests/rcutorture/configs/rcu/PRCU07.boot   |  2 ++
 .../selftests/rcutorture/configs/rcu/PRCU09        | 19 +++++++++++++++
 .../selftests/rcutorture/configs/rcu/PRCU09.boot   |  1 +
 11 files changed, 136 insertions(+)
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot

diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST
index a3a1a05a..7359e194 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST
+++ b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST
@@ -1,3 +1,8 @@
+PRCU02
+PRCU03
+PRCU06
+PRCU07
+PRCU09
 TREE01
 TREE02
 TREE03
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU02 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02
new file mode 100644
index 00000000..5f532f05
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02
@@ -0,0 +1,27 @@
+CONFIG_SMP=y
+CONFIG_NR_CPUS=8
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_PRCU=y
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_FANOUT=3
+CONFIG_RCU_FANOUT_LEAF=3
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=y
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
+CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
new file mode 100644
index 00000000..6c5e626f
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
@@ -0,0 +1 @@
+rcutorture.torture_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU03 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03
new file mode 100644
index 00000000..869cadc8
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03
@@ -0,0 +1,23 @@
+CONFIG_SMP=y
+CONFIG_NR_CPUS=16
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_PRCU=y
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=y
+CONFIG_NO_HZ_IDLE=n
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_TRACE=y
+CONFIG_HOTPLUG_CPU=y
+CONFIG_RCU_FANOUT=2
+CONFIG_RCU_FANOUT_LEAF=2
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_RCU_BOOST=y
+CONFIG_RCU_KTHREAD_PRIO=2
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
+CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
new file mode 100644
index 00000000..0be10cba
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
@@ -0,0 +1,2 @@
+rcutorture.onoff_interval=1 rcutorture.onoff_holdoff=30
+rcutorture.torture_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU06 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06
new file mode 100644
index 00000000..b1480963
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06
@@ -0,0 +1,26 @@
+CONFIG_SMP=y
+CONFIG_NR_CPUS=8
+CONFIG_PREEMPT_NONE=y
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=n
+CONFIG_PRCU=y
+#CHECK#CONFIG_TREE_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_FANOUT=6
+CONFIG_RCU_FANOUT_LEAF=6
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=y
+CONFIG_PROVE_LOCKING=y
+#CHECK#CONFIG_PROVE_RCU=y
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
+CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
new file mode 100644
index 00000000..00787e68
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
@@ -0,0 +1,5 @@
+rcupdate.rcu_self_test=1
+rcupdate.rcu_self_test_bh=1
+rcupdate.rcu_self_test_sched=1
+rcutree.rcu_fanout_exact=1
+rcutorture.torture_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU07 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU07
new file mode 100644
index 00000000..14f74c68
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU07
@@ -0,0 +1,25 @@
+CONFIG_SMP=y
+CONFIG_NR_CPUS=16
+CONFIG_CPUMASK_OFFSTACK=y
+CONFIG_PREEMPT_NONE=y
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=n
+CONFIG_PRCU=y
+#CHECK#CONFIG_TREE_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=n
+CONFIG_NO_HZ_FULL=y
+CONFIG_NO_HZ_FULL_ALL=n
+CONFIG_NO_HZ_FULL_SYSIDLE=y
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=y
+CONFIG_HOTPLUG_CPU=y
+CONFIG_RCU_FANOUT=2
+CONFIG_RCU_FANOUT_LEAF=2
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
+CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
+CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
new file mode 100644
index 00000000..43dac30b
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
@@ -0,0 +1,2 @@
+nohz_full=2-9
+rcutorture.torture_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU09 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU09
new file mode 100644
index 00000000..43d4718d
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU09
@@ -0,0 +1,19 @@
+CONFIG_SMP=n
+CONFIG_NR_CPUS=1
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_PRCU=y
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+#CHECK#CONFIG_RCU_EXPERT=n
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
new file mode 100644
index 00000000..6c5e626f
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
@@ -0,0 +1 @@
+rcutorture.torture_type=prcu
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 04/16] rcuperf: Add PRCU rcu_perf_ops
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (2 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 03/16] rcutorture: Add PRCU test config files lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-23  7:59 ` [PATCH RFC 05/16] rcuperf: Add PRCU test config files lianglihao
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 kernel/rcu/rcuperf.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index a4a86fb4..ea80fa3e 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -28,6 +28,7 @@
 #include <linux/spinlock.h>
 #include <linux/smp.h>
 #include <linux/rcupdate.h>
+#include <linux/prcu.h>
 #include <linux/interrupt.h>
 #include <linux/sched.h>
 #include <uapi/linux/sched/types.h>
@@ -304,6 +305,34 @@ static bool __maybe_unused torturing_tasks(void)
 
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
 
+/*
+ * Definitions for prcu perf testing.
+ */
+
+static int prcu_perf_read_lock(void) __acquires(RCU)
+{
+	prcu_read_lock();
+	return 0;
+}
+
+static void prcu_perf_read_unlock(int idx) __releases(RCU)
+{
+	prcu_read_unlock();
+}
+
+static struct rcu_perf_ops prcu_ops = {
+	.ptype		= PRCU_FLAVOR,
+	.init		= rcu_sync_perf_init,
+	.readlock	= prcu_perf_read_lock,
+	.readunlock	= prcu_perf_read_unlock,
+	.started	= rcu_no_completed,
+	.completed	= rcu_no_completed,
+	.exp_completed	= rcu_no_completed,
+	.sync		= synchronize_prcu,
+	.exp_sync	= synchronize_prcu,
+	.name		= "prcu"
+};
+
 /*
  * If performance tests complete, wait for shutdown to commence.
  */
@@ -554,7 +583,7 @@ rcu_perf_init(void)
 	long i;
 	int firsterr = 0;
 	static struct rcu_perf_ops *perf_ops[] = {
-		&rcu_ops, &rcu_bh_ops, &srcu_ops, &sched_ops,
+		&rcu_ops, &rcu_bh_ops, &srcu_ops, &sched_ops, &prcu_ops,
 		RCUPERF_TASKS_OPS
 	};
 
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 05/16] rcuperf: Add PRCU test config files
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (3 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 04/16] rcuperf: Add PRCU rcu_perf_ops lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-23  7:59 ` [PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run lianglihao
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Use the same config file of TREE.

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 .../selftests/rcutorture/configs/rcuperf/CFLIST      |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU        | 20 ++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/PRCU.boot   |  1 +
 3 files changed, 22 insertions(+)
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot

diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST b/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST
index c9f56cf2..4b80917a 100644
--- a/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/CFLIST
@@ -1 +1,2 @@
 TREE
+PRCU
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
new file mode 100644
index 00000000..a312f671
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
@@ -0,0 +1,20 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (4 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 05/16] rcuperf: Add PRCU test config files lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-25  6:18   ` Paul E. McKenney
  2018-01-23  7:59 ` [PATCH RFC 07/16] prcu: Implement call_prcu() API lianglihao
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 kernel/rcu/rcuperf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index ea80fa3e..baccc123 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -60,7 +60,7 @@ MODULE_AUTHOR("Paul E. McKenney <paulmck@linux.vnet.ibm.com>");
 #define VERBOSE_PERFOUT_ERRSTRING(s) \
 	do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } while (0)
 
-torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
+torture_param(bool, gp_exp, true, "Use expedited GP wait primitives");
 torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
 torture_param(int, nreaders, -1, "Number of RCU reader threads");
 torture_param(int, nwriters, -1, "Number of RCU updater threads");
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 07/16] prcu: Implement call_prcu() API
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (5 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-25  6:20   ` Paul E. McKenney
  2018-01-23  7:59 ` [PATCH RFC 08/16] prcu: Implement PRCU callback processing lianglihao
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

This is PRCU's counterpart of RCU's call_rcu() API.

Reviewed-by: Heng Zhang <heng.z@huawei.com>
Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 include/linux/prcu.h | 25 ++++++++++++++++++++
 init/main.c          |  2 ++
 kernel/rcu/prcu.c    | 67 +++++++++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 91 insertions(+), 3 deletions(-)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index 653b4633..e5e09c9b 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -2,15 +2,36 @@
 #define __LINUX_PRCU_H
 
 #include <linux/atomic.h>
+#include <linux/types.h>
 #include <linux/mutex.h>
 #include <linux/wait.h>
 
 #define CONFIG_PRCU
 
+struct prcu_version_head {
+	unsigned long long version;
+	struct prcu_version_head *next;
+};
+
+/* Simple unsegmented callback list for PRCU. */
+struct prcu_cblist {
+	struct rcu_head *head;
+	struct rcu_head **tail;
+	struct prcu_version_head *version_head;
+	struct prcu_version_head **version_tail;
+	long len;
+};
+
+#define PRCU_CBLIST_INITIALIZER(n) { \
+	.head = NULL, .tail = &n.head, \
+	.version_head = NULL, .version_tail = &n.version_head, \
+}
+
 struct prcu_local_struct {
 	unsigned int locked;
 	unsigned int online;
 	unsigned long long version;
+	struct prcu_cblist cblist;
 };
 
 struct prcu_struct {
@@ -24,6 +45,8 @@ struct prcu_struct {
 void prcu_read_lock(void);
 void prcu_read_unlock(void);
 void synchronize_prcu(void);
+void call_prcu(struct rcu_head *head, rcu_callback_t func);
+void prcu_init(void);
 void prcu_note_context_switch(void);
 
 #else /* #ifdef CONFIG_PRCU */
@@ -31,6 +54,8 @@ void prcu_note_context_switch(void);
 #define prcu_read_lock() do {} while (0)
 #define prcu_read_unlock() do {} while (0)
 #define synchronize_prcu() do {} while (0)
+#define call_prcu() do {} while (0)
+#define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
 
 #endif /* #ifdef CONFIG_PRCU */
diff --git a/init/main.c b/init/main.c
index f8665104..4925964e 100644
--- a/init/main.c
+++ b/init/main.c
@@ -38,6 +38,7 @@
 #include <linux/smp.h>
 #include <linux/profile.h>
 #include <linux/rcupdate.h>
+#include <linux/prcu.h>
 #include <linux/moduleparam.h>
 #include <linux/kallsyms.h>
 #include <linux/writeback.h>
@@ -574,6 +575,7 @@ asmlinkage __visible void __init start_kernel(void)
 	workqueue_init_early();
 
 	rcu_init();
+	prcu_init();
 
 	/* Trace events are available after this */
 	trace_init();
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index a00b9420..f198285c 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -1,11 +1,12 @@
 #include <linux/smp.h>
-#include <linux/prcu.h>
 #include <linux/percpu.h>
-#include <linux/compiler.h>
+#include <linux/prcu.h>
 #include <linux/sched.h>
-
+#include <linux/slab.h>
 #include <asm/barrier.h>
 
+#include "rcu.h"
+
 DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
 
 struct prcu_struct global_prcu = {
@@ -16,6 +17,16 @@ struct prcu_struct global_prcu = {
 };
 struct prcu_struct *prcu = &global_prcu;
 
+/* Initialize simple callback list. */
+static void prcu_cblist_init(struct prcu_cblist *rclp)
+{
+	rclp->head = NULL;
+	rclp->tail = &rclp->head;
+	rclp->version_head = NULL;
+	rclp->version_tail = &rclp->version_head;
+	rclp->len = 0;
+}
+
 static inline void prcu_report(struct prcu_local_struct *local)
 {
 	unsigned long long global_version;
@@ -123,3 +134,53 @@ void prcu_note_context_switch(void)
 	prcu_report(local);
 	put_cpu_ptr(&prcu_local);
 }
+
+void call_prcu(struct rcu_head *head, rcu_callback_t func)
+{
+	unsigned long flags;
+	struct prcu_local_struct *local;
+	struct prcu_cblist *rclp;
+	struct prcu_version_head *vhp;
+
+	debug_rcu_head_queue(head);
+
+	/* Use GFP_ATOMIC with IRQs disabled */
+	vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC);
+	if (!vhp)
+		return;
+
+	head->func = func;
+	head->next = NULL;
+	vhp->next = NULL;
+
+	local_irq_save(flags);
+	local = this_cpu_ptr(&prcu_local);
+	vhp->version = local->version;
+	rclp = &local->cblist;
+	rclp->len++;
+	*rclp->tail = head;
+	rclp->tail = &head->next;
+	*rclp->version_tail = vhp;
+	rclp->version_tail = &vhp->next;
+	local_irq_restore(flags);
+}
+EXPORT_SYMBOL(call_prcu);
+
+void prcu_init_local_struct(int cpu)
+{
+	struct prcu_local_struct *local;
+
+	local = per_cpu_ptr(&prcu_local, cpu);
+	local->locked = 0;
+	local->online = 0;
+	local->version = 0;
+	prcu_cblist_init(&local->cblist);
+}
+
+void __init prcu_init(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		prcu_init_local_struct(cpu);
+}
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 08/16] prcu: Implement PRCU callback processing
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (6 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 07/16] prcu: Implement call_prcu() API lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-23  7:59 ` [PATCH RFC 09/16] prcu: Implement prcu_barrier() API lianglihao
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Currently, PRCU core processing only consists of callback processing
in prcu_process_callbacks(), which is triggered by the scheduling-clock
interrupt.

Reviewed-by: Heng Zhang <heng.z@huawei.com>
Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 include/linux/interrupt.h |  3 ++
 include/linux/prcu.h      |  8 +++++
 kernel/rcu/prcu.c         | 86 +++++++++++++++++++++++++++++++++++++++++++++++
 kernel/rcu/tree.c         |  1 +
 kernel/time/timer.c       |  2 ++
 5 files changed, 100 insertions(+)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 0991f973..f05ef62a 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -456,6 +456,9 @@ enum
 	SCHED_SOFTIRQ,
 	HRTIMER_SOFTIRQ, /* Unused, but kept as tools rely on the
 			    numbering. Sigh! */
+#ifdef CONFIG_PRCU
+	PRCU_SOFTIRQ,
+#endif
 	RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */
 
 	NR_SOFTIRQS
diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index e5e09c9b..4e7d5d65 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -31,11 +31,13 @@ struct prcu_local_struct {
 	unsigned int locked;
 	unsigned int online;
 	unsigned long long version;
+	unsigned long long cb_version;
 	struct prcu_cblist cblist;
 };
 
 struct prcu_struct {
 	atomic64_t global_version;
+	atomic64_t cb_version;
 	atomic_t active_ctr;
 	struct mutex mtx;
 	wait_queue_head_t wait_q;
@@ -48,6 +50,9 @@ void synchronize_prcu(void);
 void call_prcu(struct rcu_head *head, rcu_callback_t func);
 void prcu_init(void);
 void prcu_note_context_switch(void);
+int prcu_pending(void);
+void invoke_prcu_core(void);
+void prcu_check_callbacks(void);
 
 #else /* #ifdef CONFIG_PRCU */
 
@@ -57,6 +62,9 @@ void prcu_note_context_switch(void);
 #define call_prcu() do {} while (0)
 #define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
+#define prcu_pending() 0
+#define invoke_prcu_core() do {} while (0)
+#define prcu_check_callbacks() do {} while (0)
 
 #endif /* #ifdef CONFIG_PRCU */
 #endif /* __LINUX_PRCU_H */
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index f198285c..373039c5 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -1,6 +1,7 @@
 #include <linux/smp.h>
 #include <linux/percpu.h>
 #include <linux/prcu.h>
+#include <linux/interrupt.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <asm/barrier.h>
@@ -11,6 +12,7 @@ DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
 
 struct prcu_struct global_prcu = {
 	.global_version = ATOMIC64_INIT(0),
+	.cb_version = ATOMIC64_INIT(0),
 	.active_ctr = ATOMIC_INIT(0),
 	.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
 	.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
@@ -27,6 +29,35 @@ static void prcu_cblist_init(struct prcu_cblist *rclp)
 	rclp->len = 0;
 }
 
+/*
+ * Dequeue the oldest rcu_head structure from the specified callback list;
+ * store the callback grace period version number into the version pointer.
+ */
+static struct rcu_head *prcu_cblist_dequeue(struct prcu_cblist *rclp)
+{
+	struct rcu_head *rhp;
+	struct prcu_version_head *vhp;
+
+	rhp = rclp->head;
+	if (!rhp) {
+		WARN_ON(vhp);
+		WARN_ON(rclp->len);
+		return NULL;
+	}
+
+	vhp = rclp->version_head;
+	rclp->version_head = vhp->next;
+	rclp->head = rhp->next;
+	rclp->len--;
+
+	if (!rclp->head) {
+		rclp->tail = &rclp->head;
+		rclp->version_tail = &rclp->version_head;
+	}
+
+	return rhp;
+}
+
 static inline void prcu_report(struct prcu_local_struct *local)
 {
 	unsigned long long global_version;
@@ -117,6 +148,7 @@ void synchronize_prcu(void)
 	if (atomic_read(&prcu->active_ctr))
 		wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
 
+	atomic64_set(&prcu->cb_version, version);
 	mutex_unlock(&prcu->mtx);
 }
 EXPORT_SYMBOL(synchronize_prcu);
@@ -166,6 +198,58 @@ void call_prcu(struct rcu_head *head, rcu_callback_t func)
 }
 EXPORT_SYMBOL(call_prcu);
 
+int prcu_pending(void)
+{
+	struct prcu_local_struct *local = get_cpu_ptr(&prcu_local);
+	unsigned long long cb_version = local->cb_version;
+	struct prcu_cblist *rclp = &local->cblist;
+
+	put_cpu_ptr(&prcu_local);
+	return cb_version < atomic64_read(&prcu->cb_version) && rclp->head;
+}
+
+void invoke_prcu_core(void)
+{
+	if (cpu_online(smp_processor_id()))
+		raise_softirq(PRCU_SOFTIRQ);
+}
+
+void prcu_check_callbacks(void)
+{
+	if (prcu_pending())
+		invoke_prcu_core();
+}
+
+static __latent_entropy void prcu_process_callbacks(struct softirq_action *unused)
+{
+	unsigned long flags;
+	unsigned long long cb_version;
+	struct prcu_local_struct *local;
+	struct prcu_cblist *rclp;
+	struct rcu_head *rhp;
+	struct prcu_version_head *vhp;
+
+	if (cpu_is_offline(smp_processor_id()))
+		return;
+
+	cb_version = atomic64_read(&prcu->cb_version);
+
+	/* Disable interrupts to prevent races with call_prcu() */
+	local_irq_save(flags);
+	local = this_cpu_ptr(&prcu_local);
+	rclp = &local->cblist;
+	rhp = rclp->head;
+	vhp = rclp->version_head;
+	for (; rhp && vhp && vhp->version < cb_version;
+	     rhp = rclp->head, vhp = rclp->version_head) {
+		rhp = prcu_cblist_dequeue(rclp);
+		debug_rcu_head_unqueue(rhp);
+		rhp->func(rhp);
+	}
+	local->cb_version = cb_version;
+	local_irq_restore(flags);
+}
+
 void prcu_init_local_struct(int cpu)
 {
 	struct prcu_local_struct *local;
@@ -174,6 +258,7 @@ void prcu_init_local_struct(int cpu)
 	local->locked = 0;
 	local->online = 0;
 	local->version = 0;
+	local->cb_version = 0;
 	prcu_cblist_init(&local->cblist);
 }
 
@@ -181,6 +266,7 @@ void __init prcu_init(void)
 {
 	int cpu;
 
+	open_softirq(PRCU_SOFTIRQ, prcu_process_callbacks);
 	for_each_possible_cpu(cpu)
 		prcu_init_local_struct(cpu);
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e354e475..46910114 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2852,6 +2852,7 @@ void rcu_check_callbacks(int user)
 {
 	trace_rcu_utilization(TPS("Start scheduler-tick"));
 	increment_cpu_stall_ticks();
+
 	if (user || rcu_is_cpu_rrupt_from_idle()) {
 
 		/*
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index d3f33020..ed863e63 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -44,6 +44,7 @@
 #include <linux/sched/debug.h>
 #include <linux/slab.h>
 #include <linux/compat.h>
+#include <linux/prcu.h>
 
 #include <linux/uaccess.h>
 #include <asm/unistd.h>
@@ -1568,6 +1569,7 @@ void update_process_times(int user_tick)
 	/* Note: this timer irq context must be accounted for as well. */
 	account_process_tick(p, user_tick);
 	run_local_timers();
+	prcu_check_callbacks();
 	rcu_check_callbacks(user_tick);
 #ifdef CONFIG_IRQ_WORK
 	if (in_irq())
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 09/16] prcu: Implement prcu_barrier() API
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (7 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 08/16] prcu: Implement PRCU callback processing lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-25  6:24   ` Paul E. McKenney
  2018-01-23  7:59 ` [PATCH RFC 10/16] rcutorture: Test call_prcu() and prcu_barrier() lianglihao
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

This is PRCU's counterpart of RCU's rcu_barrier() API.

Reviewed-by: Heng Zhang <heng.z@huawei.com>
Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 include/linux/prcu.h |  7 ++++++
 kernel/rcu/prcu.c    | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index 4e7d5d65..cce967fd 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -5,6 +5,7 @@
 #include <linux/types.h>
 #include <linux/mutex.h>
 #include <linux/wait.h>
+#include <linux/completion.h>
 
 #define CONFIG_PRCU
 
@@ -32,6 +33,7 @@ struct prcu_local_struct {
 	unsigned int online;
 	unsigned long long version;
 	unsigned long long cb_version;
+	struct rcu_head barrier_head;
 	struct prcu_cblist cblist;
 };
 
@@ -39,8 +41,11 @@ struct prcu_struct {
 	atomic64_t global_version;
 	atomic64_t cb_version;
 	atomic_t active_ctr;
+	atomic_t barrier_cpu_count;
 	struct mutex mtx;
+	struct mutex barrier_mtx;
 	wait_queue_head_t wait_q;
+	struct completion barrier_completion;
 };
 
 #ifdef CONFIG_PRCU
@@ -48,6 +53,7 @@ void prcu_read_lock(void);
 void prcu_read_unlock(void);
 void synchronize_prcu(void);
 void call_prcu(struct rcu_head *head, rcu_callback_t func);
+void prcu_barrier(void);
 void prcu_init(void);
 void prcu_note_context_switch(void);
 int prcu_pending(void);
@@ -60,6 +66,7 @@ void prcu_check_callbacks(void);
 #define prcu_read_unlock() do {} while (0)
 #define synchronize_prcu() do {} while (0)
 #define call_prcu() do {} while (0)
+#define prcu_barrier() do {} while (0)
 #define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
 #define prcu_pending() 0
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index 373039c5..2664d091 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -15,6 +15,7 @@ struct prcu_struct global_prcu = {
 	.cb_version = ATOMIC64_INIT(0),
 	.active_ctr = ATOMIC_INIT(0),
 	.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
+	.barrier_mtx = __MUTEX_INITIALIZER(global_prcu.barrier_mtx),
 	.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
 };
 struct prcu_struct *prcu = &global_prcu;
@@ -250,6 +251,68 @@ static __latent_entropy void prcu_process_callbacks(struct softirq_action *unuse
 	local_irq_restore(flags);
 }
 
+/*
+ * PRCU callback function for prcu_barrier().
+ * If we are last, wake up the task executing prcu_barrier().
+ */
+static void prcu_barrier_callback(struct rcu_head *rhp)
+{
+	if (atomic_dec_and_test(&prcu->barrier_cpu_count))
+		complete(&prcu->barrier_completion);
+}
+
+/*
+ * Called with preemption disabled, and from cross-cpu IRQ context.
+ */
+static void prcu_barrier_func(void *info)
+{
+	struct prcu_local_struct *local = this_cpu_ptr(&prcu_local);
+
+	atomic_inc(&prcu->barrier_cpu_count);
+	call_prcu(&local->barrier_head, prcu_barrier_callback);
+}
+
+/* Waiting for all PRCU callbacks to complete. */
+void prcu_barrier(void)
+{
+	int cpu;
+
+	/* Take mutex to serialize concurrent prcu_barrier() requests. */
+	mutex_lock(&prcu->barrier_mtx);
+
+	/*
+	 * Initialize the count to one rather than to zero in order to
+	 * avoid a too-soon return to zero in case of a short grace period
+	 * (or preemption of this task).
+	 */
+	init_completion(&prcu->barrier_completion);
+	atomic_set(&prcu->barrier_cpu_count, 1);
+
+	/*
+	 * Register a new callback on each CPU using IPI to prevent races
+	 * with call_prcu(). When that callback is invoked, we will know
+	 * that all of the corresponding CPU's preceding callbacks have
+	 * been invoked.
+	 */
+	for_each_possible_cpu(cpu)
+		smp_call_function_single(cpu, prcu_barrier_func, NULL, 1);
+
+	/* Decrement the count as we initialize it to one. */
+	if (atomic_dec_and_test(&prcu->barrier_cpu_count))
+		complete(&prcu->barrier_completion);
+
+	/*
+	 * Now that we have an prcu_barrier_callback() callback on each
+	 * CPU, and thus each counted, remove the initial count.
+	 * Wait for all prcu_barrier_callback() callbacks to be invoked.
+	 */
+	wait_for_completion(&prcu->barrier_completion);
+
+	/* Other rcu_barrier() invocations can now safely proceed. */
+	mutex_unlock(&prcu->barrier_mtx);
+}
+EXPORT_SYMBOL(prcu_barrier);
+
 void prcu_init_local_struct(int cpu)
 {
 	struct prcu_local_struct *local;
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 10/16] rcutorture: Test call_prcu() and prcu_barrier()
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (8 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 09/16] prcu: Implement prcu_barrier() API lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-23  7:59 ` [PATCH RFC 11/16] rcutorture: Add basic ARM64 support to run scripts lianglihao
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 kernel/rcu/prcu.c       | 4 +++-
 kernel/rcu/rcutorture.c | 4 ++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index 2664d091..49cb70e6 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -179,8 +179,10 @@ void call_prcu(struct rcu_head *head, rcu_callback_t func)
 
 	/* Use GFP_ATOMIC with IRQs disabled */
 	vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC);
-	if (!vhp)
+	if (!vhp) {
+		WARN_ON(1);
 		return;
+	}
 
 	head->func = func;
 	head->next = NULL;
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 7d65bf0c..9215ebb0 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -797,8 +797,8 @@ static struct rcu_torture_ops prcu_ops = {
 	.exp_sync	= synchronize_prcu,
 	.get_state	= NULL,
 	.cond_sync	= NULL,
-	.call		= NULL,
-	.cb_barrier	= NULL,
+	.call		= call_prcu,
+	.cb_barrier	= prcu_barrier,
 	.fqs		= NULL,
 	.stats		= NULL,
 	.irq_capable	= 1,
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 11/16] rcutorture: Add basic ARM64 support to run scripts
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (9 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 10/16] rcutorture: Test call_prcu() and prcu_barrier() lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-23  7:59 ` [PATCH RFC 12/16] prcu: Add PRCU Kconfig parameter lianglihao
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

This commit adds support of the qemu command qemu-system-aarch64
to rcutorture.

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 tools/testing/selftests/rcutorture/bin/functions.sh | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh
index 1426a9b9..4a24b873 100644
--- a/tools/testing/selftests/rcutorture/bin/functions.sh
+++ b/tools/testing/selftests/rcutorture/bin/functions.sh
@@ -111,6 +111,9 @@ identify_boot_image () {
 		qemu-system-x86_64|qemu-system-i386)
 			echo arch/x86/boot/bzImage
 			;;
+		qemu-system-aarch64)
+			echo arch/arm64/boot/Image
+			;;
 		*)
 			echo vmlinux
 			;;
@@ -133,6 +136,9 @@ identify_qemu () {
 	elif echo $u | grep -q "Intel 80386"
 	then
 		echo qemu-system-i386
+	elif echo $u | grep -q aarch64
+	then
+		echo qemu-system-aarch64
 	elif uname -a | grep -q ppc64
 	then
 		echo qemu-system-ppc64
@@ -151,16 +157,20 @@ identify_qemu () {
 # Output arguments for the qemu "-append" string based on CPU type
 # and the TORTURE_QEMU_INTERACTIVE environment variable.
 identify_qemu_append () {
+	local console=ttyS0
 	case "$1" in
 	qemu-system-x86_64|qemu-system-i386)
 		echo noapic selinux=0 initcall_debug debug
 		;;
+	qemu-system-aarch64)
+		console=ttyAMA0
+		;;
 	esac
 	if test -n "$TORTURE_QEMU_INTERACTIVE"
 	then
 		echo root=/dev/sda
 	else
-		echo console=ttyS0
+		echo console=$console
 	fi
 }
 
@@ -172,6 +182,9 @@ identify_qemu_args () {
 	case "$1" in
 	qemu-system-x86_64|qemu-system-i386)
 		;;
+	qemu-system-arm|qemu-system-aarch64)
+		echo -machine virt,gic-version=host -cpu host
+		;;
 	qemu-system-ppc64)
 		echo -enable-kvm -M pseries -nodefaults
 		echo -device spapr-vscsi
@@ -229,7 +242,7 @@ specify_qemu_cpus () {
 		echo $2
 	else
 		case "$1" in
-		qemu-system-x86_64|qemu-system-i386)
+		qemu-system-x86_64|qemu-system-i386|qemu-system-aarch64)
 			echo $2 -smp $3
 			;;
 		qemu-system-ppc64)
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 12/16] prcu: Add PRCU Kconfig parameter
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (10 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 11/16] rcutorture: Add basic ARM64 support to run scripts lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-23  7:59 ` [PATCH RFC 13/16] prcu: Comment source code lianglihao
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 include/linux/prcu.h | 14 ++++++--------
 init/Kconfig         |  7 +++++++
 kernel/rcu/Makefile  |  3 ++-
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index cce967fd..bb20fa40 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -7,8 +7,7 @@
 #include <linux/wait.h>
 #include <linux/completion.h>
 
-#define CONFIG_PRCU
-
+#ifdef CONFIG_PRCU
 struct prcu_version_head {
 	unsigned long long version;
 	struct prcu_version_head *next;
@@ -48,7 +47,6 @@ struct prcu_struct {
 	struct completion barrier_completion;
 };
 
-#ifdef CONFIG_PRCU
 void prcu_read_lock(void);
 void prcu_read_unlock(void);
 void synchronize_prcu(void);
@@ -62,11 +60,11 @@ void prcu_check_callbacks(void);
 
 #else /* #ifdef CONFIG_PRCU */
 
-#define prcu_read_lock() do {} while (0)
-#define prcu_read_unlock() do {} while (0)
-#define synchronize_prcu() do {} while (0)
-#define call_prcu() do {} while (0)
-#define prcu_barrier() do {} while (0)
+#define prcu_read_lock rcu_read_lock
+#define prcu_read_unlock rcu_read_unlock
+#define synchronize_prcu synchronize_rcu
+#define call_prcu call_rcu
+#define prcu_barrier rcu_barrier
 #define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
 #define prcu_pending() 0
diff --git a/init/Kconfig b/init/Kconfig
index 1d3475fc..c1fd80f9 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -565,6 +565,13 @@ config TASKS_RCU
 	  only voluntary context switch (not preemption!), idle, and
 	  user-mode execution as quiescent states.
 
+config PRCU
+	bool
+	default y
+	help
+	  This option selects the PRCU implementation based on a fast
+	  consensus protocol.
+
 config RCU_STALL_COMMON
 	def_bool ( TREE_RCU || PREEMPT_RCU || RCU_TRACE )
 	help
diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
index 8791419c..9074b395 100644
--- a/kernel/rcu/Makefile
+++ b/kernel/rcu/Makefile
@@ -2,7 +2,7 @@
 # and is generally not a function of system call inputs.
 KCOV_INSTRUMENT := n
 
-obj-y += update.o sync.o prcu.o
+obj-y += update.o sync.o
 obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
 obj-$(CONFIG_TREE_SRCU) += srcutree.o
 obj-$(CONFIG_TINY_SRCU) += srcutiny.o
@@ -12,4 +12,5 @@ obj-$(CONFIG_TREE_RCU) += tree.o
 obj-$(CONFIG_PREEMPT_RCU) += tree.o
 obj-$(CONFIG_TREE_RCU_TRACE) += tree_trace.o
 obj-$(CONFIG_TINY_RCU) += tiny.o
+obj-$(CONFIG_PRCU) += prcu.o
 obj-$(CONFIG_RCU_NEED_SEGCBLIST) += rcu_segcblist.o
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 13/16] prcu: Comment source code
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (11 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 12/16] prcu: Add PRCU Kconfig parameter lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-23  7:59 ` [PATCH RFC 14/16] rcuperf: Add config files with various CONFIG_NR_CPUS lianglihao
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 include/linux/prcu.h |  73 ++++++++++++++++-----
 kernel/rcu/prcu.c    | 178 +++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 225 insertions(+), 26 deletions(-)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index bb20fa40..9f740985 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -1,3 +1,11 @@
+/*
+ * Read-Copy Update mechanism for mutual exclusion (PRCU version).
+ * PRCU public definitions.
+ *
+ * Authors: Heng Zhang <heng.z@huawei.com>
+ *          Lihao Liang <lianglihao@huawei.com>
+ */
+
 #ifndef __LINUX_PRCU_H
 #define __LINUX_PRCU_H
 
@@ -8,12 +16,26 @@
 #include <linux/completion.h>
 
 #ifdef CONFIG_PRCU
+
+/*
+ * Simple list structure of callback versions.
+ *
+ * Note: Ideally, we would like to add the version field
+ * to the rcu_head struct.  But if we do so, other users of
+ * rcu_head in the Linux kernel will complain hard and loudly.
+ */
 struct prcu_version_head {
 	unsigned long long version;
 	struct prcu_version_head *next;
 };
 
-/* Simple unsegmented callback list for PRCU. */
+/*
+ * Simple unsegmented callback list for PRCU.
+ *
+ * Note: Since we can't add a new version field to rcu_head,
+ * we have to make our own callback list for PRCU instead of
+ * using the existing rcu_cblist. Sigh!
+ */
 struct prcu_cblist {
 	struct rcu_head *head;
 	struct rcu_head **tail;
@@ -27,31 +49,47 @@ struct prcu_cblist {
 	.version_head = NULL, .version_tail = &n.version_head, \
 }
 
+/*
+ * PRCU's per-CPU state.
+ */
 struct prcu_local_struct {
-	unsigned int locked;
-	unsigned int online;
-	unsigned long long version;
-	unsigned long long cb_version;
-	struct rcu_head barrier_head;
-	struct prcu_cblist cblist;
+	unsigned int locked;	       /* Nesting level of PRCU read-side */
+				       /*  critcal sections */
+	unsigned int online;	       /* Indicates whether a context-switch */
+				       /*  has occurred on this CPU */
+	unsigned long long version;    /* Local grace-period version */
+	unsigned long long cb_version; /* Local callback version */
+	struct rcu_head barrier_head;  /* PRCU callback list */
+	struct prcu_cblist cblist;     /* PRCU callback version list */
 };
 
+/*
+ * PRCU's global state.
+ */
 struct prcu_struct {
-	atomic64_t global_version;
-	atomic64_t cb_version;
-	atomic_t active_ctr;
-	atomic_t barrier_cpu_count;
-	struct mutex mtx;
-	struct mutex barrier_mtx;
-	wait_queue_head_t wait_q;
-	struct completion barrier_completion;
+	atomic64_t global_version;	      /* Global grace-period version */
+	atomic64_t cb_version;		      /* Global callback version */
+	atomic_t active_ctr;		      /* Outstanding PRCU tasks */
+					      /*  being context-switched */
+	atomic_t barrier_cpu_count;	      /* # CPUs waiting on prcu_barrier() */
+	struct mutex mtx;		      /* Serialize synchronize_prcu() */
+	struct mutex barrier_mtx;	      /* Serialize prcu_barrier() */
+	wait_queue_head_t wait_q;             /* Wait for synchronize_prcu() */
+	struct completion barrier_completion; /* Wait for prcu_barrier() */
 };
 
+/*
+ * PRCU APIs.
+ */
 void prcu_read_lock(void);
 void prcu_read_unlock(void);
 void synchronize_prcu(void);
 void call_prcu(struct rcu_head *head, rcu_callback_t func);
 void prcu_barrier(void);
+
+/*
+ * Internal non-public functions.
+ */
 void prcu_init(void);
 void prcu_note_context_switch(void);
 int prcu_pending(void);
@@ -60,11 +98,16 @@ void prcu_check_callbacks(void);
 
 #else /* #ifdef CONFIG_PRCU */
 
+/*
+ * If CONFIG_PRCU is not defined,
+ * map its APIs to RCU's counterparts.
+ */
 #define prcu_read_lock rcu_read_lock
 #define prcu_read_unlock rcu_read_unlock
 #define synchronize_prcu synchronize_rcu
 #define call_prcu call_rcu
 #define prcu_barrier rcu_barrier
+
 #define prcu_init() do {} while (0)
 #define prcu_note_context_switch() do {} while (0)
 #define prcu_pending() 0
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index 49cb70e6..ef2c7730 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -1,3 +1,17 @@
+/*
+ * Read-Copy Update mechanism for mutual exclusion (PRCU version).
+ * This PRCU implementation is based on a fast consensus protocol
+ * published in the following paper:
+ *
+ * Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
+ * Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
+ * IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
+ * https://dl.acm.org/citation.cfm?id=3024114.3024143
+ *
+ * Authors: Heng Zhang <heng.z@huawei.com>
+ *          Lihao Liang <lianglihao@huawei.com>
+ */
+
 #include <linux/smp.h>
 #include <linux/percpu.h>
 #include <linux/prcu.h>
@@ -8,8 +22,16 @@
 
 #include "rcu.h"
 
+/* Data structures. */
+
+/*
+ * Initialize PRCU's per-CPU local structure.
+ */
 DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
 
+/*
+ * Initialize PRCU's global structure.
+ */
 struct prcu_struct global_prcu = {
 	.global_version = ATOMIC64_INIT(0),
 	.cb_version = ATOMIC64_INIT(0),
@@ -20,7 +42,9 @@ struct prcu_struct global_prcu = {
 };
 struct prcu_struct *prcu = &global_prcu;
 
-/* Initialize simple callback list. */
+/*
+ * Initialize simple PRCU callback list.
+ */
 static void prcu_cblist_init(struct prcu_cblist *rclp)
 {
 	rclp->head = NULL;
@@ -31,8 +55,8 @@ static void prcu_cblist_init(struct prcu_cblist *rclp)
 }
 
 /*
- * Dequeue the oldest rcu_head structure from the specified callback list;
- * store the callback grace period version number into the version pointer.
+ * Dequeue the oldest rcu_head structure from the specified callback list.
+ * Store the callback version number into the version pointer.
  */
 static struct rcu_head *prcu_cblist_dequeue(struct prcu_cblist *rclp)
 {
@@ -59,6 +83,11 @@ static struct rcu_head *prcu_cblist_dequeue(struct prcu_cblist *rclp)
 	return rhp;
 }
 
+/* PRCU function implementations. */
+
+/*
+ * Update local PRCU state of the current CPU.
+ */
 static inline void prcu_report(struct prcu_local_struct *local)
 {
 	unsigned long long global_version;
@@ -70,6 +99,15 @@ static inline void prcu_report(struct prcu_local_struct *local)
 		cmpxchg(&local->version, local_version, global_version);
 }
 
+/*
+ * Mark the beginning of a PRCU read-side critical section.
+ *
+ * A PRCU quiescent state of a CPU is when its local ->locked and
+ * ->online variables become 0.
+ *
+ * See prcu_read_unlock() and synchronize_prcu() for more information.
+ * Also see rcu_read_lock() comment header.
+ */
 void prcu_read_lock(void)
 {
 	struct prcu_local_struct *local;
@@ -77,29 +115,50 @@ void prcu_read_lock(void)
 	local = get_cpu_ptr(&prcu_local);
 	if (!local->online) {
 		WRITE_ONCE(local->online, 1);
+		/*
+		 * Memory barrier is needed for PRCU writers
+		 * to see the updated local->online value.
+		 */
 		smp_mb();
 	}
-
 	local->locked++;
+	/*
+	 * Critical section after entry code.
+	 * put_cpu_ptr() provides the needed barrier().
+	 */
 	put_cpu_ptr(&prcu_local);
 }
 EXPORT_SYMBOL(prcu_read_lock);
 
+/*
+ * Mark the end of a PRCU read-side critical section.
+ *
+ * See prcu_read_lock() and synchronize_prcu() for more information.
+ * Also see rcu_read_unlock() comment header.
+ */
 void prcu_read_unlock(void)
 {
 	int locked;
 	struct prcu_local_struct *local;
 
-	barrier();
+	barrier(); /* Critical section before exit code. */
 	local = get_cpu_ptr(&prcu_local);
 	locked = local->locked;
 	if (locked) {
 		local->locked--;
+		/*
+		 * If we are executing the last PRCU task,
+		 * update the CPU-local PRCU state.
+		 */
 		if (locked == 1)
 			prcu_report(local);
 		put_cpu_ptr(&prcu_local);
 	} else {
 		put_cpu_ptr(&prcu_local);
+		/*
+		 * If we are executing the last outstanding
+		 * PRCU task, wake up synchronize_prcu().
+		 */
 		if (!atomic_dec_return(&prcu->active_ctr))
 			wake_up(&prcu->wait_q);
 	}
@@ -111,10 +170,25 @@ static void prcu_handler(void *info)
 	struct prcu_local_struct *local;
 
 	local = this_cpu_ptr(&prcu_local);
+	/*
+	 * We need to do this check locally on the current CPU
+	 * because no memory barrier is used for ->locked so
+	 * PRCU writers may not see its latest local value.
+	 */
 	if (!local->locked)
 		WRITE_ONCE(local->version, atomic64_read(&prcu->global_version));
 }
 
+/*
+ * Wait until a grace period has completed.
+ *
+ * A PRCU grace period can end if each CPU has passed a PRCU quiescent state
+ * -and- the global variable ->active_ctr is 0, that is all pre-existing
+ * PRCU read-side critical sections have completed.
+ *
+ * See prcu_read_lock() and prcu_read_unlock() for more information.
+ * Also see synchronize_rcu() comment header.
+ */
 void synchronize_prcu(void)
 {
 	int cpu;
@@ -122,7 +196,13 @@ void synchronize_prcu(void)
 	unsigned long long version;
 	struct prcu_local_struct *local;
 
+	/*
+	 * Get the new global grace-period version before taking mutex,
+	 * which allows multiple synchronize_prcu() calls spreading PRCU
+	 * readers can return in a timely fashion.
+	 */
 	version = atomic64_add_return(1, &prcu->global_version);
+	/* Take mutex to serialize concurrent synchronize_prcu() calls. */
 	mutex_lock(&prcu->mtx);
 
 	local = get_cpu_ptr(&prcu_local);
@@ -130,8 +210,14 @@ void synchronize_prcu(void)
 	put_cpu_ptr(&prcu_local);
 
 	cpumask_clear(&cpus);
+	/* Send an IPI to force straggling CPUs to update their PRCU state. */
 	for_each_possible_cpu(cpu) {
 		local = per_cpu_ptr(&prcu_local, cpu);
+		/*
+		 * If no PRCU tasks are currently running on this CPU
+		 * or a context-switch has occurred, the CPU-local PRCU
+		 * state has already been updated.
+		 */
 		if (!READ_ONCE(local->online))
 			continue;
 		if (READ_ONCE(local->version) < version) {
@@ -140,34 +226,46 @@ void synchronize_prcu(void)
 		}
 	}
 
+	/* Wait for outstanding CPUs to commit. */
 	for_each_cpu(cpu, &cpus) {
 		local = per_cpu_ptr(&prcu_local, cpu);
 		while (READ_ONCE(local->version) < version)
 			cpu_relax();
 	}
 
+	/* Wait for outstanding PRCU tasks to finish. */
 	if (atomic_read(&prcu->active_ctr))
 		wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
-
+	/* Update the global callback version to its grace-period version. */
 	atomic64_set(&prcu->cb_version, version);
 	mutex_unlock(&prcu->mtx);
 }
 EXPORT_SYMBOL(synchronize_prcu);
 
+/*
+ * Update PRCU state when a context-switch occurs.
+ */
 void prcu_note_context_switch(void)
 {
 	struct prcu_local_struct *local;
 
 	local = get_cpu_ptr(&prcu_local);
+	/* Update local and global outstanding PRCU task number. */
 	if (local->locked) {
 		atomic_add(local->locked, &prcu->active_ctr);
 		local->locked = 0;
 	}
+	/* Indicate a context-switch has occurred on this CPU. */
 	local->online = 0;
+	/* Update this CPU's local PRCU state. */
 	prcu_report(local);
 	put_cpu_ptr(&prcu_local);
 }
 
+/*
+ * Queue a PRCU callback to the current CPU for invocation
+ * after a grace period.
+ */
 void call_prcu(struct rcu_head *head, rcu_callback_t func)
 {
 	unsigned long flags;
@@ -177,8 +275,12 @@ void call_prcu(struct rcu_head *head, rcu_callback_t func)
 
 	debug_rcu_head_queue(head);
 
-	/* Use GFP_ATOMIC with IRQs disabled */
+	/* Use GFP_ATOMIC with IRQs disabled. */
 	vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC);
+	/*
+	 * Complain about kmalloc() failure.  This could be handled
+	 * in a different way, e.g. return -1 to inform the caller.
+	 */
 	if (!vhp) {
 		WARN_ON(1);
 		return;
@@ -188,8 +290,13 @@ void call_prcu(struct rcu_head *head, rcu_callback_t func)
 	head->next = NULL;
 	vhp->next = NULL;
 
+	/* Disable IRQs to prevent races with prcu_process_callbacks(). */
 	local_irq_save(flags);
 	local = this_cpu_ptr(&prcu_local);
+	/*
+	 * Assign the CPU-local callback version to the given callback
+	 * and add it to the PRCU callback list of the current CPU.
+	 */
 	vhp->version = local->version;
 	rclp = &local->cblist;
 	rclp->len++;
@@ -201,6 +308,13 @@ void call_prcu(struct rcu_head *head, rcu_callback_t func)
 }
 EXPORT_SYMBOL(call_prcu);
 
+/*
+ * Check to see if there is any immediate PRCU-related work
+ * to be done by the current CPU, returning 1 if so.
+ *
+ * Currently, it only checks whether this CPU has callbacks
+ * that are ready to invoke.
+ */
 int prcu_pending(void)
 {
 	struct prcu_local_struct *local = get_cpu_ptr(&prcu_local);
@@ -211,18 +325,33 @@ int prcu_pending(void)
 	return cb_version < atomic64_read(&prcu->cb_version) && rclp->head;
 }
 
+/*
+ * Perform PRCU core processing for the current CPU using softirq.
+ */
 void invoke_prcu_core(void)
 {
 	if (cpu_online(smp_processor_id()))
 		raise_softirq(PRCU_SOFTIRQ);
 }
 
+/*
+ * Schedule PRCU core processing.
+ *
+ * This function must be called from hardirq context.
+ * It is normally invoked from the scheduling-clock interrupt.
+ */
 void prcu_check_callbacks(void)
 {
 	if (prcu_pending())
 		invoke_prcu_core();
 }
 
+/*
+ * Process PRCU callbacks whose grace period has completed.
+ * Do this using softirq for each CPU.
+ *
+ * Also see the prcu_barrier() comment header.
+ */
 static __latent_entropy void prcu_process_callbacks(struct softirq_action *unused)
 {
 	unsigned long flags;
@@ -237,18 +366,24 @@ static __latent_entropy void prcu_process_callbacks(struct softirq_action *unuse
 
 	cb_version = atomic64_read(&prcu->cb_version);
 
-	/* Disable interrupts to prevent races with call_prcu() */
+	/* Disable IRQs to prevent races with call_prcu(). */
 	local_irq_save(flags);
 	local = this_cpu_ptr(&prcu_local);
 	rclp = &local->cblist;
 	rhp = rclp->head;
 	vhp = rclp->version_head;
+	/*
+	 * Process PRCU callbacks with version number smaller
+	 * than the global PRCU callback version whose associated
+	 * grace periods have completed.
+	 */
 	for (; rhp && vhp && vhp->version < cb_version;
 	     rhp = rclp->head, vhp = rclp->version_head) {
 		rhp = prcu_cblist_dequeue(rclp);
 		debug_rcu_head_unqueue(rhp);
 		rhp->func(rhp);
 	}
+	/* Record the version number of callbacks to be processed. */
 	local->cb_version = cb_version;
 	local_irq_restore(flags);
 }
@@ -274,7 +409,18 @@ static void prcu_barrier_func(void *info)
 	call_prcu(&local->barrier_head, prcu_barrier_callback);
 }
 
-/* Waiting for all PRCU callbacks to complete. */
+/*
+ * Waiting for all PRCU callbacks to complete.
+ *
+ * NOTE: The current PRCU implementation relies on synchronize_prcu()
+ * to update its global grace-period and callback version numbers.
+ * If there is no synchronize_prcu() running and call_prcu() is called,
+ * rcu_process_callbacks() wont't make progress and prcu_barrier() will
+ * -not- return.
+ *
+ * This needs to be fixed, e.g. using a grace-period expediting mechanism
+ * as found in the Linux-kernel RCU implementation.
+ */
 void prcu_barrier(void)
 {
 	int cpu;
@@ -292,9 +438,13 @@ void prcu_barrier(void)
 
 	/*
 	 * Register a new callback on each CPU using IPI to prevent races
-	 * with call_prcu(). When that callback is invoked, we will know
+	 * with call_prcu().  When that callback is invoked, we will know
 	 * that all of the corresponding CPU's preceding callbacks have
-	 * been invoked.
+	 * been invoked. Note that we must use the wait version of
+	 * smp_call_function_single().  Otherwise prcu_barrier_func()
+	 * might not finish incrementing prcu->barrier_cpu_count and
+	 * registering prcu_barrier_callback() on -each- CPU before
+	 * we exit the loop and wait for completion. Hence a bug!
 	 */
 	for_each_possible_cpu(cpu)
 		smp_call_function_single(cpu, prcu_barrier_func, NULL, 1);
@@ -315,6 +465,9 @@ void prcu_barrier(void)
 }
 EXPORT_SYMBOL(prcu_barrier);
 
+/*
+ * Helper function for prcu_init() to initialize PRCU's CPU-local structure.
+ */
 void prcu_init_local_struct(int cpu)
 {
 	struct prcu_local_struct *local;
@@ -327,6 +480,9 @@ void prcu_init_local_struct(int cpu)
 	prcu_cblist_init(&local->cblist);
 }
 
+/*
+ * Initialize PRCU at boot time.
+ */
 void __init prcu_init(void)
 {
 	int cpu;
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 14/16] rcuperf: Add config files with various CONFIG_NR_CPUS
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (12 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 13/16] prcu: Comment source code lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-23  7:59 ` [PATCH RFC 15/16] rcutorture: Add scripts to run experiments lianglihao
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 .../selftests/rcutorture/configs/rcuperf/PRCU-12    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-12.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-14    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-14.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-15    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-15.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-16    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-16.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-2     | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-2.boot          |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-32    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-32.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-4     | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-4.boot          |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-48    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-48.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-56    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-56.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-60    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-60.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-62    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-62.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-64    | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-64.boot         |  1 +
 .../selftests/rcutorture/configs/rcuperf/PRCU-8     | 21 +++++++++++++++++++++
 .../rcutorture/configs/rcuperf/PRCU-8.boot          |  1 +
 .../selftests/rcutorture/configs/rcuperf/TREE-12    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-14    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-15    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-16    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-2     | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-32    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-4     | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-48    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-56    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-60    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-62    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-64    | 21 +++++++++++++++++++++
 .../selftests/rcutorture/configs/rcuperf/TREE-8     | 21 +++++++++++++++++++++
 39 files changed, 559 insertions(+)
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
 create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8

diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
new file mode 100644
index 00000000..4ba9bf0d
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=12
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
new file mode 100644
index 00000000..9e3999c5
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=14
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
new file mode 100644
index 00000000..5faf3c94
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=15
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
new file mode 100644
index 00000000..2b1fc756
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=16
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
new file mode 100644
index 00000000..7447ccc3
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=2
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
new file mode 100644
index 00000000..b7586093
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=32
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
new file mode 100644
index 00000000..d14698ba
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=4
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
new file mode 100644
index 00000000..99d9f4aa
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=48
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
new file mode 100644
index 00000000..c77bed56
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=56
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
new file mode 100644
index 00000000..131e99ae
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=60
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
new file mode 100644
index 00000000..24a550b2
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=62
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
new file mode 100644
index 00000000..257ace8b
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=64
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8 b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
new file mode 100644
index 00000000..35d313ef
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=8
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
new file mode 100644
index 00000000..7e54ea55
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
@@ -0,0 +1 @@
+rcuperf.perf_type=prcu
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
new file mode 100644
index 00000000..4ba9bf0d
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=12
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
new file mode 100644
index 00000000..9e3999c5
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=14
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
new file mode 100644
index 00000000..5faf3c94
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=15
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
new file mode 100644
index 00000000..2b1fc756
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=16
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
new file mode 100644
index 00000000..7447ccc3
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=2
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
new file mode 100644
index 00000000..b7586093
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=32
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
new file mode 100644
index 00000000..d14698ba
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=4
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
new file mode 100644
index 00000000..99d9f4aa
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=48
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
new file mode 100644
index 00000000..c77bed56
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=56
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
new file mode 100644
index 00000000..131e99ae
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=60
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
new file mode 100644
index 00000000..24a550b2
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=62
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
new file mode 100644
index 00000000..257ace8b
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=64
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8 b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8
new file mode 100644
index 00000000..35d313ef
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8
@@ -0,0 +1,21 @@
+CONFIG_SMP=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_NR_CPUS=8
+#CHECK#CONFIG_PREEMPT_RCU=y
+CONFIG_HZ_PERIODIC=n
+CONFIG_NO_HZ_IDLE=y
+CONFIG_NO_HZ_FULL=n
+CONFIG_RCU_FAST_NO_HZ=n
+CONFIG_RCU_TRACE=n
+CONFIG_HOTPLUG_CPU=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+CONFIG_RCU_NOCB_CPU=n
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+CONFIG_RCU_BOOST=n
+CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
+CONFIG_RCU_EXPERT=y
+CONFIG_RCU_TRACE=y
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 15/16] rcutorture: Add scripts to run experiments
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (13 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 14/16] rcuperf: Add config files with various CONFIG_NR_CPUS lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-25  6:28   ` Paul E. McKenney
  2018-01-23  7:59 ` [PATCH RFC 16/16] Add GPLv2 license lianglihao
  2018-01-25  5:53 ` [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol Paul E. McKenney
  16 siblings, 1 reply; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 kvm.sh         | 452 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 run-rcuperf.sh |  26 ++++
 2 files changed, 478 insertions(+)
 create mode 100755 kvm.sh
 create mode 100755 run-rcuperf.sh

diff --git a/kvm.sh b/kvm.sh
new file mode 100755
index 00000000..3b3c1b69
--- /dev/null
+++ b/kvm.sh
@@ -0,0 +1,452 @@
+#!/bin/bash
+#
+# Run a series of 14 tests under KVM.  These are not particularly
+# well-selected or well-tuned, but are the current set.  Run from the
+# top level of the source tree.
+#
+# Edit the definitions below to set the locations of the various directories,
+# as well as the test duration.
+#
+# Usage: kvm.sh [ options ]
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
+#
+# Copyright (C) IBM Corporation, 2011
+#
+# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
+
+scriptname=$0
+args="$*"
+
+T=/tmp/kvm.sh.$$
+trap 'rm -rf $T' 0
+mkdir $T
+
+dur=$((30*60))
+dryrun=""
+KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM
+PATH=${KVM}/bin:$PATH; export PATH
+TORTURE_DEFCONFIG=defconfig
+TORTURE_BOOT_IMAGE=""
+TORTURE_INITRD="$KVM/initrd"; export TORTURE_INITRD
+TORTURE_KMAKE_ARG=""
+TORTURE_SHUTDOWN_GRACE=180
+TORTURE_SUITE=rcu
+resdir=""
+configs=""
+cpus=0
+ds=`date +%Y.%m.%d-%H:%M:%S`
+jitter="-1"
+
+. functions.sh
+
+usage () {
+	echo "Usage: $scriptname optional arguments:"
+	echo "       --bootargs kernel-boot-arguments"
+	echo "       --bootimage relative-path-to-kernel-boot-image"
+	echo "       --buildonly"
+	echo "       --configs \"config-file list w/ repeat factor (3*TINY01)\""
+	echo "       --cpus N"
+	echo "       --datestamp string"
+	echo "       --defconfig string"
+	echo "       --dryrun sched|script"
+	echo "       --duration minutes"
+	echo "       --interactive"
+	echo "       --jitter N [ maxsleep (us) [ maxspin (us) ] ]"
+	echo "       --kmake-arg kernel-make-arguments"
+	echo "       --mac nn:nn:nn:nn:nn:nn"
+	echo "       --no-initrd"
+	echo "       --qemu-args qemu-system-..."
+	echo "       --qemu-cmd qemu-system-..."
+	echo "       --results absolute-pathname"
+	echo "       --torture rcu"
+	exit 1
+}
+
+while test $# -gt 0
+do
+	case "$1" in
+	--bootargs|--bootarg)
+		checkarg --bootargs "(list of kernel boot arguments)" "$#" "$2" '.*' '^--'
+		TORTURE_BOOTARGS="$2"
+		shift
+		;;
+	--bootimage)
+		checkarg --bootimage "(relative path to kernel boot image)" "$#" "$2" '[a-zA-Z0-9][a-zA-Z0-9_]*' '^--'
+		TORTURE_BOOT_IMAGE="$2"
+		shift
+		;;
+	--buildonly)
+		TORTURE_BUILDONLY=1
+		;;
+	--configs|--config)
+		checkarg --configs "(list of config files)" "$#" "$2" '^[^/]*$' '^--'
+		configs="$2"
+		shift
+		;;
+	--cpus)
+		checkarg --cpus "(number)" "$#" "$2" '^[0-9]*$' '^--'
+		cpus=$2
+		shift
+		;;
+	--datestamp)
+		checkarg --datestamp "(relative pathname)" "$#" "$2" '^[^/]*$' '^--'
+		ds=$2
+		shift
+		;;
+	--defconfig)
+		checkarg --defconfig "defconfigtype" "$#" "$2" '^[^/][^/]*$' '^--'
+		TORTURE_DEFCONFIG=$2
+		shift
+		;;
+	--dryrun)
+		checkarg --dryrun "sched|script" $# "$2" 'sched\|script' '^--'
+		dryrun=$2
+		shift
+		;;
+	--duration)
+		checkarg --duration "(minutes)" $# "$2" '^[0-9]*$' '^error'
+		dur=$(($2*60))
+		shift
+		;;
+	--interactive)
+		TORTURE_QEMU_INTERACTIVE=1; export TORTURE_QEMU_INTERACTIVE
+		;;
+	--jitter)
+		checkarg --jitter "(# threads [ sleep [ spin ] ])" $# "$2" '^-\{,1\}[0-9]\+\( \+[0-9]\+\)\{,2\} *$' '^error$'
+		jitter="$2"
+		shift
+		;;
+	--kmake-arg)
+		checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
+		TORTURE_KMAKE_ARG="$2"
+		shift
+		;;
+	--mac)
+		checkarg --mac "(MAC address)" $# "$2" '^\([0-9a-fA-F]\{2\}:\)\{5\}[0-9a-fA-F]\{2\}$' error
+		TORTURE_QEMU_MAC=$2
+		shift
+		;;
+	--no-initrd)
+		TORTURE_INITRD=""; export TORTURE_INITRD
+		;;
+	--qemu-args|--qemu-arg)
+		checkarg --qemu-args "-qemu args" $# "$2" '^-' '^error'
+		TORTURE_QEMU_ARG="$2"
+		shift
+		;;
+	--qemu-cmd)
+		checkarg --qemu-cmd "(qemu-system-...)" $# "$2" 'qemu-system-' '^--'
+		TORTURE_QEMU_CMD="$2"
+		shift
+		;;
+	--results)
+		checkarg --results "(absolute pathname)" "$#" "$2" '^/' '^error'
+		resdir=$2
+		shift
+		;;
+	--shutdown-grace)
+		checkarg --shutdown-grace "(seconds)" "$#" "$2" '^[0-9]*$' '^error'
+		TORTURE_SHUTDOWN_GRACE=$2
+		shift
+		;;
+	--torture)
+		checkarg --torture "(suite name)" "$#" "$2" '^\(lock\|rcu\|rcuperf\)$' '^--'
+		TORTURE_SUITE=$2
+		shift
+		;;
+	*)
+		echo Unknown argument $1
+		usage
+		;;
+	esac
+	shift
+done
+
+CONFIGFRAG=${KVM}/configs/${TORTURE_SUITE}; export CONFIGFRAG
+
+if test -z "$configs"
+then
+	configs="`cat $CONFIGFRAG/CFLIST`"
+fi
+
+if test -z "$resdir"
+then
+	resdir=$KVM/res
+fi
+
+# Create a file of test-name/#cpus pairs, sorted by decreasing #cpus.
+touch $T/cfgcpu
+for CF in $configs
+do
+	case $CF in
+	[0-9]\**|[0-9][0-9]\**|[0-9][0-9][0-9]\**)
+		config_reps=`echo $CF | sed -e 's/\*.*$//'`
+		CF1=`echo $CF | sed -e 's/^[^*]*\*//'`
+		;;
+	*)
+		config_reps=1
+		CF1=$CF
+		;;
+	esac
+	if test -f "$CONFIGFRAG/$CF1"
+	then
+		cpu_count=`configNR_CPUS.sh $CONFIGFRAG/$CF1`
+		cpu_count=`configfrag_boot_cpus "$TORTURE_BOOTARGS" "$CONFIGFRAG/$CF1" "$cpu_count"`
+		for ((cur_rep=0;cur_rep<$config_reps;cur_rep++))
+		do
+			echo $CF1 $cpu_count >> $T/cfgcpu
+		done
+	else
+		echo "The --configs file $CF1 does not exist, terminating."
+		exit 1
+	fi
+done
+sort -k2nr $T/cfgcpu > $T/cfgcpu.sort
+
+# Use a greedy bin-packing algorithm, sorting the list accordingly.
+awk < $T/cfgcpu.sort > $T/cfgcpu.pack -v ncpus=$cpus '
+BEGIN {
+	njobs = 0;
+}
+
+{
+	# Read file of tests and corresponding required numbers of CPUs.
+	cf[njobs] = $1;
+	cpus[njobs] = $2;
+	njobs++;
+}
+
+END {
+	alldone = 0;
+	batch = 0;
+	nc = -1;
+
+	# Each pass through the following loop creates on test batch
+	# that can be executed concurrently given ncpus.  Note that a
+	# given test that requires more than the available CPUs will run in
+	# their own batch.  Such tests just have to make do with what
+	# is available.
+	while (nc != ncpus) {
+		batch++;
+		nc = ncpus;
+
+		# Each pass through the following loop considers one
+		# test for inclusion in the current batch.
+		for (i = 0; i < njobs; i++) {
+			if (done[i])
+				continue; # Already part of a batch.
+			if (nc >= cpus[i] || nc == ncpus) {
+
+				# This test fits into the current batch.
+				done[i] = batch;
+				nc -= cpus[i];
+				if (nc <= 0)
+					break; # Too-big test in its own batch.
+			}
+		}
+	}
+
+	# Dump out the tests in batch order.
+	for (b = 1; b <= batch; b++)
+		for (i = 0; i < njobs; i++)
+			if (done[i] == b)
+				print cf[i], cpus[i];
+}'
+
+# Generate a script to execute the tests in appropriate batches.
+cat << ___EOF___ > $T/script
+CONFIGFRAG="$CONFIGFRAG"; export CONFIGFRAG
+KVM="$KVM"; export KVM
+PATH="$PATH"; export PATH
+TORTURE_BOOT_IMAGE="$TORTURE_BOOT_IMAGE"; export TORTURE_BOOT_IMAGE
+TORTURE_BUILDONLY="$TORTURE_BUILDONLY"; export TORTURE_BUILDONLY
+TORTURE_DEFCONFIG="$TORTURE_DEFCONFIG"; export TORTURE_DEFCONFIG
+TORTURE_INITRD="$TORTURE_INITRD"; export TORTURE_INITRD
+TORTURE_KMAKE_ARG="$TORTURE_KMAKE_ARG"; export TORTURE_KMAKE_ARG
+TORTURE_QEMU_CMD="$TORTURE_QEMU_CMD"; export TORTURE_QEMU_CMD
+TORTURE_QEMU_INTERACTIVE="$TORTURE_QEMU_INTERACTIVE"; export TORTURE_QEMU_INTERACTIVE
+TORTURE_QEMU_MAC="$TORTURE_QEMU_MAC"; export TORTURE_QEMU_MAC
+TORTURE_SHUTDOWN_GRACE="$TORTURE_SHUTDOWN_GRACE"; export TORTURE_SHUTDOWN_GRACE
+TORTURE_SUITE="$TORTURE_SUITE"; export TORTURE_SUITE
+if ! test -e $resdir
+then
+	mkdir -p "$resdir" || :
+fi
+mkdir $resdir/$ds
+echo Results directory: $resdir/$ds
+echo $scriptname $args
+touch $resdir/$ds/log
+echo $scriptname $args >> $resdir/$ds/log
+echo ${TORTURE_SUITE} > $resdir/$ds/TORTURE_SUITE
+pwd > $resdir/$ds/testid.txt
+if test -d .git
+then
+	git status >> $resdir/$ds/testid.txt
+	git rev-parse HEAD >> $resdir/$ds/testid.txt
+	if ! git diff HEAD > $T/git-diff 2>&1
+	then
+		cp $T/git-diff $resdir/$ds
+	fi
+fi
+___EOF___
+awk < $T/cfgcpu.pack \
+	-v TORTURE_BUILDONLY="$TORTURE_BUILDONLY" \
+	-v CONFIGDIR="$CONFIGFRAG/" \
+	-v KVM="$KVM" \
+	-v ncpus=$cpus \
+	-v jitter="$jitter" \
+	-v rd=$resdir/$ds/ \
+	-v dur=$dur \
+	-v TORTURE_QEMU_ARG="$TORTURE_QEMU_ARG" \
+	-v TORTURE_BOOTARGS="$TORTURE_BOOTARGS" \
+'BEGIN {
+	i = 0;
+}
+
+{
+	cf[i] = $1;
+	cpus[i] = $2;
+	i++;
+}
+
+# Dump out the scripting required to run one test batch.
+function dump(first, pastlast, batchnum)
+{
+	print "echo ----Start batch " batchnum ": `date`";
+	print "echo ----Start batch " batchnum ": `date` >> " rd "/log";
+	jn=1
+	for (j = first; j < pastlast; j++) {
+		builddir=KVM "/b" jn
+		cpusr[jn] = cpus[j];
+		if (cfrep[cf[j]] == "") {
+			cfr[jn] = cf[j];
+			cfrep[cf[j]] = 1;
+		} else {
+			cfrep[cf[j]]++;
+			cfr[jn] = cf[j] "." cfrep[cf[j]];
+		}
+		if (cpusr[jn] > ncpus && ncpus != 0)
+			ovf = "-ovf";
+		else
+			ovf = "";
+		print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date`";
+		print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date` >> " rd "/log";
+		print "rm -f " builddir ".*";
+		print "touch " builddir ".wait";
+		print "mkdir " builddir " > /dev/null 2>&1 || :";
+		print "mkdir " rd cfr[jn] " || :";
+		print "kvm-test-1-run.sh " CONFIGDIR cf[j], builddir, rd cfr[jn], dur " \"" TORTURE_QEMU_ARG "\" \"" TORTURE_BOOTARGS "\" > " rd cfr[jn]  "/kvm-test-1-run.sh.out 2>&1 &"
+		print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date`";
+		print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date` >> " rd "/log";
+		print "while test -f " builddir ".wait"
+		print "do"
+		print "\tsleep 1"
+		print "done"
+		print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date`";
+		print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date` >> " rd "/log";
+		jn++;
+	}
+	for (j = 1; j < jn; j++) {
+		builddir=KVM "/b" j
+		print "rm -f " builddir ".ready"
+		print "if test -z \"$TORTURE_BUILDONLY\""
+		print "then"
+		print "\techo ----", cfr[j], cpusr[j] ovf ": Starting kernel. `date`";
+		print "\techo ----", cfr[j], cpusr[j] ovf ": Starting kernel. `date` >> " rd "/log";
+		print "fi"
+	}
+	njitter = 0;
+	split(jitter, ja);
+	if (ja[1] == -1 && ncpus == 0)
+		njitter = 1;
+	else if (ja[1] == -1)
+		njitter = ncpus;
+	else
+		njitter = ja[1];
+	if (TORTURE_BUILDONLY && njitter != 0) {
+		njitter = 0;
+		print "echo Build-only run, so suppressing jitter >> " rd "/log"
+	}
+	for (j = 0; j < njitter; j++)
+		print "jitter.sh " j " " dur " " ja[2] " " ja[3] "&"
+	print "wait"
+	print "if test -z \"$TORTURE_BUILDONLY\""
+	print "then"
+	print "\techo ---- All kernel runs complete. `date`";
+	print "\techo ---- All kernel runs complete. `date` >> " rd "/log";
+	print "fi"
+	for (j = 1; j < jn; j++) {
+		builddir=KVM "/b" j
+		print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results:";
+		print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results: >> " rd "/log";
+		print "cat " rd cfr[j]  "/kvm-test-1-run.sh.out";
+		print "cat " rd cfr[j]  "/kvm-test-1-run.sh.out >> " rd "/log";
+	}
+}
+
+END {
+	njobs = i;
+	nc = ncpus;
+	first = 0;
+	batchnum = 1;
+
+	# Each pass through the following loop considers one test.
+	for (i = 0; i < njobs; i++) {
+		if (ncpus == 0) {
+			# Sequential test specified, each test its own batch.
+			dump(i, i + 1, batchnum);
+			first = i;
+			batchnum++;
+		} else if (nc < cpus[i] && i != 0) {
+			# Out of CPUs, dump out a batch.
+			dump(first, i, batchnum);
+			first = i;
+			nc = ncpus;
+			batchnum++;
+		}
+		# Account for the CPUs needed by the current test.
+		nc -= cpus[i];
+	}
+	# Dump the last batch.
+	if (ncpus != 0)
+		dump(first, i, batchnum);
+}' >> $T/script
+
+cat << ___EOF___ >> $T/script
+echo
+echo
+echo " --- `date` Test summary:"
+echo Results directory: $resdir/$ds
+kvm-recheck.sh $resdir/$ds
+___EOF___
+
+if test "$dryrun" = script
+then
+	cat $T/script
+	exit 0
+elif test "$dryrun" = sched
+then
+	# Extract the test run schedule from the script.
+	egrep 'Start batch|Starting build\.' $T/script |
+		grep -v ">>" |
+		sed -e 's/:.*$//' -e 's/^echo //'
+	exit 0
+else
+	# Not a dryrun, so run the script.
+	sh $T/script
+fi
+
+# Tracing: trace_event=rcu:rcu_grace_period,rcu:rcu_future_grace_period,rcu:rcu_grace_period_init,rcu:rcu_nocb_wake,rcu:rcu_preempt_task,rcu:rcu_unlock_preempted_task,rcu:rcu_quiescent_state_report,rcu:rcu_fqs,rcu:rcu_callback,rcu:rcu_kfree_callback,rcu:rcu_batch_start,rcu:rcu_invoke_callback,rcu:rcu_invoke_kfree_callback,rcu:rcu_batch_end,rcu:rcu_torture_read,rcu:rcu_barrier
diff --git a/run-rcuperf.sh b/run-rcuperf.sh
new file mode 100755
index 00000000..0526fff1
--- /dev/null
+++ b/run-rcuperf.sh
@@ -0,0 +1,26 @@
+#!/bin/bash
+
+dur=10
+run=10
+torture="rcuperf"
+rest="1m"
+path=`pwd`
+
+for cpu in 4 8 16
+do
+	for type in PRCU TREE
+	do
+		folder="$path/res/rcuperf/cpu-$cpu/$type"
+		if ! test -d $folder
+		then
+			echo "$folder does not exist..."
+			exit
+		fi
+
+		echo "Running rcuperf-$type-${cpu}cpus..."
+		`./kvm.sh --torture $torture --duration $dur --configs ${run}*${type}-${cpu} --results $folder &> $folder/$type-cpu${cpu}-${dur}min.out`
+
+		echo "Sleep $rest..."
+		`sleep $rest`
+	done
+done
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH RFC 16/16] Add GPLv2 license
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (14 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 15/16] rcutorture: Add scripts to run experiments lianglihao
@ 2018-01-23  7:59 ` lianglihao
  2018-01-25  5:53 ` [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol Paul E. McKenney
  16 siblings, 0 replies; 43+ messages in thread
From: lianglihao @ 2018-01-23  7:59 UTC (permalink / raw)
  To: paulmck; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

From: Lihao Liang <lianglihao@huawei.com>

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
---
 include/linux/prcu.h | 4 ++++
 kernel/rcu/prcu.c    | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/include/linux/prcu.h b/include/linux/prcu.h
index 9f740985..9fa74dac 100644
--- a/include/linux/prcu.h
+++ b/include/linux/prcu.h
@@ -4,6 +4,10 @@
  *
  * Authors: Heng Zhang <heng.z@huawei.com>
  *          Lihao Liang <lianglihao@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
  */
 
 #ifndef __LINUX_PRCU_H
diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
index ef2c7730..06375ee6 100644
--- a/kernel/rcu/prcu.c
+++ b/kernel/rcu/prcu.c
@@ -10,6 +10,10 @@
  *
  * Authors: Heng Zhang <heng.z@huawei.com>
  *          Lihao Liang <lianglihao@huawei.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
  */
 
 #include <linux/smp.h>
-- 
2.14.1.729.g59c0ea183

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-23  7:59 ` [PATCH RFC 01/16] prcu: Add PRCU implementation lianglihao
@ 2018-01-24 11:26   ` Peter Zijlstra
  2018-01-24 17:15     ` Lihao Liang
  2018-01-25  6:16   ` Paul E. McKenney
  2018-01-29  9:10   ` Lai Jiangshan
  2 siblings, 1 reply; 43+ messages in thread
From: Peter Zijlstra @ 2018-01-24 11:26 UTC (permalink / raw)
  To: lianglihao; +Cc: paulmck, guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

On Tue, Jan 23, 2018 at 03:59:26PM +0800, lianglihao@huawei.com wrote:
> From: Heng Zhang <heng.z@huawei.com>
> 
> This RCU implementation (PRCU) is based on a fast consensus protocol
> published in the following paper:
> 
> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
> https://dl.acm.org/citation.cfm?id=3024114.3024143

That's an utterly useless changelog for something like a new RCU
implementation.

You fail to describe why you're proposing a new RCU implementation; what
problems does it fix?, how is it better?

All you provide is a paywalled link to some paper that we can't read.

Please write a real changelog that describes things properly and
provide, if at all possible, a readily accessible link to your paper.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-24 11:26   ` Peter Zijlstra
@ 2018-01-24 17:15     ` Lihao Liang
  2018-01-24 20:19       ` Peter Zijlstra
  0 siblings, 1 reply; 43+ messages in thread
From: Lihao Liang @ 2018-01-24 17:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: lianglihao, Paul McKenney, Guohanjun (Hanjun Guo),
	heng.z, hb.chen, linux-kernel

Dear Peter,

Many thanks for your comments. I will provide a proper changelog.
Alternatively, the paper can be found at

http://ipads.se.sjtu.edu.cn/lib/exe/fetch.php?media=publications:consensus-tpds16.pdf

Best,
Lihao.

On Wed, Jan 24, 2018 at 11:26 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Jan 23, 2018 at 03:59:26PM +0800, lianglihao@huawei.com wrote:
>> From: Heng Zhang <heng.z@huawei.com>
>>
>> This RCU implementation (PRCU) is based on a fast consensus protocol
>> published in the following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>
> That's an utterly useless changelog for something like a new RCU
> implementation.
>
> You fail to describe why you're proposing a new RCU implementation; what
> problems does it fix?, how is it better?
>
> All you provide is a paywalled link to some paper that we can't read.
>
> Please write a real changelog that describes things properly and
> provide, if at all possible, a readily accessible link to your paper.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-24 17:15     ` Lihao Liang
@ 2018-01-24 20:19       ` Peter Zijlstra
  0 siblings, 0 replies; 43+ messages in thread
From: Peter Zijlstra @ 2018-01-24 20:19 UTC (permalink / raw)
  To: Lihao Liang
  Cc: lianglihao, Paul McKenney, Guohanjun (Hanjun Guo),
	heng.z, hb.chen, linux-kernel

On Wed, Jan 24, 2018 at 05:15:30PM +0000, Lihao Liang wrote:
> Alternatively, the paper can be found at
> 
> http://ipads.se.sjtu.edu.cn/lib/exe/fetch.php?media=publications:consensus-tpds16.pdf

Thanks, I'll try and have a read.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol
  2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
                   ` (15 preceding siblings ...)
  2018-01-23  7:59 ` [PATCH RFC 16/16] Add GPLv2 license lianglihao
@ 2018-01-25  5:53 ` Paul E. McKenney
  2018-01-27  7:22   ` Lihao Liang
  16 siblings, 1 reply; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-25  5:53 UTC (permalink / raw)
  To: lianglihao; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

On Tue, Jan 23, 2018 at 03:59:25PM +0800, lianglihao@huawei.com wrote:
> From: Lihao Liang <lianglihao@huawei.com>
> 
> Dear Paul,
> 
> This patch set implements a preemptive version of RCU (PRCU) based on the following paper:
> 
> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
> https://dl.acm.org/citation.cfm?id=3024114.3024143
> 
> We have also added preliminary callback-handling support.  Thus, the current version
> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), call_prcu(),
> and prcu_barrier().
> 
> This is an experimental patch, so it would be good to have some feedback.
> 
> Known shortcoming is that the grace-period version is incremented in synchronize_prcu().
> If call_prcu() or prcu_barrier() is called but there is no synchronized_prcu() invoked,
> callbacks cannot be invoked.  Later version should address this issue, e.g. adding a
> grace-period expedition mechanism.  Others include to use a a hierarchical structure,
> taking into account the NUMA topology, to send IPI in synchronize_prcu().
> 
> We have tested the implementation using rcutorture on both an x86 and ARM64 machine.
> PRCU passed 1h and 3h tests on all the newly added config files except PRCU07 reported BUG 
> in a 1h run.
> 
> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
> [ 1594.755553] smpboot: CPU 14 is now offline
> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
> [ 1599.365098] prcu-torture: rtc: ffffffffb0277b90 ver: 66358 tfle: 0 rta: 66358 rtaf: 0 
> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418 
> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 cbflood: 225
> [ 1599.367946] prcu-torture: !!!
> [ 1599.367966] ------------[ cut here ]------------

The "rtbe: 1" indicates that your implementation of prcu_barrier()
failed to wait for all preceding call_prcu() callbacks to be invoked.

Does the immediately following "Reader Pipe:" list have any but the
first two numbers non-zero?

> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to true, that is
> synchronize_rcu_expedited was tested.
> 
> The rcuperf results are as follows (average grace-period duration in ms of ten 10min runs):
> 
> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
> 
> CPUs      2       4       8      12      15       16
> PRCU   0.14    1.07    4.15    8.02   10.79    15.16 
> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
> 
> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
> 
> CPUs       2       4        8      16      32       48       63        64
> PRCU    0.23   19.69    38.28   63.21   95.41   167.18   252.01   1841.44
> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27

Well, at the very least, this is a bug report on either expedited RCU
grace-period latency or on rcuperf's measurements, and thank you for that.
I will look into this.  In the meantime, could you please let me know
exactly how you invoked rcuperf?

I have a few comments on some of your patches based on a quick scan
through them.

							Thanx, Paul

> Best wishes,
> Lihao.
> 
> 
> Lihao Liang (15):
>   rcutorture: Add PRCU rcu_torture_ops
>   rcutorture: Add PRCU test config files
>   rcuperf: Add PRCU rcu_perf_ops
>   rcuperf: Add PRCU test config files
>   rcuperf: Set gp_exp to true for tests to run
>   prcu: Implement call_prcu() API
>   prcu: Implement PRCU callback processing
>   prcu: Implement prcu_barrier() API
>   rcutorture: Test call_prcu() and prcu_barrier()
>   rcutorture: Add basic ARM64 support to run scripts
>   prcu: Add PRCU Kconfig parameter
>   prcu: Comment source code
>   rcuperf: Add config files with various CONFIG_NR_CPUS
>   rcutorture: Add scripts to run experiments
>   Add GPLv2 license
> 
> Heng Zhang (1):
>   prcu: Add PRCU implementation
> 
>  include/linux/interrupt.h                          |   3 +
>  include/linux/prcu.h                               | 122 +++++
>  include/linux/rcupdate.h                           |   1 +
>  init/Kconfig                                       |   7 +
>  init/main.c                                        |   2 +
>  kernel/rcu/Makefile                                |   1 +
>  kernel/rcu/prcu.c                                  | 497 +++++++++++++++++++++
>  kernel/rcu/rcuperf.c                               |  33 +-
>  kernel/rcu/rcutorture.c                            |  40 +-
>  kernel/rcu/tree.c                                  |   1 +
>  kernel/sched/core.c                                |   2 +
>  kernel/time/timer.c                                |   2 +
>  kvm.sh                                             | 452 +++++++++++++++++++
>  run-rcuperf.sh                                     |  26 ++
>  .../testing/selftests/rcutorture/bin/functions.sh  |  17 +-
>  .../selftests/rcutorture/configs/rcu/CFLIST        |   5 +
>  .../selftests/rcutorture/configs/rcu/PRCU02        |  27 ++
>  .../selftests/rcutorture/configs/rcu/PRCU02.boot   |   1 +
>  .../selftests/rcutorture/configs/rcu/PRCU03        |  23 +
>  .../selftests/rcutorture/configs/rcu/PRCU03.boot   |   2 +
>  .../selftests/rcutorture/configs/rcu/PRCU06        |  26 ++
>  .../selftests/rcutorture/configs/rcu/PRCU06.boot   |   5 +
>  .../selftests/rcutorture/configs/rcu/PRCU07        |  25 ++
>  .../selftests/rcutorture/configs/rcu/PRCU07.boot   |   2 +
>  .../selftests/rcutorture/configs/rcu/PRCU09        |  19 +
>  .../selftests/rcutorture/configs/rcu/PRCU09.boot   |   1 +
>  .../selftests/rcutorture/configs/rcuperf/CFLIST    |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU      |  20 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-12   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-12.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-14   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-14.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-15   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-15.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-16   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-16.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-2    |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-2.boot         |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-32   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-32.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-4    |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-4.boot         |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-48   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-48.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-56   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-56.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-60   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-60.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-62   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-62.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-64   |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-64.boot        |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU-8    |  21 +
>  .../rcutorture/configs/rcuperf/PRCU-8.boot         |   1 +
>  .../selftests/rcutorture/configs/rcuperf/PRCU.boot |   1 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-12   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-14   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-15   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-16   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-2    |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-32   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-4    |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-48   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-56   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-60   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-62   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-64   |  21 +
>  .../selftests/rcutorture/configs/rcuperf/TREE-8    |  21 +
>  68 files changed, 1918 insertions(+), 5 deletions(-)
>  create mode 100644 include/linux/prcu.h
>  create mode 100644 kernel/rcu/prcu.c
>  create mode 100755 kvm.sh
>  create mode 100755 run-rcuperf.sh
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8
> 
> -- 
> 2.14.1.729.g59c0ea183
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-23  7:59 ` [PATCH RFC 01/16] prcu: Add PRCU implementation lianglihao
  2018-01-24 11:26   ` Peter Zijlstra
@ 2018-01-25  6:16   ` Paul E. McKenney
  2018-01-25  7:30     ` Boqun Feng
                       ` (2 more replies)
  2018-01-29  9:10   ` Lai Jiangshan
  2 siblings, 3 replies; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-25  6:16 UTC (permalink / raw)
  To: lianglihao; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

On Tue, Jan 23, 2018 at 03:59:26PM +0800, lianglihao@huawei.com wrote:
> From: Heng Zhang <heng.z@huawei.com>
> 
> This RCU implementation (PRCU) is based on a fast consensus protocol
> published in the following paper:
> 
> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
> https://dl.acm.org/citation.cfm?id=3024114.3024143
> 
> Signed-off-by: Heng Zhang <heng.z@huawei.com>
> Signed-off-by: Lihao Liang <lianglihao@huawei.com>

A few comments and questions interspersed.

							Thanx, Paul

> ---
>  include/linux/prcu.h |  37 +++++++++++++++
>  kernel/rcu/Makefile  |   2 +-
>  kernel/rcu/prcu.c    | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  kernel/sched/core.c  |   2 +
>  4 files changed, 165 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/prcu.h
>  create mode 100644 kernel/rcu/prcu.c
> 
> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
> new file mode 100644
> index 00000000..653b4633
> --- /dev/null
> +++ b/include/linux/prcu.h
> @@ -0,0 +1,37 @@
> +#ifndef __LINUX_PRCU_H
> +#define __LINUX_PRCU_H
> +
> +#include <linux/atomic.h>
> +#include <linux/mutex.h>
> +#include <linux/wait.h>
> +
> +#define CONFIG_PRCU
> +
> +struct prcu_local_struct {
> +	unsigned int locked;
> +	unsigned int online;
> +	unsigned long long version;
> +};
> +
> +struct prcu_struct {
> +	atomic64_t global_version;
> +	atomic_t active_ctr;
> +	struct mutex mtx;
> +	wait_queue_head_t wait_q;
> +};
> +
> +#ifdef CONFIG_PRCU
> +void prcu_read_lock(void);
> +void prcu_read_unlock(void);
> +void synchronize_prcu(void);
> +void prcu_note_context_switch(void);
> +
> +#else /* #ifdef CONFIG_PRCU */
> +
> +#define prcu_read_lock() do {} while (0)
> +#define prcu_read_unlock() do {} while (0)
> +#define synchronize_prcu() do {} while (0)
> +#define prcu_note_context_switch() do {} while (0)

If CONFIG_PRCU=n and some code is built that uses PRCU, shouldn't you
get a build error rather than an error-free but inoperative PRCU?

Of course, Peter's question about purpose of the patch set applies
here as well.

> +
> +#endif /* #ifdef CONFIG_PRCU */
> +#endif /* __LINUX_PRCU_H */
> diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
> index 23803c7d..8791419c 100644
> --- a/kernel/rcu/Makefile
> +++ b/kernel/rcu/Makefile
> @@ -2,7 +2,7 @@
>  # and is generally not a function of system call inputs.
>  KCOV_INSTRUMENT := n
> 
> -obj-y += update.o sync.o
> +obj-y += update.o sync.o prcu.o
>  obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
>  obj-$(CONFIG_TREE_SRCU) += srcutree.o
>  obj-$(CONFIG_TINY_SRCU) += srcutiny.o
> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
> new file mode 100644
> index 00000000..a00b9420
> --- /dev/null
> +++ b/kernel/rcu/prcu.c
> @@ -0,0 +1,125 @@
> +#include <linux/smp.h>
> +#include <linux/prcu.h>
> +#include <linux/percpu.h>
> +#include <linux/compiler.h>
> +#include <linux/sched.h>
> +
> +#include <asm/barrier.h>
> +
> +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
> +
> +struct prcu_struct global_prcu = {
> +	.global_version = ATOMIC64_INIT(0),
> +	.active_ctr = ATOMIC_INIT(0),
> +	.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
> +	.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
> +};
> +struct prcu_struct *prcu = &global_prcu;
> +
> +static inline void prcu_report(struct prcu_local_struct *local)
> +{
> +	unsigned long long global_version;
> +	unsigned long long local_version;
> +
> +	global_version = atomic64_read(&prcu->global_version);
> +	local_version = local->version;
> +	if (global_version > local_version)
> +		cmpxchg(&local->version, local_version, global_version);
> +}
> +
> +void prcu_read_lock(void)
> +{
> +	struct prcu_local_struct *local;
> +
> +	local = get_cpu_ptr(&prcu_local);
> +	if (!local->online) {
> +		WRITE_ONCE(local->online, 1);
> +		smp_mb();
> +	}
> +
> +	local->locked++;
> +	put_cpu_ptr(&prcu_local);
> +}
> +EXPORT_SYMBOL(prcu_read_lock);
> +
> +void prcu_read_unlock(void)
> +{
> +	int locked;
> +	struct prcu_local_struct *local;
> +
> +	barrier();
> +	local = get_cpu_ptr(&prcu_local);
> +	locked = local->locked;
> +	if (locked) {
> +		local->locked--;
> +		if (locked == 1)
> +			prcu_report(local);

Is ordering important here?  It looks to me that the compiler could
rearrange some of the accesses within prcu_report() with the local->locked
decrement.  There appears to be some potential for load and store tearing,
though perhaps you have verified that your compiler avoids this on
the architecture that you are using.

> +		put_cpu_ptr(&prcu_local);
> +	} else {

Hmmm...  We get here if the RCU read-side critical section was preempted.
If none of them are preempted, ->active_ctr remains zero.

> +		put_cpu_ptr(&prcu_local);
> +		if (!atomic_dec_return(&prcu->active_ctr))
> +			wake_up(&prcu->wait_q);
> +	}
> +}
> +EXPORT_SYMBOL(prcu_read_unlock);
> +
> +static void prcu_handler(void *info)
> +{
> +	struct prcu_local_struct *local;
> +
> +	local = this_cpu_ptr(&prcu_local);
> +	if (!local->locked)
> +		WRITE_ONCE(local->version, atomic64_read(&prcu->global_version));
> +}
> +
> +void synchronize_prcu(void)
> +{
> +	int cpu;
> +	cpumask_t cpus;
> +	unsigned long long version;
> +	struct prcu_local_struct *local;
> +
> +	version = atomic64_add_return(1, &prcu->global_version);
> +	mutex_lock(&prcu->mtx);
> +
> +	local = get_cpu_ptr(&prcu_local);
> +	local->version = version;
> +	put_cpu_ptr(&prcu_local);
> +
> +	cpumask_clear(&cpus);
> +	for_each_possible_cpu(cpu) {
> +		local = per_cpu_ptr(&prcu_local, cpu);
> +		if (!READ_ONCE(local->online))
> +			continue;
> +		if (READ_ONCE(local->version) < version) {

On 32-bit systems, given that ->version is long long, you might see
load tearing.  And on some 32-bit systems, the cmpxchg() in prcu_hander()
might not build.

Or is the idea that only prcu_handler() updates ->version?  But in that
case, you wouldn't need the READ_ONCE() above.  What am I missing here?

> +			smp_call_function_single(cpu, prcu_handler, NULL, 0);
> +			cpumask_set_cpu(cpu, &cpus);
> +		}
> +	}
> +
> +	for_each_cpu(cpu, &cpus) {
> +		local = per_cpu_ptr(&prcu_local, cpu);
> +		while (READ_ONCE(local->version) < version)

This ->version read can also tear on some 32-bit systems, and this
one most definitely can race with the prcu_handler() above.  Does the
algorithm operate correctly in that case?  (It doesn't look that way
to me, but I might be missing something.) Or are 32-bit systems excluded?

> +			cpu_relax();
> +	}

I might be missing something, but I believe we need a memory barrier
here on non-TSO systems.  Without that, couldn't we miss a preemption?

> +
> +	if (atomic_read(&prcu->active_ctr))
> +		wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
> +
> +	mutex_unlock(&prcu->mtx);
> +}
> +EXPORT_SYMBOL(synchronize_prcu);
> +
> +void prcu_note_context_switch(void)
> +{
> +	struct prcu_local_struct *local;
> +
> +	local = get_cpu_ptr(&prcu_local);
> +	if (local->locked) {
> +		atomic_add(local->locked, &prcu->active_ctr);
> +		local->locked = 0;
> +	}
> +	local->online = 0;
> +	prcu_report(local);
> +	put_cpu_ptr(&prcu_local);
> +}
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 326d4f88..a308581b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -15,6 +15,7 @@
>  #include <linux/init_task.h>
>  #include <linux/context_tracking.h>
>  #include <linux/rcupdate_wait.h>
> +#include <linux/prcu.h>
> 
>  #include <linux/blkdev.h>
>  #include <linux/kprobes.h>
> @@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool preempt)
> 
>  	local_irq_disable();
>  	rcu_note_context_switch(preempt);
> +	prcu_note_context_switch();
> 
>  	/*
>  	 * Make sure that signal_pending_state()->signal_pending() below
> -- 
> 2.14.1.729.g59c0ea183
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run
  2018-01-23  7:59 ` [PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run lianglihao
@ 2018-01-25  6:18   ` Paul E. McKenney
  2018-01-26  8:33     ` Lihao Liang
  0 siblings, 1 reply; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-25  6:18 UTC (permalink / raw)
  To: lianglihao; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

On Tue, Jan 23, 2018 at 03:59:31PM +0800, lianglihao@huawei.com wrote:
> From: Lihao Liang <lianglihao@huawei.com>
> 
> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
> ---
>  kernel/rcu/rcuperf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
> index ea80fa3e..baccc123 100644
> --- a/kernel/rcu/rcuperf.c
> +++ b/kernel/rcu/rcuperf.c
> @@ -60,7 +60,7 @@ MODULE_AUTHOR("Paul E. McKenney <paulmck@linux.vnet.ibm.com>");
>  #define VERBOSE_PERFOUT_ERRSTRING(s) \
>  	do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } while (0)
> 
> -torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
> +torture_param(bool, gp_exp, true, "Use expedited GP wait primitives");

This is fine as a convenience for internal testing, but the usual way
to make this happen is using the rcuperf.gp_exp kernel boot parameter.
Or was that not working for you?

							Thanx, Paul

>  torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
>  torture_param(int, nreaders, -1, "Number of RCU reader threads");
>  torture_param(int, nwriters, -1, "Number of RCU updater threads");
> -- 
> 2.14.1.729.g59c0ea183
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 07/16] prcu: Implement call_prcu() API
  2018-01-23  7:59 ` [PATCH RFC 07/16] prcu: Implement call_prcu() API lianglihao
@ 2018-01-25  6:20   ` Paul E. McKenney
  2018-01-26  8:44     ` Lihao Liang
  0 siblings, 1 reply; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-25  6:20 UTC (permalink / raw)
  To: lianglihao; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

On Tue, Jan 23, 2018 at 03:59:32PM +0800, lianglihao@huawei.com wrote:
> From: Lihao Liang <lianglihao@huawei.com>
> 
> This is PRCU's counterpart of RCU's call_rcu() API.
> 
> Reviewed-by: Heng Zhang <heng.z@huawei.com>
> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
> ---
>  include/linux/prcu.h | 25 ++++++++++++++++++++
>  init/main.c          |  2 ++
>  kernel/rcu/prcu.c    | 67 +++++++++++++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 91 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
> index 653b4633..e5e09c9b 100644
> --- a/include/linux/prcu.h
> +++ b/include/linux/prcu.h
> @@ -2,15 +2,36 @@
>  #define __LINUX_PRCU_H
> 
>  #include <linux/atomic.h>
> +#include <linux/types.h>
>  #include <linux/mutex.h>
>  #include <linux/wait.h>
> 
>  #define CONFIG_PRCU
> 
> +struct prcu_version_head {
> +	unsigned long long version;
> +	struct prcu_version_head *next;
> +};
> +
> +/* Simple unsegmented callback list for PRCU. */
> +struct prcu_cblist {
> +	struct rcu_head *head;
> +	struct rcu_head **tail;
> +	struct prcu_version_head *version_head;
> +	struct prcu_version_head **version_tail;
> +	long len;
> +};
> +
> +#define PRCU_CBLIST_INITIALIZER(n) { \
> +	.head = NULL, .tail = &n.head, \
> +	.version_head = NULL, .version_tail = &n.version_head, \
> +}
> +
>  struct prcu_local_struct {
>  	unsigned int locked;
>  	unsigned int online;
>  	unsigned long long version;
> +	struct prcu_cblist cblist;
>  };
> 
>  struct prcu_struct {
> @@ -24,6 +45,8 @@ struct prcu_struct {
>  void prcu_read_lock(void);
>  void prcu_read_unlock(void);
>  void synchronize_prcu(void);
> +void call_prcu(struct rcu_head *head, rcu_callback_t func);
> +void prcu_init(void);
>  void prcu_note_context_switch(void);
> 
>  #else /* #ifdef CONFIG_PRCU */
> @@ -31,6 +54,8 @@ void prcu_note_context_switch(void);
>  #define prcu_read_lock() do {} while (0)
>  #define prcu_read_unlock() do {} while (0)
>  #define synchronize_prcu() do {} while (0)
> +#define call_prcu() do {} while (0)
> +#define prcu_init() do {} while (0)
>  #define prcu_note_context_switch() do {} while (0)
> 
>  #endif /* #ifdef CONFIG_PRCU */
> diff --git a/init/main.c b/init/main.c
> index f8665104..4925964e 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -38,6 +38,7 @@
>  #include <linux/smp.h>
>  #include <linux/profile.h>
>  #include <linux/rcupdate.h>
> +#include <linux/prcu.h>
>  #include <linux/moduleparam.h>
>  #include <linux/kallsyms.h>
>  #include <linux/writeback.h>
> @@ -574,6 +575,7 @@ asmlinkage __visible void __init start_kernel(void)
>  	workqueue_init_early();
> 
>  	rcu_init();
> +	prcu_init();
> 
>  	/* Trace events are available after this */
>  	trace_init();
> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
> index a00b9420..f198285c 100644
> --- a/kernel/rcu/prcu.c
> +++ b/kernel/rcu/prcu.c
> @@ -1,11 +1,12 @@
>  #include <linux/smp.h>
> -#include <linux/prcu.h>
>  #include <linux/percpu.h>
> -#include <linux/compiler.h>
> +#include <linux/prcu.h>
>  #include <linux/sched.h>
> -
> +#include <linux/slab.h>
>  #include <asm/barrier.h>
> 
> +#include "rcu.h"
> +
>  DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
> 
>  struct prcu_struct global_prcu = {
> @@ -16,6 +17,16 @@ struct prcu_struct global_prcu = {
>  };
>  struct prcu_struct *prcu = &global_prcu;
> 
> +/* Initialize simple callback list. */
> +static void prcu_cblist_init(struct prcu_cblist *rclp)
> +{
> +	rclp->head = NULL;
> +	rclp->tail = &rclp->head;
> +	rclp->version_head = NULL;
> +	rclp->version_tail = &rclp->version_head;
> +	rclp->len = 0;
> +}
> +
>  static inline void prcu_report(struct prcu_local_struct *local)
>  {
>  	unsigned long long global_version;
> @@ -123,3 +134,53 @@ void prcu_note_context_switch(void)
>  	prcu_report(local);
>  	put_cpu_ptr(&prcu_local);
>  }
> +
> +void call_prcu(struct rcu_head *head, rcu_callback_t func)
> +{
> +	unsigned long flags;
> +	struct prcu_local_struct *local;
> +	struct prcu_cblist *rclp;
> +	struct prcu_version_head *vhp;
> +
> +	debug_rcu_head_queue(head);
> +
> +	/* Use GFP_ATOMIC with IRQs disabled */
> +	vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC);
> +	if (!vhp)
> +		return;

Silently failing to post the callback can cause system hangs.  I suggest
finding some way to avoid allocating on the call_prcu() code path.

							Thanx, Paul

> +
> +	head->func = func;
> +	head->next = NULL;
> +	vhp->next = NULL;
> +
> +	local_irq_save(flags);
> +	local = this_cpu_ptr(&prcu_local);
> +	vhp->version = local->version;
> +	rclp = &local->cblist;
> +	rclp->len++;
> +	*rclp->tail = head;
> +	rclp->tail = &head->next;
> +	*rclp->version_tail = vhp;
> +	rclp->version_tail = &vhp->next;
> +	local_irq_restore(flags);
> +}
> +EXPORT_SYMBOL(call_prcu);
> +
> +void prcu_init_local_struct(int cpu)
> +{
> +	struct prcu_local_struct *local;
> +
> +	local = per_cpu_ptr(&prcu_local, cpu);
> +	local->locked = 0;
> +	local->online = 0;
> +	local->version = 0;
> +	prcu_cblist_init(&local->cblist);
> +}
> +
> +void __init prcu_init(void)
> +{
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu)
> +		prcu_init_local_struct(cpu);
> +}
> -- 
> 2.14.1.729.g59c0ea183
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 09/16] prcu: Implement prcu_barrier() API
  2018-01-23  7:59 ` [PATCH RFC 09/16] prcu: Implement prcu_barrier() API lianglihao
@ 2018-01-25  6:24   ` Paul E. McKenney
  0 siblings, 0 replies; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-25  6:24 UTC (permalink / raw)
  To: lianglihao; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

On Tue, Jan 23, 2018 at 03:59:34PM +0800, lianglihao@huawei.com wrote:
> From: Lihao Liang <lianglihao@huawei.com>
> 
> This is PRCU's counterpart of RCU's rcu_barrier() API.
> 
> Reviewed-by: Heng Zhang <heng.z@huawei.com>
> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
> ---
>  include/linux/prcu.h |  7 ++++++
>  kernel/rcu/prcu.c    | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 70 insertions(+)
> 
> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
> index 4e7d5d65..cce967fd 100644
> --- a/include/linux/prcu.h
> +++ b/include/linux/prcu.h
> @@ -5,6 +5,7 @@
>  #include <linux/types.h>
>  #include <linux/mutex.h>
>  #include <linux/wait.h>
> +#include <linux/completion.h>
> 
>  #define CONFIG_PRCU
> 
> @@ -32,6 +33,7 @@ struct prcu_local_struct {
>  	unsigned int online;
>  	unsigned long long version;
>  	unsigned long long cb_version;
> +	struct rcu_head barrier_head;
>  	struct prcu_cblist cblist;
>  };
> 
> @@ -39,8 +41,11 @@ struct prcu_struct {
>  	atomic64_t global_version;
>  	atomic64_t cb_version;
>  	atomic_t active_ctr;
> +	atomic_t barrier_cpu_count;
>  	struct mutex mtx;
> +	struct mutex barrier_mtx;
>  	wait_queue_head_t wait_q;
> +	struct completion barrier_completion;
>  };
> 
>  #ifdef CONFIG_PRCU
> @@ -48,6 +53,7 @@ void prcu_read_lock(void);
>  void prcu_read_unlock(void);
>  void synchronize_prcu(void);
>  void call_prcu(struct rcu_head *head, rcu_callback_t func);
> +void prcu_barrier(void);
>  void prcu_init(void);
>  void prcu_note_context_switch(void);
>  int prcu_pending(void);
> @@ -60,6 +66,7 @@ void prcu_check_callbacks(void);
>  #define prcu_read_unlock() do {} while (0)
>  #define synchronize_prcu() do {} while (0)
>  #define call_prcu() do {} while (0)
> +#define prcu_barrier() do {} while (0)
>  #define prcu_init() do {} while (0)
>  #define prcu_note_context_switch() do {} while (0)
>  #define prcu_pending() 0
> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
> index 373039c5..2664d091 100644
> --- a/kernel/rcu/prcu.c
> +++ b/kernel/rcu/prcu.c
> @@ -15,6 +15,7 @@ struct prcu_struct global_prcu = {
>  	.cb_version = ATOMIC64_INIT(0),
>  	.active_ctr = ATOMIC_INIT(0),
>  	.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
> +	.barrier_mtx = __MUTEX_INITIALIZER(global_prcu.barrier_mtx),
>  	.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
>  };
>  struct prcu_struct *prcu = &global_prcu;
> @@ -250,6 +251,68 @@ static __latent_entropy void prcu_process_callbacks(struct softirq_action *unuse
>  	local_irq_restore(flags);
>  }
> 
> +/*
> + * PRCU callback function for prcu_barrier().
> + * If we are last, wake up the task executing prcu_barrier().
> + */
> +static void prcu_barrier_callback(struct rcu_head *rhp)
> +{
> +	if (atomic_dec_and_test(&prcu->barrier_cpu_count))
> +		complete(&prcu->barrier_completion);
> +}
> +
> +/*
> + * Called with preemption disabled, and from cross-cpu IRQ context.
> + */
> +static void prcu_barrier_func(void *info)
> +{
> +	struct prcu_local_struct *local = this_cpu_ptr(&prcu_local);
> +
> +	atomic_inc(&prcu->barrier_cpu_count);
> +	call_prcu(&local->barrier_head, prcu_barrier_callback);
> +}
> +
> +/* Waiting for all PRCU callbacks to complete. */
> +void prcu_barrier(void)
> +{
> +	int cpu;
> +
> +	/* Take mutex to serialize concurrent prcu_barrier() requests. */
> +	mutex_lock(&prcu->barrier_mtx);
> +
> +	/*
> +	 * Initialize the count to one rather than to zero in order to
> +	 * avoid a too-soon return to zero in case of a short grace period
> +	 * (or preemption of this task).
> +	 */
> +	init_completion(&prcu->barrier_completion);
> +	atomic_set(&prcu->barrier_cpu_count, 1);
> +
> +	/*
> +	 * Register a new callback on each CPU using IPI to prevent races
> +	 * with call_prcu(). When that callback is invoked, we will know
> +	 * that all of the corresponding CPU's preceding callbacks have
> +	 * been invoked.
> +	 */
> +	for_each_possible_cpu(cpu)
> +		smp_call_function_single(cpu, prcu_barrier_func, NULL, 1);

This code seems to be assuming CONFIG_HOTPLUG_CPU=n.  This might explain
your rcutorture failure.

> +	/* Decrement the count as we initialize it to one. */
> +	if (atomic_dec_and_test(&prcu->barrier_cpu_count))
> +		complete(&prcu->barrier_completion);
> +
> +	/*
> +	 * Now that we have an prcu_barrier_callback() callback on each
> +	 * CPU, and thus each counted, remove the initial count.
> +	 * Wait for all prcu_barrier_callback() callbacks to be invoked.
> +	 */
> +	wait_for_completion(&prcu->barrier_completion);
> +
> +	/* Other rcu_barrier() invocations can now safely proceed. */
> +	mutex_unlock(&prcu->barrier_mtx);
> +}
> +EXPORT_SYMBOL(prcu_barrier);
> +
>  void prcu_init_local_struct(int cpu)
>  {
>  	struct prcu_local_struct *local;
> -- 
> 2.14.1.729.g59c0ea183
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 03/16] rcutorture: Add PRCU test config files
  2018-01-23  7:59 ` [PATCH RFC 03/16] rcutorture: Add PRCU test config files lianglihao
@ 2018-01-25  6:27   ` Paul E. McKenney
  0 siblings, 0 replies; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-25  6:27 UTC (permalink / raw)
  To: lianglihao; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

On Tue, Jan 23, 2018 at 03:59:28PM +0800, lianglihao@huawei.com wrote:
> From: Lihao Liang <lianglihao@huawei.com>
> 
> Use the same config files as TREE02, TREE03, TREE06, TREE07, and TREE09.
> 
> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
> ---
>  .../selftests/rcutorture/configs/rcu/CFLIST        |  5 ++++
>  .../selftests/rcutorture/configs/rcu/PRCU02        | 27 ++++++++++++++++++++++
>  .../selftests/rcutorture/configs/rcu/PRCU02.boot   |  1 +
>  .../selftests/rcutorture/configs/rcu/PRCU03        | 23 ++++++++++++++++++
>  .../selftests/rcutorture/configs/rcu/PRCU03.boot   |  2 ++
>  .../selftests/rcutorture/configs/rcu/PRCU06        | 26 +++++++++++++++++++++
>  .../selftests/rcutorture/configs/rcu/PRCU06.boot   |  5 ++++
>  .../selftests/rcutorture/configs/rcu/PRCU07        | 25 ++++++++++++++++++++
>  .../selftests/rcutorture/configs/rcu/PRCU07.boot   |  2 ++
>  .../selftests/rcutorture/configs/rcu/PRCU09        | 19 +++++++++++++++
>  .../selftests/rcutorture/configs/rcu/PRCU09.boot   |  1 +
>  11 files changed, 136 insertions(+)
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
> 
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST
> index a3a1a05a..7359e194 100644
> --- a/tools/testing/selftests/rcutorture/configs/rcu/CFLIST
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/CFLIST
> @@ -1,3 +1,8 @@
> +PRCU02
> +PRCU03
> +PRCU06
> +PRCU07
> +PRCU09
>  TREE01
>  TREE02
>  TREE03
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU02 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02
> new file mode 100644
> index 00000000..5f532f05
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02
> @@ -0,0 +1,27 @@
> +CONFIG_SMP=y
> +CONFIG_NR_CPUS=8
> +CONFIG_PREEMPT_NONE=n
> +CONFIG_PREEMPT_VOLUNTARY=n
> +CONFIG_PREEMPT=y
> +CONFIG_PRCU=y
> +#CHECK#CONFIG_PREEMPT_RCU=y
> +CONFIG_HZ_PERIODIC=n
> +CONFIG_NO_HZ_IDLE=y
> +CONFIG_NO_HZ_FULL=n
> +CONFIG_RCU_FAST_NO_HZ=n
> +CONFIG_RCU_TRACE=n
> +CONFIG_HOTPLUG_CPU=n
> +CONFIG_SUSPEND=n
> +CONFIG_HIBERNATION=n
> +CONFIG_RCU_FANOUT=3
> +CONFIG_RCU_FANOUT_LEAF=3
> +CONFIG_RCU_NOCB_CPU=n
> +CONFIG_DEBUG_LOCK_ALLOC=y
> +CONFIG_PROVE_LOCKING=n
> +CONFIG_RCU_BOOST=n
> +CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
> +CONFIG_RCU_EXPERT=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
> +CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
> new file mode 100644
> index 00000000..6c5e626f
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
> @@ -0,0 +1 @@
> +rcutorture.torture_type=prcu
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU03 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03
> new file mode 100644
> index 00000000..869cadc8
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03
> @@ -0,0 +1,23 @@
> +CONFIG_SMP=y
> +CONFIG_NR_CPUS=16
> +CONFIG_PREEMPT_NONE=n
> +CONFIG_PREEMPT_VOLUNTARY=n
> +CONFIG_PREEMPT=y
> +CONFIG_PRCU=y
> +#CHECK#CONFIG_PREEMPT_RCU=y
> +CONFIG_HZ_PERIODIC=y
> +CONFIG_NO_HZ_IDLE=n
> +CONFIG_NO_HZ_FULL=n
> +CONFIG_RCU_TRACE=y
> +CONFIG_HOTPLUG_CPU=y

And from what I can see, PRCU doesn't handle CPU hotplug.  I would not
be surprised to see rcutorture failures when running this scenario.

> +CONFIG_RCU_FANOUT=2
> +CONFIG_RCU_FANOUT_LEAF=2
> +CONFIG_RCU_NOCB_CPU=n
> +CONFIG_DEBUG_LOCK_ALLOC=n
> +CONFIG_RCU_BOOST=y
> +CONFIG_RCU_KTHREAD_PRIO=2
> +CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
> +CONFIG_RCU_EXPERT=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
> new file mode 100644
> index 00000000..0be10cba
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
> @@ -0,0 +1,2 @@
> +rcutorture.onoff_interval=1 rcutorture.onoff_holdoff=30
> +rcutorture.torture_type=prcu
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU06 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06
> new file mode 100644
> index 00000000..b1480963
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06
> @@ -0,0 +1,26 @@
> +CONFIG_SMP=y
> +CONFIG_NR_CPUS=8
> +CONFIG_PREEMPT_NONE=y
> +CONFIG_PREEMPT_VOLUNTARY=n
> +CONFIG_PREEMPT=n
> +CONFIG_PRCU=y
> +#CHECK#CONFIG_TREE_RCU=y
> +CONFIG_HZ_PERIODIC=n
> +CONFIG_NO_HZ_IDLE=y
> +CONFIG_NO_HZ_FULL=n
> +CONFIG_RCU_FAST_NO_HZ=n
> +CONFIG_RCU_TRACE=n
> +CONFIG_HOTPLUG_CPU=n
> +CONFIG_SUSPEND=n
> +CONFIG_HIBERNATION=n
> +CONFIG_RCU_FANOUT=6
> +CONFIG_RCU_FANOUT_LEAF=6
> +CONFIG_RCU_NOCB_CPU=n
> +CONFIG_DEBUG_LOCK_ALLOC=y
> +CONFIG_PROVE_LOCKING=y
> +#CHECK#CONFIG_PROVE_RCU=y
> +CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
> +CONFIG_RCU_EXPERT=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
> new file mode 100644
> index 00000000..00787e68
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
> @@ -0,0 +1,5 @@
> +rcupdate.rcu_self_test=1
> +rcupdate.rcu_self_test_bh=1
> +rcupdate.rcu_self_test_sched=1
> +rcutree.rcu_fanout_exact=1
> +rcutorture.torture_type=prcu
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU07 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU07
> new file mode 100644
> index 00000000..14f74c68
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU07
> @@ -0,0 +1,25 @@
> +CONFIG_SMP=y
> +CONFIG_NR_CPUS=16
> +CONFIG_CPUMASK_OFFSTACK=y
> +CONFIG_PREEMPT_NONE=y
> +CONFIG_PREEMPT_VOLUNTARY=n
> +CONFIG_PREEMPT=n
> +CONFIG_PRCU=y
> +#CHECK#CONFIG_TREE_RCU=y
> +CONFIG_HZ_PERIODIC=n
> +CONFIG_NO_HZ_IDLE=n
> +CONFIG_NO_HZ_FULL=y
> +CONFIG_NO_HZ_FULL_ALL=n
> +CONFIG_NO_HZ_FULL_SYSIDLE=y
> +CONFIG_RCU_FAST_NO_HZ=n
> +CONFIG_RCU_TRACE=y
> +CONFIG_HOTPLUG_CPU=y

And this one.

> +CONFIG_RCU_FANOUT=2
> +CONFIG_RCU_FANOUT_LEAF=2
> +CONFIG_RCU_NOCB_CPU=n
> +CONFIG_DEBUG_LOCK_ALLOC=n
> +CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
> +CONFIG_RCU_EXPERT=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_CLEANUP=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_INIT=y
> +CONFIG_RCU_TORTURE_TEST_SLOW_PREINIT=y
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
> new file mode 100644
> index 00000000..43dac30b
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
> @@ -0,0 +1,2 @@
> +nohz_full=2-9
> +rcutorture.torture_type=prcu
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU09 b/tools/testing/selftests/rcutorture/configs/rcu/PRCU09
> new file mode 100644
> index 00000000..43d4718d
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU09
> @@ -0,0 +1,19 @@
> +CONFIG_SMP=n
> +CONFIG_NR_CPUS=1
> +CONFIG_PREEMPT_NONE=n
> +CONFIG_PREEMPT_VOLUNTARY=n
> +CONFIG_PREEMPT=y
> +CONFIG_PRCU=y
> +#CHECK#CONFIG_PREEMPT_RCU=y
> +CONFIG_HZ_PERIODIC=n
> +CONFIG_NO_HZ_IDLE=y
> +CONFIG_NO_HZ_FULL=n
> +CONFIG_RCU_TRACE=n
> +CONFIG_HOTPLUG_CPU=n
> +CONFIG_SUSPEND=n
> +CONFIG_HIBERNATION=n
> +CONFIG_RCU_NOCB_CPU=n
> +CONFIG_DEBUG_LOCK_ALLOC=n
> +CONFIG_RCU_BOOST=n
> +CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
> +#CHECK#CONFIG_RCU_EXPERT=n
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot b/tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
> new file mode 100644
> index 00000000..6c5e626f
> --- /dev/null
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
> @@ -0,0 +1 @@
> +rcutorture.torture_type=prcu
> -- 
> 2.14.1.729.g59c0ea183
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 15/16] rcutorture: Add scripts to run experiments
  2018-01-23  7:59 ` [PATCH RFC 15/16] rcutorture: Add scripts to run experiments lianglihao
@ 2018-01-25  6:28   ` Paul E. McKenney
  0 siblings, 0 replies; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-25  6:28 UTC (permalink / raw)
  To: lianglihao; +Cc: guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

On Tue, Jan 23, 2018 at 03:59:40PM +0800, lianglihao@huawei.com wrote:
> From: Lihao Liang <lianglihao@huawei.com>
> 
> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
> ---
>  kvm.sh         | 452 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  run-rcuperf.sh |  26 ++++

The usual approach would be to add what you need to the existing kvm.sh...

								Thanx, Paul

>  2 files changed, 478 insertions(+)
>  create mode 100755 kvm.sh
>  create mode 100755 run-rcuperf.sh
> 
> diff --git a/kvm.sh b/kvm.sh
> new file mode 100755
> index 00000000..3b3c1b69
> --- /dev/null
> +++ b/kvm.sh
> @@ -0,0 +1,452 @@
> +#!/bin/bash
> +#
> +# Run a series of 14 tests under KVM.  These are not particularly
> +# well-selected or well-tuned, but are the current set.  Run from the
> +# top level of the source tree.
> +#
> +# Edit the definitions below to set the locations of the various directories,
> +# as well as the test duration.
> +#
> +# Usage: kvm.sh [ options ]
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, you can access it online at
> +# http://www.gnu.org/licenses/gpl-2.0.html.
> +#
> +# Copyright (C) IBM Corporation, 2011
> +#
> +# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> +
> +scriptname=$0
> +args="$*"
> +
> +T=/tmp/kvm.sh.$$
> +trap 'rm -rf $T' 0
> +mkdir $T
> +
> +dur=$((30*60))
> +dryrun=""
> +KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM
> +PATH=${KVM}/bin:$PATH; export PATH
> +TORTURE_DEFCONFIG=defconfig
> +TORTURE_BOOT_IMAGE=""
> +TORTURE_INITRD="$KVM/initrd"; export TORTURE_INITRD
> +TORTURE_KMAKE_ARG=""
> +TORTURE_SHUTDOWN_GRACE=180
> +TORTURE_SUITE=rcu
> +resdir=""
> +configs=""
> +cpus=0
> +ds=`date +%Y.%m.%d-%H:%M:%S`
> +jitter="-1"
> +
> +. functions.sh
> +
> +usage () {
> +	echo "Usage: $scriptname optional arguments:"
> +	echo "       --bootargs kernel-boot-arguments"
> +	echo "       --bootimage relative-path-to-kernel-boot-image"
> +	echo "       --buildonly"
> +	echo "       --configs \"config-file list w/ repeat factor (3*TINY01)\""
> +	echo "       --cpus N"
> +	echo "       --datestamp string"
> +	echo "       --defconfig string"
> +	echo "       --dryrun sched|script"
> +	echo "       --duration minutes"
> +	echo "       --interactive"
> +	echo "       --jitter N [ maxsleep (us) [ maxspin (us) ] ]"
> +	echo "       --kmake-arg kernel-make-arguments"
> +	echo "       --mac nn:nn:nn:nn:nn:nn"
> +	echo "       --no-initrd"
> +	echo "       --qemu-args qemu-system-..."
> +	echo "       --qemu-cmd qemu-system-..."
> +	echo "       --results absolute-pathname"
> +	echo "       --torture rcu"
> +	exit 1
> +}
> +
> +while test $# -gt 0
> +do
> +	case "$1" in
> +	--bootargs|--bootarg)
> +		checkarg --bootargs "(list of kernel boot arguments)" "$#" "$2" '.*' '^--'
> +		TORTURE_BOOTARGS="$2"
> +		shift
> +		;;
> +	--bootimage)
> +		checkarg --bootimage "(relative path to kernel boot image)" "$#" "$2" '[a-zA-Z0-9][a-zA-Z0-9_]*' '^--'
> +		TORTURE_BOOT_IMAGE="$2"
> +		shift
> +		;;
> +	--buildonly)
> +		TORTURE_BUILDONLY=1
> +		;;
> +	--configs|--config)
> +		checkarg --configs "(list of config files)" "$#" "$2" '^[^/]*$' '^--'
> +		configs="$2"
> +		shift
> +		;;
> +	--cpus)
> +		checkarg --cpus "(number)" "$#" "$2" '^[0-9]*$' '^--'
> +		cpus=$2
> +		shift
> +		;;
> +	--datestamp)
> +		checkarg --datestamp "(relative pathname)" "$#" "$2" '^[^/]*$' '^--'
> +		ds=$2
> +		shift
> +		;;
> +	--defconfig)
> +		checkarg --defconfig "defconfigtype" "$#" "$2" '^[^/][^/]*$' '^--'
> +		TORTURE_DEFCONFIG=$2
> +		shift
> +		;;
> +	--dryrun)
> +		checkarg --dryrun "sched|script" $# "$2" 'sched\|script' '^--'
> +		dryrun=$2
> +		shift
> +		;;
> +	--duration)
> +		checkarg --duration "(minutes)" $# "$2" '^[0-9]*$' '^error'
> +		dur=$(($2*60))
> +		shift
> +		;;
> +	--interactive)
> +		TORTURE_QEMU_INTERACTIVE=1; export TORTURE_QEMU_INTERACTIVE
> +		;;
> +	--jitter)
> +		checkarg --jitter "(# threads [ sleep [ spin ] ])" $# "$2" '^-\{,1\}[0-9]\+\( \+[0-9]\+\)\{,2\} *$' '^error$'
> +		jitter="$2"
> +		shift
> +		;;
> +	--kmake-arg)
> +		checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
> +		TORTURE_KMAKE_ARG="$2"
> +		shift
> +		;;
> +	--mac)
> +		checkarg --mac "(MAC address)" $# "$2" '^\([0-9a-fA-F]\{2\}:\)\{5\}[0-9a-fA-F]\{2\}$' error
> +		TORTURE_QEMU_MAC=$2
> +		shift
> +		;;
> +	--no-initrd)
> +		TORTURE_INITRD=""; export TORTURE_INITRD
> +		;;
> +	--qemu-args|--qemu-arg)
> +		checkarg --qemu-args "-qemu args" $# "$2" '^-' '^error'
> +		TORTURE_QEMU_ARG="$2"
> +		shift
> +		;;
> +	--qemu-cmd)
> +		checkarg --qemu-cmd "(qemu-system-...)" $# "$2" 'qemu-system-' '^--'
> +		TORTURE_QEMU_CMD="$2"
> +		shift
> +		;;
> +	--results)
> +		checkarg --results "(absolute pathname)" "$#" "$2" '^/' '^error'
> +		resdir=$2
> +		shift
> +		;;
> +	--shutdown-grace)
> +		checkarg --shutdown-grace "(seconds)" "$#" "$2" '^[0-9]*$' '^error'
> +		TORTURE_SHUTDOWN_GRACE=$2
> +		shift
> +		;;
> +	--torture)
> +		checkarg --torture "(suite name)" "$#" "$2" '^\(lock\|rcu\|rcuperf\)$' '^--'
> +		TORTURE_SUITE=$2
> +		shift
> +		;;
> +	*)
> +		echo Unknown argument $1
> +		usage
> +		;;
> +	esac
> +	shift
> +done
> +
> +CONFIGFRAG=${KVM}/configs/${TORTURE_SUITE}; export CONFIGFRAG
> +
> +if test -z "$configs"
> +then
> +	configs="`cat $CONFIGFRAG/CFLIST`"
> +fi
> +
> +if test -z "$resdir"
> +then
> +	resdir=$KVM/res
> +fi
> +
> +# Create a file of test-name/#cpus pairs, sorted by decreasing #cpus.
> +touch $T/cfgcpu
> +for CF in $configs
> +do
> +	case $CF in
> +	[0-9]\**|[0-9][0-9]\**|[0-9][0-9][0-9]\**)
> +		config_reps=`echo $CF | sed -e 's/\*.*$//'`
> +		CF1=`echo $CF | sed -e 's/^[^*]*\*//'`
> +		;;
> +	*)
> +		config_reps=1
> +		CF1=$CF
> +		;;
> +	esac
> +	if test -f "$CONFIGFRAG/$CF1"
> +	then
> +		cpu_count=`configNR_CPUS.sh $CONFIGFRAG/$CF1`
> +		cpu_count=`configfrag_boot_cpus "$TORTURE_BOOTARGS" "$CONFIGFRAG/$CF1" "$cpu_count"`
> +		for ((cur_rep=0;cur_rep<$config_reps;cur_rep++))
> +		do
> +			echo $CF1 $cpu_count >> $T/cfgcpu
> +		done
> +	else
> +		echo "The --configs file $CF1 does not exist, terminating."
> +		exit 1
> +	fi
> +done
> +sort -k2nr $T/cfgcpu > $T/cfgcpu.sort
> +
> +# Use a greedy bin-packing algorithm, sorting the list accordingly.
> +awk < $T/cfgcpu.sort > $T/cfgcpu.pack -v ncpus=$cpus '
> +BEGIN {
> +	njobs = 0;
> +}
> +
> +{
> +	# Read file of tests and corresponding required numbers of CPUs.
> +	cf[njobs] = $1;
> +	cpus[njobs] = $2;
> +	njobs++;
> +}
> +
> +END {
> +	alldone = 0;
> +	batch = 0;
> +	nc = -1;
> +
> +	# Each pass through the following loop creates on test batch
> +	# that can be executed concurrently given ncpus.  Note that a
> +	# given test that requires more than the available CPUs will run in
> +	# their own batch.  Such tests just have to make do with what
> +	# is available.
> +	while (nc != ncpus) {
> +		batch++;
> +		nc = ncpus;
> +
> +		# Each pass through the following loop considers one
> +		# test for inclusion in the current batch.
> +		for (i = 0; i < njobs; i++) {
> +			if (done[i])
> +				continue; # Already part of a batch.
> +			if (nc >= cpus[i] || nc == ncpus) {
> +
> +				# This test fits into the current batch.
> +				done[i] = batch;
> +				nc -= cpus[i];
> +				if (nc <= 0)
> +					break; # Too-big test in its own batch.
> +			}
> +		}
> +	}
> +
> +	# Dump out the tests in batch order.
> +	for (b = 1; b <= batch; b++)
> +		for (i = 0; i < njobs; i++)
> +			if (done[i] == b)
> +				print cf[i], cpus[i];
> +}'
> +
> +# Generate a script to execute the tests in appropriate batches.
> +cat << ___EOF___ > $T/script
> +CONFIGFRAG="$CONFIGFRAG"; export CONFIGFRAG
> +KVM="$KVM"; export KVM
> +PATH="$PATH"; export PATH
> +TORTURE_BOOT_IMAGE="$TORTURE_BOOT_IMAGE"; export TORTURE_BOOT_IMAGE
> +TORTURE_BUILDONLY="$TORTURE_BUILDONLY"; export TORTURE_BUILDONLY
> +TORTURE_DEFCONFIG="$TORTURE_DEFCONFIG"; export TORTURE_DEFCONFIG
> +TORTURE_INITRD="$TORTURE_INITRD"; export TORTURE_INITRD
> +TORTURE_KMAKE_ARG="$TORTURE_KMAKE_ARG"; export TORTURE_KMAKE_ARG
> +TORTURE_QEMU_CMD="$TORTURE_QEMU_CMD"; export TORTURE_QEMU_CMD
> +TORTURE_QEMU_INTERACTIVE="$TORTURE_QEMU_INTERACTIVE"; export TORTURE_QEMU_INTERACTIVE
> +TORTURE_QEMU_MAC="$TORTURE_QEMU_MAC"; export TORTURE_QEMU_MAC
> +TORTURE_SHUTDOWN_GRACE="$TORTURE_SHUTDOWN_GRACE"; export TORTURE_SHUTDOWN_GRACE
> +TORTURE_SUITE="$TORTURE_SUITE"; export TORTURE_SUITE
> +if ! test -e $resdir
> +then
> +	mkdir -p "$resdir" || :
> +fi
> +mkdir $resdir/$ds
> +echo Results directory: $resdir/$ds
> +echo $scriptname $args
> +touch $resdir/$ds/log
> +echo $scriptname $args >> $resdir/$ds/log
> +echo ${TORTURE_SUITE} > $resdir/$ds/TORTURE_SUITE
> +pwd > $resdir/$ds/testid.txt
> +if test -d .git
> +then
> +	git status >> $resdir/$ds/testid.txt
> +	git rev-parse HEAD >> $resdir/$ds/testid.txt
> +	if ! git diff HEAD > $T/git-diff 2>&1
> +	then
> +		cp $T/git-diff $resdir/$ds
> +	fi
> +fi
> +___EOF___
> +awk < $T/cfgcpu.pack \
> +	-v TORTURE_BUILDONLY="$TORTURE_BUILDONLY" \
> +	-v CONFIGDIR="$CONFIGFRAG/" \
> +	-v KVM="$KVM" \
> +	-v ncpus=$cpus \
> +	-v jitter="$jitter" \
> +	-v rd=$resdir/$ds/ \
> +	-v dur=$dur \
> +	-v TORTURE_QEMU_ARG="$TORTURE_QEMU_ARG" \
> +	-v TORTURE_BOOTARGS="$TORTURE_BOOTARGS" \
> +'BEGIN {
> +	i = 0;
> +}
> +
> +{
> +	cf[i] = $1;
> +	cpus[i] = $2;
> +	i++;
> +}
> +
> +# Dump out the scripting required to run one test batch.
> +function dump(first, pastlast, batchnum)
> +{
> +	print "echo ----Start batch " batchnum ": `date`";
> +	print "echo ----Start batch " batchnum ": `date` >> " rd "/log";
> +	jn=1
> +	for (j = first; j < pastlast; j++) {
> +		builddir=KVM "/b" jn
> +		cpusr[jn] = cpus[j];
> +		if (cfrep[cf[j]] == "") {
> +			cfr[jn] = cf[j];
> +			cfrep[cf[j]] = 1;
> +		} else {
> +			cfrep[cf[j]]++;
> +			cfr[jn] = cf[j] "." cfrep[cf[j]];
> +		}
> +		if (cpusr[jn] > ncpus && ncpus != 0)
> +			ovf = "-ovf";
> +		else
> +			ovf = "";
> +		print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date`";
> +		print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date` >> " rd "/log";
> +		print "rm -f " builddir ".*";
> +		print "touch " builddir ".wait";
> +		print "mkdir " builddir " > /dev/null 2>&1 || :";
> +		print "mkdir " rd cfr[jn] " || :";
> +		print "kvm-test-1-run.sh " CONFIGDIR cf[j], builddir, rd cfr[jn], dur " \"" TORTURE_QEMU_ARG "\" \"" TORTURE_BOOTARGS "\" > " rd cfr[jn]  "/kvm-test-1-run.sh.out 2>&1 &"
> +		print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date`";
> +		print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date` >> " rd "/log";
> +		print "while test -f " builddir ".wait"
> +		print "do"
> +		print "\tsleep 1"
> +		print "done"
> +		print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date`";
> +		print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date` >> " rd "/log";
> +		jn++;
> +	}
> +	for (j = 1; j < jn; j++) {
> +		builddir=KVM "/b" j
> +		print "rm -f " builddir ".ready"
> +		print "if test -z \"$TORTURE_BUILDONLY\""
> +		print "then"
> +		print "\techo ----", cfr[j], cpusr[j] ovf ": Starting kernel. `date`";
> +		print "\techo ----", cfr[j], cpusr[j] ovf ": Starting kernel. `date` >> " rd "/log";
> +		print "fi"
> +	}
> +	njitter = 0;
> +	split(jitter, ja);
> +	if (ja[1] == -1 && ncpus == 0)
> +		njitter = 1;
> +	else if (ja[1] == -1)
> +		njitter = ncpus;
> +	else
> +		njitter = ja[1];
> +	if (TORTURE_BUILDONLY && njitter != 0) {
> +		njitter = 0;
> +		print "echo Build-only run, so suppressing jitter >> " rd "/log"
> +	}
> +	for (j = 0; j < njitter; j++)
> +		print "jitter.sh " j " " dur " " ja[2] " " ja[3] "&"
> +	print "wait"
> +	print "if test -z \"$TORTURE_BUILDONLY\""
> +	print "then"
> +	print "\techo ---- All kernel runs complete. `date`";
> +	print "\techo ---- All kernel runs complete. `date` >> " rd "/log";
> +	print "fi"
> +	for (j = 1; j < jn; j++) {
> +		builddir=KVM "/b" j
> +		print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results:";
> +		print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results: >> " rd "/log";
> +		print "cat " rd cfr[j]  "/kvm-test-1-run.sh.out";
> +		print "cat " rd cfr[j]  "/kvm-test-1-run.sh.out >> " rd "/log";
> +	}
> +}
> +
> +END {
> +	njobs = i;
> +	nc = ncpus;
> +	first = 0;
> +	batchnum = 1;
> +
> +	# Each pass through the following loop considers one test.
> +	for (i = 0; i < njobs; i++) {
> +		if (ncpus == 0) {
> +			# Sequential test specified, each test its own batch.
> +			dump(i, i + 1, batchnum);
> +			first = i;
> +			batchnum++;
> +		} else if (nc < cpus[i] && i != 0) {
> +			# Out of CPUs, dump out a batch.
> +			dump(first, i, batchnum);
> +			first = i;
> +			nc = ncpus;
> +			batchnum++;
> +		}
> +		# Account for the CPUs needed by the current test.
> +		nc -= cpus[i];
> +	}
> +	# Dump the last batch.
> +	if (ncpus != 0)
> +		dump(first, i, batchnum);
> +}' >> $T/script
> +
> +cat << ___EOF___ >> $T/script
> +echo
> +echo
> +echo " --- `date` Test summary:"
> +echo Results directory: $resdir/$ds
> +kvm-recheck.sh $resdir/$ds
> +___EOF___
> +
> +if test "$dryrun" = script
> +then
> +	cat $T/script
> +	exit 0
> +elif test "$dryrun" = sched
> +then
> +	# Extract the test run schedule from the script.
> +	egrep 'Start batch|Starting build\.' $T/script |
> +		grep -v ">>" |
> +		sed -e 's/:.*$//' -e 's/^echo //'
> +	exit 0
> +else
> +	# Not a dryrun, so run the script.
> +	sh $T/script
> +fi
> +
> +# Tracing: trace_event=rcu:rcu_grace_period,rcu:rcu_future_grace_period,rcu:rcu_grace_period_init,rcu:rcu_nocb_wake,rcu:rcu_preempt_task,rcu:rcu_unlock_preempted_task,rcu:rcu_quiescent_state_report,rcu:rcu_fqs,rcu:rcu_callback,rcu:rcu_kfree_callback,rcu:rcu_batch_start,rcu:rcu_invoke_callback,rcu:rcu_invoke_kfree_callback,rcu:rcu_batch_end,rcu:rcu_torture_read,rcu:rcu_barrier
> diff --git a/run-rcuperf.sh b/run-rcuperf.sh
> new file mode 100755
> index 00000000..0526fff1
> --- /dev/null
> +++ b/run-rcuperf.sh
> @@ -0,0 +1,26 @@
> +#!/bin/bash
> +
> +dur=10
> +run=10
> +torture="rcuperf"
> +rest="1m"
> +path=`pwd`
> +
> +for cpu in 4 8 16
> +do
> +	for type in PRCU TREE
> +	do
> +		folder="$path/res/rcuperf/cpu-$cpu/$type"
> +		if ! test -d $folder
> +		then
> +			echo "$folder does not exist..."
> +			exit
> +		fi
> +
> +		echo "Running rcuperf-$type-${cpu}cpus..."
> +		`./kvm.sh --torture $torture --duration $dur --configs ${run}*${type}-${cpu} --results $folder &> $folder/$type-cpu${cpu}-${dur}min.out`
> +
> +		echo "Sleep $rest..."
> +		`sleep $rest`
> +	done
> +done
> -- 
> 2.14.1.729.g59c0ea183
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-25  6:16   ` Paul E. McKenney
@ 2018-01-25  7:30     ` Boqun Feng
  2018-01-30  5:34       ` zhangheng (AC)
  2018-01-27  7:35     ` Lihao Liang
  2018-01-30  3:58     ` zhangheng (AC)
  2 siblings, 1 reply; 43+ messages in thread
From: Boqun Feng @ 2018-01-25  7:30 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: lianglihao, guohanjun, heng.z, hb.chen, lihao.liang, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 9456 bytes --]

On Wed, Jan 24, 2018 at 10:16:18PM -0800, Paul E. McKenney wrote:
> On Tue, Jan 23, 2018 at 03:59:26PM +0800, lianglihao@huawei.com wrote:
> > From: Heng Zhang <heng.z@huawei.com>
> > 
> > This RCU implementation (PRCU) is based on a fast consensus protocol
> > published in the following paper:
> > 
> > Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
> > Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
> > IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
> > https://dl.acm.org/citation.cfm?id=3024114.3024143
> > 
> > Signed-off-by: Heng Zhang <heng.z@huawei.com>
> > Signed-off-by: Lihao Liang <lianglihao@huawei.com>
> 
> A few comments and questions interspersed.
> 
> 							Thanx, Paul
> 
> > ---
> >  include/linux/prcu.h |  37 +++++++++++++++
> >  kernel/rcu/Makefile  |   2 +-
> >  kernel/rcu/prcu.c    | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  kernel/sched/core.c  |   2 +
> >  4 files changed, 165 insertions(+), 1 deletion(-)
> >  create mode 100644 include/linux/prcu.h
> >  create mode 100644 kernel/rcu/prcu.c
> > 
> > diff --git a/include/linux/prcu.h b/include/linux/prcu.h
> > new file mode 100644
> > index 00000000..653b4633
> > --- /dev/null
> > +++ b/include/linux/prcu.h
> > @@ -0,0 +1,37 @@
> > +#ifndef __LINUX_PRCU_H
> > +#define __LINUX_PRCU_H
> > +
> > +#include <linux/atomic.h>
> > +#include <linux/mutex.h>
> > +#include <linux/wait.h>
> > +
> > +#define CONFIG_PRCU
> > +
> > +struct prcu_local_struct {
> > +	unsigned int locked;
> > +	unsigned int online;
> > +	unsigned long long version;
> > +};
> > +
> > +struct prcu_struct {
> > +	atomic64_t global_version;
> > +	atomic_t active_ctr;
> > +	struct mutex mtx;
> > +	wait_queue_head_t wait_q;
> > +};
> > +
> > +#ifdef CONFIG_PRCU
> > +void prcu_read_lock(void);
> > +void prcu_read_unlock(void);
> > +void synchronize_prcu(void);
> > +void prcu_note_context_switch(void);
> > +
> > +#else /* #ifdef CONFIG_PRCU */
> > +
> > +#define prcu_read_lock() do {} while (0)
> > +#define prcu_read_unlock() do {} while (0)
> > +#define synchronize_prcu() do {} while (0)
> > +#define prcu_note_context_switch() do {} while (0)
> 
> If CONFIG_PRCU=n and some code is built that uses PRCU, shouldn't you
> get a build error rather than an error-free but inoperative PRCU?
> 
> Of course, Peter's question about purpose of the patch set applies
> here as well.
> 
> > +
> > +#endif /* #ifdef CONFIG_PRCU */
> > +#endif /* __LINUX_PRCU_H */
> > diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
> > index 23803c7d..8791419c 100644
> > --- a/kernel/rcu/Makefile
> > +++ b/kernel/rcu/Makefile
> > @@ -2,7 +2,7 @@
> >  # and is generally not a function of system call inputs.
> >  KCOV_INSTRUMENT := n
> > 
> > -obj-y += update.o sync.o
> > +obj-y += update.o sync.o prcu.o
> >  obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
> >  obj-$(CONFIG_TREE_SRCU) += srcutree.o
> >  obj-$(CONFIG_TINY_SRCU) += srcutiny.o
> > diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
> > new file mode 100644
> > index 00000000..a00b9420
> > --- /dev/null
> > +++ b/kernel/rcu/prcu.c
> > @@ -0,0 +1,125 @@
> > +#include <linux/smp.h>
> > +#include <linux/prcu.h>
> > +#include <linux/percpu.h>
> > +#include <linux/compiler.h>
> > +#include <linux/sched.h>
> > +
> > +#include <asm/barrier.h>
> > +
> > +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
> > +
> > +struct prcu_struct global_prcu = {
> > +	.global_version = ATOMIC64_INIT(0),
> > +	.active_ctr = ATOMIC_INIT(0),
> > +	.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
> > +	.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
> > +};
> > +struct prcu_struct *prcu = &global_prcu;
> > +
> > +static inline void prcu_report(struct prcu_local_struct *local)
> > +{
> > +	unsigned long long global_version;
> > +	unsigned long long local_version;
> > +
> > +	global_version = atomic64_read(&prcu->global_version);
> > +	local_version = local->version;
> > +	if (global_version > local_version)
> > +		cmpxchg(&local->version, local_version, global_version);
> > +}
> > +
> > +void prcu_read_lock(void)
> > +{
> > +	struct prcu_local_struct *local;
> > +
> > +	local = get_cpu_ptr(&prcu_local);
> > +	if (!local->online) {
> > +		WRITE_ONCE(local->online, 1);
> > +		smp_mb();
> > +	}
> > +
> > +	local->locked++;
> > +	put_cpu_ptr(&prcu_local);
> > +}
> > +EXPORT_SYMBOL(prcu_read_lock);
> > +
> > +void prcu_read_unlock(void)
> > +{
> > +	int locked;
> > +	struct prcu_local_struct *local;
> > +
> > +	barrier();
> > +	local = get_cpu_ptr(&prcu_local);
> > +	locked = local->locked;
> > +	if (locked) {
> > +		local->locked--;
> > +		if (locked == 1)
> > +			prcu_report(local);
> 
> Is ordering important here?  It looks to me that the compiler could
> rearrange some of the accesses within prcu_report() with the local->locked
> decrement.  There appears to be some potential for load and store tearing,
> though perhaps you have verified that your compiler avoids this on
> the architecture that you are using.
> 
> > +		put_cpu_ptr(&prcu_local);
> > +	} else {
> 
> Hmmm...  We get here if the RCU read-side critical section was preempted.
> If none of them are preempted, ->active_ctr remains zero.
> 
> > +		put_cpu_ptr(&prcu_local);
> > +		if (!atomic_dec_return(&prcu->active_ctr))
> > +			wake_up(&prcu->wait_q);
> > +	}
> > +}
> > +EXPORT_SYMBOL(prcu_read_unlock);
> > +
> > +static void prcu_handler(void *info)
> > +{
> > +	struct prcu_local_struct *local;
> > +
> > +	local = this_cpu_ptr(&prcu_local);
> > +	if (!local->locked)

And I think a smp_mb() is needed here, because in the following case:

	CPU 0				CPU 1
	==================		==========================
	{X is initially 0}

	WRITE_ONCE(X, 1);

	prcu_read_unlock(void):
	  if (locked) {
	  				synchronize_prcu(void):
					  ...
					  <send IPI to CPU 0>
	    local->locked--;
	# switch to IPI
	  WRITE_ONCE(local->version,....)
	  				  <read CPU 0 version to be latest>
					  <return>

					r1 = READ_ONCE(X);

r1 could be 0, which breaks RCU guarantees.

> > +		WRITE_ONCE(local->version, atomic64_read(&prcu->global_version));
> > +}
> > +
> > +void synchronize_prcu(void)
> > +{
> > +	int cpu;
> > +	cpumask_t cpus;
> > +	unsigned long long version;
> > +	struct prcu_local_struct *local;
> > +
> > +	version = atomic64_add_return(1, &prcu->global_version);
> > +	mutex_lock(&prcu->mtx);
> > +
> > +	local = get_cpu_ptr(&prcu_local);
> > +	local->version = version;
> > +	put_cpu_ptr(&prcu_local);
> > +
> > +	cpumask_clear(&cpus);
> > +	for_each_possible_cpu(cpu) {
> > +		local = per_cpu_ptr(&prcu_local, cpu);
> > +		if (!READ_ONCE(local->online))
> > +			continue;
> > +		if (READ_ONCE(local->version) < version) {
> 
> On 32-bit systems, given that ->version is long long, you might see
> load tearing.  And on some 32-bit systems, the cmpxchg() in prcu_hander()
> might not build.
> 

/me curious about why an atomic64_t is used here for global version. I
think maybe 32bit global version still suffices.

Regards,
Boqun

> Or is the idea that only prcu_handler() updates ->version?  But in that
> case, you wouldn't need the READ_ONCE() above.  What am I missing here?
> 
> > +			smp_call_function_single(cpu, prcu_handler, NULL, 0);
> > +			cpumask_set_cpu(cpu, &cpus);
> > +		}
> > +	}
> > +
> > +	for_each_cpu(cpu, &cpus) {
> > +		local = per_cpu_ptr(&prcu_local, cpu);
> > +		while (READ_ONCE(local->version) < version)
> 
> This ->version read can also tear on some 32-bit systems, and this
> one most definitely can race with the prcu_handler() above.  Does the
> algorithm operate correctly in that case?  (It doesn't look that way
> to me, but I might be missing something.) Or are 32-bit systems excluded?
> 
> > +			cpu_relax();
> > +	}
> 
> I might be missing something, but I believe we need a memory barrier
> here on non-TSO systems.  Without that, couldn't we miss a preemption?
> 
> > +
> > +	if (atomic_read(&prcu->active_ctr))
> > +		wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
> > +
> > +	mutex_unlock(&prcu->mtx);
> > +}
> > +EXPORT_SYMBOL(synchronize_prcu);
> > +
> > +void prcu_note_context_switch(void)
> > +{
> > +	struct prcu_local_struct *local;
> > +
> > +	local = get_cpu_ptr(&prcu_local);
> > +	if (local->locked) {
> > +		atomic_add(local->locked, &prcu->active_ctr);
> > +		local->locked = 0;
> > +	}
> > +	local->online = 0;
> > +	prcu_report(local);
> > +	put_cpu_ptr(&prcu_local);
> > +}
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 326d4f88..a308581b 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -15,6 +15,7 @@
> >  #include <linux/init_task.h>
> >  #include <linux/context_tracking.h>
> >  #include <linux/rcupdate_wait.h>
> > +#include <linux/prcu.h>
> > 
> >  #include <linux/blkdev.h>
> >  #include <linux/kprobes.h>
> > @@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool preempt)
> > 
> >  	local_irq_disable();
> >  	rcu_note_context_switch(preempt);
> > +	prcu_note_context_switch();
> > 
> >  	/*
> >  	 * Make sure that signal_pending_state()->signal_pending() below
> > -- 
> > 2.14.1.729.g59c0ea183
> > 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run
  2018-01-25  6:18   ` Paul E. McKenney
@ 2018-01-26  8:33     ` Lihao Liang
  0 siblings, 0 replies; 43+ messages in thread
From: Lihao Liang @ 2018-01-26  8:33 UTC (permalink / raw)
  To: Paul McKenney; +Cc: Guohanjun (Hanjun Guo), heng.z, hb.chen, linux-kernel

On Thu, Jan 25, 2018 at 6:18 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Jan 23, 2018 at 03:59:31PM +0800, lianglihao@huawei.com wrote:
>> From: Lihao Liang <lianglihao@huawei.com>
>>
>> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
>> ---
>>  kernel/rcu/rcuperf.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
>> index ea80fa3e..baccc123 100644
>> --- a/kernel/rcu/rcuperf.c
>> +++ b/kernel/rcu/rcuperf.c
>> @@ -60,7 +60,7 @@ MODULE_AUTHOR("Paul E. McKenney <paulmck@linux.vnet.ibm.com>");
>>  #define VERBOSE_PERFOUT_ERRSTRING(s) \
>>       do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } while (0)
>>
>> -torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
>> +torture_param(bool, gp_exp, true, "Use expedited GP wait primitives");
>
> This is fine as a convenience for internal testing, but the usual way
> to make this happen is using the rcuperf.gp_exp kernel boot parameter.
> Or was that not working for you?
>

Sure. It should work if rcuperf.gp_exp=1 is added to the .boot files
(it wouldn't work rcuperf.gp_exp=false is used).

Thanks,
Lihao.

>                                                         Thanx, Paul
>
>>  torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
>>  torture_param(int, nreaders, -1, "Number of RCU reader threads");
>>  torture_param(int, nwriters, -1, "Number of RCU updater threads");
>> --
>> 2.14.1.729.g59c0ea183
>>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 07/16] prcu: Implement call_prcu() API
  2018-01-25  6:20   ` Paul E. McKenney
@ 2018-01-26  8:44     ` Lihao Liang
  2018-01-26 22:22       ` Paul E. McKenney
  0 siblings, 1 reply; 43+ messages in thread
From: Lihao Liang @ 2018-01-26  8:44 UTC (permalink / raw)
  To: Paul McKenney; +Cc: Guohanjun (Hanjun Guo), heng.z, hb.chen, linux-kernel

On Thu, Jan 25, 2018 at 6:20 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Jan 23, 2018 at 03:59:32PM +0800, lianglihao@huawei.com wrote:
>> From: Lihao Liang <lianglihao@huawei.com>
>>
>> This is PRCU's counterpart of RCU's call_rcu() API.
>>
>> Reviewed-by: Heng Zhang <heng.z@huawei.com>
>> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
>> ---
>>  include/linux/prcu.h | 25 ++++++++++++++++++++
>>  init/main.c          |  2 ++
>>  kernel/rcu/prcu.c    | 67 +++++++++++++++++++++++++++++++++++++++++++++++++---
>>  3 files changed, 91 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
>> index 653b4633..e5e09c9b 100644
>> --- a/include/linux/prcu.h
>> +++ b/include/linux/prcu.h
>> @@ -2,15 +2,36 @@
>>  #define __LINUX_PRCU_H
>>
>>  #include <linux/atomic.h>
>> +#include <linux/types.h>
>>  #include <linux/mutex.h>
>>  #include <linux/wait.h>
>>
>>  #define CONFIG_PRCU
>>
>> +struct prcu_version_head {
>> +     unsigned long long version;
>> +     struct prcu_version_head *next;
>> +};
>> +
>> +/* Simple unsegmented callback list for PRCU. */
>> +struct prcu_cblist {
>> +     struct rcu_head *head;
>> +     struct rcu_head **tail;
>> +     struct prcu_version_head *version_head;
>> +     struct prcu_version_head **version_tail;
>> +     long len;
>> +};
>> +
>> +#define PRCU_CBLIST_INITIALIZER(n) { \
>> +     .head = NULL, .tail = &n.head, \
>> +     .version_head = NULL, .version_tail = &n.version_head, \
>> +}
>> +
>>  struct prcu_local_struct {
>>       unsigned int locked;
>>       unsigned int online;
>>       unsigned long long version;
>> +     struct prcu_cblist cblist;
>>  };
>>
>>  struct prcu_struct {
>> @@ -24,6 +45,8 @@ struct prcu_struct {
>>  void prcu_read_lock(void);
>>  void prcu_read_unlock(void);
>>  void synchronize_prcu(void);
>> +void call_prcu(struct rcu_head *head, rcu_callback_t func);
>> +void prcu_init(void);
>>  void prcu_note_context_switch(void);
>>
>>  #else /* #ifdef CONFIG_PRCU */
>> @@ -31,6 +54,8 @@ void prcu_note_context_switch(void);
>>  #define prcu_read_lock() do {} while (0)
>>  #define prcu_read_unlock() do {} while (0)
>>  #define synchronize_prcu() do {} while (0)
>> +#define call_prcu() do {} while (0)
>> +#define prcu_init() do {} while (0)
>>  #define prcu_note_context_switch() do {} while (0)
>>
>>  #endif /* #ifdef CONFIG_PRCU */
>> diff --git a/init/main.c b/init/main.c
>> index f8665104..4925964e 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -38,6 +38,7 @@
>>  #include <linux/smp.h>
>>  #include <linux/profile.h>
>>  #include <linux/rcupdate.h>
>> +#include <linux/prcu.h>
>>  #include <linux/moduleparam.h>
>>  #include <linux/kallsyms.h>
>>  #include <linux/writeback.h>
>> @@ -574,6 +575,7 @@ asmlinkage __visible void __init start_kernel(void)
>>       workqueue_init_early();
>>
>>       rcu_init();
>> +     prcu_init();
>>
>>       /* Trace events are available after this */
>>       trace_init();
>> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
>> index a00b9420..f198285c 100644
>> --- a/kernel/rcu/prcu.c
>> +++ b/kernel/rcu/prcu.c
>> @@ -1,11 +1,12 @@
>>  #include <linux/smp.h>
>> -#include <linux/prcu.h>
>>  #include <linux/percpu.h>
>> -#include <linux/compiler.h>
>> +#include <linux/prcu.h>
>>  #include <linux/sched.h>
>> -
>> +#include <linux/slab.h>
>>  #include <asm/barrier.h>
>>
>> +#include "rcu.h"
>> +
>>  DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
>>
>>  struct prcu_struct global_prcu = {
>> @@ -16,6 +17,16 @@ struct prcu_struct global_prcu = {
>>  };
>>  struct prcu_struct *prcu = &global_prcu;
>>
>> +/* Initialize simple callback list. */
>> +static void prcu_cblist_init(struct prcu_cblist *rclp)
>> +{
>> +     rclp->head = NULL;
>> +     rclp->tail = &rclp->head;
>> +     rclp->version_head = NULL;
>> +     rclp->version_tail = &rclp->version_head;
>> +     rclp->len = 0;
>> +}
>> +
>>  static inline void prcu_report(struct prcu_local_struct *local)
>>  {
>>       unsigned long long global_version;
>> @@ -123,3 +134,53 @@ void prcu_note_context_switch(void)
>>       prcu_report(local);
>>       put_cpu_ptr(&prcu_local);
>>  }
>> +
>> +void call_prcu(struct rcu_head *head, rcu_callback_t func)
>> +{
>> +     unsigned long flags;
>> +     struct prcu_local_struct *local;
>> +     struct prcu_cblist *rclp;
>> +     struct prcu_version_head *vhp;
>> +
>> +     debug_rcu_head_queue(head);
>> +
>> +     /* Use GFP_ATOMIC with IRQs disabled */
>> +     vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC);
>> +     if (!vhp)
>> +             return;
>
> Silently failing to post the callback can cause system hangs.  I suggest
> finding some way to avoid allocating on the call_prcu() code path.
>

You're absolutely right. We were also thinking of changing the
function return type from void to int to indicate whether the memory
allocation is successful or not.

Best,
Lihao.

>                                                         Thanx, Paul
>
>> +
>> +     head->func = func;
>> +     head->next = NULL;
>> +     vhp->next = NULL;
>> +
>> +     local_irq_save(flags);
>> +     local = this_cpu_ptr(&prcu_local);
>> +     vhp->version = local->version;
>> +     rclp = &local->cblist;
>> +     rclp->len++;
>> +     *rclp->tail = head;
>> +     rclp->tail = &head->next;
>> +     *rclp->version_tail = vhp;
>> +     rclp->version_tail = &vhp->next;
>> +     local_irq_restore(flags);
>> +}
>> +EXPORT_SYMBOL(call_prcu);
>> +
>> +void prcu_init_local_struct(int cpu)
>> +{
>> +     struct prcu_local_struct *local;
>> +
>> +     local = per_cpu_ptr(&prcu_local, cpu);
>> +     local->locked = 0;
>> +     local->online = 0;
>> +     local->version = 0;
>> +     prcu_cblist_init(&local->cblist);
>> +}
>> +
>> +void __init prcu_init(void)
>> +{
>> +     int cpu;
>> +
>> +     for_each_possible_cpu(cpu)
>> +             prcu_init_local_struct(cpu);
>> +}
>> --
>> 2.14.1.729.g59c0ea183
>>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 07/16] prcu: Implement call_prcu() API
  2018-01-26  8:44     ` Lihao Liang
@ 2018-01-26 22:22       ` Paul E. McKenney
  0 siblings, 0 replies; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-26 22:22 UTC (permalink / raw)
  To: Lihao Liang; +Cc: Guohanjun (Hanjun Guo), heng.z, hb.chen, linux-kernel

On Fri, Jan 26, 2018 at 08:44:50AM +0000, Lihao Liang wrote:
> On Thu, Jan 25, 2018 at 6:20 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Jan 23, 2018 at 03:59:32PM +0800, lianglihao@huawei.com wrote:
> >> From: Lihao Liang <lianglihao@huawei.com>
> >>
> >> This is PRCU's counterpart of RCU's call_rcu() API.
> >>
> >> Reviewed-by: Heng Zhang <heng.z@huawei.com>
> >> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
> >> ---
> >>  include/linux/prcu.h | 25 ++++++++++++++++++++
> >>  init/main.c          |  2 ++
> >>  kernel/rcu/prcu.c    | 67 +++++++++++++++++++++++++++++++++++++++++++++++++---
> >>  3 files changed, 91 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
> >> index 653b4633..e5e09c9b 100644
> >> --- a/include/linux/prcu.h
> >> +++ b/include/linux/prcu.h
> >> @@ -2,15 +2,36 @@
> >>  #define __LINUX_PRCU_H
> >>
> >>  #include <linux/atomic.h>
> >> +#include <linux/types.h>
> >>  #include <linux/mutex.h>
> >>  #include <linux/wait.h>
> >>
> >>  #define CONFIG_PRCU
> >>
> >> +struct prcu_version_head {
> >> +     unsigned long long version;
> >> +     struct prcu_version_head *next;
> >> +};
> >> +
> >> +/* Simple unsegmented callback list for PRCU. */
> >> +struct prcu_cblist {
> >> +     struct rcu_head *head;
> >> +     struct rcu_head **tail;
> >> +     struct prcu_version_head *version_head;
> >> +     struct prcu_version_head **version_tail;
> >> +     long len;
> >> +};
> >> +
> >> +#define PRCU_CBLIST_INITIALIZER(n) { \
> >> +     .head = NULL, .tail = &n.head, \
> >> +     .version_head = NULL, .version_tail = &n.version_head, \
> >> +}
> >> +
> >>  struct prcu_local_struct {
> >>       unsigned int locked;
> >>       unsigned int online;
> >>       unsigned long long version;
> >> +     struct prcu_cblist cblist;
> >>  };
> >>
> >>  struct prcu_struct {
> >> @@ -24,6 +45,8 @@ struct prcu_struct {
> >>  void prcu_read_lock(void);
> >>  void prcu_read_unlock(void);
> >>  void synchronize_prcu(void);
> >> +void call_prcu(struct rcu_head *head, rcu_callback_t func);
> >> +void prcu_init(void);
> >>  void prcu_note_context_switch(void);
> >>
> >>  #else /* #ifdef CONFIG_PRCU */
> >> @@ -31,6 +54,8 @@ void prcu_note_context_switch(void);
> >>  #define prcu_read_lock() do {} while (0)
> >>  #define prcu_read_unlock() do {} while (0)
> >>  #define synchronize_prcu() do {} while (0)
> >> +#define call_prcu() do {} while (0)
> >> +#define prcu_init() do {} while (0)
> >>  #define prcu_note_context_switch() do {} while (0)
> >>
> >>  #endif /* #ifdef CONFIG_PRCU */
> >> diff --git a/init/main.c b/init/main.c
> >> index f8665104..4925964e 100644
> >> --- a/init/main.c
> >> +++ b/init/main.c
> >> @@ -38,6 +38,7 @@
> >>  #include <linux/smp.h>
> >>  #include <linux/profile.h>
> >>  #include <linux/rcupdate.h>
> >> +#include <linux/prcu.h>
> >>  #include <linux/moduleparam.h>
> >>  #include <linux/kallsyms.h>
> >>  #include <linux/writeback.h>
> >> @@ -574,6 +575,7 @@ asmlinkage __visible void __init start_kernel(void)
> >>       workqueue_init_early();
> >>
> >>       rcu_init();
> >> +     prcu_init();
> >>
> >>       /* Trace events are available after this */
> >>       trace_init();
> >> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
> >> index a00b9420..f198285c 100644
> >> --- a/kernel/rcu/prcu.c
> >> +++ b/kernel/rcu/prcu.c
> >> @@ -1,11 +1,12 @@
> >>  #include <linux/smp.h>
> >> -#include <linux/prcu.h>
> >>  #include <linux/percpu.h>
> >> -#include <linux/compiler.h>
> >> +#include <linux/prcu.h>
> >>  #include <linux/sched.h>
> >> -
> >> +#include <linux/slab.h>
> >>  #include <asm/barrier.h>
> >>
> >> +#include "rcu.h"
> >> +
> >>  DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
> >>
> >>  struct prcu_struct global_prcu = {
> >> @@ -16,6 +17,16 @@ struct prcu_struct global_prcu = {
> >>  };
> >>  struct prcu_struct *prcu = &global_prcu;
> >>
> >> +/* Initialize simple callback list. */
> >> +static void prcu_cblist_init(struct prcu_cblist *rclp)
> >> +{
> >> +     rclp->head = NULL;
> >> +     rclp->tail = &rclp->head;
> >> +     rclp->version_head = NULL;
> >> +     rclp->version_tail = &rclp->version_head;
> >> +     rclp->len = 0;
> >> +}
> >> +
> >>  static inline void prcu_report(struct prcu_local_struct *local)
> >>  {
> >>       unsigned long long global_version;
> >> @@ -123,3 +134,53 @@ void prcu_note_context_switch(void)
> >>       prcu_report(local);
> >>       put_cpu_ptr(&prcu_local);
> >>  }
> >> +
> >> +void call_prcu(struct rcu_head *head, rcu_callback_t func)
> >> +{
> >> +     unsigned long flags;
> >> +     struct prcu_local_struct *local;
> >> +     struct prcu_cblist *rclp;
> >> +     struct prcu_version_head *vhp;
> >> +
> >> +     debug_rcu_head_queue(head);
> >> +
> >> +     /* Use GFP_ATOMIC with IRQs disabled */
> >> +     vhp = kmalloc(sizeof(struct prcu_version_head), GFP_ATOMIC);
> >> +     if (!vhp)
> >> +             return;
> >
> > Silently failing to post the callback can cause system hangs.  I suggest
> > finding some way to avoid allocating on the call_prcu() code path.
> >
> 
> You're absolutely right. We were also thinking of changing the
> function return type from void to int to indicate whether the memory
> allocation is successful or not.

Suppose that you are a user of such a function.  When it returns indicating
failure, what are you supposed to do?  What would be the complexity of
the resulting failure-handling code?

Having it simply unconditionally succeed is much friendlier to the user,
especially given that it is not all that hard to make it do so.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol
  2018-01-25  5:53 ` [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol Paul E. McKenney
@ 2018-01-27  7:22   ` Lihao Liang
  2018-01-27  7:57     ` Paul E. McKenney
  0 siblings, 1 reply; 43+ messages in thread
From: Lihao Liang @ 2018-01-27  7:22 UTC (permalink / raw)
  To: Paul McKenney; +Cc: Guohanjun (Hanjun Guo), heng.z, hb.chen, linux-kernel

On Thu, Jan 25, 2018 at 5:53 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Jan 23, 2018 at 03:59:25PM +0800, lianglihao@huawei.com wrote:
>> From: Lihao Liang <lianglihao@huawei.com>
>>
>> Dear Paul,
>>
>> This patch set implements a preemptive version of RCU (PRCU) based on the following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>>
>> We have also added preliminary callback-handling support.  Thus, the current version
>> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), call_prcu(),
>> and prcu_barrier().
>>
>> This is an experimental patch, so it would be good to have some feedback.
>>
>> Known shortcoming is that the grace-period version is incremented in synchronize_prcu().
>> If call_prcu() or prcu_barrier() is called but there is no synchronized_prcu() invoked,
>> callbacks cannot be invoked.  Later version should address this issue, e.g. adding a
>> grace-period expedition mechanism.  Others include to use a a hierarchical structure,
>> taking into account the NUMA topology, to send IPI in synchronize_prcu().
>>
>> We have tested the implementation using rcutorture on both an x86 and ARM64 machine.
>> PRCU passed 1h and 3h tests on all the newly added config files except PRCU07 reported BUG
>> in a 1h run.
>>
>> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
>> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
>> [ 1594.755553] smpboot: CPU 14 is now offline
>> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
>> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
>> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
>> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
>> [ 1599.365098] prcu-torture: rtc: ffffffffb0277b90 ver: 66358 tfle: 0 rta: 66358 rtaf: 0
>> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418
>> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 cbflood: 225
>> [ 1599.367946] prcu-torture: !!!
>> [ 1599.367966] ------------[ cut here ]------------
>
> The "rtbe: 1" indicates that your implementation of prcu_barrier()
> failed to wait for all preceding call_prcu() callbacks to be invoked.
>
> Does the immediately following "Reader Pipe:" list have any but the
> first two numbers non-zero?
>

Yes.

>> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to true, that is
>> synchronize_rcu_expedited was tested.
>>
>> The rcuperf results are as follows (average grace-period duration in ms of ten 10min runs):
>>
>> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
>>
>> CPUs      2       4       8      12      15       16
>> PRCU   0.14    1.07    4.15    8.02   10.79    15.16
>> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
>>
>> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
>>
>> CPUs       2       4        8      16      32       48       63        64
>> PRCU    0.23   19.69    38.28   63.21   95.41   167.18   252.01   1841.44
>> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27
>
> Well, at the very least, this is a bug report on either expedited RCU
> grace-period latency or on rcuperf's measurements, and thank you for that.
> I will look into this.  In the meantime, could you please let me know
> exactly how you invoked rcuperf?
>

We used the following command to invoke rcuperf:

sudo ./kvm.sh --torture rcuperf --duration 10 --configs 10*TREE

The actual script run-rcuperf.sh to run the experiments can be found
in the following email of this patch series:

[PATCH RFC 15/16] rcutorture: Add scripts to run experiments

Please let us know how it goes.

Many thanks,
Lihao.

> I have a few comments on some of your patches based on a quick scan
> through them.
>
>                                                         Thanx, Paul
>
>> Best wishes,
>> Lihao.
>>
>>
>> Lihao Liang (15):
>>   rcutorture: Add PRCU rcu_torture_ops
>>   rcutorture: Add PRCU test config files
>>   rcuperf: Add PRCU rcu_perf_ops
>>   rcuperf: Add PRCU test config files
>>   rcuperf: Set gp_exp to true for tests to run
>>   prcu: Implement call_prcu() API
>>   prcu: Implement PRCU callback processing
>>   prcu: Implement prcu_barrier() API
>>   rcutorture: Test call_prcu() and prcu_barrier()
>>   rcutorture: Add basic ARM64 support to run scripts
>>   prcu: Add PRCU Kconfig parameter
>>   prcu: Comment source code
>>   rcuperf: Add config files with various CONFIG_NR_CPUS
>>   rcutorture: Add scripts to run experiments
>>   Add GPLv2 license
>>
>> Heng Zhang (1):
>>   prcu: Add PRCU implementation
>>
>>  include/linux/interrupt.h                          |   3 +
>>  include/linux/prcu.h                               | 122 +++++
>>  include/linux/rcupdate.h                           |   1 +
>>  init/Kconfig                                       |   7 +
>>  init/main.c                                        |   2 +
>>  kernel/rcu/Makefile                                |   1 +
>>  kernel/rcu/prcu.c                                  | 497 +++++++++++++++++++++
>>  kernel/rcu/rcuperf.c                               |  33 +-
>>  kernel/rcu/rcutorture.c                            |  40 +-
>>  kernel/rcu/tree.c                                  |   1 +
>>  kernel/sched/core.c                                |   2 +
>>  kernel/time/timer.c                                |   2 +
>>  kvm.sh                                             | 452 +++++++++++++++++++
>>  run-rcuperf.sh                                     |  26 ++
>>  .../testing/selftests/rcutorture/bin/functions.sh  |  17 +-
>>  .../selftests/rcutorture/configs/rcu/CFLIST        |   5 +
>>  .../selftests/rcutorture/configs/rcu/PRCU02        |  27 ++
>>  .../selftests/rcutorture/configs/rcu/PRCU02.boot   |   1 +
>>  .../selftests/rcutorture/configs/rcu/PRCU03        |  23 +
>>  .../selftests/rcutorture/configs/rcu/PRCU03.boot   |   2 +
>>  .../selftests/rcutorture/configs/rcu/PRCU06        |  26 ++
>>  .../selftests/rcutorture/configs/rcu/PRCU06.boot   |   5 +
>>  .../selftests/rcutorture/configs/rcu/PRCU07        |  25 ++
>>  .../selftests/rcutorture/configs/rcu/PRCU07.boot   |   2 +
>>  .../selftests/rcutorture/configs/rcu/PRCU09        |  19 +
>>  .../selftests/rcutorture/configs/rcu/PRCU09.boot   |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/CFLIST    |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU      |  20 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-12   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-12.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-14   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-14.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-15   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-15.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-16   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-16.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-2    |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-2.boot         |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-32   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-32.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-4    |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-4.boot         |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-48   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-48.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-56   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-56.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-60   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-60.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-62   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-62.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-64   |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-64.boot        |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU-8    |  21 +
>>  .../rcutorture/configs/rcuperf/PRCU-8.boot         |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/PRCU.boot |   1 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-12   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-14   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-15   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-16   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-2    |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-32   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-4    |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-48   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-56   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-60   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-62   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-64   |  21 +
>>  .../selftests/rcutorture/configs/rcuperf/TREE-8    |  21 +
>>  68 files changed, 1918 insertions(+), 5 deletions(-)
>>  create mode 100644 include/linux/prcu.h
>>  create mode 100644 kernel/rcu/prcu.c
>>  create mode 100755 kvm.sh
>>  create mode 100755 run-rcuperf.sh
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
>>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8
>>
>> --
>> 2.14.1.729.g59c0ea183
>>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-25  6:16   ` Paul E. McKenney
  2018-01-25  7:30     ` Boqun Feng
@ 2018-01-27  7:35     ` Lihao Liang
  2018-01-30  3:58     ` zhangheng (AC)
  2 siblings, 0 replies; 43+ messages in thread
From: Lihao Liang @ 2018-01-27  7:35 UTC (permalink / raw)
  To: Paul McKenney, Peter Zijlstra
  Cc: Guohanjun (Hanjun Guo), heng.z, hb.chen, linux-kernel

On Thu, Jan 25, 2018 at 6:16 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Jan 23, 2018 at 03:59:26PM +0800, lianglihao@huawei.com wrote:
>> From: Heng Zhang <heng.z@huawei.com>
>>
>> This RCU implementation (PRCU) is based on a fast consensus protocol
>> published in the following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>>
>> Signed-off-by: Heng Zhang <heng.z@huawei.com>
>> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
>
> A few comments and questions interspersed.
>
>                                                         Thanx, Paul
>
>> ---
>>  include/linux/prcu.h |  37 +++++++++++++++
>>  kernel/rcu/Makefile  |   2 +-
>>  kernel/rcu/prcu.c    | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  kernel/sched/core.c  |   2 +
>>  4 files changed, 165 insertions(+), 1 deletion(-)
>>  create mode 100644 include/linux/prcu.h
>>  create mode 100644 kernel/rcu/prcu.c
>>
>> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
>> new file mode 100644
>> index 00000000..653b4633
>> --- /dev/null
>> +++ b/include/linux/prcu.h
>> @@ -0,0 +1,37 @@
>> +#ifndef __LINUX_PRCU_H
>> +#define __LINUX_PRCU_H
>> +
>> +#include <linux/atomic.h>
>> +#include <linux/mutex.h>
>> +#include <linux/wait.h>
>> +
>> +#define CONFIG_PRCU
>> +
>> +struct prcu_local_struct {
>> +     unsigned int locked;
>> +     unsigned int online;
>> +     unsigned long long version;
>> +};
>> +
>> +struct prcu_struct {
>> +     atomic64_t global_version;
>> +     atomic_t active_ctr;
>> +     struct mutex mtx;
>> +     wait_queue_head_t wait_q;
>> +};
>> +
>> +#ifdef CONFIG_PRCU
>> +void prcu_read_lock(void);
>> +void prcu_read_unlock(void);
>> +void synchronize_prcu(void);
>> +void prcu_note_context_switch(void);
>> +
>> +#else /* #ifdef CONFIG_PRCU */
>> +
>> +#define prcu_read_lock() do {} while (0)
>> +#define prcu_read_unlock() do {} while (0)
>> +#define synchronize_prcu() do {} while (0)
>> +#define prcu_note_context_switch() do {} while (0)
>
> If CONFIG_PRCU=n and some code is built that uses PRCU, shouldn't you
> get a build error rather than an error-free but inoperative PRCU?
>

Very good point, thank you!

> Of course, Peter's question about purpose of the patch set applies
> here as well.
>

The main motivation of this patch set is the comparison results of
rcuperf between PRCU and Tree RCU in which PRCU outperformed Tree RCU
by a large margin.

As indicated in your reply of the email in this patch series

[PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol

this may be a bug on either expedited RCU grace-period latency or on
rcuperf's measurements.

Many thanks,
Lihao.

>> +
>> +#endif /* #ifdef CONFIG_PRCU */
>> +#endif /* __LINUX_PRCU_H */
>> diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
>> index 23803c7d..8791419c 100644
>> --- a/kernel/rcu/Makefile
>> +++ b/kernel/rcu/Makefile
>> @@ -2,7 +2,7 @@
>>  # and is generally not a function of system call inputs.
>>  KCOV_INSTRUMENT := n
>>
>> -obj-y += update.o sync.o
>> +obj-y += update.o sync.o prcu.o
>>  obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
>>  obj-$(CONFIG_TREE_SRCU) += srcutree.o
>>  obj-$(CONFIG_TINY_SRCU) += srcutiny.o
>> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
>> new file mode 100644
>> index 00000000..a00b9420
>> --- /dev/null
>> +++ b/kernel/rcu/prcu.c
>> @@ -0,0 +1,125 @@
>> +#include <linux/smp.h>
>> +#include <linux/prcu.h>
>> +#include <linux/percpu.h>
>> +#include <linux/compiler.h>
>> +#include <linux/sched.h>
>> +
>> +#include <asm/barrier.h>
>> +
>> +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
>> +
>> +struct prcu_struct global_prcu = {
>> +     .global_version = ATOMIC64_INIT(0),
>> +     .active_ctr = ATOMIC_INIT(0),
>> +     .mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
>> +     .wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
>> +};
>> +struct prcu_struct *prcu = &global_prcu;
>> +
>> +static inline void prcu_report(struct prcu_local_struct *local)
>> +{
>> +     unsigned long long global_version;
>> +     unsigned long long local_version;
>> +
>> +     global_version = atomic64_read(&prcu->global_version);
>> +     local_version = local->version;
>> +     if (global_version > local_version)
>> +             cmpxchg(&local->version, local_version, global_version);
>> +}
>> +
>> +void prcu_read_lock(void)
>> +{
>> +     struct prcu_local_struct *local;
>> +
>> +     local = get_cpu_ptr(&prcu_local);
>> +     if (!local->online) {
>> +             WRITE_ONCE(local->online, 1);
>> +             smp_mb();
>> +     }
>> +
>> +     local->locked++;
>> +     put_cpu_ptr(&prcu_local);
>> +}
>> +EXPORT_SYMBOL(prcu_read_lock);
>> +
>> +void prcu_read_unlock(void)
>> +{
>> +     int locked;
>> +     struct prcu_local_struct *local;
>> +
>> +     barrier();
>> +     local = get_cpu_ptr(&prcu_local);
>> +     locked = local->locked;
>> +     if (locked) {
>> +             local->locked--;
>> +             if (locked == 1)
>> +                     prcu_report(local);
>
> Is ordering important here?  It looks to me that the compiler could
> rearrange some of the accesses within prcu_report() with the local->locked
> decrement.  There appears to be some potential for load and store tearing,
> though perhaps you have verified that your compiler avoids this on
> the architecture that you are using.
>
>> +             put_cpu_ptr(&prcu_local);
>> +     } else {
>
> Hmmm...  We get here if the RCU read-side critical section was preempted.
> If none of them are preempted, ->active_ctr remains zero.
>
>> +             put_cpu_ptr(&prcu_local);
>> +             if (!atomic_dec_return(&prcu->active_ctr))
>> +                     wake_up(&prcu->wait_q);
>> +     }
>> +}
>> +EXPORT_SYMBOL(prcu_read_unlock);
>> +
>> +static void prcu_handler(void *info)
>> +{
>> +     struct prcu_local_struct *local;
>> +
>> +     local = this_cpu_ptr(&prcu_local);
>> +     if (!local->locked)
>> +             WRITE_ONCE(local->version, atomic64_read(&prcu->global_version));
>> +}
>> +
>> +void synchronize_prcu(void)
>> +{
>> +     int cpu;
>> +     cpumask_t cpus;
>> +     unsigned long long version;
>> +     struct prcu_local_struct *local;
>> +
>> +     version = atomic64_add_return(1, &prcu->global_version);
>> +     mutex_lock(&prcu->mtx);
>> +
>> +     local = get_cpu_ptr(&prcu_local);
>> +     local->version = version;
>> +     put_cpu_ptr(&prcu_local);
>> +
>> +     cpumask_clear(&cpus);
>> +     for_each_possible_cpu(cpu) {
>> +             local = per_cpu_ptr(&prcu_local, cpu);
>> +             if (!READ_ONCE(local->online))
>> +                     continue;
>> +             if (READ_ONCE(local->version) < version) {
>
> On 32-bit systems, given that ->version is long long, you might see
> load tearing.  And on some 32-bit systems, the cmpxchg() in prcu_hander()
> might not build.
>
> Or is the idea that only prcu_handler() updates ->version?  But in that
> case, you wouldn't need the READ_ONCE() above.  What am I missing here?
>
>> +                     smp_call_function_single(cpu, prcu_handler, NULL, 0);
>> +                     cpumask_set_cpu(cpu, &cpus);
>> +             }
>> +     }
>> +
>> +     for_each_cpu(cpu, &cpus) {
>> +             local = per_cpu_ptr(&prcu_local, cpu);
>> +             while (READ_ONCE(local->version) < version)
>
> This ->version read can also tear on some 32-bit systems, and this
> one most definitely can race with the prcu_handler() above.  Does the
> algorithm operate correctly in that case?  (It doesn't look that way
> to me, but I might be missing something.) Or are 32-bit systems excluded?
>
>> +                     cpu_relax();
>> +     }
>
> I might be missing something, but I believe we need a memory barrier
> here on non-TSO systems.  Without that, couldn't we miss a preemption?
>
>> +
>> +     if (atomic_read(&prcu->active_ctr))
>> +             wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
>> +
>> +     mutex_unlock(&prcu->mtx);
>> +}
>> +EXPORT_SYMBOL(synchronize_prcu);
>> +
>> +void prcu_note_context_switch(void)
>> +{
>> +     struct prcu_local_struct *local;
>> +
>> +     local = get_cpu_ptr(&prcu_local);
>> +     if (local->locked) {
>> +             atomic_add(local->locked, &prcu->active_ctr);
>> +             local->locked = 0;
>> +     }
>> +     local->online = 0;
>> +     prcu_report(local);
>> +     put_cpu_ptr(&prcu_local);
>> +}
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 326d4f88..a308581b 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -15,6 +15,7 @@
>>  #include <linux/init_task.h>
>>  #include <linux/context_tracking.h>
>>  #include <linux/rcupdate_wait.h>
>> +#include <linux/prcu.h>
>>
>>  #include <linux/blkdev.h>
>>  #include <linux/kprobes.h>
>> @@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool preempt)
>>
>>       local_irq_disable();
>>       rcu_note_context_switch(preempt);
>> +     prcu_note_context_switch();
>>
>>       /*
>>        * Make sure that signal_pending_state()->signal_pending() below
>> --
>> 2.14.1.729.g59c0ea183
>>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol
  2018-01-27  7:22   ` Lihao Liang
@ 2018-01-27  7:57     ` Paul E. McKenney
  2018-01-27  9:57       ` Lihao Liang
  2018-01-27 23:41       ` Paul E. McKenney
  0 siblings, 2 replies; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-27  7:57 UTC (permalink / raw)
  To: Lihao Liang; +Cc: Guohanjun (Hanjun Guo), heng.z, hb.chen, linux-kernel

On Sat, Jan 27, 2018 at 07:22:27AM +0000, Lihao Liang wrote:
> On Thu, Jan 25, 2018 at 5:53 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Jan 23, 2018 at 03:59:25PM +0800, lianglihao@huawei.com wrote:
> >> From: Lihao Liang <lianglihao@huawei.com>
> >>
> >> Dear Paul,
> >>
> >> This patch set implements a preemptive version of RCU (PRCU) based on the following paper:
> >>
> >> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
> >> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
> >> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
> >> https://dl.acm.org/citation.cfm?id=3024114.3024143
> >>
> >> We have also added preliminary callback-handling support.  Thus, the current version
> >> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), call_prcu(),
> >> and prcu_barrier().
> >>
> >> This is an experimental patch, so it would be good to have some feedback.
> >>
> >> Known shortcoming is that the grace-period version is incremented in synchronize_prcu().
> >> If call_prcu() or prcu_barrier() is called but there is no synchronized_prcu() invoked,
> >> callbacks cannot be invoked.  Later version should address this issue, e.g. adding a
> >> grace-period expedition mechanism.  Others include to use a a hierarchical structure,
> >> taking into account the NUMA topology, to send IPI in synchronize_prcu().
> >>
> >> We have tested the implementation using rcutorture on both an x86 and ARM64 machine.
> >> PRCU passed 1h and 3h tests on all the newly added config files except PRCU07 reported BUG
> >> in a 1h run.
> >>
> >> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
> >> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
> >> [ 1594.755553] smpboot: CPU 14 is now offline
> >> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
> >> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
> >> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
> >> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
> >> [ 1599.365098] prcu-torture: rtc: ffffffffb0277b90 ver: 66358 tfle: 0 rta: 66358 rtaf: 0
> >> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418
> >> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 cbflood: 225
> >> [ 1599.367946] prcu-torture: !!!
> >> [ 1599.367966] ------------[ cut here ]------------
> >
> > The "rtbe: 1" indicates that your implementation of prcu_barrier()
> > failed to wait for all preceding call_prcu() callbacks to be invoked.
> >
> > Does the immediately following "Reader Pipe:" list have any but the
> > first two numbers non-zero?
> 
> Yes.

If the third or subsequent numbers are non-zero, that would indicate
too-short grace periods.  This would be a critical bug in PRCU.

> >> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to true, that is
> >> synchronize_rcu_expedited was tested.
> >>
> >> The rcuperf results are as follows (average grace-period duration in ms of ten 10min runs):
> >>
> >> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
> >>
> >> CPUs      2       4       8      12      15       16
> >> PRCU   0.14    1.07    4.15    8.02   10.79    15.16
> >> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
> >>
> >> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
> >>
> >> CPUs       2       4        8      16      32       48       63        64
> >> PRCU    0.23   19.69    38.28   63.21   95.41   167.18   252.01   1841.44
> >> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27
> >
> > Well, at the very least, this is a bug report on either expedited RCU
> > grace-period latency or on rcuperf's measurements, and thank you for that.
> > I will look into this.  In the meantime, could you please let me know
> > exactly how you invoked rcuperf?
> 
> We used the following command to invoke rcuperf:
> 
> sudo ./kvm.sh --torture rcuperf --duration 10 --configs 10*TREE
> 
> The actual script run-rcuperf.sh to run the experiments can be found
> in the following email of this patch series:
> 
> [PATCH RFC 15/16] rcutorture: Add scripts to run experiments
> 
> Please let us know how it goes.

Will do!

As I said before, at the very least you have identified a performance bug
in RCU expedited grace periods.

							Thanx, Paul

> Many thanks,
> Lihao.
> 
> > I have a few comments on some of your patches based on a quick scan
> > through them.
> >
> >                                                         Thanx, Paul
> >
> >> Best wishes,
> >> Lihao.
> >>
> >>
> >> Lihao Liang (15):
> >>   rcutorture: Add PRCU rcu_torture_ops
> >>   rcutorture: Add PRCU test config files
> >>   rcuperf: Add PRCU rcu_perf_ops
> >>   rcuperf: Add PRCU test config files
> >>   rcuperf: Set gp_exp to true for tests to run
> >>   prcu: Implement call_prcu() API
> >>   prcu: Implement PRCU callback processing
> >>   prcu: Implement prcu_barrier() API
> >>   rcutorture: Test call_prcu() and prcu_barrier()
> >>   rcutorture: Add basic ARM64 support to run scripts
> >>   prcu: Add PRCU Kconfig parameter
> >>   prcu: Comment source code
> >>   rcuperf: Add config files with various CONFIG_NR_CPUS
> >>   rcutorture: Add scripts to run experiments
> >>   Add GPLv2 license
> >>
> >> Heng Zhang (1):
> >>   prcu: Add PRCU implementation
> >>
> >>  include/linux/interrupt.h                          |   3 +
> >>  include/linux/prcu.h                               | 122 +++++
> >>  include/linux/rcupdate.h                           |   1 +
> >>  init/Kconfig                                       |   7 +
> >>  init/main.c                                        |   2 +
> >>  kernel/rcu/Makefile                                |   1 +
> >>  kernel/rcu/prcu.c                                  | 497 +++++++++++++++++++++
> >>  kernel/rcu/rcuperf.c                               |  33 +-
> >>  kernel/rcu/rcutorture.c                            |  40 +-
> >>  kernel/rcu/tree.c                                  |   1 +
> >>  kernel/sched/core.c                                |   2 +
> >>  kernel/time/timer.c                                |   2 +
> >>  kvm.sh                                             | 452 +++++++++++++++++++
> >>  run-rcuperf.sh                                     |  26 ++
> >>  .../testing/selftests/rcutorture/bin/functions.sh  |  17 +-
> >>  .../selftests/rcutorture/configs/rcu/CFLIST        |   5 +
> >>  .../selftests/rcutorture/configs/rcu/PRCU02        |  27 ++
> >>  .../selftests/rcutorture/configs/rcu/PRCU02.boot   |   1 +
> >>  .../selftests/rcutorture/configs/rcu/PRCU03        |  23 +
> >>  .../selftests/rcutorture/configs/rcu/PRCU03.boot   |   2 +
> >>  .../selftests/rcutorture/configs/rcu/PRCU06        |  26 ++
> >>  .../selftests/rcutorture/configs/rcu/PRCU06.boot   |   5 +
> >>  .../selftests/rcutorture/configs/rcu/PRCU07        |  25 ++
> >>  .../selftests/rcutorture/configs/rcu/PRCU07.boot   |   2 +
> >>  .../selftests/rcutorture/configs/rcu/PRCU09        |  19 +
> >>  .../selftests/rcutorture/configs/rcu/PRCU09.boot   |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/CFLIST    |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU      |  20 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-12   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-12.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-14   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-14.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-15   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-15.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-16   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-16.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-2    |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-2.boot         |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-32   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-32.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-4    |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-4.boot         |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-48   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-48.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-56   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-56.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-60   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-60.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-62   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-62.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-64   |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-64.boot        |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-8    |  21 +
> >>  .../rcutorture/configs/rcuperf/PRCU-8.boot         |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/PRCU.boot |   1 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-12   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-14   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-15   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-16   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-2    |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-32   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-4    |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-48   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-56   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-60   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-62   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-64   |  21 +
> >>  .../selftests/rcutorture/configs/rcuperf/TREE-8    |  21 +
> >>  68 files changed, 1918 insertions(+), 5 deletions(-)
> >>  create mode 100644 include/linux/prcu.h
> >>  create mode 100644 kernel/rcu/prcu.c
> >>  create mode 100755 kvm.sh
> >>  create mode 100755 run-rcuperf.sh
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8
> >>
> >> --
> >> 2.14.1.729.g59c0ea183
> >>
> >
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol
  2018-01-27  7:57     ` Paul E. McKenney
@ 2018-01-27  9:57       ` Lihao Liang
  2018-01-27 23:46         ` Paul E. McKenney
  2018-01-27 23:41       ` Paul E. McKenney
  1 sibling, 1 reply; 43+ messages in thread
From: Lihao Liang @ 2018-01-27  9:57 UTC (permalink / raw)
  To: Paul McKenney; +Cc: Guohanjun (Hanjun Guo), heng.z, hb.chen, linux-kernel

On Sat, Jan 27, 2018 at 7:57 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Sat, Jan 27, 2018 at 07:22:27AM +0000, Lihao Liang wrote:
>> On Thu, Jan 25, 2018 at 5:53 AM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> > On Tue, Jan 23, 2018 at 03:59:25PM +0800, lianglihao@huawei.com wrote:
>> >> From: Lihao Liang <lianglihao@huawei.com>
>> >>
>> >> Dear Paul,
>> >>
>> >> This patch set implements a preemptive version of RCU (PRCU) based on the following paper:
>> >>
>> >> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
>> >> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> >> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> >> https://dl.acm.org/citation.cfm?id=3024114.3024143
>> >>
>> >> We have also added preliminary callback-handling support.  Thus, the current version
>> >> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), call_prcu(),
>> >> and prcu_barrier().
>> >>
>> >> This is an experimental patch, so it would be good to have some feedback.
>> >>
>> >> Known shortcoming is that the grace-period version is incremented in synchronize_prcu().
>> >> If call_prcu() or prcu_barrier() is called but there is no synchronized_prcu() invoked,
>> >> callbacks cannot be invoked.  Later version should address this issue, e.g. adding a
>> >> grace-period expedition mechanism.  Others include to use a a hierarchical structure,
>> >> taking into account the NUMA topology, to send IPI in synchronize_prcu().
>> >>
>> >> We have tested the implementation using rcutorture on both an x86 and ARM64 machine.
>> >> PRCU passed 1h and 3h tests on all the newly added config files except PRCU07 reported BUG
>> >> in a 1h run.
>> >>
>> >> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
>> >> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
>> >> [ 1594.755553] smpboot: CPU 14 is now offline
>> >> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
>> >> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
>> >> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
>> >> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
>> >> [ 1599.365098] prcu-torture: rtc: ffffffffb0277b90 ver: 66358 tfle: 0 rta: 66358 rtaf: 0
>> >> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418
>> >> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 cbflood: 225
>> >> [ 1599.367946] prcu-torture: !!!
>> >> [ 1599.367966] ------------[ cut here ]------------
>> >
>> > The "rtbe: 1" indicates that your implementation of prcu_barrier()
>> > failed to wait for all preceding call_prcu() callbacks to be invoked.
>> >
>> > Does the immediately following "Reader Pipe:" list have any but the
>> > first two numbers non-zero?
>>
>> Yes.
>
> If the third or subsequent numbers are non-zero, that would indicate
> too-short grace periods.  This would be a critical bug in PRCU.
>
>> >> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to true, that is
>> >> synchronize_rcu_expedited was tested.
>> >>
>> >> The rcuperf results are as follows (average grace-period duration in ms of ten 10min runs):
>> >>
>> >> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
>> >>
>> >> CPUs      2       4       8      12      15       16
>> >> PRCU   0.14    1.07    4.15    8.02   10.79    15.16
>> >> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
>> >>
>> >> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
>> >>
>> >> CPUs       2       4        8      16      32       48       63        64
>> >> PRCU    0.23   19.69    38.28   63.21   95.41   167.18   252.01   1841.44
>> >> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27
>> >
>> > Well, at the very least, this is a bug report on either expedited RCU
>> > grace-period latency or on rcuperf's measurements, and thank you for that.
>> > I will look into this.  In the meantime, could you please let me know
>> > exactly how you invoked rcuperf?
>>
>> We used the following command to invoke rcuperf:
>>
>> sudo ./kvm.sh --torture rcuperf --duration 10 --configs 10*TREE
>>
>> The actual script run-rcuperf.sh to run the experiments can be found
>> in the following email of this patch series:
>>
>> [PATCH RFC 15/16] rcutorture: Add scripts to run experiments
>>
>> Please let us know how it goes.
>
> Will do!
>
> As I said before, at the very least you have identified a performance bug
> in RCU expedited grace periods.
>

I should add that we also tested the normal synchronize_rcu() on the
same x86 machine, and the rcuperf figures were about 10 times slower
than those of synchronize_rcu_expedited().

Is this expected for synchronize_rcu()?

Best,
Lihao.

>                                                         Thanx, Paul
>
>> Many thanks,
>> Lihao.
>>
>> > I have a few comments on some of your patches based on a quick scan
>> > through them.
>> >
>> >                                                         Thanx, Paul
>> >
>> >> Best wishes,
>> >> Lihao.
>> >>
>> >>
>> >> Lihao Liang (15):
>> >>   rcutorture: Add PRCU rcu_torture_ops
>> >>   rcutorture: Add PRCU test config files
>> >>   rcuperf: Add PRCU rcu_perf_ops
>> >>   rcuperf: Add PRCU test config files
>> >>   rcuperf: Set gp_exp to true for tests to run
>> >>   prcu: Implement call_prcu() API
>> >>   prcu: Implement PRCU callback processing
>> >>   prcu: Implement prcu_barrier() API
>> >>   rcutorture: Test call_prcu() and prcu_barrier()
>> >>   rcutorture: Add basic ARM64 support to run scripts
>> >>   prcu: Add PRCU Kconfig parameter
>> >>   prcu: Comment source code
>> >>   rcuperf: Add config files with various CONFIG_NR_CPUS
>> >>   rcutorture: Add scripts to run experiments
>> >>   Add GPLv2 license
>> >>
>> >> Heng Zhang (1):
>> >>   prcu: Add PRCU implementation
>> >>
>> >>  include/linux/interrupt.h                          |   3 +
>> >>  include/linux/prcu.h                               | 122 +++++
>> >>  include/linux/rcupdate.h                           |   1 +
>> >>  init/Kconfig                                       |   7 +
>> >>  init/main.c                                        |   2 +
>> >>  kernel/rcu/Makefile                                |   1 +
>> >>  kernel/rcu/prcu.c                                  | 497 +++++++++++++++++++++
>> >>  kernel/rcu/rcuperf.c                               |  33 +-
>> >>  kernel/rcu/rcutorture.c                            |  40 +-
>> >>  kernel/rcu/tree.c                                  |   1 +
>> >>  kernel/sched/core.c                                |   2 +
>> >>  kernel/time/timer.c                                |   2 +
>> >>  kvm.sh                                             | 452 +++++++++++++++++++
>> >>  run-rcuperf.sh                                     |  26 ++
>> >>  .../testing/selftests/rcutorture/bin/functions.sh  |  17 +-
>> >>  .../selftests/rcutorture/configs/rcu/CFLIST        |   5 +
>> >>  .../selftests/rcutorture/configs/rcu/PRCU02        |  27 ++
>> >>  .../selftests/rcutorture/configs/rcu/PRCU02.boot   |   1 +
>> >>  .../selftests/rcutorture/configs/rcu/PRCU03        |  23 +
>> >>  .../selftests/rcutorture/configs/rcu/PRCU03.boot   |   2 +
>> >>  .../selftests/rcutorture/configs/rcu/PRCU06        |  26 ++
>> >>  .../selftests/rcutorture/configs/rcu/PRCU06.boot   |   5 +
>> >>  .../selftests/rcutorture/configs/rcu/PRCU07        |  25 ++
>> >>  .../selftests/rcutorture/configs/rcu/PRCU07.boot   |   2 +
>> >>  .../selftests/rcutorture/configs/rcu/PRCU09        |  19 +
>> >>  .../selftests/rcutorture/configs/rcu/PRCU09.boot   |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/CFLIST    |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU      |  20 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-12   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-12.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-14   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-14.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-15   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-15.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-16   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-16.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-2    |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-2.boot         |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-32   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-32.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-4    |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-4.boot         |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-48   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-48.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-56   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-56.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-60   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-60.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-62   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-62.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-64   |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-64.boot        |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-8    |  21 +
>> >>  .../rcutorture/configs/rcuperf/PRCU-8.boot         |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/PRCU.boot |   1 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-12   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-14   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-15   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-16   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-2    |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-32   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-4    |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-48   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-56   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-60   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-62   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-64   |  21 +
>> >>  .../selftests/rcutorture/configs/rcuperf/TREE-8    |  21 +
>> >>  68 files changed, 1918 insertions(+), 5 deletions(-)
>> >>  create mode 100644 include/linux/prcu.h
>> >>  create mode 100644 kernel/rcu/prcu.c
>> >>  create mode 100755 kvm.sh
>> >>  create mode 100755 run-rcuperf.sh
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
>> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8
>> >>
>> >> --
>> >> 2.14.1.729.g59c0ea183
>> >>
>> >
>>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol
  2018-01-27  7:57     ` Paul E. McKenney
  2018-01-27  9:57       ` Lihao Liang
@ 2018-01-27 23:41       ` Paul E. McKenney
  1 sibling, 0 replies; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-27 23:41 UTC (permalink / raw)
  To: Lihao Liang; +Cc: Guohanjun (Hanjun Guo), heng.z, hb.chen, linux-kernel

On Fri, Jan 26, 2018 at 11:57:44PM -0800, Paul E. McKenney wrote:
> On Sat, Jan 27, 2018 at 07:22:27AM +0000, Lihao Liang wrote:
> > On Thu, Jan 25, 2018 at 5:53 AM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > > On Tue, Jan 23, 2018 at 03:59:25PM +0800, lianglihao@huawei.com wrote:
> > >> From: Lihao Liang <lianglihao@huawei.com>
> > >>
> > >> Dear Paul,
> > >>
> > >> This patch set implements a preemptive version of RCU (PRCU) based on the following paper:
> > >>
> > >> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
> > >> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
> > >> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
> > >> https://dl.acm.org/citation.cfm?id=3024114.3024143
> > >>
> > >> We have also added preliminary callback-handling support.  Thus, the current version
> > >> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), call_prcu(),
> > >> and prcu_barrier().
> > >>
> > >> This is an experimental patch, so it would be good to have some feedback.
> > >>
> > >> Known shortcoming is that the grace-period version is incremented in synchronize_prcu().
> > >> If call_prcu() or prcu_barrier() is called but there is no synchronized_prcu() invoked,
> > >> callbacks cannot be invoked.  Later version should address this issue, e.g. adding a
> > >> grace-period expedition mechanism.  Others include to use a a hierarchical structure,
> > >> taking into account the NUMA topology, to send IPI in synchronize_prcu().
> > >>
> > >> We have tested the implementation using rcutorture on both an x86 and ARM64 machine.
> > >> PRCU passed 1h and 3h tests on all the newly added config files except PRCU07 reported BUG
> > >> in a 1h run.
> > >>
> > >> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
> > >> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
> > >> [ 1594.755553] smpboot: CPU 14 is now offline
> > >> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
> > >> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
> > >> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
> > >> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
> > >> [ 1599.365098] prcu-torture: rtc: ffffffffb0277b90 ver: 66358 tfle: 0 rta: 66358 rtaf: 0
> > >> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418
> > >> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 cbflood: 225
> > >> [ 1599.367946] prcu-torture: !!!
> > >> [ 1599.367966] ------------[ cut here ]------------
> > >
> > > The "rtbe: 1" indicates that your implementation of prcu_barrier()
> > > failed to wait for all preceding call_prcu() callbacks to be invoked.

And my guess is that the "rtbe: 1" is happening because you aren't doing
anything with CPU hotplug.  The way this could happen is as follows:

1.	rcutorture does call_prcu() with a test callback on CPU 5.

2.	CPU 5 goes offline, but its callbacks stay there.

3.	rcutorture does prcu_barrier(), which queues a callback on each
	sufficiently online CPU.  No callback is enqueued on CPU 5.

4.	The enqueued callbacks are all invoked, so prcu_barrier()
	returns.

5.	rcutorture complains because the test callback stranded on CPU 5
	never did get invoked.

A slightly different sequence of events could strand one of prcu_barrier()'s
callbacks on an offline CPU, in which case prcu_barrier() would hang until
that CPU came back online.  There are probably other failure scenarios, but
those two should do for now.  ;-)

							Thanx, Paul

> > > Does the immediately following "Reader Pipe:" list have any but the
> > > first two numbers non-zero?
> > 
> > Yes.
> 
> If the third or subsequent numbers are non-zero, that would indicate
> too-short grace periods.  This would be a critical bug in PRCU.
> 
> > >> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to true, that is
> > >> synchronize_rcu_expedited was tested.
> > >>
> > >> The rcuperf results are as follows (average grace-period duration in ms of ten 10min runs):
> > >>
> > >> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
> > >>
> > >> CPUs      2       4       8      12      15       16
> > >> PRCU   0.14    1.07    4.15    8.02   10.79    15.16
> > >> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
> > >>
> > >> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
> > >>
> > >> CPUs       2       4        8      16      32       48       63        64
> > >> PRCU    0.23   19.69    38.28   63.21   95.41   167.18   252.01   1841.44
> > >> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27
> > >
> > > Well, at the very least, this is a bug report on either expedited RCU
> > > grace-period latency or on rcuperf's measurements, and thank you for that.
> > > I will look into this.  In the meantime, could you please let me know
> > > exactly how you invoked rcuperf?
> > 
> > We used the following command to invoke rcuperf:
> > 
> > sudo ./kvm.sh --torture rcuperf --duration 10 --configs 10*TREE
> > 
> > The actual script run-rcuperf.sh to run the experiments can be found
> > in the following email of this patch series:
> > 
> > [PATCH RFC 15/16] rcutorture: Add scripts to run experiments
> > 
> > Please let us know how it goes.
> 
> Will do!
> 
> As I said before, at the very least you have identified a performance bug
> in RCU expedited grace periods.
> 
> 							Thanx, Paul
> 
> > Many thanks,
> > Lihao.
> > 
> > > I have a few comments on some of your patches based on a quick scan
> > > through them.
> > >
> > >                                                         Thanx, Paul
> > >
> > >> Best wishes,
> > >> Lihao.
> > >>
> > >>
> > >> Lihao Liang (15):
> > >>   rcutorture: Add PRCU rcu_torture_ops
> > >>   rcutorture: Add PRCU test config files
> > >>   rcuperf: Add PRCU rcu_perf_ops
> > >>   rcuperf: Add PRCU test config files
> > >>   rcuperf: Set gp_exp to true for tests to run
> > >>   prcu: Implement call_prcu() API
> > >>   prcu: Implement PRCU callback processing
> > >>   prcu: Implement prcu_barrier() API
> > >>   rcutorture: Test call_prcu() and prcu_barrier()
> > >>   rcutorture: Add basic ARM64 support to run scripts
> > >>   prcu: Add PRCU Kconfig parameter
> > >>   prcu: Comment source code
> > >>   rcuperf: Add config files with various CONFIG_NR_CPUS
> > >>   rcutorture: Add scripts to run experiments
> > >>   Add GPLv2 license
> > >>
> > >> Heng Zhang (1):
> > >>   prcu: Add PRCU implementation
> > >>
> > >>  include/linux/interrupt.h                          |   3 +
> > >>  include/linux/prcu.h                               | 122 +++++
> > >>  include/linux/rcupdate.h                           |   1 +
> > >>  init/Kconfig                                       |   7 +
> > >>  init/main.c                                        |   2 +
> > >>  kernel/rcu/Makefile                                |   1 +
> > >>  kernel/rcu/prcu.c                                  | 497 +++++++++++++++++++++
> > >>  kernel/rcu/rcuperf.c                               |  33 +-
> > >>  kernel/rcu/rcutorture.c                            |  40 +-
> > >>  kernel/rcu/tree.c                                  |   1 +
> > >>  kernel/sched/core.c                                |   2 +
> > >>  kernel/time/timer.c                                |   2 +
> > >>  kvm.sh                                             | 452 +++++++++++++++++++
> > >>  run-rcuperf.sh                                     |  26 ++
> > >>  .../testing/selftests/rcutorture/bin/functions.sh  |  17 +-
> > >>  .../selftests/rcutorture/configs/rcu/CFLIST        |   5 +
> > >>  .../selftests/rcutorture/configs/rcu/PRCU02        |  27 ++
> > >>  .../selftests/rcutorture/configs/rcu/PRCU02.boot   |   1 +
> > >>  .../selftests/rcutorture/configs/rcu/PRCU03        |  23 +
> > >>  .../selftests/rcutorture/configs/rcu/PRCU03.boot   |   2 +
> > >>  .../selftests/rcutorture/configs/rcu/PRCU06        |  26 ++
> > >>  .../selftests/rcutorture/configs/rcu/PRCU06.boot   |   5 +
> > >>  .../selftests/rcutorture/configs/rcu/PRCU07        |  25 ++
> > >>  .../selftests/rcutorture/configs/rcu/PRCU07.boot   |   2 +
> > >>  .../selftests/rcutorture/configs/rcu/PRCU09        |  19 +
> > >>  .../selftests/rcutorture/configs/rcu/PRCU09.boot   |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/CFLIST    |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU      |  20 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-12   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-12.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-14   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-14.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-15   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-15.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-16   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-16.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-2    |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-2.boot         |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-32   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-32.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-4    |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-4.boot         |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-48   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-48.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-56   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-56.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-60   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-60.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-62   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-62.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-64   |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-64.boot        |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU-8    |  21 +
> > >>  .../rcutorture/configs/rcuperf/PRCU-8.boot         |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/PRCU.boot |   1 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-12   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-14   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-15   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-16   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-2    |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-32   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-4    |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-48   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-56   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-60   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-62   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-64   |  21 +
> > >>  .../selftests/rcutorture/configs/rcuperf/TREE-8    |  21 +
> > >>  68 files changed, 1918 insertions(+), 5 deletions(-)
> > >>  create mode 100644 include/linux/prcu.h
> > >>  create mode 100644 kernel/rcu/prcu.c
> > >>  create mode 100755 kvm.sh
> > >>  create mode 100755 run-rcuperf.sh
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
> > >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8
> > >>
> > >> --
> > >> 2.14.1.729.g59c0ea183
> > >>
> > >
> > 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol
  2018-01-27  9:57       ` Lihao Liang
@ 2018-01-27 23:46         ` Paul E. McKenney
  0 siblings, 0 replies; 43+ messages in thread
From: Paul E. McKenney @ 2018-01-27 23:46 UTC (permalink / raw)
  To: Lihao Liang; +Cc: Guohanjun (Hanjun Guo), heng.z, hb.chen, linux-kernel

On Sat, Jan 27, 2018 at 09:57:03AM +0000, Lihao Liang wrote:
> On Sat, Jan 27, 2018 at 7:57 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Sat, Jan 27, 2018 at 07:22:27AM +0000, Lihao Liang wrote:
> >> On Thu, Jan 25, 2018 at 5:53 AM, Paul E. McKenney
> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> > On Tue, Jan 23, 2018 at 03:59:25PM +0800, lianglihao@huawei.com wrote:
> >> >> From: Lihao Liang <lianglihao@huawei.com>
> >> >>
> >> >> Dear Paul,
> >> >>
> >> >> This patch set implements a preemptive version of RCU (PRCU) based on the following paper:
> >> >>
> >> >> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
> >> >> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
> >> >> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
> >> >> https://dl.acm.org/citation.cfm?id=3024114.3024143
> >> >>
> >> >> We have also added preliminary callback-handling support.  Thus, the current version
> >> >> provides APIs prcu_read_lock(), prcu_read_unlock(), synchronize_prcu(), call_prcu(),
> >> >> and prcu_barrier().
> >> >>
> >> >> This is an experimental patch, so it would be good to have some feedback.
> >> >>
> >> >> Known shortcoming is that the grace-period version is incremented in synchronize_prcu().
> >> >> If call_prcu() or prcu_barrier() is called but there is no synchronized_prcu() invoked,
> >> >> callbacks cannot be invoked.  Later version should address this issue, e.g. adding a
> >> >> grace-period expedition mechanism.  Others include to use a a hierarchical structure,
> >> >> taking into account the NUMA topology, to send IPI in synchronize_prcu().
> >> >>
> >> >> We have tested the implementation using rcutorture on both an x86 and ARM64 machine.
> >> >> PRCU passed 1h and 3h tests on all the newly added config files except PRCU07 reported BUG
> >> >> in a 1h run.
> >> >>
> >> >> [ 1593.604201] ---[ end trace b3bae911bec86152 ]---
> >> >> [ 1594.629450] prcu-torture:torture_onoff task: offlining 14
> >> >> [ 1594.755553] smpboot: CPU 14 is now offline
> >> >> [ 1594.757732] prcu-torture:torture_onoff task: offlined 14
> >> >> [ 1597.765149] prcu-torture:torture_onoff task: onlining 11
> >> >> [ 1597.766795] smpboot: Booting Node 0 Processor 11 APIC 0xb
> >> >> [ 1597.804102] prcu-torture:torture_onoff task: onlined 11
> >> >> [ 1599.365098] prcu-torture: rtc: ffffffffb0277b90 ver: 66358 tfle: 0 rta: 66358 rtaf: 0
> >> >> rtf: 66349 rtmbe: 0 rtbe: 1 rtbke: 0 rtbre: 0 rtbf: 0 rtb: 0 nt: 2233418
> >> >> onoff: 191/191:199/199 34,199:59,5102 10403:0 (HZ=1000) barrier: 188/189:1 cbflood: 225
> >> >> [ 1599.367946] prcu-torture: !!!
> >> >> [ 1599.367966] ------------[ cut here ]------------
> >> >
> >> > The "rtbe: 1" indicates that your implementation of prcu_barrier()
> >> > failed to wait for all preceding call_prcu() callbacks to be invoked.
> >> >
> >> > Does the immediately following "Reader Pipe:" list have any but the
> >> > first two numbers non-zero?
> >>
> >> Yes.
> >
> > If the third or subsequent numbers are non-zero, that would indicate
> > too-short grace periods.  This would be a critical bug in PRCU.
> >
> >> >> We have also compared PRCU with TREE RCU using rcuperf with gp_exp set to true, that is
> >> >> synchronize_rcu_expedited was tested.
> >> >>
> >> >> The rcuperf results are as follows (average grace-period duration in ms of ten 10min runs):
> >> >>
> >> >> 16*Intel Xeon CPU@2.4GHz, 16GB memory, Ubuntu Linux 3.13.0-47-generic
> >> >>
> >> >> CPUs      2       4       8      12      15       16
> >> >> PRCU   0.14    1.07    4.15    8.02   10.79    15.16
> >> >> TREE  49.30  104.75  277.55  390.82  620.82  1381.54
> >> >>
> >> >> 64*Cortex-A72 CPU@2.4GHz, 130GB memory, Ubuntu Linux 4.10.0-21.23-generic
> >> >>
> >> >> CPUs       2       4        8      16      32       48       63        64
> >> >> PRCU    0.23   19.69    38.28   63.21   95.41   167.18   252.01   1841.44
> >> >> TREE  416.73  901.89  1060.86  743.00  920.66  1325.21  1646.20  23806.27
> >> >
> >> > Well, at the very least, this is a bug report on either expedited RCU
> >> > grace-period latency or on rcuperf's measurements, and thank you for that.
> >> > I will look into this.  In the meantime, could you please let me know
> >> > exactly how you invoked rcuperf?
> >>
> >> We used the following command to invoke rcuperf:
> >>
> >> sudo ./kvm.sh --torture rcuperf --duration 10 --configs 10*TREE
> >>
> >> The actual script run-rcuperf.sh to run the experiments can be found
> >> in the following email of this patch series:
> >>
> >> [PATCH RFC 15/16] rcutorture: Add scripts to run experiments
> >>
> >> Please let us know how it goes.
> >
> > Will do!
> >
> > As I said before, at the very least you have identified a performance bug
> > in RCU expedited grace periods.
> >
> 
> I should add that we also tested the normal synchronize_rcu() on the
> same x86 machine, and the rcuperf figures were about 10 times slower
> than those of synchronize_rcu_expedited().
> 
> Is this expected for synchronize_rcu()?

Yes, this is a deliberate design decision.  The goal of
synchronize_rcu() is to minimize CPU overhead, which it does by
aggressively batching concurrent requests -- and also by batching
not-quite-so-concurrent requests.  The other end of this tradeoff is that
synchronize_rcu() has relatively high latency.  In contrast, the goal of
synchronize_rcu_expedited() is lower latency, with the penalty of higher
CPU utilization and more disturbance of real-time workloads.  Note that
synchronize_prcu()'s use of IPIs disturbs real-time workloads in a manner
similar to synchronize_rcu_expedited(), but that synchronize_prcu()
further disturbs idle CPUs.  In my experience, disturbing idle CPUs
doesn't make the people with battery-powered systems very happy.

							Thanx, Paul

> Best,
> Lihao.
> 
> >                                                         Thanx, Paul
> >
> >> Many thanks,
> >> Lihao.
> >>
> >> > I have a few comments on some of your patches based on a quick scan
> >> > through them.
> >> >
> >> >                                                         Thanx, Paul
> >> >
> >> >> Best wishes,
> >> >> Lihao.
> >> >>
> >> >>
> >> >> Lihao Liang (15):
> >> >>   rcutorture: Add PRCU rcu_torture_ops
> >> >>   rcutorture: Add PRCU test config files
> >> >>   rcuperf: Add PRCU rcu_perf_ops
> >> >>   rcuperf: Add PRCU test config files
> >> >>   rcuperf: Set gp_exp to true for tests to run
> >> >>   prcu: Implement call_prcu() API
> >> >>   prcu: Implement PRCU callback processing
> >> >>   prcu: Implement prcu_barrier() API
> >> >>   rcutorture: Test call_prcu() and prcu_barrier()
> >> >>   rcutorture: Add basic ARM64 support to run scripts
> >> >>   prcu: Add PRCU Kconfig parameter
> >> >>   prcu: Comment source code
> >> >>   rcuperf: Add config files with various CONFIG_NR_CPUS
> >> >>   rcutorture: Add scripts to run experiments
> >> >>   Add GPLv2 license
> >> >>
> >> >> Heng Zhang (1):
> >> >>   prcu: Add PRCU implementation
> >> >>
> >> >>  include/linux/interrupt.h                          |   3 +
> >> >>  include/linux/prcu.h                               | 122 +++++
> >> >>  include/linux/rcupdate.h                           |   1 +
> >> >>  init/Kconfig                                       |   7 +
> >> >>  init/main.c                                        |   2 +
> >> >>  kernel/rcu/Makefile                                |   1 +
> >> >>  kernel/rcu/prcu.c                                  | 497 +++++++++++++++++++++
> >> >>  kernel/rcu/rcuperf.c                               |  33 +-
> >> >>  kernel/rcu/rcutorture.c                            |  40 +-
> >> >>  kernel/rcu/tree.c                                  |   1 +
> >> >>  kernel/sched/core.c                                |   2 +
> >> >>  kernel/time/timer.c                                |   2 +
> >> >>  kvm.sh                                             | 452 +++++++++++++++++++
> >> >>  run-rcuperf.sh                                     |  26 ++
> >> >>  .../testing/selftests/rcutorture/bin/functions.sh  |  17 +-
> >> >>  .../selftests/rcutorture/configs/rcu/CFLIST        |   5 +
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU02        |  27 ++
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU02.boot   |   1 +
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU03        |  23 +
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU03.boot   |   2 +
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU06        |  26 ++
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU06.boot   |   5 +
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU07        |  25 ++
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU07.boot   |   2 +
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU09        |  19 +
> >> >>  .../selftests/rcutorture/configs/rcu/PRCU09.boot   |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/CFLIST    |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU      |  20 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-12   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-12.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-14   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-14.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-15   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-15.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-16   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-16.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-2    |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-2.boot         |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-32   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-32.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-4    |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-4.boot         |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-48   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-48.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-56   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-56.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-60   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-60.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-62   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-62.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-64   |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-64.boot        |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU-8    |  21 +
> >> >>  .../rcutorture/configs/rcuperf/PRCU-8.boot         |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/PRCU.boot |   1 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-12   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-14   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-15   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-16   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-2    |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-32   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-4    |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-48   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-56   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-60   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-62   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-64   |  21 +
> >> >>  .../selftests/rcutorture/configs/rcuperf/TREE-8    |  21 +
> >> >>  68 files changed, 1918 insertions(+), 5 deletions(-)
> >> >>  create mode 100644 include/linux/prcu.h
> >> >>  create mode 100644 kernel/rcu/prcu.c
> >> >>  create mode 100755 kvm.sh
> >> >>  create mode 100755 run-rcuperf.sh
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU02.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU03.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU06.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU07.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcu/PRCU09.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-12.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-14.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-15.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-16.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-2.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-32.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-4.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-48.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-56.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-60.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-62.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-64.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU-8.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/PRCU.boot
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-12
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-14
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-15
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-16
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-2
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-32
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-4
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-48
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-56
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-60
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-62
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-64
> >> >>  create mode 100644 tools/testing/selftests/rcutorture/configs/rcuperf/TREE-8
> >> >>
> >> >> --
> >> >> 2.14.1.729.g59c0ea183
> >> >>
> >> >
> >>
> >
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-23  7:59 ` [PATCH RFC 01/16] prcu: Add PRCU implementation lianglihao
  2018-01-24 11:26   ` Peter Zijlstra
  2018-01-25  6:16   ` Paul E. McKenney
@ 2018-01-29  9:10   ` Lai Jiangshan
  2018-01-30  6:21     ` zhangheng (AC)
  2 siblings, 1 reply; 43+ messages in thread
From: Lai Jiangshan @ 2018-01-29  9:10 UTC (permalink / raw)
  To: lianglihao
  Cc: Paul E. McKenney, guohanjun, heng.z, hb.chen, lihao.liang, LKML

On Tue, Jan 23, 2018 at 3:59 PM,  <lianglihao@huawei.com> wrote:
> From: Heng Zhang <heng.z@huawei.com>
>
> This RCU implementation (PRCU) is based on a fast consensus protocol
> published in the following paper:
>
> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
> https://dl.acm.org/citation.cfm?id=3024114.3024143
>
> Signed-off-by: Heng Zhang <heng.z@huawei.com>
> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
> ---
>  include/linux/prcu.h |  37 +++++++++++++++
>  kernel/rcu/Makefile  |   2 +-
>  kernel/rcu/prcu.c    | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  kernel/sched/core.c  |   2 +
>  4 files changed, 165 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/prcu.h
>  create mode 100644 kernel/rcu/prcu.c
>
> diff --git a/include/linux/prcu.h b/include/linux/prcu.h
> new file mode 100644
> index 00000000..653b4633
> --- /dev/null
> +++ b/include/linux/prcu.h
> @@ -0,0 +1,37 @@
> +#ifndef __LINUX_PRCU_H
> +#define __LINUX_PRCU_H
> +
> +#include <linux/atomic.h>
> +#include <linux/mutex.h>
> +#include <linux/wait.h>
> +
> +#define CONFIG_PRCU
> +
> +struct prcu_local_struct {
> +       unsigned int locked;
> +       unsigned int online;
> +       unsigned long long version;
> +};
> +
> +struct prcu_struct {
> +       atomic64_t global_version;
> +       atomic_t active_ctr;
> +       struct mutex mtx;
> +       wait_queue_head_t wait_q;
> +};
> +
> +#ifdef CONFIG_PRCU
> +void prcu_read_lock(void);
> +void prcu_read_unlock(void);
> +void synchronize_prcu(void);
> +void prcu_note_context_switch(void);
> +
> +#else /* #ifdef CONFIG_PRCU */
> +
> +#define prcu_read_lock() do {} while (0)
> +#define prcu_read_unlock() do {} while (0)
> +#define synchronize_prcu() do {} while (0)
> +#define prcu_note_context_switch() do {} while (0)
> +
> +#endif /* #ifdef CONFIG_PRCU */
> +#endif /* __LINUX_PRCU_H */
> diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile
> index 23803c7d..8791419c 100644
> --- a/kernel/rcu/Makefile
> +++ b/kernel/rcu/Makefile
> @@ -2,7 +2,7 @@
>  # and is generally not a function of system call inputs.
>  KCOV_INSTRUMENT := n
>
> -obj-y += update.o sync.o
> +obj-y += update.o sync.o prcu.o
>  obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
>  obj-$(CONFIG_TREE_SRCU) += srcutree.o
>  obj-$(CONFIG_TINY_SRCU) += srcutiny.o
> diff --git a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c
> new file mode 100644
> index 00000000..a00b9420
> --- /dev/null
> +++ b/kernel/rcu/prcu.c
> @@ -0,0 +1,125 @@
> +#include <linux/smp.h>
> +#include <linux/prcu.h>
> +#include <linux/percpu.h>
> +#include <linux/compiler.h>
> +#include <linux/sched.h>
> +
> +#include <asm/barrier.h>
> +
> +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
> +
> +struct prcu_struct global_prcu = {
> +       .global_version = ATOMIC64_INIT(0),
> +       .active_ctr = ATOMIC_INIT(0),
> +       .mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
> +       .wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
> +};
> +struct prcu_struct *prcu = &global_prcu;
> +
> +static inline void prcu_report(struct prcu_local_struct *local)
> +{
> +       unsigned long long global_version;
> +       unsigned long long local_version;
> +
> +       global_version = atomic64_read(&prcu->global_version);
> +       local_version = local->version;
> +       if (global_version > local_version)
> +               cmpxchg(&local->version, local_version, global_version);

It is called with irq-disabled, and local->version can't be modified on
other cpu. why cmpxchg is needed?

> +}
> +
> +void prcu_read_lock(void)
> +{
> +       struct prcu_local_struct *local;
> +
> +       local = get_cpu_ptr(&prcu_local);
> +       if (!local->online) {
> +               WRITE_ONCE(local->online, 1);
> +               smp_mb();

What's is the paired code?

> +       }
> +
> +       local->locked++;
> +       put_cpu_ptr(&prcu_local);
> +}
> +EXPORT_SYMBOL(prcu_read_lock);
> +
> +void prcu_read_unlock(void)
> +{
> +       int locked;
> +       struct prcu_local_struct *local;
> +
> +       barrier();
> +       local = get_cpu_ptr(&prcu_local);
> +       locked = local->locked;
> +       if (locked) {
> +               local->locked--;
> +               if (locked == 1)
> +                       prcu_report(local);
> +               put_cpu_ptr(&prcu_local);
> +       } else {
> +               put_cpu_ptr(&prcu_local);
> +               if (!atomic_dec_return(&prcu->active_ctr))
> +                       wake_up(&prcu->wait_q);
> +       }
> +}
> +EXPORT_SYMBOL(prcu_read_unlock);
> +
> +static void prcu_handler(void *info)
> +{
> +       struct prcu_local_struct *local;
> +
> +       local = this_cpu_ptr(&prcu_local);
> +       if (!local->locked)
> +               WRITE_ONCE(local->version, atomic64_read(&prcu->global_version));
> +}
> +
> +void synchronize_prcu(void)
> +{
> +       int cpu;
> +       cpumask_t cpus;

It might overflow the stack if the cpumask is large, please move it to
struct prcu.

> +       unsigned long long version;
> +       struct prcu_local_struct *local;
> +
> +       version = atomic64_add_return(1, &prcu->global_version);

I think this line of code at least causes the following problem.

> +       mutex_lock(&prcu->mtx);
> +
> +       local = get_cpu_ptr(&prcu_local);
> +       local->version = version;

The successful orders of mutex_lock() might not be the same
the orders of atomic64_add_return(). In this case,
local->version will be decreased.

prcu_report() can also happen here now. It is unsure who will
change successfully the local->version.

> +       put_cpu_ptr(&prcu_local);
> +
> +       cpumask_clear(&cpus);
> +       for_each_possible_cpu(cpu) {
> +               local = per_cpu_ptr(&prcu_local, cpu);
> +               if (!READ_ONCE(local->online))
> +                       continue;

It seems like reading on local->online is unreliable.

> +               if (READ_ONCE(local->version) < version) {

please handle the cases when version wraps around the maximum.

> +                       smp_call_function_single(cpu, prcu_handler, NULL, 0);

it smells bad when it is in for_each_possible_cpu() loop.

> +                       cpumask_set_cpu(cpu, &cpus);
> +               }
> +       }
> +
> +       for_each_cpu(cpu, &cpus) {
> +               local = per_cpu_ptr(&prcu_local, cpu);
> +               while (READ_ONCE(local->version) < version)
> +                       cpu_relax();
> +       }
>

Ouch, the cpu_relax() loop would take a long time.
Since it will wait until all the relevant cpus scheduled.
relevant cpus: prcu reader active cpus.
So this block of code equals to synchronze_sched()
in many cases when prcu is massively used. isn't it?


smp_mb() /* A paired with B */

> +       if (atomic_read(&prcu->active_ctr))
> +               wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
> +
> +       mutex_unlock(&prcu->mtx);
> +}
> +EXPORT_SYMBOL(synchronize_prcu);
> +
> +void prcu_note_context_switch(void)
> +{
> +       struct prcu_local_struct *local;
> +
> +       local = get_cpu_ptr(&prcu_local);
> +       if (local->locked) {
> +               atomic_add(local->locked, &prcu->active_ctr);

smp_mb() /* B paired with A */

> +               local->locked = 0;
> +       }
> +       local->online = 0;
> +       prcu_report(local);
> +       put_cpu_ptr(&prcu_local);
> +}
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 326d4f88..a308581b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -15,6 +15,7 @@
>  #include <linux/init_task.h>
>  #include <linux/context_tracking.h>
>  #include <linux/rcupdate_wait.h>
> +#include <linux/prcu.h>
>
>  #include <linux/blkdev.h>
>  #include <linux/kprobes.h>
> @@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool preempt)
>
>         local_irq_disable();
>         rcu_note_context_switch(preempt);
> +       prcu_note_context_switch();
>
>         /*
>          * Make sure that signal_pending_state()->signal_pending() below
> --
> 2.14.1.729.g59c0ea183
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-25  6:16   ` Paul E. McKenney
  2018-01-25  7:30     ` Boqun Feng
  2018-01-27  7:35     ` Lihao Liang
@ 2018-01-30  3:58     ` zhangheng (AC)
  2 siblings, 0 replies; 43+ messages in thread
From: zhangheng (AC) @ 2018-01-30  3:58 UTC (permalink / raw)
  To: paulmck, lianglihao
  Cc: Guohanjun (Hanjun Guo), Chenhaibo (Haibo, OS Lab),
	lihao.liang, linux-kernel

Very sorry for late response, I found this email has been blocked after Lihao mentioned this morning.

-----Original Message-----
From: zhangheng (AC) 
Sent: 2018年1月26日 18:30
To: 'paulmck@linux.vnet.ibm.com' <paulmck@linux.vnet.ibm.com>; lianglihao@huawei.com
Cc: Guohanjun (Hanjun Guo) <guohanjun@huawei.com>; Chenhaibo (Haibo, OS Lab) <hb.chen@huawei.com>; lihao.liang@gmail.com; linux-kernel@vger.kernel.org
Subject: RE: [PATCH RFC 01/16] prcu: Add PRCU implementation

Hi Paul, thanks a lot for pointing out the problems of the implementation. Here's my understanding.

Best Regards,
Heng

-----Original Message-----
>From: Paul E. McKenney [mailto:paulmck@linux.vnet.ibm.com]
>Sent: 2018年1月25日 14:16
>To: lianglihao@huawei.com
>Cc: Guohanjun (Hanjun Guo) <guohanjun@huawei.com>; zhangheng (AC) 
><heng.z@huawei.com>; Chenhaibo (Haibo, OS Lab) <hb.chen@huawei.com>; 
>lihao.liang@gmail.com; linux-kernel@vger.kernel.org
>Subject: Re: [PATCH RFC 01/16] prcu: Add PRCU implementation

>On Tue, Jan 23, 2018 at 03:59:26PM +0800, lianglihao@huawei.com wrote:
>> From: Heng Zhang <heng.z@huawei.com>
>> 
>> This RCU implementation (PRCU) is based on a fast consensus protocol 
>> published in the following paper:
>> 
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>> 
>> Signed-off-by: Heng Zhang <heng.z@huawei.com>
>> Signed-off-by: Lihao Liang <lianglihao@huawei.com>

>A few comments and questions interspersed.

>							Thanx, Paul

>> ---
>>  include/linux/prcu.h |  37 +++++++++++++++
>>  kernel/rcu/Makefile  |   2 +-
>>  kernel/rcu/prcu.c    | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  kernel/sched/core.c  |   2 +
>>  4 files changed, 165 insertions(+), 1 deletion(-)  create mode 
>> 100644 include/linux/prcu.h  create mode 100644 kernel/rcu/prcu.c
>> 
>> diff --git a/include/linux/prcu.h b/include/linux/prcu.h new file 
>> mode
>> 100644 index 00000000..653b4633
>> --- /dev/null
>> +++ b/include/linux/prcu.h
>> @@ -0,0 +1,37 @@
>> +#ifndef __LINUX_PRCU_H
>> +#define __LINUX_PRCU_H
>> +
>> +#include <linux/atomic.h>
>> +#include <linux/mutex.h>
>> +#include <linux/wait.h>
>> +
>> +#define CONFIG_PRCU
>> +
>> +struct prcu_local_struct {
>> +	unsigned int locked;
>> +	unsigned int online;
>> +	unsigned long long version;
>> +};
>> +
>> +struct prcu_struct {
>> +	atomic64_t global_version;
>> +	atomic_t active_ctr;
>> +	struct mutex mtx;
>> +	wait_queue_head_t wait_q;
>> +};
>> +
>> +#ifdef CONFIG_PRCU
>> +void prcu_read_lock(void);
>> +void prcu_read_unlock(void);
>> +void synchronize_prcu(void);
>> +void prcu_note_context_switch(void);
>> +
>> +#else /* #ifdef CONFIG_PRCU */
>> +
>> +#define prcu_read_lock() do {} while (0) #define prcu_read_unlock() 
>> +do {} while (0) #define synchronize_prcu() do {} while (0) #define
>> +prcu_note_context_switch() do {} while (0)

>If CONFIG_PRCU=n and some code is built that uses PRCU, shouldn't you get a build error rather than an error-free but inoperative PRCU?

>Of course, Peter's question about purpose of the patch set applies here as well.

Yes, we should handle this case more carefully.
And in my personal opinion, prcu is designed for some modules that have a few threads and require low synchronization latency (because it just sends IPIs to online readers). I think if a module that uses synchronize_rcu_expedited, it may need prcu.

>> +
>> +#endif /* #ifdef CONFIG_PRCU */
>> +#endif /* __LINUX_PRCU_H */
>> diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile index 
>> 23803c7d..8791419c 100644
>> --- a/kernel/rcu/Makefile
>> +++ b/kernel/rcu/Makefile
>> @@ -2,7 +2,7 @@
>>  # and is generally not a function of system call inputs.
>>  KCOV_INSTRUMENT := n
>> 
>> -obj-y += update.o sync.o
>> +obj-y += update.o sync.o prcu.o
>>  obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
>>  obj-$(CONFIG_TREE_SRCU) += srcutree.o
>>  obj-$(CONFIG_TINY_SRCU) += srcutiny.o diff --git a/kernel/rcu/prcu.c 
>> b/kernel/rcu/prcu.c new file mode 100644 index 00000000..a00b9420
>> --- /dev/null
>> +++ b/kernel/rcu/prcu.c
>> @@ -0,0 +1,125 @@
>> +#include <linux/smp.h>
>> +#include <linux/prcu.h>
>> +#include <linux/percpu.h>
>> +#include <linux/compiler.h>
>> +#include <linux/sched.h>
>> +
>> +#include <asm/barrier.h>
>> +
>> +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
>> +
>> +struct prcu_struct global_prcu = {
>> +	.global_version = ATOMIC64_INIT(0),
>> +	.active_ctr = ATOMIC_INIT(0),
>> +	.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
>> +	.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
>> +};
>> +struct prcu_struct *prcu = &global_prcu;
>> +
>> +static inline void prcu_report(struct prcu_local_struct *local) {
>> +	unsigned long long global_version;
>> +	unsigned long long local_version;
>> +
>> +	global_version = atomic64_read(&prcu->global_version);
>> +	local_version = local->version;
>> +	if (global_version > local_version)
>> +		cmpxchg(&local->version, local_version, global_version); }
>> +
>> +void prcu_read_lock(void)
>> +{
>> +	struct prcu_local_struct *local;
>> +
>> +	local = get_cpu_ptr(&prcu_local);
>> +	if (!local->online) {
>> +		WRITE_ONCE(local->online, 1);
>> +		smp_mb();
>> +	}
>> +
>> +	local->locked++;
>> +	put_cpu_ptr(&prcu_local);
>> +}
>> +EXPORT_SYMBOL(prcu_read_lock);
>> +
>> +void prcu_read_unlock(void)
>> +{
>> +	int locked;
>> +	struct prcu_local_struct *local;
>> +
>> +	barrier();
>> +	local = get_cpu_ptr(&prcu_local);
>> +	locked = local->locked;
>> +	if (locked) {
>> +		local->locked--;
>> +		if (locked == 1)
>> +			prcu_report(local);

>Is ordering important here?  It looks to me that the compiler could rearrange some of the accesses within prcu_report() with the local->locked decrement.  There appears to be some potential for load and store tearing, though perhaps you have verified that your compiler avoids this on the architecture that you are using.

You are right. I should add a barrier() here. If the prcu_report() does effect (update local_version) when locked != 1 because of the rearrange, the prcu must cause problem.

>> +		put_cpu_ptr(&prcu_local);
>> +	} else {

>Hmmm...  We get here if the RCU read-side critical section was preempted.
>If none of them are preempted, ->active_ctr remains zero.

Yeah, unless the user calls prcu_read_unlock without prcu_read_lock, the activer_ctr must be non-zero.

>> +		put_cpu_ptr(&prcu_local);
>> +		if (!atomic_dec_return(&prcu->active_ctr))
>> +			wake_up(&prcu->wait_q);
>> +	}
>> +}
>> +EXPORT_SYMBOL(prcu_read_unlock);
>> +
>> +static void prcu_handler(void *info) {
>> +	struct prcu_local_struct *local;
>> +
>> +	local = this_cpu_ptr(&prcu_local);
>> +	if (!local->locked)
>> +		WRITE_ONCE(local->version, atomic64_read(&prcu->global_version));
>> +}
>> +
>> +void synchronize_prcu(void)
>> +{
>> +	int cpu;
>> +	cpumask_t cpus;
>> +	unsigned long long version;
>> +	struct prcu_local_struct *local;
>> +
>> +	version = atomic64_add_return(1, &prcu->global_version);
>> +	mutex_lock(&prcu->mtx);
>> +
>> +	local = get_cpu_ptr(&prcu_local);
>> +	local->version = version;
>> +	put_cpu_ptr(&prcu_local);
>> +
>> +	cpumask_clear(&cpus);
>> +	for_each_possible_cpu(cpu) {
>> +		local = per_cpu_ptr(&prcu_local, cpu);
>> +		if (!READ_ONCE(local->online))
>> +			continue;
>> +		if (READ_ONCE(local->version) < version) {

>On 32-bit systems, given that ->version is long long, you might see load tearing.  And on some 32-bit systems, the cmpxchg() in prcu_hander() might not build.

>Or is the idea that only prcu_handler() updates ->version?  But in that case, you wouldn't need the READ_ONCE() above.  What am I missing here?

Thanks, I didn't consider the problems on 32-bit systems, so this comments are really helpful.
I must use 64-bit counter to avoid overflow. So if it doesn't work on 32-bit system, I think I should add a spinlock for each local->version.

>> +			smp_call_function_single(cpu, prcu_handler, NULL, 0);
>> +			cpumask_set_cpu(cpu, &cpus);
>> +		}
>> +	}
>> +
>> +	for_each_cpu(cpu, &cpus) {
>> +		local = per_cpu_ptr(&prcu_local, cpu);
>> +		while (READ_ONCE(local->version) < version)

>This ->version read can also tear on some 32-bit systems, and this one most definitely can race with the prcu_handler() above.  Does the algorithm operate correctly in that case?  (It doesn't look that way to me, but I might be missing something.) Or are 32-bit systems excluded?

>> +			cpu_relax();
>> +	}

>I might be missing something, but I believe we need a memory barrier here on non-TSO systems.  Without that, couldn't we miss a preemption?

Yes, a memory barrier is reasonable here for non-TSO. Otherwise the following atomic_read may be reordered before all CPUs' report.

>> +
>> +	if (atomic_read(&prcu->active_ctr))
>> +		wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
>> +
>> +	mutex_unlock(&prcu->mtx);
>> +}
>> +EXPORT_SYMBOL(synchronize_prcu);
>> +
>> +void prcu_note_context_switch(void)
>> +{
>> +	struct prcu_local_struct *local;
>> +
>> +	local = get_cpu_ptr(&prcu_local);
>> +	if (local->locked) {
>> +		atomic_add(local->locked, &prcu->active_ctr);
>> +		local->locked = 0;
>> +	}
>> +	local->online = 0;
>> +	prcu_report(local);
>> +	put_cpu_ptr(&prcu_local);
>> +}
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 
>> 326d4f88..a308581b 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -15,6 +15,7 @@
>>  #include <linux/init_task.h>
>>  #include <linux/context_tracking.h>
>>  #include <linux/rcupdate_wait.h>
>> +#include <linux/prcu.h>
>> 
>>  #include <linux/blkdev.h>
>>  #include <linux/kprobes.h>
>> @@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool
>> preempt)
>> 
>>  	local_irq_disable();
>>  	rcu_note_context_switch(preempt);
>> +	prcu_note_context_switch();
>> 
>>  	/*
>>  	 * Make sure that signal_pending_state()->signal_pending() below
>> --
>> 2.14.1.729.g59c0ea183
>> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-25  7:30     ` Boqun Feng
@ 2018-01-30  5:34       ` zhangheng (AC)
  2018-01-30  6:40         ` Boqun Feng
  0 siblings, 1 reply; 43+ messages in thread
From: zhangheng (AC) @ 2018-01-30  5:34 UTC (permalink / raw)
  To: Boqun Feng, Paul E. McKenney
  Cc: lianglihao, Guohanjun (Hanjun Guo), Chenhaibo (Haibo, OS Lab),
	lihao.liang, linux-kernel

-----Original Message-----
>From: Boqun Feng [mailto:boqun.feng@gmail.com] 
>Sent: 2018年1月25日 15:31
>To: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>Cc: lianglihao@huawei.com; Guohanjun (Hanjun Guo) <guohanjun@huawei.com>; zhangheng (AC) <heng.z@huawei.com>; Chenhaibo (Haibo, OS Lab) <hb.chen@huawei.com>; lihao.liang@gmail.com; linux-kernel@vger.kernel.org
>Subject: Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
>
>On Wed, Jan 24, 2018 at 10:16:18PM -0800, Paul E. McKenney wrote:
>> On Tue, Jan 23, 2018 at 03:59:26PM +0800, lianglihao@huawei.com wrote:
>> > From: Heng Zhang <heng.z@huawei.com>
>> > 
>> > This RCU implementation (PRCU) is based on a fast consensus protocol 
>> > published in the following paper:
>> > 
>> > Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
>> > Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> > IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> > https://dl.acm.org/citation.cfm?id=3024114.3024143
>> > 
>> > Signed-off-by: Heng Zhang <heng.z@huawei.com>
>> > Signed-off-by: Lihao Liang <lianglihao@huawei.com>
>> 
>> A few comments and questions interspersed.
>> 
>> 							Thanx, Paul
>> 
>> > ---
>> >  include/linux/prcu.h |  37 +++++++++++++++
>> >  kernel/rcu/Makefile  |   2 +-
>> >  kernel/rcu/prcu.c    | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++
>> >  kernel/sched/core.c  |   2 +
>> >  4 files changed, 165 insertions(+), 1 deletion(-)  create mode 
>> > 100644 include/linux/prcu.h  create mode 100644 kernel/rcu/prcu.c
>> > 
>> > diff --git a/include/linux/prcu.h b/include/linux/prcu.h new file 
>> > mode 100644 index 00000000..653b4633
>> > --- /dev/null
>> > +++ b/include/linux/prcu.h
>> > @@ -0,0 +1,37 @@
>> > +#ifndef __LINUX_PRCU_H
>> > +#define __LINUX_PRCU_H
>> > +
>> > +#include <linux/atomic.h>
>> > +#include <linux/mutex.h>
>> > +#include <linux/wait.h>
>> > +
>> > +#define CONFIG_PRCU
>> > +
>> > +struct prcu_local_struct {
>> > +	unsigned int locked;
>> > +	unsigned int online;
>> > +	unsigned long long version;
>> > +};
>> > +
>> > +struct prcu_struct {
>> > +	atomic64_t global_version;
>> > +	atomic_t active_ctr;
>> > +	struct mutex mtx;
>> > +	wait_queue_head_t wait_q;
>> > +};
>> > +
>> > +#ifdef CONFIG_PRCU
>> > +void prcu_read_lock(void);
>> > +void prcu_read_unlock(void);
>> > +void synchronize_prcu(void);
>> > +void prcu_note_context_switch(void);
>> > +
>> > +#else /* #ifdef CONFIG_PRCU */
>> > +
>> > +#define prcu_read_lock() do {} while (0) #define prcu_read_unlock() 
>> > +do {} while (0) #define synchronize_prcu() do {} while (0) #define 
>> > +prcu_note_context_switch() do {} while (0)
>> 
>> If CONFIG_PRCU=n and some code is built that uses PRCU, shouldn't you 
>> get a build error rather than an error-free but inoperative PRCU?
>> 
>> Of course, Peter's question about purpose of the patch set applies 
>> here as well.
>> 
>> > +
>> > +#endif /* #ifdef CONFIG_PRCU */
>> > +#endif /* __LINUX_PRCU_H */
>> > diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile index 
>> > 23803c7d..8791419c 100644
>> > --- a/kernel/rcu/Makefile
>> > +++ b/kernel/rcu/Makefile
>> > @@ -2,7 +2,7 @@
>> >  # and is generally not a function of system call inputs.
>> >  KCOV_INSTRUMENT := n
>> > 
>> > -obj-y += update.o sync.o
>> > +obj-y += update.o sync.o prcu.o
>> >  obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
>> >  obj-$(CONFIG_TREE_SRCU) += srcutree.o
>> >  obj-$(CONFIG_TINY_SRCU) += srcutiny.o diff --git 
>> > a/kernel/rcu/prcu.c b/kernel/rcu/prcu.c new file mode 100644 index 
>> > 00000000..a00b9420
>> > --- /dev/null
>> > +++ b/kernel/rcu/prcu.c
>> > @@ -0,0 +1,125 @@
>> > +#include <linux/smp.h>
>> > +#include <linux/prcu.h>
>> > +#include <linux/percpu.h>
>> > +#include <linux/compiler.h>
>> > +#include <linux/sched.h>
>> > +
>> > +#include <asm/barrier.h>
>> > +
>> > +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, 
>> > +prcu_local);
>> > +
>> > +struct prcu_struct global_prcu = {
>> > +	.global_version = ATOMIC64_INIT(0),
>> > +	.active_ctr = ATOMIC_INIT(0),
>> > +	.mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
>> > +	.wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
>> > +};
>> > +struct prcu_struct *prcu = &global_prcu;
>> > +
>> > +static inline void prcu_report(struct prcu_local_struct *local) {
>> > +	unsigned long long global_version;
>> > +	unsigned long long local_version;
>> > +
>> > +	global_version = atomic64_read(&prcu->global_version);
>> > +	local_version = local->version;
>> > +	if (global_version > local_version)
>> > +		cmpxchg(&local->version, local_version, global_version); }
>> > +
>> > +void prcu_read_lock(void)
>> > +{
>> > +	struct prcu_local_struct *local;
>> > +
>> > +	local = get_cpu_ptr(&prcu_local);
>> > +	if (!local->online) {
>> > +		WRITE_ONCE(local->online, 1);
>> > +		smp_mb();
>> > +	}
>> > +
>> > +	local->locked++;
>> > +	put_cpu_ptr(&prcu_local);
>> > +}
>> > +EXPORT_SYMBOL(prcu_read_lock);
>> > +
>> > +void prcu_read_unlock(void)
>> > +{
>> > +	int locked;
>> > +	struct prcu_local_struct *local;
>> > +
>> > +	barrier();
>> > +	local = get_cpu_ptr(&prcu_local);
>> > +	locked = local->locked;
>> > +	if (locked) {
>> > +		local->locked--;
>> > +		if (locked == 1)
>> > +			prcu_report(local);
>> 
>> Is ordering important here?  It looks to me that the compiler could 
>> rearrange some of the accesses within prcu_report() with the 
>> local->locked decrement.  There appears to be some potential for load 
>> and store tearing, though perhaps you have verified that your compiler 
>> avoids this on the architecture that you are using.
>> 
>> > +		put_cpu_ptr(&prcu_local);
>> > +	} else {
>> 
>> Hmmm...  We get here if the RCU read-side critical section was preempted.
>> If none of them are preempted, ->active_ctr remains zero.
>> 
>> > +		put_cpu_ptr(&prcu_local);
>> > +		if (!atomic_dec_return(&prcu->active_ctr))
>> > +			wake_up(&prcu->wait_q);
>> > +	}
>> > +}
>> > +EXPORT_SYMBOL(prcu_read_unlock);
>> > +
>> > +static void prcu_handler(void *info) {
>> > +	struct prcu_local_struct *local;
>> > +
>> > +	local = this_cpu_ptr(&prcu_local);
>> > +	if (!local->locked)
>
>And I think a smp_mb() is needed here, because in the following case:
>
>	CPU 0				          CPU 1
>	==================		==========================
>	{X is initially 0}
>
>	WRITE_ONCE(X, 1);
>
>	                      prcu_read_unlock(void):
>	                      if (locked) {
>	  				              synchronize_prcu(void):
>					                ...
>					                <send IPI to CPU 0>
>	local->locked--;
>	                      # switch to IPI
> WRITE_ONCE(local->version,....)
>	  				            <read CPU 0 version to be latest>
>					              <return>
>
>					              r1 = READ_ONCE(X);
>
>r1 could be 0, which breaks RCU guarantees.
>

Thank you.
As I know,
it guarantees that the interrupt to be handled after all write instructions issued before have complete in x86 arch.
So the smp_mb is meaningless in x86 arch.
But I am not sure whether other archs guarantee this feature. If not, we do need a smp_mb here.

>> > +		WRITE_ONCE(local->version, atomic64_read(&prcu->global_version));
>> > +}
>> > +
>> > +void synchronize_prcu(void)
>> > +{
>> > +	int cpu;
>> > +	cpumask_t cpus;
>> > +	unsigned long long version;
>> > +	struct prcu_local_struct *local;
>> > +
>> > +	version = atomic64_add_return(1, &prcu->global_version);
>> > +	mutex_lock(&prcu->mtx);
>> > +
>> > +	local = get_cpu_ptr(&prcu_local);
>> > +	local->version = version;
>> > +	put_cpu_ptr(&prcu_local);
>> > +
>> > +	cpumask_clear(&cpus);
>> > +	for_each_possible_cpu(cpu) {
>> > +		local = per_cpu_ptr(&prcu_local, cpu);
>> > +		if (!READ_ONCE(local->online))
>> > +			continue;
>> > +		if (READ_ONCE(local->version) < version) {
>> 
>> On 32-bit systems, given that ->version is long long, you might see 
>> load tearing.  And on some 32-bit systems, the cmpxchg() in 
>> prcu_hander() might not build.
>> 
>
>/me curious about why an atomic64_t is used here for global version. I think maybe 32bit global version still suffices.
>
>Regards,
>Boqun

Because the synchronization latency is low, it can have higher gp frequency.
It seems that 32bit can only correctly work for several years if there are 20+ gps per second.

>
>> Or is the idea that only prcu_handler() updates ->version?  But in 
>> that case, you wouldn't need the READ_ONCE() above.  What am I missing here?
>> 
>> > +			smp_call_function_single(cpu, prcu_handler, NULL, 0);
>> > +			cpumask_set_cpu(cpu, &cpus);
>> > +		}
>> > +	}
>> > +
>> > +	for_each_cpu(cpu, &cpus) {
>> > +		local = per_cpu_ptr(&prcu_local, cpu);
>> > +		while (READ_ONCE(local->version) < version)
>> 
>> This ->version read can also tear on some 32-bit systems, and this one 
>> most definitely can race with the prcu_handler() above.  Does the 
>> algorithm operate correctly in that case?  (It doesn't look that way 
>> to me, but I might be missing something.) Or are 32-bit systems excluded?
>> 
>> > +			cpu_relax();
>> > +	}
>> 
>> I might be missing something, but I believe we need a memory barrier 
>> here on non-TSO systems.  Without that, couldn't we miss a preemption?
>> 
>> > +
>> > +	if (atomic_read(&prcu->active_ctr))
>> > +		wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
>> > +
>> > +	mutex_unlock(&prcu->mtx);
>> > +}
>> > +EXPORT_SYMBOL(synchronize_prcu);
>> > +
>> > +void prcu_note_context_switch(void) {
>> > +	struct prcu_local_struct *local;
>> > +
>> > +	local = get_cpu_ptr(&prcu_local);
>> > +	if (local->locked) {
>> > +		atomic_add(local->locked, &prcu->active_ctr);
>> > +		local->locked = 0;
>> > +	}
>> > +	local->online = 0;
>> > +	prcu_report(local);
>> > +	put_cpu_ptr(&prcu_local);
>> > +}
>> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 
>> > 326d4f88..a308581b 100644
>> > --- a/kernel/sched/core.c
>> > +++ b/kernel/sched/core.c
>> > @@ -15,6 +15,7 @@
>> >  #include <linux/init_task.h>
>> >  #include <linux/context_tracking.h>  #include 
>> > <linux/rcupdate_wait.h>
>> > +#include <linux/prcu.h>
>> > 
>> >  #include <linux/blkdev.h>
>> >  #include <linux/kprobes.h>
>> > @@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool 
>> > preempt)
>> > 
>> >  	local_irq_disable();
>> >  	rcu_note_context_switch(preempt);
>> > +	prcu_note_context_switch();
>> > 
>> >  	/*
>> >  	 * Make sure that signal_pending_state()->signal_pending() below
>> > --
>> > 2.14.1.729.g59c0ea183
>> > 
>> 
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-29  9:10   ` Lai Jiangshan
@ 2018-01-30  6:21     ` zhangheng (AC)
  0 siblings, 0 replies; 43+ messages in thread
From: zhangheng (AC) @ 2018-01-30  6:21 UTC (permalink / raw)
  To: Lai Jiangshan, lianglihao
  Cc: Paul E. McKenney, Guohanjun (Hanjun Guo),
	Chenhaibo (Haibo, OS Lab),
	lihao.liang, LKML

>-----Original Message-----
>From: jiangshanlai@gmail.com [mailto:jiangshanlai@gmail.com] On Behalf Of Lai Jiangshan
>Sent: 2018年1月29日 17:11
>To: lianglihao@huawei.com
>Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>; Guohanjun (Hanjun Guo) <guohanjun@huawei.com>; zhangheng (AC) <heng.z@huawei.com>; Chenhaibo (Haibo, OS Lab) <hb.chen@huawei.com>; lihao.liang@gmail.com; LKML <linux-kernel@vger.kernel.org>
>Subject: Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
>
>On Tue, Jan 23, 2018 at 3:59 PM,  <lianglihao@huawei.com> wrote:
>> From: Heng Zhang <heng.z@huawei.com>
>>
>> This RCU implementation (PRCU) is based on a fast consensus protocol 
>> published in the following paper:
>>
>> Fast Consensus Using Bounded Staleness for Scalable Read-mostly Synchronization.
>> Haibo Chen, Heng Zhang, Ran Liu, Binyu Zang, and Haibing Guan.
>> IEEE Transactions on Parallel and Distributed Systems (TPDS), 2016.
>> https://dl.acm.org/citation.cfm?id=3024114.3024143
>>
>> Signed-off-by: Heng Zhang <heng.z@huawei.com>
>> Signed-off-by: Lihao Liang <lianglihao@huawei.com>
>> ---
>>  include/linux/prcu.h |  37 +++++++++++++++
>>  kernel/rcu/Makefile  |   2 +-
>>  kernel/rcu/prcu.c    | 125 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  kernel/sched/core.c  |   2 +
>>  4 files changed, 165 insertions(+), 1 deletion(-)  create mode 100644 
>> include/linux/prcu.h  create mode 100644 kernel/rcu/prcu.c
>>
>> diff --git a/include/linux/prcu.h b/include/linux/prcu.h new file mode 
>> 100644 index 00000000..653b4633
>> --- /dev/null
>> +++ b/include/linux/prcu.h
>> @@ -0,0 +1,37 @@
>> +#ifndef __LINUX_PRCU_H
>> +#define __LINUX_PRCU_H
>> +
>> +#include <linux/atomic.h>
>> +#include <linux/mutex.h>
>> +#include <linux/wait.h>
>> +
>> +#define CONFIG_PRCU
>> +
>> +struct prcu_local_struct {
>> +       unsigned int locked;
>> +       unsigned int online;
>> +       unsigned long long version;
>> +};
>> +
>> +struct prcu_struct {
>> +       atomic64_t global_version;
>> +       atomic_t active_ctr;
>> +       struct mutex mtx;
>> +       wait_queue_head_t wait_q;
>> +};
>> +
>> +#ifdef CONFIG_PRCU
>> +void prcu_read_lock(void);
>> +void prcu_read_unlock(void);
>> +void synchronize_prcu(void);
>> +void prcu_note_context_switch(void);
>> +
>> +#else /* #ifdef CONFIG_PRCU */
>> +
>> +#define prcu_read_lock() do {} while (0) #define prcu_read_unlock() 
>> +do {} while (0) #define synchronize_prcu() do {} while (0) #define 
>> +prcu_note_context_switch() do {} while (0)
>> +
>> +#endif /* #ifdef CONFIG_PRCU */
>> +#endif /* __LINUX_PRCU_H */
>> diff --git a/kernel/rcu/Makefile b/kernel/rcu/Makefile index 
>> 23803c7d..8791419c 100644
>> --- a/kernel/rcu/Makefile
>> +++ b/kernel/rcu/Makefile
>> @@ -2,7 +2,7 @@
>>  # and is generally not a function of system call inputs.
>>  KCOV_INSTRUMENT := n
>>
>> -obj-y += update.o sync.o
>> +obj-y += update.o sync.o prcu.o
>>  obj-$(CONFIG_CLASSIC_SRCU) += srcu.o
>>  obj-$(CONFIG_TREE_SRCU) += srcutree.o
>>  obj-$(CONFIG_TINY_SRCU) += srcutiny.o diff --git a/kernel/rcu/prcu.c 
>> b/kernel/rcu/prcu.c new file mode 100644 index 00000000..a00b9420
>> --- /dev/null
>> +++ b/kernel/rcu/prcu.c
>> @@ -0,0 +1,125 @@
>> +#include <linux/smp.h>
>> +#include <linux/prcu.h>
>> +#include <linux/percpu.h>
>> +#include <linux/compiler.h>
>> +#include <linux/sched.h>
>> +
>> +#include <asm/barrier.h>
>> +
>> +DEFINE_PER_CPU_SHARED_ALIGNED(struct prcu_local_struct, prcu_local);
>> +
>> +struct prcu_struct global_prcu = {
>> +       .global_version = ATOMIC64_INIT(0),
>> +       .active_ctr = ATOMIC_INIT(0),
>> +       .mtx = __MUTEX_INITIALIZER(global_prcu.mtx),
>> +       .wait_q = __WAIT_QUEUE_HEAD_INITIALIZER(global_prcu.wait_q)
>> +};
>> +struct prcu_struct *prcu = &global_prcu;
>> +
>> +static inline void prcu_report(struct prcu_local_struct *local) {
>> +       unsigned long long global_version;
>> +       unsigned long long local_version;
>> +
>> +       global_version = atomic64_read(&prcu->global_version);
>> +       local_version = local->version;
>> +       if (global_version > local_version)
>> +               cmpxchg(&local->version, local_version, 
>> + global_version);
>
>It is called with irq-disabled, and local->version can't be modified on other cpu. why cmpxchg is needed?

No, it will also be called by prcu_read_unlock in this implementation.

>> +}
>> +
>> +void prcu_read_lock(void)
>> +{
>> +       struct prcu_local_struct *local;
>> +
>> +       local = get_cpu_ptr(&prcu_local);
>> +       if (!local->online) {
>> +               WRITE_ONCE(local->online, 1);
>> +               smp_mb();
>
>What's is the paired code?

It is paired with the mutex_lock in synchronize_prcu.
It is used to ensure that if writer see the online is false, there must be no online reader on this core.

>
>> +       }
>> +
>> +       local->locked++;
>> +       put_cpu_ptr(&prcu_local);
>> +}
>> +EXPORT_SYMBOL(prcu_read_lock);
>> +
>> +void prcu_read_unlock(void)
>> +{
>> +       int locked;
>> +       struct prcu_local_struct *local;
>> +
>> +       barrier();
>> +       local = get_cpu_ptr(&prcu_local);
>> +       locked = local->locked;
>> +       if (locked) {
>> +               local->locked--;
>> +               if (locked == 1)
>> +                       prcu_report(local);
>> +               put_cpu_ptr(&prcu_local);
>> +       } else {
>> +               put_cpu_ptr(&prcu_local);
>> +               if (!atomic_dec_return(&prcu->active_ctr))
>> +                       wake_up(&prcu->wait_q);
>> +       }
>> +}
>> +EXPORT_SYMBOL(prcu_read_unlock);
>> +
>> +static void prcu_handler(void *info)
>> +{
>> +       struct prcu_local_struct *local;
>> +
>> +       local = this_cpu_ptr(&prcu_local);
>> +       if (!local->locked)
>> +               WRITE_ONCE(local->version, 
>> +atomic64_read(&prcu->global_version));
>> +}
>> +
>> +void synchronize_prcu(void)
>> +{
>> +       int cpu;
>> +       cpumask_t cpus;
>
>It might overflow the stack if the cpumask is large, please move it to struct prcu.
OK, thank you.

>> +       unsigned long long version;
>> +       struct prcu_local_struct *local;
>> +
>> +       version = atomic64_add_return(1, &prcu->global_version);
>
>I think this line of code at least causes the following problem.
>
>> +       mutex_lock(&prcu->mtx);
>> +
>> +       local = get_cpu_ptr(&prcu_local);
>> +       local->version = version;
>
>The successful orders of mutex_lock() might not be the same the orders of atomic64_add_return(). In this case,
>local->version will be decreased.

Yes, it should read the global_version again.
But I think it is also correct.

>
>prcu_report() can also happen here now. It is unsure who will change successfully the local->version.
>
>> +       put_cpu_ptr(&prcu_local);
>> +
>> +       cpumask_clear(&cpus);
>> +       for_each_possible_cpu(cpu) {
>> +               local = per_cpu_ptr(&prcu_local, cpu);
>> +               if (!READ_ONCE(local->online))
>> +                       continue;
>
>It seems like reading on local->online is unreliable.

Any problem?

>
>> +               if (READ_ONCE(local->version) < version) {
>
>please handle the cases when version wraps around the maximum.

That's why I just want to use 64bit counters.
Resetting counters needs a full system synchronization, which needs to involve a complex protocol.

>
>> +                       smp_call_function_single(cpu, prcu_handler, 
>> + NULL, 0);
>
>it smells bad when it is in for_each_possible_cpu() loop.
>
>> +                       cpumask_set_cpu(cpu, &cpus);
>> +               }
>> +       }
>> +
>> +       for_each_cpu(cpu, &cpus) {
>> +               local = per_cpu_ptr(&prcu_local, cpu);
>> +               while (READ_ONCE(local->version) < version)
>> +                       cpu_relax();
>> +       }
>>
>
>Ouch, the cpu_relax() loop would take a long time.
>Since it will wait until all the relevant cpus scheduled.
>relevant cpus: prcu reader active cpus.
>So this block of code equals to synchronze_sched() in many cases when prcu is massively used. isn't it?

No, it waits for all online readers to do unlocking. I think the read CS should not be long.
If prcu can be massively used (I don't believe...), I plan to split the singleton prcu into multiple allocable instances.
Because each instance only sents ipis to its own online readers, I think it could work better.

>
>
>smp_mb() /* A paired with B */
>
>> +       if (atomic_read(&prcu->active_ctr))
>> +               wait_event(prcu->wait_q, 
>> + !atomic_read(&prcu->active_ctr));
>> +
>> +       mutex_unlock(&prcu->mtx);
>> +}
>> +EXPORT_SYMBOL(synchronize_prcu);
>> +
>> +void prcu_note_context_switch(void)
>> +{
>> +       struct prcu_local_struct *local;
>> +
>> +       local = get_cpu_ptr(&prcu_local);
>> +       if (local->locked) {
>> +               atomic_add(local->locked, &prcu->active_ctr);
>
>smp_mb() /* B paired with A */

Thank you, I should consider more for non-TSO arch.

>
>> +               local->locked = 0;
>> +       }
>> +       local->online = 0;
>> +       prcu_report(local);
>> +       put_cpu_ptr(&prcu_local);
>> +}
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 
>> 326d4f88..a308581b 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -15,6 +15,7 @@
>>  #include <linux/init_task.h>
>>  #include <linux/context_tracking.h>
>>  #include <linux/rcupdate_wait.h>
>> +#include <linux/prcu.h>
>>
>>  #include <linux/blkdev.h>
>>  #include <linux/kprobes.h>
>> @@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool 
>> preempt)
>>
>>         local_irq_disable();
>>         rcu_note_context_switch(preempt);
>> +       prcu_note_context_switch();
>>
>>         /*
>>          * Make sure that signal_pending_state()->signal_pending() 
>> below
>> --
>> 2.14.1.729.g59c0ea183
>>
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-30  5:34       ` zhangheng (AC)
@ 2018-01-30  6:40         ` Boqun Feng
  2018-01-30 10:42           ` zhangheng (AC)
  0 siblings, 1 reply; 43+ messages in thread
From: Boqun Feng @ 2018-01-30  6:40 UTC (permalink / raw)
  To: zhangheng (AC)
  Cc: Paul E. McKenney, lianglihao, Guohanjun (Hanjun Guo),
	Chenhaibo (Haibo, OS Lab),
	lihao.liang, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5504 bytes --]

On Tue, Jan 30, 2018 at 05:34:03AM +0000, zhangheng (AC) wrote:
[...]
> >> > +static void prcu_handler(void *info) {
> >> > +	struct prcu_local_struct *local;
> >> > +
> >> > +	local = this_cpu_ptr(&prcu_local);
> >> > +	if (!local->locked)
> >
> >And I think a smp_mb() is needed here, because in the following case:
> >
> >	CPU 0				          CPU 1
> >	==================		==========================
> >	{X is initially 0}
> >
> >	WRITE_ONCE(X, 1);
> >
> >	                      prcu_read_unlock(void):
> >	                      if (locked) {
> >	  				              synchronize_prcu(void):
> >					                ...
> >					                <send IPI to CPU 0>
> >	local->locked--;
> >	                      # switch to IPI
> > WRITE_ONCE(local->version,....)
> >	  				            <read CPU 0 version to be latest>
> >					              <return>
> >
> >					              r1 = READ_ONCE(X);
> >
> >r1 could be 0, which breaks RCU guarantees.
> >
> 
> Thank you.
> As I know,
> it guarantees that the interrupt to be handled after all write instructions issued before have complete in x86 arch.
> So the smp_mb is meaningless in x86 arch.

Sure. x86 is TSO, and we are talking about reordering of two stores
here, and that can not happen on TSO.

> But I am not sure whether other archs guarantee this feature. If not, we do need a smp_mb here.
> 

I think most of the weak memory model don't have this gaurantee, so you
need a smp_mb() or use smp_store_release().

> >> > +		WRITE_ONCE(local->version, atomic64_read(&prcu->global_version));
> >> > +}
> >> > +
> >> > +void synchronize_prcu(void)
> >> > +{
> >> > +	int cpu;
> >> > +	cpumask_t cpus;
> >> > +	unsigned long long version;
> >> > +	struct prcu_local_struct *local;
> >> > +
> >> > +	version = atomic64_add_return(1, &prcu->global_version);
> >> > +	mutex_lock(&prcu->mtx);
> >> > +
> >> > +	local = get_cpu_ptr(&prcu_local);
> >> > +	local->version = version;
> >> > +	put_cpu_ptr(&prcu_local);
> >> > +
> >> > +	cpumask_clear(&cpus);
> >> > +	for_each_possible_cpu(cpu) {
> >> > +		local = per_cpu_ptr(&prcu_local, cpu);
> >> > +		if (!READ_ONCE(local->online))
> >> > +			continue;
> >> > +		if (READ_ONCE(local->version) < version) {
> >> 
> >> On 32-bit systems, given that ->version is long long, you might see 
> >> load tearing.  And on some 32-bit systems, the cmpxchg() in 
> >> prcu_hander() might not build.
> >> 
> >
> >/me curious about why an atomic64_t is used here for global version. I think maybe 32bit global version still suffices.
> >
> >Regards,
> >Boqun
> 
> Because the synchronization latency is low, it can have higher gp frequency.
> It seems that 32bit can only correctly work for several years if there are 20+ gps per second.
> 

Because PRCU doesn't handle gp number overflow? May I ask why this is
difficult? Currently RCU could tolerate counter wrap for grace period:

	https://lwn.net/Articles/652677/ (Details in "Parallelism facts of life")

Is there any subtle difference I'm missing?

Regards,
Boqun

> >
> >> Or is the idea that only prcu_handler() updates ->version?  But in 
> >> that case, you wouldn't need the READ_ONCE() above.  What am I missing here?
> >> 
> >> > +			smp_call_function_single(cpu, prcu_handler, NULL, 0);
> >> > +			cpumask_set_cpu(cpu, &cpus);
> >> > +		}
> >> > +	}
> >> > +
> >> > +	for_each_cpu(cpu, &cpus) {
> >> > +		local = per_cpu_ptr(&prcu_local, cpu);
> >> > +		while (READ_ONCE(local->version) < version)
> >> 
> >> This ->version read can also tear on some 32-bit systems, and this one 
> >> most definitely can race with the prcu_handler() above.  Does the 
> >> algorithm operate correctly in that case?  (It doesn't look that way 
> >> to me, but I might be missing something.) Or are 32-bit systems excluded?
> >> 
> >> > +			cpu_relax();
> >> > +	}
> >> 
> >> I might be missing something, but I believe we need a memory barrier 
> >> here on non-TSO systems.  Without that, couldn't we miss a preemption?
> >> 
> >> > +
> >> > +	if (atomic_read(&prcu->active_ctr))
> >> > +		wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
> >> > +
> >> > +	mutex_unlock(&prcu->mtx);
> >> > +}
> >> > +EXPORT_SYMBOL(synchronize_prcu);
> >> > +
> >> > +void prcu_note_context_switch(void) {
> >> > +	struct prcu_local_struct *local;
> >> > +
> >> > +	local = get_cpu_ptr(&prcu_local);
> >> > +	if (local->locked) {
> >> > +		atomic_add(local->locked, &prcu->active_ctr);
> >> > +		local->locked = 0;
> >> > +	}
> >> > +	local->online = 0;
> >> > +	prcu_report(local);
> >> > +	put_cpu_ptr(&prcu_local);
> >> > +}
> >> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 
> >> > 326d4f88..a308581b 100644
> >> > --- a/kernel/sched/core.c
> >> > +++ b/kernel/sched/core.c
> >> > @@ -15,6 +15,7 @@
> >> >  #include <linux/init_task.h>
> >> >  #include <linux/context_tracking.h>  #include 
> >> > <linux/rcupdate_wait.h>
> >> > +#include <linux/prcu.h>
> >> > 
> >> >  #include <linux/blkdev.h>
> >> >  #include <linux/kprobes.h>
> >> > @@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool 
> >> > preempt)
> >> > 
> >> >  	local_irq_disable();
> >> >  	rcu_note_context_switch(preempt);
> >> > +	prcu_note_context_switch();
> >> > 
> >> >  	/*
> >> >  	 * Make sure that signal_pending_state()->signal_pending() below
> >> > --
> >> > 2.14.1.729.g59c0ea183
> >> > 
> >> 
> >

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* RE: [PATCH RFC 01/16] prcu: Add PRCU implementation
  2018-01-30  6:40         ` Boqun Feng
@ 2018-01-30 10:42           ` zhangheng (AC)
  0 siblings, 0 replies; 43+ messages in thread
From: zhangheng (AC) @ 2018-01-30 10:42 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Paul E. McKenney, lianglihao, Guohanjun (Hanjun Guo),
	Chenhaibo (Haibo, OS Lab),
	lihao.liang, linux-kernel

-----Original Message-----
>From: Boqun Feng [mailto:boqun.feng@gmail.com] 
>Sent: 2018年1月30日 14:41
>To: zhangheng (AC) <heng.z@huawei.com>
>Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>; lianglihao@huawei.com; Guohanjun (Hanjun Guo) <guohanjun@huawei.com>; Chenhaibo (Haibo, OS Lab) <hb.chen@huawei.com>; lihao.liang@gmail.com; linux-kernel@vger.kernel.org
>Subject: Re: [PATCH RFC 01/16] prcu: Add PRCU implementation
>
>On Tue, Jan 30, 2018 at 05:34:03AM +0000, zhangheng (AC) wrote:
>[...]
>> >> > +static void prcu_handler(void *info) {
>> >> > +	struct prcu_local_struct *local;
>> >> > +
>> >> > +	local = this_cpu_ptr(&prcu_local);
>> >> > +	if (!local->locked)
>> >
>> >And I think a smp_mb() is needed here, because in the following case:
>> >
>> >	CPU 0				          CPU 1
>> >	==================		==========================
>> >	{X is initially 0}
>> >
>> >	WRITE_ONCE(X, 1);
>> >
>> >	                      prcu_read_unlock(void):
>> >	                      if (locked) {
>> >	  				              synchronize_prcu(void):
>> >					                ...
>> >					                <send IPI to CPU 0>
>> >	local->locked--;
>> >	                      # switch to IPI
>> > WRITE_ONCE(local->version,....)
>> >	  				            <read CPU 0 version to be latest>
>> >					              <return>
>> >
>> >					              r1 = READ_ONCE(X);
>> >
>> >r1 could be 0, which breaks RCU guarantees.
>> >
>> 
>> Thank you.
>> As I know,
>> it guarantees that the interrupt to be handled after all write instructions issued before have complete in x86 arch.
>> So the smp_mb is meaningless in x86 arch.
>
>Sure. x86 is TSO, and we are talking about reordering of two stores here, and that can not happen on TSO.
>
>> But I am not sure whether other archs guarantee this feature. If not, we do need a smp_mb here.
>> 
>
>I think most of the weak memory model don't have this gaurantee, so you need a smp_mb() or use smp_store_release().

Agree.

>
>> >> > +		WRITE_ONCE(local->version, 
>> >> > +atomic64_read(&prcu->global_version));
>> >> > +}
>> >> > +
>> >> > +void synchronize_prcu(void)
>> >> > +{
>> >> > +	int cpu;
>> >> > +	cpumask_t cpus;
>> >> > +	unsigned long long version;
>> >> > +	struct prcu_local_struct *local;
>> >> > +
>> >> > +	version = atomic64_add_return(1, &prcu->global_version);
>> >> > +	mutex_lock(&prcu->mtx);
>> >> > +
>> >> > +	local = get_cpu_ptr(&prcu_local);
>> >> > +	local->version = version;
>> >> > +	put_cpu_ptr(&prcu_local);
>> >> > +
>> >> > +	cpumask_clear(&cpus);
>> >> > +	for_each_possible_cpu(cpu) {
>> >> > +		local = per_cpu_ptr(&prcu_local, cpu);
>> >> > +		if (!READ_ONCE(local->online))
>> >> > +			continue;
>> >> > +		if (READ_ONCE(local->version) < version) {
>> >> 
>> >> On 32-bit systems, given that ->version is long long, you might see 
>> >> load tearing.  And on some 32-bit systems, the cmpxchg() in
>> >> prcu_hander() might not build.
>> >> 
>> >
>> >/me curious about why an atomic64_t is used here for global version. I think maybe 32bit global version still suffices.
>> >
>> >Regards,
>> >Boqun
>> 
>> Because the synchronization latency is low, it can have higher gp frequency.
>> It seems that 32bit can only correctly work for several years if there are 20+ gps per second.
>> 
>
>Because PRCU doesn't handle gp number overflow? May I ask why this is difficult? Currently RCU could tolerate counter wrap for grace period:
>
>	https://lwn.net/Articles/652677/ (Details in "Parallelism facts of life")
>
>Is there any subtle difference I'm missing?
>
>Regards,
>Boqun
>

Yes, you are right. Currently prcu hasn't given a solution for overflow thus it needs a 64-bit counter.
Giving a solution is not that difficult. I just didn't consider it when I use 64-bit counter.
Since 64-bit counter isn't friendly for 32-bit system, I agree that 32-bit counter + overflow handler is necessary.
Thank you.

>> >
>> >> Or is the idea that only prcu_handler() updates ->version?  But in 
>> >> that case, you wouldn't need the READ_ONCE() above.  What am I missing here?
>> >> 
>> >> > +			smp_call_function_single(cpu, prcu_handler, NULL, 0);
>> >> > +			cpumask_set_cpu(cpu, &cpus);
>> >> > +		}
>> >> > +	}
>> >> > +
>> >> > +	for_each_cpu(cpu, &cpus) {
>> >> > +		local = per_cpu_ptr(&prcu_local, cpu);
>> >> > +		while (READ_ONCE(local->version) < version)
>> >> 
>> >> This ->version read can also tear on some 32-bit systems, and this 
>> >> one most definitely can race with the prcu_handler() above.  Does 
>> >> the algorithm operate correctly in that case?  (It doesn't look 
>> >> that way to me, but I might be missing something.) Or are 32-bit systems excluded?
>> >> 
>> >> > +			cpu_relax();
>> >> > +	}
>> >> 
>> >> I might be missing something, but I believe we need a memory 
>> >> barrier here on non-TSO systems.  Without that, couldn't we miss a preemption?
>> >> 
>> >> > +
>> >> > +	if (atomic_read(&prcu->active_ctr))
>> >> > +		wait_event(prcu->wait_q, !atomic_read(&prcu->active_ctr));
>> >> > +
>> >> > +	mutex_unlock(&prcu->mtx);
>> >> > +}
>> >> > +EXPORT_SYMBOL(synchronize_prcu);
>> >> > +
>> >> > +void prcu_note_context_switch(void) {
>> >> > +	struct prcu_local_struct *local;
>> >> > +
>> >> > +	local = get_cpu_ptr(&prcu_local);
>> >> > +	if (local->locked) {
>> >> > +		atomic_add(local->locked, &prcu->active_ctr);
>> >> > +		local->locked = 0;
>> >> > +	}
>> >> > +	local->online = 0;
>> >> > +	prcu_report(local);
>> >> > +	put_cpu_ptr(&prcu_local);
>> >> > +}
>> >> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 
>> >> > 326d4f88..a308581b 100644
>> >> > --- a/kernel/sched/core.c
>> >> > +++ b/kernel/sched/core.c
>> >> > @@ -15,6 +15,7 @@
>> >> >  #include <linux/init_task.h>
>> >> >  #include <linux/context_tracking.h>  #include 
>> >> > <linux/rcupdate_wait.h>
>> >> > +#include <linux/prcu.h>
>> >> > 
>> >> >  #include <linux/blkdev.h>
>> >> >  #include <linux/kprobes.h>
>> >> > @@ -3383,6 +3384,7 @@ static void __sched notrace __schedule(bool
>> >> > preempt)
>> >> > 
>> >> >  	local_irq_disable();
>> >> >  	rcu_note_context_switch(preempt);
>> >> > +	prcu_note_context_switch();
>> >> > 
>> >> >  	/*
>> >> >  	 * Make sure that signal_pending_state()->signal_pending() 
>> >> > below
>> >> > --
>> >> > 2.14.1.729.g59c0ea183
>> >> > 
>> >> 
>> >
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2018-01-30 10:43 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-23  7:59 [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol lianglihao
2018-01-23  7:59 ` [PATCH RFC 01/16] prcu: Add PRCU implementation lianglihao
2018-01-24 11:26   ` Peter Zijlstra
2018-01-24 17:15     ` Lihao Liang
2018-01-24 20:19       ` Peter Zijlstra
2018-01-25  6:16   ` Paul E. McKenney
2018-01-25  7:30     ` Boqun Feng
2018-01-30  5:34       ` zhangheng (AC)
2018-01-30  6:40         ` Boqun Feng
2018-01-30 10:42           ` zhangheng (AC)
2018-01-27  7:35     ` Lihao Liang
2018-01-30  3:58     ` zhangheng (AC)
2018-01-29  9:10   ` Lai Jiangshan
2018-01-30  6:21     ` zhangheng (AC)
2018-01-23  7:59 ` [PATCH RFC 02/16] rcutorture: Add PRCU rcu_torture_ops lianglihao
2018-01-23  7:59 ` [PATCH RFC 03/16] rcutorture: Add PRCU test config files lianglihao
2018-01-25  6:27   ` Paul E. McKenney
2018-01-23  7:59 ` [PATCH RFC 04/16] rcuperf: Add PRCU rcu_perf_ops lianglihao
2018-01-23  7:59 ` [PATCH RFC 05/16] rcuperf: Add PRCU test config files lianglihao
2018-01-23  7:59 ` [PATCH RFC 06/16] rcuperf: Set gp_exp to true for tests to run lianglihao
2018-01-25  6:18   ` Paul E. McKenney
2018-01-26  8:33     ` Lihao Liang
2018-01-23  7:59 ` [PATCH RFC 07/16] prcu: Implement call_prcu() API lianglihao
2018-01-25  6:20   ` Paul E. McKenney
2018-01-26  8:44     ` Lihao Liang
2018-01-26 22:22       ` Paul E. McKenney
2018-01-23  7:59 ` [PATCH RFC 08/16] prcu: Implement PRCU callback processing lianglihao
2018-01-23  7:59 ` [PATCH RFC 09/16] prcu: Implement prcu_barrier() API lianglihao
2018-01-25  6:24   ` Paul E. McKenney
2018-01-23  7:59 ` [PATCH RFC 10/16] rcutorture: Test call_prcu() and prcu_barrier() lianglihao
2018-01-23  7:59 ` [PATCH RFC 11/16] rcutorture: Add basic ARM64 support to run scripts lianglihao
2018-01-23  7:59 ` [PATCH RFC 12/16] prcu: Add PRCU Kconfig parameter lianglihao
2018-01-23  7:59 ` [PATCH RFC 13/16] prcu: Comment source code lianglihao
2018-01-23  7:59 ` [PATCH RFC 14/16] rcuperf: Add config files with various CONFIG_NR_CPUS lianglihao
2018-01-23  7:59 ` [PATCH RFC 15/16] rcutorture: Add scripts to run experiments lianglihao
2018-01-25  6:28   ` Paul E. McKenney
2018-01-23  7:59 ` [PATCH RFC 16/16] Add GPLv2 license lianglihao
2018-01-25  5:53 ` [PATCH RFC 00/16] A new RCU implementation based on a fast consensus protocol Paul E. McKenney
2018-01-27  7:22   ` Lihao Liang
2018-01-27  7:57     ` Paul E. McKenney
2018-01-27  9:57       ` Lihao Liang
2018-01-27 23:46         ` Paul E. McKenney
2018-01-27 23:41       ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).