linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH tip/core/rcu 0/11] Fixes for 3.13
@ 2013-09-25  1:27 Paul E. McKenney
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
  0 siblings, 1 reply; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw

Hello!

This series provides the following miscellaneous fixes:

1.	Place a preemption point in do_mlockall().

2.	Use proper cpp macro for ->gp_flags instead of the constant "1".

3.	Convert a number of local functions to static.

4.	Fix a dubious "if" condition to use "||" rather than "|"
	(we were getting lucky...).

5.	Make list_splice_init_rcu() account for RCU readers.

6.	Replace __get_cpu_var() uses, courtesy of Christoph Lameter.

7.	Silence an unused-variables warning in rcu_eqs_enter_common()
	and rcu_eqs_exit_common().

8.	Micro-optimize rcu_cpu_has_callbacks().

9.	Reject memory-order-induced stall-warning false positives.

10.	Apply tracepoint_string() to rcutiny's trace events.

11.	Avoid a CONFIG_RCU_NOCB_CPU_ALL=y panic on systems with sparse
	CPU numbering, courtesy of Kirill Tkhai.

							Thanx, Paul


 b/include/linux/rculist.h |   23 +++++++++-
 b/kernel/rcu.h            |    7 +++
 b/kernel/rcupdate.c       |    2 
 b/kernel/rcutiny.c        |   17 ++++----
 b/kernel/rcutree.c        |   97 ++++++++++++++++++++++++++++++----------------
 b/kernel/rcutree_plugin.h |   23 ++++++----
 b/mm/mlock.c              |    1 
 7 files changed, 119 insertions(+), 51 deletions(-)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop
  2013-09-25  1:27 [PATCH tip/core/rcu 0/11] Fixes for 3.13 Paul E. McKenney
@ 2013-09-25  1:29 ` Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 02/11] rcu: Use proper cpp macro for ->gp_flags Paul E. McKenney
                     ` (10 more replies)
  0 siblings, 11 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney, KOSAKI Motohiro, Michel Lespinasse,
	Linus Torvalds

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

There is a loop in do_mlockall() that lacks a preemption point, which
means that the following can happen on non-preemptible builds of the
kernel:

> My fuzz tester keeps hitting this. Every instance shows the non-irq stack
> came in from mlockall.  I'm only seeing this on one box, but that has more
> ram (8gb) than my other machines, which might explain it.
>
> 	Dave
>
> INFO: rcu_preempt self-detected stall on CPU { 3}  (t=6500 jiffies g=470344 c=470343 q=0)
> sending NMI to all CPUs:
> NMI backtrace for cpu 3
> CPU: 3 PID: 29664 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #32
> task: ffff88023e743fc0 ti: ffff88022f6f2000 task.ti: ffff88022f6f2000
> RIP: 0010:[<ffffffff810bf7d1>]  [<ffffffff810bf7d1>] trace_hardirqs_off_caller+0x21/0xb0
> RSP: 0018:ffff880244e03c30  EFLAGS: 00000046
> RAX: ffff88023e743fc0 RBX: 0000000000000001 RCX: 000000000000003c
> RDX: 000000000000000f RSI: 0000000000000004 RDI: ffffffff81033cab
> RBP: ffff880244e03c38 R08: ffff880243288a80 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000001 R12: ffff880243288a80
> R13: ffff8802437eda40 R14: 0000000000080000 R15: 000000000000d010
> FS:  00007f50ae33b740(0000) GS:ffff880244e00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000000097f000 CR3: 0000000240fa0000 CR4: 00000000001407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
> Stack:
>  ffffffff810bf86d ffff880244e03c98 ffffffff81033cab 0000000000000096
>  000000000000d008 0000000300000002 0000000000000004 0000000000000003
>  0000000000002710 ffffffff81c50d00 ffffffff81c50d00 ffff880244fcde00
> Call Trace:
>  <IRQ>
>  [<ffffffff810bf86d>] ? trace_hardirqs_off+0xd/0x10
>  [<ffffffff81033cab>] __x2apic_send_IPI_mask+0x1ab/0x1c0
>  [<ffffffff81033cdc>] x2apic_send_IPI_all+0x1c/0x20
>  [<ffffffff81030115>] arch_trigger_all_cpu_backtrace+0x65/0xa0
>  [<ffffffff811144b1>] rcu_check_callbacks+0x331/0x8e0
>  [<ffffffff8108bfa0>] ? hrtimer_run_queues+0x20/0x180
>  [<ffffffff8109e905>] ? sched_clock_cpu+0xb5/0x100
>  [<ffffffff81069557>] update_process_times+0x47/0x80
>  [<ffffffff810bd115>] tick_sched_handle.isra.16+0x25/0x60
>  [<ffffffff810bd231>] tick_sched_timer+0x41/0x60
>  [<ffffffff8108ace1>] __run_hrtimer+0x81/0x4e0
>  [<ffffffff810bd1f0>] ? tick_sched_do_timer+0x60/0x60
>  [<ffffffff8108b93f>] hrtimer_interrupt+0xff/0x240
>  [<ffffffff8102de84>] local_apic_timer_interrupt+0x34/0x60
>  [<ffffffff81718c5f>] smp_apic_timer_interrupt+0x3f/0x60
>  [<ffffffff817178ef>] apic_timer_interrupt+0x6f/0x80
>  [<ffffffff8170e8e0>] ? retint_restore_args+0xe/0xe
>  [<ffffffff8105f101>] ? __do_softirq+0xb1/0x440
>  [<ffffffff8105f64d>] irq_exit+0xcd/0xe0
>  [<ffffffff81718c65>] smp_apic_timer_interrupt+0x45/0x60
>  [<ffffffff817178ef>] apic_timer_interrupt+0x6f/0x80
>  <EOI>
>  [<ffffffff8170e8e0>] ? retint_restore_args+0xe/0xe
>  [<ffffffff8170b830>] ? wait_for_completion_killable+0x170/0x170
>  [<ffffffff8170c853>] ? preempt_schedule_irq+0x53/0x90
>  [<ffffffff8170e9f6>] retint_kernel+0x26/0x30
>  [<ffffffff8107a523>] ? queue_work_on+0x43/0x90
>  [<ffffffff8107c369>] schedule_on_each_cpu+0xc9/0x1a0
>  [<ffffffff81167770>] ? lru_add_drain+0x50/0x50
>  [<ffffffff811677c5>] lru_add_drain_all+0x15/0x20
>  [<ffffffff81186965>] SyS_mlockall+0xa5/0x1a0
>  [<ffffffff81716e94>] tracesys+0xdd/0xe2

This commit addresses this problem by inserting the required preemption
point.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 mm/mlock.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/mlock.c b/mm/mlock.c
index d638026..67ba6da 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -736,6 +736,7 @@ static int do_mlockall(int flags)
 
 		/* Ignore errors */
 		mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
+		cond_resched();
 	}
 out:
 	return 0;
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 02/11] rcu: Use proper cpp macro for ->gp_flags
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 03/11] rcu: Convert local functions to static Paul E. McKenney
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

One of the ->gp_flags assignments used a raw number rather than the
cpp macro that was intended for this purpose, which this commit fixes.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 32618b3..e0fa192 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1452,7 +1452,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	rdp = this_cpu_ptr(rsp->rda);
 	rcu_advance_cbs(rsp, rnp, rdp);  /* Reduce false positives below. */
 	if (cpu_needs_another_gp(rsp, rdp))
-		rsp->gp_flags = 1;
+		rsp->gp_flags = RCU_GP_FLAG_INIT;
 	raw_spin_unlock_irq(&rnp->lock);
 }
 
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 03/11] rcu: Convert local functions to static
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 02/11] rcu: Use proper cpp macro for ->gp_flags Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 04/11] rcu: Fix dubious "if" condition in __call_rcu_nocb_enqueue() Paul E. McKenney
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The rcu_cpu_stall_timeout kernel parameter, the rcu_dynticks per-CPU
variable, and the rcu_gp_fqs() function are used only locally.  This
commit therefore marks them as static.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcupdate.c | 2 +-
 kernel/rcutree.c  | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index b02a339..3260a10 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -298,7 +298,7 @@ EXPORT_SYMBOL_GPL(do_trace_rcu_torture_read);
 #endif
 
 int rcu_cpu_stall_suppress __read_mostly; /* 1 = suppress stall warnings. */
-int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT;
+static int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT;
 
 module_param(rcu_cpu_stall_suppress, int, 0644);
 module_param(rcu_cpu_stall_timeout, int, 0644);
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index e0fa192..2712b89 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -222,7 +222,7 @@ void rcu_note_context_switch(int cpu)
 }
 EXPORT_SYMBOL_GPL(rcu_note_context_switch);
 
-DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
+static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
 	.dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
 	.dynticks = ATOMIC_INIT(1),
 #ifdef CONFIG_NO_HZ_FULL_SYSIDLE
@@ -1366,7 +1366,7 @@ static int rcu_gp_init(struct rcu_state *rsp)
 /*
  * Do one round of quiescent-state forcing.
  */
-int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
+static int rcu_gp_fqs(struct rcu_state *rsp, int fqs_state_in)
 {
 	int fqs_state = fqs_state_in;
 	bool isidle = false;
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 04/11] rcu: Fix dubious "if" condition in __call_rcu_nocb_enqueue()
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 02/11] rcu: Use proper cpp macro for ->gp_flags Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 03/11] rcu: Convert local functions to static Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 05/11] rcu: Make list_splice_init_rcu() account for RCU readers Paul E. McKenney
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

This commit replaces an incorrect (but fortunately functional)
bitwise OR ("|") operator with the correct logical OR ("||").

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_plugin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 130c97b..6f9aece 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -2108,7 +2108,7 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp,
 
 	/* If we are not being polled and there is a kthread, awaken it ... */
 	t = ACCESS_ONCE(rdp->nocb_kthread);
-	if (rcu_nocb_poll | !t)
+	if (rcu_nocb_poll || !t)
 		return;
 	len = atomic_long_read(&rdp->nocb_q_count);
 	if (old_rhpp == &rdp->nocb_head) {
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 05/11] rcu: Make list_splice_init_rcu() account for RCU readers
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
                     ` (2 preceding siblings ...)
  2013-09-25  1:29   ` [PATCH tip/core/rcu 04/11] rcu: Fix dubious "if" condition in __call_rcu_nocb_enqueue() Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 06/11] rcu: Replace __get_cpu_var() uses Paul E. McKenney
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The list_splice_init_rcu() function allows a list visible to RCU readers
to be spliced into another list visible to RCU readers.  This is OK,
except for the use of INIT_LIST_HEAD(), which does pointer updates
without doing anything to make those updates safe for concurrent readers.

Of course, most of the time INIT_LIST_HEAD() is being used in reader-free
contexts, such as initialization or cleanup, so it is OK for it to update
pointers in an unsafe-for-RCU-readers manner.  This commit therefore
creates an INIT_LIST_HEAD_RCU() that uses ACCESS_ONCE() to make the updates
reader-safe.  The reason that we can use ACCESS_ONCE() instead of the more
typical rcu_assign_pointer() is that list_splice_init_rcu() is updating the
pointers to reference something that is already visible to readers, so
that there is no problem with pre-initialized values.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 include/linux/rculist.h | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 4106721..45a0a9e 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -19,6 +19,21 @@
  */
 
 /*
+ * INIT_LIST_HEAD_RCU - Initialize a list_head visible to RCU readers
+ * @list: list to be initialized
+ *
+ * You should instead use INIT_LIST_HEAD() for normal initialization and
+ * cleanup tasks, when readers have no access to the list being initialized.
+ * However, if the list being initialized is visible to readers, you
+ * need to keep the compiler from being too mischievous.
+ */
+static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
+{
+	ACCESS_ONCE(list->next) = list;
+	ACCESS_ONCE(list->prev) = list;
+}
+
+/*
  * return the ->next pointer of a list_head in an rcu safe
  * way, we must not access it directly
  */
@@ -191,9 +206,13 @@ static inline void list_splice_init_rcu(struct list_head *list,
 	if (list_empty(list))
 		return;
 
-	/* "first" and "last" tracking list, so initialize it. */
+	/*
+	 * "first" and "last" tracking list, so initialize it.  RCU readers
+	 * have access to this list, so we must use INIT_LIST_HEAD_RCU()
+	 * instead of INIT_LIST_HEAD().
+	 */
 
-	INIT_LIST_HEAD(list);
+	INIT_LIST_HEAD_RCU(list);
 
 	/*
 	 * At this point, the list body still points to the source list.
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 06/11] rcu: Replace __get_cpu_var() uses
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
                     ` (3 preceding siblings ...)
  2013-09-25  1:29   ` [PATCH tip/core/rcu 05/11] rcu: Make list_splice_init_rcu() account for RCU readers Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 07/11] rcu: Silence unused-variable warnings Paul E. McKenney
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Christoph Lameter, Paul E. McKenney

From: Christoph Lameter <cl@linux.com>

__get_cpu_var() is used for multiple purposes in the kernel source. One
of them is address calculation via the form &__get_cpu_var(x). This
calculates the address for the instance of the percpu variable of the
current processor based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area.  __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

__get_cpu_var() always only does an address determination. However,
store and retrieve operations could use a segment prefix (or global
register on other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into
a percpu area and use optimized assembly code to read and write per
cpu variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations
that use the offset. Thereby address calcualtions are avoided and less
registers are used when code is generated.

At the end of the patchset all uses of __get_cpu_var have been removed
so the macro is removed too.

The patchset includes passes over all arches as well. Once these
operations are used throughout then specialized macros can be defined in
non -x86 arches as well in order to optimize per cpu access by f.e. using
a global register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

	DEFINE_PER_CPU(int, y);
	int *x = &__get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

	DEFINE_PER_CPU(int, y[20]);
	int *x = __get_cpu_var(y);

    Converts to

	int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
   variable.

	DEFINE_PER_CPU(int, u);
	int x = __get_cpu_var(y)

   Converts to

	int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

	DEFINE_PER_CPU(struct mystruct, y);
	struct mystruct x = __get_cpu_var(y);

   Converts to

	memcpy(this_cpu_ptr(&x), y, sizeof(x));

5. Assignment to a per cpu variable

	DEFINE_PER_CPU(int, y)
	__get_cpu_var(y) = x;

   Converts to

	this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

	DEFINE_PER_CPU(int, y);
	__get_cpu_var(y)++

   Converts to

	this_cpu_inc(y)

Signed-off-by: Christoph Lameter <cl@linux.com>
[ paulmck: Address conflicts. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        | 22 +++++++++++-----------
 kernel/rcutree_plugin.h | 14 +++++++-------
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 2712b89..8eb9cfd 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -407,7 +407,7 @@ static void rcu_eqs_enter(bool user)
 	long long oldval;
 	struct rcu_dynticks *rdtp;
 
-	rdtp = &__get_cpu_var(rcu_dynticks);
+	rdtp = this_cpu_ptr(&rcu_dynticks);
 	oldval = rdtp->dynticks_nesting;
 	WARN_ON_ONCE((oldval & DYNTICK_TASK_NEST_MASK) == 0);
 	if ((oldval & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE)
@@ -435,7 +435,7 @@ void rcu_idle_enter(void)
 
 	local_irq_save(flags);
 	rcu_eqs_enter(false);
-	rcu_sysidle_enter(&__get_cpu_var(rcu_dynticks), 0);
+	rcu_sysidle_enter(this_cpu_ptr(&rcu_dynticks), 0);
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(rcu_idle_enter);
@@ -478,7 +478,7 @@ void rcu_irq_exit(void)
 	struct rcu_dynticks *rdtp;
 
 	local_irq_save(flags);
-	rdtp = &__get_cpu_var(rcu_dynticks);
+	rdtp = this_cpu_ptr(&rcu_dynticks);
 	oldval = rdtp->dynticks_nesting;
 	rdtp->dynticks_nesting--;
 	WARN_ON_ONCE(rdtp->dynticks_nesting < 0);
@@ -528,7 +528,7 @@ static void rcu_eqs_exit(bool user)
 	struct rcu_dynticks *rdtp;
 	long long oldval;
 
-	rdtp = &__get_cpu_var(rcu_dynticks);
+	rdtp = this_cpu_ptr(&rcu_dynticks);
 	oldval = rdtp->dynticks_nesting;
 	WARN_ON_ONCE(oldval < 0);
 	if (oldval & DYNTICK_TASK_NEST_MASK)
@@ -555,7 +555,7 @@ void rcu_idle_exit(void)
 
 	local_irq_save(flags);
 	rcu_eqs_exit(false);
-	rcu_sysidle_exit(&__get_cpu_var(rcu_dynticks), 0);
+	rcu_sysidle_exit(this_cpu_ptr(&rcu_dynticks), 0);
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(rcu_idle_exit);
@@ -599,7 +599,7 @@ void rcu_irq_enter(void)
 	long long oldval;
 
 	local_irq_save(flags);
-	rdtp = &__get_cpu_var(rcu_dynticks);
+	rdtp = this_cpu_ptr(&rcu_dynticks);
 	oldval = rdtp->dynticks_nesting;
 	rdtp->dynticks_nesting++;
 	WARN_ON_ONCE(rdtp->dynticks_nesting == 0);
@@ -620,7 +620,7 @@ void rcu_irq_enter(void)
  */
 void rcu_nmi_enter(void)
 {
-	struct rcu_dynticks *rdtp = &__get_cpu_var(rcu_dynticks);
+	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
 
 	if (rdtp->dynticks_nmi_nesting == 0 &&
 	    (atomic_read(&rdtp->dynticks) & 0x1))
@@ -642,7 +642,7 @@ void rcu_nmi_enter(void)
  */
 void rcu_nmi_exit(void)
 {
-	struct rcu_dynticks *rdtp = &__get_cpu_var(rcu_dynticks);
+	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
 
 	if (rdtp->dynticks_nmi_nesting == 0 ||
 	    --rdtp->dynticks_nmi_nesting != 0)
@@ -665,7 +665,7 @@ int rcu_is_cpu_idle(void)
 	int ret;
 
 	preempt_disable();
-	ret = (atomic_read(&__get_cpu_var(rcu_dynticks).dynticks) & 0x1) == 0;
+	ret = (atomic_read(this_cpu_ptr(&rcu_dynticks.dynticks)) & 0x1) == 0;
 	preempt_enable();
 	return ret;
 }
@@ -703,7 +703,7 @@ bool rcu_lockdep_current_cpu_online(void)
 	if (in_nmi())
 		return 1;
 	preempt_disable();
-	rdp = &__get_cpu_var(rcu_sched_data);
+	rdp = this_cpu_ptr(&rcu_sched_data);
 	rnp = rdp->mynode;
 	ret = (rdp->grpmask & rnp->qsmaskinit) ||
 	      !rcu_scheduler_fully_active;
@@ -723,7 +723,7 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
  */
 static int rcu_is_cpu_rrupt_from_idle(void)
 {
-	return __get_cpu_var(rcu_dynticks).dynticks_nesting <= 1;
+	return __this_cpu_read(rcu_dynticks.dynticks_nesting) <= 1;
 }
 
 /*
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 6f9aece..c684f7a 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -660,7 +660,7 @@ static void rcu_preempt_check_callbacks(int cpu)
 
 static void rcu_preempt_do_callbacks(void)
 {
-	rcu_do_batch(&rcu_preempt_state, &__get_cpu_var(rcu_preempt_data));
+	rcu_do_batch(&rcu_preempt_state, this_cpu_ptr(&rcu_preempt_data));
 }
 
 #endif /* #ifdef CONFIG_RCU_BOOST */
@@ -1332,7 +1332,7 @@ static void invoke_rcu_callbacks_kthread(void)
  */
 static bool rcu_is_callbacks_kthread(void)
 {
-	return __get_cpu_var(rcu_cpu_kthread_task) == current;
+	return __this_cpu_read(rcu_cpu_kthread_task) == current;
 }
 
 #define RCU_BOOST_DELAY_JIFFIES DIV_ROUND_UP(CONFIG_RCU_BOOST_DELAY * HZ, 1000)
@@ -1382,8 +1382,8 @@ static int rcu_spawn_one_boost_kthread(struct rcu_state *rsp,
 
 static void rcu_kthread_do_work(void)
 {
-	rcu_do_batch(&rcu_sched_state, &__get_cpu_var(rcu_sched_data));
-	rcu_do_batch(&rcu_bh_state, &__get_cpu_var(rcu_bh_data));
+	rcu_do_batch(&rcu_sched_state, this_cpu_ptr(&rcu_sched_data));
+	rcu_do_batch(&rcu_bh_state, this_cpu_ptr(&rcu_bh_data));
 	rcu_preempt_do_callbacks();
 }
 
@@ -1402,7 +1402,7 @@ static void rcu_cpu_kthread_park(unsigned int cpu)
 
 static int rcu_cpu_kthread_should_run(unsigned int cpu)
 {
-	return __get_cpu_var(rcu_cpu_has_work);
+	return __this_cpu_read(rcu_cpu_has_work);
 }
 
 /*
@@ -1412,8 +1412,8 @@ static int rcu_cpu_kthread_should_run(unsigned int cpu)
  */
 static void rcu_cpu_kthread(unsigned int cpu)
 {
-	unsigned int *statusp = &__get_cpu_var(rcu_cpu_kthread_status);
-	char work, *workp = &__get_cpu_var(rcu_cpu_has_work);
+	unsigned int *statusp = this_cpu_ptr(&rcu_cpu_kthread_status);
+	char work, *workp = this_cpu_ptr(&rcu_cpu_has_work);
 	int spincnt;
 
 	for (spincnt = 0; spincnt < 10; spincnt++) {
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 07/11] rcu: Silence unused-variable warnings
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
                     ` (4 preceding siblings ...)
  2013-09-25  1:29   ` [PATCH tip/core/rcu 06/11] rcu: Replace __get_cpu_var() uses Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks() Paul E. McKenney
                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The "idle" variable in both rcu_eqs_enter_common() and
rcu_eqs_exit_common() is only used in a WARN_ON_ONCE().  If the kernel
is built disabling WARN_ON_ONCE(), the compiler will complain (rightly)
that "idle" is unused.  This commit therefore adds a __maybe_unused to
the declaration of both variables.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 8eb9cfd..e6f2e8f 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -371,7 +371,8 @@ static void rcu_eqs_enter_common(struct rcu_dynticks *rdtp, long long oldval,
 {
 	trace_rcu_dyntick(TPS("Start"), oldval, rdtp->dynticks_nesting);
 	if (!user && !is_idle_task(current)) {
-		struct task_struct *idle = idle_task(smp_processor_id());
+		struct task_struct *idle __maybe_unused =
+			idle_task(smp_processor_id());
 
 		trace_rcu_dyntick(TPS("Error on entry: not idle task"), oldval, 0);
 		ftrace_dump(DUMP_ORIG);
@@ -508,7 +509,8 @@ static void rcu_eqs_exit_common(struct rcu_dynticks *rdtp, long long oldval,
 	rcu_cleanup_after_idle(smp_processor_id());
 	trace_rcu_dyntick(TPS("End"), oldval, rdtp->dynticks_nesting);
 	if (!user && !is_idle_task(current)) {
-		struct task_struct *idle = idle_task(smp_processor_id());
+		struct task_struct *idle __maybe_unused =
+			idle_task(smp_processor_id());
 
 		trace_rcu_dyntick(TPS("Error on exit: not idle task"),
 				  oldval, rdtp->dynticks_nesting);
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks()
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
                     ` (5 preceding siblings ...)
  2013-09-25  1:29   ` [PATCH tip/core/rcu 07/11] rcu: Silence unused-variable warnings Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  2:55     ` Chen Gang
  2013-09-25  1:29   ` [PATCH tip/core/rcu 09/11] rcu: Reject memory-order-induced stall-warning false positives Paul E. McKenney
                     ` (3 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The for_each_rcu_flavor() loop unconditionally scans all flavors, even
when the first flavor might have some non-lazy callbacks.  Once the
loop has seen a non-lazy callback, further passes through the loop
cannot change the state.  This is not a huge problem, given that there
can be at most three RCU flavors (RCU-bh, RCU-preempt, and RCU-sched),
but this code is on the path to idle, so speeding it up even a small
amount would have some benefit.

This commit therefore does two things:

1.	Rearranges the order of the list of RCU flavors in order to
	place the most active flavor first in the list.  The most active
	RCU flavor is RCU-preempt, or, if there is no RCU-preempt,
	RCU-sched.

2.	Reworks the for_each_rcu_flavor() to exit early when the first
	non-lazy callback is seen, or, in the case where the caller
	does not care about non-lazy callbacks (RCU_FAST_NO_HZ=n),
	when the first callback is seen.

Reported-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index e6f2e8f..49464ad 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -2727,10 +2727,13 @@ static int rcu_cpu_has_callbacks(int cpu, bool *all_lazy)
 
 	for_each_rcu_flavor(rsp) {
 		rdp = per_cpu_ptr(rsp->rda, cpu);
-		if (rdp->qlen != rdp->qlen_lazy)
+		if (!rdp->nxtlist)
+			continue;
+		hc = true;
+		if (rdp->qlen != rdp->qlen_lazy || !all_lazy) {
 			al = false;
-		if (rdp->nxtlist)
-			hc = true;
+			break;
+		}
 	}
 	if (all_lazy)
 		*all_lazy = al;
@@ -3297,8 +3300,8 @@ void __init rcu_init(void)
 
 	rcu_bootup_announce();
 	rcu_init_geometry();
-	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
 	rcu_init_one(&rcu_bh_state, &rcu_bh_data);
+	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
 	__rcu_init_preempt();
 	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
 
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 09/11] rcu: Reject memory-order-induced stall-warning false positives
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
                     ` (6 preceding siblings ...)
  2013-09-25  1:29   ` [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks() Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 10/11] rcu: Have rcutiny tracepoints use tracepoint_string() Paul E. McKenney
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

If a system is idle from an RCU perspective for longer than specified
by CONFIG_RCU_CPU_STALL_TIMEOUT, and if one CPU starts a grace period
just as a second checks for CPU stalls, and if this second CPU happens
to see the old value of rsp->jiffies_stall, it will incorrectly report a
CPU stall.  This is quite rare, but apparently occurs deterministically
on systems with about 6TB of memory.

This commit therefore orders accesses to the data used to determine
whether or not a CPU stall is in progress.  Grace-period initialization
and cleanup first increments rsp->completed to mark the end of the
previous grace period, then records the current jiffies in rsp->gp_start,
then records the jiffies at which a stall can be expected to occur in
rsp->jiffies_stall, and finally increments rsp->gpnum to mark the start
of the new grace period.  Now, this ordering by itself does not prevent
false positives.  For example, if grace-period initialization was delayed
between recording rsp->gp_start and rsp->jiffies_stall, the CPU stall
warning code might still see an old value of rsp->jiffies_stall.

Therefore, this commit also orders the CPU stall warning accesses as
well, loading rsp->gpnum and jiffies, then rsp->jiffies_stall, then
rsp->gp_start, and finally rsp->completed.  This ordering means that
the false-positive scenario in the previous paragraph would result
in rsp->completed being greater than or equal to rsp->gpnum, which is
never valid for a CPU stall, allowing the false positive to be rejected.
Furthermore, any fetch that gets an old value of rsp->jiffies_stall
must also get an old value of rsp->gpnum, which will again be rejected
by the comparison of rsp->gpnum and rsp->completed.  Situations where
rsp->gp_start is later than rsp->jiffies_stall are also rejected, as
are situations where jiffies is less than rsp->jiffies_stall.

Although use of unsynchronized accesses means that there are likely
still some false-positive scenarios (synchronization has proven to be
a very bad idea on large systems), this should get rid of a large class
of these scenarios.

Reported-by: Fabian Herschel <fabian.herschel@suse.com>
Reported-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Jochen Striepe <jochen@tolot.escape.de>
---
 kernel/rcutree.c | 45 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 5 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 49464ad..b618d72 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -804,8 +804,11 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp,
 
 static void record_gp_stall_check_time(struct rcu_state *rsp)
 {
-	rsp->gp_start = jiffies;
-	rsp->jiffies_stall = jiffies + rcu_jiffies_till_stall_check();
+	unsigned long j = ACCESS_ONCE(jiffies);
+
+	rsp->gp_start = j;
+	smp_wmb(); /* Record start time before stall time. */
+	rsp->jiffies_stall = j + rcu_jiffies_till_stall_check();
 }
 
 /*
@@ -934,17 +937,48 @@ static void print_cpu_stall(struct rcu_state *rsp)
 
 static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
 {
+	unsigned long completed;
+	unsigned long gpnum;
+	unsigned long gps;
 	unsigned long j;
 	unsigned long js;
 	struct rcu_node *rnp;
 
-	if (rcu_cpu_stall_suppress)
+	if (rcu_cpu_stall_suppress || !rcu_gp_in_progress(rsp))
 		return;
 	j = ACCESS_ONCE(jiffies);
+
+	/*
+	 * Lots of memory barriers to reject false positives.
+	 *
+	 * The idea is to pick up rsp->gpnum, then rsp->jiffies_stall,
+	 * then rsp->gp_start, and finally rsp->completed.  These values
+	 * are updated in the opposite order with memory barriers (or
+	 * equivalent) during grace-period initialization and cleanup.
+	 * Now, a false positive can occur if we get an new value of
+	 * rsp->gp_start and a old value of rsp->jiffies_stall.  But given
+	 * the memory barriers, the only way that this can happen is if one
+	 * grace period ends and another starts between these two fetches.
+	 * Detect this by comparing rsp->completed with the previous fetch
+	 * from rsp->gpnum.
+	 *
+	 * Given this check, comparisons of jiffies, rsp->jiffies_stall,
+	 * and rsp->gp_start suffice to forestall false positives.
+	 */
+	gpnum = ACCESS_ONCE(rsp->gpnum);
+	smp_rmb(); /* Pick up ->gpnum first... */
 	js = ACCESS_ONCE(rsp->jiffies_stall);
+	smp_rmb(); /* ...then ->jiffies_stall before the rest... */
+	gps = ACCESS_ONCE(rsp->gp_start);
+	smp_rmb(); /* ...and finally ->gp_start before ->completed. */
+	completed = ACCESS_ONCE(rsp->completed);
+	if (ULONG_CMP_GE(completed, gpnum) ||
+	    ULONG_CMP_LT(j, js) ||
+	    ULONG_CMP_GE(gps, js))
+		return; /* No stall or GP completed since entering function. */
 	rnp = rdp->mynode;
 	if (rcu_gp_in_progress(rsp) &&
-	    (ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
+	    (ACCESS_ONCE(rnp->qsmask) & rdp->grpmask)) {
 
 		/* We haven't checked in, so go dump stack. */
 		print_cpu_stall(rsp);
@@ -1317,9 +1351,10 @@ static int rcu_gp_init(struct rcu_state *rsp)
 	}
 
 	/* Advance to a new grace period and initialize state. */
+	record_gp_stall_check_time(rsp);
+	smp_wmb(); /* Record GP times before starting GP. */
 	rsp->gpnum++;
 	trace_rcu_grace_period(rsp->name, rsp->gpnum, TPS("start"));
-	record_gp_stall_check_time(rsp);
 	raw_spin_unlock_irq(&rnp->lock);
 
 	/* Exclude any concurrent CPU-hotplug operations. */
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 10/11] rcu: Have rcutiny tracepoints use tracepoint_string()
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
                     ` (7 preceding siblings ...)
  2013-09-25  1:29   ` [PATCH tip/core/rcu 09/11] rcu: Reject memory-order-induced stall-warning false positives Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  1:29   ` [PATCH tip/core/rcu 11/11] rcu: Fix CONFIG_RCU_NOCB_CPU_ALL panic on machines with sparse CPU mask Paul E. McKenney
  2013-09-25  4:10   ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Andrew Morton
  10 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Paul E. McKenney

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

This commit extends the work done in f7f7bac9 (rcu: Have the RCU
tracepoints use the tracepoint_string infrastructure) to cover rcutiny.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/rcu.h     |  7 +++++++
 kernel/rcutiny.c | 17 ++++++++++-------
 kernel/rcutree.c |  7 -------
 3 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/kernel/rcu.h b/kernel/rcu.h
index 7713196..7859a0a 100644
--- a/kernel/rcu.h
+++ b/kernel/rcu.h
@@ -122,4 +122,11 @@ int rcu_jiffies_till_stall_check(void);
 
 #endif /* #ifdef CONFIG_RCU_STALL_COMMON */
 
+/*
+ * Strings used in tracepoints need to be exported via the
+ * tracing system such that tools like perf and trace-cmd can
+ * translate the string address pointers to actual text.
+ */
+#define TPS(x)  tracepoint_string(x)
+
 #endif /* __LINUX_RCU_H */
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 9ed6075..e99eb5f 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -35,6 +35,7 @@
 #include <linux/time.h>
 #include <linux/cpu.h>
 #include <linux/prefetch.h>
+#include <linux/ftrace_event.h>
 
 #ifdef CONFIG_RCU_TRACE
 #include <trace/events/rcu.h>
@@ -58,16 +59,17 @@ static long long rcu_dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
 static void rcu_idle_enter_common(long long newval)
 {
 	if (newval) {
-		RCU_TRACE(trace_rcu_dyntick("--=",
+		RCU_TRACE(trace_rcu_dyntick(TPS("--="),
 					    rcu_dynticks_nesting, newval));
 		rcu_dynticks_nesting = newval;
 		return;
 	}
-	RCU_TRACE(trace_rcu_dyntick("Start", rcu_dynticks_nesting, newval));
+	RCU_TRACE(trace_rcu_dyntick(TPS("Start"),
+				    rcu_dynticks_nesting, newval));
 	if (!is_idle_task(current)) {
 		struct task_struct *idle = idle_task(smp_processor_id());
 
-		RCU_TRACE(trace_rcu_dyntick("Error on entry: not idle task",
+		RCU_TRACE(trace_rcu_dyntick(TPS("Entry error: not idle task"),
 					    rcu_dynticks_nesting, newval));
 		ftrace_dump(DUMP_ALL);
 		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
@@ -120,15 +122,15 @@ EXPORT_SYMBOL_GPL(rcu_irq_exit);
 static void rcu_idle_exit_common(long long oldval)
 {
 	if (oldval) {
-		RCU_TRACE(trace_rcu_dyntick("++=",
+		RCU_TRACE(trace_rcu_dyntick(TPS("++="),
 					    oldval, rcu_dynticks_nesting));
 		return;
 	}
-	RCU_TRACE(trace_rcu_dyntick("End", oldval, rcu_dynticks_nesting));
+	RCU_TRACE(trace_rcu_dyntick(TPS("End"), oldval, rcu_dynticks_nesting));
 	if (!is_idle_task(current)) {
 		struct task_struct *idle = idle_task(smp_processor_id());
 
-		RCU_TRACE(trace_rcu_dyntick("Error on exit: not idle task",
+		RCU_TRACE(trace_rcu_dyntick(TPS("Exit error: not idle task"),
 			  oldval, rcu_dynticks_nesting));
 		ftrace_dump(DUMP_ALL);
 		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
@@ -304,7 +306,8 @@ static void __rcu_process_callbacks(struct rcu_ctrlblk *rcp)
 		RCU_TRACE(cb_count++);
 	}
 	RCU_TRACE(rcu_trace_sub_qlen(rcp, cb_count));
-	RCU_TRACE(trace_rcu_batch_end(rcp->name, cb_count, 0, need_resched(),
+	RCU_TRACE(trace_rcu_batch_end(rcp->name,
+				      cb_count, 0, need_resched(),
 				      is_idle_task(current),
 				      false));
 }
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index b618d72..62aab5c 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -61,13 +61,6 @@
 
 #include "rcu.h"
 
-/*
- * Strings used in tracepoints need to be exported via the
- * tracing system such that tools like perf and trace-cmd can
- * translate the string address pointers to actual text.
- */
-#define TPS(x)	tracepoint_string(x)
-
 /* Data structures. */
 
 static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH tip/core/rcu 11/11] rcu: Fix CONFIG_RCU_NOCB_CPU_ALL panic on machines with sparse CPU mask
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
                     ` (8 preceding siblings ...)
  2013-09-25  1:29   ` [PATCH tip/core/rcu 10/11] rcu: Have rcutiny tracepoints use tracepoint_string() Paul E. McKenney
@ 2013-09-25  1:29   ` Paul E. McKenney
  2013-09-25  4:10   ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Andrew Morton
  10 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25  1:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	Kirill Tkhai, Paul E. McKenney

From: Kirill Tkhai <tkhai@yandex.ru>

Some architectures have sparse cpu mask. UltraSparc's cpuinfo for example:

CPU0: online
CPU2: online

So, set only possible CPUs when CONFIG_RCU_NOCB_CPU_ALL is enabled.

Also, check that user passes right 'rcu_nocbs=' option.

Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
CC: Dipankar Sarma <dipankar@in.ibm.com>
[ paulmck: Fix pr_info() issue noted by scripts/checkpatch.pl. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_plugin.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index c684f7a..1855d66 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -96,10 +96,15 @@ static void __init rcu_bootup_announce_oddness(void)
 #endif /* #ifdef CONFIG_RCU_NOCB_CPU_ZERO */
 #ifdef CONFIG_RCU_NOCB_CPU_ALL
 	pr_info("\tOffload RCU callbacks from all CPUs\n");
-	cpumask_setall(rcu_nocb_mask);
+	cpumask_copy(rcu_nocb_mask, cpu_possible_mask);
 #endif /* #ifdef CONFIG_RCU_NOCB_CPU_ALL */
 #endif /* #ifndef CONFIG_RCU_NOCB_CPU_NONE */
 	if (have_rcu_nocb_mask) {
+		if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) {
+			pr_info("\tNote: kernel parameter 'rcu_nocbs=' contains nonexistent CPUs.\n");
+			cpumask_and(rcu_nocb_mask, cpu_possible_mask,
+				    rcu_nocb_mask);
+		}
 		cpulist_scnprintf(nocb_buf, sizeof(nocb_buf), rcu_nocb_mask);
 		pr_info("\tOffload RCU callbacks from CPUs: %s.\n", nocb_buf);
 		if (rcu_nocb_poll)
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks()
  2013-09-25  1:29   ` [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks() Paul E. McKenney
@ 2013-09-25  2:55     ` Chen Gang
  2013-09-25 20:16       ` Paul E. McKenney
  0 siblings, 1 reply; 24+ messages in thread
From: Chen Gang @ 2013-09-25  2:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	linux-kernel


Thank you for your whole work, firstly  :-).


And your suggestion about testing (in our discussion) is also valuable
to me.

I need start LTP in q4. After referenced your suggestion, my first step
for using/learning LTP is not mainly for finding kernel issues, but for
testing kernel (to improve my kernel testing efficiency).

When I want to find issues by reading code, I will consider about LTP
too (I will try to find issues which can be tested by LTP).


On 09/25/2013 09:29 AM, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> The for_each_rcu_flavor() loop unconditionally scans all flavors, even
> when the first flavor might have some non-lazy callbacks.  Once the
> loop has seen a non-lazy callback, further passes through the loop
> cannot change the state.  This is not a huge problem, given that there
> can be at most three RCU flavors (RCU-bh, RCU-preempt, and RCU-sched),
> but this code is on the path to idle, so speeding it up even a small
> amount would have some benefit.
> 
> This commit therefore does two things:
> 
> 1.	Rearranges the order of the list of RCU flavors in order to
> 	place the most active flavor first in the list.  The most active
> 	RCU flavor is RCU-preempt, or, if there is no RCU-preempt,
> 	RCU-sched.
> 
> 2.	Reworks the for_each_rcu_flavor() to exit early when the first
> 	non-lazy callback is seen, or, in the case where the caller
> 	does not care about non-lazy callbacks (RCU_FAST_NO_HZ=n),
> 	when the first callback is seen.
> 
> Reported-by: Chen Gang <gang.chen@asianux.com>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/rcutree.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index e6f2e8f..49464ad 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -2727,10 +2727,13 @@ static int rcu_cpu_has_callbacks(int cpu, bool *all_lazy)
>  
>  	for_each_rcu_flavor(rsp) {
>  		rdp = per_cpu_ptr(rsp->rda, cpu);
> -		if (rdp->qlen != rdp->qlen_lazy)
> +		if (!rdp->nxtlist)
> +			continue;
> +		hc = true;
> +		if (rdp->qlen != rdp->qlen_lazy || !all_lazy) {
>  			al = false;
> -		if (rdp->nxtlist)
> -			hc = true;
> +			break;
> +		}
>  	}
>  	if (all_lazy)
>  		*all_lazy = al;
> @@ -3297,8 +3300,8 @@ void __init rcu_init(void)
>  
>  	rcu_bootup_announce();
>  	rcu_init_geometry();
> -	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
>  	rcu_init_one(&rcu_bh_state, &rcu_bh_data);
> +	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
>  	__rcu_init_preempt();
>  	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
>  
> 


-- 
Chen Gang

-- 
Chen Gang

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop
  2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
                     ` (9 preceding siblings ...)
  2013-09-25  1:29   ` [PATCH tip/core/rcu 11/11] rcu: Fix CONFIG_RCU_NOCB_CPU_ALL panic on machines with sparse CPU mask Paul E. McKenney
@ 2013-09-25  4:10   ` Andrew Morton
  2013-09-25 13:48     ` Paul E. McKenney
  10 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2013-09-25  4:10 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, mingo, laijs, dipankar, mathieu.desnoyers, josh,
	niv, tglx, peterz, rostedt, dhowells, edumazet, darren, fweisbec,
	sbw, KOSAKI Motohiro, Michel Lespinasse, Linus Torvalds

On Tue, 24 Sep 2013 18:29:11 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -736,6 +736,7 @@ static int do_mlockall(int flags)
>  
>  		/* Ignore errors */
>  		mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
> +		cond_resched();
>  	}
>  out:
>  	return 0;

Might need one in munlock_vma_pages_range() as well - it's a matter of
finding the right test case.  This will be neverending :(

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop
  2013-09-25  4:10   ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Andrew Morton
@ 2013-09-25 13:48     ` Paul E. McKenney
  2013-09-25 19:35       ` Andrew Morton
  0 siblings, 1 reply; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25 13:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, mingo, laijs, dipankar, mathieu.desnoyers, josh,
	niv, tglx, peterz, rostedt, dhowells, edumazet, darren, fweisbec,
	sbw, KOSAKI Motohiro, Michel Lespinasse, Linus Torvalds

On Tue, Sep 24, 2013 at 09:10:47PM -0700, Andrew Morton wrote:
> On Tue, 24 Sep 2013 18:29:11 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > --- a/mm/mlock.c
> > +++ b/mm/mlock.c
> > @@ -736,6 +736,7 @@ static int do_mlockall(int flags)
> >  
> >  		/* Ignore errors */
> >  		mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
> > +		cond_resched();
> >  	}
> >  out:
> >  	return 0;
> 
> Might need one in munlock_vma_pages_range() as well - it's a matter of
> finding the right test case.  This will be neverending :(

Indeed...  I suspect that Trinity running on big-memory systems will
eventually find most of them via RCU CPU stall warnings, but as you say...

Would you like the corresponding change to munlock_vma_pages_range()
beforehand?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop
  2013-09-25 13:48     ` Paul E. McKenney
@ 2013-09-25 19:35       ` Andrew Morton
  2013-09-25 20:18         ` Paul E. McKenney
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2013-09-25 19:35 UTC (permalink / raw)
  To: paulmck
  Cc: linux-kernel, mingo, laijs, dipankar, mathieu.desnoyers, josh,
	niv, tglx, peterz, rostedt, dhowells, edumazet, darren, fweisbec,
	sbw, KOSAKI Motohiro, Michel Lespinasse, Linus Torvalds

On Wed, 25 Sep 2013 06:48:04 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Sep 24, 2013 at 09:10:47PM -0700, Andrew Morton wrote:
> > On Tue, 24 Sep 2013 18:29:11 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > --- a/mm/mlock.c
> > > +++ b/mm/mlock.c
> > > @@ -736,6 +736,7 @@ static int do_mlockall(int flags)
> > >  
> > >  		/* Ignore errors */
> > >  		mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
> > > +		cond_resched();
> > >  	}
> > >  out:
> > >  	return 0;
> > 
> > Might need one in munlock_vma_pages_range() as well - it's a matter of
> > finding the right test case.  This will be neverending :(
> 
> Indeed...  I suspect that Trinity running on big-memory systems will
> eventually find most of them via RCU CPU stall warnings, but as you say...
> 
> Would you like the corresponding change to munlock_vma_pages_range()
> beforehand?

Can't decide.  If we went and poked holes in every place which looks
like it loops for a long time, we'd be poking holes everywhere, some of
them unnecessary.  otoh if we wait around for people to say "hey" then
it will take a very long time to poke all the needed holes.

The best approach would be for someone to sit down, identify all the
potential problem spots, attempt to craft a userspace exploit to verify
that each one really is a problem, then fix it.  Nobody will bother
doing this.

So I dunno.  Stop asking difficult questions ;)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks()
  2013-09-25  2:55     ` Chen Gang
@ 2013-09-25 20:16       ` Paul E. McKenney
  2013-09-26  2:57         ` Chen Gang
  0 siblings, 1 reply; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25 20:16 UTC (permalink / raw)
  To: Chen Gang
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	linux-kernel

On Wed, Sep 25, 2013 at 10:55:30AM +0800, Chen Gang wrote:
> 
> Thank you for your whole work, firstly  :-).
> 
> And your suggestion about testing (in our discussion) is also valuable
> to me.
> 
> I need start LTP in q4. After referenced your suggestion, my first step
> for using/learning LTP is not mainly for finding kernel issues, but for
> testing kernel (to improve my kernel testing efficiency).
> 
> When I want to find issues by reading code, I will consider about LTP
> too (I will try to find issues which can be tested by LTP).

Doing more testing will be good!  You will probably need more tests
than just LTP, but you must of course start somewhere.

							Thanx, Paul

> On 09/25/2013 09:29 AM, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > 
> > The for_each_rcu_flavor() loop unconditionally scans all flavors, even
> > when the first flavor might have some non-lazy callbacks.  Once the
> > loop has seen a non-lazy callback, further passes through the loop
> > cannot change the state.  This is not a huge problem, given that there
> > can be at most three RCU flavors (RCU-bh, RCU-preempt, and RCU-sched),
> > but this code is on the path to idle, so speeding it up even a small
> > amount would have some benefit.
> > 
> > This commit therefore does two things:
> > 
> > 1.	Rearranges the order of the list of RCU flavors in order to
> > 	place the most active flavor first in the list.  The most active
> > 	RCU flavor is RCU-preempt, or, if there is no RCU-preempt,
> > 	RCU-sched.
> > 
> > 2.	Reworks the for_each_rcu_flavor() to exit early when the first
> > 	non-lazy callback is seen, or, in the case where the caller
> > 	does not care about non-lazy callbacks (RCU_FAST_NO_HZ=n),
> > 	when the first callback is seen.
> > 
> > Reported-by: Chen Gang <gang.chen@asianux.com>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/rcutree.c | 11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index e6f2e8f..49464ad 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -2727,10 +2727,13 @@ static int rcu_cpu_has_callbacks(int cpu, bool *all_lazy)
> >  
> >  	for_each_rcu_flavor(rsp) {
> >  		rdp = per_cpu_ptr(rsp->rda, cpu);
> > -		if (rdp->qlen != rdp->qlen_lazy)
> > +		if (!rdp->nxtlist)
> > +			continue;
> > +		hc = true;
> > +		if (rdp->qlen != rdp->qlen_lazy || !all_lazy) {
> >  			al = false;
> > -		if (rdp->nxtlist)
> > -			hc = true;
> > +			break;
> > +		}
> >  	}
> >  	if (all_lazy)
> >  		*all_lazy = al;
> > @@ -3297,8 +3300,8 @@ void __init rcu_init(void)
> >  
> >  	rcu_bootup_announce();
> >  	rcu_init_geometry();
> > -	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
> >  	rcu_init_one(&rcu_bh_state, &rcu_bh_data);
> > +	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
> >  	__rcu_init_preempt();
> >  	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
> >  
> > 
> 
> 
> -- 
> Chen Gang
> 
> -- 
> Chen Gang
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop
  2013-09-25 19:35       ` Andrew Morton
@ 2013-09-25 20:18         ` Paul E. McKenney
  0 siblings, 0 replies; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-25 20:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, mingo, laijs, dipankar, mathieu.desnoyers, josh,
	niv, tglx, peterz, rostedt, dhowells, edumazet, darren, fweisbec,
	sbw, KOSAKI Motohiro, Michel Lespinasse, Linus Torvalds

On Wed, Sep 25, 2013 at 12:35:37PM -0700, Andrew Morton wrote:
> On Wed, 25 Sep 2013 06:48:04 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Sep 24, 2013 at 09:10:47PM -0700, Andrew Morton wrote:
> > > On Tue, 24 Sep 2013 18:29:11 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > 
> > > > --- a/mm/mlock.c
> > > > +++ b/mm/mlock.c
> > > > @@ -736,6 +736,7 @@ static int do_mlockall(int flags)
> > > >  
> > > >  		/* Ignore errors */
> > > >  		mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
> > > > +		cond_resched();
> > > >  	}
> > > >  out:
> > > >  	return 0;
> > > 
> > > Might need one in munlock_vma_pages_range() as well - it's a matter of
> > > finding the right test case.  This will be neverending :(
> > 
> > Indeed...  I suspect that Trinity running on big-memory systems will
> > eventually find most of them via RCU CPU stall warnings, but as you say...
> > 
> > Would you like the corresponding change to munlock_vma_pages_range()
> > beforehand?
> 
> Can't decide.  If we went and poked holes in every place which looks
> like it loops for a long time, we'd be poking holes everywhere, some of
> them unnecessary.  otoh if we wait around for people to say "hey" then
> it will take a very long time to poke all the needed holes.

Yep.

> The best approach would be for someone to sit down, identify all the
> potential problem spots, attempt to craft a userspace exploit to verify
> that each one really is a problem, then fix it.  Nobody will bother
> doing this.

And if someone does bother doing this, there will no doubt be some debate
about whether or not the exploit is reasonable.

> So I dunno.  Stop asking difficult questions ;)

;-) ;-) ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks()
  2013-09-25 20:16       ` Paul E. McKenney
@ 2013-09-26  2:57         ` Chen Gang
  2013-09-26 18:33           ` Paul E. McKenney
  0 siblings, 1 reply; 24+ messages in thread
From: Chen Gang @ 2013-09-26  2:57 UTC (permalink / raw)
  To: paulmck
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	linux-kernel

On 09/26/2013 04:16 AM, Paul E. McKenney wrote:
> On Wed, Sep 25, 2013 at 10:55:30AM +0800, Chen Gang wrote:
>>
>> Thank you for your whole work, firstly  :-).
>>
>> And your suggestion about testing (in our discussion) is also valuable
>> to me.
>>
>> I need start LTP in q4. After referenced your suggestion, my first step
>> for using/learning LTP is not mainly for finding kernel issues, but for
>> testing kernel (to improve my kernel testing efficiency).
>>
>> When I want to find issues by reading code, I will consider about LTP
>> too (I will try to find issues which can be tested by LTP).
> 
> Doing more testing will be good!  You will probably need more tests
> than just LTP, but you must of course start somewhere.
>

Give more testing is good, but also mean more time resources cost. If
spend the 'cost', also need get additional 'contributions' (not only
prove an issue), or the 'efficiency' can not be 'acceptable'.


When "I need more tests than just LTP", firstly I need perform this
test, and then, also try to send "test case" to LTP (I guess, these
kinds of mails are welcomed by LTP).

And LTP is also a way to find kernel issues, although I will not mainly
depend on it now (but maybe in future), it is better to familiar with it
step by step.


LTP (Linux Test Project) is one of main kernel mad user at downstream.
Tool chain (GCC/Binutils) is one of kernel main mad tools at upstream.
If we face to the whole kernel, suggest to use them. ;-)


Thanks.

> 							Thanx, Paul
> 
>> On 09/25/2013 09:29 AM, Paul E. McKenney wrote:
>>> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
>>>
>>> The for_each_rcu_flavor() loop unconditionally scans all flavors, even
>>> when the first flavor might have some non-lazy callbacks.  Once the
>>> loop has seen a non-lazy callback, further passes through the loop
>>> cannot change the state.  This is not a huge problem, given that there
>>> can be at most three RCU flavors (RCU-bh, RCU-preempt, and RCU-sched),
>>> but this code is on the path to idle, so speeding it up even a small
>>> amount would have some benefit.
>>>
>>> This commit therefore does two things:
>>>
>>> 1.	Rearranges the order of the list of RCU flavors in order to
>>> 	place the most active flavor first in the list.  The most active
>>> 	RCU flavor is RCU-preempt, or, if there is no RCU-preempt,
>>> 	RCU-sched.
>>>
>>> 2.	Reworks the for_each_rcu_flavor() to exit early when the first
>>> 	non-lazy callback is seen, or, in the case where the caller
>>> 	does not care about non-lazy callbacks (RCU_FAST_NO_HZ=n),
>>> 	when the first callback is seen.
>>>
>>> Reported-by: Chen Gang <gang.chen@asianux.com>
>>> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>>> ---
>>>  kernel/rcutree.c | 11 +++++++----
>>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
>>> index e6f2e8f..49464ad 100644
>>> --- a/kernel/rcutree.c
>>> +++ b/kernel/rcutree.c
>>> @@ -2727,10 +2727,13 @@ static int rcu_cpu_has_callbacks(int cpu, bool *all_lazy)
>>>  
>>>  	for_each_rcu_flavor(rsp) {
>>>  		rdp = per_cpu_ptr(rsp->rda, cpu);
>>> -		if (rdp->qlen != rdp->qlen_lazy)
>>> +		if (!rdp->nxtlist)
>>> +			continue;
>>> +		hc = true;
>>> +		if (rdp->qlen != rdp->qlen_lazy || !all_lazy) {
>>>  			al = false;
>>> -		if (rdp->nxtlist)
>>> -			hc = true;
>>> +			break;
>>> +		}
>>>  	}
>>>  	if (all_lazy)
>>>  		*all_lazy = al;
>>> @@ -3297,8 +3300,8 @@ void __init rcu_init(void)
>>>  
>>>  	rcu_bootup_announce();
>>>  	rcu_init_geometry();
>>> -	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
>>>  	rcu_init_one(&rcu_bh_state, &rcu_bh_data);
>>> +	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
>>>  	__rcu_init_preempt();
>>>  	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
>>>  
>>>
>>
>>
>> -- 
>> Chen Gang
>>
>> -- 
>> Chen Gang
>>
> 
> 
> 


-- 
Chen Gang

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks()
  2013-09-26  2:57         ` Chen Gang
@ 2013-09-26 18:33           ` Paul E. McKenney
  2013-09-27  2:29             ` Chen Gang
  0 siblings, 1 reply; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-26 18:33 UTC (permalink / raw)
  To: Chen Gang
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	linux-kernel

On Thu, Sep 26, 2013 at 10:57:39AM +0800, Chen Gang wrote:
> On 09/26/2013 04:16 AM, Paul E. McKenney wrote:
> > On Wed, Sep 25, 2013 at 10:55:30AM +0800, Chen Gang wrote:
> >>
> >> Thank you for your whole work, firstly  :-).
> >>
> >> And your suggestion about testing (in our discussion) is also valuable
> >> to me.
> >>
> >> I need start LTP in q4. After referenced your suggestion, my first step
> >> for using/learning LTP is not mainly for finding kernel issues, but for
> >> testing kernel (to improve my kernel testing efficiency).
> >>
> >> When I want to find issues by reading code, I will consider about LTP
> >> too (I will try to find issues which can be tested by LTP).
> > 
> > Doing more testing will be good!  You will probably need more tests
> > than just LTP, but you must of course start somewhere.
> 
> Give more testing is good, but also mean more time resources cost. If
> spend the 'cost', also need get additional 'contributions' (not only
> prove an issue), or the 'efficiency' can not be 'acceptable'.
> 
> When "I need more tests than just LTP", firstly I need perform this
> test, and then, also try to send "test case" to LTP (I guess, these
> kinds of mails are welcomed by LTP).
> 
> And LTP is also a way to find kernel issues, although I will not mainly
> depend on it now (but maybe in future), it is better to familiar with it
> step by step.
> 
> LTP (Linux Test Project) is one of main kernel mad user at downstream.
> Tool chain (GCC/Binutils) is one of kernel main mad tools at upstream.
> If we face to the whole kernel, suggest to use them. ;-)

Yep, starting with just LTP is OK.  But if by this time next year you
really should be using more than just LTP.

							Thanx, Paul

> >> On 09/25/2013 09:29 AM, Paul E. McKenney wrote:
> >>> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> >>>
> >>> The for_each_rcu_flavor() loop unconditionally scans all flavors, even
> >>> when the first flavor might have some non-lazy callbacks.  Once the
> >>> loop has seen a non-lazy callback, further passes through the loop
> >>> cannot change the state.  This is not a huge problem, given that there
> >>> can be at most three RCU flavors (RCU-bh, RCU-preempt, and RCU-sched),
> >>> but this code is on the path to idle, so speeding it up even a small
> >>> amount would have some benefit.
> >>>
> >>> This commit therefore does two things:
> >>>
> >>> 1.	Rearranges the order of the list of RCU flavors in order to
> >>> 	place the most active flavor first in the list.  The most active
> >>> 	RCU flavor is RCU-preempt, or, if there is no RCU-preempt,
> >>> 	RCU-sched.
> >>>
> >>> 2.	Reworks the for_each_rcu_flavor() to exit early when the first
> >>> 	non-lazy callback is seen, or, in the case where the caller
> >>> 	does not care about non-lazy callbacks (RCU_FAST_NO_HZ=n),
> >>> 	when the first callback is seen.
> >>>
> >>> Reported-by: Chen Gang <gang.chen@asianux.com>
> >>> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> >>> ---
> >>>  kernel/rcutree.c | 11 +++++++----
> >>>  1 file changed, 7 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> >>> index e6f2e8f..49464ad 100644
> >>> --- a/kernel/rcutree.c
> >>> +++ b/kernel/rcutree.c
> >>> @@ -2727,10 +2727,13 @@ static int rcu_cpu_has_callbacks(int cpu, bool *all_lazy)
> >>>  
> >>>  	for_each_rcu_flavor(rsp) {
> >>>  		rdp = per_cpu_ptr(rsp->rda, cpu);
> >>> -		if (rdp->qlen != rdp->qlen_lazy)
> >>> +		if (!rdp->nxtlist)
> >>> +			continue;
> >>> +		hc = true;
> >>> +		if (rdp->qlen != rdp->qlen_lazy || !all_lazy) {
> >>>  			al = false;
> >>> -		if (rdp->nxtlist)
> >>> -			hc = true;
> >>> +			break;
> >>> +		}
> >>>  	}
> >>>  	if (all_lazy)
> >>>  		*all_lazy = al;
> >>> @@ -3297,8 +3300,8 @@ void __init rcu_init(void)
> >>>  
> >>>  	rcu_bootup_announce();
> >>>  	rcu_init_geometry();
> >>> -	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
> >>>  	rcu_init_one(&rcu_bh_state, &rcu_bh_data);
> >>> +	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
> >>>  	__rcu_init_preempt();
> >>>  	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
> >>>  
> >>>
> >>
> >>
> >> -- 
> >> Chen Gang
> >>
> >> -- 
> >> Chen Gang
> >>
> > 
> > 
> > 
> 
> 
> -- 
> Chen Gang
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks()
  2013-09-26 18:33           ` Paul E. McKenney
@ 2013-09-27  2:29             ` Chen Gang
  2013-09-29  4:24               ` Chen Gang
  0 siblings, 1 reply; 24+ messages in thread
From: Chen Gang @ 2013-09-27  2:29 UTC (permalink / raw)
  To: paulmck
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	linux-kernel

On 09/27/2013 02:33 AM, Paul E. McKenney wrote:
> On Thu, Sep 26, 2013 at 10:57:39AM +0800, Chen Gang wrote:
>> On 09/26/2013 04:16 AM, Paul E. McKenney wrote:
>>> On Wed, Sep 25, 2013 at 10:55:30AM +0800, Chen Gang wrote:
>>>>
>>>> Thank you for your whole work, firstly  :-).
>>>>
>>>> And your suggestion about testing (in our discussion) is also valuable
>>>> to me.
>>>>
>>>> I need start LTP in q4. After referenced your suggestion, my first step
>>>> for using/learning LTP is not mainly for finding kernel issues, but for
>>>> testing kernel (to improve my kernel testing efficiency).
>>>>
>>>> When I want to find issues by reading code, I will consider about LTP
>>>> too (I will try to find issues which can be tested by LTP).
>>>
>>> Doing more testing will be good!  You will probably need more tests
>>> than just LTP, but you must of course start somewhere.
>>
>> Give more testing is good, but also mean more time resources cost. If
>> spend the 'cost', also need get additional 'contributions' (not only
>> prove an issue), or the 'efficiency' can not be 'acceptable'.
>>
>> When "I need more tests than just LTP", firstly I need perform this
>> test, and then, also try to send "test case" to LTP (I guess, these
>> kinds of mails are welcomed by LTP).
>>
>> And LTP is also a way to find kernel issues, although I will not mainly
>> depend on it now (but maybe in future), it is better to familiar with it
>> step by step.
>>
>> LTP (Linux Test Project) is one of main kernel mad user at downstream.
>> Tool chain (GCC/Binutils) is one of kernel main mad tools at upstream.
>> If we face to the whole kernel, suggest to use them. ;-)
> 
> Yep, starting with just LTP is OK.  But if by this time next year you
> really should be using more than just LTP.
> 

Hmm... LTP is "Linux Test Project", if I make some test cases which is
useful for the issue which I find, I guess, these test cases are also
welcomed by LTP.

Except testing, "I really should be using more than just LTP" (just
like you said).

e.g.

  Tool Chain: just I am trying.

    According to my current time resources, within this year, I can not finish allmodconfig on all architectures. :-(
    I am just solving one gcc issue, it seems it is not quite difficult, but at least now, I have no time on it. :-(

  Documents: just I am trying.

    I am trying to discuss API definition comments, but it seems I am not well done. :-(
    I am also trying some of trivial patches, neither seems what I have done is well enough. :-(
    Communicating and discussing related issues with other members. Only this, it seems not quite bad. :-)

  LTP:  I will try in q4 2013.

    In fact, when I first comes to our Public Kernel, I already use LTP (and disccus an nfs issue by LTP test), which is still suspending. :-(
    In my original plan (not declare to outside), I want to start LTP in q3 2013, but fails (because of no time resources). :-(


  Bugzilla: plan to try in next year.

    I also want to solve some issues which comes from Bugzilla (especially for some issues which no one wants to try).
    but according to my current action result and time resources, I can not dare to declare it to outside in next year. :-(

  And I still have some company internal things to do (which may be urgent, sometimes), it will consume my 20-40% time resources. :-(


So, please understand with each other: every members' time resource is
expensive, we have to take care of it. and also, I thank all members
who can spend their time resources on my mail and disccus with me.


Thanks.

> 							Thanx, Paul
> 
>>>> On 09/25/2013 09:29 AM, Paul E. McKenney wrote:
>>>>> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
>>>>>
>>>>> The for_each_rcu_flavor() loop unconditionally scans all flavors, even
>>>>> when the first flavor might have some non-lazy callbacks.  Once the
>>>>> loop has seen a non-lazy callback, further passes through the loop
>>>>> cannot change the state.  This is not a huge problem, given that there
>>>>> can be at most three RCU flavors (RCU-bh, RCU-preempt, and RCU-sched),
>>>>> but this code is on the path to idle, so speeding it up even a small
>>>>> amount would have some benefit.
>>>>>
>>>>> This commit therefore does two things:
>>>>>
>>>>> 1.	Rearranges the order of the list of RCU flavors in order to
>>>>> 	place the most active flavor first in the list.  The most active
>>>>> 	RCU flavor is RCU-preempt, or, if there is no RCU-preempt,
>>>>> 	RCU-sched.
>>>>>
>>>>> 2.	Reworks the for_each_rcu_flavor() to exit early when the first
>>>>> 	non-lazy callback is seen, or, in the case where the caller
>>>>> 	does not care about non-lazy callbacks (RCU_FAST_NO_HZ=n),
>>>>> 	when the first callback is seen.
>>>>>
>>>>> Reported-by: Chen Gang <gang.chen@asianux.com>
>>>>> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>>>>> ---
>>>>>  kernel/rcutree.c | 11 +++++++----
>>>>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
>>>>> index e6f2e8f..49464ad 100644
>>>>> --- a/kernel/rcutree.c
>>>>> +++ b/kernel/rcutree.c
>>>>> @@ -2727,10 +2727,13 @@ static int rcu_cpu_has_callbacks(int cpu, bool *all_lazy)
>>>>>  
>>>>>  	for_each_rcu_flavor(rsp) {
>>>>>  		rdp = per_cpu_ptr(rsp->rda, cpu);
>>>>> -		if (rdp->qlen != rdp->qlen_lazy)
>>>>> +		if (!rdp->nxtlist)
>>>>> +			continue;
>>>>> +		hc = true;
>>>>> +		if (rdp->qlen != rdp->qlen_lazy || !all_lazy) {
>>>>>  			al = false;
>>>>> -		if (rdp->nxtlist)
>>>>> -			hc = true;
>>>>> +			break;
>>>>> +		}
>>>>>  	}
>>>>>  	if (all_lazy)
>>>>>  		*all_lazy = al;
>>>>> @@ -3297,8 +3300,8 @@ void __init rcu_init(void)
>>>>>  
>>>>>  	rcu_bootup_announce();
>>>>>  	rcu_init_geometry();
>>>>> -	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
>>>>>  	rcu_init_one(&rcu_bh_state, &rcu_bh_data);
>>>>> +	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
>>>>>  	__rcu_init_preempt();
>>>>>  	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
>>>>>  
>>>>>
>>>>
>>>>
>>>> -- 
>>>> Chen Gang
>>>>
>>>> -- 
>>>> Chen Gang
>>>>
>>>
>>>
>>>
>>
>>
>> -- 
>> Chen Gang
>>
> 
> 
> 


-- 
Chen Gang

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks()
  2013-09-27  2:29             ` Chen Gang
@ 2013-09-29  4:24               ` Chen Gang
  2013-09-29 20:23                 ` Paul E. McKenney
  0 siblings, 1 reply; 24+ messages in thread
From: Chen Gang @ 2013-09-29  4:24 UTC (permalink / raw)
  To: paulmck
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	linux-kernel

On 09/27/2013 10:29 AM, Chen Gang wrote:
> On 09/27/2013 02:33 AM, Paul E. McKenney wrote:
>> On Thu, Sep 26, 2013 at 10:57:39AM +0800, Chen Gang wrote:
>>> On 09/26/2013 04:16 AM, Paul E. McKenney wrote:
>>>> On Wed, Sep 25, 2013 at 10:55:30AM +0800, Chen Gang wrote:
>>>>>
>>>>> Thank you for your whole work, firstly  :-).
>>>>>
>>>>> And your suggestion about testing (in our discussion) is also valuable
>>>>> to me.
>>>>>
>>>>> I need start LTP in q4. After referenced your suggestion, my first step
>>>>> for using/learning LTP is not mainly for finding kernel issues, but for
>>>>> testing kernel (to improve my kernel testing efficiency).
>>>>>
>>>>> When I want to find issues by reading code, I will consider about LTP
>>>>> too (I will try to find issues which can be tested by LTP).
>>>>
>>>> Doing more testing will be good!  You will probably need more tests
>>>> than just LTP, but you must of course start somewhere.
>>>
>>> Give more testing is good, but also mean more time resources cost. If
>>> spend the 'cost', also need get additional 'contributions' (not only
>>> prove an issue), or the 'efficiency' can not be 'acceptable'.
>>>
>>> When "I need more tests than just LTP", firstly I need perform this
>>> test, and then, also try to send "test case" to LTP (I guess, these
>>> kinds of mails are welcomed by LTP).
>>>
>>> And LTP is also a way to find kernel issues, although I will not mainly
>>> depend on it now (but maybe in future), it is better to familiar with it
>>> step by step.
>>>
>>> LTP (Linux Test Project) is one of main kernel mad user at downstream.
>>> Tool chain (GCC/Binutils) is one of kernel main mad tools at upstream.
>>> If we face to the whole kernel, suggest to use them. ;-)
>>
>> Yep, starting with just LTP is OK.  But if by this time next year you
>> really should be using more than just LTP.
>>

What I have done is trying to fully use other members contributions, not trying to instead of them.


And the reason why I want/try to 'open' my 'ideas' to public:

  get more suggestions, and completions from other members.

  share my ideas, it can let other members provide more contributions (e.g. I am glad, if find other members also try 'allmodconfig' on all architectures).

  If some members replicate me, I will save my current time resources and devote them to another things (which also based on other members contributions).


In my opinion:

  "Open and Share" are both important and urgent to everyone, although it may not be noticed directly. Like "Air and Water" which God have blessed to everyone.


Thanks.

> 
> Hmm... LTP is "Linux Test Project", if I make some test cases which is
> useful for the issue which I find, I guess, these test cases are also
> welcomed by LTP.
> 
> Except testing, "I really should be using more than just LTP" (just
> like you said).
> 
> e.g.
> 
>   Tool Chain: just I am trying.
> 
>     According to my current time resources, within this year, I can not finish allmodconfig on all architectures. :-(
>     I am just solving one gcc issue, it seems it is not quite difficult, but at least now, I have no time on it. :-(
> 
>   Documents: just I am trying.
> 
>     I am trying to discuss API definition comments, but it seems I am not well done. :-(
>     I am also trying some of trivial patches, neither seems what I have done is well enough. :-(
>     Communicating and discussing related issues with other members. Only this, it seems not quite bad. :-)
> 
>   LTP:  I will try in q4 2013.
> 
>     In fact, when I first comes to our Public Kernel, I already use LTP (and disccus an nfs issue by LTP test), which is still suspending. :-(
>     In my original plan (not declare to outside), I want to start LTP in q3 2013, but fails (because of no time resources). :-(
> 
> 
>   Bugzilla: plan to try in next year.
> 
>     I also want to solve some issues which comes from Bugzilla (especially for some issues which no one wants to try).
>     but according to my current action result and time resources, I can not dare to declare it to outside in next year. :-(
> 
>   And I still have some company internal things to do (which may be urgent, sometimes), it will consume my 20-40% time resources. :-(
> 
> 
> So, please understand with each other: every members' time resource is
> expensive, we have to take care of it. and also, I thank all members
> who can spend their time resources on my mail and disccus with me.
> 
> 
> Thanks.
> 
>> 							Thanx, Paul
>>
>>>>> On 09/25/2013 09:29 AM, Paul E. McKenney wrote:
>>>>>> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
>>>>>>
>>>>>> The for_each_rcu_flavor() loop unconditionally scans all flavors, even
>>>>>> when the first flavor might have some non-lazy callbacks.  Once the
>>>>>> loop has seen a non-lazy callback, further passes through the loop
>>>>>> cannot change the state.  This is not a huge problem, given that there
>>>>>> can be at most three RCU flavors (RCU-bh, RCU-preempt, and RCU-sched),
>>>>>> but this code is on the path to idle, so speeding it up even a small
>>>>>> amount would have some benefit.
>>>>>>
>>>>>> This commit therefore does two things:
>>>>>>
>>>>>> 1.	Rearranges the order of the list of RCU flavors in order to
>>>>>> 	place the most active flavor first in the list.  The most active
>>>>>> 	RCU flavor is RCU-preempt, or, if there is no RCU-preempt,
>>>>>> 	RCU-sched.
>>>>>>
>>>>>> 2.	Reworks the for_each_rcu_flavor() to exit early when the first
>>>>>> 	non-lazy callback is seen, or, in the case where the caller
>>>>>> 	does not care about non-lazy callbacks (RCU_FAST_NO_HZ=n),
>>>>>> 	when the first callback is seen.
>>>>>>
>>>>>> Reported-by: Chen Gang <gang.chen@asianux.com>
>>>>>> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>>>>>> ---
>>>>>>  kernel/rcutree.c | 11 +++++++----
>>>>>>  1 file changed, 7 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
>>>>>> index e6f2e8f..49464ad 100644
>>>>>> --- a/kernel/rcutree.c
>>>>>> +++ b/kernel/rcutree.c
>>>>>> @@ -2727,10 +2727,13 @@ static int rcu_cpu_has_callbacks(int cpu, bool *all_lazy)
>>>>>>  
>>>>>>  	for_each_rcu_flavor(rsp) {
>>>>>>  		rdp = per_cpu_ptr(rsp->rda, cpu);
>>>>>> -		if (rdp->qlen != rdp->qlen_lazy)
>>>>>> +		if (!rdp->nxtlist)
>>>>>> +			continue;
>>>>>> +		hc = true;
>>>>>> +		if (rdp->qlen != rdp->qlen_lazy || !all_lazy) {
>>>>>>  			al = false;
>>>>>> -		if (rdp->nxtlist)
>>>>>> -			hc = true;
>>>>>> +			break;
>>>>>> +		}
>>>>>>  	}
>>>>>>  	if (all_lazy)
>>>>>>  		*all_lazy = al;
>>>>>> @@ -3297,8 +3300,8 @@ void __init rcu_init(void)
>>>>>>  
>>>>>>  	rcu_bootup_announce();
>>>>>>  	rcu_init_geometry();
>>>>>> -	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
>>>>>>  	rcu_init_one(&rcu_bh_state, &rcu_bh_data);
>>>>>> +	rcu_init_one(&rcu_sched_state, &rcu_sched_data);
>>>>>>  	__rcu_init_preempt();
>>>>>>  	open_softirq(RCU_SOFTIRQ, rcu_process_callbacks);
>>>>>>  
>>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Chen Gang
>>>>>
>>>>> -- 
>>>>> Chen Gang
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> -- 
>>> Chen Gang
>>>
>>
>>
>>
> 
> 


-- 
Chen Gang

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks()
  2013-09-29  4:24               ` Chen Gang
@ 2013-09-29 20:23                 ` Paul E. McKenney
  2013-09-30  1:33                   ` Chen Gang
  0 siblings, 1 reply; 24+ messages in thread
From: Paul E. McKenney @ 2013-09-29 20:23 UTC (permalink / raw)
  To: Chen Gang
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	linux-kernel

On Sun, Sep 29, 2013 at 12:24:52PM +0800, Chen Gang wrote:
> On 09/27/2013 10:29 AM, Chen Gang wrote:
> > On 09/27/2013 02:33 AM, Paul E. McKenney wrote:
> >> On Thu, Sep 26, 2013 at 10:57:39AM +0800, Chen Gang wrote:
> >>> On 09/26/2013 04:16 AM, Paul E. McKenney wrote:
> >>>> On Wed, Sep 25, 2013 at 10:55:30AM +0800, Chen Gang wrote:
> >>>>>
> >>>>> Thank you for your whole work, firstly  :-).
> >>>>>
> >>>>> And your suggestion about testing (in our discussion) is also valuable
> >>>>> to me.
> >>>>>
> >>>>> I need start LTP in q4. After referenced your suggestion, my first step
> >>>>> for using/learning LTP is not mainly for finding kernel issues, but for
> >>>>> testing kernel (to improve my kernel testing efficiency).
> >>>>>
> >>>>> When I want to find issues by reading code, I will consider about LTP
> >>>>> too (I will try to find issues which can be tested by LTP).
> >>>>
> >>>> Doing more testing will be good!  You will probably need more tests
> >>>> than just LTP, but you must of course start somewhere.
> >>>
> >>> Give more testing is good, but also mean more time resources cost. If
> >>> spend the 'cost', also need get additional 'contributions' (not only
> >>> prove an issue), or the 'efficiency' can not be 'acceptable'.
> >>>
> >>> When "I need more tests than just LTP", firstly I need perform this
> >>> test, and then, also try to send "test case" to LTP (I guess, these
> >>> kinds of mails are welcomed by LTP).
> >>>
> >>> And LTP is also a way to find kernel issues, although I will not mainly
> >>> depend on it now (but maybe in future), it is better to familiar with it
> >>> step by step.
> >>>
> >>> LTP (Linux Test Project) is one of main kernel mad user at downstream.
> >>> Tool chain (GCC/Binutils) is one of kernel main mad tools at upstream.
> >>> If we face to the whole kernel, suggest to use them. ;-)
> >>
> >> Yep, starting with just LTP is OK.  But if by this time next year you
> >> really should be using more than just LTP.
> >>
> 
> What I have done is trying to fully use other members contributions, not trying to instead of them.
> 
> 
> And the reason why I want/try to 'open' my 'ideas' to public:
> 
>   get more suggestions, and completions from other members.
> 
>   share my ideas, it can let other members provide more contributions (e.g. I am glad, if find other members also try 'allmodconfig' on all architectures).
> 
>   If some members replicate me, I will save my current time resources and devote them to another things (which also based on other members contributions).
> 
> 
> In my opinion:
> 
>   "Open and Share" are both important and urgent to everyone, although it may not be noticed directly. Like "Air and Water" which God have blessed to everyone.

In a general sense, of course.

In many specific cases, effective sharing can require quite a bit of
preparation.  For but one example, in Dipankar's and my case, it took
about two years of work (mostly Dipankar's work) to get the initial
implementation of RCU accepted into the Linux kernel.

In your case, you can invest an average of three days per accepted
patch if you are to achieve your goal of ten patches accepted per month
(if I remember correctly).  Of course, not every patch will be accepted,
which reduces your per-patch time.  For example, if 50% of your patches
are accepted, you can invest an average of about 1.5 days per patch.

Of course, investing in learning about test frameworks or specific
kernel subsystems further reduces your time available per patch.

But if you don't invest in your learning, you will be limited in what
you can effectively contribute.  This might be OK, for all I know.
After all, in the 15 million lines of Linux kernel code, there is
probably a very large number of point-problems waiting to be fixed.

But suppose that you run out of easily found point problems?  Or that
you want to do something more wide-ranging than fixes for point problems?
What can you do?

Here are a few options.  If you think more about it, I am sure that you
can come up with others.

1.	Put the ten-patches-per-month quota aside for a month (or two or
	three or whatever is required and appropriate).  Spend this time
	studying a given kernel subsystem or a given test framework.
	(Which kernel subsystem?  The best candidates would be those
	having bugs but no active maintainer, but which you have the
	hardware needed to adequately test.)

2.	Add a review and/or test component to your monthly quota, so
	that a given patch could be substituted for by some number of
	Reviewed-by or Tested-by flags.  Of course, this gives your
	a chicken-and-egg problem because you cannot adequately review
	or test without some understanding of the subsystem in question.
	(At least not efficiently enough to get enough Tested-by or
	Reviewed-by flags.)

3.	Set aside a fixed amount of time each week (or each month) to
	learn.  This time needs to be a contiguous block of at least
	four hours.  If you focus your learning appropriately, you might
	be able to contribute more deeply to whatever you learned about
	over time.

	For whatever it is worth, just staring at code is for most people
	an inefficient way to learn.  Exercising the code using tools
	like ftrace or userspace scaffolding can help speed up the
	learning.

4.	Your idea here...

Your current approach seems to be to submit patches and hope that the
maintainer takes it upon himself or herself to teach you.  Unfortunately,
as you might have noticed, a given maintainer might not have the time
or energy to take on full responsibility for your education.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks()
  2013-09-29 20:23                 ` Paul E. McKenney
@ 2013-09-30  1:33                   ` Chen Gang
  0 siblings, 0 replies; 24+ messages in thread
From: Chen Gang @ 2013-09-30  1:33 UTC (permalink / raw)
  To: paulmck
  Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, niv, tglx,
	peterz, rostedt, dhowells, edumazet, darren, fweisbec, sbw,
	linux-kernel

On 09/30/2013 04:23 AM, Paul E. McKenney wrote:
> On Sun, Sep 29, 2013 at 12:24:52PM +0800, Chen Gang wrote:
>> On 09/27/2013 10:29 AM, Chen Gang wrote:
>>> On 09/27/2013 02:33 AM, Paul E. McKenney wrote:
>>>> On Thu, Sep 26, 2013 at 10:57:39AM +0800, Chen Gang wrote:
>>>>> On 09/26/2013 04:16 AM, Paul E. McKenney wrote:
>>>>>> On Wed, Sep 25, 2013 at 10:55:30AM +0800, Chen Gang wrote:
>>>>>>>
>>>>>>> Thank you for your whole work, firstly  :-).
>>>>>>>
>>>>>>> And your suggestion about testing (in our discussion) is also valuable
>>>>>>> to me.
>>>>>>>
>>>>>>> I need start LTP in q4. After referenced your suggestion, my first step
>>>>>>> for using/learning LTP is not mainly for finding kernel issues, but for
>>>>>>> testing kernel (to improve my kernel testing efficiency).
>>>>>>>
>>>>>>> When I want to find issues by reading code, I will consider about LTP
>>>>>>> too (I will try to find issues which can be tested by LTP).
>>>>>>
>>>>>> Doing more testing will be good!  You will probably need more tests
>>>>>> than just LTP, but you must of course start somewhere.
>>>>>
>>>>> Give more testing is good, but also mean more time resources cost. If
>>>>> spend the 'cost', also need get additional 'contributions' (not only
>>>>> prove an issue), or the 'efficiency' can not be 'acceptable'.
>>>>>
>>>>> When "I need more tests than just LTP", firstly I need perform this
>>>>> test, and then, also try to send "test case" to LTP (I guess, these
>>>>> kinds of mails are welcomed by LTP).
>>>>>
>>>>> And LTP is also a way to find kernel issues, although I will not mainly
>>>>> depend on it now (but maybe in future), it is better to familiar with it
>>>>> step by step.
>>>>>
>>>>> LTP (Linux Test Project) is one of main kernel mad user at downstream.
>>>>> Tool chain (GCC/Binutils) is one of kernel main mad tools at upstream.
>>>>> If we face to the whole kernel, suggest to use them. ;-)
>>>>
>>>> Yep, starting with just LTP is OK.  But if by this time next year you
>>>> really should be using more than just LTP.
>>>>
>>
>> What I have done is trying to fully use other members contributions, not trying to instead of them.
>>
>>
>> And the reason why I want/try to 'open' my 'ideas' to public:
>>
>>   get more suggestions, and completions from other members.
>>
>>   share my ideas, it can let other members provide more contributions (e.g. I am glad, if find other members also try 'allmodconfig' on all architectures).
>>
>>   If some members replicate me, I will save my current time resources and devote them to another things (which also based on other members contributions).
>>
>>
>> In my opinion:
>>
>>   "Open and Share" are both important and urgent to everyone, although it may not be noticed directly. Like "Air and Water" which God have blessed to everyone.
> 

Firstly, thank you very much for your details reply.


> In a general sense, of course.
> 
> In many specific cases, effective sharing can require quite a bit of
> preparation.  For but one example, in Dipankar's and my case, it took
> about two years of work (mostly Dipankar's work) to get the initial
> implementation of RCU accepted into the Linux kernel.
> 
> In your case, you can invest an average of three days per accepted
> patch if you are to achieve your goal of ten patches accepted per month
> (if I remember correctly).  Of course, not every patch will be accepted,
> which reduces your per-patch time.  For example, if 50% of your patches
> are accepted, you can invest an average of about 1.5 days per patch.
> 
> Of course, investing in learning about test frameworks or specific
> kernel subsystems further reduces your time available per patch.
> 
> But if you don't invest in your learning, you will be limited in what
> you can effectively contribute.  This might be OK, for all I know.
> After all, in the 15 million lines of Linux kernel code, there is
> probably a very large number of point-problems waiting to be fixed.
> 
> But suppose that you run out of easily found point problems?  Or that
> you want to do something more wide-ranging than fixes for point problems?
> What can you do?
> 
> Here are a few options.  If you think more about it, I am sure that you
> can come up with others.
> 
> 1.	Put the ten-patches-per-month quota aside for a month (or two or
> 	three or whatever is required and appropriate).  Spend this time
> 	studying a given kernel subsystem or a given test framework.
> 	(Which kernel subsystem?  The best candidates would be those
> 	having bugs but no active maintainer, but which you have the
> 	hardware needed to adequately test.)
> 
> 2.	Add a review and/or test component to your monthly quota, so
> 	that a given patch could be substituted for by some number of
> 	Reviewed-by or Tested-by flags.  Of course, this gives your
> 	a chicken-and-egg problem because you cannot adequately review
> 	or test without some understanding of the subsystem in question.
> 	(At least not efficiently enough to get enough Tested-by or
> 	Reviewed-by flags.)
> 
> 3.	Set aside a fixed amount of time each week (or each month) to
> 	learn.  This time needs to be a contiguous block of at least
> 	four hours.  If you focus your learning appropriately, you might
> 	be able to contribute more deeply to whatever you learned about
> 	over time.
> 
> 	For whatever it is worth, just staring at code is for most people
> 	an inefficient way to learn.  Exercising the code using tools
> 	like ftrace or userspace scaffolding can help speed up the
> 	learning.
> 

At least for me, what you said is valuable.

In fact, I am just trying in this way for Tool Chain (GCC/Binutils),
and use Linux Kernel as the test object of Tool Chain. ;-)


> 4.	Your idea here...
> 

'Ways' depends on your goal.


For Tool Chain and LTP, I only want to use them for kernel, so I need
familiar with their features details which related with Linux Kernel,
(in fact, GCC is not easy for me, too).

But for Linux Kernel, I want to face the whole kernel (it is my main
goal), so I start from Interface: kernel's upstream (e.g. Tool Chain),
kernel's downstream (e.g. LTP), and Reading Code/Docs.

So what I have done to Linux kernel, is just only starting, it can be
followed with many many next steps.


> Your current approach seems to be to submit patches and hope that the
> maintainer takes it upon himself or herself to teach you.  Unfortunately,
> as you might have noticed, a given maintainer might not have the time
> or energy to take on full responsibility for your education.
> 

In my opinion, teaching and educating are not quite efficient: I am not
graduated from University (no bachelor's degree, not computer science
major, either), although I come from China Zhe Jiang University.

When I send a patch to the related maintainer (or integrator), I don't
intend to let them 'teach' me (it is not quite efficient), I only want
to work together which can improve the whole efficiency.

  e.g. if the maintainers already know about it, we don't need wast time again.
  e.g. if no related maintainer, I should try and let integrator check and provide him/her suggestions for what I have done.



> 							Thanx, Paul
> 
> 
> 

Thanks.
-- 
Chen Gang

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2013-09-30  1:34 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-25  1:27 [PATCH tip/core/rcu 0/11] Fixes for 3.13 Paul E. McKenney
2013-09-25  1:29 ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Paul E. McKenney
2013-09-25  1:29   ` [PATCH tip/core/rcu 02/11] rcu: Use proper cpp macro for ->gp_flags Paul E. McKenney
2013-09-25  1:29   ` [PATCH tip/core/rcu 03/11] rcu: Convert local functions to static Paul E. McKenney
2013-09-25  1:29   ` [PATCH tip/core/rcu 04/11] rcu: Fix dubious "if" condition in __call_rcu_nocb_enqueue() Paul E. McKenney
2013-09-25  1:29   ` [PATCH tip/core/rcu 05/11] rcu: Make list_splice_init_rcu() account for RCU readers Paul E. McKenney
2013-09-25  1:29   ` [PATCH tip/core/rcu 06/11] rcu: Replace __get_cpu_var() uses Paul E. McKenney
2013-09-25  1:29   ` [PATCH tip/core/rcu 07/11] rcu: Silence unused-variable warnings Paul E. McKenney
2013-09-25  1:29   ` [PATCH tip/core/rcu 08/11] rcu: Micro-optimize rcu_cpu_has_callbacks() Paul E. McKenney
2013-09-25  2:55     ` Chen Gang
2013-09-25 20:16       ` Paul E. McKenney
2013-09-26  2:57         ` Chen Gang
2013-09-26 18:33           ` Paul E. McKenney
2013-09-27  2:29             ` Chen Gang
2013-09-29  4:24               ` Chen Gang
2013-09-29 20:23                 ` Paul E. McKenney
2013-09-30  1:33                   ` Chen Gang
2013-09-25  1:29   ` [PATCH tip/core/rcu 09/11] rcu: Reject memory-order-induced stall-warning false positives Paul E. McKenney
2013-09-25  1:29   ` [PATCH tip/core/rcu 10/11] rcu: Have rcutiny tracepoints use tracepoint_string() Paul E. McKenney
2013-09-25  1:29   ` [PATCH tip/core/rcu 11/11] rcu: Fix CONFIG_RCU_NOCB_CPU_ALL panic on machines with sparse CPU mask Paul E. McKenney
2013-09-25  4:10   ` [PATCH tip/core/rcu 01/11] mm: Place preemption point in do_mlockall() loop Andrew Morton
2013-09-25 13:48     ` Paul E. McKenney
2013-09-25 19:35       ` Andrew Morton
2013-09-25 20:18         ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).