All of lore.kernel.org
 help / color / mirror / Atom feed
* There is a Tasks RCU stall warning
@ 2017-04-11 21:18 Paul E. McKenney
  2017-04-11 21:21 ` Steven Rostedt
  2017-04-11 21:31 ` Steven Rostedt
  0 siblings, 2 replies; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-11 21:18 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel

Hello, Steve,

Network connectivity issues...  :-/

There is already a Tasks RCU stall warning, but it waits for
ten -minutes- before complaining.  You can change this with the
rcupdate.rcu_task_stall_timeout kernel boot parameter, if you wish.

Or did you wait longer than ten minutes?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 21:18 There is a Tasks RCU stall warning Paul E. McKenney
@ 2017-04-11 21:21 ` Steven Rostedt
  2017-04-11 21:32   ` Paul E. McKenney
  2017-04-11 21:31 ` Steven Rostedt
  1 sibling, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-11 21:21 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Tue, 11 Apr 2017 14:18:02 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> Hello, Steve,
> 
> Network connectivity issues...  :-/

ug

> 
> There is already a Tasks RCU stall warning, but it waits for
> ten -minutes- before complaining.  You can change this with the
> rcupdate.rcu_task_stall_timeout kernel boot parameter, if you wish.
> 
> Or did you wait longer than ten minutes?

Hmm, I may have waited longer. But I can try again and shorten the
timeout.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 21:18 There is a Tasks RCU stall warning Paul E. McKenney
  2017-04-11 21:21 ` Steven Rostedt
@ 2017-04-11 21:31 ` Steven Rostedt
  2017-04-11 21:34   ` Steven Rostedt
  1 sibling, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-11 21:31 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Tue, 11 Apr 2017 14:18:02 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> Hello, Steve,
> 
> Network connectivity issues...  :-/
> 
> There is already a Tasks RCU stall warning, but it waits for
> ten -minutes- before complaining.  You can change this with the
> rcupdate.rcu_task_stall_timeout kernel boot parameter, if you wish.
> 
> Or did you wait longer than ten minutes?
> 

Remember when I said I had a bunch of debugging turned on in this
kernel. Well, I also included my ftrace trace_event benchmark code.
Which appears to be what is triggering this.

[  205.159860] INFO: rcu_tasks detected stalls on tasks:
[  205.165042] ffff8800c40450c0: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
[  205.171627] event_benchmark R  running task    30224  1113      2 0x10000000
[  205.178829] Call Trace:
[  205.181379]  __schedule+0x574/0x1210
[  205.185061]  ? __schedule+0x574/0x1210
[  205.188917]  ? mark_held_locks+0x23/0xc0
[  205.192944]  ? mark_held_locks+0x23/0xc0
[  205.196964]  ? retint_kernel+0x2d/0x2d
[  205.200831]  ? retint_kernel+0x2d/0x2d
[  205.204679]  ? trace_hardirqs_on_caller+0x182/0x280
[  205.209660]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  205.214380]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  205.219097]  irq_exit+0x91/0x100
[  205.222418]  ? retint_kernel+0x2d/0x2d
[  205.226263]  ? retint_kernel+0x2d/0x2d
[  205.230113]  ? kthread_should_stop+0x3d/0x60
[  205.234486]  ? __asan_load8+0x11/0x70
[  205.238245]  ? ring_buffer_record_is_on+0x11/0x20
[  205.243052]  ? tracing_is_on+0x15/0x30
[  205.246895]  ? benchmark_event_kthread+0x4f/0x3c0
[  205.251708]  ? kthread+0x178/0x1d0
[  205.255200]  ? trace_benchmark_reg+0x80/0x80
[  205.259563]  ? kthread_create_on_node+0xa0/0xa0
[  205.264205]  ? ret_from_fork+0x2e/0x40

The thread gets created when I enable the benchmark tracepoint. It just
so happens that my test enables *all* tracepoints, which would of
course include this one as well.

I'll have to look at this code to see why it is getting missed.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 21:21 ` Steven Rostedt
@ 2017-04-11 21:32   ` Paul E. McKenney
  0 siblings, 0 replies; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-11 21:32 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Tue, Apr 11, 2017 at 05:21:07PM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2017 14:18:02 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > Hello, Steve,
> > 
> > Network connectivity issues...  :-/
> 
> ug
> 
> > There is already a Tasks RCU stall warning, but it waits for
> > ten -minutes- before complaining.  You can change this with the
> > rcupdate.rcu_task_stall_timeout kernel boot parameter, if you wish.
> > 
> > Or did you wait longer than ten minutes?
> 
> Hmm, I may have waited longer. But I can try again and shorten the
> timeout.

Failing that, you might try carefully placed printk() calls in
rcu_tasks_kthread().  The loop spinning on non-empty rcu_tasks_holdouts()
should loop at most once per second, so you can get away with quite a
bit of instrumentation.  Unlike much of the rest of RCU.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 21:31 ` Steven Rostedt
@ 2017-04-11 21:34   ` Steven Rostedt
  2017-04-11 21:39     ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-11 21:34 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Tue, 11 Apr 2017 17:31:33 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> The thread gets created when I enable the benchmark tracepoint. It just
> so happens that my test enables *all* tracepoints, which would of
> course include this one as well.
> 
> I'll have to look at this code to see why it is getting missed.

Yep, this thread never goes to sleep, but will call cond_resched()
periodically. This keeps rcu_tasks() from finishing.

Should I add a direct "schedule()" in there instead of a
cond_resched(), or do you think rcu_tasks should have cond_resched() be
a quiescent state as well?

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 21:34   ` Steven Rostedt
@ 2017-04-11 21:39     ` Steven Rostedt
  2017-04-11 21:44       ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-11 21:39 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Tue, 11 Apr 2017 17:34:47 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 11 Apr 2017 17:31:33 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > The thread gets created when I enable the benchmark tracepoint. It just
> > so happens that my test enables *all* tracepoints, which would of
> > course include this one as well.
> > 
> > I'll have to look at this code to see why it is getting missed.  
> 
> Yep, this thread never goes to sleep, but will call cond_resched()
> periodically. This keeps rcu_tasks() from finishing.
> 
> Should I add a direct "schedule()" in there instead of a
> cond_resched(), or do you think rcu_tasks should have cond_resched() be
> a quiescent state as well?

Actually, I believe this found a bug in my trace_event benchmark
thread. On a preempt kernel, cond_resched() is a nop and expects to
only be preempted. Calling schedule() directly should fix everything. I
shouldn't depend on cond_resched() here.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 21:39     ` Steven Rostedt
@ 2017-04-11 21:44       ` Paul E. McKenney
  2017-04-11 21:49         ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-11 21:44 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Tue, Apr 11, 2017 at 05:39:00PM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2017 17:34:47 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Tue, 11 Apr 2017 17:31:33 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > The thread gets created when I enable the benchmark tracepoint. It just
> > > so happens that my test enables *all* tracepoints, which would of
> > > course include this one as well.
> > > 
> > > I'll have to look at this code to see why it is getting missed.  
> > 
> > Yep, this thread never goes to sleep, but will call cond_resched()
> > periodically. This keeps rcu_tasks() from finishing.
> > 
> > Should I add a direct "schedule()" in there instead of a
> > cond_resched(), or do you think rcu_tasks should have cond_resched() be
> > a quiescent state as well?
> 
> Actually, I believe this found a bug in my trace_event benchmark
> thread. On a preempt kernel, cond_resched() is a nop and expects to
> only be preempted. Calling schedule() directly should fix everything. I
> shouldn't depend on cond_resched() here.

Works for me!

Hopefully it will also work for your computer.  :-)

And whew!  Glad to see that the stall warnings worked!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 21:44       ` Paul E. McKenney
@ 2017-04-11 21:49         ` Steven Rostedt
  2017-04-11 21:56           ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-11 21:49 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Tue, 11 Apr 2017 14:44:43 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

 
> Works for me!
> 
> Hopefully it will also work for your computer.  :-)
> 
> And whew!  Glad to see that the stall warnings worked!

Ah! but I think I found a bug in synchronize_rcu_tasks()!

Calling schedule isn't good enough. For rcu_tasks to continue, the task
needs to schedule out. With my updated code, I just triggered:

[  196.276868] INFO: rcu_tasks detected stalls on tasks:
[  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
[  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
[  196.302746] Call Trace:
[  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
[  196.314453]  __schedule+0x222/0x1210
[  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
[  196.327616]  ? preempt_count_add+0xb7/0xf0
[  196.334174]  ? __asan_store8+0x15/0x70
[  196.340384]  schedule+0x57/0xe0
[  196.345888]  benchmark_event_kthread+0x2e/0x3c0
[  196.352823]  kthread+0x178/0x1d0
[  196.358411]  ? trace_benchmark_reg+0x80/0x80
[  196.365073]  ? kthread_create_on_node+0xa0/0xa0
[  196.371999]  ret_from_fork+0x2e/0x40


And here my benchmark called schedule(), but nothing scheduled it out,
and it still fails on rcu_tasks.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 21:49         ` Steven Rostedt
@ 2017-04-11 21:56           ` Paul E. McKenney
  2017-04-11 22:15             ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-11 21:56 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Tue, Apr 11, 2017 at 05:49:53PM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2017 14:44:43 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> 
> > Works for me!
> > 
> > Hopefully it will also work for your computer.  :-)
> > 
> > And whew!  Glad to see that the stall warnings worked!
> 
> Ah! but I think I found a bug in synchronize_rcu_tasks()!
> 
> Calling schedule isn't good enough. For rcu_tasks to continue, the task
> needs to schedule out. With my updated code, I just triggered:
> 
> [  196.276868] INFO: rcu_tasks detected stalls on tasks:
> [  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
> [  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
> [  196.302746] Call Trace:
> [  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
> [  196.314453]  __schedule+0x222/0x1210
> [  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
> [  196.327616]  ? preempt_count_add+0xb7/0xf0
> [  196.334174]  ? __asan_store8+0x15/0x70
> [  196.340384]  schedule+0x57/0xe0
> [  196.345888]  benchmark_event_kthread+0x2e/0x3c0
> [  196.352823]  kthread+0x178/0x1d0
> [  196.358411]  ? trace_benchmark_reg+0x80/0x80
> [  196.365073]  ? kthread_create_on_node+0xa0/0xa0
> [  196.371999]  ret_from_fork+0x2e/0x40
> 
> 
> And here my benchmark called schedule(), but nothing scheduled it out,
> and it still fails on rcu_tasks.

Good point!

Hmmmm...  I cannot hook into rcu_note_context_switch() because that gets
called for preemption as well as for voluntary context switches.

How about cond_resched_rcu_qs()?  Does that cover it?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 21:56           ` Paul E. McKenney
@ 2017-04-11 22:15             ` Steven Rostedt
  2017-04-11 23:01               ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-11 22:15 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Tue, 11 Apr 2017 14:56:56 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Tue, Apr 11, 2017 at 05:49:53PM -0400, Steven Rostedt wrote:
> > On Tue, 11 Apr 2017 14:44:43 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> >   
> > > Works for me!
> > > 
> > > Hopefully it will also work for your computer.  :-)
> > > 
> > > And whew!  Glad to see that the stall warnings worked!  
> > 
> > Ah! but I think I found a bug in synchronize_rcu_tasks()!
> > 
> > Calling schedule isn't good enough. For rcu_tasks to continue, the task
> > needs to schedule out. With my updated code, I just triggered:
> > 
> > [  196.276868] INFO: rcu_tasks detected stalls on tasks:
> > [  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
> > [  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
> > [  196.302746] Call Trace:
> > [  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
> > [  196.314453]  __schedule+0x222/0x1210
> > [  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
> > [  196.327616]  ? preempt_count_add+0xb7/0xf0
> > [  196.334174]  ? __asan_store8+0x15/0x70
> > [  196.340384]  schedule+0x57/0xe0
> > [  196.345888]  benchmark_event_kthread+0x2e/0x3c0
> > [  196.352823]  kthread+0x178/0x1d0
> > [  196.358411]  ? trace_benchmark_reg+0x80/0x80
> > [  196.365073]  ? kthread_create_on_node+0xa0/0xa0
> > [  196.371999]  ret_from_fork+0x2e/0x40
> > 
> > 
> > And here my benchmark called schedule(), but nothing scheduled it out,
> > and it still fails on rcu_tasks.  
> 
> Good point!
> 
> Hmmmm...  I cannot hook into rcu_note_context_switch() because that gets
> called for preemption as well as for voluntary context switches.

If you pass in the "preempt" variable, it would work. It's false when
__schedule() was called by a "schedule()" directly, and true when
called by one of the preempt_schedule() functions.

-- Steve


> 
> How about cond_resched_rcu_qs()?  Does that cover it?
> 
> 							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 22:15             ` Steven Rostedt
@ 2017-04-11 23:01               ` Paul E. McKenney
  2017-04-11 23:04                 ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-11 23:01 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Tue, Apr 11, 2017 at 06:15:30PM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2017 14:56:56 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Tue, Apr 11, 2017 at 05:49:53PM -0400, Steven Rostedt wrote:
> > > On Tue, 11 Apr 2017 14:44:43 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > 
> > >   
> > > > Works for me!
> > > > 
> > > > Hopefully it will also work for your computer.  :-)
> > > > 
> > > > And whew!  Glad to see that the stall warnings worked!  
> > > 
> > > Ah! but I think I found a bug in synchronize_rcu_tasks()!
> > > 
> > > Calling schedule isn't good enough. For rcu_tasks to continue, the task
> > > needs to schedule out. With my updated code, I just triggered:
> > > 
> > > [  196.276868] INFO: rcu_tasks detected stalls on tasks:
> > > [  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
> > > [  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
> > > [  196.302746] Call Trace:
> > > [  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
> > > [  196.314453]  __schedule+0x222/0x1210
> > > [  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
> > > [  196.327616]  ? preempt_count_add+0xb7/0xf0
> > > [  196.334174]  ? __asan_store8+0x15/0x70
> > > [  196.340384]  schedule+0x57/0xe0
> > > [  196.345888]  benchmark_event_kthread+0x2e/0x3c0
> > > [  196.352823]  kthread+0x178/0x1d0
> > > [  196.358411]  ? trace_benchmark_reg+0x80/0x80
> > > [  196.365073]  ? kthread_create_on_node+0xa0/0xa0
> > > [  196.371999]  ret_from_fork+0x2e/0x40
> > > 
> > > 
> > > And here my benchmark called schedule(), but nothing scheduled it out,
> > > and it still fails on rcu_tasks.  
> > 
> > Good point!
> > 
> > Hmmmm...  I cannot hook into rcu_note_context_switch() because that gets
> > called for preemption as well as for voluntary context switches.
> 
> If you pass in the "preempt" variable, it would work. It's false when
> __schedule() was called by a "schedule()" directly, and true when
> called by one of the preempt_schedule() functions.

Like this?  (Untested, but builds at least some of the time.)

							Thanx, Paul

------------------------------------------------------------------------

commit 05c3e601dd8aa0e1030e6ed997b6d5b936e7e1c1
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Tue Apr 11 15:50:41 2017 -0700

    rcu: Make non-preemptive schedule be Tasks RCU quiescent state
    
    Currently, a call to schedule() acts as a Tasks RCU quiescent state
    only if a context switch actually takes place.  However, just the
    call to schedule() guarantees that the calling task has moved off of
    whatever tracing trampoline that it might have been one previously.
    This commit therefore plumbs schedule()'s "preempt" parameter into
    rcu_note_context_switch(), which then records the Tasks RCU quiescent
    state, but only if this call to schedule() was -not- due to a preemption.
    
    Suggested-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index e6146d0074f8..f531b29207da 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -363,15 +363,20 @@ static inline void rcu_init_nohz(void)
 #ifdef CONFIG_TASKS_RCU
 #define TASKS_RCU(x) x
 extern struct srcu_struct tasks_rcu_exit_srcu;
-#define rcu_note_voluntary_context_switch(t) \
+#define rcu_note_voluntary_context_switch_lite(t) \
 	do { \
-		rcu_all_qs(); \
 		if (READ_ONCE((t)->rcu_tasks_holdout)) \
 			WRITE_ONCE((t)->rcu_tasks_holdout, false); \
 	} while (0)
+#define rcu_note_voluntary_context_switch(t) \
+	do { \
+		rcu_all_qs(); \
+		rcu_note_voluntary_context_switch_lite(t); \
+	} while (0)
 #else /* #ifdef CONFIG_TASKS_RCU */
 #define TASKS_RCU(x) do { } while (0)
-#define rcu_note_voluntary_context_switch(t)	rcu_all_qs()
+#define rcu_note_voluntary_context_switch_lite(t)	do { } while (0)
+#define rcu_note_voluntary_context_switch(t)		rcu_all_qs()
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
 
 /**
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 5219be250f00..21ea52df651d 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -92,10 +92,11 @@ static inline void kfree_call_rcu(struct rcu_head *head,
 	call_rcu(head, func);
 }
 
-static inline void rcu_note_context_switch(void)
-{
-	rcu_sched_qs();
-}
+#define rcu_note_context_switch(preempt) \
+	do { \
+		rcu_sched_qs(); \
+		rcu_note_voluntary_context_switch_lite(current); \
+	} while (0)
 
 /*
  * Take advantage of the fact that there is only one CPU, which
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 63a4e4cf40a5..04a2caf3ab8b 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -30,7 +30,7 @@
 #ifndef __LINUX_RCUTREE_H
 #define __LINUX_RCUTREE_H
 
-void rcu_note_context_switch(void);
+void rcu_note_context_switch(bool preempt);
 int rcu_needs_cpu(u64 basem, u64 *nextevt);
 void rcu_cpu_stall_reset(void);
 
@@ -41,7 +41,7 @@ void rcu_cpu_stall_reset(void);
  */
 static inline void rcu_virt_note_context_switch(int cpu)
 {
-	rcu_note_context_switch();
+	rcu_note_context_switch(false);
 }
 
 void synchronize_rcu_bh(void);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f13fc4ab2f0d..afeee70f7762 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -458,14 +458,16 @@ static void rcu_momentary_dyntick_idle(void)
  * and requires special handling for preemptible RCU.
  * The caller must have disabled interrupts.
  */
-void rcu_note_context_switch(void)
+void rcu_note_context_switch(bool preempt)
 {
 	struct rcu_state *rsp;
 
 	barrier(); /* Avoid RCU read-side critical sections leaking down. */
 	trace_rcu_utilization(TPS("Start context switch"));
 	rcu_sched_qs();
-	rcu_preempt_note_context_switch();
+	if (!preempt)
+		rcu_preempt_note_context_switch();
+	rcu_note_voluntary_context_switch_lite(current);
 	/* Load rcu_urgent_qs before other flags. */
 	if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs)))
 		goto out;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9c38e6be4f3e..653ea8cf89e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3378,7 +3378,7 @@ static void __sched notrace __schedule(bool preempt)
 		hrtick_clear(rq);
 
 	local_irq_disable();
-	rcu_note_context_switch();
+	rcu_note_context_switch(preempt);
 
 	/*
 	 * Make sure that signal_pending_state()->signal_pending() below

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 23:01               ` Paul E. McKenney
@ 2017-04-11 23:04                 ` Paul E. McKenney
  2017-04-11 23:11                   ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-11 23:04 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Tue, Apr 11, 2017 at 04:01:54PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 11, 2017 at 06:15:30PM -0400, Steven Rostedt wrote:
> > On Tue, 11 Apr 2017 14:56:56 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Tue, Apr 11, 2017 at 05:49:53PM -0400, Steven Rostedt wrote:
> > > > On Tue, 11 Apr 2017 14:44:43 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > 
> > > >   
> > > > > Works for me!
> > > > > 
> > > > > Hopefully it will also work for your computer.  :-)
> > > > > 
> > > > > And whew!  Glad to see that the stall warnings worked!  
> > > > 
> > > > Ah! but I think I found a bug in synchronize_rcu_tasks()!
> > > > 
> > > > Calling schedule isn't good enough. For rcu_tasks to continue, the task
> > > > needs to schedule out. With my updated code, I just triggered:
> > > > 
> > > > [  196.276868] INFO: rcu_tasks detected stalls on tasks:
> > > > [  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
> > > > [  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
> > > > [  196.302746] Call Trace:
> > > > [  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
> > > > [  196.314453]  __schedule+0x222/0x1210
> > > > [  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
> > > > [  196.327616]  ? preempt_count_add+0xb7/0xf0
> > > > [  196.334174]  ? __asan_store8+0x15/0x70
> > > > [  196.340384]  schedule+0x57/0xe0
> > > > [  196.345888]  benchmark_event_kthread+0x2e/0x3c0
> > > > [  196.352823]  kthread+0x178/0x1d0
> > > > [  196.358411]  ? trace_benchmark_reg+0x80/0x80
> > > > [  196.365073]  ? kthread_create_on_node+0xa0/0xa0
> > > > [  196.371999]  ret_from_fork+0x2e/0x40
> > > > 
> > > > 
> > > > And here my benchmark called schedule(), but nothing scheduled it out,
> > > > and it still fails on rcu_tasks.  
> > > 
> > > Good point!
> > > 
> > > Hmmmm...  I cannot hook into rcu_note_context_switch() because that gets
> > > called for preemption as well as for voluntary context switches.
> > 
> > If you pass in the "preempt" variable, it would work. It's false when
> > __schedule() was called by a "schedule()" directly, and true when
> > called by one of the preempt_schedule() functions.
> 
> Like this?  (Untested, but builds at least some of the time.)

Not like that...  :-/  Update on its way.

							Thanx, Paul

> ------------------------------------------------------------------------
> 
> commit 05c3e601dd8aa0e1030e6ed997b6d5b936e7e1c1
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Tue Apr 11 15:50:41 2017 -0700
> 
>     rcu: Make non-preemptive schedule be Tasks RCU quiescent state
>     
>     Currently, a call to schedule() acts as a Tasks RCU quiescent state
>     only if a context switch actually takes place.  However, just the
>     call to schedule() guarantees that the calling task has moved off of
>     whatever tracing trampoline that it might have been one previously.
>     This commit therefore plumbs schedule()'s "preempt" parameter into
>     rcu_note_context_switch(), which then records the Tasks RCU quiescent
>     state, but only if this call to schedule() was -not- due to a preemption.
>     
>     Suggested-by: Steven Rostedt <rostedt@goodmis.org>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index e6146d0074f8..f531b29207da 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -363,15 +363,20 @@ static inline void rcu_init_nohz(void)
>  #ifdef CONFIG_TASKS_RCU
>  #define TASKS_RCU(x) x
>  extern struct srcu_struct tasks_rcu_exit_srcu;
> -#define rcu_note_voluntary_context_switch(t) \
> +#define rcu_note_voluntary_context_switch_lite(t) \
>  	do { \
> -		rcu_all_qs(); \
>  		if (READ_ONCE((t)->rcu_tasks_holdout)) \
>  			WRITE_ONCE((t)->rcu_tasks_holdout, false); \
>  	} while (0)
> +#define rcu_note_voluntary_context_switch(t) \
> +	do { \
> +		rcu_all_qs(); \
> +		rcu_note_voluntary_context_switch_lite(t); \
> +	} while (0)
>  #else /* #ifdef CONFIG_TASKS_RCU */
>  #define TASKS_RCU(x) do { } while (0)
> -#define rcu_note_voluntary_context_switch(t)	rcu_all_qs()
> +#define rcu_note_voluntary_context_switch_lite(t)	do { } while (0)
> +#define rcu_note_voluntary_context_switch(t)		rcu_all_qs()
>  #endif /* #else #ifdef CONFIG_TASKS_RCU */
>  
>  /**
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index 5219be250f00..21ea52df651d 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -92,10 +92,11 @@ static inline void kfree_call_rcu(struct rcu_head *head,
>  	call_rcu(head, func);
>  }
>  
> -static inline void rcu_note_context_switch(void)
> -{
> -	rcu_sched_qs();
> -}
> +#define rcu_note_context_switch(preempt) \
> +	do { \
> +		rcu_sched_qs(); \
> +		rcu_note_voluntary_context_switch_lite(current); \
> +	} while (0)
>  
>  /*
>   * Take advantage of the fact that there is only one CPU, which
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 63a4e4cf40a5..04a2caf3ab8b 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -30,7 +30,7 @@
>  #ifndef __LINUX_RCUTREE_H
>  #define __LINUX_RCUTREE_H
>  
> -void rcu_note_context_switch(void);
> +void rcu_note_context_switch(bool preempt);
>  int rcu_needs_cpu(u64 basem, u64 *nextevt);
>  void rcu_cpu_stall_reset(void);
>  
> @@ -41,7 +41,7 @@ void rcu_cpu_stall_reset(void);
>   */
>  static inline void rcu_virt_note_context_switch(int cpu)
>  {
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(false);
>  }
>  
>  void synchronize_rcu_bh(void);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f13fc4ab2f0d..afeee70f7762 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -458,14 +458,16 @@ static void rcu_momentary_dyntick_idle(void)
>   * and requires special handling for preemptible RCU.
>   * The caller must have disabled interrupts.
>   */
> -void rcu_note_context_switch(void)
> +void rcu_note_context_switch(bool preempt)
>  {
>  	struct rcu_state *rsp;
>  
>  	barrier(); /* Avoid RCU read-side critical sections leaking down. */
>  	trace_rcu_utilization(TPS("Start context switch"));
>  	rcu_sched_qs();
> -	rcu_preempt_note_context_switch();
> +	if (!preempt)
> +		rcu_preempt_note_context_switch();
> +	rcu_note_voluntary_context_switch_lite(current);
>  	/* Load rcu_urgent_qs before other flags. */
>  	if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs)))
>  		goto out;
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9c38e6be4f3e..653ea8cf89e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3378,7 +3378,7 @@ static void __sched notrace __schedule(bool preempt)
>  		hrtick_clear(rq);
>  
>  	local_irq_disable();
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(preempt);
>  
>  	/*
>  	 * Make sure that signal_pending_state()->signal_pending() below

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 23:04                 ` Paul E. McKenney
@ 2017-04-11 23:11                   ` Paul E. McKenney
  2017-04-12  3:23                     ` Paul E. McKenney
  2017-04-12 14:48                     ` Paul E. McKenney
  0 siblings, 2 replies; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-11 23:11 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Tue, Apr 11, 2017 at 04:04:45PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 11, 2017 at 04:01:54PM -0700, Paul E. McKenney wrote:
> > On Tue, Apr 11, 2017 at 06:15:30PM -0400, Steven Rostedt wrote:
> > > On Tue, 11 Apr 2017 14:56:56 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > 
> > > > On Tue, Apr 11, 2017 at 05:49:53PM -0400, Steven Rostedt wrote:
> > > > > On Tue, 11 Apr 2017 14:44:43 -0700
> > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > 
> > > > >   
> > > > > > Works for me!
> > > > > > 
> > > > > > Hopefully it will also work for your computer.  :-)
> > > > > > 
> > > > > > And whew!  Glad to see that the stall warnings worked!  
> > > > > 
> > > > > Ah! but I think I found a bug in synchronize_rcu_tasks()!
> > > > > 
> > > > > Calling schedule isn't good enough. For rcu_tasks to continue, the task
> > > > > needs to schedule out. With my updated code, I just triggered:
> > > > > 
> > > > > [  196.276868] INFO: rcu_tasks detected stalls on tasks:
> > > > > [  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
> > > > > [  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
> > > > > [  196.302746] Call Trace:
> > > > > [  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
> > > > > [  196.314453]  __schedule+0x222/0x1210
> > > > > [  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
> > > > > [  196.327616]  ? preempt_count_add+0xb7/0xf0
> > > > > [  196.334174]  ? __asan_store8+0x15/0x70
> > > > > [  196.340384]  schedule+0x57/0xe0
> > > > > [  196.345888]  benchmark_event_kthread+0x2e/0x3c0
> > > > > [  196.352823]  kthread+0x178/0x1d0
> > > > > [  196.358411]  ? trace_benchmark_reg+0x80/0x80
> > > > > [  196.365073]  ? kthread_create_on_node+0xa0/0xa0
> > > > > [  196.371999]  ret_from_fork+0x2e/0x40
> > > > > 
> > > > > 
> > > > > And here my benchmark called schedule(), but nothing scheduled it out,
> > > > > and it still fails on rcu_tasks.  
> > > > 
> > > > Good point!
> > > > 
> > > > Hmmmm...  I cannot hook into rcu_note_context_switch() because that gets
> > > > called for preemption as well as for voluntary context switches.
> > > 
> > > If you pass in the "preempt" variable, it would work. It's false when
> > > __schedule() was called by a "schedule()" directly, and true when
> > > called by one of the preempt_schedule() functions.
> > 
> > Like this?  (Untested, but builds at least some of the time.)
> 
> Not like that...  :-/  Update on its way.

Perhaps more like this.  Started rcutorture on it, will see how it goes.

							Thanx, Paul

------------------------------------------------------------------------

commit c9653896be9b79b7227e8b97710ad6475fc72dc1
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Tue Apr 11 15:50:41 2017 -0700

    rcu: Make non-preemptive schedule be Tasks RCU quiescent state
    
    Currently, a call to schedule() acts as a Tasks RCU quiescent state
    only if a context switch actually takes place.  However, just the
    call to schedule() guarantees that the calling task has moved off of
    whatever tracing trampoline that it might have been one previously.
    This commit therefore plumbs schedule()'s "preempt" parameter into
    rcu_note_context_switch(), which then records the Tasks RCU quiescent
    state, but only if this call to schedule() was -not- due to a preemption.
    
    Suggested-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index e6146d0074f8..f531b29207da 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -363,15 +363,20 @@ static inline void rcu_init_nohz(void)
 #ifdef CONFIG_TASKS_RCU
 #define TASKS_RCU(x) x
 extern struct srcu_struct tasks_rcu_exit_srcu;
-#define rcu_note_voluntary_context_switch(t) \
+#define rcu_note_voluntary_context_switch_lite(t) \
 	do { \
-		rcu_all_qs(); \
 		if (READ_ONCE((t)->rcu_tasks_holdout)) \
 			WRITE_ONCE((t)->rcu_tasks_holdout, false); \
 	} while (0)
+#define rcu_note_voluntary_context_switch(t) \
+	do { \
+		rcu_all_qs(); \
+		rcu_note_voluntary_context_switch_lite(t); \
+	} while (0)
 #else /* #ifdef CONFIG_TASKS_RCU */
 #define TASKS_RCU(x) do { } while (0)
-#define rcu_note_voluntary_context_switch(t)	rcu_all_qs()
+#define rcu_note_voluntary_context_switch_lite(t)	do { } while (0)
+#define rcu_note_voluntary_context_switch(t)		rcu_all_qs()
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
 
 /**
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 5219be250f00..21ea52df651d 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -92,10 +92,11 @@ static inline void kfree_call_rcu(struct rcu_head *head,
 	call_rcu(head, func);
 }
 
-static inline void rcu_note_context_switch(void)
-{
-	rcu_sched_qs();
-}
+#define rcu_note_context_switch(preempt) \
+	do { \
+		rcu_sched_qs(); \
+		rcu_note_voluntary_context_switch_lite(current); \
+	} while (0)
 
 /*
  * Take advantage of the fact that there is only one CPU, which
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 63a4e4cf40a5..04a2caf3ab8b 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -30,7 +30,7 @@
 #ifndef __LINUX_RCUTREE_H
 #define __LINUX_RCUTREE_H
 
-void rcu_note_context_switch(void);
+void rcu_note_context_switch(bool preempt);
 int rcu_needs_cpu(u64 basem, u64 *nextevt);
 void rcu_cpu_stall_reset(void);
 
@@ -41,7 +41,7 @@ void rcu_cpu_stall_reset(void);
  */
 static inline void rcu_virt_note_context_switch(int cpu)
 {
-	rcu_note_context_switch();
+	rcu_note_context_switch(false);
 }
 
 void synchronize_rcu_bh(void);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f13fc4ab2f0d..92cf78fffd31 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -458,13 +458,15 @@ static void rcu_momentary_dyntick_idle(void)
  * and requires special handling for preemptible RCU.
  * The caller must have disabled interrupts.
  */
-void rcu_note_context_switch(void)
+void rcu_note_context_switch(bool preempt)
 {
 	struct rcu_state *rsp;
 
 	barrier(); /* Avoid RCU read-side critical sections leaking down. */
 	trace_rcu_utilization(TPS("Start context switch"));
 	rcu_sched_qs();
+	if (!preempt)
+		rcu_note_voluntary_context_switch_lite(current);
 	rcu_preempt_note_context_switch();
 	/* Load rcu_urgent_qs before other flags. */
 	if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs)))
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9c38e6be4f3e..653ea8cf89e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3378,7 +3378,7 @@ static void __sched notrace __schedule(bool preempt)
 		hrtick_clear(rq);
 
 	local_irq_disable();
-	rcu_note_context_switch();
+	rcu_note_context_switch(preempt);
 
 	/*
 	 * Make sure that signal_pending_state()->signal_pending() below

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 23:11                   ` Paul E. McKenney
@ 2017-04-12  3:23                     ` Paul E. McKenney
  2017-04-12 13:18                       ` Steven Rostedt
  2017-04-12 14:48                     ` Paul E. McKenney
  1 sibling, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-12  3:23 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Tue, Apr 11, 2017 at 04:11:38PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 11, 2017 at 04:04:45PM -0700, Paul E. McKenney wrote:
> > On Tue, Apr 11, 2017 at 04:01:54PM -0700, Paul E. McKenney wrote:
> > > On Tue, Apr 11, 2017 at 06:15:30PM -0400, Steven Rostedt wrote:
> > > > On Tue, 11 Apr 2017 14:56:56 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > 
> > > > > On Tue, Apr 11, 2017 at 05:49:53PM -0400, Steven Rostedt wrote:
> > > > > > On Tue, 11 Apr 2017 14:44:43 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > > 
> > > > > >   
> > > > > > > Works for me!
> > > > > > > 
> > > > > > > Hopefully it will also work for your computer.  :-)
> > > > > > > 
> > > > > > > And whew!  Glad to see that the stall warnings worked!  
> > > > > > 
> > > > > > Ah! but I think I found a bug in synchronize_rcu_tasks()!
> > > > > > 
> > > > > > Calling schedule isn't good enough. For rcu_tasks to continue, the task
> > > > > > needs to schedule out. With my updated code, I just triggered:
> > > > > > 
> > > > > > [  196.276868] INFO: rcu_tasks detected stalls on tasks:
> > > > > > [  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
> > > > > > [  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
> > > > > > [  196.302746] Call Trace:
> > > > > > [  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
> > > > > > [  196.314453]  __schedule+0x222/0x1210
> > > > > > [  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
> > > > > > [  196.327616]  ? preempt_count_add+0xb7/0xf0
> > > > > > [  196.334174]  ? __asan_store8+0x15/0x70
> > > > > > [  196.340384]  schedule+0x57/0xe0
> > > > > > [  196.345888]  benchmark_event_kthread+0x2e/0x3c0
> > > > > > [  196.352823]  kthread+0x178/0x1d0
> > > > > > [  196.358411]  ? trace_benchmark_reg+0x80/0x80
> > > > > > [  196.365073]  ? kthread_create_on_node+0xa0/0xa0
> > > > > > [  196.371999]  ret_from_fork+0x2e/0x40
> > > > > > 
> > > > > > 
> > > > > > And here my benchmark called schedule(), but nothing scheduled it out,
> > > > > > and it still fails on rcu_tasks.  
> > > > > 
> > > > > Good point!
> > > > > 
> > > > > Hmmmm...  I cannot hook into rcu_note_context_switch() because that gets
> > > > > called for preemption as well as for voluntary context switches.
> > > > 
> > > > If you pass in the "preempt" variable, it would work. It's false when
> > > > __schedule() was called by a "schedule()" directly, and true when
> > > > called by one of the preempt_schedule() functions.
> > > 
> > > Like this?  (Untested, but builds at least some of the time.)
> > 
> > Not like that...  :-/  Update on its way.
> 
> Perhaps more like this.  Started rcutorture on it, will see how it goes.

And it passed moderate rcutorture testing, for whatever that is worth.

But another question...

Suppose someone traced or probed or whatever a call to (say)
cond_resched_rcu_qs().  Wouldn't that put the call to this
function in the trampoline itself?  Of course, if this happened,
life would be hard when the trampoline was freed due to
cond_resched_rcu_qs() being a quiescent state.

Or is there something that takes care to avoid putting calls to
this sort of function (and calls to any function calling this sort
of function, directly or indirectly) into a trampoline?

							Thanx, Paul

> ------------------------------------------------------------------------
> 
> commit c9653896be9b79b7227e8b97710ad6475fc72dc1
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Tue Apr 11 15:50:41 2017 -0700
> 
>     rcu: Make non-preemptive schedule be Tasks RCU quiescent state
>     
>     Currently, a call to schedule() acts as a Tasks RCU quiescent state
>     only if a context switch actually takes place.  However, just the
>     call to schedule() guarantees that the calling task has moved off of
>     whatever tracing trampoline that it might have been one previously.
>     This commit therefore plumbs schedule()'s "preempt" parameter into
>     rcu_note_context_switch(), which then records the Tasks RCU quiescent
>     state, but only if this call to schedule() was -not- due to a preemption.
>     
>     Suggested-by: Steven Rostedt <rostedt@goodmis.org>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index e6146d0074f8..f531b29207da 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -363,15 +363,20 @@ static inline void rcu_init_nohz(void)
>  #ifdef CONFIG_TASKS_RCU
>  #define TASKS_RCU(x) x
>  extern struct srcu_struct tasks_rcu_exit_srcu;
> -#define rcu_note_voluntary_context_switch(t) \
> +#define rcu_note_voluntary_context_switch_lite(t) \
>  	do { \
> -		rcu_all_qs(); \
>  		if (READ_ONCE((t)->rcu_tasks_holdout)) \
>  			WRITE_ONCE((t)->rcu_tasks_holdout, false); \
>  	} while (0)
> +#define rcu_note_voluntary_context_switch(t) \
> +	do { \
> +		rcu_all_qs(); \
> +		rcu_note_voluntary_context_switch_lite(t); \
> +	} while (0)
>  #else /* #ifdef CONFIG_TASKS_RCU */
>  #define TASKS_RCU(x) do { } while (0)
> -#define rcu_note_voluntary_context_switch(t)	rcu_all_qs()
> +#define rcu_note_voluntary_context_switch_lite(t)	do { } while (0)
> +#define rcu_note_voluntary_context_switch(t)		rcu_all_qs()
>  #endif /* #else #ifdef CONFIG_TASKS_RCU */
>  
>  /**
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index 5219be250f00..21ea52df651d 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -92,10 +92,11 @@ static inline void kfree_call_rcu(struct rcu_head *head,
>  	call_rcu(head, func);
>  }
>  
> -static inline void rcu_note_context_switch(void)
> -{
> -	rcu_sched_qs();
> -}
> +#define rcu_note_context_switch(preempt) \
> +	do { \
> +		rcu_sched_qs(); \
> +		rcu_note_voluntary_context_switch_lite(current); \
> +	} while (0)
>  
>  /*
>   * Take advantage of the fact that there is only one CPU, which
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 63a4e4cf40a5..04a2caf3ab8b 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -30,7 +30,7 @@
>  #ifndef __LINUX_RCUTREE_H
>  #define __LINUX_RCUTREE_H
>  
> -void rcu_note_context_switch(void);
> +void rcu_note_context_switch(bool preempt);
>  int rcu_needs_cpu(u64 basem, u64 *nextevt);
>  void rcu_cpu_stall_reset(void);
>  
> @@ -41,7 +41,7 @@ void rcu_cpu_stall_reset(void);
>   */
>  static inline void rcu_virt_note_context_switch(int cpu)
>  {
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(false);
>  }
>  
>  void synchronize_rcu_bh(void);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f13fc4ab2f0d..92cf78fffd31 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -458,13 +458,15 @@ static void rcu_momentary_dyntick_idle(void)
>   * and requires special handling for preemptible RCU.
>   * The caller must have disabled interrupts.
>   */
> -void rcu_note_context_switch(void)
> +void rcu_note_context_switch(bool preempt)
>  {
>  	struct rcu_state *rsp;
>  
>  	barrier(); /* Avoid RCU read-side critical sections leaking down. */
>  	trace_rcu_utilization(TPS("Start context switch"));
>  	rcu_sched_qs();
> +	if (!preempt)
> +		rcu_note_voluntary_context_switch_lite(current);
>  	rcu_preempt_note_context_switch();
>  	/* Load rcu_urgent_qs before other flags. */
>  	if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs)))
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9c38e6be4f3e..653ea8cf89e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3378,7 +3378,7 @@ static void __sched notrace __schedule(bool preempt)
>  		hrtick_clear(rq);
>  
>  	local_irq_disable();
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(preempt);
>  
>  	/*
>  	 * Make sure that signal_pending_state()->signal_pending() below

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12  3:23                     ` Paul E. McKenney
@ 2017-04-12 13:18                       ` Steven Rostedt
  2017-04-12 14:19                         ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-12 13:18 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Tue, 11 Apr 2017 20:23:07 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> But another question...
> 
> Suppose someone traced or probed or whatever a call to (say)
> cond_resched_rcu_qs().  Wouldn't that put the call to this
> function in the trampoline itself?  Of course, if this happened,
> life would be hard when the trampoline was freed due to
> cond_resched_rcu_qs() being a quiescent state.

Not at all, because the trampoline happens at the beginning of the
function. Not in the guts of it (unless something in the guts was
traced). But even then, it should be fine as the change was already
made.

	/* unhook trampoline from function calls */
	unregister_ftrace_function(my_ops);

	synchronize_rcu_tasks();

	kfree(my_ops->trampoline);


Thus, once the unregister_ftrace_function() is called, no new entries
into the trampoline can happen. The synchronize_rcu_tasks() is to move
those that are currently on a trampoline off.

Is there a way that a task could be in the middle of
cond_resched_rcu_qs() and get preempted by something while on the
ftrace trampoline, then the above "unregister_ftrace_function()" and
"synchronize_rcu_tasks()" can be called and finish, while the one task
is still on the trampoline and never finished the cond_resched_rcu_qs()?

> 
> Or is there something that takes care to avoid putting calls to
> this sort of function (and calls to any function calling this sort
> of function, directly or indirectly) into a trampoline?

The question is, if its on the trampoline in one of theses functions
when synchronize_rcu_tasks() is called, will it still be on the
trampoline when that returns?

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 13:18                       ` Steven Rostedt
@ 2017-04-12 14:19                         ` Paul E. McKenney
  2017-04-12 14:42                           ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-12 14:19 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Wed, Apr 12, 2017 at 09:18:21AM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2017 20:23:07 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > But another question...
> > 
> > Suppose someone traced or probed or whatever a call to (say)
> > cond_resched_rcu_qs().  Wouldn't that put the call to this
> > function in the trampoline itself?  Of course, if this happened,
> > life would be hard when the trampoline was freed due to
> > cond_resched_rcu_qs() being a quiescent state.
> 
> Not at all, because the trampoline happens at the beginning of the
> function. Not in the guts of it (unless something in the guts was
> traced). But even then, it should be fine as the change was already
> made.
> 
> 	/* unhook trampoline from function calls */
> 	unregister_ftrace_function(my_ops);
> 
> 	synchronize_rcu_tasks();
> 
> 	kfree(my_ops->trampoline);
> 
> 
> Thus, once the unregister_ftrace_function() is called, no new entries
> into the trampoline can happen. The synchronize_rcu_tasks() is to move
> those that are currently on a trampoline off.

OK, good!  (I thought that these things could appear anywhere.)

If it ever becomes necessary, I suppose you could have a function
call as the very last thing on a trampoline.  Do the (off-trampoline)
return-address push, jump at the function, and that is the last need
for the trampoline.

Assuming that the called function doesn't try accessing the code
surrounding the call, but that would be a problem in any case.

> Is there a way that a task could be in the middle of
> cond_resched_rcu_qs() and get preempted by something while on the
> ftrace trampoline, then the above "unregister_ftrace_function()" and
> "synchronize_rcu_tasks()" can be called and finish, while the one task
> is still on the trampoline and never finished the cond_resched_rcu_qs()?

Well, if the kernel being ftraced is a guest OS and the hypervisor
preempts it at just that point...

> > Or is there something that takes care to avoid putting calls to
> > this sort of function (and calls to any function calling this sort
> > of function, directly or indirectly) into a trampoline?
> 
> The question is, if its on the trampoline in one of theses functions
> when synchronize_rcu_tasks() is called, will it still be on the
> trampoline when that returns?

If the function's return address is within the trampoline, it seems to
me that bad things could happen.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 14:19                         ` Paul E. McKenney
@ 2017-04-12 14:42                           ` Steven Rostedt
  2017-04-12 15:18                             ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-12 14:42 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Wed, 12 Apr 2017 07:19:36 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Apr 12, 2017 at 09:18:21AM -0400, Steven Rostedt wrote:
> > On Tue, 11 Apr 2017 20:23:07 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > But another question...
> > > 
> > > Suppose someone traced or probed or whatever a call to (say)
> > > cond_resched_rcu_qs().  Wouldn't that put the call to this
> > > function in the trampoline itself?  Of course, if this happened,
> > > life would be hard when the trampoline was freed due to
> > > cond_resched_rcu_qs() being a quiescent state.  
> > 
> > Not at all, because the trampoline happens at the beginning of the
> > function. Not in the guts of it (unless something in the guts was
> > traced). But even then, it should be fine as the change was already
> > made.
> > 
> > 	/* unhook trampoline from function calls */
> > 	unregister_ftrace_function(my_ops);
> > 
> > 	synchronize_rcu_tasks();
> > 
> > 	kfree(my_ops->trampoline);
> > 
> > 
> > Thus, once the unregister_ftrace_function() is called, no new entries
> > into the trampoline can happen. The synchronize_rcu_tasks() is to move
> > those that are currently on a trampoline off.  
> 
> OK, good!  (I thought that these things could appear anywhere.)

Well the trampolines pretty much can, but they are removed before
calling synchronize_rcu_tasks(), and nothing can enter the trampoline
when that is called.

> 
> If it ever becomes necessary, I suppose you could have a function
> call as the very last thing on a trampoline.  Do the (off-trampoline)
> return-address push, jump at the function, and that is the last need
> for the trampoline.

The point of trampolines is to optimize the function hooks, added
features will kill that optimization. But then it gets even more
complex. The trampolines are written in assembly and do special reg
savings in order to call C code. And it needs to restore back to the
original state before calling back to the function being traced. Thus,
anything at the end of the trampoline will need to be written in
assembly. Not sure writing RCU code in assembly would be much fun.


> 
> Assuming that the called function doesn't try accessing the code
> surrounding the call, but that would be a problem in any case.
> 
> > Is there a way that a task could be in the middle of
> > cond_resched_rcu_qs() and get preempted by something while on the
> > ftrace trampoline, then the above "unregister_ftrace_function()" and
> > "synchronize_rcu_tasks()" can be called and finish, while the one task
> > is still on the trampoline and never finished the cond_resched_rcu_qs()?  
> 
> Well, if the kernel being ftraced is a guest OS and the hypervisor
> preempts it at just that point...

Not sure what you mean by the above. You mean the hypervisor running
ftrace on the guest OS? Or just a long pause on the guest OS (could
also be an NMI). But in any case, we don't care about long pauses. We
care about tasks going to sleep while on the trampoline, and the ftrace
code that does the schedule_on_each_cpu() missing that task, because it
was preempted, and not effected by the schedule_on_each_cpu() call.

> 
> > > Or is there something that takes care to avoid putting calls to
> > > this sort of function (and calls to any function calling this sort
> > > of function, directly or indirectly) into a trampoline?  
> > 
> > The question is, if its on the trampoline in one of theses functions
> > when synchronize_rcu_tasks() is called, will it still be on the
> > trampoline when that returns?  
> 
> If the function's return address is within the trampoline, it seems to
> me that bad things could happen.

Not sure what you mean by the above. One should never be tracing within
a trampoline, or calling synchronize_rcu_tasks() in one. The trampoline
could be called from any context, including NMI.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-11 23:11                   ` Paul E. McKenney
  2017-04-12  3:23                     ` Paul E. McKenney
@ 2017-04-12 14:48                     ` Paul E. McKenney
  2017-04-12 14:59                       ` Steven Rostedt
  1 sibling, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-12 14:48 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Tue, Apr 11, 2017 at 04:11:38PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 11, 2017 at 04:04:45PM -0700, Paul E. McKenney wrote:
> > On Tue, Apr 11, 2017 at 04:01:54PM -0700, Paul E. McKenney wrote:
> > > On Tue, Apr 11, 2017 at 06:15:30PM -0400, Steven Rostedt wrote:
> > > > On Tue, 11 Apr 2017 14:56:56 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > 
> > > > > On Tue, Apr 11, 2017 at 05:49:53PM -0400, Steven Rostedt wrote:
> > > > > > On Tue, 11 Apr 2017 14:44:43 -0700
> > > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > > > 
> > > > > >   
> > > > > > > Works for me!
> > > > > > > 
> > > > > > > Hopefully it will also work for your computer.  :-)
> > > > > > > 
> > > > > > > And whew!  Glad to see that the stall warnings worked!  
> > > > > > 
> > > > > > Ah! but I think I found a bug in synchronize_rcu_tasks()!
> > > > > > 
> > > > > > Calling schedule isn't good enough. For rcu_tasks to continue, the task
> > > > > > needs to schedule out. With my updated code, I just triggered:
> > > > > > 
> > > > > > [  196.276868] INFO: rcu_tasks detected stalls on tasks:
> > > > > > [  196.284294] ffff8800c26f8040: .. nvcsw: 2/2 holdout: 1 idle_cpu: -1/1
> > > > > > [  196.293175] event_benchmark R  running task    30536  1127      2 0x10000000
> > > > > > [  196.302746] Call Trace:
> > > > > > [  196.307640]  ? _raw_spin_unlock_irq+0x1f/0x50
> > > > > > [  196.314453]  __schedule+0x222/0x1210
> > > > > > [  196.320476]  ? pci_mmcfg_check_reserved+0xc0/0xc0
> > > > > > [  196.327616]  ? preempt_count_add+0xb7/0xf0
> > > > > > [  196.334174]  ? __asan_store8+0x15/0x70
> > > > > > [  196.340384]  schedule+0x57/0xe0
> > > > > > [  196.345888]  benchmark_event_kthread+0x2e/0x3c0
> > > > > > [  196.352823]  kthread+0x178/0x1d0
> > > > > > [  196.358411]  ? trace_benchmark_reg+0x80/0x80
> > > > > > [  196.365073]  ? kthread_create_on_node+0xa0/0xa0
> > > > > > [  196.371999]  ret_from_fork+0x2e/0x40
> > > > > > 
> > > > > > 
> > > > > > And here my benchmark called schedule(), but nothing scheduled it out,
> > > > > > and it still fails on rcu_tasks.  
> > > > > 
> > > > > Good point!
> > > > > 
> > > > > Hmmmm...  I cannot hook into rcu_note_context_switch() because that gets
> > > > > called for preemption as well as for voluntary context switches.
> > > > 
> > > > If you pass in the "preempt" variable, it would work. It's false when
> > > > __schedule() was called by a "schedule()" directly, and true when
> > > > called by one of the preempt_schedule() functions.
> > > 
> > > Like this?  (Untested, but builds at least some of the time.)
> > 
> > Not like that...  :-/  Update on its way.
> 
> Perhaps more like this.  Started rcutorture on it, will see how it goes.

Do you need this patch?  If so, I should do some more work on it to
eliminate the extra common-case branch on the scheduler fastpath.

							Thanx, Paul

> ------------------------------------------------------------------------
> 
> commit c9653896be9b79b7227e8b97710ad6475fc72dc1
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Tue Apr 11 15:50:41 2017 -0700
> 
>     rcu: Make non-preemptive schedule be Tasks RCU quiescent state
>     
>     Currently, a call to schedule() acts as a Tasks RCU quiescent state
>     only if a context switch actually takes place.  However, just the
>     call to schedule() guarantees that the calling task has moved off of
>     whatever tracing trampoline that it might have been one previously.
>     This commit therefore plumbs schedule()'s "preempt" parameter into
>     rcu_note_context_switch(), which then records the Tasks RCU quiescent
>     state, but only if this call to schedule() was -not- due to a preemption.
>     
>     Suggested-by: Steven Rostedt <rostedt@goodmis.org>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index e6146d0074f8..f531b29207da 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -363,15 +363,20 @@ static inline void rcu_init_nohz(void)
>  #ifdef CONFIG_TASKS_RCU
>  #define TASKS_RCU(x) x
>  extern struct srcu_struct tasks_rcu_exit_srcu;
> -#define rcu_note_voluntary_context_switch(t) \
> +#define rcu_note_voluntary_context_switch_lite(t) \
>  	do { \
> -		rcu_all_qs(); \
>  		if (READ_ONCE((t)->rcu_tasks_holdout)) \
>  			WRITE_ONCE((t)->rcu_tasks_holdout, false); \
>  	} while (0)
> +#define rcu_note_voluntary_context_switch(t) \
> +	do { \
> +		rcu_all_qs(); \
> +		rcu_note_voluntary_context_switch_lite(t); \
> +	} while (0)
>  #else /* #ifdef CONFIG_TASKS_RCU */
>  #define TASKS_RCU(x) do { } while (0)
> -#define rcu_note_voluntary_context_switch(t)	rcu_all_qs()
> +#define rcu_note_voluntary_context_switch_lite(t)	do { } while (0)
> +#define rcu_note_voluntary_context_switch(t)		rcu_all_qs()
>  #endif /* #else #ifdef CONFIG_TASKS_RCU */
>  
>  /**
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index 5219be250f00..21ea52df651d 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -92,10 +92,11 @@ static inline void kfree_call_rcu(struct rcu_head *head,
>  	call_rcu(head, func);
>  }
>  
> -static inline void rcu_note_context_switch(void)
> -{
> -	rcu_sched_qs();
> -}
> +#define rcu_note_context_switch(preempt) \
> +	do { \
> +		rcu_sched_qs(); \
> +		rcu_note_voluntary_context_switch_lite(current); \
> +	} while (0)
>  
>  /*
>   * Take advantage of the fact that there is only one CPU, which
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 63a4e4cf40a5..04a2caf3ab8b 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -30,7 +30,7 @@
>  #ifndef __LINUX_RCUTREE_H
>  #define __LINUX_RCUTREE_H
>  
> -void rcu_note_context_switch(void);
> +void rcu_note_context_switch(bool preempt);
>  int rcu_needs_cpu(u64 basem, u64 *nextevt);
>  void rcu_cpu_stall_reset(void);
>  
> @@ -41,7 +41,7 @@ void rcu_cpu_stall_reset(void);
>   */
>  static inline void rcu_virt_note_context_switch(int cpu)
>  {
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(false);
>  }
>  
>  void synchronize_rcu_bh(void);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f13fc4ab2f0d..92cf78fffd31 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -458,13 +458,15 @@ static void rcu_momentary_dyntick_idle(void)
>   * and requires special handling for preemptible RCU.
>   * The caller must have disabled interrupts.
>   */
> -void rcu_note_context_switch(void)
> +void rcu_note_context_switch(bool preempt)
>  {
>  	struct rcu_state *rsp;
>  
>  	barrier(); /* Avoid RCU read-side critical sections leaking down. */
>  	trace_rcu_utilization(TPS("Start context switch"));
>  	rcu_sched_qs();
> +	if (!preempt)
> +		rcu_note_voluntary_context_switch_lite(current);
>  	rcu_preempt_note_context_switch();
>  	/* Load rcu_urgent_qs before other flags. */
>  	if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs)))
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9c38e6be4f3e..653ea8cf89e4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3378,7 +3378,7 @@ static void __sched notrace __schedule(bool preempt)
>  		hrtick_clear(rq);
>  
>  	local_irq_disable();
> -	rcu_note_context_switch();
> +	rcu_note_context_switch(preempt);
>  
>  	/*
>  	 * Make sure that signal_pending_state()->signal_pending() below

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 14:48                     ` Paul E. McKenney
@ 2017-04-12 14:59                       ` Steven Rostedt
  2017-04-12 16:27                         ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-12 14:59 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Wed, 12 Apr 2017 07:48:00 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> > > > Like this?  (Untested, but builds at least some of the time.)  
> > > 
> > > Not like that...  :-/  Update on its way.  
> > 
> > Perhaps more like this.  Started rcutorture on it, will see how it goes.  

I just love the above discussion with yourself ;-)

> 
> Do you need this patch?  If so, I should do some more work on it to
> eliminate the extra common-case branch on the scheduler fastpath.
> 

Do I still need this patch? Maybe. :-)

I changed my benchmark test to call cond_resched_rcu_qs() instead and
that appears to fix the issue. But I'm not sure if there's any other
kthread out there that just calls cond_resched() or schedule().

Actually, I think it is still a good idea to have it. I believe that it
will still allow synchronize_rcu_tasks() to progress even if there's a
kthread task that is constantly being woken up, and never sleeps when
it calls schedule(), as it may always have the R state.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 14:42                           ` Steven Rostedt
@ 2017-04-12 15:18                             ` Paul E. McKenney
  2017-04-12 15:53                               ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-12 15:18 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Wed, Apr 12, 2017 at 10:42:55AM -0400, Steven Rostedt wrote:
> On Wed, 12 Apr 2017 07:19:36 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Apr 12, 2017 at 09:18:21AM -0400, Steven Rostedt wrote:
> > > On Tue, 11 Apr 2017 20:23:07 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > But another question...
> > > > 
> > > > Suppose someone traced or probed or whatever a call to (say)
> > > > cond_resched_rcu_qs().  Wouldn't that put the call to this
> > > > function in the trampoline itself?  Of course, if this happened,
> > > > life would be hard when the trampoline was freed due to
> > > > cond_resched_rcu_qs() being a quiescent state.  
> > > 
> > > Not at all, because the trampoline happens at the beginning of the
> > > function. Not in the guts of it (unless something in the guts was
> > > traced). But even then, it should be fine as the change was already
> > > made.
> > > 
> > > 	/* unhook trampoline from function calls */
> > > 	unregister_ftrace_function(my_ops);
> > > 
> > > 	synchronize_rcu_tasks();
> > > 
> > > 	kfree(my_ops->trampoline);
> > > 
> > > 
> > > Thus, once the unregister_ftrace_function() is called, no new entries
> > > into the trampoline can happen. The synchronize_rcu_tasks() is to move
> > > those that are currently on a trampoline off.  
> > 
> > OK, good!  (I thought that these things could appear anywhere.)
> 
> Well the trampolines pretty much can, but they are removed before
> calling synchronize_rcu_tasks(), and nothing can enter the trampoline
> when that is called.

Color me confused...

So you can have an arbitrary function call within a trampoline?

If not, agreed, no problem.  Otherwise, it seems like we have a big
problem remaining.  Unless the functions called from a trampoline are
guaranteed never to do a context switch.

So what exactly is the trampoline code allowed to do?  ;-)

> > If it ever becomes necessary, I suppose you could have a function
> > call as the very last thing on a trampoline.  Do the (off-trampoline)
> > return-address push, jump at the function, and that is the last need
> > for the trampoline.
> 
> The point of trampolines is to optimize the function hooks, added
> features will kill that optimization. But then it gets even more
> complex. The trampolines are written in assembly and do special reg
> savings in order to call C code. And it needs to restore back to the
> original state before calling back to the function being traced. Thus,
> anything at the end of the trampoline will need to be written in
> assembly. Not sure writing RCU code in assembly would be much fun.

Writing RCU code as assembly code would indeed not be my first choice!

> > Assuming that the called function doesn't try accessing the code
> > surrounding the call, but that would be a problem in any case.
> > 
> > > Is there a way that a task could be in the middle of
> > > cond_resched_rcu_qs() and get preempted by something while on the
> > > ftrace trampoline, then the above "unregister_ftrace_function()" and
> > > "synchronize_rcu_tasks()" can be called and finish, while the one task
> > > is still on the trampoline and never finished the cond_resched_rcu_qs()?  
> > 
> > Well, if the kernel being ftraced is a guest OS and the hypervisor
> > preempts it at just that point...
> 
> Not sure what you mean by the above. You mean the hypervisor running
> ftrace on the guest OS? Or just a long pause on the guest OS (could
> also be an NMI). But in any case, we don't care about long pauses. We
> care about tasks going to sleep while on the trampoline, and the ftrace
> code that does the schedule_on_each_cpu() missing that task, because it
> was preempted, and not effected by the schedule_on_each_cpu() call.

The guest doing ftrace and the hypervisor preempting it.  But yes,
same thing as NMI.

> > > > Or is there something that takes care to avoid putting calls to
> > > > this sort of function (and calls to any function calling this sort
> > > > of function, directly or indirectly) into a trampoline?  
> > > 
> > > The question is, if its on the trampoline in one of theses functions
> > > when synchronize_rcu_tasks() is called, will it still be on the
> > > trampoline when that returns?  
> > 
> > If the function's return address is within the trampoline, it seems to
> > me that bad things could happen.
> 
> Not sure what you mean by the above. One should never be tracing within
> a trampoline, or calling synchronize_rcu_tasks() in one. The trampoline
> could be called from any context, including NMI.

My problem is that I have no idea what can and cannot be included in
trampoline code.  In absence of that information, my RCU-honed reflexes
jump immediately to the worst case that I can think of.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 15:18                             ` Paul E. McKenney
@ 2017-04-12 15:53                               ` Steven Rostedt
  2017-04-12 16:26                                 ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-12 15:53 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Wed, 12 Apr 2017 08:18:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:


> > Well the trampolines pretty much can, but they are removed before
> > calling synchronize_rcu_tasks(), and nothing can enter the trampoline
> > when that is called.  
> 
> Color me confused...
> 
> So you can have an arbitrary function call within a trampoline?

Sorta.

When you do register_ftrace_function(ops), where ops has ops->func that
points to a function you want to have called when a function is traced,
the following happens (if there's no other ops registered). Let's use
an example where ops is filtered on just the schedule() function call:


 <schedule>:
    call trampoline ---+
    [..]               |
                       +--> <trampoline>:
                              push regs
                              call ops->func
                              pop regs
                              ret

But that ops->func() must be very limited in what it can do. Although,
it may actually call an rcu_read_lock()! But if that's the case, it
must either check if rcu is watching (which perf does), or enable rcu
via the rcu_irq_enter() with a check on rcu_irq_enter_disabled(), which
my stack tracer does.

Now this can be called even from NMI context! Thus what ops->func does
must be aware of that. The stack tracer func has an:

  if (in_nmi())
	return;

Because it needs to grab spin locks.

But one thing an op->func() is never allowed to do, is to call
schedule() directly, or even a cond_resched(). It may be preempted if
preemption was enabled when the trampoline was hit, but it must not
assume that it can do a voluntary schedule. That would break the
rcu_tasks as well if it did.


> 
> If not, agreed, no problem.  Otherwise, it seems like we have a big
> problem remaining.  Unless the functions called from a trampoline are
> guaranteed never to do a context switch.

Well, they can be preempted, but they should never do a voluntary
schedule. If they did, that would be bad.

> 
> So what exactly is the trampoline code allowed to do?  ;-)

Well, it must be able to work in an NMI context, or bail otherwise. And
it should never schedule on its own.

>
> My problem is that I have no idea what can and cannot be included in
> trampoline code.  In absence of that information, my RCU-honed reflexes
> jump immediately to the worst case that I can think of.  ;-)
>

Lets just say that it can't voluntarily sleep. Would that be good
enough? If someday in the future I decide to let it do so, I would add
a flag and force that ops not to be able to use a dynamic trampoline.

Currently, without the synchronize_rcu_tasks(), when a dynamic ops is
registered, the functions will point to a non dynamic trampoline. That
is, one that is never freed. It simply does:

	preempt_disable_notrace();

	do_for_each_ftrace_op(op, ftrace_ops_list) {
		/*
		 * Check the following for each ops before calling their func:
		 *  if RCU flag is set, then rcu_is_watching() must be true
		 *  if PER_CPU is set, then ftrace_function_local_disable()
		 *                          must be false
		 *  Otherwise test if the ip matches the ops filter
		 *
		 * If any of the above fails then the op->func() is not executed.
		 */
		if ((!(op->flags & FTRACE_OPS_FL_RCU) || rcu_is_watching()) &&
		    (!(op->flags & FTRACE_OPS_FL_PER_CPU) ||
		     !ftrace_function_local_disabled(op)) &&
		    ftrace_ops_test(op, ip, regs)) {
		    
			if (FTRACE_WARN_ON(!op->func)) {
				pr_warn("op=%p %pS\n", op, op);
				goto out;
			}
			op->func(ip, parent_ip, op, regs);
		}
	} while_for_each_ftrace_op(op);
out:
	preempt_enable_notrace();

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 15:53                               ` Steven Rostedt
@ 2017-04-12 16:26                                 ` Paul E. McKenney
  2017-04-12 16:49                                   ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-12 16:26 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Wed, Apr 12, 2017 at 11:53:04AM -0400, Steven Rostedt wrote:
> On Wed, 12 Apr 2017 08:18:17 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> 
> > > Well the trampolines pretty much can, but they are removed before
> > > calling synchronize_rcu_tasks(), and nothing can enter the trampoline
> > > when that is called.  
> > 
> > Color me confused...
> > 
> > So you can have an arbitrary function call within a trampoline?
> 
> Sorta.
> 
> When you do register_ftrace_function(ops), where ops has ops->func that
> points to a function you want to have called when a function is traced,
> the following happens (if there's no other ops registered). Let's use
> an example where ops is filtered on just the schedule() function call:
> 
> 
>  <schedule>:
>     call trampoline ---+
>     [..]               |
>                        +--> <trampoline>:
>                               push regs
>                               call ops->func
>                               pop regs
>                               ret
> 
> But that ops->func() must be very limited in what it can do. Although,
> it may actually call an rcu_read_lock()! But if that's the case, it
> must either check if rcu is watching (which perf does), or enable rcu
> via the rcu_irq_enter() with a check on rcu_irq_enter_disabled(), which
> my stack tracer does.
> 
> Now this can be called even from NMI context! Thus what ops->func does
> must be aware of that. The stack tracer func has an:
> 
>   if (in_nmi())
> 	return;
> 
> Because it needs to grab spin locks.

But preemption is enabled within the trampoline?  If so, then if
CONFIG_RCU_BOOST is set, rcu_read_unlock() can call rt_mutex_unlock().
Which looks OK to me, but I thought I should mention it.

> But one thing an op->func() is never allowed to do, is to call
> schedule() directly, or even a cond_resched(). It may be preempted if
> preemption was enabled when the trampoline was hit, but it must not
> assume that it can do a voluntary schedule. That would break the
> rcu_tasks as well if it did.

OK, so it can call functions, but it is not permitted to call functions
that voluntarily block.  That should work.  (Fingers firmly crossed.)

> > If not, agreed, no problem.  Otherwise, it seems like we have a big
> > problem remaining.  Unless the functions called from a trampoline are
> > guaranteed never to do a context switch.
> 
> Well, they can be preempted, but they should never do a voluntary
> schedule. If they did, that would be bad.

OK, feeeling better now.  ;-)

> > So what exactly is the trampoline code allowed to do?  ;-)
> 
> Well, it must be able to work in an NMI context, or bail otherwise. And
> it should never schedule on its own.

Good.

> > My problem is that I have no idea what can and cannot be included in
> > trampoline code.  In absence of that information, my RCU-honed reflexes
> > jump immediately to the worst case that I can think of.  ;-)
> 
> Lets just say that it can't voluntarily sleep. Would that be good
> enough? If someday in the future I decide to let it do so, I would add
> a flag and force that ops not to be able to use a dynamic trampoline.

That would work.  Again, feeling much better now.  ;-)

> Currently, without the synchronize_rcu_tasks(), when a dynamic ops is
> registered, the functions will point to a non dynamic trampoline. That
> is, one that is never freed. It simply does:
> 
> 	preempt_disable_notrace();
> 
> 	do_for_each_ftrace_op(op, ftrace_ops_list) {
> 		/*
> 		 * Check the following for each ops before calling their func:
> 		 *  if RCU flag is set, then rcu_is_watching() must be true
> 		 *  if PER_CPU is set, then ftrace_function_local_disable()
> 		 *                          must be false
> 		 *  Otherwise test if the ip matches the ops filter
> 		 *
> 		 * If any of the above fails then the op->func() is not executed.
> 		 */
> 		if ((!(op->flags & FTRACE_OPS_FL_RCU) || rcu_is_watching()) &&
> 		    (!(op->flags & FTRACE_OPS_FL_PER_CPU) ||
> 		     !ftrace_function_local_disabled(op)) &&
> 		    ftrace_ops_test(op, ip, regs)) {
> 		    
> 			if (FTRACE_WARN_ON(!op->func)) {
> 				pr_warn("op=%p %pS\n", op, op);
> 				goto out;
> 			}
> 			op->func(ip, parent_ip, op, regs);
> 		}
> 	} while_for_each_ftrace_op(op);
> out:
> 	preempt_enable_notrace();

Makes sense!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 14:59                       ` Steven Rostedt
@ 2017-04-12 16:27                         ` Paul E. McKenney
  2017-04-12 16:57                           ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-12 16:27 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Wed, Apr 12, 2017 at 10:59:37AM -0400, Steven Rostedt wrote:
> On Wed, 12 Apr 2017 07:48:00 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > > > Like this?  (Untested, but builds at least some of the time.)  
> > > > 
> > > > Not like that...  :-/  Update on its way.  
> > > 
> > > Perhaps more like this.  Started rcutorture on it, will see how it goes.  
> 
> I just love the above discussion with yourself ;-)

Talking to oneself used to cause passersby to get really bent out of
shape.  But one big benefit of ubiquitous cellphones is that now
people just assume that you are talking on a cellphone that they
cannot see.  ;-)

> > Do you need this patch?  If so, I should do some more work on it to
> > eliminate the extra common-case branch on the scheduler fastpath.
> 
> Do I still need this patch? Maybe. :-)
> 
> I changed my benchmark test to call cond_resched_rcu_qs() instead and
> that appears to fix the issue. But I'm not sure if there's any other
> kthread out there that just calls cond_resched() or schedule().
> 
> Actually, I think it is still a good idea to have it. I believe that it
> will still allow synchronize_rcu_tasks() to progress even if there's a
> kthread task that is constantly being woken up, and never sleeps when
> it calls schedule(), as it may always have the R state.

OK, will optimize it a bit.  When are you planning to get this in?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 16:26                                 ` Paul E. McKenney
@ 2017-04-12 16:49                                   ` Steven Rostedt
  0 siblings, 0 replies; 28+ messages in thread
From: Steven Rostedt @ 2017-04-12 16:49 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Wed, 12 Apr 2017 09:26:10 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Apr 12, 2017 at 11:53:04AM -0400, Steven Rostedt wrote:
> > On Wed, 12 Apr 2017 08:18:17 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> >   
> > > > Well the trampolines pretty much can, but they are removed before
> > > > calling synchronize_rcu_tasks(), and nothing can enter the trampoline
> > > > when that is called.    
> > > 
> > > Color me confused...
> > > 
> > > So you can have an arbitrary function call within a trampoline?  
> > 
> > Sorta.
> > 
> > When you do register_ftrace_function(ops), where ops has ops->func that
> > points to a function you want to have called when a function is traced,
> > the following happens (if there's no other ops registered). Let's use
> > an example where ops is filtered on just the schedule() function call:
> > 
> > 
> >  <schedule>:
> >     call trampoline ---+
> >     [..]               |
> >                        +--> <trampoline>:
> >                               push regs
> >                               call ops->func
> >                               pop regs
> >                               ret
> > 
> > But that ops->func() must be very limited in what it can do. Although,
> > it may actually call an rcu_read_lock()! But if that's the case, it
> > must either check if rcu is watching (which perf does), or enable rcu
> > via the rcu_irq_enter() with a check on rcu_irq_enter_disabled(), which
> > my stack tracer does.
> > 
> > Now this can be called even from NMI context! Thus what ops->func does
> > must be aware of that. The stack tracer func has an:
> > 
> >   if (in_nmi())
> > 	return;
> > 
> > Because it needs to grab spin locks.  
> 
> But preemption is enabled within the trampoline?  If so, then if

Yes it can be.

> CONFIG_RCU_BOOST is set, rcu_read_unlock() can call rt_mutex_unlock().
> Which looks OK to me, but I thought I should mention it.

The rt_mutex_unlock() shouldn't call schedule.

> 
> > But one thing an op->func() is never allowed to do, is to call
> > schedule() directly, or even a cond_resched(). It may be preempted if
> > preemption was enabled when the trampoline was hit, but it must not
> > assume that it can do a voluntary schedule. That would break the
> > rcu_tasks as well if it did.  
> 
> OK, so it can call functions, but it is not permitted to call functions
> that voluntarily block.  That should work.  (Fingers firmly crossed.)

Right, calling a mutex is bad. This is why all spinlocks must be of the
raw form as well, otherwise it breaks on PREEMPT_RT.

Actually, the stack tracer uses arch_spin_lock() because it has the
ability to break lockdep on top of this.

As you pointed out one of my comments. Function tracing is rude.

> 
> > > If not, agreed, no problem.  Otherwise, it seems like we have a big
> > > problem remaining.  Unless the functions called from a trampoline are
> > > guaranteed never to do a context switch.  
> > 
> > Well, they can be preempted, but they should never do a voluntary
> > schedule. If they did, that would be bad.  
> 
> OK, feeeling better now.  ;-)
> 
> > > So what exactly is the trampoline code allowed to do?  ;-)  
> > 
> > Well, it must be able to work in an NMI context, or bail otherwise. And
> > it should never schedule on its own.  
> 
> Good.
> 
> > > My problem is that I have no idea what can and cannot be included in
> > > trampoline code.  In absence of that information, my RCU-honed reflexes
> > > jump immediately to the worst case that I can think of.  ;-)  
> > 
> > Lets just say that it can't voluntarily sleep. Would that be good
> > enough? If someday in the future I decide to let it do so, I would add
> > a flag and force that ops not to be able to use a dynamic trampoline.  
> 
> That would work.  Again, feeling much better now.  ;-)

Great!

> 
> > Currently, without the synchronize_rcu_tasks(), when a dynamic ops is
> > registered, the functions will point to a non dynamic trampoline. That
> > is, one that is never freed. It simply does:
> > 
> > 	preempt_disable_notrace();
> > 
> > 	do_for_each_ftrace_op(op, ftrace_ops_list) {
> > 		/*
> > 		 * Check the following for each ops before calling their func:
> > 		 *  if RCU flag is set, then rcu_is_watching() must be true
> > 		 *  if PER_CPU is set, then ftrace_function_local_disable()
> > 		 *                          must be false
> > 		 *  Otherwise test if the ip matches the ops filter
> > 		 *
> > 		 * If any of the above fails then the op->func() is not executed.
> > 		 */
> > 		if ((!(op->flags & FTRACE_OPS_FL_RCU) || rcu_is_watching()) &&
> > 		    (!(op->flags & FTRACE_OPS_FL_PER_CPU) ||
> > 		     !ftrace_function_local_disabled(op)) &&
> > 		    ftrace_ops_test(op, ip, regs)) {
> > 		    
> > 			if (FTRACE_WARN_ON(!op->func)) {
> > 				pr_warn("op=%p %pS\n", op, op);
> > 				goto out;
> > 			}
> > 			op->func(ip, parent_ip, op, regs);
> > 		}
> > 	} while_for_each_ftrace_op(op);
> > out:
> > 	preempt_enable_notrace();  
> 
> Makes sense!

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 16:27                         ` Paul E. McKenney
@ 2017-04-12 16:57                           ` Steven Rostedt
  2017-04-12 17:07                             ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-12 16:57 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Wed, 12 Apr 2017 09:27:54 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Wed, Apr 12, 2017 at 10:59:37AM -0400, Steven Rostedt wrote:
> > On Wed, 12 Apr 2017 07:48:00 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > > > > Like this?  (Untested, but builds at least some of the time.)    
> > > > > 
> > > > > Not like that...  :-/  Update on its way.    
> > > > 
> > > > Perhaps more like this.  Started rcutorture on it, will see how it goes.    
> > 
> > I just love the above discussion with yourself ;-)  
> 
> Talking to oneself used to cause passersby to get really bent out of
> shape.  But one big benefit of ubiquitous cellphones is that now
> people just assume that you are talking on a cellphone that they
> cannot see.  ;-)
> 
> > > Do you need this patch?  If so, I should do some more work on it to
> > > eliminate the extra common-case branch on the scheduler fastpath.  
> > 
> > Do I still need this patch? Maybe. :-)
> > 
> > I changed my benchmark test to call cond_resched_rcu_qs() instead and
> > that appears to fix the issue. But I'm not sure if there's any other
> > kthread out there that just calls cond_resched() or schedule().
> > 
> > Actually, I think it is still a good idea to have it. I believe that it
> > will still allow synchronize_rcu_tasks() to progress even if there's a
> > kthread task that is constantly being woken up, and never sleeps when
> > it calls schedule(), as it may always have the R state.  
> 
> OK, will optimize it a bit.  When are you planning to get this in?
> 

Well, I added the use case for synchronize_rcu_tasks() in my current
for-next. I'll have to make sure I get the schedule_idle() in as well
as my update to the event benchmark thread as well.

I don't think anything will truly break without it yet. But that's
assuming there's not another kernel thread somewhere that just spins
calling schedule.

And this patch will still speed up those that do call
synchronize_rcu_tasks(). But that's an optimization and not really a
fix.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 16:57                           ` Steven Rostedt
@ 2017-04-12 17:07                             ` Paul E. McKenney
  2017-04-12 17:13                               ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-12 17:07 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Wed, Apr 12, 2017 at 12:57:22PM -0400, Steven Rostedt wrote:
> On Wed, 12 Apr 2017 09:27:54 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Apr 12, 2017 at 10:59:37AM -0400, Steven Rostedt wrote:
> > > On Wed, 12 Apr 2017 07:48:00 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >   
> > > > > > > Like this?  (Untested, but builds at least some of the time.)    
> > > > > > 
> > > > > > Not like that...  :-/  Update on its way.    
> > > > > 
> > > > > Perhaps more like this.  Started rcutorture on it, will see how it goes.    
> > > 
> > > I just love the above discussion with yourself ;-)  
> > 
> > Talking to oneself used to cause passersby to get really bent out of
> > shape.  But one big benefit of ubiquitous cellphones is that now
> > people just assume that you are talking on a cellphone that they
> > cannot see.  ;-)
> > 
> > > > Do you need this patch?  If so, I should do some more work on it to
> > > > eliminate the extra common-case branch on the scheduler fastpath.  
> > > 
> > > Do I still need this patch? Maybe. :-)
> > > 
> > > I changed my benchmark test to call cond_resched_rcu_qs() instead and
> > > that appears to fix the issue. But I'm not sure if there's any other
> > > kthread out there that just calls cond_resched() or schedule().
> > > 
> > > Actually, I think it is still a good idea to have it. I believe that it
> > > will still allow synchronize_rcu_tasks() to progress even if there's a
> > > kthread task that is constantly being woken up, and never sleeps when
> > > it calls schedule(), as it may always have the R state.  
> > 
> > OK, will optimize it a bit.  When are you planning to get this in?
> > 
> 
> Well, I added the use case for synchronize_rcu_tasks() in my current
> for-next. I'll have to make sure I get the schedule_idle() in as well
> as my update to the event benchmark thread as well.
> 
> I don't think anything will truly break without it yet. But that's
> assuming there's not another kernel thread somewhere that just spins
> calling schedule.
> 
> And this patch will still speed up those that do call
> synchronize_rcu_tasks(). But that's an optimization and not really a
> fix.

The upcoming v4.12 merge window, then?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 17:07                             ` Paul E. McKenney
@ 2017-04-12 17:13                               ` Steven Rostedt
  2017-04-12 20:02                                 ` Paul E. McKenney
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2017-04-12 17:13 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Wed, 12 Apr 2017 10:07:16 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:


> > > OK, will optimize it a bit.  When are you planning to get this in?
> > >   
> > 
> > Well, I added the use case for synchronize_rcu_tasks() in my current
> > for-next. I'll have to make sure I get the schedule_idle() in as well
> > as my update to the event benchmark thread as well.
> > 
> > I don't think anything will truly break without it yet. But that's
> > assuming there's not another kernel thread somewhere that just spins
> > calling schedule.
> > 
> > And this patch will still speed up those that do call
> > synchronize_rcu_tasks(). But that's an optimization and not really a
> > fix.  
> 
> The upcoming v4.12 merge window, then?

Yep.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: There is a Tasks RCU stall warning
  2017-04-12 17:13                               ` Steven Rostedt
@ 2017-04-12 20:02                                 ` Paul E. McKenney
  0 siblings, 0 replies; 28+ messages in thread
From: Paul E. McKenney @ 2017-04-12 20:02 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel

On Wed, Apr 12, 2017 at 01:13:25PM -0400, Steven Rostedt wrote:
> On Wed, 12 Apr 2017 10:07:16 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> 
> > > > OK, will optimize it a bit.  When are you planning to get this in?
> > > >   
> > > 
> > > Well, I added the use case for synchronize_rcu_tasks() in my current
> > > for-next. I'll have to make sure I get the schedule_idle() in as well
> > > as my update to the event benchmark thread as well.
> > > 
> > > I don't think anything will truly break without it yet. But that's
> > > assuming there's not another kernel thread somewhere that just spins
> > > calling schedule.
> > > 
> > > And this patch will still speed up those that do call
> > > synchronize_rcu_tasks(). But that's an optimization and not really a
> > > fix.  
> > 
> > The upcoming v4.12 merge window, then?
> 
> Yep.

Also, is the default 10-minute stall warning OK?  For purposes of
comparison, when running rcutorture, Tasks RCU grace periods take 2-3
seconds, but of course your mileage may vary.

For whatever it is worth, I set the default ratio of the Tasks RCU
stall-warning time to its average rcutorture grace period to be about
the same as the ratio for other RCU flavors.

And the kernel boot parameter rcupdate.rcu_task_stall_timeout allows
you to set whatever time you want (in jiffies) at boot time.  And also
modify it at runtime via sysfs.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-04-12 20:02 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-11 21:18 There is a Tasks RCU stall warning Paul E. McKenney
2017-04-11 21:21 ` Steven Rostedt
2017-04-11 21:32   ` Paul E. McKenney
2017-04-11 21:31 ` Steven Rostedt
2017-04-11 21:34   ` Steven Rostedt
2017-04-11 21:39     ` Steven Rostedt
2017-04-11 21:44       ` Paul E. McKenney
2017-04-11 21:49         ` Steven Rostedt
2017-04-11 21:56           ` Paul E. McKenney
2017-04-11 22:15             ` Steven Rostedt
2017-04-11 23:01               ` Paul E. McKenney
2017-04-11 23:04                 ` Paul E. McKenney
2017-04-11 23:11                   ` Paul E. McKenney
2017-04-12  3:23                     ` Paul E. McKenney
2017-04-12 13:18                       ` Steven Rostedt
2017-04-12 14:19                         ` Paul E. McKenney
2017-04-12 14:42                           ` Steven Rostedt
2017-04-12 15:18                             ` Paul E. McKenney
2017-04-12 15:53                               ` Steven Rostedt
2017-04-12 16:26                                 ` Paul E. McKenney
2017-04-12 16:49                                   ` Steven Rostedt
2017-04-12 14:48                     ` Paul E. McKenney
2017-04-12 14:59                       ` Steven Rostedt
2017-04-12 16:27                         ` Paul E. McKenney
2017-04-12 16:57                           ` Steven Rostedt
2017-04-12 17:07                             ` Paul E. McKenney
2017-04-12 17:13                               ` Steven Rostedt
2017-04-12 20:02                                 ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.