linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
@ 2014-05-27 20:08 Stephen Boyd
  2014-05-27 20:11 ` Arnd Bergmann
  0 siblings, 1 reply; 15+ messages in thread
From: Stephen Boyd @ 2014-05-27 20:08 UTC (permalink / raw)
  To: Steven Rostedt, Frederic Weisbecker, Ingo Molnar
  Cc: linux-kernel, Corey Minyard, Stanislav Meduna, Arnd Bergmann

The idle loop calls stop_critical_timings() in an attempt to stop
the irqsoff tracer from tracing a cpu's idle time. Unfortunately,
stop_critical_timings() doesn't do much besides stop the tracer
until the point that another irqsoff event happens. This
typically happens in rcu_idle_enter() or later on in the cpuidle
driver when we take a spinlock to notify the clockevent framework
about timers that lose state during deep idle (FEAT_C3_STOP).
This leads to worthless irqsoff traces that show a large amount
of time spent in idle:

 # tracer: irqsoff
 #
 # irqsoff latency trace v1.1.5 on 3.15.0-rc4-00002-geffda86cfdaa-dirty
 # --------------------------------------------------------------------
 # latency: 3997988 us, #88/88, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
 #    -----------------
 #    | task: swapper/1-0 (uid:0 nice:0 policy:0 rt_prio:0)
 #    -----------------
 #  => started at: rcu_idle_enter
 #  => ended at:   cpuidle_enter_state
 #
 #
 #                  _------=> CPU#
 #                 / _-----=> irqs-off
 #                | / _----=> need-resched
 #                || / _---=> hardirq/softirq
 #                ||| / _--=> preempt-depth
 #                |||| /     delay
 #  cmd     pid   ||||| time  |   caller
 #     \   /      |||||  \    |   /
   <idle>-0       1d..1    2us+: trace_hardirqs_off <-rcu_idle_enter
   <idle>-0       1d..2    7us+: rcu_eqs_enter_common.isra.48 <-rcu_idle_enter
   <idle>-0       1d..2   12us+: cpuidle_enabled <-cpu_startup_entry
   <idle>-0       1d..2   17us+: cpuidle_select <-cpu_startup_entry
   <idle>-0       1d..2   21us+: menu_select <-cpuidle_select
   <idle>-0       1d..2   26us+: pm_qos_request <-menu_select
   <idle>-0       1d..2   30us+: tick_nohz_get_sleep_length <-menu_select
   <idle>-0       1d..2   34us+: ns_to_timespec <-menu_select
   <idle>-0       1d..2   39us+: nr_iowait_cpu <-menu_select
   <idle>-0       1d..2   46us+: this_cpu_load <-menu_select
   <idle>-0       1d..2   50us+: nr_iowait_cpu <-menu_select
   <idle>-0       1d..2   55us+: clockevents_notify <-cpu_startup_entry
   <idle>-0       1d..2   59us+: _raw_spin_lock_irqsave <-clockevents_notify
   <idle>-0       1d..2   64us+: preempt_count_add <-_raw_spin_lock_irqsave
   <idle>-0       1d..3   69us+: do_raw_spin_lock <-_raw_spin_lock_irqsave
   <idle>-0       1d..3   74us+: tick_broadcast_oneshot_control <-clockevents_notify
   <idle>-0       1d..3   78us+: _raw_spin_lock_irqsave <-tick_broadcast_oneshot_control
   <idle>-0       1d..3   83us+: preempt_count_add <-_raw_spin_lock_irqsave
   <idle>-0       1d..4   88us+: do_raw_spin_lock <-_raw_spin_lock_irqsave
   <idle>-0       1d..4   93us+: clockevents_set_mode <-tick_broadcast_oneshot_control
   <idle>-0       1d..4   97us+: arch_timer_set_mode_virt <-clockevents_set_mode
   <idle>-0       1d..4  102us+: broadcast_needs_cpu <-tick_broadcast_oneshot_control
   <idle>-0       1d..4  106us+: _raw_spin_unlock_irqrestore <-tick_broadcast_oneshot_control
   <idle>-0       1d..4  110us+: do_raw_spin_unlock <-_raw_spin_unlock_irqrestore
   <idle>-0       1d..4  115us+: preempt_count_sub <-_raw_spin_unlock_irqrestore
   <idle>-0       1d..3  120us+: _raw_spin_unlock_irqrestore <-clockevents_notify
   <idle>-0       1d..3  124us+: do_raw_spin_unlock <-_raw_spin_unlock_irqrestore
   <idle>-0       1d..3  129us+: preempt_count_sub <-_raw_spin_unlock_irqrestore
   <idle>-0       1d..2  133us+: cpuidle_enter <-cpu_startup_entry
   <idle>-0       1d..2  137us+: cpuidle_enter_state <-cpuidle_enter
   <idle>-0       1d..2  141us+: ktime_get <-cpuidle_enter_state
   <idle>-0       1d..2  148us+: arch_counter_read <-ktime_get
   <idle>-0       1d..2  153us+: cpu_enter_idle <-cpuidle_enter_state
   <idle>-0       1d..2  157us+: cpu_pm_enter <-cpu_enter_idle
   <idle>-0       1d..2  161us+: _raw_read_lock <-cpu_pm_enter
   <idle>-0       1d..2  165us+: preempt_count_add <-_raw_read_lock
   <idle>-0       1d..3  170us+: do_raw_read_lock <-_raw_read_lock
   <idle>-0       1d..3  175us+: cpu_pm_notify <-cpu_pm_enter
   <idle>-0       1d..3  179us+: __raw_notifier_call_chain <-cpu_pm_notify
   <idle>-0       1d..3  183us+: notifier_call_chain <-__raw_notifier_call_chain
   <idle>-0       1d..3  187us+: gic_notifier <-notifier_call_chain
   <idle>-0       1d..3  192us+: arch_timer_cpu_pm_notify <-notifier_call_chain
   <idle>-0       1d..3  197us+: vfp_cpu_pm_notifier <-notifier_call_chain
   <idle>-0       1d..3  201us+: dbg_cpu_pm_notify <-notifier_call_chain
   <idle>-0       1d..3  206us+: _raw_read_unlock <-cpu_pm_enter
   <idle>-0       1d..3  210us+: do_raw_read_unlock <-_raw_read_unlock
   <idle>-0       1d..3  214us!: preempt_count_sub <-_raw_read_unlock
   <idle>-0       1d..2 3997820us+: cpu_pm_exit <-cpu_enter_idle
   <idle>-0       1d..2 3997824us+: _raw_read_lock <-cpu_pm_exit
   <idle>-0       1d..2 3997828us+: preempt_count_add <-_raw_read_lock
   <idle>-0       1d..3 3997833us+: do_raw_read_lock <-_raw_read_lock
   <idle>-0       1d..3 3997838us+: cpu_pm_notify <-cpu_pm_exit
   <idle>-0       1d..3 3997842us+: __raw_notifier_call_chain <-cpu_pm_notify
   <idle>-0       1d..3 3997846us+: notifier_call_chain <-__raw_notifier_call_chain
   <idle>-0       1d..3 3997850us+: gic_notifier <-notifier_call_chain
   <idle>-0       1d..3 3997856us+: arch_timer_cpu_pm_notify <-notifier_call_chain
   <idle>-0       1d..3 3997860us+: vfp_cpu_pm_notifier <-notifier_call_chain
   <idle>-0       1d..3 3997865us+: vfp_enable <-vfp_cpu_pm_notifier
   <idle>-0       1d..3 3997869us+: dbg_cpu_pm_notify <-notifier_call_chain
   <idle>-0       1d..3 3997873us+: reset_ctrl_regs <-dbg_cpu_pm_notify
   <idle>-0       1d..3 3997877us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997882us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997886us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997890us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997894us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997898us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997902us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997906us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997910us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997915us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997919us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997923us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997927us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997931us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997936us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997940us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997944us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997948us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997952us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997956us+: write_wb_reg <-reset_ctrl_regs
   <idle>-0       1d..3 3997960us+: _raw_read_unlock <-cpu_pm_exit
   <idle>-0       1d..3 3997965us+: do_raw_read_unlock <-_raw_read_unlock
   <idle>-0       1d..3 3997969us+: preempt_count_sub <-_raw_read_unlock
   <idle>-0       1d..2 3997973us+: ktime_get <-cpuidle_enter_state
   <idle>-0       1d..2 3997980us+: arch_counter_read <-ktime_get
   <idle>-0       1d..1 3997985us+: trace_hardirqs_on <-cpuidle_enter_state
   <idle>-0       1d..1 3997994us+: time_hardirqs_on <-cpuidle_enter_state
   <idle>-0       1d..1 3998053us : <stack trace>
  => time_hardirqs_on
  => trace_hardirqs_on_caller
  => trace_hardirqs_on
  => cpuidle_enter_state
  => cpuidle_enter
  => cpu_startup_entry
  => secondary_start_kernel

Let's make stop_critical_timings() actually stop the tracer until
start_critical_timings() is called. This way we don't have to
worry about adding more stop_critical_timings() calls deep down
in cpuidle drivers or later on in the idle loop.

Cc: Corey Minyard <cminyard@mvista.com>
Cc: Stanislav Meduna <stano@meduna.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---

Changes since v1:
 * Micro-optimize start_critical_timing() check
 * Fix comment
 * Lift out timings_stopped assignments from conditional checks

 kernel/trace/trace_irqsoff.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 8ff02cbb892f..278b5fe6a693 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -367,6 +367,8 @@ out:
 	__trace_function(tr, CALLER_ADDR0, parent_ip, flags, pc);
 }
 
+static DEFINE_PER_CPU(bool, timings_stopped);
+
 static inline void
 start_critical_timing(unsigned long ip, unsigned long parent_ip)
 {
@@ -380,7 +382,7 @@ start_critical_timing(unsigned long ip, unsigned long parent_ip)
 
 	cpu = raw_smp_processor_id();
 
-	if (per_cpu(tracing_cpu, cpu))
+	if (per_cpu(timings_stopped, cpu) || per_cpu(tracing_cpu, cpu))
 		return;
 
 	data = per_cpu_ptr(tr->trace_buffer.data, cpu);
@@ -418,7 +420,8 @@ stop_critical_timing(unsigned long ip, unsigned long parent_ip)
 	else
 		return;
 
-	if (!tracer_enabled || !tracing_is_enabled())
+	if (!tracer_enabled || !tracing_is_enabled() ||
+			per_cpu(timings_stopped, cpu))
 		return;
 
 	data = per_cpu_ptr(tr->trace_buffer.data, cpu);
@@ -436,9 +439,10 @@ stop_critical_timing(unsigned long ip, unsigned long parent_ip)
 	atomic_dec(&data->disabled);
 }
 
-/* start and stop critical timings used to for stoppage (in idle) */
+/* start and stop critical timings; used for stoppage (in idle) */
 void start_critical_timings(void)
 {
+	per_cpu(timings_stopped, raw_smp_processor_id()) = false;
 	if (preempt_trace() || irq_trace())
 		start_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
 }
@@ -448,6 +452,7 @@ void stop_critical_timings(void)
 {
 	if (preempt_trace() || irq_trace())
 		stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
+	per_cpu(timings_stopped, raw_smp_processor_id()) = true;
 }
 EXPORT_SYMBOL_GPL(stop_critical_timings);
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-27 20:08 [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers Stephen Boyd
@ 2014-05-27 20:11 ` Arnd Bergmann
  2014-05-27 21:48   ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Arnd Bergmann @ 2014-05-27 20:11 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Steven Rostedt, Frederic Weisbecker, Ingo Molnar, linux-kernel,
	Corey Minyard, Stanislav Meduna

On Tuesday 27 May 2014 13:08:04 Stephen Boyd wrote:
> @@ -380,7 +382,7 @@ start_critical_timing(unsigned long ip, unsigned long parent_ip)
>  
>         cpu = raw_smp_processor_id();
>  
> -       if (per_cpu(tracing_cpu, cpu))
> +       if (per_cpu(timings_stopped, cpu) || per_cpu(tracing_cpu, cpu))
>                 return;
>  
>         data = per_cpu_ptr(tr->trace_buffer.data, cpu);
> 

Where exactly do you see other code calling here while
per_cpu(timings_stopped) is set? Would it be possible to just
change that call site?

	Arnd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-27 20:11 ` Arnd Bergmann
@ 2014-05-27 21:48   ` Steven Rostedt
  2014-05-27 22:21     ` Stephen Boyd
  0 siblings, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2014-05-27 21:48 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Stephen Boyd, Frederic Weisbecker, Ingo Molnar, linux-kernel,
	Corey Minyard, Stanislav Meduna

On Tue, 27 May 2014 22:11:25 +0200
Arnd Bergmann <arnd@arndb.de> wrote:

> On Tuesday 27 May 2014 13:08:04 Stephen Boyd wrote:
> > @@ -380,7 +382,7 @@ start_critical_timing(unsigned long ip, unsigned long parent_ip)
> >  
> >         cpu = raw_smp_processor_id();
> >  
> > -       if (per_cpu(tracing_cpu, cpu))
> > +       if (per_cpu(timings_stopped, cpu) || per_cpu(tracing_cpu, cpu))
> >                 return;
> >  
> >         data = per_cpu_ptr(tr->trace_buffer.data, cpu);
> > 
> 
> Where exactly do you see other code calling here while
> per_cpu(timings_stopped) is set? Would it be possible to just
> change that call site?

Arnd brings up a good point. If we disable irqs off tracing completely,
we may be missing places in the idle path that disable interrupts for
long periods of time. We may want to move the stop down further.

The way it works (IIRC), and why tracing can start again is that it can
nest. Perhaps we need to stop it further down if we can't move it
completely.

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-27 21:48   ` Steven Rostedt
@ 2014-05-27 22:21     ` Stephen Boyd
  2014-05-27 23:30       ` Steven Rostedt
  0 siblings, 1 reply; 15+ messages in thread
From: Stephen Boyd @ 2014-05-27 22:21 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arnd Bergmann, Frederic Weisbecker, Ingo Molnar, linux-kernel,
	Corey Minyard, Stanislav Meduna

On 05/27/14 14:48, Steven Rostedt wrote:
> On Tue, 27 May 2014 22:11:25 +0200
> Arnd Bergmann <arnd@arndb.de> wrote:
>
>> On Tuesday 27 May 2014 13:08:04 Stephen Boyd wrote:
>>> @@ -380,7 +382,7 @@ start_critical_timing(unsigned long ip, unsigned long parent_ip)
>>>  
>>>         cpu = raw_smp_processor_id();
>>>  
>>> -       if (per_cpu(tracing_cpu, cpu))
>>> +       if (per_cpu(timings_stopped, cpu) || per_cpu(tracing_cpu, cpu))
>>>                 return;
>>>  
>>>         data = per_cpu_ptr(tr->trace_buffer.data, cpu);
>>>
>> Where exactly do you see other code calling here while
>> per_cpu(timings_stopped) is set? Would it be possible to just
>> change that call site?
> Arnd brings up a good point. 

Hrm.. still not getting Arnd's mails.

> If we disable irqs off tracing completely,
> we may be missing places in the idle path that disable interrupts for
> long periods of time. We may want to move the stop down further.
>
> The way it works (IIRC), and why tracing can start again is that it can
> nest. Perhaps we need to stop it further down if we can't move it
> completely.
>

I'm not sure how much deeper it can go and I'm afraid it will become a
game of whack-a-mole. I already see two places that disable and reenable
irqs after stop_critical_timings() is called (first in rcu_idle_enter()
and second in clockevents_notify()). Should rcu_idle_enter() move to
raw_local_irq_save()? It looks like that path calls rcu_sched_qs() and
on tiny RCU that again needs the raw_ treatement. We can probably call
stop_critical_timings() after rcu_idle_enter() to fix this.

What about clockevents_notify? __raw_spin_lock_irqsave() should probably
use raw_local_irqsave().

If we go the raw route, do we even need stop/start_critical_timings()?
Can't we just use raw accessors in the idle paths
(tick_nohz_idle_{enter,exit}(), cpuidle_enter(), etc.) and get rid of
the stop/start stuff completely? I admit this patch is pretty much a big
sledge hammer that tries to make things simple, but if there is some
benefit to the raw accessors I'm willing to send patches to fix all the
call sites.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-27 22:21     ` Stephen Boyd
@ 2014-05-27 23:30       ` Steven Rostedt
  2014-05-28  0:11         ` Stephen Boyd
                           ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Steven Rostedt @ 2014-05-27 23:30 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Arnd Bergmann, Frederic Weisbecker, Ingo Molnar, linux-kernel,
	Corey Minyard, Stanislav Meduna, Paul E. McKenney,
	Peter Zijlstra

On Tue, 27 May 2014 15:21:39 -0700
Stephen Boyd <sboyd@codeaurora.org> wrote:


> > Arnd brings up a good point. 
> 
> Hrm.. still not getting Arnd's mails.

Strange. What mail service do you have. Could they be blocking him?

> 
> > If we disable irqs off tracing completely,
> > we may be missing places in the idle path that disable interrupts for
> > long periods of time. We may want to move the stop down further.
> >
> > The way it works (IIRC), and why tracing can start again is that it can
> > nest. Perhaps we need to stop it further down if we can't move it
> > completely.
> >
> 
> I'm not sure how much deeper it can go and I'm afraid it will become a
> game of whack-a-mole. I already see two places that disable and reenable
> irqs after stop_critical_timings() is called (first in rcu_idle_enter()
> and second in clockevents_notify()). Should rcu_idle_enter() move to
> raw_local_irq_save()? It looks like that path calls rcu_sched_qs() and
> on tiny RCU that again needs the raw_ treatement. We can probably call
> stop_critical_timings() after rcu_idle_enter() to fix this.

I don't think we need to whack-a-mole. The start stop should be around
where it goes to sleep.

> 
> What about clockevents_notify? __raw_spin_lock_irqsave() should probably
> use raw_local_irqsave().

No that solution is even worse. We need lockdep working here.

> 
> If we go the raw route, do we even need stop/start_critical_timings()?
> Can't we just use raw accessors in the idle paths
> (tick_nohz_idle_{enter,exit}(), cpuidle_enter(), etc.) and get rid of
> the stop/start stuff completely? I admit this patch is pretty much a big
> sledge hammer that tries to make things simple, but if there is some
> benefit to the raw accessors I'm willing to send patches to fix all the
> call sites.
> 

How about the following. I don't see any reason stop_critical_timings()
can't be called from within rcu_idle code, as it doesn't use any rcu.

Paul, Peter, see anything wrong with this?

-- Steve

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 8f4390a..f5e6a64 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -88,12 +88,6 @@ static int cpuidle_idle_call(void)
 	}
 
 	/*
-	 * During the idle period, stop measuring the disabled irqs
-	 * critical sections latencies
-	 */
-	stop_critical_timings();
-
-	/*
 	 * Tell the RCU framework we are entering an idle section,
 	 * so no more rcu read side critical sections and one more
 	 * step to the grace period
@@ -144,6 +138,12 @@ static int cpuidle_idle_call(void)
 				trace_cpu_idle_rcuidle(next_state, dev->cpu);
 
 				/*
+				 * During the idle period, stop measuring the
+				 * disabled irqs critical sections latencies
+				 */
+				stop_critical_timings();
+
+				/*
 				 * Enter the idle state previously
 				 * returned by the governor
 				 * decision. This function will block
@@ -154,6 +154,8 @@ static int cpuidle_idle_call(void)
 				entered_state = cpuidle_enter(drv, dev,
 							      next_state);
 
+				start_critical_timings();
+
 				trace_cpu_idle_rcuidle(PWR_EVENT_EXIT,
 						       dev->cpu);
 
@@ -175,8 +177,11 @@ static int cpuidle_idle_call(void)
 	 * We can't use the cpuidle framework, let's use the default
 	 * idle routine
 	 */
-	if (ret)
+	if (ret) {
+		stop_critical_timings();
 		arch_cpu_idle();
+		start_critical_timings();
+	}
 
 	__current_set_polling();
 
@@ -188,7 +193,6 @@ static int cpuidle_idle_call(void)
 		local_irq_enable();
 
 	rcu_idle_exit();
-	start_critical_timings();
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-27 23:30       ` Steven Rostedt
@ 2014-05-28  0:11         ` Stephen Boyd
  2014-05-28  0:23           ` Stephen Boyd
  2014-05-28  0:42           ` Steven Rostedt
  2014-05-28  0:25         ` Paul E. McKenney
                           ` (3 subsequent siblings)
  4 siblings, 2 replies; 15+ messages in thread
From: Stephen Boyd @ 2014-05-28  0:11 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arnd Bergmann, Frederic Weisbecker, Ingo Molnar, linux-kernel,
	Corey Minyard, Stanislav Meduna, Paul E. McKenney,
	Peter Zijlstra

On 05/27/14 16:30, Steven Rostedt wrote:
> On Tue, 27 May 2014 15:21:39 -0700
> Stephen Boyd <sboyd@codeaurora.org> wrote:
>
>
>>> Arnd brings up a good point. 
>> Hrm.. still not getting Arnd's mails.
> Strange. What mail service do you have. Could they be blocking him?
>
>>> If we disable irqs off tracing completely,
>>> we may be missing places in the idle path that disable interrupts for
>>> long periods of time. We may want to move the stop down further.
>>>
>>> The way it works (IIRC), and why tracing can start again is that it can
>>> nest. Perhaps we need to stop it further down if we can't move it
>>> completely.
>>>
>> I'm not sure how much deeper it can go and I'm afraid it will become a
>> game of whack-a-mole. I already see two places that disable and reenable
>> irqs after stop_critical_timings() is called (first in rcu_idle_enter()
>> and second in clockevents_notify()). Should rcu_idle_enter() move to
>> raw_local_irq_save()? It looks like that path calls rcu_sched_qs() and
>> on tiny RCU that again needs the raw_ treatement. We can probably call
>> stop_critical_timings() after rcu_idle_enter() to fix this.
> I don't think we need to whack-a-mole. The start stop should be around
> where it goes to sleep.
>
>> What about clockevents_notify? __raw_spin_lock_irqsave() should probably
>> use raw_local_irqsave().
> No that solution is even worse. We need lockdep working here.
>
>> If we go the raw route, do we even need stop/start_critical_timings()?
>> Can't we just use raw accessors in the idle paths
>> (tick_nohz_idle_{enter,exit}(), cpuidle_enter(), etc.) and get rid of
>> the stop/start stuff completely? I admit this patch is pretty much a big
>> sledge hammer that tries to make things simple, but if there is some
>> benefit to the raw accessors I'm willing to send patches to fix all the
>> call sites.
>>
> How about the following. I don't see any reason stop_critical_timings()
> can't be called from within rcu_idle code, as it doesn't use any rcu.
>
> Paul, Peter, see anything wrong with this?
>

cpuidle_enter_state() calls ktime_get() which on lockdep enabled builds
calls seqcount_lockdep_reader_access() which calls local_irq_save() that
then turns on the tracer again. Perhaps the problem is that irqsoff
tracer is triggered even when we aren't transitioning between irqs on
and irqs off? What about this patch? I assume there is a reason that
this is wrong, but I don't know what it is.

---8<-----

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index d176d658fe25..ac8e0a4968bd 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -93,7 +93,8 @@
 #define local_irq_save(flags)				\
 	do {						\
 		raw_local_irq_save(flags);		\
-		trace_hardirqs_off();			\
+		if (!raw_irqs_disabled_flags(flags))	\
+			trace_hardirqs_off();		\
 	} while (0)
 
 
@@ -101,7 +102,6 @@
 	do {						\
 		if (raw_irqs_disabled_flags(flags)) {	\
 			raw_local_irq_restore(flags);	\
-			trace_hardirqs_off();		\
 		} else {				\
 			trace_hardirqs_on();		\
 			raw_local_irq_restore(flags);	\

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-28  0:11         ` Stephen Boyd
@ 2014-05-28  0:23           ` Stephen Boyd
  2014-05-28  0:42           ` Steven Rostedt
  1 sibling, 0 replies; 15+ messages in thread
From: Stephen Boyd @ 2014-05-28  0:23 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arnd Bergmann, Frederic Weisbecker, Ingo Molnar, linux-kernel,
	Corey Minyard, Stanislav Meduna, Paul E. McKenney,
	Peter Zijlstra

On 05/27/14 17:11, Stephen Boyd wrote:
> On 05/27/14 16:30, Steven Rostedt wrote:
>> On Tue, 27 May 2014 15:21:39 -0700
>> Stephen Boyd <sboyd@codeaurora.org> wrote:
>>
>>
>>>> Arnd brings up a good point. 
>>> Hrm.. still not getting Arnd's mails.
>> Strange. What mail service do you have. Could they be blocking him?
>>
>>>> If we disable irqs off tracing completely,
>>>> we may be missing places in the idle path that disable interrupts for
>>>> long periods of time. We may want to move the stop down further.
>>>>
>>>> The way it works (IIRC), and why tracing can start again is that it can
>>>> nest. Perhaps we need to stop it further down if we can't move it
>>>> completely.
>>>>
>>> I'm not sure how much deeper it can go and I'm afraid it will become a
>>> game of whack-a-mole. I already see two places that disable and reenable
>>> irqs after stop_critical_timings() is called (first in rcu_idle_enter()
>>> and second in clockevents_notify()). Should rcu_idle_enter() move to
>>> raw_local_irq_save()? It looks like that path calls rcu_sched_qs() and
>>> on tiny RCU that again needs the raw_ treatement. We can probably call
>>> stop_critical_timings() after rcu_idle_enter() to fix this.
>> I don't think we need to whack-a-mole. The start stop should be around
>> where it goes to sleep.
>>
>>> What about clockevents_notify? __raw_spin_lock_irqsave() should probably
>>> use raw_local_irqsave().
>> No that solution is even worse. We need lockdep working here.
>>
>>> If we go the raw route, do we even need stop/start_critical_timings()?
>>> Can't we just use raw accessors in the idle paths
>>> (tick_nohz_idle_{enter,exit}(), cpuidle_enter(), etc.) and get rid of
>>> the stop/start stuff completely? I admit this patch is pretty much a big
>>> sledge hammer that tries to make things simple, but if there is some
>>> benefit to the raw accessors I'm willing to send patches to fix all the
>>> call sites.
>>>
>> How about the following. I don't see any reason stop_critical_timings()
>> can't be called from within rcu_idle code, as it doesn't use any rcu.
>>
>> Paul, Peter, see anything wrong with this?
>>
> cpuidle_enter_state() calls ktime_get() which on lockdep enabled builds
> calls seqcount_lockdep_reader_access() which calls local_irq_save() that
> then turns on the tracer again. Perhaps the problem is that irqsoff
> tracer is triggered even when we aren't transitioning between irqs on
> and irqs off? What about this patch? I assume there is a reason that
> this is wrong, but I don't know what it is.
>
> ---8<-----
>
> diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
> index d176d658fe25..ac8e0a4968bd 100644
> --- a/include/linux/irqflags.h
> +++ b/include/linux/irqflags.h
> @@ -93,7 +93,8 @@
>  #define local_irq_save(flags)				\
>  	do {						\
>  		raw_local_irq_save(flags);		\
> -		trace_hardirqs_off();			\
> +		if (!raw_irqs_disabled_flags(flags))	\
> +			trace_hardirqs_off();		\
>  	} while (0)
>  
>  
> @@ -101,7 +102,6 @@
>  	do {						\
>  		if (raw_irqs_disabled_flags(flags)) {	\
>  			raw_local_irq_restore(flags);	\
> -			trace_hardirqs_off();		\
>  		} else {				\
>  			trace_hardirqs_on();		\
>  			raw_local_irq_restore(flags);	\
>

Aha, looks like lockdep wants to know about redundant hardirqs with the
redundant_hardirqs_off field. Can we use the same field in the irqsoff
tracer to monitor the hardirq on/off state more accurately?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-27 23:30       ` Steven Rostedt
  2014-05-28  0:11         ` Stephen Boyd
@ 2014-05-28  0:25         ` Paul E. McKenney
  2014-05-28  6:40         ` Arnd Bergmann
                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Paul E. McKenney @ 2014-05-28  0:25 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Stephen Boyd, Arnd Bergmann, Frederic Weisbecker, Ingo Molnar,
	linux-kernel, Corey Minyard, Stanislav Meduna, Peter Zijlstra

On Tue, May 27, 2014 at 07:30:50PM -0400, Steven Rostedt wrote:
> On Tue, 27 May 2014 15:21:39 -0700
> Stephen Boyd <sboyd@codeaurora.org> wrote:
> 
> 
> > > Arnd brings up a good point. 
> > 
> > Hrm.. still not getting Arnd's mails.
> 
> Strange. What mail service do you have. Could they be blocking him?
> 
> > 
> > > If we disable irqs off tracing completely,
> > > we may be missing places in the idle path that disable interrupts for
> > > long periods of time. We may want to move the stop down further.
> > >
> > > The way it works (IIRC), and why tracing can start again is that it can
> > > nest. Perhaps we need to stop it further down if we can't move it
> > > completely.
> > >
> > 
> > I'm not sure how much deeper it can go and I'm afraid it will become a
> > game of whack-a-mole. I already see two places that disable and reenable
> > irqs after stop_critical_timings() is called (first in rcu_idle_enter()
> > and second in clockevents_notify()). Should rcu_idle_enter() move to
> > raw_local_irq_save()? It looks like that path calls rcu_sched_qs() and
> > on tiny RCU that again needs the raw_ treatement. We can probably call
> > stop_critical_timings() after rcu_idle_enter() to fix this.
> 
> I don't think we need to whack-a-mole. The start stop should be around
> where it goes to sleep.
> 
> > 
> > What about clockevents_notify? __raw_spin_lock_irqsave() should probably
> > use raw_local_irqsave().
> 
> No that solution is even worse. We need lockdep working here.
> 
> > 
> > If we go the raw route, do we even need stop/start_critical_timings()?
> > Can't we just use raw accessors in the idle paths
> > (tick_nohz_idle_{enter,exit}(), cpuidle_enter(), etc.) and get rid of
> > the stop/start stuff completely? I admit this patch is pretty much a big
> > sledge hammer that tries to make things simple, but if there is some
> > benefit to the raw accessors I'm willing to send patches to fix all the
> > call sites.
> > 
> 
> How about the following. I don't see any reason stop_critical_timings()
> can't be called from within rcu_idle code, as it doesn't use any rcu.
> 
> Paul, Peter, see anything wrong with this?

I don't immediately see any RCU use by stop_critical_timings(), but could
easily have missed something.  But CONFIG_PROVE_RCU=y will normally yell
if something using RCU showed up.

Looks plausible, but clearly needs testing across the usual array of
configs and arches.

							Thanx, Paul

> -- Steve
> 
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 8f4390a..f5e6a64 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -88,12 +88,6 @@ static int cpuidle_idle_call(void)
>  	}
> 
>  	/*
> -	 * During the idle period, stop measuring the disabled irqs
> -	 * critical sections latencies
> -	 */
> -	stop_critical_timings();
> -
> -	/*
>  	 * Tell the RCU framework we are entering an idle section,
>  	 * so no more rcu read side critical sections and one more
>  	 * step to the grace period
> @@ -144,6 +138,12 @@ static int cpuidle_idle_call(void)
>  				trace_cpu_idle_rcuidle(next_state, dev->cpu);
> 
>  				/*
> +				 * During the idle period, stop measuring the
> +				 * disabled irqs critical sections latencies
> +				 */
> +				stop_critical_timings();
> +
> +				/*
>  				 * Enter the idle state previously
>  				 * returned by the governor
>  				 * decision. This function will block
> @@ -154,6 +154,8 @@ static int cpuidle_idle_call(void)
>  				entered_state = cpuidle_enter(drv, dev,
>  							      next_state);
> 
> +				start_critical_timings();
> +
>  				trace_cpu_idle_rcuidle(PWR_EVENT_EXIT,
>  						       dev->cpu);
> 
> @@ -175,8 +177,11 @@ static int cpuidle_idle_call(void)
>  	 * We can't use the cpuidle framework, let's use the default
>  	 * idle routine
>  	 */
> -	if (ret)
> +	if (ret) {
> +		stop_critical_timings();
>  		arch_cpu_idle();
> +		start_critical_timings();
> +	}
> 
>  	__current_set_polling();
> 
> @@ -188,7 +193,6 @@ static int cpuidle_idle_call(void)
>  		local_irq_enable();
> 
>  	rcu_idle_exit();
> -	start_critical_timings();
> 
>  	return 0;
>  }
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-28  0:11         ` Stephen Boyd
  2014-05-28  0:23           ` Stephen Boyd
@ 2014-05-28  0:42           ` Steven Rostedt
  2014-05-28  7:24             ` Peter Zijlstra
  1 sibling, 1 reply; 15+ messages in thread
From: Steven Rostedt @ 2014-05-28  0:42 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Arnd Bergmann, Frederic Weisbecker, Ingo Molnar, linux-kernel,
	Corey Minyard, Stanislav Meduna, Paul E. McKenney,
	Peter Zijlstra

On Tue, 2014-05-27 at 17:11 -0700, Stephen Boyd wrote:

> cpuidle_enter_state() calls ktime_get() which on lockdep enabled builds
> calls seqcount_lockdep_reader_access() which calls local_irq_save() that

seqcount_lockdep_reader_access()?? Ug, I wonder if that should call
raw_local_irq_save/restore() as it's a lockdep helper to begin with. If
it's wrong then it's the lockdep infrastructure that broke, not the core
kernel.

Peter?

-- Steve


> then turns on the tracer again. Perhaps the problem is that irqsoff
> tracer is triggered even when we aren't transitioning between irqs on
> and irqs off? What about this patch? I assume there is a reason that
> this is wrong, but I don't know what it is.
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-27 23:30       ` Steven Rostedt
  2014-05-28  0:11         ` Stephen Boyd
  2014-05-28  0:25         ` Paul E. McKenney
@ 2014-05-28  6:40         ` Arnd Bergmann
  2014-05-28  6:44         ` Arnd Bergmann
  2014-05-28  7:28         ` Peter Zijlstra
  4 siblings, 0 replies; 15+ messages in thread
From: Arnd Bergmann @ 2014-05-28  6:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Stephen Boyd, Frederic Weisbecker, Ingo Molnar, linux-kernel,
	Corey Minyard, Stanislav Meduna, Paul E. McKenney,
	Peter Zijlstra

On Tuesday 27 May 2014 19:30:50 Steven Rostedt wrote:
> On Tue, 27 May 2014 15:21:39 -0700
> Stephen Boyd <sboyd@codeaurora.org> wrote:
> 
> 
> > > Arnd brings up a good point. 
> > 
> > Hrm.. still not getting Arnd's mails.
> 
> Strange. What mail service do you have. Could they be blocking him?
> 

I think it's United Internet again getting blocked by some blacklist,
I has had bounces from arm.com addresses today.

	Arnd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-27 23:30       ` Steven Rostedt
                           ` (2 preceding siblings ...)
  2014-05-28  6:40         ` Arnd Bergmann
@ 2014-05-28  6:44         ` Arnd Bergmann
  2014-05-28  7:28         ` Peter Zijlstra
  4 siblings, 0 replies; 15+ messages in thread
From: Arnd Bergmann @ 2014-05-28  6:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Stephen Boyd, Frederic Weisbecker, Ingo Molnar, linux-kernel,
	Corey Minyard, Stanislav Meduna, Paul E. McKenney,
	Peter Zijlstra

On Tuesday 27 May 2014 19:30:50 Steven Rostedt wrote:
> @@ -144,6 +138,12 @@ static int cpuidle_idle_call(void)
>                                 trace_cpu_idle_rcuidle(next_state, dev->cpu);
>  
>                                 /*
> +                                * During the idle period, stop measuring the
> +                                * disabled irqs critical sections latencies
> +                                */
> +                               stop_critical_timings();
> +
> +                               /*
>                                  * Enter the idle state previously
>                                  * returned by the governor
>                                  * decision. This function will block
> @@ -154,6 +154,8 @@ static int cpuidle_idle_call(void)
>                                 entered_state = cpuidle_enter(drv, dev,
>                                                               next_state);
>  
> +                               start_critical_timings();
> +
>                                 trace_cpu_idle_rcuidle(PWR_EVENT_EXIT,
>                                                        dev->cpu);
>  

cpuidle_enter() can have quite complex implementations in drivers/cpuidle/,
it's quite possible we have to push down the
stop_critical_timings/start_critical_timings further down here into
the individual drivers.

> @@ -175,8 +177,11 @@ static int cpuidle_idle_call(void)
>          * We can't use the cpuidle framework, let's use the default
>          * idle routine
>          */
> -       if (ret)
> +       if (ret) {
> +               stop_critical_timings();
>                 arch_cpu_idle();
> +               start_critical_timings();
> +       }
>  
>         __current_set_polling();
>  
> 

This one seems fine on all architectures I've looked at.

	Arnd

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-28  0:42           ` Steven Rostedt
@ 2014-05-28  7:24             ` Peter Zijlstra
  2014-05-28 17:22               ` John Stultz
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2014-05-28  7:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Stephen Boyd, Arnd Bergmann, Frederic Weisbecker, Ingo Molnar,
	linux-kernel, Corey Minyard, Stanislav Meduna, Paul E. McKenney,
	john.stultz

[-- Attachment #1: Type: text/plain, Size: 844 bytes --]

On Tue, May 27, 2014 at 08:42:14PM -0400, Steven Rostedt wrote:
> On Tue, 2014-05-27 at 17:11 -0700, Stephen Boyd wrote:
> 
> > cpuidle_enter_state() calls ktime_get() which on lockdep enabled builds
> > calls seqcount_lockdep_reader_access() which calls local_irq_save() that
> 
> seqcount_lockdep_reader_access()?? Ug, I wonder if that should call
> raw_local_irq_save/restore() as it's a lockdep helper to begin with. If
> it's wrong then it's the lockdep infrastructure that broke, not the core
> kernel.
> 
> Peter?

Hurm,.. don't know actually.. so from a lockdep pov it doesn't need to
do that and we can simply remove the local_irq_{save,restore}() from
that function.

It could be John did it to avoid some IRQ recursion warning, but if so,
he failed to mention it.

John, remember why you typed those characters?

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-27 23:30       ` Steven Rostedt
                           ` (3 preceding siblings ...)
  2014-05-28  6:44         ` Arnd Bergmann
@ 2014-05-28  7:28         ` Peter Zijlstra
  2014-05-28 14:09           ` Steven Rostedt
  4 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2014-05-28  7:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Stephen Boyd, Arnd Bergmann, Frederic Weisbecker, Ingo Molnar,
	linux-kernel, Corey Minyard, Stanislav Meduna, Paul E. McKenney,
	rjw

[-- Attachment #1: Type: text/plain, Size: 1762 bytes --]

On Tue, May 27, 2014 at 07:30:50PM -0400, Steven Rostedt wrote:
> Paul, Peter, see anything wrong with this?

You mean, aside from that it won't actually apply due to that code
having been massively overhauled?

I suppose the same thing Arnd mentioned, the cpuidle_enter() thing can
be huge, but is we're going to have to push this down into all cpuidle
drivers we're not going to be happy.

So I think we're going to have to draw a line and impose restrictions on
what cpuidle_enter() implementations can and can not do.

> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 8f4390a..f5e6a64 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -88,12 +88,6 @@ static int cpuidle_idle_call(void)
>  	}
>  
>  	/*
> -	 * During the idle period, stop measuring the disabled irqs
> -	 * critical sections latencies
> -	 */
> -	stop_critical_timings();
> -
> -	/*
>  	 * Tell the RCU framework we are entering an idle section,
>  	 * so no more rcu read side critical sections and one more
>  	 * step to the grace period
> @@ -144,6 +138,12 @@ static int cpuidle_idle_call(void)
>  				trace_cpu_idle_rcuidle(next_state, dev->cpu);
>  
>  				/*
> +				 * During the idle period, stop measuring the
> +				 * disabled irqs critical sections latencies
> +				 */
> +				stop_critical_timings();
> +
> +				/*
>  				 * Enter the idle state previously
>  				 * returned by the governor
>  				 * decision. This function will block
> @@ -154,6 +154,8 @@ static int cpuidle_idle_call(void)
>  				entered_state = cpuidle_enter(drv, dev,
>  							      next_state);
>  
> +				start_critical_timings();
> +
>  				trace_cpu_idle_rcuidle(PWR_EVENT_EXIT,
>  						       dev->cpu);
>  



[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-28  7:28         ` Peter Zijlstra
@ 2014-05-28 14:09           ` Steven Rostedt
  0 siblings, 0 replies; 15+ messages in thread
From: Steven Rostedt @ 2014-05-28 14:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Stephen Boyd, Arnd Bergmann, Frederic Weisbecker, Ingo Molnar,
	linux-kernel, Corey Minyard, Stanislav Meduna, Paul E. McKenney,
	rjw

On Wed, 28 May 2014 09:28:52 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, May 27, 2014 at 07:30:50PM -0400, Steven Rostedt wrote:
> > Paul, Peter, see anything wrong with this?
> 
> You mean, aside from that it won't actually apply due to that code
> having been massively overhauled?
> 
> I suppose the same thing Arnd mentioned, the cpuidle_enter() thing can
> be huge, but is we're going to have to push this down into all cpuidle
> drivers we're not going to be happy.
> 

I believe that the stop/start_critical_timings() can nest. If not, I
can make them do so. Thus, we could start adding them in the depths of
the cpuidle drivers. If we miss one, we may not notice until someone
reports a large latency due to it.  And that is only if the driver
itself calls local_irq_save/restore().

-- Steve

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers
  2014-05-28  7:24             ` Peter Zijlstra
@ 2014-05-28 17:22               ` John Stultz
  0 siblings, 0 replies; 15+ messages in thread
From: John Stultz @ 2014-05-28 17:22 UTC (permalink / raw)
  To: Peter Zijlstra, Steven Rostedt
  Cc: Stephen Boyd, Arnd Bergmann, Frederic Weisbecker, Ingo Molnar,
	linux-kernel, Corey Minyard, Stanislav Meduna, Paul E. McKenney

On 05/28/2014 12:24 AM, Peter Zijlstra wrote:
> On Tue, May 27, 2014 at 08:42:14PM -0400, Steven Rostedt wrote:
>> On Tue, 2014-05-27 at 17:11 -0700, Stephen Boyd wrote:
>>
>>> cpuidle_enter_state() calls ktime_get() which on lockdep enabled builds
>>> calls seqcount_lockdep_reader_access() which calls local_irq_save() that
>>
>> seqcount_lockdep_reader_access()?? Ug, I wonder if that should call
>> raw_local_irq_save/restore() as it's a lockdep helper to begin with. If
>> it's wrong then it's the lockdep infrastructure that broke, not the core
>> kernel.
>>
>> Peter?
>
> Hurm,.. don't know actually.. so from a lockdep pov it doesn't need to
> do that and we can simply remove the local_irq_{save,restore}() from
> that function.
>
> It could be John did it to avoid some IRQ recursion warning, but if so,
> he failed to mention it.
>
> John, remember why you typed those characters?

So.. With seqlocks, we're trying to just make sure reads and writes
don't nest under a write. However we don't care if a write nests in a
read, because the read will be restarted. An example is someone hitting
gettimeofday over and over taking a read, and then an IRQ lands mid-read
and we take the write and update the data. This is expected normal
behavior.  So this was trying to make the read side lockdep
aquire/release combo atomic, so we don't create false warnings if an IRQ
landed right in-between.

Does that make sense?

thanks
-john


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-05-28 17:22 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-27 20:08 [PATCH v2] tracing: Don't account for cpu idle time with irqsoff tracers Stephen Boyd
2014-05-27 20:11 ` Arnd Bergmann
2014-05-27 21:48   ` Steven Rostedt
2014-05-27 22:21     ` Stephen Boyd
2014-05-27 23:30       ` Steven Rostedt
2014-05-28  0:11         ` Stephen Boyd
2014-05-28  0:23           ` Stephen Boyd
2014-05-28  0:42           ` Steven Rostedt
2014-05-28  7:24             ` Peter Zijlstra
2014-05-28 17:22               ` John Stultz
2014-05-28  0:25         ` Paul E. McKenney
2014-05-28  6:40         ` Arnd Bergmann
2014-05-28  6:44         ` Arnd Bergmann
2014-05-28  7:28         ` Peter Zijlstra
2014-05-28 14:09           ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).