linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] sched: Have do_idle() call __schedule() without  enabling preemption
@ 2017-04-12 18:27 Steven Rostedt
  2017-04-12 18:32 ` Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Steven Rostedt @ 2017-04-12 18:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Andrew Morton, Paul E. McKenney, Thomas Gleixner

From: Steven Rostedt (VMware) <rostedt@goodmis.org>

I finally got around to creating trampolines for dynamically allocated
ftrace_ops with using synchronize_rcu_tasks(). For users of the ftrace
function hook callbacks, like perf, that allocate the ftrace_ops
descriptor via kmalloc() and friends, ftrace was not able to optimize
the functions being traced to use a trampoline because they would also
need to be allocated dynamically. The problem is that they cannot be
freed when CONFIG_PREEMPT is set, as there's no way to tell if a task
was preempted on the trampoline. That was before Paul McKenney
implemented synchronize_rcu_tasks() that would make sure all tasks
(except idle) have scheduled out or have entered user space.

While testing this, I triggered this bug:

 BUG: unable to handle kernel paging request at ffffffffa0230077
 IP: 0xffffffffa0230077
 PGD 2414067 
 PUD 2415063 
 PMD c463c067 
 PTE 0
 
 Oops: 0010 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
 Modules linked in: ip6table_filter ip6_tables snd_hda_codec_hdmi
 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel
 snd_hda_codec snd_hwdep snd_hda_core x86_pkg_temp_thermal kvm_intel
 i915 snd_seq kvm snd_seq_device snd_pcm i2c_algo_bit snd_timer
 drm_kms_helper irqbypass syscopyarea sysfillrect sysimgblt fb_sys_fops
 drm i2c_i801 snd soundcore wmi i2c_core video e1000e ptp pps_core CPU:
 2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3-test+ #356 Hardware
 name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05
 05/07/2012 task: ffff8800cebbd0c0 task.stack: ffff8800cebd8000 RIP:
 0010:0xffffffffa0230077 RSP: 0018:ffff8800cebdfd80 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff8800cebbd0c0 RCX: ffffffff8121391e RDX:
 dffffc0000000000 RSI: ffffffff81ce7c78 RDI: ffff8800cebbf298 RBP:
 ffff8800cebdfe28 R08: 0000000000000003 R09: 0000000000000000 R10:
 0000000000000000 R11: 0000000000000000 R12: ffff8800cebbd0c0 R13:
 ffffffff828fbe90 R14: ffff8800d392c480 R15: ffffffff826cc380 FS:
 0000000000000000(0000) GS:ffff8800d3900000(0000)
 knlGS:0000000000000000 CS:  0010 DS: 0000 ES: 0000 CR0:
 0000000080050033 CR2: ffffffffa0230077 CR3: 0000000002413000 CR4:
 00000000001406e0 Call Trace: ? debug_smp_processor_id+0x17/0x20 ?
 sched_ttwu_pending+0x79/0x190 ? schedule+0x5/0xe0 ?
 trace_hardirqs_on_caller+0x182/0x280 schedule+0x5/0xe0
 schedule_preempt_disabled+0x18/0x30 ? schedule+0x5/0xe0
  ? schedule_preempt_disabled+0x18/0x30
  do_idle+0x172/0x220

What happened was that the idle task was preempted on the trampoline.
As synchronize_rcu_tasks() ignores the idle thread, there's nothing
that lets ftrace know that the idle task was preempted on a trampoline.

The idle task shouldn't need to ever enable preemption. The idle task
is simply a loop that calls schedule or places the cpu into idle mode.
In fact, having preemption enabled is inefficient, because it can
happen when idle is just about to call schedule anyway, which would
cause schedule to be called twice. Once for when the interrupt came in
and was returning back to normal context, and then again in the normal
path that the idle loop is running in, which would be pointless, as it
had already scheduled.

Adding a new function local to kernel/sched/ that allows idle to call
the scheduler without enabling preemption, fixes the
synchronize_rcu_tasks) issue, as well as removes the pointless spurious
scheduled caused by interrupts happening in the brief window where
preemption is enabled just before it calls schedule.

Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
Changes from v1:

  o Added Thomas to Cc as he wrote the original generic idle code

  o Added comment to why schedule_idle() exits

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3b31fc0..0788286 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3502,6 +3502,23 @@ asmlinkage __visible void __sched schedule(void)
 }
 EXPORT_SYMBOL(schedule);
 
+/*
+ * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
+ * state (have scheduled out non-voluntarily) by making sure that all
+ * tasks have either left the run queue or have gone into user space.
+ * As idle tasks do not do either, they must not ever be preempted
+ * (schedule out non-voluntarily).
+ *
+ * schedule_idle() is similar to schedule_preempt_disable() except
+ * that it never enables preemption.
+ */
+void __sched schedule_idle(void)
+{
+	do {
+		__schedule(false);
+	} while (need_resched());
+}
+
 #ifdef CONFIG_CONTEXT_TRACKING
 asmlinkage __visible void __sched schedule_user(void)
 {
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index ac6d517..229c17e 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -264,7 +264,7 @@ static void do_idle(void)
 	smp_mb__after_atomic();
 
 	sched_ttwu_pending();
-	schedule_preempt_disabled();
+	schedule_idle();
 }
 
 bool cpu_in_idle(unsigned long pc)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 5cbf922..c5ee02b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1465,6 +1465,8 @@ static inline struct cpuidle_state *idle_get_state(struct rq *rq)
 }
 #endif
 
+extern void schedule_idle(void);
+
 extern void sysrq_sched_debug_show(void);
 extern void sched_init_granularity(void);
 extern void update_max_interval(void);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] sched: Have do_idle() call __schedule() without  enabling preemption
  2017-04-12 18:27 [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption Steven Rostedt
@ 2017-04-12 18:32 ` Steven Rostedt
  2017-04-12 23:48 ` Paul E. McKenney
  2017-04-13  8:44 ` Peter Zijlstra
  2 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2017-04-12 18:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Andrew Morton, Paul E. McKenney, Thomas Gleixner

On Wed, 12 Apr 2017 14:27:44 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> Changes from v1:
> 
>   o Added Thomas to Cc as he wrote the original generic idle code
> 
>   o Added comment to why schedule_idle() exits

s/exits/exists/

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption
  2017-04-12 18:27 [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption Steven Rostedt
  2017-04-12 18:32 ` Steven Rostedt
@ 2017-04-12 23:48 ` Paul E. McKenney
  2017-04-13  0:40   ` Steven Rostedt
  2017-04-13  8:44 ` Peter Zijlstra
  2 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2017-04-12 23:48 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, LKML, Ingo Molnar, Andrew Morton, Thomas Gleixner

On Wed, Apr 12, 2017 at 02:27:44PM -0400, Steven Rostedt wrote:
> From: Steven Rostedt (VMware) <rostedt@goodmis.org>
> 
> I finally got around to creating trampolines for dynamically allocated
> ftrace_ops with using synchronize_rcu_tasks(). For users of the ftrace
> function hook callbacks, like perf, that allocate the ftrace_ops
> descriptor via kmalloc() and friends, ftrace was not able to optimize
> the functions being traced to use a trampoline because they would also
> need to be allocated dynamically. The problem is that they cannot be
> freed when CONFIG_PREEMPT is set, as there's no way to tell if a task
> was preempted on the trampoline. That was before Paul McKenney
> implemented synchronize_rcu_tasks() that would make sure all tasks
> (except idle) have scheduled out or have entered user space.
> 
> While testing this, I triggered this bug:
> 
>  BUG: unable to handle kernel paging request at ffffffffa0230077
>  IP: 0xffffffffa0230077
>  PGD 2414067 
>  PUD 2415063 
>  PMD c463c067 
>  PTE 0
> 
>  Oops: 0010 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
>  Modules linked in: ip6table_filter ip6_tables snd_hda_codec_hdmi
>  snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel
>  snd_hda_codec snd_hwdep snd_hda_core x86_pkg_temp_thermal kvm_intel
>  i915 snd_seq kvm snd_seq_device snd_pcm i2c_algo_bit snd_timer
>  drm_kms_helper irqbypass syscopyarea sysfillrect sysimgblt fb_sys_fops
>  drm i2c_i801 snd soundcore wmi i2c_core video e1000e ptp pps_core CPU:
>  2 PID: 0 Comm: swapper/2 Not tainted 4.11.0-rc3-test+ #356 Hardware
>  name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05
>  05/07/2012 task: ffff8800cebbd0c0 task.stack: ffff8800cebd8000 RIP:
>  0010:0xffffffffa0230077 RSP: 0018:ffff8800cebdfd80 EFLAGS: 00010286
>  RAX: 0000000000000000 RBX: ffff8800cebbd0c0 RCX: ffffffff8121391e RDX:
>  dffffc0000000000 RSI: ffffffff81ce7c78 RDI: ffff8800cebbf298 RBP:
>  ffff8800cebdfe28 R08: 0000000000000003 R09: 0000000000000000 R10:
>  0000000000000000 R11: 0000000000000000 R12: ffff8800cebbd0c0 R13:
>  ffffffff828fbe90 R14: ffff8800d392c480 R15: ffffffff826cc380 FS:
>  0000000000000000(0000) GS:ffff8800d3900000(0000)
>  knlGS:0000000000000000 CS:  0010 DS: 0000 ES: 0000 CR0:
>  0000000080050033 CR2: ffffffffa0230077 CR3: 0000000002413000 CR4:
>  00000000001406e0 Call Trace: ? debug_smp_processor_id+0x17/0x20 ?
>  sched_ttwu_pending+0x79/0x190 ? schedule+0x5/0xe0 ?
>  trace_hardirqs_on_caller+0x182/0x280 schedule+0x5/0xe0
>  schedule_preempt_disabled+0x18/0x30 ? schedule+0x5/0xe0
>   ? schedule_preempt_disabled+0x18/0x30
>   do_idle+0x172/0x220
> 
> What happened was that the idle task was preempted on the trampoline.
> As synchronize_rcu_tasks() ignores the idle thread, there's nothing
> that lets ftrace know that the idle task was preempted on a trampoline.
> 
> The idle task shouldn't need to ever enable preemption. The idle task
> is simply a loop that calls schedule or places the cpu into idle mode.
> In fact, having preemption enabled is inefficient, because it can
> happen when idle is just about to call schedule anyway, which would
> cause schedule to be called twice. Once for when the interrupt came in
> and was returning back to normal context, and then again in the normal
> path that the idle loop is running in, which would be pointless, as it
> had already scheduled.
> 
> Adding a new function local to kernel/sched/ that allows idle to call
> the scheduler without enabling preemption, fixes the
> synchronize_rcu_tasks) issue, as well as removes the pointless spurious
> scheduled caused by interrupts happening in the brief window where
> preemption is enabled just before it calls schedule.
> 
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

I don't understand why schedule_idle() needs a loop given that
schedule_preempt_disabled() doesn't have one, but other than
that, for whatever it is worth, it looks good to me.

							Thanx, Paul

> ---
> Changes from v1:
> 
>   o Added Thomas to Cc as he wrote the original generic idle code
> 
>   o Added comment to why schedule_idle() exits
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 3b31fc0..0788286 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3502,6 +3502,23 @@ asmlinkage __visible void __sched schedule(void)
>  }
>  EXPORT_SYMBOL(schedule);
> 
> +/*
> + * synchronize_rcu_tasks() makes sure that no task is stuck in preempted
> + * state (have scheduled out non-voluntarily) by making sure that all
> + * tasks have either left the run queue or have gone into user space.
> + * As idle tasks do not do either, they must not ever be preempted
> + * (schedule out non-voluntarily).
> + *
> + * schedule_idle() is similar to schedule_preempt_disable() except
> + * that it never enables preemption.
> + */
> +void __sched schedule_idle(void)
> +{
> +	do {
> +		__schedule(false);
> +	} while (need_resched());
> +}
> +
>  #ifdef CONFIG_CONTEXT_TRACKING
>  asmlinkage __visible void __sched schedule_user(void)
>  {
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index ac6d517..229c17e 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -264,7 +264,7 @@ static void do_idle(void)
>  	smp_mb__after_atomic();
> 
>  	sched_ttwu_pending();
> -	schedule_preempt_disabled();
> +	schedule_idle();
>  }
> 
>  bool cpu_in_idle(unsigned long pc)
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 5cbf922..c5ee02b 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1465,6 +1465,8 @@ static inline struct cpuidle_state *idle_get_state(struct rq *rq)
>  }
>  #endif
> 
> +extern void schedule_idle(void);
> +
>  extern void sysrq_sched_debug_show(void);
>  extern void sched_init_granularity(void);
>  extern void update_max_interval(void);
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] sched: Have do_idle() call __schedule() without  enabling preemption
  2017-04-12 23:48 ` Paul E. McKenney
@ 2017-04-13  0:40   ` Steven Rostedt
  2017-04-13  0:50     ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2017-04-13  0:40 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, LKML, Ingo Molnar, Andrew Morton, Thomas Gleixner

On Wed, 12 Apr 2017 16:48:59 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> I don't understand why schedule_idle() needs a loop given that
> schedule_preempt_disabled() doesn't have one, but other than

But it does:

void __sched schedule_preempt_disabled(void)
{
	sched_preempt_enable_no_resched();
	schedule();
	preempt_disable();
}

Where we have:

asmlinkage __visible void __sched schedule(void)
{
	struct task_struct *tsk = current;

	sched_submit_work(tsk);
	do {
		preempt_disable();
		__schedule(false);
		sched_preempt_enable_no_resched();
	} while (need_resched());
}

Which makes:

void __sched schedule_preempt_disabled(void)
{
	struct task_struct *tsk  = current;

	scehdule_preempt_enable_no_resched();
	sched_submit_work(tsk);
	do {
		preempt_disable();
		__schedule(false);
		schedule_preempt_enable_no_resched();
	} while (need_resched());
	preempt_disable();
}

My schedule_idle() is simply schedule_preempt_disabled() without the
change in preemption and no call to sched_submit_work().


> that, for whatever it is worth, it looks good to me.
> 

Does this mean I can change the Cc: to Acked-by: ?

-- Steve


> > +void __sched schedule_idle(void)
> > +{
> > +	do {
> > +		__schedule(false);
> > +	} while (need_resched());
> > +}
> > +

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption
  2017-04-13  0:40   ` Steven Rostedt
@ 2017-04-13  0:50     ` Paul E. McKenney
  0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2017-04-13  0:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, LKML, Ingo Molnar, Andrew Morton, Thomas Gleixner

On Wed, Apr 12, 2017 at 08:40:00PM -0400, Steven Rostedt wrote:
> On Wed, 12 Apr 2017 16:48:59 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > I don't understand why schedule_idle() needs a loop given that
> > schedule_preempt_disabled() doesn't have one, but other than
> 
> But it does:
> 
> void __sched schedule_preempt_disabled(void)
> {
> 	sched_preempt_enable_no_resched();
> 	schedule();
> 	preempt_disable();
> }
> 
> Where we have:
> 
> asmlinkage __visible void __sched schedule(void)
> {
> 	struct task_struct *tsk = current;
> 
> 	sched_submit_work(tsk);
> 	do {
> 		preempt_disable();
> 		__schedule(false);
> 		sched_preempt_enable_no_resched();
> 	} while (need_resched());
> }
> 
> Which makes:
> 
> void __sched schedule_preempt_disabled(void)
> {
> 	struct task_struct *tsk  = current;
> 
> 	scehdule_preempt_enable_no_resched();
> 	sched_submit_work(tsk);
> 	do {
> 		preempt_disable();
> 		__schedule(false);
> 		schedule_preempt_enable_no_resched();
> 	} while (need_resched());
> 	preempt_disable();
> }
> 
> My schedule_idle() is simply schedule_preempt_disabled() without the
> change in preemption and no call to sched_submit_work().

Ah, thank you!

> > that, for whatever it is worth, it looks good to me.
> 
> Does this mean I can change the Cc: to Acked-by: ?

Works for me!

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> -- Steve
> 
> 
> > > +void __sched schedule_idle(void)
> > > +{
> > > +	do {
> > > +		__schedule(false);
> > > +	} while (need_resched());
> > > +}
> > > +
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption
  2017-04-12 18:27 [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption Steven Rostedt
  2017-04-12 18:32 ` Steven Rostedt
  2017-04-12 23:48 ` Paul E. McKenney
@ 2017-04-13  8:44 ` Peter Zijlstra
  2017-04-13 13:48   ` Steven Rostedt
  2 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2017-04-13  8:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Ingo Molnar, Andrew Morton, Paul E. McKenney, Thomas Gleixner

On Wed, Apr 12, 2017 at 02:27:44PM -0400, Steven Rostedt wrote:
> + * schedule_idle() is similar to schedule_preempt_disable() except
> + * that it never enables preemption.

That's not right. The primary distinction is that it doesn't call
sched_submit_work().

And because that function is a no-op for the idle thread, the idle
thread can do without calling that and therefore avoid the preemption
window.

You also need a few words about fake idle threads, search play_idle()
callers.

You could also make schedule_idle() more robust by adding a WARN for the
blk_schedule_flush_plug() condition.


You Changelog is still entirely long and rambling but fails to mention
the fundamental important stuff :-(

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption
  2017-04-13  8:44 ` Peter Zijlstra
@ 2017-04-13 13:48   ` Steven Rostedt
  2017-04-13 14:38     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2017-04-13 13:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Andrew Morton, Paul E. McKenney, Thomas Gleixner

On Thu, 13 Apr 2017 10:44:53 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, Apr 12, 2017 at 02:27:44PM -0400, Steven Rostedt wrote:
> > + * schedule_idle() is similar to schedule_preempt_disable() except
> > + * that it never enables preemption.  
> 
> That's not right. The primary distinction is that it doesn't call
> sched_submit_work().

That has nothing to do with fixing synchronize_rcu_tasks(), which is
the entire point of my patch, thus it is *not* the primary distinction.
Keeping schedule from enabling preemption and calling functions is the
bug fix. Not calling sched_submit_work() is just an added optimization
benefit.

The point of the patch is to stop idle from enabling preemption,
because it doesn't need to, as sched_submit_work() is a nop for it.
I'll update my change log to mention that.

> 
> And because that function is a no-op for the idle thread, the idle
> thread can do without calling that and therefore avoid the preemption
> window.
> 
> You also need a few words about fake idle threads, search play_idle()
> callers.

Thanks, this is the first I heard of these. I'll go look at them.

> 
> You could also make schedule_idle() more robust by adding a WARN for the
> blk_schedule_flush_plug() condition.

Why? The call to schedule_preempt_disabled() never got that far when
coming from do_idle().

static inline void sched_submit_work(struct task_struct *tsk)
{
	if (!tsk->state || tsk_is_pi_blocked(tsk))
		return;
	/*
	 * If we are going to sleep and we have plugged IO queued,
	 * make sure to submit it to avoid deadlocks.
	 */
	if (blk_needs_flush_plug(tsk))
		blk_schedule_flush_plug(tsk);
}

Isn't tsk->state always zero for the idle task?

A better case would be WARN_ON(tsk->state)


> 
> 
> You Changelog is still entirely long and rambling but fails to mention
> the fundamental important stuff :-(


Remember, this patch is to fix a bug and not to optimize idle, although
that is an added benefit. The bug I am fixing, which is in linux-next
now, is that the idle thread breaks synchronize_rcu_tasks() when
calling schedule() with preemption enabled. That's what my ramblings in
the change log are talking about.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption
  2017-04-13 13:48   ` Steven Rostedt
@ 2017-04-13 14:38     ` Peter Zijlstra
  2017-04-13 14:54       ` Steven Rostedt
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2017-04-13 14:38 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Ingo Molnar, Andrew Morton, Paul E. McKenney, Thomas Gleixner

On Thu, Apr 13, 2017 at 09:48:40AM -0400, Steven Rostedt wrote:
> On Thu, 13 Apr 2017 10:44:53 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Wed, Apr 12, 2017 at 02:27:44PM -0400, Steven Rostedt wrote:
> > > + * schedule_idle() is similar to schedule_preempt_disable() except
> > > + * that it never enables preemption.  
> > 
> > That's not right. The primary distinction is that it doesn't call
> > sched_submit_work().
> 
> That has nothing to do with fixing synchronize_rcu_tasks(), which is
> the entire point of my patch, thus it is *not* the primary distinction.

It is. The only reason you _can_ avoid that preemption window is by not
doing sched_submit_work().

The moment you need to call that, the preemption window comes back.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption
  2017-04-13 14:38     ` Peter Zijlstra
@ 2017-04-13 14:54       ` Steven Rostedt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2017-04-13 14:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Andrew Morton, Paul E. McKenney, Thomas Gleixner

On Thu, 13 Apr 2017 16:38:07 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Apr 13, 2017 at 09:48:40AM -0400, Steven Rostedt wrote:
> > On Thu, 13 Apr 2017 10:44:53 +0200
> > Peter Zijlstra <peterz@infradead.org> wrote:
> >   
> > > On Wed, Apr 12, 2017 at 02:27:44PM -0400, Steven Rostedt wrote:  
> > > > + * schedule_idle() is similar to schedule_preempt_disable() except
> > > > + * that it never enables preemption.    
> > > 
> > > That's not right. The primary distinction is that it doesn't call
> > > sched_submit_work().  
> > 
> > That has nothing to do with fixing synchronize_rcu_tasks(), which is
> > the entire point of my patch, thus it is *not* the primary distinction.  
> 
> It is. The only reason you _can_ avoid that preemption window is by not
> doing sched_submit_work().

We are talking apples and oranges here. It's a distinction of the
generic call to schedule_preempt_disable() and only one instance of the
callers can actually run that. But it is not the idle case, in which
case it is a nop. But we are now bike shedding the issue.

> 
> The moment you need to call that, the preemption window comes back.

I'll update my change log and comment and repost.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-13 14:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-12 18:27 [PATCH v2] sched: Have do_idle() call __schedule() without enabling preemption Steven Rostedt
2017-04-12 18:32 ` Steven Rostedt
2017-04-12 23:48 ` Paul E. McKenney
2017-04-13  0:40   ` Steven Rostedt
2017-04-13  0:50     ` Paul E. McKenney
2017-04-13  8:44 ` Peter Zijlstra
2017-04-13 13:48   ` Steven Rostedt
2017-04-13 14:38     ` Peter Zijlstra
2017-04-13 14:54       ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).