All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI
@ 2014-04-03 16:17 Frederic Weisbecker
  2014-04-03 16:17 ` [PATCH 1/2] smp: Non busy-waiting IPI queue Frederic Weisbecker
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Frederic Weisbecker @ 2014-04-03 16:17 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: LKML, Frederic Weisbecker, Paul E. McKenney, Andrew Morton,
	Jens Axboe, Kevin Hilman, Peter Zijlstra

Ingo, Thomas,

Please pull the timers/nohz-ipi-for-tip-v3 branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	timers/nohz-ipi-for-tip-v3

It is based on 7a48837732f87a574ee3e1855927dc250117f565 
("Merge branch 'for-3.15/core' of git://git.kernel.dk/linux-block")
due to dependencies on the block tree containing IPI core changes.

This v2 addresses Paul review and add his reviewed-by tags.

I'm trying to get these 2 patches in before the merge window ends
otherwise we'll have to do tricky branch split on the next cycle.

---
When a full dynticks CPU runs in single task mode then a new task gets
enqueued, we notify it through an IPI such that it restarts its tick.

The IPI used here is the scheduler IPI. There are a few reasons for that:
it can be called when interrupts are disabled, it can be called
concurrently... These convenient properties altogether aren't yet offered
by the IPI subsystem.

Meanwhile, bloating that way the scheduler IPI with scheduler unrelated
code is an abuse of this fast path. We certainly don't want to start a
big kernel IPI.

So this patchset adds a small helper to the IPI subsystem that allows
to queue an IPI from interrupt disabled code while handling concurrent
callers as well. Eventually the nohz kick gets converted to this new facility.

Partly inspired by a suggestion from Peter Zijlstra.

* Patch 1/2 brings the IPI infrastructure to support this
* Patch 2/2 does the nohz IPI conversion


Thanks,
	Frederic
---

Frederic Weisbecker (2):
      smp: Non busy-waiting IPI queue
      nohz: Move full nohz kick to its own IPI


 include/linux/smp.h      | 11 +++++++++++
 include/linux/tick.h     |  2 ++
 kernel/sched/core.c      |  5 +----
 kernel/sched/sched.h     |  2 +-
 kernel/smp.c             | 42 ++++++++++++++++++++++++++++++++++++++++++
 kernel/time/tick-sched.c | 21 +++++++++++++++++++++
 6 files changed, 78 insertions(+), 5 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] smp: Non busy-waiting IPI queue
  2014-04-03 16:17 [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
@ 2014-04-03 16:17 ` Frederic Weisbecker
  2014-04-03 16:17 ` [PATCH 2/2] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Frederic Weisbecker @ 2014-04-03 16:17 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: LKML, Frederic Weisbecker, Andrew Morton, Jens Axboe,
	Kevin Hilman, Paul E. McKenney, Peter Zijlstra

Some IPI users, such as the nohz subsystem, need to be able to send
an async IPI (async = non waiting for any other IPI completion) on
contexts with disabled interrupts. And we want the IPI subsystem to handle
concurrent calls by itself.

Currently the nohz machinery uses the scheduler IPI for this purpose
because it can be triggered from any context and doesn't need any
serialization from the caller. But this is an abuse of a scheduler
fast path. We are bloating it with a job that should use its own IPI.

The current set of IPI functions can't be called when interrupts are
disabled otherwise we risk a deadlock when two CPUs wait for each
other's IPI completion.

OTOH smp_call_function_single_async() can be called when interrupts
are disabled. But then it's up to the caller to serialize the given
IPI. This can't be called concurrently without special care.

So we need a version of the async IPI that takes care of concurrent
calls.

The proposed solution is to synchronize the IPI with a specific flag
that prevents the IPI from being sent if it is already pending but not
yet executed. Ordering is maintained such that, if the IPI is not sent
because it's already pending, we guarantee it will see the new state of
the data we expect it to when it will execute.

This model is close to the irq_work design. It's also partly inspired by
suggestions from Peter Zijlstra.

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/smp.h | 11 +++++++++++
 kernel/smp.c        | 42 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 53 insertions(+)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 633f5ed..0de1eff 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -29,6 +29,17 @@ extern unsigned int total_cpus;
 int smp_call_function_single(int cpuid, smp_call_func_t func, void *info,
 			     int wait);
 
+struct queue_single_data;
+typedef void (*smp_queue_func_t)(struct queue_single_data *qsd);
+
+struct queue_single_data {
+	struct call_single_data data;
+	smp_queue_func_t func;
+	int pending;
+};
+
+int smp_queue_function_single(int cpu, struct queue_single_data *qsd);
+
 /*
  * Call a function on all processors
  */
diff --git a/kernel/smp.c b/kernel/smp.c
index 06d574e..7589be5 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -265,6 +265,48 @@ int smp_call_function_single_async(int cpu, struct call_single_data *csd)
 }
 EXPORT_SYMBOL_GPL(smp_call_function_single_async);
 
+void generic_smp_queue_function_single_interrupt(void *info)
+{
+	struct queue_single_data *qsd = info;
+
+	WARN_ON_ONCE(xchg(&qsd->pending, 0) != 1);
+	qsd->func(qsd);
+}
+
+/**
+ * smp_queue_function_single - Queue an asynchronous function to run on a
+ * 				specific CPU unless it's already pending.
+ * @cpu: The CPU to run on.
+ * @qsd: Pre-allocated and setup data structure
+ *
+ * Like smp_call_function_single_async() but the call to the function is
+ * serialized and won't be queued if it is already pending. In the latter case,
+ * ordering is still guaranteed such that the pending call will see the new
+ * data we expect it to.
+ *
+ * This must not be called on offline CPUs.
+ *
+ * Returns 0 when function is successfully queued or already pending, else a
+ * negative status code.
+ */
+int smp_queue_function_single(int cpu, struct queue_single_data *qsd)
+{
+	int err;
+
+	if (cmpxchg(&qsd->pending, 0, 1))
+		return 0;
+
+	preempt_disable();
+	err = generic_exec_single(cpu, &qsd->data, generic_smp_queue_function_single_interrupt, qsd, 0);
+	preempt_enable();
+
+	/* Reset in case of error. This must not be called on offline CPUs */
+	if (err)
+		qsd->pending = 0;
+
+	return err;
+}
+
 /*
  * smp_call_function_any - Run a function on any of the given cpus
  * @mask: The mask of cpus it can run on.
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] nohz: Move full nohz kick to its own IPI
  2014-04-03 16:17 [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
  2014-04-03 16:17 ` [PATCH 1/2] smp: Non busy-waiting IPI queue Frederic Weisbecker
@ 2014-04-03 16:17 ` Frederic Weisbecker
  2014-04-09 15:28 ` [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
  2014-04-16  7:37 ` Ingo Molnar
  3 siblings, 0 replies; 8+ messages in thread
From: Frederic Weisbecker @ 2014-04-03 16:17 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: LKML, Frederic Weisbecker, Andrew Morton, Jens Axboe,
	Kevin Hilman, Paul E. McKenney, Peter Zijlstra

Now that we have smp_queue_function_single() which can be used to
safely queue IPIs when interrupts are disabled and without worrying
about concurrent callers, lets use it for the full dynticks kick to
notify a CPU that it's exiting single task mode.

This unbloats a bit the scheduler IPI that the nohz code was abusing
for its cool "callable anywhere/anytime" properties.

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/tick.h     |  2 ++
 kernel/sched/core.c      |  5 +----
 kernel/sched/sched.h     |  2 +-
 kernel/time/tick-sched.c | 21 +++++++++++++++++++++
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index b84773c..9d3fcc2 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -182,6 +182,7 @@ static inline bool tick_nohz_full_cpu(int cpu)
 extern void tick_nohz_init(void);
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
+extern void tick_nohz_full_kick_cpu(int cpu);
 extern void tick_nohz_full_kick_all(void);
 extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
@@ -190,6 +191,7 @@ static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick(void) { }
+static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick_all(void) { }
 static inline void __tick_nohz_task_switch(struct task_struct *tsk) { }
 #endif
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9cae286..e4b344e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1499,9 +1499,7 @@ void scheduler_ipi(void)
 	 */
 	preempt_fold_need_resched();
 
-	if (llist_empty(&this_rq()->wake_list)
-			&& !tick_nohz_full_cpu(smp_processor_id())
-			&& !got_nohz_idle_kick())
+	if (llist_empty(&this_rq()->wake_list) && !got_nohz_idle_kick())
 		return;
 
 	/*
@@ -1518,7 +1516,6 @@ void scheduler_ipi(void)
 	 * somewhat pessimize the simple resched case.
 	 */
 	irq_enter();
-	tick_nohz_full_check();
 	sched_ttwu_pending();
 
 	/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c9007f2..4771063 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1225,7 +1225,7 @@ static inline void inc_nr_running(struct rq *rq)
 		if (tick_nohz_full_cpu(rq->cpu)) {
 			/* Order rq->nr_running write against the IPI */
 			smp_wmb();
-			smp_send_reschedule(rq->cpu);
+			tick_nohz_full_kick_cpu(rq->cpu);
 		}
        }
 #endif
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 9f8af69..582d3f6 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -230,6 +230,27 @@ void tick_nohz_full_kick(void)
 		irq_work_queue(&__get_cpu_var(nohz_full_kick_work));
 }
 
+static void nohz_full_kick_queue(struct queue_single_data *qsd)
+{
+	__tick_nohz_full_check();
+}
+
+static DEFINE_PER_CPU(struct queue_single_data, nohz_full_kick_qsd) = {
+	.func = nohz_full_kick_queue,
+};
+
+void tick_nohz_full_kick_cpu(int cpu)
+{
+	if (!tick_nohz_full_cpu(cpu))
+		return;
+
+	if (cpu == smp_processor_id()) {
+		irq_work_queue(&__get_cpu_var(nohz_full_kick_work));
+	} else {
+		smp_queue_function_single(cpu, &per_cpu(nohz_full_kick_qsd, cpu));
+	}
+}
+
 static void nohz_full_kick_ipi(void *info)
 {
 	__tick_nohz_full_check();
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI
  2014-04-03 16:17 [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
  2014-04-03 16:17 ` [PATCH 1/2] smp: Non busy-waiting IPI queue Frederic Weisbecker
  2014-04-03 16:17 ` [PATCH 2/2] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
@ 2014-04-09 15:28 ` Frederic Weisbecker
  2014-04-14 11:22   ` Ingo Molnar
  2014-04-16  7:37 ` Ingo Molnar
  3 siblings, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2014-04-09 15:28 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner
  Cc: LKML, Paul E. McKenney, Andrew Morton, Jens Axboe, Kevin Hilman,
	Peter Zijlstra

On Thu, Apr 03, 2014 at 06:17:10PM +0200, Frederic Weisbecker wrote:
> Ingo, Thomas,
> 
> Please pull the timers/nohz-ipi-for-tip-v3 branch that can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> 	timers/nohz-ipi-for-tip-v3

Ping?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI
  2014-04-09 15:28 ` [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
@ 2014-04-14 11:22   ` Ingo Molnar
  2014-04-14 23:04     ` Frederic Weisbecker
  0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2014-04-14 11:22 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Thomas Gleixner, LKML, Paul E. McKenney, Andrew Morton,
	Jens Axboe, Kevin Hilman, Peter Zijlstra


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Thu, Apr 03, 2014 at 06:17:10PM +0200, Frederic Weisbecker wrote:
> > Ingo, Thomas,
> > 
> > Please pull the timers/nohz-ipi-for-tip-v3 branch that can be found at:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > 	timers/nohz-ipi-for-tip-v3
> 
> Ping?

So this came a bit late - will it be fine for v3.16 as-is?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI
  2014-04-14 11:22   ` Ingo Molnar
@ 2014-04-14 23:04     ` Frederic Weisbecker
  2014-04-15  6:05       ` Ingo Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2014-04-14 23:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, LKML, Paul E. McKenney, Andrew Morton,
	Jens Axboe, Kevin Hilman, Peter Zijlstra

On Mon, Apr 14, 2014 at 01:22:17PM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > On Thu, Apr 03, 2014 at 06:17:10PM +0200, Frederic Weisbecker wrote:
> > > Ingo, Thomas,
> > > 
> > > Please pull the timers/nohz-ipi-for-tip-v3 branch that can be found at:
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > > 	timers/nohz-ipi-for-tip-v3
> > 
> > Ping?
> 
> So this came a bit late - will it be fine for v3.16 as-is?

It was late on purpose due to dependencies on the block tree. But nevermind.

It's still fine to be pulled as is after -rc1 for 3.16.

Thanks!
 
> Thanks,
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI
  2014-04-14 23:04     ` Frederic Weisbecker
@ 2014-04-15  6:05       ` Ingo Molnar
  0 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2014-04-15  6:05 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Thomas Gleixner, LKML, Paul E. McKenney, Andrew Morton,
	Jens Axboe, Kevin Hilman, Peter Zijlstra


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Mon, Apr 14, 2014 at 01:22:17PM +0200, Ingo Molnar wrote:
> > 
> > * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > 
> > > On Thu, Apr 03, 2014 at 06:17:10PM +0200, Frederic Weisbecker wrote:
> > > > Ingo, Thomas,
> > > > 
> > > > Please pull the timers/nohz-ipi-for-tip-v3 branch that can be found at:
> > > > 
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > > > 	timers/nohz-ipi-for-tip-v3
> > > 
> > > Ping?
> > 
> > So this came a bit late - will it be fine for v3.16 as-is?
> 
> It was late on purpose due to dependencies on the block tree. [...]

Well, changing the generic SMP IPI facilities via the block tree was 
indeed a problem.

> But nevermind.
> 
> It's still fine to be pulled as is after -rc1 for 3.16.

Ok, will do.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI
  2014-04-03 16:17 [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2014-04-09 15:28 ` [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
@ 2014-04-16  7:37 ` Ingo Molnar
  3 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2014-04-16  7:37 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Thomas Gleixner, LKML, Paul E. McKenney, Andrew Morton,
	Jens Axboe, Kevin Hilman, Peter Zijlstra


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> Ingo, Thomas,
> 
> Please pull the timers/nohz-ipi-for-tip-v3 branch that can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> 	timers/nohz-ipi-for-tip-v3
> 
> It is based on 7a48837732f87a574ee3e1855927dc250117f565 
> ("Merge branch 'for-3.15/core' of git://git.kernel.dk/linux-block")
> due to dependencies on the block tree containing IPI core changes.
> 
> This v2 addresses Paul review and add his reviewed-by tags.
> 
> I'm trying to get these 2 patches in before the merge window ends
> otherwise we'll have to do tricky branch split on the next cycle.
> 
> ---
> When a full dynticks CPU runs in single task mode then a new task gets
> enqueued, we notify it through an IPI such that it restarts its tick.
> 
> The IPI used here is the scheduler IPI. There are a few reasons for that:
> it can be called when interrupts are disabled, it can be called
> concurrently... These convenient properties altogether aren't yet offered
> by the IPI subsystem.
> 
> Meanwhile, bloating that way the scheduler IPI with scheduler unrelated
> code is an abuse of this fast path. We certainly don't want to start a
> big kernel IPI.
> 
> So this patchset adds a small helper to the IPI subsystem that allows
> to queue an IPI from interrupt disabled code while handling concurrent
> callers as well. Eventually the nohz kick gets converted to this new facility.
> 
> Partly inspired by a suggestion from Peter Zijlstra.
> 
> * Patch 1/2 brings the IPI infrastructure to support this
> * Patch 2/2 does the nohz IPI conversion
> 
> 
> Thanks,
> 	Frederic
> ---
> 
> Frederic Weisbecker (2):
>       smp: Non busy-waiting IPI queue
>       nohz: Move full nohz kick to its own IPI
> 
> 
>  include/linux/smp.h      | 11 +++++++++++
>  include/linux/tick.h     |  2 ++
>  kernel/sched/core.c      |  5 +----
>  kernel/sched/sched.h     |  2 +-
>  kernel/smp.c             | 42 ++++++++++++++++++++++++++++++++++++++++++
>  kernel/time/tick-sched.c | 21 +++++++++++++++++++++
>  6 files changed, 78 insertions(+), 5 deletions(-)

Pulled into tip:timers/nohz, thanks a lot Frederic!

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-04-16  7:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-03 16:17 [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
2014-04-03 16:17 ` [PATCH 1/2] smp: Non busy-waiting IPI queue Frederic Weisbecker
2014-04-03 16:17 ` [PATCH 2/2] nohz: Move full nohz kick to its own IPI Frederic Weisbecker
2014-04-09 15:28 ` [GIT PULL v2] nohz: Move nohz kick out of scheduler IPI Frederic Weisbecker
2014-04-14 11:22   ` Ingo Molnar
2014-04-14 23:04     ` Frederic Weisbecker
2014-04-15  6:05       ` Ingo Molnar
2014-04-16  7:37 ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.