* [patch 1/3] nohz: only wakeup a single target cpu when kicking a task
2021-01-28 20:21 [patch 0/3] nohz_full: only wakeup target CPUs when notifying new tick dependency (v5) Marcelo Tosatti
@ 2021-01-28 20:21 ` Marcelo Tosatti
2021-01-28 20:21 ` [patch 2/3] nohz: change signal tick dependency to wakeup CPUs of member tasks Marcelo Tosatti
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Marcelo Tosatti @ 2021-01-28 20:21 UTC (permalink / raw)
To: Peter Zijlstra, Frederic Weisbecker; +Cc: linux-kernel, Marcelo Tosatti
When adding a tick dependency to a task, its necessary to
wakeup the CPU where the task resides to reevaluate tick
dependencies on that CPU.
However the current code wakes up all nohz_full CPUs, which
is unnecessary.
Switch to waking up a single CPU, by using ordering of writes
to task->cpu and task->tick_dep_mask.
From: Frederic Weisbecker <frederic@kernel.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Index: linux-2.6/kernel/time/tick-sched.c
===================================================================
--- linux-2.6.orig/kernel/time/tick-sched.c
+++ linux-2.6/kernel/time/tick-sched.c
@@ -322,6 +322,31 @@ void tick_nohz_full_kick_cpu(int cpu)
irq_work_queue_on(&per_cpu(nohz_full_kick_work, cpu), cpu);
}
+static void tick_nohz_kick_task(struct task_struct *tsk)
+{
+ int cpu = task_cpu(tsk);
+
+ /*
+ * If the task concurrently migrates to another cpu,
+ * we guarantee it sees the new tick dependency upon
+ * schedule.
+ *
+ *
+ * set_task_cpu(p, cpu);
+ * STORE p->cpu = @cpu
+ * __schedule() (switch to task 'p')
+ * LOCK rq->lock
+ * smp_mb__after_spin_lock() STORE p->tick_dep_mask
+ * tick_nohz_task_switch() smp_mb() (atomic_fetch_or())
+ * LOAD p->tick_dep_mask LOAD p->cpu
+ */
+
+ preempt_disable();
+ if (cpu_online(cpu))
+ tick_nohz_full_kick_cpu(cpu);
+ preempt_enable();
+}
+
/*
* Kick all full dynticks CPUs in order to force these to re-evaluate
* their dependency on the tick and restart it if necessary.
@@ -404,19 +429,8 @@ EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_cp
*/
void tick_nohz_dep_set_task(struct task_struct *tsk, enum tick_dep_bits bit)
{
- if (!atomic_fetch_or(BIT(bit), &tsk->tick_dep_mask)) {
- if (tsk == current) {
- preempt_disable();
- tick_nohz_full_kick();
- preempt_enable();
- } else {
- /*
- * Some future tick_nohz_full_kick_task()
- * should optimize this.
- */
- tick_nohz_full_kick_all();
- }
- }
+ if (!atomic_fetch_or(BIT(bit), &tsk->tick_dep_mask))
+ tick_nohz_kick_task(tsk);
}
EXPORT_SYMBOL_GPL(tick_nohz_dep_set_task);
^ permalink raw reply [flat|nested] 9+ messages in thread
* [patch 2/3] nohz: change signal tick dependency to wakeup CPUs of member tasks
2021-01-28 20:21 [patch 0/3] nohz_full: only wakeup target CPUs when notifying new tick dependency (v5) Marcelo Tosatti
2021-01-28 20:21 ` [patch 1/3] nohz: only wakeup a single target cpu when kicking a task Marcelo Tosatti
@ 2021-01-28 20:21 ` Marcelo Tosatti
2021-02-12 12:25 ` Frederic Weisbecker
2021-01-28 20:21 ` [patch 3/3] nohz: tick_nohz_kick_task: only IPI if remote task is running Marcelo Tosatti
2021-02-12 14:19 ` [patch 0/3] nohz_full: only wakeup target CPUs when notifying new tick dependency (v5) Frederic Weisbecker
3 siblings, 1 reply; 9+ messages in thread
From: Marcelo Tosatti @ 2021-01-28 20:21 UTC (permalink / raw)
To: Peter Zijlstra, Frederic Weisbecker; +Cc: linux-kernel, Marcelo Tosatti
Rather than waking up all nohz_full CPUs on the system, only wakeup
the target CPUs of member threads of the signal.
Reduces interruptions to nohz_full CPUs.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Index: linux-2.6/kernel/time/tick-sched.c
===================================================================
--- linux-2.6.orig/kernel/time/tick-sched.c
+++ linux-2.6/kernel/time/tick-sched.c
@@ -444,9 +444,20 @@ EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_ta
* Set a per-taskgroup tick dependency. Posix CPU timers need this in order to elapse
* per process timers.
*/
-void tick_nohz_dep_set_signal(struct signal_struct *sig, enum tick_dep_bits bit)
+void tick_nohz_dep_set_signal(struct task_struct *tsk,
+ enum tick_dep_bits bit)
{
- tick_nohz_dep_set_all(&sig->tick_dep_mask, bit);
+ int prev;
+ struct signal_struct *sig = tsk->signal;
+
+ prev = atomic_fetch_or(BIT(bit), &sig->tick_dep_mask);
+ if (!prev) {
+ struct task_struct *t;
+
+ lockdep_assert_held(&tsk->sighand->siglock);
+ __for_each_thread(sig, t)
+ tick_nohz_kick_task(t);
+ }
}
void tick_nohz_dep_clear_signal(struct signal_struct *sig, enum tick_dep_bits bit)
Index: linux-2.6/include/linux/tick.h
===================================================================
--- linux-2.6.orig/include/linux/tick.h
+++ linux-2.6/include/linux/tick.h
@@ -207,7 +207,7 @@ extern void tick_nohz_dep_set_task(struc
enum tick_dep_bits bit);
extern void tick_nohz_dep_clear_task(struct task_struct *tsk,
enum tick_dep_bits bit);
-extern void tick_nohz_dep_set_signal(struct signal_struct *signal,
+extern void tick_nohz_dep_set_signal(struct task_struct *tsk,
enum tick_dep_bits bit);
extern void tick_nohz_dep_clear_signal(struct signal_struct *signal,
enum tick_dep_bits bit);
@@ -252,11 +252,11 @@ static inline void tick_dep_clear_task(s
if (tick_nohz_full_enabled())
tick_nohz_dep_clear_task(tsk, bit);
}
-static inline void tick_dep_set_signal(struct signal_struct *signal,
+static inline void tick_dep_set_signal(struct task_struct *tsk,
enum tick_dep_bits bit)
{
if (tick_nohz_full_enabled())
- tick_nohz_dep_set_signal(signal, bit);
+ tick_nohz_dep_set_signal(tsk, bit);
}
static inline void tick_dep_clear_signal(struct signal_struct *signal,
enum tick_dep_bits bit)
@@ -284,7 +284,7 @@ static inline void tick_dep_set_task(str
enum tick_dep_bits bit) { }
static inline void tick_dep_clear_task(struct task_struct *tsk,
enum tick_dep_bits bit) { }
-static inline void tick_dep_set_signal(struct signal_struct *signal,
+static inline void tick_dep_set_signal(struct task_struct *tsk,
enum tick_dep_bits bit) { }
static inline void tick_dep_clear_signal(struct signal_struct *signal,
enum tick_dep_bits bit) { }
Index: linux-2.6/kernel/time/posix-cpu-timers.c
===================================================================
--- linux-2.6.orig/kernel/time/posix-cpu-timers.c
+++ linux-2.6/kernel/time/posix-cpu-timers.c
@@ -523,7 +523,7 @@ static void arm_timer(struct k_itimer *t
if (CPUCLOCK_PERTHREAD(timer->it_clock))
tick_dep_set_task(p, TICK_DEP_BIT_POSIX_TIMER);
else
- tick_dep_set_signal(p->signal, TICK_DEP_BIT_POSIX_TIMER);
+ tick_dep_set_signal(p, TICK_DEP_BIT_POSIX_TIMER);
}
/*
@@ -1358,7 +1358,7 @@ void set_process_cpu_timer(struct task_s
if (*newval < *nextevt)
*nextevt = *newval;
- tick_dep_set_signal(tsk->signal, TICK_DEP_BIT_POSIX_TIMER);
+ tick_dep_set_signal(tsk, TICK_DEP_BIT_POSIX_TIMER);
}
static int do_cpu_nanosleep(const clockid_t which_clock, int flags,
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 2/3] nohz: change signal tick dependency to wakeup CPUs of member tasks
2021-01-28 20:21 ` [patch 2/3] nohz: change signal tick dependency to wakeup CPUs of member tasks Marcelo Tosatti
@ 2021-02-12 12:25 ` Frederic Weisbecker
2021-02-12 14:00 ` Marcelo Tosatti
0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2021-02-12 12:25 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Peter Zijlstra, linux-kernel
On Thu, Jan 28, 2021 at 05:21:36PM -0300, Marcelo Tosatti wrote:
> Rather than waking up all nohz_full CPUs on the system, only wakeup
> the target CPUs of member threads of the signal.
>
> Reduces interruptions to nohz_full CPUs.
>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
> Index: linux-2.6/kernel/time/tick-sched.c
> ===================================================================
> --- linux-2.6.orig/kernel/time/tick-sched.c
> +++ linux-2.6/kernel/time/tick-sched.c
> @@ -444,9 +444,20 @@ EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_ta
> * Set a per-taskgroup tick dependency. Posix CPU timers need this in order to elapse
> * per process timers.
> */
> -void tick_nohz_dep_set_signal(struct signal_struct *sig, enum tick_dep_bits
> bit)
Why not keeping the signal struct as a parameter?
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 2/3] nohz: change signal tick dependency to wakeup CPUs of member tasks
2021-02-12 12:25 ` Frederic Weisbecker
@ 2021-02-12 14:00 ` Marcelo Tosatti
2021-02-12 14:18 ` Frederic Weisbecker
0 siblings, 1 reply; 9+ messages in thread
From: Marcelo Tosatti @ 2021-02-12 14:00 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Peter Zijlstra, linux-kernel
On Fri, Feb 12, 2021 at 01:25:21PM +0100, Frederic Weisbecker wrote:
> On Thu, Jan 28, 2021 at 05:21:36PM -0300, Marcelo Tosatti wrote:
> > Rather than waking up all nohz_full CPUs on the system, only wakeup
> > the target CPUs of member threads of the signal.
> >
> > Reduces interruptions to nohz_full CPUs.
> >
> > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> >
> > Index: linux-2.6/kernel/time/tick-sched.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/time/tick-sched.c
> > +++ linux-2.6/kernel/time/tick-sched.c
> > @@ -444,9 +444,20 @@ EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_ta
> > * Set a per-taskgroup tick dependency. Posix CPU timers need this in order to elapse
> > * per process timers.
> > */
> > -void tick_nohz_dep_set_signal(struct signal_struct *sig, enum tick_dep_bits
> > bit)
>
> Why not keeping the signal struct as a parameter?
>
> Thanks.
All callers use "struct signal_struct *sig = tsk->signal" as
signal parameter anyway...
Can change parameters to (task, signal, bit) if you prefer.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 2/3] nohz: change signal tick dependency to wakeup CPUs of member tasks
2021-02-12 14:00 ` Marcelo Tosatti
@ 2021-02-12 14:18 ` Frederic Weisbecker
0 siblings, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2021-02-12 14:18 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Peter Zijlstra, linux-kernel
On Fri, Feb 12, 2021 at 11:00:41AM -0300, Marcelo Tosatti wrote:
> On Fri, Feb 12, 2021 at 01:25:21PM +0100, Frederic Weisbecker wrote:
> > On Thu, Jan 28, 2021 at 05:21:36PM -0300, Marcelo Tosatti wrote:
> > > Rather than waking up all nohz_full CPUs on the system, only wakeup
> > > the target CPUs of member threads of the signal.
> > >
> > > Reduces interruptions to nohz_full CPUs.
> > >
> > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > >
> > > Index: linux-2.6/kernel/time/tick-sched.c
> > > ===================================================================
> > > --- linux-2.6.orig/kernel/time/tick-sched.c
> > > +++ linux-2.6/kernel/time/tick-sched.c
> > > @@ -444,9 +444,20 @@ EXPORT_SYMBOL_GPL(tick_nohz_dep_clear_ta
> > > * Set a per-taskgroup tick dependency. Posix CPU timers need this in order to elapse
> > > * per process timers.
> > > */
> > > -void tick_nohz_dep_set_signal(struct signal_struct *sig, enum tick_dep_bits
> > > bit)
> >
> > Why not keeping the signal struct as a parameter?
> >
> > Thanks.
>
> All callers use "struct signal_struct *sig = tsk->signal" as
> signal parameter anyway...
Sure, but that makes more sense with the function role and name.
>
> Can change parameters to (task, signal, bit) if you prefer.
That's ok I can do it, just wanted to make sure I wasn't missing
a reason behind it.
Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* [patch 3/3] nohz: tick_nohz_kick_task: only IPI if remote task is running
2021-01-28 20:21 [patch 0/3] nohz_full: only wakeup target CPUs when notifying new tick dependency (v5) Marcelo Tosatti
2021-01-28 20:21 ` [patch 1/3] nohz: only wakeup a single target cpu when kicking a task Marcelo Tosatti
2021-01-28 20:21 ` [patch 2/3] nohz: change signal tick dependency to wakeup CPUs of member tasks Marcelo Tosatti
@ 2021-01-28 20:21 ` Marcelo Tosatti
2021-02-12 14:19 ` [patch 0/3] nohz_full: only wakeup target CPUs when notifying new tick dependency (v5) Frederic Weisbecker
3 siblings, 0 replies; 9+ messages in thread
From: Marcelo Tosatti @ 2021-01-28 20:21 UTC (permalink / raw)
To: Peter Zijlstra, Frederic Weisbecker; +Cc: linux-kernel, Marcelo Tosatti
If the task is not running, run_posix_cpu_timers has nothing
to elapsed, so spare IPI in that case.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Index: linux-2.6/kernel/sched/core.c
===================================================================
--- linux-2.6.orig/kernel/sched/core.c
+++ linux-2.6/kernel/sched/core.c
@@ -9182,3 +9182,9 @@ void call_trace_sched_update_nr_running(
{
trace_sched_update_nr_running_tp(rq, count);
}
+
+bool task_on_rq(struct task_struct *p)
+{
+ return p->on_rq == TASK_ON_RQ_QUEUED;
+}
+
Index: linux-2.6/include/linux/sched.h
===================================================================
--- linux-2.6.orig/include/linux/sched.h
+++ linux-2.6/include/linux/sched.h
@@ -232,6 +232,8 @@ extern void io_schedule_finish(int token
extern long io_schedule_timeout(long timeout);
extern void io_schedule(void);
+extern bool task_on_rq(struct task_struct *p);
+
/**
* struct prev_cputime - snapshot of system and user cputime
* @utime: time spent in user mode
Index: linux-2.6/kernel/time/tick-sched.c
===================================================================
--- linux-2.6.orig/kernel/time/tick-sched.c
+++ linux-2.6/kernel/time/tick-sched.c
@@ -324,8 +324,6 @@ void tick_nohz_full_kick_cpu(int cpu)
static void tick_nohz_kick_task(struct task_struct *tsk)
{
- int cpu = task_cpu(tsk);
-
/*
* If the task concurrently migrates to another cpu,
* we guarantee it sees the new tick dependency upon
@@ -340,6 +338,23 @@ static void tick_nohz_kick_task(struct t
* tick_nohz_task_switch() smp_mb() (atomic_fetch_or())
* LOAD p->tick_dep_mask LOAD p->cpu
*/
+ int cpu = task_cpu(tsk);
+
+ /*
+ * If the task is not running, run_posix_cpu_timers
+ * has nothing to elapsed, can spare IPI in that
+ * case.
+ *
+ * activate_task() STORE p->tick_dep_mask
+ * STORE p->task_on_rq
+ * __schedule() (switch to task 'p') smp_mb() (atomic_fetch_or())
+ * LOCK rq->lock LOAD p->task_on_rq
+ * smp_mb__after_spin_lock()
+ * tick_nohz_task_switch()
+ * LOAD p->tick_dep_mask
+ */
+ if (!task_on_rq(tsk))
+ return;
preempt_disable();
if (cpu_online(cpu))
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch 0/3] nohz_full: only wakeup target CPUs when notifying new tick dependency (v5)
2021-01-28 20:21 [patch 0/3] nohz_full: only wakeup target CPUs when notifying new tick dependency (v5) Marcelo Tosatti
` (2 preceding siblings ...)
2021-01-28 20:21 ` [patch 3/3] nohz: tick_nohz_kick_task: only IPI if remote task is running Marcelo Tosatti
@ 2021-02-12 14:19 ` Frederic Weisbecker
3 siblings, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2021-02-12 14:19 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Peter Zijlstra, linux-kernel
On Thu, Jan 28, 2021 at 05:21:34PM -0300, Marcelo Tosatti wrote:
> When enabling per-CPU posix timers, an IPI to nohz_full CPUs might be
> performed (to re-read the dependencies and possibly not re-enter
> nohz_full on a given CPU).
>
> A common case is for applications that run on nohz_full= CPUs
> to not use POSIX timers (eg DPDK). This patch changes the notification
> to only IPI the target CPUs where the task(s) whose tick dependencies
> are being updated are executing.
>
> This reduces interruptions to nohz_full= CPUs.
Looks good. I'm queueing the series if Peter doesn't object.
Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread