linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/nohz: Skip remote tick on idle task entirely
@ 2018-06-28 16:29 Frederic Weisbecker
  2018-07-02 12:44 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Frederic Weisbecker @ 2018-06-28 16:29 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Ingo Molnar, stable, Anna-Maria Gleixner,
	Jacek Tomaka

Some people have reported that the warning in sched_tick_remote()
occasionally triggers, especially in favour of some RCU-Torture
pressure:

	WARNING: CPU: 11 PID: 906 at kernel/sched/core.c:3138 sched_tick_remote+0xb6/0xc0
	Modules linked in:
	CPU: 11 PID: 906 Comm: kworker/u32:3 Not tainted 4.18.0-rc2+ #1
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
	Workqueue: events_unbound sched_tick_remote
	RIP: 0010:sched_tick_remote+0xb6/0xc0
	Code: e8 0f 06 b8 00 c6 03 00 fb eb 9d 8b 43 04 85 c0 75 8d 48 8b 83 e0 0a 00 00 48 85 c0 75 81 eb 88 48 89 df e8 bc fe ff ff eb aa <0f> 0b eb
	+c5 66 0f 1f 44 00 00 bf 17 00 00 00 e8 b6 2e fe ff 0f b6
	Call Trace:
	 process_one_work+0x1df/0x3b0
	 worker_thread+0x44/0x3d0
	 kthread+0xf3/0x130
	 ? set_worker_desc+0xb0/0xb0
	 ? kthread_create_worker_on_cpu+0x70/0x70
	 ret_from_fork+0x35/0x40

This happens when the remote tick applies on an idle task. Usually the
idle_cpu() check avoids that, but it is performed before we lock the
runqueue and it is therefore racy. It was intended to be that way in
order to prevent from useless runqueue locks since idle task tick
callback is a no-op.

Now if the racy check slips out of our hands and we end up remotely
ticking an idle task, the empty task_tick_idle() is harmless. Still
it won't pass the WARN_ON_ONCE() test that ensures rq_clock_task() is
not too far from curr->se.exec_start because update_curr_idle() doesn't
update the exec_start value like other scheduler policies. Hence the
reported false positive.

So let's have another check, while the rq is locked, to make sure we
don't remote tick on an idle task. The lockless idle_cpu() still applies
to avoid unecessary rq lock contention.

Reported-by: Jacek Tomaka <jacekt@dug.com>
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/sched/core.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d8fac..da8f121 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3127,16 +3127,18 @@ static void sched_tick_remote(struct work_struct *work)
 		u64 delta;
 
 		rq_lock_irq(rq, &rf);
-		update_rq_clock(rq);
 		curr = rq->curr;
-		delta = rq_clock_task(rq) - curr->se.exec_start;
+		if (!is_idle_task(curr)) {
+			update_rq_clock(rq);
+			delta = rq_clock_task(rq) - curr->se.exec_start;
 
-		/*
-		 * Make sure the next tick runs within a reasonable
-		 * amount of time.
-		 */
-		WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
-		curr->sched_class->task_tick(rq, curr, 0);
+			/*
+			 * Make sure the next tick runs within a reasonable
+			 * amount of time.
+			 */
+			WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
+			curr->sched_class->task_tick(rq, curr, 0);
+		}
 		rq_unlock_irq(rq, &rf);
 	}
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/nohz: Skip remote tick on idle task entirely
  2018-06-28 16:29 [PATCH] sched/nohz: Skip remote tick on idle task entirely Frederic Weisbecker
@ 2018-07-02 12:44 ` Peter Zijlstra
  2018-07-02 15:17 ` Paul E. McKenney
  2018-07-03  7:48 ` [tip:sched/core] " tip-bot for Frederic Weisbecker
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2018-07-02 12:44 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Thomas Gleixner, Paul E . McKenney, Ingo Molnar, stable,
	Anna-Maria Gleixner, Jacek Tomaka

On Thu, Jun 28, 2018 at 06:29:41PM +0200, Frederic Weisbecker wrote:
> ---
>  kernel/sched/core.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d8fac..da8f121 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3127,16 +3127,18 @@ static void sched_tick_remote(struct work_struct *work)
>  		u64 delta;
>  
>  		rq_lock_irq(rq, &rf);
> -		update_rq_clock(rq);
>  		curr = rq->curr;
> -		delta = rq_clock_task(rq) - curr->se.exec_start;
> +		if (!is_idle_task(curr)) {
> +			update_rq_clock(rq);
> +			delta = rq_clock_task(rq) - curr->se.exec_start;
>  
> -		/*
> -		 * Make sure the next tick runs within a reasonable
> -		 * amount of time.
> -		 */
> -		WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
> -		curr->sched_class->task_tick(rq, curr, 0);
> +			/*
> +			 * Make sure the next tick runs within a reasonable
> +			 * amount of time.
> +			 */
> +			WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
> +			curr->sched_class->task_tick(rq, curr, 0);
> +		}
>  		rq_unlock_irq(rq, &rf);
>  	}

Instead of deep indent, how about we write it like so: ?

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3113,7 +3113,9 @@ static void sched_tick_remote(struct wor
 	struct tick_work *twork = container_of(dwork, struct tick_work, work);
 	int cpu = twork->cpu;
 	struct rq *rq = cpu_rq(cpu);
+	struct task_struct *curr;
 	struct rq_flags rf;
+	u64 delta;
 
 	/*
 	 * Handle the tick only if it appears the remote CPU is running in full
@@ -3122,26 +3124,28 @@ static void sched_tick_remote(struct wor
 	 * statistics and checks timeslices in a time-independent way, regardless
 	 * of when exactly it is running.
 	 */
-	if (!idle_cpu(cpu) && tick_nohz_tick_stopped_cpu(cpu)) {
-		struct task_struct *curr;
-		u64 delta;
-
-		rq_lock_irq(rq, &rf);
-		curr = rq->curr;
-		if (!is_idle_task(curr)) {
-			update_rq_clock(rq);
-			delta = rq_clock_task(rq) - curr->se.exec_start;
-
-			/*
-			 * Make sure the next tick runs within a reasonable
-			 * amount of time.
-			 */
-			WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
-			curr->sched_class->task_tick(rq, curr, 0);
-		}
-		rq_unlock_irq(rq, &rf);
-	}
+	if (idle_cpu(cpu) || !tick_nohz_tick_stopped_cpu(cpu))
+		goto out_requeue;
 
+	rq_lock_irq(rq, &rf);
+	curr = rq->curr;
+	if (is_idle_task(curr))
+		goto out_unlock;
+
+	update_rq_clock(rq);
+	delta = rq_clock_task(rq) - curr->se.exec_start;
+
+	/*
+	 * Make sure the next tick runs within a reasonable
+	 * amount of time.
+	 */
+	WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
+	curr->sched_class->task_tick(rq, curr, 0);
+
+out_unlock:
+	rq_unlock_irq(rq, &rf);
+
+out_requeue:
 	/*
 	 * Run the remote tick once per second (1Hz). This arbitrary
 	 * frequency is large enough to avoid overload but short enough

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched/nohz: Skip remote tick on idle task entirely
  2018-06-28 16:29 [PATCH] sched/nohz: Skip remote tick on idle task entirely Frederic Weisbecker
  2018-07-02 12:44 ` Peter Zijlstra
@ 2018-07-02 15:17 ` Paul E. McKenney
  2018-07-03  7:48 ` [tip:sched/core] " tip-bot for Frederic Weisbecker
  2 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2018-07-02 15:17 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, stable,
	Anna-Maria Gleixner, Jacek Tomaka

On Thu, Jun 28, 2018 at 06:29:41PM +0200, Frederic Weisbecker wrote:
> Some people have reported that the warning in sched_tick_remote()
> occasionally triggers, especially in favour of some RCU-Torture
> pressure:
> 
> 	WARNING: CPU: 11 PID: 906 at kernel/sched/core.c:3138 sched_tick_remote+0xb6/0xc0
> 	Modules linked in:
> 	CPU: 11 PID: 906 Comm: kworker/u32:3 Not tainted 4.18.0-rc2+ #1
> 	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> 	Workqueue: events_unbound sched_tick_remote
> 	RIP: 0010:sched_tick_remote+0xb6/0xc0
> 	Code: e8 0f 06 b8 00 c6 03 00 fb eb 9d 8b 43 04 85 c0 75 8d 48 8b 83 e0 0a 00 00 48 85 c0 75 81 eb 88 48 89 df e8 bc fe ff ff eb aa <0f> 0b eb
> 	+c5 66 0f 1f 44 00 00 bf 17 00 00 00 e8 b6 2e fe ff 0f b6
> 	Call Trace:
> 	 process_one_work+0x1df/0x3b0
> 	 worker_thread+0x44/0x3d0
> 	 kthread+0xf3/0x130
> 	 ? set_worker_desc+0xb0/0xb0
> 	 ? kthread_create_worker_on_cpu+0x70/0x70
> 	 ret_from_fork+0x35/0x40
> 
> This happens when the remote tick applies on an idle task. Usually the
> idle_cpu() check avoids that, but it is performed before we lock the
> runqueue and it is therefore racy. It was intended to be that way in
> order to prevent from useless runqueue locks since idle task tick
> callback is a no-op.
> 
> Now if the racy check slips out of our hands and we end up remotely
> ticking an idle task, the empty task_tick_idle() is harmless. Still
> it won't pass the WARN_ON_ONCE() test that ensures rq_clock_task() is
> not too far from curr->se.exec_start because update_curr_idle() doesn't
> update the exec_start value like other scheduler policies. Hence the
> reported false positive.
> 
> So let's have another check, while the rq is locked, to make sure we
> don't remote tick on an idle task. The lockless idle_cpu() still applies
> to avoid unecessary rq lock contention.
> 
> Reported-by: Jacek Tomaka <jacekt@dug.com>
> Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: stable@vger.kernel.org
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  kernel/sched/core.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d8fac..da8f121 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3127,16 +3127,18 @@ static void sched_tick_remote(struct work_struct *work)
>  		u64 delta;
> 
>  		rq_lock_irq(rq, &rf);
> -		update_rq_clock(rq);
>  		curr = rq->curr;
> -		delta = rq_clock_task(rq) - curr->se.exec_start;
> +		if (!is_idle_task(curr)) {
> +			update_rq_clock(rq);
> +			delta = rq_clock_task(rq) - curr->se.exec_start;
> 
> -		/*
> -		 * Make sure the next tick runs within a reasonable
> -		 * amount of time.
> -		 */
> -		WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
> -		curr->sched_class->task_tick(rq, curr, 0);
> +			/*
> +			 * Make sure the next tick runs within a reasonable
> +			 * amount of time.
> +			 */
> +			WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
> +			curr->sched_class->task_tick(rq, curr, 0);
> +		}
>  		rq_unlock_irq(rq, &rf);
>  	}
> 
> -- 
> 2.7.4
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip:sched/core] sched/nohz: Skip remote tick on idle task entirely
  2018-06-28 16:29 [PATCH] sched/nohz: Skip remote tick on idle task entirely Frederic Weisbecker
  2018-07-02 12:44 ` Peter Zijlstra
  2018-07-02 15:17 ` Paul E. McKenney
@ 2018-07-03  7:48 ` tip-bot for Frederic Weisbecker
  2 siblings, 0 replies; 4+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2018-07-03  7:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: paulmck, anna-maria, hpa, peterz, jacekt, frederic, tglx,
	torvalds, linux-kernel, mingo

Commit-ID:  d9c0ffcabd6aae7ff1e34e8078354c13bb9f1183
Gitweb:     https://git.kernel.org/tip/d9c0ffcabd6aae7ff1e34e8078354c13bb9f1183
Author:     Frederic Weisbecker <frederic@kernel.org>
AuthorDate: Thu, 28 Jun 2018 18:29:41 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 3 Jul 2018 09:17:28 +0200

sched/nohz: Skip remote tick on idle task entirely

Some people have reported that the warning in sched_tick_remote()
occasionally triggers, especially in favour of some RCU-Torture
pressure:

	WARNING: CPU: 11 PID: 906 at kernel/sched/core.c:3138 sched_tick_remote+0xb6/0xc0
	Modules linked in:
	CPU: 11 PID: 906 Comm: kworker/u32:3 Not tainted 4.18.0-rc2+ #1
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
	Workqueue: events_unbound sched_tick_remote
	RIP: 0010:sched_tick_remote+0xb6/0xc0
	Code: e8 0f 06 b8 00 c6 03 00 fb eb 9d 8b 43 04 85 c0 75 8d 48 8b 83 e0 0a 00 00 48 85 c0 75 81 eb 88 48 89 df e8 bc fe ff ff eb aa <0f> 0b eb
	+c5 66 0f 1f 44 00 00 bf 17 00 00 00 e8 b6 2e fe ff 0f b6
	Call Trace:
	 process_one_work+0x1df/0x3b0
	 worker_thread+0x44/0x3d0
	 kthread+0xf3/0x130
	 ? set_worker_desc+0xb0/0xb0
	 ? kthread_create_worker_on_cpu+0x70/0x70
	 ret_from_fork+0x35/0x40

This happens when the remote tick applies on an idle task. Usually the
idle_cpu() check avoids that, but it is performed before we lock the
runqueue and it is therefore racy. It was intended to be that way in
order to prevent from useless runqueue locks since idle task tick
callback is a no-op.

Now if the racy check slips out of our hands and we end up remotely
ticking an idle task, the empty task_tick_idle() is harmless. Still
it won't pass the WARN_ON_ONCE() test that ensures rq_clock_task() is
not too far from curr->se.exec_start because update_curr_idle() doesn't
update the exec_start value like other scheduler policies. Hence the
reported false positive.

So let's have another check, while the rq is locked, to make sure we
don't remote tick on an idle task. The lockless idle_cpu() still applies
to avoid unecessary rq lock contention.

Reported-by: Jacek Tomaka <jacekt@dug.com>
Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1530203381-31234-1-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d8facba456..22fce36426c0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3113,7 +3113,9 @@ static void sched_tick_remote(struct work_struct *work)
 	struct tick_work *twork = container_of(dwork, struct tick_work, work);
 	int cpu = twork->cpu;
 	struct rq *rq = cpu_rq(cpu);
+	struct task_struct *curr;
 	struct rq_flags rf;
+	u64 delta;
 
 	/*
 	 * Handle the tick only if it appears the remote CPU is running in full
@@ -3122,24 +3124,28 @@ static void sched_tick_remote(struct work_struct *work)
 	 * statistics and checks timeslices in a time-independent way, regardless
 	 * of when exactly it is running.
 	 */
-	if (!idle_cpu(cpu) && tick_nohz_tick_stopped_cpu(cpu)) {
-		struct task_struct *curr;
-		u64 delta;
+	if (idle_cpu(cpu) || !tick_nohz_tick_stopped_cpu(cpu))
+		goto out_requeue;
 
-		rq_lock_irq(rq, &rf);
-		update_rq_clock(rq);
-		curr = rq->curr;
-		delta = rq_clock_task(rq) - curr->se.exec_start;
+	rq_lock_irq(rq, &rf);
+	curr = rq->curr;
+	if (is_idle_task(curr))
+		goto out_unlock;
 
-		/*
-		 * Make sure the next tick runs within a reasonable
-		 * amount of time.
-		 */
-		WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
-		curr->sched_class->task_tick(rq, curr, 0);
-		rq_unlock_irq(rq, &rf);
-	}
+	update_rq_clock(rq);
+	delta = rq_clock_task(rq) - curr->se.exec_start;
+
+	/*
+	 * Make sure the next tick runs within a reasonable
+	 * amount of time.
+	 */
+	WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
+	curr->sched_class->task_tick(rq, curr, 0);
+
+out_unlock:
+	rq_unlock_irq(rq, &rf);
 
+out_requeue:
 	/*
 	 * Run the remote tick once per second (1Hz). This arbitrary
 	 * frequency is large enough to avoid overload but short enough

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-07-03  7:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-28 16:29 [PATCH] sched/nohz: Skip remote tick on idle task entirely Frederic Weisbecker
2018-07-02 12:44 ` Peter Zijlstra
2018-07-02 15:17 ` Paul E. McKenney
2018-07-03  7:48 ` [tip:sched/core] " tip-bot for Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).