All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull
@ 2017-12-02 18:04 Steven Rostedt
  2017-12-04  7:45 ` Juri Lelli
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Steven Rostedt @ 2017-12-02 18:04 UTC (permalink / raw)
  To: LKML, linux-rt-users
  Cc: Ingo Molnar, Peter Zijlstra, Sebastian Andrzej Siewior,
	Daniel Wagner, Thomas Gleixner

Daniel Wagner reported a crash on the beaglebone black. This is a
single CPU architecture, and does not have a functional:
arch_send_call_function_single_ipi() and can crash if that is called.

As it only has one CPU, it shouldn't be called, but if the kernel is
compiled for SMP, the push/pull RT scheduling logic now calls it for
irq_work if the one CPU is overloaded, it can use that function to call
itself and crash the kernel.

Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system
only has a single CPU. But SCHED_FEAT is a constant if sched debugging
is turned off. Another fix can also be used, and this should also help
with normal SMP machines. That is, do not initiate the pull code if
there's only one RT overloaded CPU, and that CPU happens to be the
current CPU that is scheduling in a lower priority task.

Even on a system with many CPUs, if there's many RT tasks waiting to
run on a single CPU, and that CPU schedules in another RT task of lower
priority, it will initiate the PULL logic in case there's a higher
priority RT task on another CPU that is waiting to run. But if there is
no other CPU with waiting RT tasks, it will initiate the RT pull logic
on itself (as it still has RT tasks waiting to run). This is a wasted
effort.

Not only does this help with SMP code where the current CPU is the only
one with RT overloaded tasks, it should also solve the issue that
Daniel encountered, because it will prevent the PULL logic from
executing, as there's only one CPU on the system, and the check added
here will cause it to exit the RT pull code.

Link: http://lkml.kernel.org/r/8c913cc2-b2e3-8c2e-e503-aff1428f8ff5@monom.org
Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic")
Cc: stable@vger.kernel.org
Reported-by: Daniel Wagner <wagi@monom.org>
---
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 4056c19ca3f0..665ace2fc558 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2034,8 +2034,9 @@ static void pull_rt_task(struct rq *this_rq)
 	bool resched = false;
 	struct task_struct *p;
 	struct rq *src_rq;
+	int rt_overload_count = rt_overloaded(this_rq);
 
-	if (likely(!rt_overloaded(this_rq)))
+	if (likely(!rt_overload_count))
 		return;
 
 	/*
@@ -2044,6 +2045,11 @@ static void pull_rt_task(struct rq *this_rq)
 	 */
 	smp_rmb();
 
+	/* If we are the only overloaded CPU do nothing */
+	if (rt_overload_count == 1 &&
+	    cpumask_test_cpu(this_rq->cpu, this_rq->rd->rto_mask))
+		return;
+
 #ifdef HAVE_RT_PUSH_IPI
 	if (sched_feat(RT_PUSH_IPI)) {
 		tell_cpu_to_push(this_rq);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull
  2017-12-02 18:04 [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull Steven Rostedt
@ 2017-12-04  7:45 ` Juri Lelli
  2017-12-04  8:09   ` Steven Rostedt
  2017-12-04 10:29 ` Daniel Wagner
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Juri Lelli @ 2017-12-04  7:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, linux-rt-users, Ingo Molnar, Peter Zijlstra,
	Sebastian Andrzej Siewior, Daniel Wagner, Thomas Gleixner

Hi Steve,

On 02/12/17 13:04, Steven Rostedt wrote:
> Daniel Wagner reported a crash on the beaglebone black. This is a
> single CPU architecture, and does not have a functional:
> arch_send_call_function_single_ipi() and can crash if that is called.
> 
> As it only has one CPU, it shouldn't be called, but if the kernel is
> compiled for SMP, the push/pull RT scheduling logic now calls it for
> irq_work if the one CPU is overloaded, it can use that function to call
> itself and crash the kernel.
> 
> Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system
> only has a single CPU. But SCHED_FEAT is a constant if sched debugging
> is turned off. Another fix can also be used, and this should also help
> with normal SMP machines. That is, do not initiate the pull code if
> there's only one RT overloaded CPU, and that CPU happens to be the
> current CPU that is scheduling in a lower priority task.
> 
> Even on a system with many CPUs, if there's many RT tasks waiting to
> run on a single CPU, and that CPU schedules in another RT task of lower
> priority, it will initiate the PULL logic in case there's a higher
> priority RT task on another CPU that is waiting to run. But if there is
> no other CPU with waiting RT tasks, it will initiate the RT pull logic
> on itself (as it still has RT tasks waiting to run). This is a wasted
> effort.
> 
> Not only does this help with SMP code where the current CPU is the only
> one with RT overloaded tasks, it should also solve the issue that
> Daniel encountered, because it will prevent the PULL logic from
> executing, as there's only one CPU on the system, and the check added
> here will cause it to exit the RT pull code.
> 
> Link: http://lkml.kernel.org/r/8c913cc2-b2e3-8c2e-e503-aff1428f8ff5@monom.org
> Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic")
> Cc: stable@vger.kernel.org
> Reported-by: Daniel Wagner <wagi@monom.org>
> ---
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 4056c19ca3f0..665ace2fc558 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2034,8 +2034,9 @@ static void pull_rt_task(struct rq *this_rq)
>  	bool resched = false;
>  	struct task_struct *p;
>  	struct rq *src_rq;
> +	int rt_overload_count = rt_overloaded(this_rq);
>  
> -	if (likely(!rt_overloaded(this_rq)))
> +	if (likely(!rt_overload_count))
>  		return;
>  
>  	/*
> @@ -2044,6 +2045,11 @@ static void pull_rt_task(struct rq *this_rq)
>  	 */
>  	smp_rmb();
>  
> +	/* If we are the only overloaded CPU do nothing */
> +	if (rt_overload_count == 1 &&
> +	    cpumask_test_cpu(this_rq->cpu, this_rq->rd->rto_mask))
> +		return;
> +

Right. I was wondering however if for the truly UP case we shouldn't be
initiating/queueing callbacks (pull/push) at all?

DEADLINE doesn't use (yet?) the PUSH_IPI, but we will need a similar
patch to keep logics aligned.

Best,

Juri

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull
  2017-12-04  7:45 ` Juri Lelli
@ 2017-12-04  8:09   ` Steven Rostedt
  2017-12-04  9:07     ` Juri Lelli
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2017-12-04  8:09 UTC (permalink / raw)
  To: Juri Lelli
  Cc: LKML, linux-rt-users, Ingo Molnar, Peter Zijlstra,
	Sebastian Andrzej Siewior, Daniel Wagner, Thomas Gleixner

On Mon, 4 Dec 2017 08:45:17 +0100
Juri Lelli <juri.lelli@redhat.com> wrote:

> Right. I was wondering however if for the truly UP case we shouldn't be
> initiating/queueing callbacks (pull/push) at all?

If !CONFIG_SMP then it's not compiled in. The issue came up when Daniel
ran a CONFIG_SMP kernel on an arch that only supports UP.

> 
> DEADLINE doesn't use (yet?) the PUSH_IPI, but we will need a similar
> patch to keep logics aligned.

Maybe.

-- Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull
  2017-12-04  8:09   ` Steven Rostedt
@ 2017-12-04  9:07     ` Juri Lelli
  2017-12-04  9:55       ` Steven Rostedt
  0 siblings, 1 reply; 10+ messages in thread
From: Juri Lelli @ 2017-12-04  9:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, linux-rt-users, Ingo Molnar, Peter Zijlstra,
	Sebastian Andrzej Siewior, Daniel Wagner, Thomas Gleixner

On 04/12/17 03:09, Steven Rostedt wrote:
> On Mon, 4 Dec 2017 08:45:17 +0100
> Juri Lelli <juri.lelli@redhat.com> wrote:
> 
> > Right. I was wondering however if for the truly UP case we shouldn't be
> > initiating/queueing callbacks (pull/push) at all?
> 
> If !CONFIG_SMP then it's not compiled in. The issue came up when Daniel
> ran a CONFIG_SMP kernel on an arch that only supports UP.
> 

Right, sorry. I meant num_online_cpus() == 1.

Best,

Juri

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull
  2017-12-04  9:07     ` Juri Lelli
@ 2017-12-04  9:55       ` Steven Rostedt
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2017-12-04  9:55 UTC (permalink / raw)
  To: Juri Lelli
  Cc: LKML, linux-rt-users, Ingo Molnar, Peter Zijlstra,
	Sebastian Andrzej Siewior, Daniel Wagner, Thomas Gleixner

On Mon, 4 Dec 2017 10:07:57 +0100
Juri Lelli <juri.lelli@redhat.com> wrote:

> On 04/12/17 03:09, Steven Rostedt wrote:
> > On Mon, 4 Dec 2017 08:45:17 +0100
> > Juri Lelli <juri.lelli@redhat.com> wrote:
> >   
> > > Right. I was wondering however if for the truly UP case we shouldn't be
> > > initiating/queueing callbacks (pull/push) at all?  
> > 
> > If !CONFIG_SMP then it's not compiled in. The issue came up when Daniel
> > ran a CONFIG_SMP kernel on an arch that only supports UP.
> >   
> 
> Right, sorry. I meant num_online_cpus() == 1.
>

Correct. But we need to disable the push/pull when CPUs go down to 1,
or if we see "num_possible_cpus() == 1" at boot up. It woulld need
to be re-enabled when CPUs are onlined and count goes greater than
one. Which we could also add, and I started going that route first. My
first patch had that check at each push/pull, but num_online_cpus() is
a weight of the cpumask, and for machines with more than 64 CPUs,
calculating that number becomes a bigger task and we want to keep that
out of the scheduler fast path, which push/pull logic happens to be in.

When looking at changing this code, I realized that rt_overloaded()
returns the count of overloaded CPUs, and the check to see if the
current CPU is overloaded is a single bit check of a cpumask (all very
quick). This not only fixes the issue with what Daniel found, but also
can help in certain cases on large CPU count machines.

-- Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull
  2017-12-02 18:04 [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull Steven Rostedt
  2017-12-04  7:45 ` Juri Lelli
@ 2017-12-04 10:29 ` Daniel Wagner
  2017-12-11 16:47 ` Ingo Molnar
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Daniel Wagner @ 2017-12-04 10:29 UTC (permalink / raw)
  To: Steven Rostedt, LKML, linux-rt-users
  Cc: Ingo Molnar, Peter Zijlstra, Sebastian Andrzej Siewior, Thomas Gleixner

Hi Steven,

On 12/02/2017 07:04 PM, Steven Rostedt wrote:
> Daniel Wagner reported a crash on the beaglebone black. This is a
> single CPU architecture, and does not have a functional:
> arch_send_call_function_single_ipi() and can crash if that is called.
> 
> As it only has one CPU, it shouldn't be called, but if the kernel is
> compiled for SMP, the push/pull RT scheduling logic now calls it for
> irq_work if the one CPU is overloaded, it can use that function to call
> itself and crash the kernel.
> 
> Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system
> only has a single CPU. But SCHED_FEAT is a constant if sched debugging
> is turned off. Another fix can also be used, and this should also help
> with normal SMP machines. That is, do not initiate the pull code if
> there's only one RT overloaded CPU, and that CPU happens to be the
> current CPU that is scheduling in a lower priority task.
> 
> Even on a system with many CPUs, if there's many RT tasks waiting to
> run on a single CPU, and that CPU schedules in another RT task of lower
> priority, it will initiate the PULL logic in case there's a higher
> priority RT task on another CPU that is waiting to run. But if there is
> no other CPU with waiting RT tasks, it will initiate the RT pull logic
> on itself (as it still has RT tasks waiting to run). This is a wasted
> effort.
> 
> Not only does this help with SMP code where the current CPU is the only
> one with RT overloaded tasks, it should also solve the issue that
> Daniel encountered, because it will prevent the PULL logic from
> executing, as there's only one CPU on the system, and the check added
> here will cause it to exit the RT pull code.
> 
> Link: http://lkml.kernel.org/r/8c913cc2-b2e3-8c2e-e503-aff1428f8ff5@monom.org
> Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic")
> Cc: stable@vger.kernel.org
> Reported-by: Daniel Wagner <wagi@monom.org>

Tested-by: Daniel Wagner <wagi@monom.org>

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull
  2017-12-02 18:04 [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull Steven Rostedt
  2017-12-04  7:45 ` Juri Lelli
  2017-12-04 10:29 ` Daniel Wagner
@ 2017-12-11 16:47 ` Ingo Molnar
  2017-12-11 19:34   ` Steven Rostedt
  2017-12-12 10:56 ` [tip:sched/urgent] " tip-bot for Steven Rostedt
  2017-12-15 15:39 ` [tip:sched/urgent] sched/rt: Do not pull from current CPU if only one CPU " tip-bot for Steven Rostedt
  4 siblings, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2017-12-11 16:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, linux-rt-users, Peter Zijlstra, Sebastian Andrzej Siewior,
	Daniel Wagner, Thomas Gleixner


* Steven Rostedt <rostedt@goodmis.org> wrote:

> Daniel Wagner reported a crash on the beaglebone black. This is a
> single CPU architecture, and does not have a functional:
> arch_send_call_function_single_ipi() and can crash if that is called.
> 
> As it only has one CPU, it shouldn't be called, but if the kernel is
> compiled for SMP, the push/pull RT scheduling logic now calls it for
> irq_work if the one CPU is overloaded, it can use that function to call
> itself and crash the kernel.
> 
> Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system
> only has a single CPU. But SCHED_FEAT is a constant if sched debugging
> is turned off. Another fix can also be used, and this should also help
> with normal SMP machines. That is, do not initiate the pull code if
> there's only one RT overloaded CPU, and that CPU happens to be the
> current CPU that is scheduling in a lower priority task.
> 
> Even on a system with many CPUs, if there's many RT tasks waiting to
> run on a single CPU, and that CPU schedules in another RT task of lower
> priority, it will initiate the PULL logic in case there's a higher
> priority RT task on another CPU that is waiting to run. But if there is
> no other CPU with waiting RT tasks, it will initiate the RT pull logic
> on itself (as it still has RT tasks waiting to run). This is a wasted
> effort.
> 
> Not only does this help with SMP code where the current CPU is the only
> one with RT overloaded tasks, it should also solve the issue that
> Daniel encountered, because it will prevent the PULL logic from
> executing, as there's only one CPU on the system, and the check added
> here will cause it to exit the RT pull code.
> 
> Link: http://lkml.kernel.org/r/8c913cc2-b2e3-8c2e-e503-aff1428f8ff5@monom.org
> Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic")
> Cc: stable@vger.kernel.org
> Reported-by: Daniel Wagner <wagi@monom.org>

I've added:

  Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Which I suspect you just forgot to add?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull
  2017-12-11 16:47 ` Ingo Molnar
@ 2017-12-11 19:34   ` Steven Rostedt
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2017-12-11 19:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, linux-rt-users, Peter Zijlstra, Sebastian Andrzej Siewior,
	Daniel Wagner, Thomas Gleixner

On Mon, 11 Dec 2017 17:47:32 +0100
Ingo Molnar <mingo@kernel.org> wrote:

> > Link: http://lkml.kernel.org/r/8c913cc2-b2e3-8c2e-e503-aff1428f8ff5@monom.org
> > Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic")
> > Cc: stable@vger.kernel.org
> > Reported-by: Daniel Wagner <wagi@monom.org>  
> 
> I've added:
> 
>   Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> 
> Which I suspect you just forgot to add?

Ug, not sure how that happened, but yes, I simply forgot to add that
(probably did it using quilt and expected to use git commit -s)

Thanks Ingo!

-- Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip:sched/urgent] sched/rt: Do not pull from current CPU if only one cpu to pull
  2017-12-02 18:04 [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull Steven Rostedt
                   ` (2 preceding siblings ...)
  2017-12-11 16:47 ` Ingo Molnar
@ 2017-12-12 10:56 ` tip-bot for Steven Rostedt
  2017-12-15 15:39 ` [tip:sched/urgent] sched/rt: Do not pull from current CPU if only one CPU " tip-bot for Steven Rostedt
  4 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Steven Rostedt @ 2017-12-12 10:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, torvalds, mingo, peterz, wagi, rostedt, bigeasy,
	linux-kernel, linux-rt-users, hpa

Commit-ID:  f80a6308b1100d431e27afb61aa90d752df20aae
Gitweb:     https://git.kernel.org/tip/f80a6308b1100d431e27afb61aa90d752df20aae
Author:     Steven Rostedt <rostedt@goodmis.org>
AuthorDate: Sat, 2 Dec 2017 13:04:54 -0500
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 11 Dec 2017 17:46:58 +0100

sched/rt: Do not pull from current CPU if only one cpu to pull

Daniel Wagner reported a crash on the beaglebone black. This is a
single CPU architecture, and does not have a functional:
arch_send_call_function_single_ipi() and can crash if that is called.

As it only has one CPU, it shouldn't be called, but if the kernel is
compiled for SMP, the push/pull RT scheduling logic now calls it for
irq_work if the one CPU is overloaded, it can use that function to call
itself and crash the kernel.

Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system
only has a single CPU. But SCHED_FEAT is a constant if sched debugging
is turned off. Another fix can also be used, and this should also help
with normal SMP machines. That is, do not initiate the pull code if
there's only one RT overloaded CPU, and that CPU happens to be the
current CPU that is scheduling in a lower priority task.

Even on a system with many CPUs, if there's many RT tasks waiting to
run on a single CPU, and that CPU schedules in another RT task of lower
priority, it will initiate the PULL logic in case there's a higher
priority RT task on another CPU that is waiting to run. But if there is
no other CPU with waiting RT tasks, it will initiate the RT pull logic
on itself (as it still has RT tasks waiting to run). This is a wasted
effort.

Not only does this help with SMP code where the current CPU is the only
one with RT overloaded tasks, it should also solve the issue that
Daniel encountered, because it will prevent the PULL logic from
executing, as there's only one CPU on the system, and the check added
here will cause it to exit the RT pull code.

Reported-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic")
Link: http://lkml.kernel.org/r/20171202130454.4cbbfe8d@vmware.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/rt.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 4056c19..665ace2 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2034,8 +2034,9 @@ static void pull_rt_task(struct rq *this_rq)
 	bool resched = false;
 	struct task_struct *p;
 	struct rq *src_rq;
+	int rt_overload_count = rt_overloaded(this_rq);
 
-	if (likely(!rt_overloaded(this_rq)))
+	if (likely(!rt_overload_count))
 		return;
 
 	/*
@@ -2044,6 +2045,11 @@ static void pull_rt_task(struct rq *this_rq)
 	 */
 	smp_rmb();
 
+	/* If we are the only overloaded CPU do nothing */
+	if (rt_overload_count == 1 &&
+	    cpumask_test_cpu(this_rq->cpu, this_rq->rd->rto_mask))
+		return;
+
 #ifdef HAVE_RT_PUSH_IPI
 	if (sched_feat(RT_PUSH_IPI)) {
 		tell_cpu_to_push(this_rq);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip:sched/urgent] sched/rt: Do not pull from current CPU if only one CPU to pull
  2017-12-02 18:04 [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull Steven Rostedt
                   ` (3 preceding siblings ...)
  2017-12-12 10:56 ` [tip:sched/urgent] " tip-bot for Steven Rostedt
@ 2017-12-15 15:39 ` tip-bot for Steven Rostedt
  4 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Steven Rostedt @ 2017-12-15 15:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: wagi, torvalds, peterz, rostedt, tglx, linux-kernel, hpa, mingo,
	bigeasy, linux-rt-users

Commit-ID:  f73c52a5bcd1710994e53fbccc378c42b97a06b6
Gitweb:     https://git.kernel.org/tip/f73c52a5bcd1710994e53fbccc378c42b97a06b6
Author:     Steven Rostedt <rostedt@goodmis.org>
AuthorDate: Sat, 2 Dec 2017 13:04:54 -0500
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 15 Dec 2017 16:28:02 +0100

sched/rt: Do not pull from current CPU if only one CPU to pull

Daniel Wagner reported a crash on the BeagleBone Black SoC.

This is a single CPU architecture, and does not have a functional
arch_send_call_function_single_ipi() implementation which can crash
the kernel if that is called.

As it only has one CPU, it shouldn't be called, but if the kernel is
compiled for SMP, the push/pull RT scheduling logic now calls it for
irq_work if the one CPU is overloaded, it can use that function to call
itself and crash the kernel.

Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system
only has a single CPU. But SCHED_FEAT is a constant if sched debugging
is turned off. Another fix can also be used, and this should also help
with normal SMP machines. That is, do not initiate the pull code if
there's only one RT overloaded CPU, and that CPU happens to be the
current CPU that is scheduling in a lower priority task.

Even on a system with many CPUs, if there's many RT tasks waiting to
run on a single CPU, and that CPU schedules in another RT task of lower
priority, it will initiate the PULL logic in case there's a higher
priority RT task on another CPU that is waiting to run. But if there is
no other CPU with waiting RT tasks, it will initiate the RT pull logic
on itself (as it still has RT tasks waiting to run). This is a wasted
effort.

Not only does this help with SMP code where the current CPU is the only
one with RT overloaded tasks, it should also solve the issue that
Daniel encountered, because it will prevent the PULL logic from
executing, as there's only one CPU on the system, and the check added
here will cause it to exit the RT pull code.

Reported-by: Daniel Wagner <wagi@monom.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: 4bdced5c9 ("sched/rt: Simplify the IPI based RT balancing logic")
Link: http://lkml.kernel.org/r/20171202130454.4cbbfe8d@vmware.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/rt.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 4056c19..665ace2 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2034,8 +2034,9 @@ static void pull_rt_task(struct rq *this_rq)
 	bool resched = false;
 	struct task_struct *p;
 	struct rq *src_rq;
+	int rt_overload_count = rt_overloaded(this_rq);
 
-	if (likely(!rt_overloaded(this_rq)))
+	if (likely(!rt_overload_count))
 		return;
 
 	/*
@@ -2044,6 +2045,11 @@ static void pull_rt_task(struct rq *this_rq)
 	 */
 	smp_rmb();
 
+	/* If we are the only overloaded CPU do nothing */
+	if (rt_overload_count == 1 &&
+	    cpumask_test_cpu(this_rq->cpu, this_rq->rd->rto_mask))
+		return;
+
 #ifdef HAVE_RT_PUSH_IPI
 	if (sched_feat(RT_PUSH_IPI)) {
 		tell_cpu_to_push(this_rq);

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-12-15 15:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-02 18:04 [PATCH] sched/rt: Do not pull from current CPU if only one cpu to pull Steven Rostedt
2017-12-04  7:45 ` Juri Lelli
2017-12-04  8:09   ` Steven Rostedt
2017-12-04  9:07     ` Juri Lelli
2017-12-04  9:55       ` Steven Rostedt
2017-12-04 10:29 ` Daniel Wagner
2017-12-11 16:47 ` Ingo Molnar
2017-12-11 19:34   ` Steven Rostedt
2017-12-12 10:56 ` [tip:sched/urgent] " tip-bot for Steven Rostedt
2017-12-15 15:39 ` [tip:sched/urgent] sched/rt: Do not pull from current CPU if only one CPU " tip-bot for Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.