All of lore.kernel.org
 help / color / mirror / Atom feed
* sched_core_balance() releasing interrupts with pi_lock held
@ 2022-03-08 21:14 Steven Rostedt
  2022-03-15 21:46 ` Steven Rostedt
  0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2022-03-08 21:14 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: LKML, Thomas Gleixner, Sebastian Andrzej Siewior

Hi Peter,

A ChromeOS bug report showed a lockdep splat that I first thought was a bad
backport. But when looking at upstream, I don't see how it would work there
either. The lockdep splat had:

[56064.673346] Call Trace:
[56064.676066]  dump_stack+0xb9/0x117
[56064.679861]  ? print_usage_bug+0x2af/0x2c2
[56064.684434]  mark_lock_irq+0x25e/0x27d
[56064.688618]  mark_lock+0x11a/0x16c
[56064.692412]  mark_held_locks+0x57/0x87
[56064.696595]  ? _raw_spin_unlock_irq+0x2c/0x40
[56064.701460]  lockdep_hardirqs_on+0xb1/0x19d
[56064.706130]  _raw_spin_unlock_irq+0x2c/0x40
[56064.710799]  sched_core_balance+0x8a/0x4af
[56064.715369]  ? __balance_callback+0x1f/0x9a
[56064.720030]  __balance_callback+0x4f/0x9a
[56064.724506]  rt_mutex_setprio+0x43a/0x48b
[56064.728982]  task_blocks_on_rt_mutex+0x14d/0x1d5

Where I see:

task_blocks_on_rt_mutex() {
  spin_lock(pi_lock);
  rt_mutex_setprio() {
    balance_callback() {
      sched_core_balance() {
        spin_unlock_irq(rq);

Where spin_unlock_irq() enables interrupts while holding the pi_lock, and
BOOM, lockdep (rightfully) complains.

The above was me looking at mainline, not the kernel that blew up. So, I'm
guessing that this is a bug in mainline as well.

-- Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-08 21:14 sched_core_balance() releasing interrupts with pi_lock held Steven Rostedt
@ 2022-03-15 21:46 ` Steven Rostedt
  2022-03-16 16:03   ` Sebastian Andrzej Siewior
  2022-03-16 20:27   ` sched_core_balance() releasing interrupts with pi_lock held Peter Zijlstra
  0 siblings, 2 replies; 17+ messages in thread
From: Steven Rostedt @ 2022-03-15 21:46 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: LKML, Thomas Gleixner, Sebastian Andrzej Siewior

On Tue, 8 Mar 2022 16:14:55 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> Hi Peter,

Have you had time to look into this?

Thanks,

-- Steve

> 
> A ChromeOS bug report showed a lockdep splat that I first thought was a bad
> backport. But when looking at upstream, I don't see how it would work there
> either. The lockdep splat had:
> 
> [56064.673346] Call Trace:
> [56064.676066]  dump_stack+0xb9/0x117
> [56064.679861]  ? print_usage_bug+0x2af/0x2c2
> [56064.684434]  mark_lock_irq+0x25e/0x27d
> [56064.688618]  mark_lock+0x11a/0x16c
> [56064.692412]  mark_held_locks+0x57/0x87
> [56064.696595]  ? _raw_spin_unlock_irq+0x2c/0x40
> [56064.701460]  lockdep_hardirqs_on+0xb1/0x19d
> [56064.706130]  _raw_spin_unlock_irq+0x2c/0x40
> [56064.710799]  sched_core_balance+0x8a/0x4af
> [56064.715369]  ? __balance_callback+0x1f/0x9a
> [56064.720030]  __balance_callback+0x4f/0x9a
> [56064.724506]  rt_mutex_setprio+0x43a/0x48b
> [56064.728982]  task_blocks_on_rt_mutex+0x14d/0x1d5
> 
> Where I see:
> 
> task_blocks_on_rt_mutex() {
>   spin_lock(pi_lock);
>   rt_mutex_setprio() {
>     balance_callback() {
>       sched_core_balance() {
>         spin_unlock_irq(rq);
> 
> Where spin_unlock_irq() enables interrupts while holding the pi_lock, and
> BOOM, lockdep (rightfully) complains.
> 
> The above was me looking at mainline, not the kernel that blew up. So, I'm
> guessing that this is a bug in mainline as well.
> 
> -- Steve


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-15 21:46 ` Steven Rostedt
@ 2022-03-16 16:03   ` Sebastian Andrzej Siewior
  2022-03-16 16:18     ` Sebastian Andrzej Siewior
  2022-03-16 20:27   ` sched_core_balance() releasing interrupts with pi_lock held Peter Zijlstra
  1 sibling, 1 reply; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-16 16:03 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Peter Zijlstra, LKML, Thomas Gleixner

On 2022-03-15 17:46:06 [-0400], Steven Rostedt wrote:
> On Tue, 8 Mar 2022 16:14:55 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > Hi Peter,
> 
> Have you had time to look into this?

yes, I can confirm that it is a problem ;) So I did this:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 33ce5cd113d8..56c286aaa01f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5950,7 +5950,6 @@ static bool try_steal_cookie(int this, int that)
 	unsigned long cookie;
 	bool success = false;
 
-	local_irq_disable();
 	double_rq_lock(dst, src);
 
 	cookie = dst->core->core_cookie;
@@ -5989,7 +5988,6 @@ static bool try_steal_cookie(int this, int that)
 
 unlock:
 	double_rq_unlock(dst, src);
-	local_irq_enable();
 
 	return success;
 }
@@ -6019,7 +6017,7 @@ static void sched_core_balance(struct rq *rq)
 
 	preempt_disable();
 	rcu_read_lock();
-	raw_spin_rq_unlock_irq(rq);
+	raw_spin_rq_unlock(rq);
 	for_each_domain(cpu, sd) {
 		if (need_resched())
 			break;
@@ -6027,7 +6025,7 @@ static void sched_core_balance(struct rq *rq)
 		if (steal_cookie_task(cpu, sd))
 			break;
 	}
-	raw_spin_rq_lock_irq(rq);
+	raw_spin_rq_lock(rq);
 	rcu_read_unlock();
 	preempt_enable();
 }


which looked right but RT still fall apart:

| =====================================
| WARNING: bad unlock balance detected!
| 5.17.0-rc8-rt14+ #10 Not tainted
| -------------------------------------
| gcc/2608 is trying to release lock ((lock)) at:
| [<ffffffff8135a150>] folio_add_lru+0x60/0x90
| but there are no more locks to release!
| 
| other info that might help us debug this:
| 4 locks held by gcc/2608:
|  #0: ffff88826ea6efe0 (&sb->s_type->i_mutex_key#12){++++}-{3:3}, at: xfs_ilock+0x90/0xd0
|  #1: ffff88826ea6f1a0 (mapping.invalidate_lock#2){++++}-{3:3}, at: page_cache_ra_unbounded+0x8e/0x1f0
|  #2: ffff88852aba8d18 ((lock)#3){+.+.}-{2:2}, at: folio_add_lru+0x2a/0x90
|  #3: ffffffff829a5140 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xe0
| 
| stack backtrace:
| CPU: 18 PID: 2608 Comm: gcc Not tainted 5.17.0-rc8-rt14+ #10
| Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014
| Call Trace:
|  <TASK>
|  dump_stack_lvl+0x4a/0x62
|  lock_release.cold+0x32/0x37
|  rt_spin_unlock+0x17/0x80
|  folio_add_lru+0x60/0x90
|  filemap_add_folio+0x53/0xa0
|  page_cache_ra_unbounded+0x1c3/0x1f0
|  filemap_get_pages+0xe3/0x5b0
|  filemap_read+0xc5/0x2f0
|  xfs_file_buffered_read+0x6b/0x1a0
|  xfs_file_read_iter+0x6a/0xd0
|  new_sync_read+0x11b/0x1a0
|  vfs_read+0x134/0x1d0
|  ksys_read+0x68/0xf0
|  do_syscall_64+0x59/0x80
|  entry_SYSCALL_64_after_hwframe+0x44/0xae
| RIP: 0033:0x7f3feab7310e

It is always the local-lock that is breaks apart. Based on "locks held"
and the lock it tries to release it looks like the lock was acquired on
CPU-A and released on CPU-B.

> Thanks,
> 
> -- Steve

Sebastian

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-16 16:03   ` Sebastian Andrzej Siewior
@ 2022-03-16 16:18     ` Sebastian Andrzej Siewior
  2022-03-16 17:05       ` Sebastian Andrzej Siewior
  2022-03-16 20:35       ` Peter Zijlstra
  0 siblings, 2 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-16 16:18 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Peter Zijlstra, LKML, Thomas Gleixner

On 2022-03-16 17:03:14 [+0100], To Steven Rostedt wrote:
…
> which looked right but RT still fall apart:
> | =====================================
> | WARNING: bad unlock balance detected!
> | 5.17.0-rc8-rt14+ #10 Not tainted
> | -------------------------------------
…
…
> It is always the local-lock that is breaks apart. Based on "locks held"
> and the lock it tries to release it looks like the lock was acquired on
> CPU-A and released on CPU-B.

with

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 33ce5cd113d8..f4675bd8f878 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5968,6 +5967,9 @@ static bool try_steal_cookie(int this, int that)
 		if (p == src->core_pick || p == src->curr)
 			goto next;
 
+		if (p->migration_disabled)
+			goto next;
+
 		if (!cpumask_test_cpu(this, &p->cpus_mask))
 			goto next;
 
on top my problems are gone. Let me do some testing and then I would
patch unless PeterZ does the yelling :)

> > Thanks,
> > 
> > -- Steve
> 
Sebastian

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-16 16:18     ` Sebastian Andrzej Siewior
@ 2022-03-16 17:05       ` Sebastian Andrzej Siewior
  2022-03-16 20:35       ` Peter Zijlstra
  1 sibling, 0 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-16 17:05 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Peter Zijlstra, LKML, Thomas Gleixner

On 2022-03-16 17:18:41 [+0100], To Steven Rostedt wrote:
> on top my problems are gone. Let me do some testing and then I would
> patch unless PeterZ does the yelling :)

so the long irq-off section is latency wise less than optimal. A
testcase went from ~40 to 140us. And then there is this:
| NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #80!!!
| NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #10!!!

which might be RT-only.
I *think* we want first the buggy part gone and then the latency fixed.

> > > Thanks,
> > > 
> > > -- Steve

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-15 21:46 ` Steven Rostedt
  2022-03-16 16:03   ` Sebastian Andrzej Siewior
@ 2022-03-16 20:27   ` Peter Zijlstra
  2022-03-16 21:03     ` Peter Zijlstra
  2022-03-17 12:08     ` Sebastian Andrzej Siewior
  1 sibling, 2 replies; 17+ messages in thread
From: Peter Zijlstra @ 2022-03-16 20:27 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Thomas Gleixner, Sebastian Andrzej Siewior

On Tue, Mar 15, 2022 at 05:46:06PM -0400, Steven Rostedt wrote:
> On Tue, 8 Mar 2022 16:14:55 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > Hi Peter,
> 
> Have you had time to look into this?

Not since I talk to you on IRC about it last week.

Like I wrote, the balance_callback should be ran under whichever
rq->lock instance it gets queued under. As per:

  565790d28b1e ("sched: Fix balance_callback()")

Now, we only do queue_core_balance() from set_next_task_idle(), which
*should* only happen from pick_next_task(), and as such the callback
should only ever get called from finish_lock_switch() or the 'prev ==
next' case in __schedule().

Neither of these two sites holds pi_lock.


This is about as far as I got explaining things, and it being late, it's
about as far as I got looking at things.

Now that also makes conceptual sense, we only want to pull a core-cookie
task when we're scheduling an idle task.

Now, clearly this gets triggered from the PI path, but that's not making
immediate sense to me, it would mean we're boosting the idle task, which
is wrong too.

So it would be useful for someone that can reproduce this to provide a
trace of where queue_core_balance() gets called, because that *should*
only be in __schedule().

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-16 16:18     ` Sebastian Andrzej Siewior
  2022-03-16 17:05       ` Sebastian Andrzej Siewior
@ 2022-03-16 20:35       ` Peter Zijlstra
  2022-03-17 12:09         ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2022-03-16 20:35 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: Steven Rostedt, LKML, Thomas Gleixner

On Wed, Mar 16, 2022 at 05:18:40PM +0100, Sebastian Andrzej Siewior wrote:

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 33ce5cd113d8..f4675bd8f878 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5968,6 +5967,9 @@ static bool try_steal_cookie(int this, int that)
>  		if (p == src->core_pick || p == src->curr)
>  			goto next;
>  
> +		if (p->migration_disabled)
> +			goto next;
> +
>  		if (!cpumask_test_cpu(this, &p->cpus_mask))
>  			goto next;
>  
> on top my problems are gone. Let me do some testing and then I would
> patch unless PeterZ does the yelling :)

The previous thing in wrong because it tries to solve the wrong thing,
the above makes sense, except I would write it like so:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 83872f95a1ea..04c05bc4062e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5914,7 +5914,7 @@ static bool try_steal_cookie(int this, int that)
 		if (p == src->core_pick || p == src->curr)
 			goto next;
 
-		if (!cpumask_test_cpu(this, &p->cpus_mask))
+		if (!is_cpu_allowed(p, this))
 			goto next;
 
 		if (p->core_occupation > dst->idle->core_occupation)

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-16 20:27   ` sched_core_balance() releasing interrupts with pi_lock held Peter Zijlstra
@ 2022-03-16 21:03     ` Peter Zijlstra
  2022-03-21 17:30       ` Steven Rostedt
  2022-03-17 12:08     ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2022-03-16 21:03 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Thomas Gleixner, Sebastian Andrzej Siewior, joel

On Wed, Mar 16, 2022 at 09:27:34PM +0100, Peter Zijlstra wrote:
> On Tue, Mar 15, 2022 at 05:46:06PM -0400, Steven Rostedt wrote:
> > On Tue, 8 Mar 2022 16:14:55 -0500
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > Hi Peter,
> > 
> > Have you had time to look into this?
> 
> Not since I talk to you on IRC about it last week.
> 
> Like I wrote, the balance_callback should be ran under whichever
> rq->lock instance it gets queued under. As per:
> 
>   565790d28b1e ("sched: Fix balance_callback()")
> 
> Now, we only do queue_core_balance() from set_next_task_idle(), which
> *should* only happen from pick_next_task(), and as such the callback
> should only ever get called from finish_lock_switch() or the 'prev ==
> next' case in __schedule().
> 
> Neither of these two sites holds pi_lock.
> 
> 
> This is about as far as I got explaining things, and it being late, it's
> about as far as I got looking at things.
> 
> Now that also makes conceptual sense, we only want to pull a core-cookie
> task when we're scheduling an idle task.
> 
> Now, clearly this gets triggered from the PI path, but that's not making
> immediate sense to me, it would mean we're boosting the idle task, which
> is wrong too.
> 
> So it would be useful for someone that can reproduce this to provide a
> trace of where queue_core_balance() gets called, because that *should*
> only be in __schedule().

Does something like the below (untested in the extreme) help?

---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 83872f95a1ea..18163454bb47 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5665,6 +5665,8 @@ static inline struct task_struct *pick_task(struct rq *rq)
 
 extern void task_vruntime_update(struct rq *rq, struct task_struct *p, bool in_fi);
 
+static void queue_core_balance(struct rq *rq);
+
 static struct task_struct *
 pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
 {
@@ -5714,7 +5716,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
 		}
 
 		rq->core_pick = NULL;
-		return next;
+		goto out;
 	}
 
 	put_prev_task_balance(rq, prev, rf);
@@ -5764,7 +5766,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
 			 */
 			WARN_ON_ONCE(fi_before);
 			task_vruntime_update(rq, next, false);
-			goto done;
+			goto out_set_next;
 		}
 	}
 
@@ -5884,8 +5886,12 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
 		resched_curr(rq_i);
 	}
 
-done:
+out_set_next:
 	set_next_task(rq, next);
+out:
+	if (rq->core->core_forceidle_count && next == rq->idle)
+		queue_core_balance(rq);
+
 	return next;
 }
 
@@ -5914,7 +5920,7 @@ static bool try_steal_cookie(int this, int that)
 		if (p == src->core_pick || p == src->curr)
 			goto next;
 
-		if (!cpumask_test_cpu(this, &p->cpus_mask))
+		if (!is_cpu_allowed(p, this))
 			goto next;
 
 		if (p->core_occupation > dst->idle->core_occupation)
@@ -5980,7 +5986,7 @@ static void sched_core_balance(struct rq *rq)
 
 static DEFINE_PER_CPU(struct callback_head, core_balance_head);
 
-void queue_core_balance(struct rq *rq)
+static void queue_core_balance(struct rq *rq)
 {
 	if (!sched_core_enabled(rq))
 		return;
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index d17b0a5ce6ac..314c36fc9c42 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -437,7 +437,6 @@ static void set_next_task_idle(struct rq *rq, struct task_struct *next, bool fir
 {
 	update_idle_core(rq);
 	schedstat_inc(rq->sched_goidle);
-	queue_core_balance(rq);
 }
 
 #ifdef CONFIG_SMP
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index de53be905739..b85c9344779a 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1247,8 +1247,6 @@ static inline bool sched_group_cookie_match(struct rq *rq,
 	return false;
 }
 
-extern void queue_core_balance(struct rq *rq);
-
 static inline bool sched_core_enqueued(struct task_struct *p)
 {
 	return !RB_EMPTY_NODE(&p->core_node);
@@ -1282,10 +1280,6 @@ static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
 	return &rq->__lock;
 }
 
-static inline void queue_core_balance(struct rq *rq)
-{
-}
-
 static inline bool sched_cpu_cookie_match(struct rq *rq, struct task_struct *p)
 {
 	return true;

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-16 20:27   ` sched_core_balance() releasing interrupts with pi_lock held Peter Zijlstra
  2022-03-16 21:03     ` Peter Zijlstra
@ 2022-03-17 12:08     ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-17 12:08 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Steven Rostedt, LKML, Thomas Gleixner

On 2022-03-16 21:27:34 [+0100], Peter Zijlstra wrote:
> Now, we only do queue_core_balance() from set_next_task_idle(), which
> *should* only happen from pick_next_task(), and as such the callback
> should only ever get called from finish_lock_switch() or the 'prev ==
> next' case in __schedule().
> 
> Neither of these two sites holds pi_lock.

I've been trying to reproduce it and didn't make it. I see only the
idle/scheduler path.

> This is about as far as I got explaining things, and it being late, it's
> about as far as I got looking at things.
> 
> Now that also makes conceptual sense, we only want to pull a core-cookie
> task when we're scheduling an idle task.
> 
> Now, clearly this gets triggered from the PI path, but that's not making
> immediate sense to me, it would mean we're boosting the idle task, which
> is wrong too.

Looking at the idle task, it shouldn't be possible for !RT due to lack
of boostable locks and I don't see anything sleeping locks here on RT
either. 

> So it would be useful for someone that can reproduce this to provide a
> trace of where queue_core_balance() gets called, because that *should*
> only be in __schedule().

I failed so far.

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-16 20:35       ` Peter Zijlstra
@ 2022-03-17 12:09         ` Sebastian Andrzej Siewior
  2022-03-17 14:51           ` [PATCH] sched: Teach the forced-newidle balancer about CPU affinity limitation Sebastian Andrzej Siewior
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-17 12:09 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Steven Rostedt, LKML, Thomas Gleixner

On 2022-03-16 21:35:24 [+0100], Peter Zijlstra wrote:
> 
> The previous thing in wrong because it tries to solve the wrong thing,
> the above makes sense, except I would write it like so:

Perfect. You want me to dress it as a patch or have you already done so?

Sebastian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] sched: Teach the forced-newidle balancer about CPU affinity limitation.
  2022-03-17 12:09         ` Sebastian Andrzej Siewior
@ 2022-03-17 14:51           ` Sebastian Andrzej Siewior
  2022-04-05  8:22             ` [tip: sched/urgent] " tip-bot2 for Sebastian Andrzej Siewior
  0 siblings, 1 reply; 17+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-03-17 14:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt, LKML, Thomas Gleixner, Ingo Molnar, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

try_steal_cookie() looks at task_struct::cpus_mask to decide if the
task could be moved to `this' CPU. It ignores that the task might be in
a migration disabled section while not on the CPU. In this case the task
must not be moved otherwise per-CPU assumption are broken.

Use is_cpu_allowed(), as suggested by Peter Zijlstra, to decide if the a
task can be moved.

Fixes: d2dfa17bc7de6 ("sched: Trivial forced-newidle balancer")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/sched/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a44414946de3d..421ace2e007de 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5923,7 +5923,7 @@ static bool try_steal_cookie(int this, int that)
 		if (p == src->core_pick || p == src->curr)
 			goto next;
 
-		if (!cpumask_test_cpu(this, &p->cpus_mask))
+		if (!is_cpu_allowed(p, this))
 			goto next;
 
 		if (p->core_occupation > dst->idle->core_occupation)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-16 21:03     ` Peter Zijlstra
@ 2022-03-21 17:30       ` Steven Rostedt
  2022-03-29 21:22         ` Steven Rostedt
  0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2022-03-21 17:30 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: LKML, Thomas Gleixner, Sebastian Andrzej Siewior, joel

On Wed, 16 Mar 2022 22:03:41 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> Does something like the below (untested in the extreme) help?

Hi Peter,

This has been tested extensively by the ChromeOS team and said that it does
appear to fix the problem.

Could you get this into mainline, and tag it for stable so that it can be
backported to the appropriate stable releases?

Thanks for the fix!

-- Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-21 17:30       ` Steven Rostedt
@ 2022-03-29 21:22         ` Steven Rostedt
  2022-04-04 20:17           ` T.J. Alumbaugh
  0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2022-03-29 21:22 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: LKML, Thomas Gleixner, Sebastian Andrzej Siewior, joel

On Mon, 21 Mar 2022 13:30:37 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 16 Mar 2022 22:03:41 +0100
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > Does something like the below (untested in the extreme) help?  
> 
> Hi Peter,
> 
> This has been tested extensively by the ChromeOS team and said that it does
> appear to fix the problem.
> 
> Could you get this into mainline, and tag it for stable so that it can be
> backported to the appropriate stable releases?
> 
> Thanks for the fix!
>

Hi Peter,

I just don't want you to forget about this :-)

-- Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-03-29 21:22         ` Steven Rostedt
@ 2022-04-04 20:17           ` T.J. Alumbaugh
  2022-04-05  7:48             ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: T.J. Alumbaugh @ 2022-04-04 20:17 UTC (permalink / raw)
  To: Steven Rostedt, Peter Zijlstra
  Cc: LKML, Thomas Gleixner, Sebastian Andrzej Siewior, joel


On 3/29/22 17:22, Steven Rostedt wrote:
> On Mon, 21 Mar 2022 13:30:37 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
>> On Wed, 16 Mar 2022 22:03:41 +0100
>> Peter Zijlstra <peterz@infradead.org> wrote:
>>
>>> Does something like the below (untested in the extreme) help?
>> Hi Peter,
>>
>> This has been tested extensively by the ChromeOS team and said that it does
>> appear to fix the problem.
>>
>> Could you get this into mainline, and tag it for stable so that it can be
>> backported to the appropriate stable releases?
>>
>> Thanks for the fix!
>>
> Hi Peter,
>
> I just don't want you to forget about this :-)
>
> -- Steve
>
Hi Peter,

Just a note that if/when you send this out as a patch, feel free to add:

Tested-by: T.J. Alumbaugh <talumbau@chromium.org>

Thanks,

- T.J.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-04-04 20:17           ` T.J. Alumbaugh
@ 2022-04-05  7:48             ` Peter Zijlstra
  2022-04-05 15:16               ` Dietmar Eggemann
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2022-04-05  7:48 UTC (permalink / raw)
  To: T.J. Alumbaugh
  Cc: Steven Rostedt, LKML, Thomas Gleixner, Sebastian Andrzej Siewior, joel

On Mon, Apr 04, 2022 at 04:17:54PM -0400, T.J. Alumbaugh wrote:
> 
> On 3/29/22 17:22, Steven Rostedt wrote:
> > On Mon, 21 Mar 2022 13:30:37 -0400
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > On Wed, 16 Mar 2022 22:03:41 +0100
> > > Peter Zijlstra <peterz@infradead.org> wrote:
> > > 
> > > > Does something like the below (untested in the extreme) help?
> > > Hi Peter,
> > > 
> > > This has been tested extensively by the ChromeOS team and said that it does
> > > appear to fix the problem.
> > > 
> > > Could you get this into mainline, and tag it for stable so that it can be
> > > backported to the appropriate stable releases?
> > > 
> > > Thanks for the fix!
> > > 
> > Hi Peter,
> > 
> > I just don't want you to forget about this :-)
> > 
> > -- Steve
> > 
> Hi Peter,
> 
> Just a note that if/when you send this out as a patch, feel free to add:
> 
> Tested-by: T.J. Alumbaugh <talumbau@chromium.org>

https://lkml.kernel.org/r/20220330160535.GN8939@worktop.programming.kicks-ass.net

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [tip: sched/urgent] sched: Teach the forced-newidle balancer about CPU affinity limitation.
  2022-03-17 14:51           ` [PATCH] sched: Teach the forced-newidle balancer about CPU affinity limitation Sebastian Andrzej Siewior
@ 2022-04-05  8:22             ` tip-bot2 for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 17+ messages in thread
From: tip-bot2 for Sebastian Andrzej Siewior @ 2022-04-05  8:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Sebastian Andrzej Siewior, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     386ef214c3c6ab111d05e1790e79475363abaa05
Gitweb:        https://git.kernel.org/tip/386ef214c3c6ab111d05e1790e79475363abaa05
Author:        Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate:    Thu, 17 Mar 2022 15:51:32 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 05 Apr 2022 09:59:36 +02:00

sched: Teach the forced-newidle balancer about CPU affinity limitation.

try_steal_cookie() looks at task_struct::cpus_mask to decide if the
task could be moved to `this' CPU. It ignores that the task might be in
a migration disabled section while not on the CPU. In this case the task
must not be moved otherwise per-CPU assumption are broken.

Use is_cpu_allowed(), as suggested by Peter Zijlstra, to decide if the a
task can be moved.

Fixes: d2dfa17bc7de6 ("sched: Trivial forced-newidle balancer")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/YjNK9El+3fzGmswf@linutronix.de
---
 kernel/sched/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 017ee78..51efaab 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6006,7 +6006,7 @@ static bool try_steal_cookie(int this, int that)
 		if (p == src->core_pick || p == src->curr)
 			goto next;
 
-		if (!cpumask_test_cpu(this, &p->cpus_mask))
+		if (!is_cpu_allowed(p, this))
 			goto next;
 
 		if (p->core_occupation > dst->idle->core_occupation)

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: sched_core_balance() releasing interrupts with pi_lock held
  2022-04-05  7:48             ` Peter Zijlstra
@ 2022-04-05 15:16               ` Dietmar Eggemann
  0 siblings, 0 replies; 17+ messages in thread
From: Dietmar Eggemann @ 2022-04-05 15:16 UTC (permalink / raw)
  To: Peter Zijlstra, T.J. Alumbaugh
  Cc: Steven Rostedt, LKML, Thomas Gleixner, Sebastian Andrzej Siewior, joel

On 05/04/2022 09:48, Peter Zijlstra wrote:
> On Mon, Apr 04, 2022 at 04:17:54PM -0400, T.J. Alumbaugh wrote:
>>
>> On 3/29/22 17:22, Steven Rostedt wrote:
>>> On Mon, 21 Mar 2022 13:30:37 -0400
>>> Steven Rostedt <rostedt@goodmis.org> wrote:
>>>
>>>> On Wed, 16 Mar 2022 22:03:41 +0100
>>>> Peter Zijlstra <peterz@infradead.org> wrote:
>>>>
>>>>> Does something like the below (untested in the extreme) help?
>>>> Hi Peter,
>>>>
>>>> This has been tested extensively by the ChromeOS team and said that it does
>>>> appear to fix the problem.
>>>>
>>>> Could you get this into mainline, and tag it for stable so that it can be
>>>> backported to the appropriate stable releases?
>>>>
>>>> Thanks for the fix!
>>>>
>>> Hi Peter,
>>>
>>> I just don't want you to forget about this :-)
>>>
>>> -- Steve
>>>
>> Hi Peter,
>>
>> Just a note that if/when you send this out as a patch, feel free to add:
>>
>> Tested-by: T.J. Alumbaugh <talumbau@chromium.org>
> 
> https://lkml.kernel.org/r/20220330160535.GN8939@worktop.programming.kicks-ass.net

I still wonder if this issue happened on a system w/o:

     565790d28b1e ("sched: Fix balance_callback()")

Maybe chromeos-5.10 or earlier? In this case applying 565790d28b1e could
fix it as well.

The reason why I think the original issue happened on a system w/o
565790d28b1e is the call-stack in:

https://lkml.kernel.org/r/20220315174606.02959816@gandalf.local.home

[56064.673346] Call Trace:
[56064.676066]  dump_stack+0xb9/0x117
[56064.679861]  ? print_usage_bug+0x2af/0x2c2
[56064.684434]  mark_lock_irq+0x25e/0x27d
[56064.688618]  mark_lock+0x11a/0x16c
[56064.692412]  mark_held_locks+0x57/0x87
[56064.696595]  ? _raw_spin_unlock_irq+0x2c/0x40
[56064.701460]  lockdep_hardirqs_on+0xb1/0x19d
[56064.706130]  _raw_spin_unlock_irq+0x2c/0x40
[56064.710799]  sched_core_balance+0x8a/0x4af
[56064.715369]  ? __balance_callback+0x1f/0x9a        <--- !!!
[56064.720030]  __balance_callback+0x4f/0x9a
[56064.724506]  rt_mutex_setprio+0x43a/0x48b
[56064.728982]  task_blocks_on_rt_mutex+0x14d/0x1d5

has __balance_callback().

565790d28b1e changes __balance_callback() to __balance_callbacks()
                                                               ^

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-04-05 23:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-08 21:14 sched_core_balance() releasing interrupts with pi_lock held Steven Rostedt
2022-03-15 21:46 ` Steven Rostedt
2022-03-16 16:03   ` Sebastian Andrzej Siewior
2022-03-16 16:18     ` Sebastian Andrzej Siewior
2022-03-16 17:05       ` Sebastian Andrzej Siewior
2022-03-16 20:35       ` Peter Zijlstra
2022-03-17 12:09         ` Sebastian Andrzej Siewior
2022-03-17 14:51           ` [PATCH] sched: Teach the forced-newidle balancer about CPU affinity limitation Sebastian Andrzej Siewior
2022-04-05  8:22             ` [tip: sched/urgent] " tip-bot2 for Sebastian Andrzej Siewior
2022-03-16 20:27   ` sched_core_balance() releasing interrupts with pi_lock held Peter Zijlstra
2022-03-16 21:03     ` Peter Zijlstra
2022-03-21 17:30       ` Steven Rostedt
2022-03-29 21:22         ` Steven Rostedt
2022-04-04 20:17           ` T.J. Alumbaugh
2022-04-05  7:48             ` Peter Zijlstra
2022-04-05 15:16               ` Dietmar Eggemann
2022-03-17 12:08     ` Sebastian Andrzej Siewior

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.