linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
@ 2020-03-13 23:29 Joel Fernandes (Google)
  2020-03-14  0:30 ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Joel Fernandes (Google) @ 2020-03-13 23:29 UTC (permalink / raw)
  To: linux-kernel
  Cc: Joel Fernandes (Google),
	vpillai, Aaron Lu, Aubrey Li, peterz, paulmck, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Juri Lelli, Mel Gorman,
	Steven Rostedt, Vincent Guittot

rcu_read_unlock() can incur an infrequent deadlock in
sched_core_balance(). Fix this by using the RCU-sched flavor instead.

This fixes the following spinlock recursion observed when testing the
core scheduling patches on PREEMPT=y kernel on ChromeOS:

[    3.240891] BUG: spinlock recursion on CPU#2, swapper/2/0
[    3.240900]  lock: 0xffff9cd1eeb28e40, .magic: dead4ead, .owner: swapper/2/0, .owner_cpu: 2
[    3.240905] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.4.22htcore #4
[    3.240908] Hardware name: Google Eve/Eve, BIOS Google_Eve.9584.174.0 05/29/2018
[    3.240910] Call Trace:
[    3.240919]  dump_stack+0x97/0xdb
[    3.240924]  ? spin_bug+0xa4/0xb1
[    3.240927]  do_raw_spin_lock+0x79/0x98
[    3.240931]  try_to_wake_up+0x367/0x61b
[    3.240935]  rcu_read_unlock_special+0xde/0x169
[    3.240938]  ? sched_core_balance+0xd9/0x11e
[    3.240941]  __rcu_read_unlock+0x48/0x4a
[    3.240945]  __balance_callback+0x50/0xa1
[    3.240949]  __schedule+0x55a/0x61e
[    3.240952]  schedule_idle+0x21/0x2d
[    3.240956]  do_idle+0x1d5/0x1f8
[    3.240960]  cpu_startup_entry+0x1d/0x1f
[    3.240964]  start_secondary+0x159/0x174
[    3.240967]  secondary_startup_64+0xa4/0xb0
[   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]

Cc: vpillai <vpillai@digitalocean.com>
Cc: Aaron Lu <aaron.lwe@gmail.com>
Cc: Aubrey Li <aubrey.intel@gmail.com>
Cc: peterz@infradead.org
Cc: paulmck@kernel.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/sched/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3045bd50e249..037e8f2e2686 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4735,7 +4735,7 @@ static void sched_core_balance(struct rq *rq)
 	struct sched_domain *sd;
 	int cpu = cpu_of(rq);
 
-	rcu_read_lock();
+	rcu_read_lock_sched();
 	raw_spin_unlock_irq(rq_lockp(rq));
 	for_each_domain(cpu, sd) {
 		if (!(sd->flags & SD_LOAD_BALANCE))
@@ -4748,7 +4748,7 @@ static void sched_core_balance(struct rq *rq)
 			break;
 	}
 	raw_spin_lock_irq(rq_lockp(rq));
-	rcu_read_unlock();
+	rcu_read_unlock_sched();
 }
 
 static DEFINE_PER_CPU(struct callback_head, core_balance_head);
-- 
2.25.1.481.gfbce0eb801-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
  2020-03-13 23:29 [PATCH] sched: Use RCU-sched in core-scheduling balancing logic Joel Fernandes (Google)
@ 2020-03-14  0:30 ` Paul E. McKenney
  2020-03-23  6:58   ` Li, Aubrey
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2020-03-14  0:30 UTC (permalink / raw)
  To: Joel Fernandes (Google)
  Cc: linux-kernel, vpillai, Aaron Lu, Aubrey Li, peterz, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Juri Lelli, Mel Gorman,
	Steven Rostedt, Vincent Guittot

On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
> rcu_read_unlock() can incur an infrequent deadlock in
> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
> 
> This fixes the following spinlock recursion observed when testing the
> core scheduling patches on PREEMPT=y kernel on ChromeOS:
> 
> [    3.240891] BUG: spinlock recursion on CPU#2, swapper/2/0
> [    3.240900]  lock: 0xffff9cd1eeb28e40, .magic: dead4ead, .owner: swapper/2/0, .owner_cpu: 2
> [    3.240905] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.4.22htcore #4
> [    3.240908] Hardware name: Google Eve/Eve, BIOS Google_Eve.9584.174.0 05/29/2018
> [    3.240910] Call Trace:
> [    3.240919]  dump_stack+0x97/0xdb
> [    3.240924]  ? spin_bug+0xa4/0xb1
> [    3.240927]  do_raw_spin_lock+0x79/0x98
> [    3.240931]  try_to_wake_up+0x367/0x61b
> [    3.240935]  rcu_read_unlock_special+0xde/0x169
> [    3.240938]  ? sched_core_balance+0xd9/0x11e
> [    3.240941]  __rcu_read_unlock+0x48/0x4a
> [    3.240945]  __balance_callback+0x50/0xa1
> [    3.240949]  __schedule+0x55a/0x61e
> [    3.240952]  schedule_idle+0x21/0x2d
> [    3.240956]  do_idle+0x1d5/0x1f8
> [    3.240960]  cpu_startup_entry+0x1d/0x1f
> [    3.240964]  start_secondary+0x159/0x174
> [    3.240967]  secondary_startup_64+0xa4/0xb0
> [   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
> 
> Cc: vpillai <vpillai@digitalocean.com>
> Cc: Aaron Lu <aaron.lwe@gmail.com>
> Cc: Aubrey Li <aubrey.intel@gmail.com>
> Cc: peterz@infradead.org
> Cc: paulmck@kernel.org
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

The original could indeed deadlock, and this would avoid that deadlock.
(The commit to solve this deadlock is sadly not yet in mainline.)

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  kernel/sched/core.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 3045bd50e249..037e8f2e2686 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4735,7 +4735,7 @@ static void sched_core_balance(struct rq *rq)
>  	struct sched_domain *sd;
>  	int cpu = cpu_of(rq);
>  
> -	rcu_read_lock();
> +	rcu_read_lock_sched();
>  	raw_spin_unlock_irq(rq_lockp(rq));
>  	for_each_domain(cpu, sd) {
>  		if (!(sd->flags & SD_LOAD_BALANCE))
> @@ -4748,7 +4748,7 @@ static void sched_core_balance(struct rq *rq)
>  			break;
>  	}
>  	raw_spin_lock_irq(rq_lockp(rq));
> -	rcu_read_unlock();
> +	rcu_read_unlock_sched();
>  }
>  
>  static DEFINE_PER_CPU(struct callback_head, core_balance_head);
> -- 
> 2.25.1.481.gfbce0eb801-goog
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
  2020-03-14  0:30 ` Paul E. McKenney
@ 2020-03-23  6:58   ` Li, Aubrey
  2020-03-23 15:21     ` Joel Fernandes
  0 siblings, 1 reply; 9+ messages in thread
From: Li, Aubrey @ 2020-03-23  6:58 UTC (permalink / raw)
  To: paulmck, Joel Fernandes (Google)
  Cc: linux-kernel, vpillai, Aaron Lu, Aubrey Li, peterz, Ben Segall,
	Dietmar Eggemann, Ingo Molnar, Juri Lelli, Mel Gorman,
	Steven Rostedt, Vincent Guittot

On 2020/3/14 8:30, Paul E. McKenney wrote:
> On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
>> rcu_read_unlock() can incur an infrequent deadlock in
>> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
>>
>> This fixes the following spinlock recursion observed when testing the
>> core scheduling patches on PREEMPT=y kernel on ChromeOS:
>>
>> [   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
>>
> 
> The original could indeed deadlock, and this would avoid that deadlock.
> (The commit to solve this deadlock is sadly not yet in mainline.)
> 
> Acked-by: Paul E. McKenney <paulmck@kernel.org>

I saw this in dmesg with this patch, is it expected?

Thanks,
-Aubrey

[  117.000905] =============================
[  117.000907] WARNING: suspicious RCU usage
[  117.000911] 5.5.7+ #160 Not tainted
[  117.000913] -----------------------------
[  117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage!
[  117.000918] 
               other info that might help us debug this:

[  117.000921] 
               rcu_scheduler_active = 2, debug_locks = 1
[  117.000923] 1 lock held by swapper/52/0:
[  117.000925]  #0: ffffffff82670960 (rcu_read_lock_sched){....}, at: sched_core_balance+0x5/0x700
[  117.000937] 
               stack backtrace:
[  117.000940] CPU: 52 PID: 0 Comm: swapper/52 Kdump: loaded Not tainted 5.5.7+ #160
[  117.000943] Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0412.020920172159 02/09/2017
[  117.000945] Call Trace:
[  117.000955]  dump_stack+0x86/0xcb
[  117.000962]  sched_core_balance+0x634/0x700
[  117.000982]  __balance_callback+0x49/0xa0
[  117.000990]  __schedule+0x1416/0x1620
[  117.001000]  ? lockdep_hardirqs_off+0xa0/0xe0
[  117.001005]  ? _raw_spin_unlock_irqrestore+0x41/0x70
[  117.001024]  schedule_idle+0x28/0x40
[  117.001030]  do_idle+0x17e/0x2a0
[  117.001041]  cpu_startup_entry+0x19/0x20
[  117.001048]  start_secondary+0x16c/0x1c0
[  117.001055]  secondary_startup_64+0xa4/0xb0

> 
>> ---
>>  kernel/sched/core.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 3045bd50e249..037e8f2e2686 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -4735,7 +4735,7 @@ static void sched_core_balance(struct rq *rq)
>>  	struct sched_domain *sd;
>>  	int cpu = cpu_of(rq);
>>  
>> -	rcu_read_lock();
>> +	rcu_read_lock_sched();
>>  	raw_spin_unlock_irq(rq_lockp(rq));
>>  	for_each_domain(cpu, sd) {
>>  		if (!(sd->flags & SD_LOAD_BALANCE))
>> @@ -4748,7 +4748,7 @@ static void sched_core_balance(struct rq *rq)
>>  			break;
>>  	}
>>  	raw_spin_lock_irq(rq_lockp(rq));
>> -	rcu_read_unlock();
>> +	rcu_read_unlock_sched();
>>  }
>>  
>>  static DEFINE_PER_CPU(struct callback_head, core_balance_head);
>> -- 
>> 2.25.1.481.gfbce0eb801-goog
>>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
  2020-03-23  6:58   ` Li, Aubrey
@ 2020-03-23 15:21     ` Joel Fernandes
  2020-03-24  3:01       ` Li, Aubrey
  0 siblings, 1 reply; 9+ messages in thread
From: Joel Fernandes @ 2020-03-23 15:21 UTC (permalink / raw)
  To: Li, Aubrey
  Cc: paulmck, linux-kernel, vpillai, Aaron Lu, Aubrey Li, peterz,
	Ben Segall, Dietmar Eggemann, Ingo Molnar, Juri Lelli,
	Mel Gorman, Steven Rostedt, Vincent Guittot

On Mon, Mar 23, 2020 at 02:58:18PM +0800, Li, Aubrey wrote:
> On 2020/3/14 8:30, Paul E. McKenney wrote:
> > On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
> >> rcu_read_unlock() can incur an infrequent deadlock in
> >> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
> >>
> >> This fixes the following spinlock recursion observed when testing the
> >> core scheduling patches on PREEMPT=y kernel on ChromeOS:
> >>
> >> [   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
> >>
> > 
> > The original could indeed deadlock, and this would avoid that deadlock.
> > (The commit to solve this deadlock is sadly not yet in mainline.)
> > 
> > Acked-by: Paul E. McKenney <paulmck@kernel.org>
> 
> I saw this in dmesg with this patch, is it expected?
> 
> [  117.000905] =============================
> [  117.000907] WARNING: suspicious RCU usage
> [  117.000911] 5.5.7+ #160 Not tainted
> [  117.000913] -----------------------------
> [  117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage!
> [  117.000918] 
>                other info that might help us debug this:

Sigh, this is because for_each_domain() expects rcu_read_lock(). From an RCU
PoV, the code is correct (warning doesn't cause any issue).

To silence warning, we could replace the rcu_read_lock_sched() in my patch with:
preempt_disable();
rcu_read_lock();

and replace the unlock with:

rcu_read_unlock();
preempt_enable();

That should both take care of both the warning and the scheduler-related
deadlock. Thoughts?

Does that fix the warning for you? 

thanks,

 - Joel

> 
> [  117.000921] 
>                rcu_scheduler_active = 2, debug_locks = 1
> [  117.000923] 1 lock held by swapper/52/0:
> [  117.000925]  #0: ffffffff82670960 (rcu_read_lock_sched){....}, at: sched_core_balance+0x5/0x700
> [  117.000937] 
>                stack backtrace:
> [  117.000940] CPU: 52 PID: 0 Comm: swapper/52 Kdump: loaded Not tainted 5.5.7+ #160
> [  117.000943] Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0412.020920172159 02/09/2017
> [  117.000945] Call Trace:
> [  117.000955]  dump_stack+0x86/0xcb
> [  117.000962]  sched_core_balance+0x634/0x700
> [  117.000982]  __balance_callback+0x49/0xa0
> [  117.000990]  __schedule+0x1416/0x1620
> [  117.001000]  ? lockdep_hardirqs_off+0xa0/0xe0
> [  117.001005]  ? _raw_spin_unlock_irqrestore+0x41/0x70
> [  117.001024]  schedule_idle+0x28/0x40
> [  117.001030]  do_idle+0x17e/0x2a0
> [  117.001041]  cpu_startup_entry+0x19/0x20
> [  117.001048]  start_secondary+0x16c/0x1c0
> [  117.001055]  secondary_startup_64+0xa4/0xb0
> 
> > 
> >> ---
> >>  kernel/sched/core.c | 4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index 3045bd50e249..037e8f2e2686 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -4735,7 +4735,7 @@ static void sched_core_balance(struct rq *rq)
> >>  	struct sched_domain *sd;
> >>  	int cpu = cpu_of(rq);
> >>  
> >> -	rcu_read_lock();
> >> +	rcu_read_lock_sched();
> >>  	raw_spin_unlock_irq(rq_lockp(rq));
> >>  	for_each_domain(cpu, sd) {
> >>  		if (!(sd->flags & SD_LOAD_BALANCE))
> >> @@ -4748,7 +4748,7 @@ static void sched_core_balance(struct rq *rq)
> >>  			break;
> >>  	}
> >>  	raw_spin_lock_irq(rq_lockp(rq));
> >> -	rcu_read_unlock();
> >> +	rcu_read_unlock_sched();
> >>  }
> >>  
> >>  static DEFINE_PER_CPU(struct callback_head, core_balance_head);
> >> -- 
> >> 2.25.1.481.gfbce0eb801-goog
> >>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
  2020-03-23 15:21     ` Joel Fernandes
@ 2020-03-24  3:01       ` Li, Aubrey
  2020-03-24 13:30         ` Paul E. McKenney
  2020-03-24 18:49         ` Joel Fernandes
  0 siblings, 2 replies; 9+ messages in thread
From: Li, Aubrey @ 2020-03-24  3:01 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: paulmck, linux-kernel, vpillai, Aaron Lu, Aubrey Li, peterz,
	Ben Segall, Dietmar Eggemann, Ingo Molnar, Juri Lelli,
	Mel Gorman, Steven Rostedt, Vincent Guittot

On 2020/3/23 23:21, Joel Fernandes wrote:
> On Mon, Mar 23, 2020 at 02:58:18PM +0800, Li, Aubrey wrote:
>> On 2020/3/14 8:30, Paul E. McKenney wrote:
>>> On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
>>>> rcu_read_unlock() can incur an infrequent deadlock in
>>>> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
>>>>
>>>> This fixes the following spinlock recursion observed when testing the
>>>> core scheduling patches on PREEMPT=y kernel on ChromeOS:
>>>>
>>>> [   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
>>>>
>>>
>>> The original could indeed deadlock, and this would avoid that deadlock.
>>> (The commit to solve this deadlock is sadly not yet in mainline.)
>>>
>>> Acked-by: Paul E. McKenney <paulmck@kernel.org>
>>
>> I saw this in dmesg with this patch, is it expected?
>>
>> [  117.000905] =============================
>> [  117.000907] WARNING: suspicious RCU usage
>> [  117.000911] 5.5.7+ #160 Not tainted
>> [  117.000913] -----------------------------
>> [  117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage!
>> [  117.000918] 
>>                other info that might help us debug this:
> 
> Sigh, this is because for_each_domain() expects rcu_read_lock(). From an RCU
> PoV, the code is correct (warning doesn't cause any issue).
> 
> To silence warning, we could replace the rcu_read_lock_sched() in my patch with:
> preempt_disable();
> rcu_read_lock();
> 
> and replace the unlock with:
> 
> rcu_read_unlock();
> preempt_enable();
> 
> That should both take care of both the warning and the scheduler-related
> deadlock. Thoughts?
> 

How about this?

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a01df3e..7ff694e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4743,7 +4743,6 @@ static void sched_core_balance(struct rq *rq)
 	int cpu = cpu_of(rq);
 
 	rcu_read_lock();
-	raw_spin_unlock_irq(rq_lockp(rq));
 	for_each_domain(cpu, sd) {
 		if (!(sd->flags & SD_LOAD_BALANCE))
 			break;
@@ -4754,7 +4753,6 @@ static void sched_core_balance(struct rq *rq)
 		if (steal_cookie_task(cpu, sd))
 			break;
 	}
-	raw_spin_lock_irq(rq_lockp(rq));
 	rcu_read_unlock();
 }

Thanks,
-Aubrey

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
  2020-03-24  3:01       ` Li, Aubrey
@ 2020-03-24 13:30         ` Paul E. McKenney
  2020-03-24 15:12           ` Paul E. McKenney
  2020-03-24 18:49         ` Joel Fernandes
  1 sibling, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2020-03-24 13:30 UTC (permalink / raw)
  To: Li, Aubrey
  Cc: Joel Fernandes, linux-kernel, vpillai, Aaron Lu, Aubrey Li,
	peterz, Ben Segall, Dietmar Eggemann, Ingo Molnar, Juri Lelli,
	Mel Gorman, Steven Rostedt, Vincent Guittot

On Tue, Mar 24, 2020 at 11:01:27AM +0800, Li, Aubrey wrote:
> On 2020/3/23 23:21, Joel Fernandes wrote:
> > On Mon, Mar 23, 2020 at 02:58:18PM +0800, Li, Aubrey wrote:
> >> On 2020/3/14 8:30, Paul E. McKenney wrote:
> >>> On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
> >>>> rcu_read_unlock() can incur an infrequent deadlock in
> >>>> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
> >>>>
> >>>> This fixes the following spinlock recursion observed when testing the
> >>>> core scheduling patches on PREEMPT=y kernel on ChromeOS:
> >>>>
> >>>> [   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
> >>>>
> >>>
> >>> The original could indeed deadlock, and this would avoid that deadlock.
> >>> (The commit to solve this deadlock is sadly not yet in mainline.)
> >>>
> >>> Acked-by: Paul E. McKenney <paulmck@kernel.org>
> >>
> >> I saw this in dmesg with this patch, is it expected?
> >>
> >> [  117.000905] =============================
> >> [  117.000907] WARNING: suspicious RCU usage
> >> [  117.000911] 5.5.7+ #160 Not tainted
> >> [  117.000913] -----------------------------
> >> [  117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage!
> >> [  117.000918] 
> >>                other info that might help us debug this:
> > 
> > Sigh, this is because for_each_domain() expects rcu_read_lock(). From an RCU
> > PoV, the code is correct (warning doesn't cause any issue).
> > 
> > To silence warning, we could replace the rcu_read_lock_sched() in my patch with:
> > preempt_disable();
> > rcu_read_lock();
> > 
> > and replace the unlock with:
> > 
> > rcu_read_unlock();
> > preempt_enable();
> > 
> > That should both take care of both the warning and the scheduler-related
> > deadlock. Thoughts?
> > 
> 
> How about this?
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index a01df3e..7ff694e 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4743,7 +4743,6 @@ static void sched_core_balance(struct rq *rq)
>  	int cpu = cpu_of(rq);
>  
>  	rcu_read_lock();
> -	raw_spin_unlock_irq(rq_lockp(rq));
>  	for_each_domain(cpu, sd) {
>  		if (!(sd->flags & SD_LOAD_BALANCE))
>  			break;
> @@ -4754,7 +4753,6 @@ static void sched_core_balance(struct rq *rq)
>  		if (steal_cookie_task(cpu, sd))
>  			break;
>  	}
> -	raw_spin_lock_irq(rq_lockp(rq));
>  	rcu_read_unlock();
>  }

As an alternative, I am backporting the -rcu commit 2b5e19e20fc2 ("rcu:
Make rcu_read_unlock_special() safe for rq/pi locks") to v5.6-rc6 and
testing it.  The other support for this is already in mainline.  I just
now started testing it, and will let you know how it goes.

If that works for you, and if the bug you are looking to fix is also
in v5.5 or earlier, please let me know so that we can work out how to
deal with the older releases.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
  2020-03-24 13:30         ` Paul E. McKenney
@ 2020-03-24 15:12           ` Paul E. McKenney
  0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2020-03-24 15:12 UTC (permalink / raw)
  To: Li, Aubrey
  Cc: Joel Fernandes, linux-kernel, vpillai, Aaron Lu, Aubrey Li,
	peterz, Ben Segall, Dietmar Eggemann, Ingo Molnar, Juri Lelli,
	Mel Gorman, Steven Rostedt, Vincent Guittot

On Tue, Mar 24, 2020 at 06:30:12AM -0700, Paul E. McKenney wrote:
> On Tue, Mar 24, 2020 at 11:01:27AM +0800, Li, Aubrey wrote:
> > On 2020/3/23 23:21, Joel Fernandes wrote:
> > > On Mon, Mar 23, 2020 at 02:58:18PM +0800, Li, Aubrey wrote:
> > >> On 2020/3/14 8:30, Paul E. McKenney wrote:
> > >>> On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
> > >>>> rcu_read_unlock() can incur an infrequent deadlock in
> > >>>> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
> > >>>>
> > >>>> This fixes the following spinlock recursion observed when testing the
> > >>>> core scheduling patches on PREEMPT=y kernel on ChromeOS:
> > >>>>
> > >>>> [   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
> > >>>>
> > >>>
> > >>> The original could indeed deadlock, and this would avoid that deadlock.
> > >>> (The commit to solve this deadlock is sadly not yet in mainline.)
> > >>>
> > >>> Acked-by: Paul E. McKenney <paulmck@kernel.org>
> > >>
> > >> I saw this in dmesg with this patch, is it expected?
> > >>
> > >> [  117.000905] =============================
> > >> [  117.000907] WARNING: suspicious RCU usage
> > >> [  117.000911] 5.5.7+ #160 Not tainted
> > >> [  117.000913] -----------------------------
> > >> [  117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage!
> > >> [  117.000918] 
> > >>                other info that might help us debug this:
> > > 
> > > Sigh, this is because for_each_domain() expects rcu_read_lock(). From an RCU
> > > PoV, the code is correct (warning doesn't cause any issue).
> > > 
> > > To silence warning, we could replace the rcu_read_lock_sched() in my patch with:
> > > preempt_disable();
> > > rcu_read_lock();
> > > 
> > > and replace the unlock with:
> > > 
> > > rcu_read_unlock();
> > > preempt_enable();
> > > 
> > > That should both take care of both the warning and the scheduler-related
> > > deadlock. Thoughts?
> > > 
> > 
> > How about this?
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index a01df3e..7ff694e 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4743,7 +4743,6 @@ static void sched_core_balance(struct rq *rq)
> >  	int cpu = cpu_of(rq);
> >  
> >  	rcu_read_lock();
> > -	raw_spin_unlock_irq(rq_lockp(rq));
> >  	for_each_domain(cpu, sd) {
> >  		if (!(sd->flags & SD_LOAD_BALANCE))
> >  			break;
> > @@ -4754,7 +4753,6 @@ static void sched_core_balance(struct rq *rq)
> >  		if (steal_cookie_task(cpu, sd))
> >  			break;
> >  	}
> > -	raw_spin_lock_irq(rq_lockp(rq));
> >  	rcu_read_unlock();
> >  }
> 
> As an alternative, I am backporting the -rcu commit 2b5e19e20fc2 ("rcu:
> Make rcu_read_unlock_special() safe for rq/pi locks") to v5.6-rc6 and
> testing it.  The other support for this is already in mainline.  I just
> now started testing it, and will let you know how it goes.
> 
> If that works for you, and if the bug you are looking to fix is also
> in v5.5 or earlier, please let me know so that we can work out how to
> deal with the older releases.

And the backported patch below passes moderate rcutorture testing.
But the big question...  Does it work for you?

							Thanx, Paul

------------------------------------------------------------------------

commit 0ed0648e23f6aca2a6543fee011d74ab674e08e8
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Tue Mar 24 06:20:14 2020 -0700

    rcu: Make rcu_read_unlock_special() safe for rq/pi locks
    
    The scheduler is currently required to hold rq/pi locks across the entire
    RCU read-side critical section or not at all.  This is inconvenient and
    leaves traps for the unwary, including the author of this commit.
    
    But now that excessively ong grace periods enable scheduling-clock
    interrupts for holdout nohz_full CPUs, the nohz_full rescue logic in
    rcu_read_unlock_special() can be dispensed with.  In other words, the
    rcu_read_unlock_special() function can refrain from doing wakeups unless
    such wakeups are guaranteed safe.
    
    This commit therefore avoids unsafe wakeups, freeing the scheduler to
    hold rq/pi locks across rcu_read_unlock() even if the corresponding RCU
    read-side critical section might have been preempted.
    
    This commit is inspired by a patch from Lai Jiangshan:
    https://lore.kernel.org/lkml/20191102124559.1135-2-laijs@linux.alibaba.com
    This commit is further intended to be a step towards his goal of permitting
    the inlining of RCU-preempt's rcu_read_lock() and rcu_read_unlock().
    
    Cc: Lai Jiangshan <laijs@linux.alibaba.com>
    [ paulmck: Backported to v5.6-rc6. ]
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index c6ea81cd4189..2e1bfa0cc393 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -614,18 +614,16 @@ static void rcu_read_unlock_special(struct task_struct *t)
 		struct rcu_node *rnp = rdp->mynode;
 
 		exp = (t->rcu_blocked_node && t->rcu_blocked_node->exp_tasks) ||
-		      (rdp->grpmask & READ_ONCE(rnp->expmask)) ||
-		      tick_nohz_full_cpu(rdp->cpu);
+		      (rdp->grpmask & READ_ONCE(rnp->expmask));
 		// Need to defer quiescent state until everything is enabled.
-		if (irqs_were_disabled && use_softirq &&
-		    (in_interrupt() ||
-		     (exp && !t->rcu_read_unlock_special.b.deferred_qs))) {
-			// Using softirq, safe to awaken, and we get
-			// no help from enabling irqs, unlike bh/preempt.
+		if (use_softirq && (in_irq() || (exp && !irqs_were_disabled))) {
+			// Using softirq, safe to awaken, and either the
+			// wakeup is free or there is an expedited GP.
 			raise_softirq_irqoff(RCU_SOFTIRQ);
 		} else {
 			// Enabling BH or preempt does reschedule, so...
-			// Also if no expediting or NO_HZ_FULL, slow is OK.
+			// Also if no expediting, slow is OK.
+			// Plus nohz_full CPUs eventually get tick enabled.
 			set_tsk_need_resched(current);
 			set_preempt_need_resched();
 			if (IS_ENABLED(CONFIG_IRQ_WORK) && irqs_were_disabled &&

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
  2020-03-24  3:01       ` Li, Aubrey
  2020-03-24 13:30         ` Paul E. McKenney
@ 2020-03-24 18:49         ` Joel Fernandes
  2020-03-25  0:40           ` Li, Aubrey
  1 sibling, 1 reply; 9+ messages in thread
From: Joel Fernandes @ 2020-03-24 18:49 UTC (permalink / raw)
  To: Li, Aubrey
  Cc: paulmck, linux-kernel, vpillai, Aaron Lu, Aubrey Li, peterz,
	Ben Segall, Dietmar Eggemann, Ingo Molnar, Juri Lelli,
	Mel Gorman, Steven Rostedt, Vincent Guittot

On Tue, Mar 24, 2020 at 11:01:27AM +0800, Li, Aubrey wrote:
> On 2020/3/23 23:21, Joel Fernandes wrote:
> > On Mon, Mar 23, 2020 at 02:58:18PM +0800, Li, Aubrey wrote:
> >> On 2020/3/14 8:30, Paul E. McKenney wrote:
> >>> On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
> >>>> rcu_read_unlock() can incur an infrequent deadlock in
> >>>> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
> >>>>
> >>>> This fixes the following spinlock recursion observed when testing the
> >>>> core scheduling patches on PREEMPT=y kernel on ChromeOS:
> >>>>
> >>>> [   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
> >>>>
> >>>
> >>> The original could indeed deadlock, and this would avoid that deadlock.
> >>> (The commit to solve this deadlock is sadly not yet in mainline.)
> >>>
> >>> Acked-by: Paul E. McKenney <paulmck@kernel.org>
> >>
> >> I saw this in dmesg with this patch, is it expected?
> >>
> >> [  117.000905] =============================
> >> [  117.000907] WARNING: suspicious RCU usage
> >> [  117.000911] 5.5.7+ #160 Not tainted
> >> [  117.000913] -----------------------------
> >> [  117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage!
> >> [  117.000918] 
> >>                other info that might help us debug this:
> > 
> > Sigh, this is because for_each_domain() expects rcu_read_lock(). From an RCU
> > PoV, the code is correct (warning doesn't cause any issue).
> > 
> > To silence warning, we could replace the rcu_read_lock_sched() in my patch with:
> > preempt_disable();
> > rcu_read_lock();
> > 
> > and replace the unlock with:
> > 
> > rcu_read_unlock();
> > preempt_enable();
> > 
> > That should both take care of both the warning and the scheduler-related
> > deadlock. Thoughts?
> > 
> 
> How about this?
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index a01df3e..7ff694e 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4743,7 +4743,6 @@ static void sched_core_balance(struct rq *rq)
>  	int cpu = cpu_of(rq);
>  
>  	rcu_read_lock();
> -	raw_spin_unlock_irq(rq_lockp(rq));
>  	for_each_domain(cpu, sd) {
>  		if (!(sd->flags & SD_LOAD_BALANCE))
>  			break;
> @@ -4754,7 +4753,6 @@ static void sched_core_balance(struct rq *rq)
>  		if (steal_cookie_task(cpu, sd))
>  			break;
>  	}
> -	raw_spin_lock_irq(rq_lockp(rq));

try_steal_cookie() does a double_rq_lock(). Would this change not deadlock
with that?

thanks,

 - Joel


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic
  2020-03-24 18:49         ` Joel Fernandes
@ 2020-03-25  0:40           ` Li, Aubrey
  0 siblings, 0 replies; 9+ messages in thread
From: Li, Aubrey @ 2020-03-25  0:40 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: paulmck, linux-kernel, vpillai, Aaron Lu, Aubrey Li, peterz,
	Ben Segall, Dietmar Eggemann, Ingo Molnar, Juri Lelli,
	Mel Gorman, Steven Rostedt, Vincent Guittot

On 2020/3/25 2:49, Joel Fernandes wrote:
> On Tue, Mar 24, 2020 at 11:01:27AM +0800, Li, Aubrey wrote:
>> On 2020/3/23 23:21, Joel Fernandes wrote:
>>> On Mon, Mar 23, 2020 at 02:58:18PM +0800, Li, Aubrey wrote:
>>>> On 2020/3/14 8:30, Paul E. McKenney wrote:
>>>>> On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
>>>>>> rcu_read_unlock() can incur an infrequent deadlock in
>>>>>> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
>>>>>>
>>>>>> This fixes the following spinlock recursion observed when testing the
>>>>>> core scheduling patches on PREEMPT=y kernel on ChromeOS:
>>>>>>
>>>>>> [   14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
>>>>>>
>>>>>
>>>>> The original could indeed deadlock, and this would avoid that deadlock.
>>>>> (The commit to solve this deadlock is sadly not yet in mainline.)
>>>>>
>>>>> Acked-by: Paul E. McKenney <paulmck@kernel.org>
>>>>
>>>> I saw this in dmesg with this patch, is it expected?
>>>>
>>>> [  117.000905] =============================
>>>> [  117.000907] WARNING: suspicious RCU usage
>>>> [  117.000911] 5.5.7+ #160 Not tainted
>>>> [  117.000913] -----------------------------
>>>> [  117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage!
>>>> [  117.000918] 
>>>>                other info that might help us debug this:
>>>
>>> Sigh, this is because for_each_domain() expects rcu_read_lock(). From an RCU
>>> PoV, the code is correct (warning doesn't cause any issue).
>>>
>>> To silence warning, we could replace the rcu_read_lock_sched() in my patch with:
>>> preempt_disable();
>>> rcu_read_lock();
>>>
>>> and replace the unlock with:
>>>
>>> rcu_read_unlock();
>>> preempt_enable();
>>>
>>> That should both take care of both the warning and the scheduler-related
>>> deadlock. Thoughts?
>>>
>>
>> How about this?
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index a01df3e..7ff694e 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -4743,7 +4743,6 @@ static void sched_core_balance(struct rq *rq)
>>  	int cpu = cpu_of(rq);
>>  
>>  	rcu_read_lock();
>> -	raw_spin_unlock_irq(rq_lockp(rq));
>>  	for_each_domain(cpu, sd) {
>>  		if (!(sd->flags & SD_LOAD_BALANCE))
>>  			break;
>> @@ -4754,7 +4753,6 @@ static void sched_core_balance(struct rq *rq)
>>  		if (steal_cookie_task(cpu, sd))
>>  			break;
>>  	}
>> -	raw_spin_lock_irq(rq_lockp(rq));
> 
> try_steal_cookie() does a double_rq_lock(). Would this change not deadlock
> with that?
> 
Oh yes, missed double_rq_lock() inside, just want to keep local intr disabled
to avoid preemption. will try Paul's patch and report back.

Thanks,
-Aubrey

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-03-25  0:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-13 23:29 [PATCH] sched: Use RCU-sched in core-scheduling balancing logic Joel Fernandes (Google)
2020-03-14  0:30 ` Paul E. McKenney
2020-03-23  6:58   ` Li, Aubrey
2020-03-23 15:21     ` Joel Fernandes
2020-03-24  3:01       ` Li, Aubrey
2020-03-24 13:30         ` Paul E. McKenney
2020-03-24 15:12           ` Paul E. McKenney
2020-03-24 18:49         ` Joel Fernandes
2020-03-25  0:40           ` Li, Aubrey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).