linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
@ 2019-08-01 13:37 Phil Auld
  2019-08-05 14:06 ` Phil Auld
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Phil Auld @ 2019-08-01 13:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Peter Zijlstra, Vincent Guittot

Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
warning to fire in update_rq_clock. This seems to be caused by onlining
a new fair sched group not using the rq lock wrappers.

[472978.683085] rq->clock_update_flags & RQCF_UPDATED
[472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 update_rq_clock+0xec/0x150
[472978.758465] Modules linked in: fuse vfat msdos fat ext4 mbcache jbd2 sunrpc iTCO_wdt gpio_ich iTCO_vendor_support intel_powerclamp coretemp kvm_intel kvm ipmi_ssif irqbypass ipmi_si crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev pcspkr intel_cstate acpi_power_meter ipmi_devintf sg intel_uncore i7core_edac hpilo hpwdt lpc_ich ipmi_msghandler acpi_cpufreq ip_tables xfs libcrc32c sr_mod sd_mod cdrom ata_generic radeon i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ata_piix drm myri10ge hpsa libata serio_raw crc32c_intel scsi_transport_sas netxen_nic dca dm_mirror dm_region_hash dm_log dm_mod
[472979.050101] CPU: 5 PID: 54385 Comm: cgcreate Not tainted 5.2.0-rc6+ #135
[472979.089586] Hardware name: HP ProLiant DL580 G7, BIOS P65 08/16/2015
[472979.123638] RIP: 0010:update_rq_clock+0xec/0x150
[472979.146435] Code: a8 04 0f 84 55 ff ff ff 80 3d 93 34 2c 01 00 0f 85 48 ff ff ff 48 c7 c7 08 b9 a8 b7 31 c0 c6 05 7d 34 2c 01 01 e8 04 21 fd ff <0f> 0b 8b 83 b0 09 00 00 e9 26 ff ff ff e8 72 c3 f5 ff 66 90 4c 8b
[472979.247671] RSP: 0018:ffffa9addac67d48 EFLAGS: 00010086
[472979.277071] RAX: 0000000000000000 RBX: ffff9db3ff62a380 RCX: 0000000000000000
[472979.314842] RDX: 0000000000000005 RSI: ffffffffb8434a45 RDI: ffffffffb84325ec
[472979.352189] RBP: 0000000000000000 R08: ffffffffb8434a20 R09: ffffffb27562d751
[472979.389326] R10: 000000000000d471 R11: ffffa9addac67a58 R12: ffff9d8a1c9d5a00
[472979.425255] R13: ffff9db3fbbed400 R14: 000000000002a380 R15: 0000000000000000
[472979.462417] FS:  00007f6a8218cb80(0000) GS:ffff9db3ff740000(0000) knlGS:0000000000000000
[472979.511306] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[472979.543386] CR2: 00007f6a82198000 CR3: 0000003f33f16005 CR4: 00000000000206e0
[472979.578646] Call Trace:
[472979.591702]  online_fair_sched_group+0x53/0x100
[472979.618172]  cpu_cgroup_css_online+0x16/0x20
[472979.640264]  online_css+0x1c/0x60
[472979.657648]  cgroup_apply_control_enable+0x231/0x3b0
[472979.684979]  cgroup_mkdir+0x41b/0x530
[472979.704845]  kernfs_iop_mkdir+0x61/0xa0
[472979.726278]  vfs_mkdir+0x108/0x1a0
[472979.745665]  do_mkdirat+0x77/0xe0
[472979.764981]  do_syscall_64+0x55/0x1d0
[472979.785990]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Using the wrappers in online_fair_sched_group instead of the raw locking 
removes this warning. 

Signed-off-by: Phil Auld <pauld@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
---
 Resend with PATCH instead of CHANGE in subject, and more recent upstream x86 backtrace.

 kernel/sched/fair.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 036be95a87e9..5c1299a5675c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10242,17 +10242,17 @@ void online_fair_sched_group(struct task_group *tg)
 {
 	struct sched_entity *se;
 	struct rq *rq;
+	struct rq_flags rf;
 	int i;
 
 	for_each_possible_cpu(i) {
 		rq = cpu_rq(i);
 		se = tg->se[i];
-
-		raw_spin_lock_irq(&rq->lock);
+		rq_lock(rq, &rf);
 		update_rq_clock(rq);
 		attach_entity_cfs_rq(se);
 		sync_throttle(tg, i);
-		raw_spin_unlock_irq(&rq->lock);
+		rq_unlock(rq, &rf);
 	}
 }
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
  2019-08-01 13:37 [PATCH] sched: use rq_lock/unlock in online_fair_sched_group Phil Auld
@ 2019-08-05 14:06 ` Phil Auld
  2019-08-06 13:03 ` Peter Zijlstra
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Phil Auld @ 2019-08-05 14:06 UTC (permalink / raw)
  To: Hillf Danton; +Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Vincent Guittot

On Fri, Aug 02, 2019 at 05:20:38PM +0800 Hillf Danton wrote:
> 
> On Thu,  1 Aug 2019 09:37:49 -0400 Phil Auld wrote:
> >
> > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> > warning to fire in update_rq_clock. This seems to be caused by onlining
> > a new fair sched group not using the rq lock wrappers.
> > 
> > [472978.683085] rq->clock_update_flags & RQCF_UPDATED
> > [472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 update_rq_clock+0xec/0x150
> 
> Another option perhaps only if that wrappers are not mandatory.
> 
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -212,10 +212,14 @@ void update_rq_clock(struct rq *rq)
>  #endif
>  
>  	delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
> -	if (delta < 0)
> -		return;
> -	rq->clock += delta;
> -	update_rq_clock_task(rq, delta);
> +	if (delta >= 0) {
> +		rq->clock += delta;
> +		update_rq_clock_task(rq, delta);
> +	}
> +
> +#ifdef CONFIG_SCHED_DEBUG
> +	rq->clock_update_flags &= ~RQCF_UPDATED;
> +#endif
>  }
>  
>  
> --
> 

I think that would silence the warning, but...

If we're to clear that flag right there, outside of the lock pinning code, 
then I think we might as well just remove the flag and all associated 
comments etc, no?


Cheers,
Phil

-- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
  2019-08-01 13:37 [PATCH] sched: use rq_lock/unlock in online_fair_sched_group Phil Auld
  2019-08-05 14:06 ` Phil Auld
@ 2019-08-06 13:03 ` Peter Zijlstra
  2019-08-06 13:58   ` Phil Auld
  2019-08-09 13:33   ` Phil Auld
  2019-08-08 11:01 ` [tip:sched/core] sched/fair: Use " tip-bot for Phil Auld
  2019-08-12 12:52 ` tip-bot for Phil Auld
  3 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2019-08-06 13:03 UTC (permalink / raw)
  To: Phil Auld; +Cc: linux-kernel, Ingo Molnar, Vincent Guittot

On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes

ISTR there were more issues; but it sure is good to start picking them
off.

> warning to fire in update_rq_clock. This seems to be caused by onlining
> a new fair sched group not using the rq lock wrappers.
> 
> [472978.683085] rq->clock_update_flags & RQCF_UPDATED
> [472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 update_rq_clock+0xec/0x150

> Using the wrappers in online_fair_sched_group instead of the raw locking 
> removes this warning. 

Yeah, that seems sane. Thanks!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
  2019-08-06 13:03 ` Peter Zijlstra
@ 2019-08-06 13:58   ` Phil Auld
  2019-08-09 13:33   ` Phil Auld
  1 sibling, 0 replies; 14+ messages in thread
From: Phil Auld @ 2019-08-06 13:58 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, Vincent Guittot

On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote:
> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
> > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> 
> ISTR there were more issues; but it sure is good to start picking them
> off.

I haven't hit any others but if/when I'll try to dig into them. 

> 
> > warning to fire in update_rq_clock. This seems to be caused by onlining
> > a new fair sched group not using the rq lock wrappers.
> > 
> > [472978.683085] rq->clock_update_flags & RQCF_UPDATED
> > [472978.683100] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 update_rq_clock+0xec/0x150
> 
> > Using the wrappers in online_fair_sched_group instead of the raw locking 
> > removes this warning. 
> 
> Yeah, that seems sane. Thanks!

Thanks,
Phil

-- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:sched/core] sched/fair: Use rq_lock/unlock in online_fair_sched_group
  2019-08-01 13:37 [PATCH] sched: use rq_lock/unlock in online_fair_sched_group Phil Auld
  2019-08-05 14:06 ` Phil Auld
  2019-08-06 13:03 ` Peter Zijlstra
@ 2019-08-08 11:01 ` tip-bot for Phil Auld
  2019-08-09 16:21   ` Dietmar Eggemann
  2019-08-12 12:52 ` tip-bot for Phil Auld
  3 siblings, 1 reply; 14+ messages in thread
From: tip-bot for Phil Auld @ 2019-08-08 11:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, mingo, linux-kernel, hpa, mingo, peterz, vincent.guittot, pauld

Commit-ID:  6b8fd01b21f5f2701b407a7118f236ba4c41226d
Gitweb:     https://git.kernel.org/tip/6b8fd01b21f5f2701b407a7118f236ba4c41226d
Author:     Phil Auld <pauld@redhat.com>
AuthorDate: Thu, 1 Aug 2019 09:37:49 -0400
Committer:  Peter Zijlstra <peterz@infradead.org>
CommitDate: Thu, 8 Aug 2019 09:09:31 +0200

sched/fair: Use rq_lock/unlock in online_fair_sched_group

Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
warning to fire in update_rq_clock. This seems to be caused by onlining
a new fair sched group not using the rq lock wrappers.

  [] rq->clock_update_flags & RQCF_UPDATED
  [] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 update_rq_clock+0xec/0x150

  [] Call Trace:
  []  online_fair_sched_group+0x53/0x100
  []  cpu_cgroup_css_online+0x16/0x20
  []  online_css+0x1c/0x60
  []  cgroup_apply_control_enable+0x231/0x3b0
  []  cgroup_mkdir+0x41b/0x530
  []  kernfs_iop_mkdir+0x61/0xa0
  []  vfs_mkdir+0x108/0x1a0
  []  do_mkdirat+0x77/0xe0
  []  do_syscall_64+0x55/0x1d0
  []  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Using the wrappers in online_fair_sched_group instead of the raw locking
removes this warning.

Signed-off-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20190801133749.11033-1-pauld@redhat.com
---
 kernel/sched/fair.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 19c58599e967..d9407517dae9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10281,18 +10281,18 @@ err:
 void online_fair_sched_group(struct task_group *tg)
 {
 	struct sched_entity *se;
+	struct rq_flags rf;
 	struct rq *rq;
 	int i;
 
 	for_each_possible_cpu(i) {
 		rq = cpu_rq(i);
 		se = tg->se[i];
-
-		raw_spin_lock_irq(&rq->lock);
+		rq_lock(rq, &rf);
 		update_rq_clock(rq);
 		attach_entity_cfs_rq(se);
 		sync_throttle(tg, i);
-		raw_spin_unlock_irq(&rq->lock);
+		rq_unlock(rq, &rf);
 	}
 }
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
  2019-08-06 13:03 ` Peter Zijlstra
  2019-08-06 13:58   ` Phil Auld
@ 2019-08-09 13:33   ` Phil Auld
  2019-08-09 17:43     ` Valentin Schneider
  1 sibling, 1 reply; 14+ messages in thread
From: Phil Auld @ 2019-08-09 13:33 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, Vincent Guittot

On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote:
> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
> > Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> 
> ISTR there were more issues; but it sure is good to start picking them
> off.
> 

Following up on this I hit another in rt.c which looks like:

[  156.348854] Call Trace:
[  156.351301]  <IRQ>
[  156.353322]  sched_rt_period_timer+0x124/0x350
[  156.357766]  ? sched_rt_rq_enqueue+0x90/0x90
[  156.362037]  __hrtimer_run_queues+0xfb/0x270
[  156.366303]  hrtimer_interrupt+0x122/0x270
[  156.370403]  smp_apic_timer_interrupt+0x6a/0x140
[  156.375022]  apic_timer_interrupt+0xf/0x20
[  156.379119]  </IRQ>

It looks like the same issue of not using the rq_lock* wrappers and
hence not using the pinning. From looking at the code there is at 
least one potential hit in deadline.c in the push_dl_task path with 
find_lock_later_rq but I have not hit that in practice.

This commit, which introduced the warning, seems to imply that the use
of the rq_lock* wrappers is required, at least for any sections that will
call update_rq_clock:

commit 26ae58d23b94a075ae724fd18783a3773131cfbc
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Mon Oct 3 16:53:49 2016 +0200

    sched/core: Add WARNING for multiple update_rq_clock() calls
    
    Now that we have no missing calls, add a warning to find multiple
    calls.
    
    By having only a single update_rq_clock() call per rq-lock section,
    the section appears 'atomic' wrt time.


Is that the case? Otherwise we have these false positives.

I can spin up patches if so. 


Thanks,
Phil


-- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [tip:sched/core] sched/fair: Use rq_lock/unlock in online_fair_sched_group
  2019-08-08 11:01 ` [tip:sched/core] sched/fair: Use " tip-bot for Phil Auld
@ 2019-08-09 16:21   ` Dietmar Eggemann
  2019-08-09 17:28     ` Phil Auld
  0 siblings, 1 reply; 14+ messages in thread
From: Dietmar Eggemann @ 2019-08-09 16:21 UTC (permalink / raw)
  To: pauld, vincent.guittot, hpa, linux-kernel, peterz, mingo, tglx,
	mingo, linux-tip-commits

On 8/8/19 1:01 PM, tip-bot for Phil Auld wrote:

[...]

> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 19c58599e967..d9407517dae9 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10281,18 +10281,18 @@ err:
>  void online_fair_sched_group(struct task_group *tg)
>  {
>  	struct sched_entity *se;
> +	struct rq_flags rf;
>  	struct rq *rq;
>  	int i;
>  
>  	for_each_possible_cpu(i) {
>  		rq = cpu_rq(i);
>  		se = tg->se[i];
> -
> -		raw_spin_lock_irq(&rq->lock);
> +		rq_lock(rq, &rf);
>  		update_rq_clock(rq);
>  		attach_entity_cfs_rq(se);
>  		sync_throttle(tg, i);
> -		raw_spin_unlock_irq(&rq->lock);
> +		rq_unlock(rq, &rf);
>  	}
>  }

Shouldn't this be:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d9407517dae9..1054d2cf6aaa 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10288,11 +10288,11 @@ void online_fair_sched_group(struct task_group
*tg)
        for_each_possible_cpu(i) {
                rq = cpu_rq(i);
                se = tg->se[i];
-               rq_lock(rq, &rf);
+               rq_lock_irq(rq, &rf);
                update_rq_clock(rq);
                attach_entity_cfs_rq(se);
                sync_throttle(tg, i);
-               rq_unlock(rq, &rf);
+               rq_unlock_irq(rq, &rf);
        }
 }

Currently, you should get a 'inconsistent lock state' warning with
CONFIG_PROVE_LOCKING.

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [tip:sched/core] sched/fair: Use rq_lock/unlock in online_fair_sched_group
  2019-08-09 16:21   ` Dietmar Eggemann
@ 2019-08-09 17:28     ` Phil Auld
  2019-08-12  9:56       ` Dietmar Eggemann
  0 siblings, 1 reply; 14+ messages in thread
From: Phil Auld @ 2019-08-09 17:28 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: vincent.guittot, hpa, linux-kernel, peterz, mingo, tglx, mingo,
	linux-tip-commits

On Fri, Aug 09, 2019 at 06:21:22PM +0200 Dietmar Eggemann wrote:
> On 8/8/19 1:01 PM, tip-bot for Phil Auld wrote:
> 
> [...]
> 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 19c58599e967..d9407517dae9 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -10281,18 +10281,18 @@ err:
> >  void online_fair_sched_group(struct task_group *tg)
> >  {
> >  	struct sched_entity *se;
> > +	struct rq_flags rf;
> >  	struct rq *rq;
> >  	int i;
> >  
> >  	for_each_possible_cpu(i) {
> >  		rq = cpu_rq(i);
> >  		se = tg->se[i];
> > -
> > -		raw_spin_lock_irq(&rq->lock);
> > +		rq_lock(rq, &rf);
> >  		update_rq_clock(rq);
> >  		attach_entity_cfs_rq(se);
> >  		sync_throttle(tg, i);
> > -		raw_spin_unlock_irq(&rq->lock);
> > +		rq_unlock(rq, &rf);
> >  	}
> >  }
> 
> Shouldn't this be:
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d9407517dae9..1054d2cf6aaa 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10288,11 +10288,11 @@ void online_fair_sched_group(struct task_group
> *tg)
>         for_each_possible_cpu(i) {
>                 rq = cpu_rq(i);
>                 se = tg->se[i];
> -               rq_lock(rq, &rf);
> +               rq_lock_irq(rq, &rf);
>                 update_rq_clock(rq);
>                 attach_entity_cfs_rq(se);
>                 sync_throttle(tg, i);
> -               rq_unlock(rq, &rf);
> +               rq_unlock_irq(rq, &rf);
>         }
>  }
> 
> Currently, you should get a 'inconsistent lock state' warning with
> CONFIG_PROVE_LOCKING.

Yes, indeed. Sorry about that. Maybe it can be fixed in tip before 
it gets any farther?  Or do we need a new patch?


Cheers,
Phil

-- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
  2019-08-09 13:33   ` Phil Auld
@ 2019-08-09 17:43     ` Valentin Schneider
  2019-08-15 13:39       ` Phil Auld
  0 siblings, 1 reply; 14+ messages in thread
From: Valentin Schneider @ 2019-08-09 17:43 UTC (permalink / raw)
  To: Phil Auld, Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, Vincent Guittot

On 09/08/2019 14:33, Phil Auld wrote:
> On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote:
>> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
>>> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
>>
>> ISTR there were more issues; but it sure is good to start picking them
>> off.
>>
> 
> Following up on this I hit another in rt.c which looks like:
> 
> [  156.348854] Call Trace:
> [  156.351301]  <IRQ>
> [  156.353322]  sched_rt_period_timer+0x124/0x350
> [  156.357766]  ? sched_rt_rq_enqueue+0x90/0x90
> [  156.362037]  __hrtimer_run_queues+0xfb/0x270
> [  156.366303]  hrtimer_interrupt+0x122/0x270
> [  156.370403]  smp_apic_timer_interrupt+0x6a/0x140
> [  156.375022]  apic_timer_interrupt+0xf/0x20
> [  156.379119]  </IRQ>
> 
> It looks like the same issue of not using the rq_lock* wrappers and
> hence not using the pinning. From looking at the code there is at 
> least one potential hit in deadline.c in the push_dl_task path with 
> find_lock_later_rq but I have not hit that in practice.
> 
> This commit, which introduced the warning, seems to imply that the use
> of the rq_lock* wrappers is required, at least for any sections that will
> call update_rq_clock:
> 
> commit 26ae58d23b94a075ae724fd18783a3773131cfbc
> Author: Peter Zijlstra <peterz@infradead.org>
> Date:   Mon Oct 3 16:53:49 2016 +0200
> 
>     sched/core: Add WARNING for multiple update_rq_clock() calls
>     
>     Now that we have no missing calls, add a warning to find multiple
>     calls.
>     
>     By having only a single update_rq_clock() call per rq-lock section,
>     the section appears 'atomic' wrt time.
> 
> 
> Is that the case? Otherwise we have these false positives.
> 

Looks like it - only rq_pin_lock() clears RQCF_UPDATED, so any
update_rq_clock() that isn't preceded by that function will still have
RQCF_UPDATED set the second time it's executed and will trigger the warn.

Seeing as the wrappers boil down to raw_spin_*() when the debug bits are
disabled, I don't see why we wouldn't want to convert these callsites.

> I can spin up patches if so. 
> 
> 
> Thanks,
> Phil
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [tip:sched/core] sched/fair: Use rq_lock/unlock in online_fair_sched_group
  2019-08-09 17:28     ` Phil Auld
@ 2019-08-12  9:56       ` Dietmar Eggemann
  0 siblings, 0 replies; 14+ messages in thread
From: Dietmar Eggemann @ 2019-08-12  9:56 UTC (permalink / raw)
  To: Phil Auld
  Cc: vincent.guittot, hpa, linux-kernel, peterz, mingo, tglx, mingo,
	linux-tip-commits

On 8/9/19 7:28 PM, Phil Auld wrote:
> On Fri, Aug 09, 2019 at 06:21:22PM +0200 Dietmar Eggemann wrote:
>> On 8/8/19 1:01 PM, tip-bot for Phil Auld wrote:

[...]

>> Shouldn't this be:
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index d9407517dae9..1054d2cf6aaa 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -10288,11 +10288,11 @@ void online_fair_sched_group(struct task_group
>> *tg)
>>         for_each_possible_cpu(i) {
>>                 rq = cpu_rq(i);
>>                 se = tg->se[i];
>> -               rq_lock(rq, &rf);
>> +               rq_lock_irq(rq, &rf);
>>                 update_rq_clock(rq);
>>                 attach_entity_cfs_rq(se);
>>                 sync_throttle(tg, i);
>> -               rq_unlock(rq, &rf);
>> +               rq_unlock_irq(rq, &rf);
>>         }
>>  }
>>
>> Currently, you should get a 'inconsistent lock state' warning with
>> CONFIG_PROVE_LOCKING.
> 
> Yes, indeed. Sorry about that. Maybe it can be fixed in tip before 
> it gets any farther?  Or do we need a new patch?

I think Peter is on holiday so maybe you can send a new patch and ask
Ingo or Thomas to replace your original patch on tip sched/core?




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:sched/core] sched/fair: Use rq_lock/unlock in online_fair_sched_group
  2019-08-01 13:37 [PATCH] sched: use rq_lock/unlock in online_fair_sched_group Phil Auld
                   ` (2 preceding siblings ...)
  2019-08-08 11:01 ` [tip:sched/core] sched/fair: Use " tip-bot for Phil Auld
@ 2019-08-12 12:52 ` tip-bot for Phil Auld
  2019-08-12 13:13   ` Phil Auld
  3 siblings, 1 reply; 14+ messages in thread
From: tip-bot for Phil Auld @ 2019-08-12 12:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: pauld, tglx, mingo, peterz, vincent.guittot, mingo, linux-kernel, hpa

Commit-ID:  a46d14eca7b75fffe35603aa8b81df654353d80f
Gitweb:     https://git.kernel.org/tip/a46d14eca7b75fffe35603aa8b81df654353d80f
Author:     Phil Auld <pauld@redhat.com>
AuthorDate: Thu, 1 Aug 2019 09:37:49 -0400
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 12 Aug 2019 14:45:34 +0200

sched/fair: Use rq_lock/unlock in online_fair_sched_group

Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
warning to fire in update_rq_clock. This seems to be caused by onlining
a new fair sched group not using the rq lock wrappers.

  [] rq->clock_update_flags & RQCF_UPDATED
  [] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 update_rq_clock+0xec/0x150

  [] Call Trace:
  []  online_fair_sched_group+0x53/0x100
  []  cpu_cgroup_css_online+0x16/0x20
  []  online_css+0x1c/0x60
  []  cgroup_apply_control_enable+0x231/0x3b0
  []  cgroup_mkdir+0x41b/0x530
  []  kernfs_iop_mkdir+0x61/0xa0
  []  vfs_mkdir+0x108/0x1a0
  []  do_mkdirat+0x77/0xe0
  []  do_syscall_64+0x55/0x1d0
  []  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Using the wrappers in online_fair_sched_group instead of the raw locking
removes this warning.

[ tglx: Use rq_*lock_irq() ]

Signed-off-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20190801133749.11033-1-pauld@redhat.com
---
 kernel/sched/fair.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 19c58599e967..1054d2cf6aaa 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10281,18 +10281,18 @@ err:
 void online_fair_sched_group(struct task_group *tg)
 {
 	struct sched_entity *se;
+	struct rq_flags rf;
 	struct rq *rq;
 	int i;
 
 	for_each_possible_cpu(i) {
 		rq = cpu_rq(i);
 		se = tg->se[i];
-
-		raw_spin_lock_irq(&rq->lock);
+		rq_lock_irq(rq, &rf);
 		update_rq_clock(rq);
 		attach_entity_cfs_rq(se);
 		sync_throttle(tg, i);
-		raw_spin_unlock_irq(&rq->lock);
+		rq_unlock_irq(rq, &rf);
 	}
 }
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [tip:sched/core] sched/fair: Use rq_lock/unlock in online_fair_sched_group
  2019-08-12 12:52 ` tip-bot for Phil Auld
@ 2019-08-12 13:13   ` Phil Auld
  0 siblings, 0 replies; 14+ messages in thread
From: Phil Auld @ 2019-08-12 13:13 UTC (permalink / raw)
  To: tglx; +Cc: peterz, mingo, hpa, linux-kernel, mingo, vincent.guittot

On Mon, Aug 12, 2019 at 05:52:04AM -0700 tip-bot for Phil Auld wrote:
> Commit-ID:  a46d14eca7b75fffe35603aa8b81df654353d80f
> Gitweb:     https://git.kernel.org/tip/a46d14eca7b75fffe35603aa8b81df654353d80f
> Author:     Phil Auld <pauld@redhat.com>
> AuthorDate: Thu, 1 Aug 2019 09:37:49 -0400
> Committer:  Thomas Gleixner <tglx@linutronix.de>
> CommitDate: Mon, 12 Aug 2019 14:45:34 +0200
> 
> sched/fair: Use rq_lock/unlock in online_fair_sched_group
> 
> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> warning to fire in update_rq_clock. This seems to be caused by onlining
> a new fair sched group not using the rq lock wrappers.
> 
>   [] rq->clock_update_flags & RQCF_UPDATED
>   [] WARNING: CPU: 5 PID: 54385 at kernel/sched/core.c:210 update_rq_clock+0xec/0x150
> 
>   [] Call Trace:
>   []  online_fair_sched_group+0x53/0x100
>   []  cpu_cgroup_css_online+0x16/0x20
>   []  online_css+0x1c/0x60
>   []  cgroup_apply_control_enable+0x231/0x3b0
>   []  cgroup_mkdir+0x41b/0x530
>   []  kernfs_iop_mkdir+0x61/0xa0
>   []  vfs_mkdir+0x108/0x1a0
>   []  do_mkdirat+0x77/0xe0
>   []  do_syscall_64+0x55/0x1d0
>   []  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Using the wrappers in online_fair_sched_group instead of the raw locking
> removes this warning.
> 
> [ tglx: Use rq_*lock_irq() ]
> 
> Signed-off-by: Phil Auld <pauld@redhat.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Link: https://lkml.kernel.org/r/20190801133749.11033-1-pauld@redhat.com
> ---
>  kernel/sched/fair.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 19c58599e967..1054d2cf6aaa 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10281,18 +10281,18 @@ err:
>  void online_fair_sched_group(struct task_group *tg)
>  {
>  	struct sched_entity *se;
> +	struct rq_flags rf;
>  	struct rq *rq;
>  	int i;
>  
>  	for_each_possible_cpu(i) {
>  		rq = cpu_rq(i);
>  		se = tg->se[i];
> -
> -		raw_spin_lock_irq(&rq->lock);
> +		rq_lock_irq(rq, &rf);
>  		update_rq_clock(rq);
>  		attach_entity_cfs_rq(se);
>  		sync_throttle(tg, i);
> -		raw_spin_unlock_irq(&rq->lock);
> +		rq_unlock_irq(rq, &rf);
>  	}
>  }
>  

Thanks Thomas!

-- 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
  2019-08-09 17:43     ` Valentin Schneider
@ 2019-08-15 13:39       ` Phil Auld
  0 siblings, 0 replies; 14+ messages in thread
From: Phil Auld @ 2019-08-15 13:39 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Vincent Guittot

On Fri, Aug 09, 2019 at 06:43:09PM +0100 Valentin Schneider wrote:
> On 09/08/2019 14:33, Phil Auld wrote:
> > On Tue, Aug 06, 2019 at 03:03:34PM +0200 Peter Zijlstra wrote:
> >> On Thu, Aug 01, 2019 at 09:37:49AM -0400, Phil Auld wrote:
> >>> Enabling WARN_DOUBLE_CLOCK in /sys/kernel/debug/sched_features causes
> >>
> >> ISTR there were more issues; but it sure is good to start picking them
> >> off.
> >>
> > 
> > Following up on this I hit another in rt.c which looks like:
> > 
> > [  156.348854] Call Trace:
> > [  156.351301]  <IRQ>
> > [  156.353322]  sched_rt_period_timer+0x124/0x350
> > [  156.357766]  ? sched_rt_rq_enqueue+0x90/0x90
> > [  156.362037]  __hrtimer_run_queues+0xfb/0x270
> > [  156.366303]  hrtimer_interrupt+0x122/0x270
> > [  156.370403]  smp_apic_timer_interrupt+0x6a/0x140
> > [  156.375022]  apic_timer_interrupt+0xf/0x20
> > [  156.379119]  </IRQ>
> > 
> > It looks like the same issue of not using the rq_lock* wrappers and
> > hence not using the pinning. From looking at the code there is at 
> > least one potential hit in deadline.c in the push_dl_task path with 
> > find_lock_later_rq but I have not hit that in practice.
> > 
> > This commit, which introduced the warning, seems to imply that the use
> > of the rq_lock* wrappers is required, at least for any sections that will
> > call update_rq_clock:
> > 
> > commit 26ae58d23b94a075ae724fd18783a3773131cfbc
> > Author: Peter Zijlstra <peterz@infradead.org>
> > Date:   Mon Oct 3 16:53:49 2016 +0200
> > 
> >     sched/core: Add WARNING for multiple update_rq_clock() calls
> >     
> >     Now that we have no missing calls, add a warning to find multiple
> >     calls.
> >     
> >     By having only a single update_rq_clock() call per rq-lock section,
> >     the section appears 'atomic' wrt time.
> > 
> > 
> > Is that the case? Otherwise we have these false positives.
> > 
> 
> Looks like it - only rq_pin_lock() clears RQCF_UPDATED, so any
> update_rq_clock() that isn't preceded by that function will still have
> RQCF_UPDATED set the second time it's executed and will trigger the warn.
> 
> Seeing as the wrappers boil down to raw_spin_*() when the debug bits are
> disabled, I don't see why we wouldn't want to convert these callsites.
> 

The one above is easy enough.  After that I hit one related to the double_rq_lock
paths. Now I see why that was not cleaned up already. That's going to be a 
bit messier and will require some study. 

I'll post this trivial anyway. 

Cheers,
Phil

-- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched: use rq_lock/unlock in online_fair_sched_group
       [not found] <20190806060416.11440-1-hdanton@sina.com>
@ 2019-08-06 12:58 ` Phil Auld
  0 siblings, 0 replies; 14+ messages in thread
From: Phil Auld @ 2019-08-06 12:58 UTC (permalink / raw)
  To: Hillf Danton; +Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Vincent Guittot

On Tue, Aug 06, 2019 at 02:04:16PM +0800 Hillf Danton wrote:
> 
> On Mon, 5 Aug 2019 22:07:05 +0800 Phil Auld wrote:
> >
> > If we're to clear that flag right there, outside of the lock pinning code,
> > then I think we might as well just remove the flag and all associated
> > comments etc, no?
> 
> A diff may tell the Peter folks more about your thoughts?
> 

I provided a diff with my thoughts of how to remove this warning in
the original post :)

This comment was about your patch which, to my mind, makes the flag 
meaningless and so could just remove the whole thing. I was not 
proposing to actually do that. I assumed it was there because it was
thought to be useful. Although, if that is what people want I could 
certainly spin up a patch to that effect. 


Cheers,
Phil

> Hillf
> 

-- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-08-15 13:39 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-01 13:37 [PATCH] sched: use rq_lock/unlock in online_fair_sched_group Phil Auld
2019-08-05 14:06 ` Phil Auld
2019-08-06 13:03 ` Peter Zijlstra
2019-08-06 13:58   ` Phil Auld
2019-08-09 13:33   ` Phil Auld
2019-08-09 17:43     ` Valentin Schneider
2019-08-15 13:39       ` Phil Auld
2019-08-08 11:01 ` [tip:sched/core] sched/fair: Use " tip-bot for Phil Auld
2019-08-09 16:21   ` Dietmar Eggemann
2019-08-09 17:28     ` Phil Auld
2019-08-12  9:56       ` Dietmar Eggemann
2019-08-12 12:52 ` tip-bot for Phil Auld
2019-08-12 13:13   ` Phil Auld
     [not found] <20190806060416.11440-1-hdanton@sina.com>
2019-08-06 12:58 ` [PATCH] sched: use " Phil Auld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).