* Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#895 (master - 2a3984b)
[not found] ` <5d4d7164795c7_43f9afa8b58b0242711@29afa0b1-fa00-407e-a40e-a8edb471126a.mail>
@ 2019-08-09 21:21 ` Nick Desaulniers
2019-08-10 3:58 ` Nathan Chancellor
2019-08-12 12:54 ` Will Deacon
0 siblings, 2 replies; 5+ messages in thread
From: Nick Desaulniers @ 2019-08-09 21:21 UTC (permalink / raw)
To: linux-next, Stephen Rothwell, Will Deacon, Catalin Marinas, pauld
Cc: Linux ARM, Mark Brown, Mark Rutland, Arnd Bergmann, Peter Zijlstra
Did anyone report any issue with last night's -next for arm64?
Some kind of deadlock in online_fair_sched_group.
[ 15.256790] ================================
[ 15.257025] WARNING: inconsistent lock state
[ 15.257243] 5.3.0-rc3-next-20190809 #1 Not tainted
[ 15.257393] --------------------------------
[ 15.257526] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ 15.258096] init/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 15.258522] (____ptrval____) (&rq->lock){?.-.}, at:
online_fair_sched_group+0x78/0xe4
[ 15.259170] {IN-HARDIRQ-W} state was registered at:
[ 15.259658] lock_acquire+0x1dc/0x228
[ 15.259940] _raw_spin_lock+0x40/0x54
[ 15.260251] scheduler_tick+0x50/0xfc
[ 15.260491] update_process_times+0x80/0x98
[ 15.260677] tick_periodic+0xd8/0xf0
[ 15.260910] tick_handle_periodic+0x30/0x94
[ 15.261126] arch_timer_handler_virt+0x34/0x40
[ 15.261332] handle_percpu_devid_irq+0x1a8/0x3c4
[ 15.261495] __handle_domain_irq+0x7c/0xbc
[ 15.261689] gic_handle_irq+0x48/0xac
[ 15.261881] el1_irq+0xbc/0x180
[ 15.262024] _raw_spin_unlock_irqrestore+0x4c/0x80
[ 15.262263] tty_register_ldisc+0x58/0x6c
[ 15.262430] n_tty_init+0x18/0x20
[ 15.262615] console_init+0x20/0x3e4
[ 15.262820] start_kernel+0x248/0x3c4
[ 15.263079] irq event stamp: 220201
[ 15.263362] hardirqs last enabled at (220201):
[<ffff000010e1f334>] _raw_spin_unlock_irqrestore+0x48/0x80
[ 15.263731] hardirqs last disabled at (220200):
[<ffff000010e1f190>] _raw_spin_lock_irqsave+0x30/0x7c
[ 15.264046] softirqs last enabled at (220196):
[<ffff0000100f84c0>] irq_exit+0x114/0x134
[ 15.264419] softirqs last disabled at (220185):
[<ffff0000100f84c0>] irq_exit+0x114/0x134
[ 15.264751]
[ 15.264751] other info that might help us debug this:
[ 15.265044] Possible unsafe locking scenario:
[ 15.265044]
[ 15.265256] CPU0
[ 15.265458] ----
[ 15.265615] lock(&rq->lock);
[ 15.265898] <Interrupt>
[ 15.266087] lock(&rq->lock);
[ 15.266353]
[ 15.266353] *** DEADLOCK ***
[ 15.266353]
[ 15.266574] no locks held by init/1.
[ 15.266784]
[ 15.266784] stack backtrace:
[ 15.267120] CPU: 0 PID: 1 Comm: init Not tainted 5.3.0-rc3-next-20190809 #1
[ 15.267341] Hardware name: linux,dummy-virt (DT)
[ 15.267756] Call trace:
[ 15.267928] dump_backtrace+0x0/0x140
[ 15.268159] show_stack+0x14/0x1c
[ 15.268341] dump_stack+0xa8/0x104
[ 15.268482] mark_lock+0xda0/0xda8
[ 15.268728] __lock_acquire+0x300/0x858
[ 15.268869] lock_acquire+0x1dc/0x228
[ 15.269057] _raw_spin_lock+0x40/0x54
[ 15.269201] online_fair_sched_group+0x78/0xe4
[ 15.269392] sched_online_group+0x88/0xac
[ 15.269591] sched_autogroup_create_attach+0xcc/0x12c
[ 15.269765] ksys_setsid+0xe8/0xec
[ 15.269990] __arm64_sys_setsid+0xc/0x18
[ 15.270178] el0_svc_common+0x9c/0x15c
[ 15.270317] el0_svc_handler+0x5c/0x64
[ 15.270493] el0_svc+0x8/0xc
https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/223856448
Guessing related to
commit 6b8fd01b21f5 ("sched/fair: Use rq_lock/unlock in
online_fair_sched_group")
---------- Forwarded message ---------
From: Travis CI <builds@travis-ci.com>
Date: Fri, Aug 9, 2019 at 6:13 AM
Subject: [CRON] Broken: ClangBuiltLinux/continuous-integration#895
(master - 2a3984b)
To: <ndesaulniers@google.com>, <natechancellor@gmail.com>
ClangBuiltLinux / continuous-integration
master
Build #895 was broken
3 hrs, 29 mins, and 39 secs
Nathan Chancellor
2a3984b CHANGESET →
Merge pull request #196 from nathanchance/ppc64
PPC64 big endian
--
Thanks,
~Nick Desaulniers
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#895 (master - 2a3984b)
2019-08-09 21:21 ` Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#895 (master - 2a3984b) Nick Desaulniers
@ 2019-08-10 3:58 ` Nathan Chancellor
2019-08-12 12:54 ` Will Deacon
1 sibling, 0 replies; 5+ messages in thread
From: Nathan Chancellor @ 2019-08-10 3:58 UTC (permalink / raw)
To: Nick Desaulniers
Cc: linux-next, Stephen Rothwell, Will Deacon, Catalin Marinas,
pauld, Linux ARM, Mark Brown, Mark Rutland, Arnd Bergmann,
Peter Zijlstra
On Fri, Aug 09, 2019 at 02:21:21PM -0700, Nick Desaulniers wrote:
> Did anyone report any issue with last night's -next for arm64?
>
> Some kind of deadlock in online_fair_sched_group.
>
> [ 15.256790] ================================
> [ 15.257025] WARNING: inconsistent lock state
> [ 15.257243] 5.3.0-rc3-next-20190809 #1 Not tainted
> [ 15.257393] --------------------------------
> [ 15.257526] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> [ 15.258096] init/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [ 15.258522] (____ptrval____) (&rq->lock){?.-.}, at:
> online_fair_sched_group+0x78/0xe4
> [ 15.259170] {IN-HARDIRQ-W} state was registered at:
> [ 15.259658] lock_acquire+0x1dc/0x228
> [ 15.259940] _raw_spin_lock+0x40/0x54
> [ 15.260251] scheduler_tick+0x50/0xfc
> [ 15.260491] update_process_times+0x80/0x98
> [ 15.260677] tick_periodic+0xd8/0xf0
> [ 15.260910] tick_handle_periodic+0x30/0x94
> [ 15.261126] arch_timer_handler_virt+0x34/0x40
> [ 15.261332] handle_percpu_devid_irq+0x1a8/0x3c4
> [ 15.261495] __handle_domain_irq+0x7c/0xbc
> [ 15.261689] gic_handle_irq+0x48/0xac
> [ 15.261881] el1_irq+0xbc/0x180
> [ 15.262024] _raw_spin_unlock_irqrestore+0x4c/0x80
> [ 15.262263] tty_register_ldisc+0x58/0x6c
> [ 15.262430] n_tty_init+0x18/0x20
> [ 15.262615] console_init+0x20/0x3e4
> [ 15.262820] start_kernel+0x248/0x3c4
> [ 15.263079] irq event stamp: 220201
> [ 15.263362] hardirqs last enabled at (220201):
> [<ffff000010e1f334>] _raw_spin_unlock_irqrestore+0x48/0x80
> [ 15.263731] hardirqs last disabled at (220200):
> [<ffff000010e1f190>] _raw_spin_lock_irqsave+0x30/0x7c
> [ 15.264046] softirqs last enabled at (220196):
> [<ffff0000100f84c0>] irq_exit+0x114/0x134
> [ 15.264419] softirqs last disabled at (220185):
> [<ffff0000100f84c0>] irq_exit+0x114/0x134
> [ 15.264751]
> [ 15.264751] other info that might help us debug this:
> [ 15.265044] Possible unsafe locking scenario:
> [ 15.265044]
> [ 15.265256] CPU0
> [ 15.265458] ----
> [ 15.265615] lock(&rq->lock);
> [ 15.265898] <Interrupt>
> [ 15.266087] lock(&rq->lock);
> [ 15.266353]
> [ 15.266353] *** DEADLOCK ***
> [ 15.266353]
> [ 15.266574] no locks held by init/1.
> [ 15.266784]
> [ 15.266784] stack backtrace:
> [ 15.267120] CPU: 0 PID: 1 Comm: init Not tainted 5.3.0-rc3-next-20190809 #1
> [ 15.267341] Hardware name: linux,dummy-virt (DT)
> [ 15.267756] Call trace:
> [ 15.267928] dump_backtrace+0x0/0x140
> [ 15.268159] show_stack+0x14/0x1c
> [ 15.268341] dump_stack+0xa8/0x104
> [ 15.268482] mark_lock+0xda0/0xda8
> [ 15.268728] __lock_acquire+0x300/0x858
> [ 15.268869] lock_acquire+0x1dc/0x228
> [ 15.269057] _raw_spin_lock+0x40/0x54
> [ 15.269201] online_fair_sched_group+0x78/0xe4
> [ 15.269392] sched_online_group+0x88/0xac
> [ 15.269591] sched_autogroup_create_attach+0xcc/0x12c
> [ 15.269765] ksys_setsid+0xe8/0xec
> [ 15.269990] __arm64_sys_setsid+0xc/0x18
> [ 15.270178] el0_svc_common+0x9c/0x15c
> [ 15.270317] el0_svc_handler+0x5c/0x64
> [ 15.270493] el0_svc+0x8/0xc
>
> https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/223856448
While that warning certainly looks it needs to be dealt with, I just
redid the job and it boots fine; I also verified this locally. I think
the job just got stuck somewhere or the build took simply took too long
so Travis killed the job.
Cheers,
Nathan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#895 (master - 2a3984b)
2019-08-09 21:21 ` Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#895 (master - 2a3984b) Nick Desaulniers
2019-08-10 3:58 ` Nathan Chancellor
@ 2019-08-12 12:54 ` Will Deacon
2019-08-12 12:55 ` Will Deacon
1 sibling, 1 reply; 5+ messages in thread
From: Will Deacon @ 2019-08-12 12:54 UTC (permalink / raw)
To: Nick Desaulniers
Cc: linux-next, Stephen Rothwell, Will Deacon, Catalin Marinas,
pauld, Linux ARM, Mark Brown, Mark Rutland, Arnd Bergmann,
Peter Zijlstra, tglx, dietmar.eggemann
Hi Nick,
On Fri, Aug 09, 2019 at 02:21:21PM -0700, Nick Desaulniers wrote:
> Did anyone report any issue with last night's -next for arm64?
>
> Some kind of deadlock in online_fair_sched_group.
>
> [ 15.256790] ================================
> [ 15.257025] WARNING: inconsistent lock state
> [ 15.257243] 5.3.0-rc3-next-20190809 #1 Not tainted
> [ 15.257393] --------------------------------
> [ 15.257526] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
> [ 15.258096] init/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [ 15.258522] (____ptrval____) (&rq->lock){?.-.}, at:
> online_fair_sched_group+0x78/0xe4
> [ 15.259170] {IN-HARDIRQ-W} state was registered at:
> [ 15.259658] lock_acquire+0x1dc/0x228
> [ 15.259940] _raw_spin_lock+0x40/0x54
> [ 15.260251] scheduler_tick+0x50/0xfc
> [ 15.260491] update_process_times+0x80/0x98
> [ 15.260677] tick_periodic+0xd8/0xf0
> [ 15.260910] tick_handle_periodic+0x30/0x94
> [ 15.261126] arch_timer_handler_virt+0x34/0x40
> [ 15.261332] handle_percpu_devid_irq+0x1a8/0x3c4
> [ 15.261495] __handle_domain_irq+0x7c/0xbc
> [ 15.261689] gic_handle_irq+0x48/0xac
> [ 15.261881] el1_irq+0xbc/0x180
Ok, so we take rq_lock() off the back of a timer interrupt in irq context...
> [ 15.267928] dump_backtrace+0x0/0x140
> [ 15.268159] show_stack+0x14/0x1c
> [ 15.268341] dump_stack+0xa8/0x104
> [ 15.268482] mark_lock+0xda0/0xda8
> [ 15.268728] __lock_acquire+0x300/0x858
> [ 15.268869] lock_acquire+0x1dc/0x228
> [ 15.269057] _raw_spin_lock+0x40/0x54
... but also with irqs enabled when handling a syscall. Boom.
> [ 15.269201] online_fair_sched_group+0x78/0xe4
> [ 15.269392] sched_online_group+0x88/0xac
> [ 15.269591] sched_autogroup_create_attach+0xcc/0x12c
> [ 15.269765] ksys_setsid+0xe8/0xec
> [ 15.269990] __arm64_sys_setsid+0xc/0x18
> [ 15.270178] el0_svc_common+0x9c/0x15c
> [ 15.270317] el0_svc_handler+0x5c/0x64
> [ 15.270493] el0_svc+0x8/0xc
>
> https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/223856448
>
> Guessing related to
> commit 6b8fd01b21f5 ("sched/fair: Use rq_lock/unlock in
> online_fair_sched_group")
Agreed. I think that patch should be using rq_lock_{irqsave,irqrestore}().
Looking at the list archive, it seems that this was already spotted last
week:
https://lkml.kernel.org/r/dfc8f652-ca98-e30a-546f-e6a2df36e33a@arm.com
Although the proposal there disables irqs unconditionally, which matches
the old behaviour (prior to 6b8fd01b21f5) but feels a bit dodgy given that
the only caller (sched_online_group()) uses the save/restore variants.
Phil -- is there a fix queued for this somewhere?
Will
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#895 (master - 2a3984b)
2019-08-12 12:54 ` Will Deacon
@ 2019-08-12 12:55 ` Will Deacon
2019-08-12 13:16 ` Phil Auld
0 siblings, 1 reply; 5+ messages in thread
From: Will Deacon @ 2019-08-12 12:55 UTC (permalink / raw)
To: Nick Desaulniers
Cc: linux-next, Stephen Rothwell, Will Deacon, Catalin Marinas,
pauld, Linux ARM, Mark Brown, Mark Rutland, Arnd Bergmann,
Peter Zijlstra, tglx, dietmar.eggemann
On Mon, Aug 12, 2019 at 01:54:14PM +0100, Will Deacon wrote:
> Phil -- is there a fix queued for this somewhere?
Ha, tglx beat me by two minutes. This is now fixed in -tip.
Will
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#895 (master - 2a3984b)
2019-08-12 12:55 ` Will Deacon
@ 2019-08-12 13:16 ` Phil Auld
0 siblings, 0 replies; 5+ messages in thread
From: Phil Auld @ 2019-08-12 13:16 UTC (permalink / raw)
To: Will Deacon
Cc: Nick Desaulniers, linux-next, Stephen Rothwell, Will Deacon,
Catalin Marinas, Linux ARM, Mark Brown, Mark Rutland,
Arnd Bergmann, Peter Zijlstra, tglx, dietmar.eggemann
On Mon, Aug 12, 2019 at 01:55:43PM +0100 Will Deacon wrote:
> On Mon, Aug 12, 2019 at 01:54:14PM +0100, Will Deacon wrote:
> > Phil -- is there a fix queued for this somewhere?
>
> Ha, tglx beat me by two minutes. This is now fixed in -tip.
>
> Will
Yeah, it's now fixed. Sorry about that...
Cheers,
Phil
--
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-08-12 13:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <ClangBuiltLinux/continuous-integration+122566420+broken@travis-ci.com>
[not found] ` <5d4d7164795c7_43f9afa8b58b0242711@29afa0b1-fa00-407e-a40e-a8edb471126a.mail>
2019-08-09 21:21 ` Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#895 (master - 2a3984b) Nick Desaulniers
2019-08-10 3:58 ` Nathan Chancellor
2019-08-12 12:54 ` Will Deacon
2019-08-12 12:55 ` Will Deacon
2019-08-12 13:16 ` Phil Auld
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).