linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* complete_all warning with 3 waiters
@ 2020-03-11 16:32 Marouen Ghodhbane
  2020-03-12  8:10 ` Daniel Wagner
  2020-03-17 12:08 ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 4+ messages in thread
From: Marouen Ghodhbane @ 2020-03-11 16:32 UTC (permalink / raw)
  To: linux-rt-users

Hello everyone,

I am running the Linux RT kernel 4.14.78-rt47 on an i.MX8MMini when i encountered the swake_up_all_locked() warning with "complete_all() with 3 waiters" on the sdma controller probe.


[ 9.455488] complete_all() with 3 waiters
[ 9.455498] -----------[ cut here ]-----------
[ 9.455515] WARNING: CPU: 1 PID: 3450 at /usr/src/kernel/kernel/sched/swait.c:49 swake_up_all_locked+0xa4/0xb8
[ 9.455529] CPU: 1 PID: 3450 Comm: systemd-udevd Tainted: G O 4.14.78-rt47+g66620c3 #1
[ 9.455531] Hardware name: FSL i.MX8MM EVK board (DT)
[ 9.455534] task: ffff800072895a00 task.stack: ffff00001a5f0000
[ 9.455539] PC is at swake_up_all_locked+0xa4/0xb8
[ 9.455543] LR is at swake_up_all_locked+0xa4/0xb8
[ 9.455548] pc : [<ffff0000081095bc>] lr : [<ffff0000081095bc>] pstate: 800001c5
[ 9.455550] sp : ffff00001a5f3cc0
[ 9.455552] x29: ffff00001a5f3cc0 x28: ffff800072895a00
[ 9.455559] x27: ffff000008d81000 x26: 0000000000000040
[ 9.455562] x25: 0000000000000124 x24: 0000000000000015
[ 9.455568] x23: 0000000000000002 x22: ffff800076c73430
[ 9.455572] x21: 0000000000000003 x20: ffff800076c73428
[ 9.455576] x19: ffff000009eabce0 x18: 0000000000000001
[ 9.455580] x17: 0000ffffb547bcb0 x16: ffff000008209700
[ 9.455584] x15: ffff0000094c3000 x14: 00000000fffffff0
[ 9.455588] x13: ffff000009651b18 x12: ffff0000094c3000
[ 9.455591] x11: 0000000000000000 x10: ffff000009651000
[ 9.455595] x9 : 0000000000000000 x8 : ffff0000096610a3
[ 9.455599] x7 : 0000000000000000 x6 : 0000000005ca3ab7
[ 9.455602] x5 : 0000000000000000 x4 : 0000000000000000
[ 9.455606] x3 : ffffffffffffffff x2 : 0000800074ae6000
[ 9.455610] x1 : ffff800072895a00 x0 : 000000000000001d
[ 9.455616] Call trace:
[ 9.455620] Exception stack(0xffff00001a5f3b80 to 0xffff00001a5f3cc0)
[ 9.455624] 3b80: 000000000000001d ffff800072895a00 0000800074ae6000 ffffffffffffffff
[ 9.455628] 3ba0: 0000000000000000 0000000000000000 0000000005ca3ab7 0000000000000000
[ 9.455632] 3bc0: ffff0000096610a3 0000000000000000 ffff000009651000 0000000000000000
[ 9.455637] 3be0: ffff0000094c3000 ffff000009651b18 00000000fffffff0 ffff0000094c3000
[ 9.455641] 3c00: ffff000008209700 0000ffffb547bcb0 0000000000000001 ffff000009eabce0
[ 9.455646] 3c20: ffff800076c73428 0000000000000003 ffff800076c73430 0000000000000002
[ 9.455652] 3c40: 0000000000000015 0000000000000124 0000000000000040 ffff000008d81000
[ 9.455657] 3c60: ffff800072895a00 ffff00001a5f3cc0 ffff0000081095bc ffff00001a5f3cc0
[ 9.455663] 3c80: ffff0000081095bc 00000000800001c5 ffff800076c73430 0000000000000000
[ 9.455668] 3ca0: 0000ffffffffffff 0000000000000004 ffff00001a5f3cc0 ffff0000081095bc
[ 9.455674] [<ffff0000081095bc>] swake_up_all_locked+0xa4/0xb8
[ 9.455681] [<ffff000008109bcc>] complete_all+0x34/0x50
[ 9.455690] [<ffff0000086cc338>] firmware_loading_store+0x168/0x218
[ 9.455696] [<ffff0000086ab7a0>] dev_attr_store+0x18/0x28
[ 9.455704] [<ffff000008288f6c>] sysfs_kf_write+0x3c/0x50
[ 9.455709] [<ffff000008288208>] kernfs_fop_write+0x118/0x1e8
[ 9.455714] [<ffff000008209190>] __vfs_write+0x18/0x118
[ 9.455718] [<ffff000008209484>] vfs_write+0xa4/0x1b0
[ 9.455724] [<ffff000008209748>] SyS_write+0x48/0xb0
[ 9.455729] Exception stack(0xffff00001a5f3ec0 to 0xffff00001a5f4000)
[ 9.455733] 3ec0: 000000000000000f 0000aaab040673f0 0000000000000002 0000ffffb556e190
[ 9.455738] 3ee0: 0000000000000040 0000000000000000 0000000000000000 000000000000000a
[ 9.455742] 3f00: 0000000000000040 0000ffffb53c8c90 0000000000000040 0000000000000000
[ 9.455746] 3f20: 0000000000000001 000000000000270f 0000000000000000 0000000000000000
[ 9.455750] 3f40: 0000aaaad73f9948 0000ffffb547bcb0 0000ffffb5569a70 000000000000000f
[ 9.455754] 3f60: 0000aaab040673f0 0000aaab0406a6e0 0000000000000002 0000aaab040673f0
[ 9.455758] 3f80: 0000000000000002 0000aaab040418c0 0000000000000001 0000000000000b12
[ 9.455763] 3fa0: 0000aaaad73b85a8 0000ffffcaab8510 0000ffffb54884dc 0000ffffcaab8510
[ 9.455767] 3fc0: 0000ffffb54dbd4c 0000000020000000 000000000000000f 0000000000000040
[ 9.455771] 3fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 9.455776] [<ffff000008083b58>] __sys_trace_return+0x0/0x4
[ 9.455779] --[ end trace 0000000000000002 ]--
[ 9.457052] imx-sdma 302b0000.dma-controller: loaded firmware 4.4
[ 9.458193] imx-sdma 302c0000.dma-controller: loaded firmware 4.4
[ 9.481511] imx-sdma 30bd0000.dma-controller: loaded firmware 4.4

In fact, there is 3 sdma controller devices on the target pointing to the same firmware file and the imx-sdma driver is requesting the firmware asynchronously with request_firmware_nowait(). The request_firmware API in linux is making all waiters, requesting the same firmware file, wait on the same completion which triggers this warning.
I checked the linux-5.4.y-rt branch but the same warning is there.
My question is: is there any good reason for the value of 2 in the warning condition ? It seems to me like a warning put earlier to check if we trigger this condition and work on some unresolved issues. Do we need another filter like the pm_in_action ? 
Any idea/suggestion is definitely appreciated

Best Regards,
Marouen.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: complete_all warning with 3 waiters
  2020-03-11 16:32 complete_all warning with 3 waiters Marouen Ghodhbane
@ 2020-03-12  8:10 ` Daniel Wagner
  2020-03-17 12:08 ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 4+ messages in thread
From: Daniel Wagner @ 2020-03-12  8:10 UTC (permalink / raw)
  To: Marouen Ghodhbane; +Cc: linux-rt-users

Hi Marouen,

On Wed, Mar 11, 2020 at 04:32:03PM +0000, Marouen Ghodhbane wrote:
> My question is: is there any good reason for the value of 2 in the
> warning condition ? It seems to me like a warning put earlier to
> check if we trigger this condition and work on some unresolved
> issues. Do we need another filter like the pm_in_action ?

The check is there to find code paths which rely on waking many
waiters. The problem is, swake_up_all_locked() is running while
holding a raw spin lock with irq disabled which can introduce unbouded
latencies on the system.

> Any idea/suggestion is definitely appreciated

The warning is harmless in your case though annoying.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: complete_all warning with 3 waiters
  2020-03-11 16:32 complete_all warning with 3 waiters Marouen Ghodhbane
  2020-03-12  8:10 ` Daniel Wagner
@ 2020-03-17 12:08 ` Sebastian Andrzej Siewior
  2020-03-17 23:51   ` [EXT] " Marouen Ghodhbane
  1 sibling, 1 reply; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2020-03-17 12:08 UTC (permalink / raw)
  To: Marouen Ghodhbane; +Cc: linux-rt-users

On 2020-03-11 16:32:03 [+0000], Marouen Ghodhbane wrote:
> Hello everyone,
Hi,

> In fact, there is 3 sdma controller devices on the target pointing to
> the same firmware file and the imx-sdma driver is requesting the
> firmware asynchronously with request_firmware_nowait(). The
> request_firmware API in linux is making all waiters, requesting the
> same firmware file, wait on the same completion which triggers this
> warning.

This looks like something that happens at boot / hardware setup time and
not while the system is running "production". Thanks for the feedback.

> Any idea/suggestion is definitely appreciated

I've been interested in cases which can stack up beyond 2 and can be
triggered by users because these may influence the RT workload.
The pm-case has been filtered out because nobody should do RT a workload
while PM is going up and down. The report I got was in the init-phase of
crypto which does not trigger usually and shouldn't trigger at run-time
once everything is set up.
Feel free to raise the bar here to avoid the warning in your case here.
I'm currently thinking about removing the warning due to lack of new
cases.

> Best Regards,
> Marouen.

Sebastian

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [EXT] Re: complete_all warning with 3 waiters
  2020-03-17 12:08 ` Sebastian Andrzej Siewior
@ 2020-03-17 23:51   ` Marouen Ghodhbane
  0 siblings, 0 replies; 4+ messages in thread
From: Marouen Ghodhbane @ 2020-03-17 23:51 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users, Daniel Wagner

Thanks Sebastian and Daniel for your answers.

>> In fact, there is 3 sdma controller devices on the target pointing to

>> the same firmware file and the imx-sdma driver is requesting the

>> firmware asynchronously with request_firmware_nowait(). The

>> request_firmware API in linux is making all waiters, requesting the

>> same firmware file, wait on the same completion which triggers this

>> warning.


> This looks like something that happens at boot / hardware setup time and

> not while the system is running "production". Thanks for the feedback.

Indeed, this happens only on system boot (more precisely on the imx-sdma 
driver probe). So, no RT workload is impacted here.


> Feel free to raise the bar here to avoid the warning in your case here.

> I'm currently thinking about removing the warning due to lack of new

> cases.

That's exactly what I implemented for the moment.

Thanks,
Marouen. 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-03-17 23:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-11 16:32 complete_all warning with 3 waiters Marouen Ghodhbane
2020-03-12  8:10 ` Daniel Wagner
2020-03-17 12:08 ` Sebastian Andrzej Siewior
2020-03-17 23:51   ` [EXT] " Marouen Ghodhbane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).