RCU Archive on lore.kernel.org
 help / color / Atom feed
* BUG: sleeping function called from invalid context at kernel/locking/mutex.c db410c
@ 2020-07-22  7:43 Naresh Kamboju
  2020-07-22  8:04 ` Arnd Bergmann
  0 siblings, 1 reply; 4+ messages in thread
From: Naresh Kamboju @ 2020-07-22  7:43 UTC (permalink / raw)
  To: open list, rcu, Linux-Next Mailing List, lkft-triage
  Cc: Peter Zijlstra, Ingo Molnar, Paul E. McKenney, Will Deacon,
	Josh Triplett, Steven Rostedt, mathieu.desnoyers, jiangshanlai,
	Joel Fernandes, Stephen Rothwell, Arnd Bergmann, walter-zh.wu,
	neeraju, Vinod Koul

Kernel BUG noticed on arm64 db410c device while booting linux next 20200721 tag.
Continually popping up these BUG messages on the boot console and kernel warning
also noticed.

metadata:
  git branch: master
  git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
  git commit: de2e69cfe54a8f2ed4b75f09d3110c514f45d38e
  git describe: next-20200721
  kernel-config:
http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/dragonboard-410c/lkft/linux-next/818/config
  build-location:
http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/dragonboard-410c/lkft/linux-next/818

Crash log:
[    0.444349] BUG: sleeping function called from invalid context at
/usr/src/kernel/kernel/locking/mutex.c:935
[    0.444422] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
30, name: kworker/0:1
[    0.444458] 2 locks held by kworker/0:1/30:
[    0.444489]  #0: ffff00000eb15138
((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
[    0.444627]  #1: ffff800013713df0
((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
process_one_work+0x208/0x768
[    0.444761] CPU: 0 PID: 30 Comm: kworker/0:1 Not tainted
5.8.0-rc6-next-20200721 #1
[    0.444787] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[    0.444817] Workqueue: rcu_gp srcu_invoke_callbacks
[    0.444856] Call trace:
[    0.444883]  dump_backtrace+0x0/0x1f8
[    0.444910]  show_stack+0x2c/0x38
[    0.444939]  dump_stack+0xf0/0x16c
[    0.444969]  ___might_sleep+0x144/0x208
[    0.444996]  __might_sleep+0x54/0x90
[    0.445027]  __mutex_lock+0x64/0x970
[    0.445055]  mutex_lock_nested+0x54/0x70
[    0.445084]  device_del+0x44/0x3c0
[    0.445110]  device_unregister+0x24/0x78
[    0.445138]  __device_link_free_srcu+0x64/0x70
[    0.445164]  srcu_invoke_callbacks+0x10c/0x1a0
[    0.445191]  process_one_work+0x2b0/0x768
[    0.445218]  worker_thread+0x48/0x498
[    0.445246]  kthread+0x158/0x168
[    0.445274]  ret_from_fork+0x10/0x1c

<>

[   13.015528] BUG: sleeping function called from invalid context at
/usr/src/kernel/kernel/locking/mutex.c:935
[   13.015588] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
188, name: kworker/2:4
[   13.025268] 2 locks held by kworker/2:4/188:
[   13.032647]  #0: ffff00000eb15138
((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
[   13.036920]  #1: ffff80001426bdf0
((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
process_one_work+0x208/0x768
[   13.046032] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
      5.8.0-rc6-next-20200721 #1
[   13.056039] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[   13.065417] Workqueue: rcu_gp srcu_invoke_callbacks
[   13.072091] Call trace:
[   13.076695]  dump_backtrace+0x0/0x1f8
[   13.079124]  show_stack+0x2c/0x38
[   13.082944]  dump_stack+0xf0/0x16c
[   13.086243]  ___might_sleep+0x144/0x208
[   13.089542]  __might_sleep+0x54/0x90
[   13.093274]  __mutex_lock+0x64/0x970
[   13.097094]  mutex_lock_nested+0x54/0x70
[   13.100654]  device_del+0x44/0x3c0
[   13.104556]  device_unregister+0x24/0x78
[   13.107771]  __device_link_free_srcu+0x64/0x70
[   13.111850]  srcu_invoke_callbacks+0x10c/0x1a0
[   13.116103]  process_one_work+0x2b0/0x768
[   13.120530]  worker_thread+0x48/0x498
[   13.124610]  kthread+0x158/0x168
[   13.128254]  ret_from_fork+0x10/0x1c
[   13.131818] BUG: scheduling while atomic: kworker/2:4/188/0x00000201
[   13.135243] 3 locks held by kworker/2:4/188:
[   13.141558]  #0: ffff00000eb15138
((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
[   13.145834]  #1: ffff80001426bdf0
((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
process_one_work+0x208/0x768
[   13.154924]  #2: ffff8000127f2fd0 (kernfs_mutex){+.+.}-{3:3}, at:
__kernfs_remove+0x304/0x378
[   13.164952] Modules linked in: mdt_loader videobuf2_common
drm_kms_helper qcom_rng i2c_qcom_cci display_connector drm socinfo
rmtfs_mem rfkill qrtr ns fuse
[   13.173653] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
      5.8.0-rc6-next-20200721 #1
[   13.187296] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[   13.196666] Workqueue: rcu_gp srcu_invoke_callbacks
[   13.203340] Call trace:
[[0;32m  OK  [0m] Started Network Manager Script Dispatcher Servic[
13.207942]  dump_backtrace+0x0/0x1f8
e.[   13.221350]  __schedule_bug+0x74/0xa0
[   13.221386]  __schedule+0x890/0x938
[   13.224160]  schedule+0x48/0x110
[   13.227462]  schedule_preempt_disabled+0x1c/0x30
[   13.230933]  __mutex_lock+0x814/0x970
[   13.235533]  mutex_lock_nested+0x54/0x70
[   13.239091]  __kernfs_remove+0x304/0x378
[   13.243085]  kernfs_remove_by_name_ns+0x5c/0xc0
[   13.246991]  sysfs_remove_link+0x30/0x60
[   13.251246]  device_remove_class_symlinks+0x78/0xa8
[   13.255411]  device_del+0xb8/0x3c0

[   13.260011]  device_unregister+0x24/0x78
[   13.263744]  __device_link_free_srcu+0x64/0x70
[   13.267746]  srcu_invoke_callbacks+0x10c/0x1a0
[   13.271992]  process_one_work+0x2b0/0x768
[   13.276417]  worker_thread+0x48/0x498
[   13.280499]  kthread+0x158/0x168
[   13.284144]  ret_from_fork+0x10/0x1c
[   13.289062] BUG: workqueue leaked lock or atomic: kworker/2:4/0xfffffe00/188
[   13.289062]      last function: srcu_invoke_callbacks
[   13.291056] no locks held by kworker/2:4/188.
[   13.302935] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
      5.8.0-rc6-next-20200721 #1
[   13.305703] rtc-pm8xxx 200f000.spmi:pm8916@0:rtc@6000: registered as rtc0
[   13.307317] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[   13.307333] Workqueue: rcu_gp srcu_invoke_callbacks
[   13.307349] Call trace:
[   13.307360]  dump_backtrace+0x0/0x1f8
[   13.307369]  show_stack+0x2c/0x38
[   13.307381]  dump_stack+0xf0/0x16c
[   13.307392]  process_one_work+0x624/0x768
[   13.347424]  worker_thread+0x48/0x498
[   13.351501]  kthread+0x158/0x168
[   13.355146]  ret_from_fork+0x10/0x1c
[   13.358487] BUG: scheduling while atomic: kworker/2:4/188/0xfffffe01
[   13.362056] no locks held by kworker/2:4/188.
[   13.363536] rtc-pm8xxx 200f000.spmi:pm8916@0:rtc@6000: setting
system clock to 1970-01-01T00:00:19 UTC (19)
[   13.368361] Modules linked in: rtc_pm8xxx(+) videobuf2_memops
videobuf2_v4l2 mdt_loader videobuf2_common drm_kms_helper qcom_rng
i2c_qcom_cci display_connector drm socinfo rmtfs_mem rfkill qrtr ns
fuse
[   13.382197] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
      5.8.0-rc6-next-20200721 #1
[   13.400208] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[   13.409418] Workqueue:  0x0 (rcu_gp)
[   13.416084] Call trace:
[   13.419646]  dump_backtrace+0x0/0x1f8
[   13.421815]  show_stack+0x2c/0x38
[   13.425635]  dump_stack+0xf0/0x16c
[   13.428933]  __schedule_bug+0x74/0xa0
[   13.432234]  __schedule+0x890/0x938
[   13.435967]  schedule+0x48/0x110
[   13.439262]  worker_thread+0xc0/0x498
[   13.442735]  kthread+0x158/0x168
[   13.446294]  ret_from_fork+0x10/0x1c

Full test log,
https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20200721/testrun/2972385/suite/linux-log-parser/test/check-kernel-warning-1595062/log

-- 
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c db410c
  2020-07-22  7:43 BUG: sleeping function called from invalid context at kernel/locking/mutex.c db410c Naresh Kamboju
@ 2020-07-22  8:04 ` Arnd Bergmann
  2020-07-22 13:01   ` Paul E. McKenney
  0 siblings, 1 reply; 4+ messages in thread
From: Arnd Bergmann @ 2020-07-22  8:04 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: open list, rcu, Linux-Next Mailing List, lkft-triage,
	Peter Zijlstra, Ingo Molnar, Paul E. McKenney, Will Deacon,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, Stephen Rothwell, walter-zh.wu, neeraju,
	Vinod Koul, saravanak, Greg Kroah-Hartman

On Wed, Jul 22, 2020 at 9:44 AM Naresh Kamboju
<naresh.kamboju@linaro.org> wrote:
>
> Kernel BUG noticed on arm64 db410c device while booting linux next 20200721 tag.
> Continually popping up these BUG messages on the boot console and kernel warning
> also noticed.
>
> metadata:
>   git branch: master
>   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
>   git commit: de2e69cfe54a8f2ed4b75f09d3110c514f45d38e
>   git describe: next-20200721
>   kernel-config:
> http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/dragonboard-410c/lkft/linux-next/818/config
>   build-location:
> http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/dragonboard-410c/lkft/linux-next/818
>
> Crash log:
> [    0.444349] BUG: sleeping function called from invalid context at
> /usr/src/kernel/kernel/locking/mutex.c:935
> [    0.444422] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
> 30, name: kworker/0:1
> [    0.444458] 2 locks held by kworker/0:1/30:
> [    0.444489]  #0: ffff00000eb15138
> ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
> [    0.444627]  #1: ffff800013713df0
> ((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
> process_one_work+0x208/0x768
> [    0.444761] CPU: 0 PID: 30 Comm: kworker/0:1 Not tainted
> 5.8.0-rc6-next-20200721 #1
> [    0.444787] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [    0.444817] Workqueue: rcu_gp srcu_invoke_callbacks
> [    0.444856] Call trace:
> [    0.444883]  dump_backtrace+0x0/0x1f8
> [    0.444910]  show_stack+0x2c/0x38
> [    0.444939]  dump_stack+0xf0/0x16c
> [    0.444969]  ___might_sleep+0x144/0x208
> [    0.444996]  __might_sleep+0x54/0x90
> [    0.445027]  __mutex_lock+0x64/0x970
> [    0.445055]  mutex_lock_nested+0x54/0x70
> [    0.445084]  device_del+0x44/0x3c0
> [    0.445110]  device_unregister+0x24/0x78
> [    0.445138]  __device_link_free_srcu+0x64/0x70
> [    0.445164]  srcu_invoke_callbacks+0x10c/0x1a0

The device_unregister() was added to __device_link_free_srcu() as part
of commit 287905e68dd2 ("driver core: Expose device link details in sysfs")

I'm fairly sure this is what introduced the console output, though the
code before it is already suspicious:

call_srcu(&device_links_srcu, &link->rcu_head, __device_link_free_srcu);

According to the documentation, call_srcu() must not call any sleeping
functions, which is what the 'in_atomic(): 1' above is about.

However, from what I can tell, the put_device() and pm_runtime_put()
calls in device_link_free() can also potentially sleep even if they normally
don't warn about that.

       Arnd

> [    0.445191]  process_one_work+0x2b0/0x768
> [    0.445218]  worker_thread+0x48/0x498
> [    0.445246]  kthread+0x158/0x168
> [    0.445274]  ret_from_fork+0x10/0x1c
>
> <>
>
> [   13.015528] BUG: sleeping function called from invalid context at
> /usr/src/kernel/kernel/locking/mutex.c:935
> [   13.015588] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
> 188, name: kworker/2:4
> [   13.025268] 2 locks held by kworker/2:4/188:
> [   13.032647]  #0: ffff00000eb15138
> ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
> [   13.036920]  #1: ffff80001426bdf0
> ((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
> process_one_work+0x208/0x768
> [   13.046032] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
>       5.8.0-rc6-next-20200721 #1
> [   13.056039] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [   13.065417] Workqueue: rcu_gp srcu_invoke_callbacks
> [   13.072091] Call trace:
> [   13.076695]  dump_backtrace+0x0/0x1f8
> [   13.079124]  show_stack+0x2c/0x38
> [   13.082944]  dump_stack+0xf0/0x16c
> [   13.086243]  ___might_sleep+0x144/0x208
> [   13.089542]  __might_sleep+0x54/0x90
> [   13.093274]  __mutex_lock+0x64/0x970
> [   13.097094]  mutex_lock_nested+0x54/0x70
> [   13.100654]  device_del+0x44/0x3c0
> [   13.104556]  device_unregister+0x24/0x78
> [   13.107771]  __device_link_free_srcu+0x64/0x70
> [   13.111850]  srcu_invoke_callbacks+0x10c/0x1a0
> [   13.116103]  process_one_work+0x2b0/0x768
> [   13.120530]  worker_thread+0x48/0x498
> [   13.124610]  kthread+0x158/0x168
> [   13.128254]  ret_from_fork+0x10/0x1c
> [   13.131818] BUG: scheduling while atomic: kworker/2:4/188/0x00000201
> [   13.135243] 3 locks held by kworker/2:4/188:
> [   13.141558]  #0: ffff00000eb15138
> ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
> [   13.145834]  #1: ffff80001426bdf0
> ((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
> process_one_work+0x208/0x768
> [   13.154924]  #2: ffff8000127f2fd0 (kernfs_mutex){+.+.}-{3:3}, at:
> __kernfs_remove+0x304/0x378
> [   13.164952] Modules linked in: mdt_loader videobuf2_common
> drm_kms_helper qcom_rng i2c_qcom_cci display_connector drm socinfo
> rmtfs_mem rfkill qrtr ns fuse
> [   13.173653] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
>       5.8.0-rc6-next-20200721 #1
> [   13.187296] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [   13.196666] Workqueue: rcu_gp srcu_invoke_callbacks
> [   13.203340] Call trace:
> [[0;32m  OK  [0m] Started Network Manager Script Dispatcher Servic[
> 13.207942]  dump_backtrace+0x0/0x1f8
> e.[   13.221350]  __schedule_bug+0x74/0xa0
> [   13.221386]  __schedule+0x890/0x938
> [   13.224160]  schedule+0x48/0x110
> [   13.227462]  schedule_preempt_disabled+0x1c/0x30
> [   13.230933]  __mutex_lock+0x814/0x970
> [   13.235533]  mutex_lock_nested+0x54/0x70
> [   13.239091]  __kernfs_remove+0x304/0x378
> [   13.243085]  kernfs_remove_by_name_ns+0x5c/0xc0
> [   13.246991]  sysfs_remove_link+0x30/0x60
> [   13.251246]  device_remove_class_symlinks+0x78/0xa8
> [   13.255411]  device_del+0xb8/0x3c0
>
> [   13.260011]  device_unregister+0x24/0x78
> [   13.263744]  __device_link_free_srcu+0x64/0x70
> [   13.267746]  srcu_invoke_callbacks+0x10c/0x1a0
> [   13.271992]  process_one_work+0x2b0/0x768
> [   13.276417]  worker_thread+0x48/0x498
> [   13.280499]  kthread+0x158/0x168
> [   13.284144]  ret_from_fork+0x10/0x1c
> [   13.289062] BUG: workqueue leaked lock or atomic: kworker/2:4/0xfffffe00/188
> [   13.289062]      last function: srcu_invoke_callbacks
> [   13.291056] no locks held by kworker/2:4/188.
> [   13.302935] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
>       5.8.0-rc6-next-20200721 #1
> [   13.305703] rtc-pm8xxx 200f000.spmi:pm8916@0:rtc@6000: registered as rtc0
> [   13.307317] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [   13.307333] Workqueue: rcu_gp srcu_invoke_callbacks
> [   13.307349] Call trace:
> [   13.307360]  dump_backtrace+0x0/0x1f8
> [   13.307369]  show_stack+0x2c/0x38
> [   13.307381]  dump_stack+0xf0/0x16c
> [   13.307392]  process_one_work+0x624/0x768
> [   13.347424]  worker_thread+0x48/0x498
> [   13.351501]  kthread+0x158/0x168
> [   13.355146]  ret_from_fork+0x10/0x1c
> [   13.358487] BUG: scheduling while atomic: kworker/2:4/188/0xfffffe01
> [   13.362056] no locks held by kworker/2:4/188.
> [   13.363536] rtc-pm8xxx 200f000.spmi:pm8916@0:rtc@6000: setting
> system clock to 1970-01-01T00:00:19 UTC (19)
> [   13.368361] Modules linked in: rtc_pm8xxx(+) videobuf2_memops
> videobuf2_v4l2 mdt_loader videobuf2_common drm_kms_helper qcom_rng
> i2c_qcom_cci display_connector drm socinfo rmtfs_mem rfkill qrtr ns
> fuse
> [   13.382197] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
>       5.8.0-rc6-next-20200721 #1
> [   13.400208] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> [   13.409418] Workqueue:  0x0 (rcu_gp)
> [   13.416084] Call trace:
> [   13.419646]  dump_backtrace+0x0/0x1f8
> [   13.421815]  show_stack+0x2c/0x38
> [   13.425635]  dump_stack+0xf0/0x16c
> [   13.428933]  __schedule_bug+0x74/0xa0
> [   13.432234]  __schedule+0x890/0x938
> [   13.435967]  schedule+0x48/0x110
> [   13.439262]  worker_thread+0xc0/0x498
> [   13.442735]  kthread+0x158/0x168
> [   13.446294]  ret_from_fork+0x10/0x1c
>
> Full test log,
> https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20200721/testrun/2972385/suite/linux-log-parser/test/check-kernel-warning-1595062/log
>
> --
> Linaro LKFT
> https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c db410c
  2020-07-22  8:04 ` Arnd Bergmann
@ 2020-07-22 13:01   ` Paul E. McKenney
  2020-07-22 15:25     ` Saravana Kannan
  0 siblings, 1 reply; 4+ messages in thread
From: Paul E. McKenney @ 2020-07-22 13:01 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Naresh Kamboju, open list, rcu, Linux-Next Mailing List,
	lkft-triage, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Joel Fernandes, Stephen Rothwell, walter-zh.wu, neeraju,
	Vinod Koul, saravanak, Greg Kroah-Hartman

On Wed, Jul 22, 2020 at 10:04:21AM +0200, Arnd Bergmann wrote:
> On Wed, Jul 22, 2020 at 9:44 AM Naresh Kamboju
> <naresh.kamboju@linaro.org> wrote:
> >
> > Kernel BUG noticed on arm64 db410c device while booting linux next 20200721 tag.
> > Continually popping up these BUG messages on the boot console and kernel warning
> > also noticed.
> >
> > metadata:
> >   git branch: master
> >   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> >   git commit: de2e69cfe54a8f2ed4b75f09d3110c514f45d38e
> >   git describe: next-20200721
> >   kernel-config:
> > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/dragonboard-410c/lkft/linux-next/818/config
> >   build-location:
> > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/dragonboard-410c/lkft/linux-next/818
> >
> > Crash log:
> > [    0.444349] BUG: sleeping function called from invalid context at
> > /usr/src/kernel/kernel/locking/mutex.c:935
> > [    0.444422] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
> > 30, name: kworker/0:1
> > [    0.444458] 2 locks held by kworker/0:1/30:
> > [    0.444489]  #0: ffff00000eb15138
> > ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
> > [    0.444627]  #1: ffff800013713df0
> > ((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
> > process_one_work+0x208/0x768
> > [    0.444761] CPU: 0 PID: 30 Comm: kworker/0:1 Not tainted
> > 5.8.0-rc6-next-20200721 #1
> > [    0.444787] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > [    0.444817] Workqueue: rcu_gp srcu_invoke_callbacks
> > [    0.444856] Call trace:
> > [    0.444883]  dump_backtrace+0x0/0x1f8
> > [    0.444910]  show_stack+0x2c/0x38
> > [    0.444939]  dump_stack+0xf0/0x16c
> > [    0.444969]  ___might_sleep+0x144/0x208
> > [    0.444996]  __might_sleep+0x54/0x90
> > [    0.445027]  __mutex_lock+0x64/0x970
> > [    0.445055]  mutex_lock_nested+0x54/0x70
> > [    0.445084]  device_del+0x44/0x3c0
> > [    0.445110]  device_unregister+0x24/0x78
> > [    0.445138]  __device_link_free_srcu+0x64/0x70
> > [    0.445164]  srcu_invoke_callbacks+0x10c/0x1a0
> 
> The device_unregister() was added to __device_link_free_srcu() as part
> of commit 287905e68dd2 ("driver core: Expose device link details in sysfs")
> 
> I'm fairly sure this is what introduced the console output, though the
> code before it is already suspicious:
> 
> call_srcu(&device_links_srcu, &link->rcu_head, __device_link_free_srcu);
> 
> According to the documentation, call_srcu() must not call any sleeping
> functions, which is what the 'in_atomic(): 1' above is about.

Yes, the callback function, in this case __device_link_free_srcu(),
will execute from softirq context.

> However, from what I can tell, the put_device() and pm_runtime_put()
> calls in device_link_free() can also potentially sleep even if they normally
> don't warn about that.

The usual workaround is to have __device_link_free_srcu() schedule a
workqueue that invokes the potentially sleeping functions.

							Thanx, Paul

>        Arnd
> 
> > [    0.445191]  process_one_work+0x2b0/0x768
> > [    0.445218]  worker_thread+0x48/0x498
> > [    0.445246]  kthread+0x158/0x168
> > [    0.445274]  ret_from_fork+0x10/0x1c
> >
> > <>
> >
> > [   13.015528] BUG: sleeping function called from invalid context at
> > /usr/src/kernel/kernel/locking/mutex.c:935
> > [   13.015588] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
> > 188, name: kworker/2:4
> > [   13.025268] 2 locks held by kworker/2:4/188:
> > [   13.032647]  #0: ffff00000eb15138
> > ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
> > [   13.036920]  #1: ffff80001426bdf0
> > ((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
> > process_one_work+0x208/0x768
> > [   13.046032] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
> >       5.8.0-rc6-next-20200721 #1
> > [   13.056039] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > [   13.065417] Workqueue: rcu_gp srcu_invoke_callbacks
> > [   13.072091] Call trace:
> > [   13.076695]  dump_backtrace+0x0/0x1f8
> > [   13.079124]  show_stack+0x2c/0x38
> > [   13.082944]  dump_stack+0xf0/0x16c
> > [   13.086243]  ___might_sleep+0x144/0x208
> > [   13.089542]  __might_sleep+0x54/0x90
> > [   13.093274]  __mutex_lock+0x64/0x970
> > [   13.097094]  mutex_lock_nested+0x54/0x70
> > [   13.100654]  device_del+0x44/0x3c0
> > [   13.104556]  device_unregister+0x24/0x78
> > [   13.107771]  __device_link_free_srcu+0x64/0x70
> > [   13.111850]  srcu_invoke_callbacks+0x10c/0x1a0
> > [   13.116103]  process_one_work+0x2b0/0x768
> > [   13.120530]  worker_thread+0x48/0x498
> > [   13.124610]  kthread+0x158/0x168
> > [   13.128254]  ret_from_fork+0x10/0x1c
> > [   13.131818] BUG: scheduling while atomic: kworker/2:4/188/0x00000201
> > [   13.135243] 3 locks held by kworker/2:4/188:
> > [   13.141558]  #0: ffff00000eb15138
> > ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
> > [   13.145834]  #1: ffff80001426bdf0
> > ((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
> > process_one_work+0x208/0x768
> > [   13.154924]  #2: ffff8000127f2fd0 (kernfs_mutex){+.+.}-{3:3}, at:
> > __kernfs_remove+0x304/0x378
> > [   13.164952] Modules linked in: mdt_loader videobuf2_common
> > drm_kms_helper qcom_rng i2c_qcom_cci display_connector drm socinfo
> > rmtfs_mem rfkill qrtr ns fuse
> > [   13.173653] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
> >       5.8.0-rc6-next-20200721 #1
> > [   13.187296] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > [   13.196666] Workqueue: rcu_gp srcu_invoke_callbacks
> > [   13.203340] Call trace:
> > [[0;32m  OK  [0m] Started Network Manager Script Dispatcher Servic[
> > 13.207942]  dump_backtrace+0x0/0x1f8
> > e.[   13.221350]  __schedule_bug+0x74/0xa0
> > [   13.221386]  __schedule+0x890/0x938
> > [   13.224160]  schedule+0x48/0x110
> > [   13.227462]  schedule_preempt_disabled+0x1c/0x30
> > [   13.230933]  __mutex_lock+0x814/0x970
> > [   13.235533]  mutex_lock_nested+0x54/0x70
> > [   13.239091]  __kernfs_remove+0x304/0x378
> > [   13.243085]  kernfs_remove_by_name_ns+0x5c/0xc0
> > [   13.246991]  sysfs_remove_link+0x30/0x60
> > [   13.251246]  device_remove_class_symlinks+0x78/0xa8
> > [   13.255411]  device_del+0xb8/0x3c0
> >
> > [   13.260011]  device_unregister+0x24/0x78
> > [   13.263744]  __device_link_free_srcu+0x64/0x70
> > [   13.267746]  srcu_invoke_callbacks+0x10c/0x1a0
> > [   13.271992]  process_one_work+0x2b0/0x768
> > [   13.276417]  worker_thread+0x48/0x498
> > [   13.280499]  kthread+0x158/0x168
> > [   13.284144]  ret_from_fork+0x10/0x1c
> > [   13.289062] BUG: workqueue leaked lock or atomic: kworker/2:4/0xfffffe00/188
> > [   13.289062]      last function: srcu_invoke_callbacks
> > [   13.291056] no locks held by kworker/2:4/188.
> > [   13.302935] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
> >       5.8.0-rc6-next-20200721 #1
> > [   13.305703] rtc-pm8xxx 200f000.spmi:pm8916@0:rtc@6000: registered as rtc0
> > [   13.307317] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > [   13.307333] Workqueue: rcu_gp srcu_invoke_callbacks
> > [   13.307349] Call trace:
> > [   13.307360]  dump_backtrace+0x0/0x1f8
> > [   13.307369]  show_stack+0x2c/0x38
> > [   13.307381]  dump_stack+0xf0/0x16c
> > [   13.307392]  process_one_work+0x624/0x768
> > [   13.347424]  worker_thread+0x48/0x498
> > [   13.351501]  kthread+0x158/0x168
> > [   13.355146]  ret_from_fork+0x10/0x1c
> > [   13.358487] BUG: scheduling while atomic: kworker/2:4/188/0xfffffe01
> > [   13.362056] no locks held by kworker/2:4/188.
> > [   13.363536] rtc-pm8xxx 200f000.spmi:pm8916@0:rtc@6000: setting
> > system clock to 1970-01-01T00:00:19 UTC (19)
> > [   13.368361] Modules linked in: rtc_pm8xxx(+) videobuf2_memops
> > videobuf2_v4l2 mdt_loader videobuf2_common drm_kms_helper qcom_rng
> > i2c_qcom_cci display_connector drm socinfo rmtfs_mem rfkill qrtr ns
> > fuse
> > [   13.382197] CPU: 2 PID: 188 Comm: kworker/2:4 Tainted: G        W
> >       5.8.0-rc6-next-20200721 #1
> > [   13.400208] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > [   13.409418] Workqueue:  0x0 (rcu_gp)
> > [   13.416084] Call trace:
> > [   13.419646]  dump_backtrace+0x0/0x1f8
> > [   13.421815]  show_stack+0x2c/0x38
> > [   13.425635]  dump_stack+0xf0/0x16c
> > [   13.428933]  __schedule_bug+0x74/0xa0
> > [   13.432234]  __schedule+0x890/0x938
> > [   13.435967]  schedule+0x48/0x110
> > [   13.439262]  worker_thread+0xc0/0x498
> > [   13.442735]  kthread+0x158/0x168
> > [   13.446294]  ret_from_fork+0x10/0x1c
> >
> > Full test log,
> > https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20200721/testrun/2972385/suite/linux-log-parser/test/check-kernel-warning-1595062/log
> >
> > --
> > Linaro LKFT
> > https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: BUG: sleeping function called from invalid context at kernel/locking/mutex.c db410c
  2020-07-22 13:01   ` Paul E. McKenney
@ 2020-07-22 15:25     ` Saravana Kannan
  0 siblings, 0 replies; 4+ messages in thread
From: Saravana Kannan @ 2020-07-22 15:25 UTC (permalink / raw)
  To: paulmck
  Cc: Arnd Bergmann, Naresh Kamboju, open list, rcu,
	Linux-Next Mailing List, lkft-triage, Peter Zijlstra,
	Ingo Molnar, Will Deacon, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Joel Fernandes,
	Stephen Rothwell, walter-zh.wu, neeraju, Vinod Koul,
	Greg Kroah-Hartman

On Wed, Jul 22, 2020 at 6:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Wed, Jul 22, 2020 at 10:04:21AM +0200, Arnd Bergmann wrote:
> > On Wed, Jul 22, 2020 at 9:44 AM Naresh Kamboju
> > <naresh.kamboju@linaro.org> wrote:
> > >
> > > Kernel BUG noticed on arm64 db410c device while booting linux next 20200721 tag.
> > > Continually popping up these BUG messages on the boot console and kernel warning
> > > also noticed.
> > >
> > > metadata:
> > >   git branch: master
> > >   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > >   git commit: de2e69cfe54a8f2ed4b75f09d3110c514f45d38e
> > >   git describe: next-20200721
> > >   kernel-config:
> > > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/dragonboard-410c/lkft/linux-next/818/config
> > >   build-location:
> > > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/dragonboard-410c/lkft/linux-next/818
> > >
> > > Crash log:
> > > [    0.444349] BUG: sleeping function called from invalid context at
> > > /usr/src/kernel/kernel/locking/mutex.c:935
> > > [    0.444422] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
> > > 30, name: kworker/0:1
> > > [    0.444458] 2 locks held by kworker/0:1/30:
> > > [    0.444489]  #0: ffff00000eb15138
> > > ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x208/0x768
> > > [    0.444627]  #1: ffff800013713df0
> > > ((work_completion)(&sdp->work)){+.+.}-{0:0}, at:
> > > process_one_work+0x208/0x768
> > > [    0.444761] CPU: 0 PID: 30 Comm: kworker/0:1 Not tainted
> > > 5.8.0-rc6-next-20200721 #1
> > > [    0.444787] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > > [    0.444817] Workqueue: rcu_gp srcu_invoke_callbacks
> > > [    0.444856] Call trace:
> > > [    0.444883]  dump_backtrace+0x0/0x1f8
> > > [    0.444910]  show_stack+0x2c/0x38
> > > [    0.444939]  dump_stack+0xf0/0x16c
> > > [    0.444969]  ___might_sleep+0x144/0x208
> > > [    0.444996]  __might_sleep+0x54/0x90
> > > [    0.445027]  __mutex_lock+0x64/0x970
> > > [    0.445055]  mutex_lock_nested+0x54/0x70
> > > [    0.445084]  device_del+0x44/0x3c0
> > > [    0.445110]  device_unregister+0x24/0x78
> > > [    0.445138]  __device_link_free_srcu+0x64/0x70
> > > [    0.445164]  srcu_invoke_callbacks+0x10c/0x1a0
> >
> > The device_unregister() was added to __device_link_free_srcu() as part
> > of commit 287905e68dd2 ("driver core: Expose device link details in sysfs")
> >
> > I'm fairly sure this is what introduced the console output, though the
> > code before it is already suspicious:
> >
> > call_srcu(&device_links_srcu, &link->rcu_head, __device_link_free_srcu);
> >
> > According to the documentation, call_srcu() must not call any sleeping
> > functions, which is what the 'in_atomic(): 1' above is about.
>
> Yes, the callback function, in this case __device_link_free_srcu(),
> will execute from softirq context.
>
> > However, from what I can tell, the put_device() and pm_runtime_put()
> > calls in device_link_free() can also potentially sleep even if they normally
> > don't warn about that.
>
> The usual workaround is to have __device_link_free_srcu() schedule a
> workqueue that invokes the potentially sleeping functions.
>
>                                                         Thanx, Paul

Thanks for reporting. It was already reported and fixed last week.
Waiting on Greg to pull it in.
https://lore.kernel.org/lkml/CAGETcx9quqtV9TeD8WEcLLijGeV+7hXqnyiMv7F8xo+vyF_7iw@mail.gmail.com/#t


-Saravana

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22  7:43 BUG: sleeping function called from invalid context at kernel/locking/mutex.c db410c Naresh Kamboju
2020-07-22  8:04 ` Arnd Bergmann
2020-07-22 13:01   ` Paul E. McKenney
2020-07-22 15:25     ` Saravana Kannan

RCU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/rcu/0 rcu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 rcu rcu/ https://lore.kernel.org/rcu \
		rcu@vger.kernel.org
	public-inbox-index rcu

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.rcu


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git