All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marek Szyprowski <m.szyprowski@samsung.com>
To: Guenter Roeck <linux@roeck-us.net>,
	Saravana Kannan <saravanak@google.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>
Cc: kernel-team@android.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] driver core: Fix scheduling while atomic warnings during device link deletion
Date: Thu, 16 Jul 2020 07:48:42 +0200	[thread overview]
Message-ID: <99bf8e73-1daa-34b8-ca7e-093a44fdba9b@samsung.com> (raw)
In-Reply-To: <e3f4469e-8467-1736-8a39-6539b9f542af@roeck-us.net>

Hi

On 16.07.2020 07:30, Guenter Roeck wrote:
> On 7/15/20 10:08 PM, Saravana Kannan wrote:
>> Marek and Guenter reported that commit 287905e68dd2 ("driver core:
>> Expose device link details in sysfs") caused sleeping/scheduling while
>> atomic warnings.
>>
>> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 12, name: kworker/0:1
>> 2 locks held by kworker/0:1/12:
>>    #0: ee8074a8 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x174/0x7dc
>>    #1: ee921f20 ((work_completion)(&sdp->work)){+.+.}-{0:0}, at: process_one_work+0x174/0x7dc
>> Preemption disabled at:
>> [<c01b10f0>] srcu_invoke_callbacks+0xc0/0x154
>> ----- 8< ----- SNIP
>> [<c064590c>] (device_del) from [<c0645c9c>] (device_unregister+0x24/0x64)
>> [<c0645c9c>] (device_unregister) from [<c01b10fc>] (srcu_invoke_callbacks+0xcc/0x154)
>> [<c01b10fc>] (srcu_invoke_callbacks) from [<c01493c4>] (process_one_work+0x234/0x7dc)
>> [<c01493c4>] (process_one_work) from [<c01499b0>] (worker_thread+0x44/0x51c)
>> [<c01499b0>] (worker_thread) from [<c0150bf4>] (kthread+0x158/0x1a0)
>> [<c0150bf4>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20)
>> Exception stack(0xee921fb0 to 0xee921ff8)
>>
>> This was caused by the device link device being released in the context
>> of srcu_invoke_callbacks().  There is no need to wait till the RCU
>> callback to release the device link device.  So release the device
>> earlier and revert the RCU callback code to what it was before
>> commit 287905e68dd2 ("driver core: Expose device link details in sysfs")
>>
>> Fixes: 287905e68dd2 ("driver core: Expose device link details in sysfs")
>> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> Reported-by: Guenter Roeck <linux@roeck-us.net>
>> Signed-off-by: Saravana Kannan <saravanak@google.com>
>> ---
>> Marek and Guenter,
>>
>> It haven't had a chance to test this yet. Can one of you please test it
>> and confirm it fixes the issue?
>>
> With this patch applied, the original warning is gone, but I get lots
> of other warnings.
>
> WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0xa4^M
> Device 'regulators:regulator@0:50038000.ethernet' does not have a release() function, it is broken and must be fixed.
>
> WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0xa4
> Device '53f9c000.gpio:50038000.ethernet' does not have a release() function, it is broken and must be fixed.
>
> WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0xa4^M
> Device '50030000.tscadc:50030400.tcq' does not have a release() function, it is broken and must be fixed.

I confirm that I also get such warnings for every platform device in the 
system with this patch applied to linux next-20200715:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at drivers/base/core.c:1790 device_release+0x94/0x98
Device '10023c40.power-domain:13620000.sysmmu' does not have a release() 
function, it is broken and must be fixed. See 
Documentation/core-api/kobject.rst.
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
5.8.0-rc5-next-20200715-00002-g0f637964c4b0 #1270
Hardware name: Samsung Exynos (Flattened Device Tree)
[<c011184c>] (unwind_backtrace) from [<c010d250>] (show_stack+0x10/0x14)
[<c010d250>] (show_stack) from [<c051b8fc>] (dump_stack+0xbc/0xe8)
[<c051b8fc>] (dump_stack) from [<c0126ed8>] (__warn+0xf0/0x108)
[<c0126ed8>] (__warn) from [<c0126f64>] (warn_slowpath_fmt+0x74/0xb8)
[<c0126f64>] (warn_slowpath_fmt) from [<c064a2a0>] 
(device_release+0x94/0x98)
[<c064a2a0>] (device_release) from [<c0522178>] (kobject_put+0x104/0x288)
[<c0522178>] (kobject_put) from [<c064b45c>] (__device_link_del+0x38/0xac)
[<c064b45c>] (__device_link_del) from [<c064c1f0>] 
(device_links_driver_bound+0x260/0x26c)
[<c064c1f0>] (device_links_driver_bound) from [<c0650af0>] 
(driver_bound+0x5c/0x110)
[<c0650af0>] (driver_bound) from [<c0651038>] (really_probe+0x2d4/0x4fc)
[<c0651038>] (really_probe) from [<c06513c8>] 
(driver_probe_device+0x78/0x1fc)
[<c06513c8>] (driver_probe_device) from [<c064ee00>] 
(bus_for_each_drv+0x74/0xb8)
[<c064ee00>] (bus_for_each_drv) from [<c0650cc4>] 
(__device_attach+0xd4/0x16c)
[<c0650cc4>] (__device_attach) from [<c064fdc4>] 
(bus_probe_device+0x88/0x90)
[<c064fdc4>] (bus_probe_device) from [<c064c604>] 
(fw_devlink_resume+0xa0/0x134)
[<c064c604>] (fw_devlink_resume) from [<c102bfd4>] 
(of_platform_default_populate_init+0xa8/0xc0)
[<c102bfd4>] (of_platform_default_populate_init) from [<c0102378>] 
(do_one_initcall+0x8c/0x424)
[<c0102378>] (do_one_initcall) from [<c1001158>] 
(kernel_init_freeable+0x190/0x204)
[<c1001158>] (kernel_init_freeable) from [<c0ac05d0>] 
(kernel_init+0x8/0x118)
[<c0ac05d0>] (kernel_init) from [<c0100114>] (ret_from_fork+0x14/0x20)
Exception stack(0xef0dffb0 to 0xef0dfff8)
ffa0:                                     00000000 00000000 00000000 
00000000
ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
00000000
ffe0: 00000000 00000000 00000000 00000000 00000013 00000000
irq event stamp: 40543
hardirqs last  enabled at (40551): [<c019d624>] console_unlock+0x430/0x6cc
hardirqs last disabled at (40568): [<c019d348>] console_unlock+0x154/0x6cc
softirqs last  enabled at (40584): [<c010174c>] __do_softirq+0x50c/0x608
softirqs last disabled at (40595): [<c0130218>] irq_exit+0x168/0x16c
---[ end trace 1d4780a89f63483a ]---

> and so on. I don't know if this is caused by this patch or by
> some other patch in -next.

This is caused by patch 287905e68dd2 ("driver core: Expose device link 
details in sysfs"). If you revert it, the warning will go away.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


  reply	other threads:[~2020-07-16  5:48 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-16  5:08 [PATCH v1] driver core: Fix scheduling while atomic warnings during device link deletion Saravana Kannan
2020-07-16  5:30 ` Guenter Roeck
2020-07-16  5:48   ` Marek Szyprowski [this message]
2020-07-16 18:26     ` Saravana Kannan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99bf8e73-1daa-34b8-ca7e-093a44fdba9b@samsung.com \
    --to=m.szyprowski@samsung.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=rafael@kernel.org \
    --cc=saravanak@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.