From: Li Zhong <zhong@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
gregkh@linuxfoundation.org, rafael.j.wysocki@intel.com,
toshi.kani@hp.com
Subject: [RFC PATCH v2] Use kernfs_break_active_protection() for device online store callbacks
Date: Fri, 11 Apr 2014 12:10:45 +0800 [thread overview]
Message-ID: <1397189445.3649.14.camel@ThinkPad-T5421> (raw)
In-Reply-To: <20140410133116.GB25308@htj.dyndns.org>
I noticed following lockdep warning when trying acpi hot-remove cpus:
[84154.204080] ======================================================
[84154.204080] [ INFO: possible circular locking dependency detected ]
[84154.204080] 3.14.0-next-20140408+ #24 Tainted: G W
[84154.204080] -------------------------------------------------------
[84154.204080] bash/777 is trying to acquire lock:
[84154.204080] (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810664a7>]
cpu_maps_update_begin+0x17/0x20
[84154.213203]
[84154.213203] but task is already holding lock:
[84154.213203] (s_active#79){++++.+}, at: [<ffffffff81256e14>]
kernfs_fop_write+0xe4/0x190
[84154.213203]
[84154.213203] which lock already depends on the new lock.
[84154.213203]
[84154.213203]
[84154.213203] the existing dependency chain (in reverse order) is:
[84154.213203]
-> #3 (s_active#79){++++.+}:
[84154.213203] [<ffffffff810c408b>] lock_acquire+0x9b/0x1d0
[84154.213203] [<ffffffff812552e0>] __kernfs_remove+0x250/0x310
[84154.213203] [<ffffffff81256470>] kernfs_remove_by_name_ns
+0x50/0xc0
[84154.213203] [<ffffffff81257af5>] sysfs_remove_file_ns
+0x15/0x20
[84154.213203] [<ffffffff813df339>] device_remove_file+0x19/0x20
[84154.213203] [<ffffffff813dff33>] device_remove_attrs+0x33/0x80
[84154.213203] [<ffffffff813e00a7>] device_del+0x127/0x1c0
[84154.213203] [<ffffffff813e0162>] device_unregister+0x22/0x60
[84154.213203] [<ffffffff813e66de>] unregister_cpu+0x1e/0x40
[84154.213203] [<ffffffff81009a33>] arch_unregister_cpu+0x23/0x30
[84154.213203] [<ffffffff8136f619>] acpi_processor_remove
+0x8d/0xb2
[84154.213203] [<ffffffff8136cfff>] acpi_bus_trim+0x5a/0x8d
[84154.213203] [<ffffffff8136ec3b>] acpi_device_hotplug
+0x1a8/0x3ec
[84154.213203] [<ffffffff81369002>] acpi_hotplug_work_fn
+0x1f/0x2b
[84154.213203] [<ffffffff8108754b>] process_one_work+0x1eb/0x6b0
[84154.213203] [<ffffffff81087e9b>] worker_thread+0x11b/0x370
[84154.213203] [<ffffffff81090324>] kthread+0xe4/0x100
[84154.213203] [<ffffffff815d2f2c>] ret_from_fork+0x7c/0xb0
[84154.213203]
-> #2 (cpu_hotplug.lock#2){+.+.+.}:
[84154.213203] [<ffffffff810c408b>] lock_acquire+0x9b/0x1d0
[84154.213203] [<ffffffff815c7700>] mutex_lock_nested+0x50/0x3c0
[84154.213203] [<ffffffff810665bf>] cpu_hotplug_begin+0x4f/0x80
[84154.213203] [<ffffffff8106666f>] _cpu_up+0x3f/0x160
[84154.213203] [<ffffffff810667f9>] cpu_up+0x69/0x80
[84154.213203] [<ffffffff81b18f14>] smp_init+0x60/0x8c
[84154.213203] [<ffffffff81b00fd8>] kernel_init_freeable
+0x126/0x23b
[84154.213203] [<ffffffff815b4a3e>] kernel_init+0xe/0xf0
[84154.213203] [<ffffffff815d2f2c>] ret_from_fork+0x7c/0xb0
[84154.213203]
-> #1 (cpu_hotplug.lock){++++++}:
[84154.213203] [<ffffffff810c408b>] lock_acquire+0x9b/0x1d0
[84154.213203] [<ffffffff810665b1>] cpu_hotplug_begin+0x41/0x80
[84154.213203] [<ffffffff8106666f>] _cpu_up+0x3f/0x160
[84154.213203] [<ffffffff810667f9>] cpu_up+0x69/0x80
[84154.213203] [<ffffffff81b18f14>] smp_init+0x60/0x8c
[84154.213203] [<ffffffff81b00fd8>] kernel_init_freeable
+0x126/0x23b
[84154.213203] [<ffffffff815b4a3e>] kernel_init+0xe/0xf0
[84154.213203] [<ffffffff815d2f2c>] ret_from_fork+0x7c/0xb0
[84154.213203]
-> #0 (cpu_add_remove_lock){+.+.+.}:
[84154.213203] [<ffffffff810c397a>] __lock_acquire+0x1f2a/0x1f60
[84154.213203] [<ffffffff810c408b>] lock_acquire+0x9b/0x1d0
[84154.213203] [<ffffffff815c7700>] mutex_lock_nested+0x50/0x3c0
[84154.213203] [<ffffffff810664a7>] cpu_maps_update_begin
+0x17/0x20
[84154.213203] [<ffffffff815b582d>] cpu_down+0x1d/0x50
[84154.213203] [<ffffffff813e63b4>] cpu_subsys_offline+0x14/0x20
[84154.213203] [<ffffffff813e145d>] device_offline+0xad/0xd0
[84154.213203] [<ffffffff813e1562>] online_store+0x42/0x80
[84154.213203] [<ffffffff813deab8>] dev_attr_store+0x18/0x30
[84154.213203] [<ffffffff81257bb9>] sysfs_kf_write+0x49/0x60
[84154.213203] [<ffffffff81256e39>] kernfs_fop_write+0x109/0x190
[84154.213203] [<ffffffff811d15be>] vfs_write+0xbe/0x1c0
[84154.213203] [<ffffffff811d1a52>] SyS_write+0x52/0xb0
[84154.213203] [<ffffffff815d3162>] tracesys+0xd0/0xd5
[84154.213203]
[84154.213203] other info that might help us debug this:
[84154.213203]
[84154.213203] Chain exists of:
cpu_add_remove_lock --> cpu_hotplug.lock#2 --> s_active#79
[84154.213203] Possible unsafe locking scenario:
[84154.213203]
[84154.213203] CPU0 CPU1
[84154.213203] ---- ----
[84154.213203] lock(s_active#79);
[84154.213203] lock(cpu_hotplug.lock#2);
[84154.213203] lock(s_active#79);
[84154.213203] lock(cpu_add_remove_lock);
[84154.213203]
[84154.213203] *** DEADLOCK ***
.............
The deadlock itself seems already fixed in commit 5e33bc41.
As Tejun suggested, to avoid this lockdep warning,
kernfs_break_active_protection() is used before online/offline callbacks
to take the s_active lock out of the dependency chain during
online/offline operations.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
drivers/base/core.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 0dd6528..2b9f68e 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -439,6 +439,7 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr,
{
bool val;
int ret;
+ struct kernfs_node *kn;
ret = strtobool(buf, &val);
if (ret < 0)
@@ -448,7 +449,15 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr,
if (ret)
return ret;
+ kn = kernfs_find_and_get(dev->kobj.sd, attr->attr.name);
+ if (WARN_ON_ONCE(!kn))
+ goto out;
+
+ kernfs_break_active_protection(kn);
+
ret = val ? device_online(dev) : device_offline(dev);
+ kernfs_unbreak_active_protection(kn);
+out:
unlock_device_hotplug();
return ret < 0 ? ret : count;
}
next prev parent reply other threads:[~2014-04-11 4:10 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-10 9:18 [RFC PATCH] Suppress a device hot remove related lockdep warning Li Zhong
2014-04-10 13:31 ` Tejun Heo
2014-04-11 4:10 ` Li Zhong [this message]
2014-04-11 10:26 ` [RFC PATCH v2] Use kernfs_break_active_protection() for device online store callbacks Tejun Heo
2014-04-14 7:47 ` [RFC PATCH v3] " Li Zhong
2014-04-14 20:13 ` Tejun Heo
2014-04-15 2:44 ` Li Zhong
2014-04-15 14:50 ` Tejun Heo
2014-04-16 1:41 ` Li Zhong
2014-04-16 15:17 ` Tejun Heo
2014-04-17 3:05 ` Li Zhong
2014-04-17 15:06 ` Tejun Heo
2014-04-17 6:50 ` [RFC PATCH v4] " Li Zhong
2014-04-17 15:17 ` Tejun Heo
2014-04-18 8:33 ` Li Zhong
2014-04-21 9:20 ` [RFC PATCH v5 1/2] Use lock_device_hotplug() in cpu_probe_store() and cpu_release_store() Li Zhong
2014-04-21 9:23 ` [RFC PATCH v5 2/2] Use kernfs_break_active_protection() for device online store callbacks Li Zhong
2014-04-21 22:46 ` Tejun Heo
2014-04-22 3:34 ` Li Zhong
2014-04-22 10:11 ` Rafael J. Wysocki
2014-04-23 1:50 ` Li Zhong
2014-04-23 10:54 ` Rafael J. Wysocki
2014-04-24 1:13 ` Li Zhong
2014-04-22 20:44 ` Tejun Heo
2014-04-22 22:21 ` Rafael J. Wysocki
2014-04-23 14:23 ` Tejun Heo
2014-04-23 16:12 ` Rafael J. Wysocki
2014-04-23 16:52 ` Tejun Heo
2014-04-24 8:59 ` Li Zhong
2014-04-24 10:02 ` Rafael J. Wysocki
2014-04-25 1:46 ` Li Zhong
2014-04-25 12:47 ` Rafael J. Wysocki
2014-04-28 1:49 ` Li Zhong
2014-04-23 5:03 ` Li Zhong
2014-04-23 10:58 ` Rafael J. Wysocki
2014-04-24 1:33 ` Li Zhong
2014-05-09 8:35 ` Li Zhong
2014-05-09 8:40 ` [RFC PATCH v6 1/2 ] Use lock_device_hotplug() in cpu_probe_store() and cpu_release_store() Li Zhong
2014-05-09 8:40 ` [RFC PATCH v6 2/2] Implement lock_device_hotplug_sysfs() by breaking active protection Li Zhong
2014-04-21 22:38 ` [RFC PATCH v5 1/2] Use lock_device_hotplug() in cpu_probe_store() and cpu_release_store() Tejun Heo
2014-04-22 2:29 ` Li Zhong
2014-04-22 20:40 ` Tejun Heo
2014-04-23 2:00 ` Li Zhong
2014-04-23 14:39 ` Tejun Heo
2014-04-24 8:37 ` Li Zhong
2014-04-24 14:32 ` Tejun Heo
2014-04-25 1:56 ` Li Zhong
2014-04-25 12:28 ` Tejun Heo
2014-04-28 0:51 ` Li Zhong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1397189445.3649.14.camel@ThinkPad-T5421 \
--to=zhong@linux.vnet.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=tj@kernel.org \
--cc=toshi.kani@hp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).