All of lore.kernel.org
 help / color / mirror / Atom feed
From: Li Zhong <zhong@linux.vnet.ibm.com>
To: Tejun Heo <tj@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	gregkh@linuxfoundation.org, rafael.j.wysocki@intel.com,
	toshi.kani@hp.com
Subject: [RFC PATCH v2] Use kernfs_break_active_protection() for device online store callbacks
Date: Fri, 11 Apr 2014 12:10:45 +0800	[thread overview]
Message-ID: <1397189445.3649.14.camel@ThinkPad-T5421> (raw)
In-Reply-To: <20140410133116.GB25308@htj.dyndns.org>

I noticed following lockdep warning when trying acpi hot-remove cpus:

[84154.204080] ======================================================
[84154.204080] [ INFO: possible circular locking dependency detected ]
[84154.204080] 3.14.0-next-20140408+ #24 Tainted: G        W    
[84154.204080] -------------------------------------------------------
[84154.204080] bash/777 is trying to acquire lock:
[84154.204080]  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810664a7>]
cpu_maps_update_begin+0x17/0x20
[84154.213203] 
[84154.213203] but task is already holding lock:
[84154.213203]  (s_active#79){++++.+}, at: [<ffffffff81256e14>]
kernfs_fop_write+0xe4/0x190
[84154.213203] 
[84154.213203] which lock already depends on the new lock.
[84154.213203] 
[84154.213203] 
[84154.213203] the existing dependency chain (in reverse order) is:
[84154.213203] 
-> #3 (s_active#79){++++.+}:
[84154.213203]        [<ffffffff810c408b>] lock_acquire+0x9b/0x1d0
[84154.213203]        [<ffffffff812552e0>] __kernfs_remove+0x250/0x310
[84154.213203]        [<ffffffff81256470>] kernfs_remove_by_name_ns
+0x50/0xc0
[84154.213203]        [<ffffffff81257af5>] sysfs_remove_file_ns
+0x15/0x20
[84154.213203]        [<ffffffff813df339>] device_remove_file+0x19/0x20
[84154.213203]        [<ffffffff813dff33>] device_remove_attrs+0x33/0x80
[84154.213203]        [<ffffffff813e00a7>] device_del+0x127/0x1c0
[84154.213203]        [<ffffffff813e0162>] device_unregister+0x22/0x60
[84154.213203]        [<ffffffff813e66de>] unregister_cpu+0x1e/0x40
[84154.213203]        [<ffffffff81009a33>] arch_unregister_cpu+0x23/0x30
[84154.213203]        [<ffffffff8136f619>] acpi_processor_remove
+0x8d/0xb2
[84154.213203]        [<ffffffff8136cfff>] acpi_bus_trim+0x5a/0x8d
[84154.213203]        [<ffffffff8136ec3b>] acpi_device_hotplug
+0x1a8/0x3ec
[84154.213203]        [<ffffffff81369002>] acpi_hotplug_work_fn
+0x1f/0x2b
[84154.213203]        [<ffffffff8108754b>] process_one_work+0x1eb/0x6b0
[84154.213203]        [<ffffffff81087e9b>] worker_thread+0x11b/0x370
[84154.213203]        [<ffffffff81090324>] kthread+0xe4/0x100
[84154.213203]        [<ffffffff815d2f2c>] ret_from_fork+0x7c/0xb0
[84154.213203] 
-> #2 (cpu_hotplug.lock#2){+.+.+.}:
[84154.213203]        [<ffffffff810c408b>] lock_acquire+0x9b/0x1d0
[84154.213203]        [<ffffffff815c7700>] mutex_lock_nested+0x50/0x3c0
[84154.213203]        [<ffffffff810665bf>] cpu_hotplug_begin+0x4f/0x80
[84154.213203]        [<ffffffff8106666f>] _cpu_up+0x3f/0x160
[84154.213203]        [<ffffffff810667f9>] cpu_up+0x69/0x80
[84154.213203]        [<ffffffff81b18f14>] smp_init+0x60/0x8c
[84154.213203]        [<ffffffff81b00fd8>] kernel_init_freeable
+0x126/0x23b
[84154.213203]        [<ffffffff815b4a3e>] kernel_init+0xe/0xf0
[84154.213203]        [<ffffffff815d2f2c>] ret_from_fork+0x7c/0xb0
[84154.213203] 
-> #1 (cpu_hotplug.lock){++++++}:
[84154.213203]        [<ffffffff810c408b>] lock_acquire+0x9b/0x1d0
[84154.213203]        [<ffffffff810665b1>] cpu_hotplug_begin+0x41/0x80
[84154.213203]        [<ffffffff8106666f>] _cpu_up+0x3f/0x160
[84154.213203]        [<ffffffff810667f9>] cpu_up+0x69/0x80
[84154.213203]        [<ffffffff81b18f14>] smp_init+0x60/0x8c
[84154.213203]        [<ffffffff81b00fd8>] kernel_init_freeable
+0x126/0x23b
[84154.213203]        [<ffffffff815b4a3e>] kernel_init+0xe/0xf0
[84154.213203]        [<ffffffff815d2f2c>] ret_from_fork+0x7c/0xb0
[84154.213203] 
-> #0 (cpu_add_remove_lock){+.+.+.}:
[84154.213203]        [<ffffffff810c397a>] __lock_acquire+0x1f2a/0x1f60
[84154.213203]        [<ffffffff810c408b>] lock_acquire+0x9b/0x1d0
[84154.213203]        [<ffffffff815c7700>] mutex_lock_nested+0x50/0x3c0
[84154.213203]        [<ffffffff810664a7>] cpu_maps_update_begin
+0x17/0x20
[84154.213203]        [<ffffffff815b582d>] cpu_down+0x1d/0x50
[84154.213203]        [<ffffffff813e63b4>] cpu_subsys_offline+0x14/0x20
[84154.213203]        [<ffffffff813e145d>] device_offline+0xad/0xd0
[84154.213203]        [<ffffffff813e1562>] online_store+0x42/0x80
[84154.213203]        [<ffffffff813deab8>] dev_attr_store+0x18/0x30
[84154.213203]        [<ffffffff81257bb9>] sysfs_kf_write+0x49/0x60
[84154.213203]        [<ffffffff81256e39>] kernfs_fop_write+0x109/0x190
[84154.213203]        [<ffffffff811d15be>] vfs_write+0xbe/0x1c0
[84154.213203]        [<ffffffff811d1a52>] SyS_write+0x52/0xb0
[84154.213203]        [<ffffffff815d3162>] tracesys+0xd0/0xd5
[84154.213203] 
[84154.213203] other info that might help us debug this:
[84154.213203] 
[84154.213203] Chain exists of:
  cpu_add_remove_lock --> cpu_hotplug.lock#2 --> s_active#79

[84154.213203]  Possible unsafe locking scenario:
[84154.213203] 
[84154.213203]        CPU0                    CPU1
[84154.213203]        ----                    ----
[84154.213203]   lock(s_active#79);
[84154.213203]                                lock(cpu_hotplug.lock#2);
[84154.213203]                                lock(s_active#79);
[84154.213203]   lock(cpu_add_remove_lock);
[84154.213203] 
[84154.213203]  *** DEADLOCK ***
.............

The deadlock itself seems already fixed in commit 5e33bc41. 

As Tejun suggested, to avoid this lockdep warning,
kernfs_break_active_protection() is used before online/offline callbacks
to take the s_active lock out of the dependency chain during
online/offline operations. 

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 drivers/base/core.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 0dd6528..2b9f68e 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -439,6 +439,7 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr,
 {
 	bool val;
 	int ret;
+	struct kernfs_node *kn;
 
 	ret = strtobool(buf, &val);
 	if (ret < 0)
@@ -448,7 +449,15 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr,
 	if (ret)
 		return ret;
 
+	kn = kernfs_find_and_get(dev->kobj.sd, attr->attr.name);
+	if (WARN_ON_ONCE(!kn))
+		goto out;
+
+	kernfs_break_active_protection(kn);
+
 	ret = val ? device_online(dev) : device_offline(dev);
+	kernfs_unbreak_active_protection(kn);
+out:
 	unlock_device_hotplug();
 	return ret < 0 ? ret : count;
 }



  reply	other threads:[~2014-04-11  4:10 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-10  9:18 [RFC PATCH] Suppress a device hot remove related lockdep warning Li Zhong
2014-04-10 13:31 ` Tejun Heo
2014-04-11  4:10   ` Li Zhong [this message]
2014-04-11 10:26     ` [RFC PATCH v2] Use kernfs_break_active_protection() for device online store callbacks Tejun Heo
2014-04-14  7:47       ` [RFC PATCH v3] " Li Zhong
2014-04-14 20:13         ` Tejun Heo
2014-04-15  2:44           ` Li Zhong
2014-04-15 14:50             ` Tejun Heo
2014-04-16  1:41               ` Li Zhong
2014-04-16 15:17                 ` Tejun Heo
2014-04-17  3:05                   ` Li Zhong
2014-04-17 15:06                     ` Tejun Heo
2014-04-17  6:50                   ` [RFC PATCH v4] " Li Zhong
2014-04-17 15:17                     ` Tejun Heo
2014-04-18  8:33                       ` Li Zhong
2014-04-21  9:20                       ` [RFC PATCH v5 1/2] Use lock_device_hotplug() in cpu_probe_store() and cpu_release_store() Li Zhong
2014-04-21  9:23                         ` [RFC PATCH v5 2/2] Use kernfs_break_active_protection() for device online store callbacks Li Zhong
2014-04-21 22:46                           ` Tejun Heo
2014-04-22  3:34                             ` Li Zhong
2014-04-22 10:11                               ` Rafael J. Wysocki
2014-04-23  1:50                                 ` Li Zhong
2014-04-23 10:54                                   ` Rafael J. Wysocki
2014-04-24  1:13                                     ` Li Zhong
2014-04-22 20:44                               ` Tejun Heo
2014-04-22 22:21                                 ` Rafael J. Wysocki
2014-04-23 14:23                                   ` Tejun Heo
2014-04-23 16:12                                     ` Rafael J. Wysocki
2014-04-23 16:52                                       ` Tejun Heo
2014-04-24  8:59                                       ` Li Zhong
2014-04-24 10:02                                         ` Rafael J. Wysocki
2014-04-25  1:46                                           ` Li Zhong
2014-04-25 12:47                                             ` Rafael J. Wysocki
2014-04-28  1:49                                               ` Li Zhong
2014-04-23  5:03                                 ` Li Zhong
2014-04-23 10:58                                   ` Rafael J. Wysocki
2014-04-24  1:33                                     ` Li Zhong
2014-05-09  8:35                               ` Li Zhong
2014-05-09  8:40                                 ` [RFC PATCH v6 1/2 ] Use lock_device_hotplug() in cpu_probe_store() and cpu_release_store() Li Zhong
2014-05-09  8:40                                   ` [RFC PATCH v6 2/2] Implement lock_device_hotplug_sysfs() by breaking active protection Li Zhong
2014-04-21 22:38                         ` [RFC PATCH v5 1/2] Use lock_device_hotplug() in cpu_probe_store() and cpu_release_store() Tejun Heo
2014-04-22  2:29                           ` Li Zhong
2014-04-22 20:40                             ` Tejun Heo
2014-04-23  2:00                               ` Li Zhong
2014-04-23 14:39                                 ` Tejun Heo
2014-04-24  8:37                                   ` Li Zhong
2014-04-24 14:32                                     ` Tejun Heo
2014-04-25  1:56                                       ` Li Zhong
2014-04-25 12:28                                         ` Tejun Heo
2014-04-28  0:51                                           ` Li Zhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1397189445.3649.14.camel@ThinkPad-T5421 \
    --to=zhong@linux.vnet.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=tj@kernel.org \
    --cc=toshi.kani@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.