All of lore.kernel.org
 help / color / mirror / Atom feed
From: Li Zhong <zhong@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Toshi Kani <toshi.kani@hp.com>,
	Li Zhong <zhong@linux.vnet.ibm.com>
Subject: [RFC PATCH v6 2/2] Implement lock_device_hotplug_sysfs() by breaking active protection
Date: Fri,  9 May 2014 16:40:16 +0800	[thread overview]
Message-ID: <1399624816-11127-2-git-send-email-zhong@linux.vnet.ibm.com> (raw)
In-Reply-To: <1399624816-11127-1-git-send-email-zhong@linux.vnet.ibm.com>

This patch tries to solve the device hot remove locking issues in a
different way from commit 5e33bc41, as kernfs already has a mechanism
to break active protection.

The active protection is there to keep the file alive by blocking
deletion while operations are on-going in the file.  This blocking
creates a dependency loop when an operation running off a sysfs knob
ends up grabbing a lock which may be held while removing the said sysfs
knob.

So the problem here is the order of s_active, and the series of hotplug
related locks.

commit 5e33bc41 solves it by taking out the first of the series of
hoplug related locks, device_hotplug_lock, with a try lock. And if that
try lock fails, it returns a restart syscall error, and drops s_active
temporarily to allow the other process to remove the sysfs knob.

This doesn't help with lockdep warnings reported against s_active and
other hotplug related locks in the series.

This patch instead tries to take s_active out of the lock dependency
graph using the active protection breaking mechanism.

lock_device_hotplug_sysfs() function name is kept here, three more
arguments are added, dev, attr, to help find the kernfs_node
corresponding to the sysfs knob (which should always be able to be
found, WARN if not); kn_out is used to record the found kernfs_node for
use in unlock. unlock_device_hotplug_sysfs() is created to help
cleanup the operations(get reference, break active, etc) done in
lock_device_hotplug_sysfs(), as well as unlock the lock.

As we break the active protection here, the device removing process
might remove the sysfs entries after that point. So after we grab the
device_hotplug_lock, we can check whether that really happens. If so,
we should abort and not invoke the online/offline callbacks anymore.
lockdep assertion is added in try_offline_node() to make sure
device_hotplug_lock is grabbed while removing cpu or memory.

As devcie_hotplug_lock is used to protect hotplug operations with multiple
subsystems. So there might be devices that won't be removed together with
devices of other subsystems, and thus device_hotplug_lock is not needed.
In this case, their specific online/offline callbacks needs to check whether
the device has been removed.

Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 drivers/base/core.c    | 93 +++++++++++++++++++++++++++++++++++++++++++++-----
 drivers/base/memory.c  |  5 +--
 include/linux/device.h |  7 +++-
 mm/memory_hotplug.c    |  2 ++
 4 files changed, 95 insertions(+), 12 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 20da3ad..329f3b4 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -61,14 +61,87 @@ void unlock_device_hotplug(void)
 	mutex_unlock(&device_hotplug_lock);
 }
 
-int lock_device_hotplug_sysfs(void)
+void device_hotplug_assert_held(void)
 {
-	if (mutex_trylock(&device_hotplug_lock))
-		return 0;
+	lockdep_assert_held(&device_hotplug_lock);
+}
+
+/*
+ * device_hotplug_lock is used to protect hotplugs which may involve multiple
+ * subsystems. This _sysfs() version is created to solve the deadlock with
+ * s_active of the device attribute file, which could be used to online/offline
+ * the device.
+ *
+ * Returns 0 on success. -ENODEV if the @dev is removed after we break the
+ * active protection. (We should always be able to find the kernfs_node
+ * related to the attribute, WARN if not, and also return -ENODEV).
+ *
+ * When success(return 0), the found kernfs_node is passed out in @kn_out,
+ * else pass NULL to @kn_out.
+ */
+int lock_device_hotplug_sysfs(struct device *dev,
+			      struct device_attribute *attr,
+			      struct kernfs_node **kn_out)
+{
+	struct kernfs_node *kn;
+
+	*kn_out = kn = kernfs_find_and_get(dev->kobj.sd, attr->attr.name);
+
+	if (WARN_ON_ONCE(!kn))
+		return -ENODEV;
+
+	/*
+	 * Break active protection here to avoid deadlocks with device
+	 * removing process, which tries to remove sysfs entries including this
+	 * "online" attribute while holding some hotplug related locks.
+	 *
+	 * @dev needs to be protected here, or it could go away any time after
+	 * dropping active protection.
+	 */
+	get_device(dev);
+	kernfs_break_active_protection(kn);
+
+	lock_device_hotplug();
+
+	/*
+	 * For devices which need device_hoplug_lock to protect the
+	 * online/offline operation, i.e. it might be removed together with
+	 * devices in some other subsystems, we can check here whether the
+	 * device has been removed, so we don't call device_{on|off}line
+	 * against removed device. lockdep assertion is added in
+	 * try_offline_node() to make sure the lock is held to remove cpu
+	 * or memory.
+	 *
+	 * If some devices won't be removed together with devices in other
+	 * subsystems, thus device_hotplug_lock is not needed. Then they need
+	 * check whether device has been removed in their devices' specific
+	 * online/offline callbacks.
+	 */
+	if (!dev->kobj.sd) {
+		/* device_del() already called on @dev, we should also abort */
+		unlock_device_hotplug();
+		kernfs_unbreak_active_protection(kn);
+		put_device(dev);
+		kernfs_put(kn);
+		*kn_out = NULL;
+		return -ENODEV;
+	}
 
-	/* Avoid busy looping (5 ms of sleep should do). */
-	msleep(5);
-	return restart_syscall();
+	return 0;
+}
+
+/*
+ * This _sysfs version of unlock_device_hotplug help to undo the
+ * operations (get reference, break active protection, etc) done in
+ * lock_device_hotplug_sysfs(), as well as unlock the device_hotplug_lock.
+ */
+void unlock_device_hotplug_sysfs(struct device *dev,
+				 struct kernfs_node *kn)
+{
+	unlock_device_hotplug();
+	kernfs_unbreak_active_protection(kn);
+	put_device(dev);
+	kernfs_put(kn);
 }
 
 #ifdef CONFIG_BLOCK
@@ -439,17 +512,19 @@ static ssize_t online_store(struct device *dev, struct device_attribute *attr,
 {
 	bool val;
 	int ret;
+	struct kernfs_node *kn = NULL;
 
 	ret = strtobool(buf, &val);
 	if (ret < 0)
 		return ret;
 
-	ret = lock_device_hotplug_sysfs();
-	if (ret)
+	ret = lock_device_hotplug_sysfs(dev, attr, &kn);
+	if (ret < 0)
 		return ret;
 
 	ret = val ? device_online(dev) : device_offline(dev);
-	unlock_device_hotplug();
+	unlock_device_hotplug_sysfs(dev, kn);
+
 	return ret < 0 ? ret : count;
 }
 static DEVICE_ATTR_RW(online);
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index bece691..f82dea4 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -320,8 +320,9 @@ store_mem_state(struct device *dev,
 {
 	struct memory_block *mem = to_memory_block(dev);
 	int ret, online_type;
+	struct kernfs_node *kn = NULL;
 
-	ret = lock_device_hotplug_sysfs();
+	ret = lock_device_hotplug_sysfs(dev, attr, &kn);
 	if (ret)
 		return ret;
 
@@ -360,7 +361,7 @@ store_mem_state(struct device *dev,
 	}
 
 err:
-	unlock_device_hotplug();
+	unlock_device_hotplug_sysfs(dev, kn);
 
 	if (ret)
 		return ret;
diff --git a/include/linux/device.h b/include/linux/device.h
index d1d1c05..19a24f4 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -917,7 +917,12 @@ static inline bool device_supports_offline(struct device *dev)
 
 extern void lock_device_hotplug(void);
 extern void unlock_device_hotplug(void);
-extern int lock_device_hotplug_sysfs(void);
+extern void device_hotplug_assert_held(void);
+extern int lock_device_hotplug_sysfs(struct device *dev,
+				     struct device_attribute *attr,
+				     struct kernfs_node **kn);
+extern void unlock_device_hotplug_sysfs(struct device *dev,
+					struct kernfs_node *parent);
 extern int device_offline(struct device *dev);
 extern int device_online(struct device *dev);
 /*
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a650db2..bdec24d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1822,6 +1822,8 @@ void try_offline_node(int nid)
 	struct page *pgdat_page = virt_to_page(pgdat);
 	int i;
 
+	device_hotplug_assert_held();
+
 	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
 		unsigned long section_nr = pfn_to_section_nr(pfn);
 
-- 
1.8.1.4


  reply	other threads:[~2014-05-09  8:42 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-10  9:18 [RFC PATCH] Suppress a device hot remove related lockdep warning Li Zhong
2014-04-10 13:31 ` Tejun Heo
2014-04-11  4:10   ` [RFC PATCH v2] Use kernfs_break_active_protection() for device online store callbacks Li Zhong
2014-04-11 10:26     ` Tejun Heo
2014-04-14  7:47       ` [RFC PATCH v3] " Li Zhong
2014-04-14 20:13         ` Tejun Heo
2014-04-15  2:44           ` Li Zhong
2014-04-15 14:50             ` Tejun Heo
2014-04-16  1:41               ` Li Zhong
2014-04-16 15:17                 ` Tejun Heo
2014-04-17  3:05                   ` Li Zhong
2014-04-17 15:06                     ` Tejun Heo
2014-04-17  6:50                   ` [RFC PATCH v4] " Li Zhong
2014-04-17 15:17                     ` Tejun Heo
2014-04-18  8:33                       ` Li Zhong
2014-04-21  9:20                       ` [RFC PATCH v5 1/2] Use lock_device_hotplug() in cpu_probe_store() and cpu_release_store() Li Zhong
2014-04-21  9:23                         ` [RFC PATCH v5 2/2] Use kernfs_break_active_protection() for device online store callbacks Li Zhong
2014-04-21 22:46                           ` Tejun Heo
2014-04-22  3:34                             ` Li Zhong
2014-04-22 10:11                               ` Rafael J. Wysocki
2014-04-23  1:50                                 ` Li Zhong
2014-04-23 10:54                                   ` Rafael J. Wysocki
2014-04-24  1:13                                     ` Li Zhong
2014-04-22 20:44                               ` Tejun Heo
2014-04-22 22:21                                 ` Rafael J. Wysocki
2014-04-23 14:23                                   ` Tejun Heo
2014-04-23 16:12                                     ` Rafael J. Wysocki
2014-04-23 16:52                                       ` Tejun Heo
2014-04-24  8:59                                       ` Li Zhong
2014-04-24 10:02                                         ` Rafael J. Wysocki
2014-04-25  1:46                                           ` Li Zhong
2014-04-25 12:47                                             ` Rafael J. Wysocki
2014-04-28  1:49                                               ` Li Zhong
2014-04-23  5:03                                 ` Li Zhong
2014-04-23 10:58                                   ` Rafael J. Wysocki
2014-04-24  1:33                                     ` Li Zhong
2014-05-09  8:35                               ` Li Zhong
2014-05-09  8:40                                 ` [RFC PATCH v6 1/2 ] Use lock_device_hotplug() in cpu_probe_store() and cpu_release_store() Li Zhong
2014-05-09  8:40                                   ` Li Zhong [this message]
2014-04-21 22:38                         ` [RFC PATCH v5 1/2] " Tejun Heo
2014-04-22  2:29                           ` Li Zhong
2014-04-22 20:40                             ` Tejun Heo
2014-04-23  2:00                               ` Li Zhong
2014-04-23 14:39                                 ` Tejun Heo
2014-04-24  8:37                                   ` Li Zhong
2014-04-24 14:32                                     ` Tejun Heo
2014-04-25  1:56                                       ` Li Zhong
2014-04-25 12:28                                         ` Tejun Heo
2014-04-28  0:51                                           ` Li Zhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1399624816-11127-2-git-send-email-zhong@linux.vnet.ibm.com \
    --to=zhong@linux.vnet.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=tj@kernel.org \
    --cc=toshi.kani@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.