All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Hao Luo <haoluo@google.com>, Tejun Heo <tj@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 5.10 22/38] kernfs: Separate kernfs_pr_cont_buf and rename_lock.
Date: Tue,  7 Jun 2022 13:58:17 -0400	[thread overview]
Message-ID: <20220607175835.480735-22-sashal@kernel.org> (raw)
In-Reply-To: <20220607175835.480735-1-sashal@kernel.org>

From: Hao Luo <haoluo@google.com>

[ Upstream commit 1a702dc88e150487c9c173a249b3d236498b9183 ]

Previously the protection of kernfs_pr_cont_buf was piggy backed by
rename_lock, which means that pr_cont() needs to be protected under
rename_lock. This can cause potential circular lock dependencies.

If there is an OOM, we have the following call hierarchy:

 -> cpuset_print_current_mems_allowed()
   -> pr_cont_cgroup_name()
     -> pr_cont_kernfs_name()

pr_cont_kernfs_name() will grab rename_lock and call printk. So we have
the following lock dependencies:

 kernfs_rename_lock -> console_sem

Sometimes, printk does a wakeup before releasing console_sem, which has
the dependence chain:

 console_sem -> p->pi_lock -> rq->lock

Now, imagine one wants to read cgroup_name under rq->lock, for example,
printing cgroup_name in a tracepoint in the scheduler code. They will
be holding rq->lock and take rename_lock:

 rq->lock -> kernfs_rename_lock

Now they will deadlock.

A prevention to this circular lock dependency is to separate the
protection of pr_cont_buf from rename_lock. In principle, rename_lock
is to protect the integrity of cgroup name when copying to buf. Once
pr_cont_buf has got its content, rename_lock can be dropped. So it's
safe to drop rename_lock after kernfs_name_locked (and
kernfs_path_from_node_locked) and rely on a dedicated pr_cont_lock
to protect pr_cont_buf.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/r/20220516190951.3144144-1-haoluo@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/kernfs/dir.c | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 9aec80b9d7c6..afb39e1bbe3b 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -19,7 +19,15 @@
 
 DEFINE_MUTEX(kernfs_mutex);
 static DEFINE_SPINLOCK(kernfs_rename_lock);	/* kn->parent and ->name */
-static char kernfs_pr_cont_buf[PATH_MAX];	/* protected by rename_lock */
+/*
+ * Don't use rename_lock to piggy back on pr_cont_buf. We don't want to
+ * call pr_cont() while holding rename_lock. Because sometimes pr_cont()
+ * will perform wakeups when releasing console_sem. Holding rename_lock
+ * will introduce deadlock if the scheduler reads the kernfs_name in the
+ * wakeup path.
+ */
+static DEFINE_SPINLOCK(kernfs_pr_cont_lock);
+static char kernfs_pr_cont_buf[PATH_MAX];	/* protected by pr_cont_lock */
 static DEFINE_SPINLOCK(kernfs_idr_lock);	/* root->ino_idr */
 
 #define rb_to_kn(X) rb_entry((X), struct kernfs_node, rb)
@@ -230,12 +238,12 @@ void pr_cont_kernfs_name(struct kernfs_node *kn)
 {
 	unsigned long flags;
 
-	spin_lock_irqsave(&kernfs_rename_lock, flags);
+	spin_lock_irqsave(&kernfs_pr_cont_lock, flags);
 
-	kernfs_name_locked(kn, kernfs_pr_cont_buf, sizeof(kernfs_pr_cont_buf));
+	kernfs_name(kn, kernfs_pr_cont_buf, sizeof(kernfs_pr_cont_buf));
 	pr_cont("%s", kernfs_pr_cont_buf);
 
-	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+	spin_unlock_irqrestore(&kernfs_pr_cont_lock, flags);
 }
 
 /**
@@ -249,10 +257,10 @@ void pr_cont_kernfs_path(struct kernfs_node *kn)
 	unsigned long flags;
 	int sz;
 
-	spin_lock_irqsave(&kernfs_rename_lock, flags);
+	spin_lock_irqsave(&kernfs_pr_cont_lock, flags);
 
-	sz = kernfs_path_from_node_locked(kn, NULL, kernfs_pr_cont_buf,
-					  sizeof(kernfs_pr_cont_buf));
+	sz = kernfs_path_from_node(kn, NULL, kernfs_pr_cont_buf,
+				   sizeof(kernfs_pr_cont_buf));
 	if (sz < 0) {
 		pr_cont("(error)");
 		goto out;
@@ -266,7 +274,7 @@ void pr_cont_kernfs_path(struct kernfs_node *kn)
 	pr_cont("%s", kernfs_pr_cont_buf);
 
 out:
-	spin_unlock_irqrestore(&kernfs_rename_lock, flags);
+	spin_unlock_irqrestore(&kernfs_pr_cont_lock, flags);
 }
 
 /**
@@ -864,13 +872,12 @@ static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *parent,
 
 	lockdep_assert_held(&kernfs_mutex);
 
-	/* grab kernfs_rename_lock to piggy back on kernfs_pr_cont_buf */
-	spin_lock_irq(&kernfs_rename_lock);
+	spin_lock_irq(&kernfs_pr_cont_lock);
 
 	len = strlcpy(kernfs_pr_cont_buf, path, sizeof(kernfs_pr_cont_buf));
 
 	if (len >= sizeof(kernfs_pr_cont_buf)) {
-		spin_unlock_irq(&kernfs_rename_lock);
+		spin_unlock_irq(&kernfs_pr_cont_lock);
 		return NULL;
 	}
 
@@ -882,7 +889,7 @@ static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *parent,
 		parent = kernfs_find_ns(parent, name, ns);
 	}
 
-	spin_unlock_irq(&kernfs_rename_lock);
+	spin_unlock_irq(&kernfs_pr_cont_lock);
 
 	return parent;
 }
-- 
2.35.1


  parent reply	other threads:[~2022-06-07 19:37 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-07 17:57 [PATCH AUTOSEL 5.10 01/38] iio: dummy: iio_simple_dummy: check the return value of kstrdup() Sasha Levin
2022-06-07 17:57 ` [PATCH AUTOSEL 5.10 02/38] staging: rtl8712: fix a potential memory leak in r871xu_drv_init() Sasha Levin
2022-06-07 17:57 ` [PATCH AUTOSEL 5.10 03/38] iio: st_sensors: Add a local lock for protecting odr Sasha Levin
2022-06-07 17:57 ` [PATCH AUTOSEL 5.10 04/38] lkdtm/usercopy: Expand size of "out of frame" object Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 05/38] tty: synclink_gt: Fix null-pointer-dereference in slgt_clean() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 06/38] tty: Fix a possible resource leak in icom_probe Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 07/38] drivers: staging: rtl8192u: Fix deadlock in ieee80211_beacons_stop() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 08/38] drivers: staging: rtl8192e: Fix deadlock in rtllib_beacons_stop() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 09/38] USB: host: isp116x: check return value after calling platform_get_resource() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 10/38] drivers: tty: serial: Fix deadlock in sa1100_set_termios() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 11/38] drivers: usb: host: Fix deadlock in oxu_bus_suspend() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 12/38] USB: hcd-pci: Fully suspend across freeze/thaw cycle Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 13/38] sysrq: do not omit current cpu when showing backtrace of all active CPUs Sasha Levin
2022-06-07 17:58   ` Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 14/38] usb: dwc2: gadget: don't reset gadget's driver->bus Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 15/38] misc: rtsx: set NULL intfdata when probe fails Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 16/38] extcon: Modify extcon device to be created after driver data is set Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 17/38] clocksource/drivers/sp804: Avoid error on multiple instances Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 18/38] staging: rtl8712: fix uninit-value in usb_read8() and friends Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 19/38] staging: rtl8712: fix uninit-value in r871xu_drv_init() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 20/38] serial: msm_serial: disable interrupts in __msm_console_write() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 21/38] accessiblity: speakup: Add missing misc_deregister in softsynth_probe Sasha Levin
2022-06-08 21:08   ` Pavel Machek
2022-06-12 17:47     ` Sasha Levin
2022-06-12 17:49       ` Samuel Thibault
2022-06-07 17:58 ` Sasha Levin [this message]
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 23/38] watchdog: wdat_wdt: Stop watchdog when rebooting the system Sasha Levin
2022-06-07 17:58 ` [dm-devel] [PATCH AUTOSEL 5.10 24/38] md: don't unregister sync_thread with reconfig_mutex held Sasha Levin
2022-06-07 17:58   ` Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 25/38] md: protect md_unregister_thread from reentrancy Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 26/38] scsi: myrb: Fix up null pointer access on myrb_cleanup() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 27/38] Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process" Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 28/38] ceph: allow ceph.dir.rctime xattr to be updatable Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 29/38] drm/radeon: fix a possible null pointer dereference Sasha Levin
2022-06-07 17:58   ` Sasha Levin
2022-06-07 17:58   ` Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 30/38] modpost: fix undefined behavior of is_arm_mapping_symbol() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 31/38] x86/cpu: Elide KCSAN for cpu_has() and friends Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 32/38] jump_label,noinstr: Avoid instrumentation for JUMP_LABEL=n builds Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 33/38] nbd: call genl_unregister_family() first in nbd_cleanup() Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 34/38] nbd: fix race between nbd_alloc_config() and module removal Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 35/38] nbd: fix io hung while disconnecting device Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 36/38] s390/gmap: voluntarily schedule during key setting Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 37/38] cifs: version operations for smb20 unneeded when legacy support disabled Sasha Levin
2022-06-07 17:58 ` [PATCH AUTOSEL 5.10 38/38] nodemask: Fix return values to be unsigned Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220607175835.480735-22-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=haoluo@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.