linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: hch@infradead.org
Cc: axboe@kernel.dk, desmondcheongzx@gmail.com,
	gregkh@linuxfoundation.org, linux-block@vger.kernel.org,
	linux-kernel-mentees@lists.linuxfoundation.org,
	linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org,
	miquel.raynal@bootlin.com, richard@nod.at,
	Shuah Khan <skhan@linuxfoundation.org>,
	syzbot+6a8a0d93c91e8fbf2e80@syzkaller.appspotmail.com,
	vigneshr@ti.com
Subject: [PATCH v2] block: genhd: don't call probe function with major_names_lock held
Date: Sat, 19 Jun 2021 10:05:44 +0900	[thread overview]
Message-ID: <f790f8fb-5758-ea4e-a527-0ee4af82dd44@i-love.sakura.ne.jp> (raw)
In-Reply-To: 

syzbot is reporting circular locking problem at blk_request_module() [1],
for blk_request_module() is calling probe function with major_names_lock
held while major_names_lock is held during module's __init and __exit
functions.

                                         loop_exit() {
                                           mutex_lock(&loop_ctl_mutex);
  blk_request_module() {
    mutex_lock(&major_names_lock);
    loop_probe() {
      mutex_lock(&loop_ctl_mutex); // Blocked by loop_exit()
      mutex_unlock(&loop_ctl_mutex);
    }
    mutex_unlock(&major_names_lock);
                                           unregister_blkdev() {
                                             mutex_lock(&major_names_lock); // Blocked by blk_request_module()
                                             mutex_unlock(&major_names_lock);
                                           }
                                           mutex_unlock(&loop_ctl_mutex);
  }
                                         }

Based on an assumption that a probe callback passed to __register_blkdev()
belongs to a module which calls __register_blkdev(), drop major_names_lock
before calling probe function by holding a reference to that module which
contains that probe function. If there is a module where this assumption
does not hold, such module can call ____register_blkdev() directly.

  blk_request_module() {
    mutex_lock(&major_names_lock);
    // Block loop_exit()
    mutex_unlock(&major_names_lock);
    loop_probe() {
      mutex_lock(&loop_ctl_mutex);
      mutex_unlock(&loop_ctl_mutex);
    }
    // Unblock loop_exit()
  }
                                         loop_exit() {
                                           mutex_lock(&loop_ctl_mutex);
                                           unregister_blkdev() {
                                             mutex_lock(&major_names_lock);
                                             mutex_unlock(&major_names_lock);
                                           }
                                           mutex_unlock(&loop_ctl_mutex);
                                         }

Note that regardless of this patch, it is up to probe function to
serialize module's __init function and probe function in that module
by using e.g. a mutex. This patch simply makes sure that module's __exit
function won't be called when probe function is about to be called.

While Desmond Cheong Zhi Xi already proposed a patch for breaking ABCD
circular dependency [2], I consider that this patch is still needed for
safely breaking AB-BA dependency upon module unloading. (Note that syzbot
does not test module unloading code because the syzbot kernels are built
with almost all modules built-in. We need manual inspection.)

By doing kmalloc(GFP_KERNEL) in ____register_blkdev() before holding
major_names_lock, we could convert major_names_lock from a mutex to
a spinlock. But that is beyond the scope of this patch.

Link: https://syzkaller.appspot.com/bug?id=7bd106c28e846d1023d4ca915718b1a0905444cb [1]
Link: https://lkml.kernel.org/r/20210617160904.570111-1-desmondcheongzx@gmail.com [2]
Reported-by: syzbot <syzbot+6a8a0d93c91e8fbf2e80@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+6a8a0d93c91e8fbf2e80@syzkaller.appspotmail.com>
---
 block/genhd.c         | 36 +++++++++++++++++++++++++++---------
 include/linux/genhd.h |  8 +++++---
 2 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 9f8cb7beaad1..9577c70a6bd3 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -169,6 +169,7 @@ static struct blk_major_name {
 	int major;
 	char name[16];
 	void (*probe)(dev_t devt);
+	struct module *owner;
 } *major_names[BLKDEV_MAJOR_HASH_SIZE];
 static DEFINE_MUTEX(major_names_lock);
 
@@ -197,7 +198,8 @@ void blkdev_show(struct seq_file *seqf, off_t offset)
  * @major: the requested major device number [1..BLKDEV_MAJOR_MAX-1]. If
  *         @major = 0, try to allocate any unused major number.
  * @name: the name of the new block device as a zero terminated string
- * @probe: allback that is called on access to any minor number of @major
+ * @probe: callback that is called on access to any minor number of @major
+ * @owner: the owner of @probe function (i.e. THIS_MODULE or NULL).
  *
  * The @name must be unique within the system.
  *
@@ -214,8 +216,8 @@ void blkdev_show(struct seq_file *seqf, off_t offset)
  *
  * Use register_blkdev instead for any new code.
  */
-int __register_blkdev(unsigned int major, const char *name,
-		void (*probe)(dev_t devt))
+int ____register_blkdev(unsigned int major, const char *name,
+			void (*probe)(dev_t devt), struct module *owner)
 {
 	struct blk_major_name **n, *p;
 	int index, ret = 0;
@@ -255,6 +257,7 @@ int __register_blkdev(unsigned int major, const char *name,
 
 	p->major = major;
 	p->probe = probe;
+	p->owner = owner;
 	strlcpy(p->name, name, sizeof(p->name));
 	p->next = NULL;
 	index = major_to_index(major);
@@ -277,7 +280,7 @@ int __register_blkdev(unsigned int major, const char *name,
 	mutex_unlock(&major_names_lock);
 	return ret;
 }
-EXPORT_SYMBOL(__register_blkdev);
+EXPORT_SYMBOL(____register_blkdev);
 
 void unregister_blkdev(unsigned int major, const char *name)
 {
@@ -676,14 +679,29 @@ void blk_request_module(dev_t devt)
 {
 	unsigned int major = MAJOR(devt);
 	struct blk_major_name **n;
+	void (*probe_fn)(dev_t devt);
 
 	mutex_lock(&major_names_lock);
 	for (n = &major_names[major_to_index(major)]; *n; n = &(*n)->next) {
-		if ((*n)->major == major && (*n)->probe) {
-			(*n)->probe(devt);
-			mutex_unlock(&major_names_lock);
-			return;
-		}
+		if ((*n)->major != major || !(*n)->probe)
+			continue;
+		if (!try_module_get((*n)->owner))
+			break;
+		/*
+		 * Calling probe function with major_names_lock held causes
+		 * circular locking dependency problem. Thus, call it after
+		 * releasing major_names_lock.
+		 */
+		probe_fn = (*n)->probe;
+		mutex_unlock(&major_names_lock);
+		/*
+		 * Assuming that unregister_blkdev() is called from module's
+		 * __exit function, a module refcount taken above allows us
+		 * to safely call probe function without major_names_lock held.
+		 */
+		probe_fn(devt);
+		module_put((*n)->owner);
+		return;
 	}
 	mutex_unlock(&major_names_lock);
 
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 6fc26f7bdf71..070b73c043e6 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -277,10 +277,12 @@ extern void put_disk(struct gendisk *disk);
 
 #define alloc_disk(minors) alloc_disk_node(minors, NUMA_NO_NODE)
 
-int __register_blkdev(unsigned int major, const char *name,
-		void (*probe)(dev_t devt));
+int ____register_blkdev(unsigned int major, const char *name,
+			void (*probe)(dev_t devt), struct module *owner);
+#define __register_blkdev(major, name, probe) \
+	____register_blkdev(major, name, probe, THIS_MODULE)
 #define register_blkdev(major, name) \
-	__register_blkdev(major, name, NULL)
+	____register_blkdev(major, name, NULL, NULL)
 void unregister_blkdev(unsigned int major, const char *name);
 
 bool bdev_check_media_change(struct block_device *bdev);
-- 
2.18.4

             reply	other threads:[~2021-06-19  1:06 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-19  1:05 Tetsuo Handa [this message]
2021-06-19  3:24 ` [PATCH v2] block: genhd: don't call probe function with major_names_lock held kernel test robot
2021-06-19  6:14 ` kernel test robot
2021-06-19  6:44 ` Greg KH
2021-06-19  8:47   ` Tetsuo Handa
     [not found]   ` <20210620024403.820-1-hdanton@sina.com>
2021-06-20 13:54     ` Tetsuo Handa
2021-06-21  8:54       ` Greg KH
2021-06-21  6:18 ` Christoph Hellwig
     [not found] ` <4e153910-bf60-2cca-fa02-b46d22b6e2c5@i-love.sakura.ne.jp>
     [not found]   ` <20210816073313.GA27275@lst.de>
     [not found]     ` <20210817081045.3609-1-hdanton@sina.com>
2021-08-17 10:18       ` [PATCH v3] " Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f790f8fb-5758-ea4e-a527-0ee4af82dd44@i-love.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=axboe@kernel.dk \
    --cc=desmondcheongzx@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel-mentees@lists.linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=miquel.raynal@bootlin.com \
    --cc=richard@nod.at \
    --cc=skhan@linuxfoundation.org \
    --cc=syzbot+6a8a0d93c91e8fbf2e80@syzkaller.appspotmail.com \
    --cc=vigneshr@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).