From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>
Cc: Greg KH <gregkh@linuxfoundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Hannes Reinecke <hare@suse.de>,
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>,
Hillf Danton <hdanton@sina.com>,
Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>,
linux-block <linux-block@vger.kernel.org>
Subject: [PATCH v3] block: genhd: don't call probe function with major_names_lock held
Date: Sun, 15 Aug 2021 15:52:45 +0900 [thread overview]
Message-ID: <4e153910-bf60-2cca-fa02-b46d22b6e2c5@i-love.sakura.ne.jp> (raw)
In-Reply-To: <f790f8fb-5758-ea4e-a527-0ee4af82dd44@i-love.sakura.ne.jp>
syzbot is reporting circular locking problem at __loop_clr_fd() [1], for
commit a160c6159d4a0cf8 ("block: add an optional probe callback to
major_names") is calling the module's probe function with major_names_lock
held.
When copying content of /proc/devices to another file via sendfile(),
sb_writers#$N => &p->lock => major_names_lock
dependency is recorded.
When loop_process_work() from WQ context performs a write request,
(wq_completion)loop$M => (work_completion)&lo->rootcg_work =>
sb_writers#$N
dependency is recorded.
When flush_workqueue() from drain_workqueue() from destroy_workqueue()
from __loop_clr_fd() from blkdev_put() from blkdev_close() from __fput()
is called,
&disk->open_mutex => &lo->lo_mutex => (wq_completion)loop$M
dependency is recorded.
When loop_control_remove() from loop_control_ioctl(LOOP_CTL_REMOVE) is
called,
loop_ctl_mutex => &lo->lo_mutex
dependency is recorded.
As a result, lockdep thinks that there is
loop_ctl_mutex => &lo->lo_mutex => (wq_completion)loop$M =>
(work_completion)&lo->rootcg_work => sb_writers#$N => &p->lock =>
major_names_lock
dependency chain.
Then, if loop_add() from loop_probe() from blk_request_module() from
blkdev_get_no_open() from blkdev_get_by_dev() from blkdev_open() from
do_dentry_open() from path_openat() from do_filp_open() is called,
major_names_lock => loop_ctl_mutex
dependency is appended to the dependency chain.
There would be two approaches for breaking this circular dependency.
One is to kill loop_ctl_mutex => &lo->lo_mutex chain. The other is to
kill major_names_lock => loop_ctl_mutex chain. This patch implements
the latter, due to the following reasons.
(1) sb_writers#$N => &p->lock => major_names_lock chain is unavoidable
(2) this patch can also fix similar problem in other modules [2] which
is caused by calling the probe function with major_names_lock held
(3) I believe that this patch is principally safer than e.g.
commit bd5c39edad535d9f ("loop: reduce loop_ctl_mutex coverage in
loop_exit") which waits until the probe function finishes using
global mutex in order to fix deadlock reproducible by sleep
injection [3]
This patch adds THIS_MODULE parameter to __register_blkdev(), based on an
assumption that a probe callback passed to __register_blkdev() belongs to
a module which calls __register_blkdev(). Then, drop major_names_lock
before calling probe function by holding a reference to that module which
contains that probe function.
It may sound strange to pass THIS_MODULE as a function argument, but
what this patch is doing is essentially the same with passing e.g.
"struct file_system_type" argument initialized with .owner = THIS_MODULE
to register_filesystem(). To minimize lines changed, this patch does not
define some "struct" for __register_blkdev().
Link: https://syzkaller.appspot.com/bug?id=7bb10e8b62f83e4d445cdf4c13d69e407e629558 [1]
Link: https://syzkaller.appspot.com/bug?id=7bd106c28e846d1023d4ca915718b1a0905444cb [2]
Link: https://lkml.kernel.org/r/c4edf07f-92e1-a350-2743-f0b0234a2b6c@i-love.sakura.ne.jp [3]
Reported-by: syzbot <syzbot+f61766d5763f9e7a118f@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+6a8a0d93c91e8fbf2e80@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+f61766d5763f9e7a118f@syzkaller.appspotmail.com>
Fixes: a160c6159d4a0cf8 ("block: add an optional probe callback to major_names")
---
block/genhd.c | 32 +++++++++++++++++++++++++-------
drivers/block/ataflop.c | 2 +-
drivers/block/brd.c | 2 +-
drivers/block/floppy.c | 2 +-
drivers/block/loop.c | 2 +-
drivers/md/md.c | 4 ++--
drivers/scsi/sd.c | 2 +-
include/linux/genhd.h | 4 ++--
8 files changed, 34 insertions(+), 16 deletions(-)
diff --git a/block/genhd.c b/block/genhd.c
index 298ee78c1bda..e7c75c5aa831 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -162,6 +162,7 @@ static struct blk_major_name {
int major;
char name[16];
void (*probe)(dev_t devt);
+ struct module *owner;
} *major_names[BLKDEV_MAJOR_HASH_SIZE];
static DEFINE_MUTEX(major_names_lock);
@@ -190,7 +191,8 @@ void blkdev_show(struct seq_file *seqf, off_t offset)
* @major: the requested major device number [1..BLKDEV_MAJOR_MAX-1]. If
* @major = 0, try to allocate any unused major number.
* @name: the name of the new block device as a zero terminated string
- * @probe: allback that is called on access to any minor number of @major
+ * @probe: callback that is called on access to any minor number of @major
+ * @owner: THIS_MODULE if @probe is not NULL, ignored if @probe is NULL.
*
* The @name must be unique within the system.
*
@@ -208,7 +210,7 @@ void blkdev_show(struct seq_file *seqf, off_t offset)
* Use register_blkdev instead for any new code.
*/
int __register_blkdev(unsigned int major, const char *name,
- void (*probe)(dev_t devt))
+ void (*probe)(dev_t devt), struct module *owner)
{
struct blk_major_name **n, *p;
int index, ret = 0;
@@ -248,6 +250,7 @@ int __register_blkdev(unsigned int major, const char *name,
p->major = major;
p->probe = probe;
+ p->owner = owner;
strlcpy(p->name, name, sizeof(p->name));
p->next = NULL;
index = major_to_index(major);
@@ -653,14 +656,29 @@ void blk_request_module(dev_t devt)
{
unsigned int major = MAJOR(devt);
struct blk_major_name **n;
+ void (*probe_fn)(dev_t devt);
mutex_lock(&major_names_lock);
for (n = &major_names[major_to_index(major)]; *n; n = &(*n)->next) {
- if ((*n)->major == major && (*n)->probe) {
- (*n)->probe(devt);
- mutex_unlock(&major_names_lock);
- return;
- }
+ if ((*n)->major != major || !(*n)->probe)
+ continue;
+ if (!try_module_get((*n)->owner))
+ break;
+ /*
+ * Calling probe function with major_names_lock held causes
+ * circular locking dependency problem. Thus, call it after
+ * releasing major_names_lock.
+ */
+ probe_fn = (*n)->probe;
+ mutex_unlock(&major_names_lock);
+ /*
+ * Assuming that unregister_blkdev() is called from module's
+ * __exit function, a module refcount taken above allows us
+ * to safely call probe function without major_names_lock held.
+ */
+ probe_fn(devt);
+ module_put((*n)->owner);
+ return;
}
mutex_unlock(&major_names_lock);
diff --git a/drivers/block/ataflop.c b/drivers/block/ataflop.c
index a093644ac39f..1b7fe10d49e7 100644
--- a/drivers/block/ataflop.c
+++ b/drivers/block/ataflop.c
@@ -2016,7 +2016,7 @@ static int __init atari_floppy_init (void)
return -ENODEV;
mutex_lock(&ataflop_probe_lock);
- ret = __register_blkdev(FLOPPY_MAJOR, "fd", ataflop_probe);
+ ret = __register_blkdev(FLOPPY_MAJOR, "fd", ataflop_probe, THIS_MODULE);
if (ret)
goto out_unlock;
diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 95694113e38e..d0bdfd56dfc8 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -487,7 +487,7 @@ static int __init brd_init(void)
* dynamically.
*/
- if (__register_blkdev(RAMDISK_MAJOR, "ramdisk", brd_probe))
+ if (__register_blkdev(RAMDISK_MAJOR, "ramdisk", brd_probe, THIS_MODULE))
return -EIO;
brd_check_and_reset_par();
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 87460e0e5c72..ee33ba03e6bd 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -4570,7 +4570,7 @@ static int __init do_floppy_init(void)
timer_setup(&motor_off_timer[drive], motor_off_callback, 0);
}
- err = __register_blkdev(FLOPPY_MAJOR, "fd", floppy_probe);
+ err = __register_blkdev(FLOPPY_MAJOR, "fd", floppy_probe, THIS_MODULE);
if (err)
goto out_put_disk;
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index f0cdff0c5fbf..d6606c3b7d74 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -2564,7 +2564,7 @@ static int __init loop_init(void)
goto err_out;
- if (__register_blkdev(LOOP_MAJOR, "loop", loop_probe)) {
+ if (__register_blkdev(LOOP_MAJOR, "loop", loop_probe, THIS_MODULE)) {
err = -EIO;
goto misc_out;
}
diff --git a/drivers/md/md.c b/drivers/md/md.c
index ae8fe54ea358..c13f45c0f502 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9584,11 +9584,11 @@ static int __init md_init(void)
if (!md_rdev_misc_wq)
goto err_rdev_misc_wq;
- ret = __register_blkdev(MD_MAJOR, "md", md_probe);
+ ret = __register_blkdev(MD_MAJOR, "md", md_probe, THIS_MODULE);
if (ret < 0)
goto err_md;
- ret = __register_blkdev(0, "mdp", md_probe);
+ ret = __register_blkdev(0, "mdp", md_probe, THIS_MODULE);
if (ret < 0)
goto err_mdp;
mdp_major = ret;
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index b8d55af763f9..ddd67a1045e7 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3729,7 +3729,7 @@ static int __init init_sd(void)
SCSI_LOG_HLQUEUE(3, printk("init_sd: sd driver entry point\n"));
for (i = 0; i < SD_MAJORS; i++) {
- if (__register_blkdev(sd_major(i), "sd", sd_default_probe))
+ if (__register_blkdev(sd_major(i), "sd", sd_default_probe, THIS_MODULE))
continue;
majors++;
}
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 13b34177cc85..b0948003071d 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -303,9 +303,9 @@ struct gendisk *__blk_alloc_disk(int node);
void blk_cleanup_disk(struct gendisk *disk);
int __register_blkdev(unsigned int major, const char *name,
- void (*probe)(dev_t devt));
+ void (*probe)(dev_t devt), struct module *owner);
#define register_blkdev(major, name) \
- __register_blkdev(major, name, NULL)
+ __register_blkdev(major, name, NULL, NULL)
void unregister_blkdev(unsigned int major, const char *name);
bool bdev_check_media_change(struct block_device *bdev);
--
2.18.4
next prev parent reply other threads:[~2021-08-15 6:59 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-19 1:05 [PATCH v2] block: genhd: don't call probe function with major_names_lock held Tetsuo Handa
2021-06-19 3:24 ` kernel test robot
2021-06-19 6:14 ` kernel test robot
2021-06-19 6:44 ` Greg KH
2021-06-19 8:47 ` Tetsuo Handa
[not found] ` <20210620024403.820-1-hdanton@sina.com>
2021-06-20 13:54 ` Tetsuo Handa
2021-06-21 8:54 ` Greg KH
2021-06-21 6:18 ` Christoph Hellwig
2021-08-15 6:52 ` Tetsuo Handa [this message]
2021-08-15 7:06 ` [PATCH v3] " Greg KH
2021-08-15 7:49 ` Tetsuo Handa
2021-08-15 9:19 ` Greg KH
2021-08-18 11:07 ` [PATCH v4] " Tetsuo Handa
2021-08-18 13:27 ` Greg KH
2021-08-18 14:44 ` Tetsuo Handa
2021-08-18 15:28 ` Greg KH
2021-08-21 6:12 ` [PATCH v5] " Tetsuo Handa
2021-08-18 13:47 ` [PATCH v4] " Christoph Hellwig
2021-08-18 14:34 ` Tetsuo Handa
2021-08-18 14:41 ` Greg KH
2021-08-18 14:51 ` Tetsuo Handa
2021-08-19 9:16 ` Christoph Hellwig
2021-08-19 14:47 ` Tetsuo Handa
2021-08-19 9:19 ` Christoph Hellwig
2021-08-19 14:23 ` Tetsuo Handa
2021-08-19 15:10 ` Greg KH
2021-08-16 7:33 ` [PATCH v3] " Christoph Hellwig
2021-08-16 14:44 ` Tetsuo Handa
[not found] ` <20210817081045.3609-1-hdanton@sina.com>
2021-08-17 10:18 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4e153910-bf60-2cca-fa02-b46d22b6e2c5@i-love.sakura.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=axboe@kernel.dk \
--cc=chaitanya.kulkarni@wdc.com \
--cc=desmondcheongzx@gmail.com \
--cc=gregkh@linuxfoundation.org \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=hdanton@sina.com \
--cc=linux-block@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).