linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] block: fix blktrace debugfs use after free
@ 2020-04-29  7:46 Luis Chamberlain
  2020-04-29  7:46 ` [PATCH v3 1/6] block: revert back to synchronous request_queue removal Luis Chamberlain
                   ` (5 more replies)
  0 siblings, 6 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29  7:46 UTC (permalink / raw)
  To: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm
  Cc: mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Luis Chamberlain

Alrighty, here is v3 with all the BUG_*() crap removed, and moving
to just create the debugfs directory needed for the partitions as well
at initialization. This allows us to get rid of the pesky
debugfs_lookup() calls which has made this code very awkward, and
allowed us to find surprising bugs when we went with an
asynchronous request_queue removal.

I'll note that I still see this:

debugfs: Directory 'loop0' with parent 'block' already present!

But only for break-blktrace [0] run_0004.sh. But since we don't
have any more races with blktrace, this has pushed me to look
into disk registration / deletion. I'll be posting patches soon
about some changes to help with that, on the error handling.

If, after these patches, you however find the root cause to this
let me know!

Also, if folks don't disagree, I'll likely follow up to just merge
break-blktrace as a self-test for blktrace. We can later expand on it
upstream instead.

These patches are based on linux-next tag next-20200428, you can find
the code on my 20200428-blktrace-fixes branch [1].

[0] https://github.com/mcgrof/break-blktrace
[1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20200428-blktrace-fixes

Luis Chamberlain (6):
  block: revert back to synchronous request_queue removal
  block: move main block debugfs initialization to its own file
  blktrace: move blktrace debugfs creation to helper function
  blktrace: fix debugfs use after free
  blktrace: break out of blktrace setup on concurrent calls
  loop: be paranoid on exit and prevent new additions / removals

 block/Makefile               |  1 +
 block/blk-core.c             | 32 ++++++++++++----
 block/blk-debugfs.c          | 44 ++++++++++++++++++++++
 block/blk-mq-debugfs.c       |  5 ---
 block/blk-sysfs.c            | 47 ++++++++++++-----------
 block/blk.h                  | 18 +++++++++
 block/genhd.c                | 73 +++++++++++++++++++++++++++++++++++-
 block/partitions/core.c      |  3 ++
 drivers/block/loop.c         |  4 ++
 drivers/scsi/sg.c            |  2 +
 include/linux/blkdev.h       |  7 ++--
 include/linux/blktrace_api.h |  1 -
 include/linux/genhd.h        | 18 +++++++++
 kernel/trace/blktrace.c      | 39 ++++++++++++++++---
 14 files changed, 249 insertions(+), 45 deletions(-)
 create mode 100644 block/blk-debugfs.c

-- 
2.25.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 1/6] block: revert back to synchronous request_queue removal
  2020-04-29  7:46 [PATCH v3 0/6] block: fix blktrace debugfs use after free Luis Chamberlain
@ 2020-04-29  7:46 ` Luis Chamberlain
  2020-04-29 11:15   ` Christoph Hellwig
  2020-05-02  0:22   ` Bart Van Assche
  2020-04-29  7:46 ` [PATCH v3 2/6] block: move main block debugfs initialization to its own file Luis Chamberlain
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29  7:46 UTC (permalink / raw)
  To: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm
  Cc: mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Luis Chamberlain, Omar Sandoval, Hannes Reinecke,
	Michal Hocko

Commit dc9edc44de6c ("block: Fix a blk_exit_rl() regression") merged on
v4.12 moved the work behind blk_release_queue() into a workqueue after a
splat floated around which indicated some work on blk_release_queue()
could sleep in blk_exit_rl(). This splat would be possible when a driver
called blk_put_queue() or blk_cleanup_queue() (which calls blk_put_queue()
as its final call) from an atomic context.

blk_put_queue() decrements the refcount for the request_queue kobject,
and upon reaching 0 blk_release_queue() is called. Although blk_exit_rl()
is now removed through commit db6d9952356 ("block: remove request_list code")
on v5.0, we reserve the right to be able to sleep within blk_release_queue()
context.

The last reference for the request_queue must not be called from atomic
conext. *When* the last reference to the request_queue reaches 0 varies,
and so let's take the opportunity to document when that is expected to
happen and also document the context of the related calls as best as possible
so we can avoid future issues, and with the hopes that the synchronous
request_queue removal sticks.

We revert back to synchronous request_queue removal because asynchronous
removal creates a regression with expected userspace interaction with
several drivers. An example is when removing the loopback driver, one
uses ioctls from userspace to do so, but upon return and if successful,
one expects the device to be removed. Likewise if one races to add another
device the new one may not be added as it is still being removed. This was
expected behaviour before and it now fails as the device is still present
and busy still. Moving to asynchronous request_queue removal could have
broken many scripts which relied on the removal to have been completed if
there was no error. Document this expectation as well so that this
doesn't regress userspace again.

Using asynchronous request_queue removal however has helped us find
other bugs. In the future we can test what could break with this
arrangement by enabling CONFIG_DEBUG_KOBJECT_RELEASE.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: yu kuai <yukuai3@huawei.com>
Suggested-by: Nicolai Stange <nstange@suse.de>
Fixes: dc9edc44de6c ("block: Fix a blk_exit_rl() regression")
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 block/blk-core.c       | 23 +++++++++++++
 block/blk-sysfs.c      | 43 +++++++++++++------------
 block/genhd.c          | 73 +++++++++++++++++++++++++++++++++++++++++-
 include/linux/blkdev.h |  2 --
 4 files changed, 117 insertions(+), 24 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 0641c2916d7e..8a27c772982e 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -306,6 +306,16 @@ void blk_clear_pm_only(struct request_queue *q)
 }
 EXPORT_SYMBOL_GPL(blk_clear_pm_only);
 
+/**
+ * blk_put_queue - decrement the request_queue refcount
+ * @q: the request_queue structure to decrement the refcount for
+ *
+ * Decrements the refcount to the request_queue kobject. When this reaches 0
+ * we'll have blk_release_queue() called.
+ *
+ * Context: Any context, but the last reference must not be dropped from
+ *          atomic context.
+ */
 void blk_put_queue(struct request_queue *q)
 {
 	kobject_put(&q->kobj);
@@ -337,9 +347,14 @@ EXPORT_SYMBOL_GPL(blk_set_queue_dying);
  *
  * Mark @q DYING, drain all pending requests, mark @q DEAD, destroy and
  * put it.  All future requests will be failed immediately with -ENODEV.
+ *
+ * Context: can sleep
  */
 void blk_cleanup_queue(struct request_queue *q)
 {
+	/* cannot be called from atomic context */
+	might_sleep();
+
 	WARN_ON_ONCE(blk_queue_registered(q));
 
 	/* mark @q DYING, no new request or merges will be allowed afterwards */
@@ -567,6 +582,14 @@ struct request_queue *blk_alloc_queue(make_request_fn make_request, int node_id)
 }
 EXPORT_SYMBOL(blk_alloc_queue);
 
+/**
+ * blk_get_queue - increment the request_queue refcount
+ * @q: the request_queue structure to incremenet the refcount for
+ *
+ * Increment the refcount to the request_queue kobject.
+ *
+ * Context: Any context.
+ */
 bool blk_get_queue(struct request_queue *q)
 {
 	if (likely(!blk_queue_dying(q))) {
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index fca9b158f4a0..eda8c4985511 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -860,22 +860,32 @@ static void blk_exit_queue(struct request_queue *q)
 	bdi_put(q->backing_dev_info);
 }
 
-
 /**
- * __blk_release_queue - release a request queue
- * @work: pointer to the release_work member of the request queue to be released
+ * blk_release_queue - releases all allocated resources of the request_queue
+ * @kobj: pointer to a kobject, who's container is a request_queue
+ *
+ * This function releases all allocated resources of the request queue.
+ *
+ * The struct request_queue refcount is incremented with blk_get_queue() and
+ * decremented with blk_put_queue(). Once the refcount reaches 0 this function
+ * is called.
+ *
+ * For drivers that have a request_queue on a gendisk and added with
+ * __device_add_disk() the refcount to request_queue will reach 0 with
+ * the last put_disk() called by the driver. For drivers which don't use
+ * __device_add_disk() this happens with blk_cleanup_queue().
  *
- * Description:
- *     This function is called when a block device is being unregistered. The
- *     process of releasing a request queue starts with blk_cleanup_queue, which
- *     set the appropriate flags and then calls blk_put_queue, that decrements
- *     the reference counter of the request queue. Once the reference counter
- *     of the request queue reaches zero, blk_release_queue is called to release
- *     all allocated resources of the request queue.
+ * Drivers exist which depend on the release of the request_queue to be
+ * synchronous, it should not be deferred.
+ *
+ * Context: can sleep
  */
-static void __blk_release_queue(struct work_struct *work)
+static void blk_release_queue(struct kobject *kobj)
 {
-	struct request_queue *q = container_of(work, typeof(*q), release_work);
+	struct request_queue *q =
+		container_of(kobj, struct request_queue, kobj);
+
+	might_sleep();
 
 	if (test_bit(QUEUE_FLAG_POLL_STATS, &q->queue_flags))
 		blk_stat_remove_callback(q, q->poll_cb);
@@ -904,15 +914,6 @@ static void __blk_release_queue(struct work_struct *work)
 	call_rcu(&q->rcu_head, blk_free_queue_rcu);
 }
 
-static void blk_release_queue(struct kobject *kobj)
-{
-	struct request_queue *q =
-		container_of(kobj, struct request_queue, kobj);
-
-	INIT_WORK(&q->release_work, __blk_release_queue);
-	schedule_work(&q->release_work);
-}
-
 static const struct sysfs_ops queue_sysfs_ops = {
 	.show	= queue_attr_show,
 	.store	= queue_attr_store,
diff --git a/block/genhd.c b/block/genhd.c
index c05d509877fa..a933cffbee2e 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -897,11 +897,32 @@ static void invalidate_partition(struct gendisk *disk, int partno)
 	bdput(bdev);
 }
 
+/**
+ * del_gendisk - remove the gendisk
+ * @disk: the struct gendisk to remove
+ *
+ * Removes the gendisk and all its associated resources. This deletes the
+ * partitions associated with the gendisk, and unregisters the associated
+ * request_queue.
+ *
+ * This is the counter to the respective __device_add_disk() call.
+ *
+ * The final removal of the struct gendisk happens when its refcount reaches 0
+ * with put_disk(), which should be called after del_gendisk(), if
+ * __device_add_disk() was used.
+ *
+ * Drivers exist which depend on the release of the gendisk to be synchronous,
+ * it should not be deferred.
+ *
+ * Context: can sleep
+ */
 void del_gendisk(struct gendisk *disk)
 {
 	struct disk_part_iter piter;
 	struct hd_struct *part;
 
+	might_sleep();
+
 	blk_integrity_del(disk);
 	disk_del_events(disk);
 
@@ -992,11 +1013,15 @@ static ssize_t disk_badblocks_store(struct device *dev,
  *
  * This function gets the structure containing partitioning
  * information for the given device @devt.
+ *
+ * Context: can sleep
  */
 struct gendisk *get_gendisk(dev_t devt, int *partno)
 {
 	struct gendisk *disk = NULL;
 
+	might_sleep();
+
 	if (MAJOR(devt) != BLOCK_EXT_MAJOR) {
 		struct kobject *kobj;
 
@@ -1528,10 +1553,31 @@ int disk_expand_part_tbl(struct gendisk *disk, int partno)
 	return 0;
 }
 
+/**
+ * disk_release - releases all allocated resources of the gendisk
+ * @dev: the device representing this disk
+ *
+ * This function releases all allocated resources of the gendisk.
+ *
+ * The struct gendisk refcounted is incremeneted with get_gendisk() or
+ * get_disk_and_module(), and its refcount is decremented with
+ * put_disk_and_module() or put_disk(). Once the refcount reaches 0 this
+ * function is called.
+ *
+ * Drivers which used __device_add_disk() have a gendisk with a request_queue
+ * assigned. Since the request_queue sits on top of the gendisk for these
+ * drivers we also call blk_put_queue() for them, and we expect the
+ * request_queue refcount to reach 0 at this point, and so the request_queue
+ * will also be freed prior to the disk.
+ *
+ * Context: can sleep
+ */
 static void disk_release(struct device *dev)
 {
 	struct gendisk *disk = dev_to_disk(dev);
 
+	might_sleep();
+
 	blk_free_devt(dev->devt);
 	disk_release_events(disk);
 	kfree(disk->random);
@@ -1737,6 +1783,15 @@ struct gendisk *__alloc_disk_node(int minors, int node_id)
 }
 EXPORT_SYMBOL(__alloc_disk_node);
 
+/**
+ * get_disk_and_module - increments the gendisk and gendisk fops module refcount
+ * @disk: the struct gendisk to to increment the refcount for
+ *
+ * This increments the refcount for the struct gendisk, and the gendisk's
+ * fops module owner.
+ *
+ * Context: Any context.
+ */
 struct kobject *get_disk_and_module(struct gendisk *disk)
 {
 	struct module *owner;
@@ -1757,6 +1812,16 @@ struct kobject *get_disk_and_module(struct gendisk *disk)
 }
 EXPORT_SYMBOL(get_disk_and_module);
 
+/**
+ * put_disk - decrements the gendisk refcount
+ * @disk: the struct gendisk to to decrement the refcount for
+ *
+ * This decrements the refcount for the struct gendisk. When this reaches 0
+ * we'll have disk_release() called.
+ *
+ * Context: Any context, but the last reference must not be dropped from
+ *          atomic context.
+ */
 void put_disk(struct gendisk *disk)
 {
 	if (disk)
@@ -1764,9 +1829,15 @@ void put_disk(struct gendisk *disk)
 }
 EXPORT_SYMBOL(put_disk);
 
-/*
+/**
+ * put_disk_and_module - decrements the module and gendisk refcount
+ * @disk: the struct gendisk to to decrement the refcount for
+ *
  * This is a counterpart of get_disk_and_module() and thus also of
  * get_gendisk().
+ *
+ * Context: Any context, but the last reference must not be dropped from
+ *          atomic context.
  */
 void put_disk_and_module(struct gendisk *disk)
 {
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index f00bd4042295..3122a93c7277 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -571,8 +571,6 @@ struct request_queue {
 
 	size_t			cmd_size;
 
-	struct work_struct	release_work;
-
 #define BLK_MAX_WRITE_HINTS	5
 	u64			write_hints[BLK_MAX_WRITE_HINTS];
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 2/6] block: move main block debugfs initialization to its own file
  2020-04-29  7:46 [PATCH v3 0/6] block: fix blktrace debugfs use after free Luis Chamberlain
  2020-04-29  7:46 ` [PATCH v3 1/6] block: revert back to synchronous request_queue removal Luis Chamberlain
@ 2020-04-29  7:46 ` Luis Chamberlain
  2020-04-29 11:15   ` Christoph Hellwig
  2020-04-29  7:46 ` [PATCH v3 3/6] blktrace: move blktrace debugfs creation to helper function Luis Chamberlain
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29  7:46 UTC (permalink / raw)
  To: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm
  Cc: mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Luis Chamberlain, Omar Sandoval, Hannes Reinecke,
	Michal Hocko

make_request-based drivers and and request-based drivers share some
debugfs code. By moving this into its own file it makes it easier
to expand and audit this shared code.

This patch contains no functional changes.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: yu kuai <yukuai3@huawei.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 block/Makefile      |  1 +
 block/blk-core.c    |  9 +--------
 block/blk-debugfs.c | 15 +++++++++++++++
 block/blk.h         |  7 +++++++
 4 files changed, 24 insertions(+), 8 deletions(-)
 create mode 100644 block/blk-debugfs.c

diff --git a/block/Makefile b/block/Makefile
index 206b96e9387f..1d3ab20505d8 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_BLOCK) := bio.o elevator.o blk-core.o blk-sysfs.o \
 			blk-mq-sysfs.o blk-mq-cpumap.o blk-mq-sched.o ioctl.o \
 			genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o
 
+obj-$(CONFIG_DEBUG_FS)		+= blk-debugfs.o
 obj-$(CONFIG_BOUNCE)		+= bounce.o
 obj-$(CONFIG_BLK_SCSI_REQUEST)	+= scsi_ioctl.o
 obj-$(CONFIG_BLK_DEV_BSG)	+= bsg.o
diff --git a/block/blk-core.c b/block/blk-core.c
index 8a27c772982e..4b26f686e249 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -49,10 +49,6 @@
 #include "blk-pm.h"
 #include "blk-rq-qos.h"
 
-#ifdef CONFIG_DEBUG_FS
-struct dentry *blk_debugfs_root;
-#endif
-
 EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_remap);
 EXPORT_TRACEPOINT_SYMBOL_GPL(block_rq_remap);
 EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_complete);
@@ -1828,10 +1824,7 @@ int __init blk_dev_init(void)
 
 	blk_requestq_cachep = kmem_cache_create("request_queue",
 			sizeof(struct request_queue), 0, SLAB_PANIC, NULL);
-
-#ifdef CONFIG_DEBUG_FS
-	blk_debugfs_root = debugfs_create_dir("block", NULL);
-#endif
+	blk_debugfs_register();
 
 	return 0;
 }
diff --git a/block/blk-debugfs.c b/block/blk-debugfs.c
new file mode 100644
index 000000000000..19091e1effc0
--- /dev/null
+++ b/block/blk-debugfs.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Shared request-based / make_request-based functionality
+ */
+#include <linux/kernel.h>
+#include <linux/blkdev.h>
+#include <linux/debugfs.h>
+
+struct dentry *blk_debugfs_root;
+
+void blk_debugfs_register(void)
+{
+	blk_debugfs_root = debugfs_create_dir("block", NULL);
+}
diff --git a/block/blk.h b/block/blk.h
index 73bd3b1c6938..ec16e8a6049e 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -456,5 +456,12 @@ struct request_queue *__blk_alloc_queue(int node_id);
 int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 		struct page *page, unsigned int len, unsigned int offset,
 		bool *same_page);
+#ifdef CONFIG_DEBUG_FS
+void blk_debugfs_register(void);
+#else
+static inline void blk_debugfs_register(void)
+{
+}
+#endif /* CONFIG_DEBUG_FS */
 
 #endif /* BLK_INTERNAL_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 3/6] blktrace: move blktrace debugfs creation to helper function
  2020-04-29  7:46 [PATCH v3 0/6] block: fix blktrace debugfs use after free Luis Chamberlain
  2020-04-29  7:46 ` [PATCH v3 1/6] block: revert back to synchronous request_queue removal Luis Chamberlain
  2020-04-29  7:46 ` [PATCH v3 2/6] block: move main block debugfs initialization to its own file Luis Chamberlain
@ 2020-04-29  7:46 ` Luis Chamberlain
  2020-04-29 11:20   ` Christoph Hellwig
  2020-05-02  0:25   ` Bart Van Assche
  2020-04-29  7:46 ` [PATCH v3 4/6] blktrace: fix debugfs use after free Luis Chamberlain
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29  7:46 UTC (permalink / raw)
  To: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm
  Cc: mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Luis Chamberlain

Move the work to create the debugfs directory used into a helper.
It will make further checks easier to read. This commit introduces
no functional changes.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 kernel/trace/blktrace.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index ca39dc3230cb..2c6e6c386ace 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -468,6 +468,18 @@ static void blk_trace_setup_lba(struct blk_trace *bt,
 	}
 }
 
+static struct dentry *blk_trace_debugfs_dir(struct blk_user_trace_setup *buts,
+					    struct blk_trace *bt)
+{
+	struct dentry *dir = NULL;
+
+	dir = debugfs_lookup(buts->name, blk_debugfs_root);
+	if (!dir)
+		bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root);
+
+	return dir;
+}
+
 /*
  * Setup everything required to start tracing
  */
@@ -509,9 +521,7 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
 
 	ret = -ENOENT;
 
-	dir = debugfs_lookup(buts->name, blk_debugfs_root);
-	if (!dir)
-		bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root);
+	dir = blk_trace_debugfs_dir(buts, bt);
 
 	bt->dev = dev;
 	atomic_set(&bt->dropped, 0);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29  7:46 [PATCH v3 0/6] block: fix blktrace debugfs use after free Luis Chamberlain
                   ` (2 preceding siblings ...)
  2020-04-29  7:46 ` [PATCH v3 3/6] blktrace: move blktrace debugfs creation to helper function Luis Chamberlain
@ 2020-04-29  7:46 ` Luis Chamberlain
  2020-04-29  9:47   ` Greg KH
  2020-04-29 11:26   ` Christoph Hellwig
  2020-04-29  7:46 ` [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls Luis Chamberlain
  2020-04-29  7:46 ` [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals Luis Chamberlain
  5 siblings, 2 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29  7:46 UTC (permalink / raw)
  To: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm
  Cc: mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Luis Chamberlain, Omar Sandoval, Hannes Reinecke,
	Michal Hocko, syzbot+603294af2d01acfdd6da

On commit 6ac93117ab00 ("blktrace: use existing disk debugfs directory")
merged on v4.12 Omar fixed the original blktrace code for request-based
drivers (multiqueue). This however left in place a possible crash, if you
happen to abuse blktrace while racing to remove / add a device.

We used to use asynchronous removal of the request_queue, and with that
the issue was easier to reproduce. Now that we have reverted to
synchronous removal of the request_queue, the issue is still possible to
reproduce, its however just a bit more difficult.

We essentially run two instances of break-blktrace which add/remove
a loop device, and setup a blktrace and just never tear the blktrace
down. We do this twice in parallel. This is easily reproduced with the
break-blktrace run_0004.sh script.

We can end up with two types of panics each reflecting where we
race, one a failed blktrace setup:

[  252.426751] debugfs: Directory 'loop0' with parent 'block' already present!
[  252.432265] BUG: kernel NULL pointer dereference, address: 00000000000000a0
[  252.436592] #PF: supervisor write access in kernel mode
[  252.439822] #PF: error_code(0x0002) - not-present page
[  252.442967] PGD 0 P4D 0
[  252.444656] Oops: 0002 [#1] SMP NOPTI
[  252.446972] CPU: 10 PID: 1153 Comm: break-blktrace Tainted: G            E     5.7.0-rc2-next-20200420+ #164
[  252.452673] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
[  252.456343] RIP: 0010:down_write+0x15/0x40
[  252.458146] Code: eb ca e8 ae 22 8d ff cc cc cc cc cc cc cc cc cc cc cc cc
               cc cc 0f 1f 44 00 00 55 48 89 fd e8 52 db ff ff 31 c0 ba 01 00
               00 00 <f0> 48 0f b1 55 00 75 0f 48 8b 04 25 c0 8b 01 00 48 89
               45 08 5d
[  252.463638] RSP: 0018:ffffa626415abcc8 EFLAGS: 00010246
[  252.464950] RAX: 0000000000000000 RBX: ffff958c25f0f5c0 RCX: ffffff8100000000
[  252.466727] RDX: 0000000000000001 RSI: ffffff8100000000 RDI: 00000000000000a0
[  252.468482] RBP: 00000000000000a0 R08: 0000000000000000 R09: 0000000000000001
[  252.470014] R10: 0000000000000000 R11: ffff958d1f9227ff R12: 0000000000000000
[  252.471473] R13: ffff958c25ea5380 R14: ffffffff8cce15f1 R15: 00000000000000a0
[  252.473346] FS:  00007f2e69dee540(0000) GS:ffff958c2fc80000(0000) knlGS:0000000000000000
[  252.475225] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  252.476267] CR2: 00000000000000a0 CR3: 0000000427d10004 CR4: 0000000000360ee0
[  252.477526] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  252.478776] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  252.479866] Call Trace:
[  252.480322]  simple_recursive_removal+0x4e/0x2e0
[  252.481078]  ? debugfs_remove+0x60/0x60
[  252.481725]  ? relay_destroy_buf+0x77/0xb0
[  252.482662]  debugfs_remove+0x40/0x60
[  252.483518]  blk_remove_buf_file_callback+0x5/0x10
[  252.484328]  relay_close_buf+0x2e/0x60
[  252.484930]  relay_open+0x1ce/0x2c0
[  252.485520]  do_blk_trace_setup+0x14f/0x2b0
[  252.486187]  __blk_trace_setup+0x54/0xb0
[  252.486803]  blk_trace_ioctl+0x90/0x140
[  252.487423]  ? do_sys_openat2+0x1ab/0x2d0
[  252.488053]  blkdev_ioctl+0x4d/0x260
[  252.488636]  block_ioctl+0x39/0x40
[  252.489139]  ksys_ioctl+0x87/0xc0
[  252.489675]  __x64_sys_ioctl+0x16/0x20
[  252.490380]  do_syscall_64+0x52/0x180
[  252.491032]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

And the other on the device removal:

[  128.528940] debugfs: Directory 'loop0' with parent 'block' already present!
[  128.615325] BUG: kernel NULL pointer dereference, address: 00000000000000a0
[  128.619537] #PF: supervisor write access in kernel mode
[  128.622700] #PF: error_code(0x0002) - not-present page
[  128.625842] PGD 0 P4D 0
[  128.627585] Oops: 0002 [#1] SMP NOPTI
[  128.629871] CPU: 12 PID: 544 Comm: break-blktrace Tainted: G            E     5.7.0-rc2-next-20200420+ #164
[  128.635595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
[  128.640471] RIP: 0010:down_write+0x15/0x40
[  128.643041] Code: eb ca e8 ae 22 8d ff cc cc cc cc cc cc cc cc cc cc cc cc
               cc cc 0f 1f 44 00 00 55 48 89 fd e8 52 db ff ff 31 c0 ba 01 00
               00 00 <f0> 48 0f b1 55 00 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89
               45 08 5d
[  128.650180] RSP: 0018:ffffa9c3c05ebd78 EFLAGS: 00010246
[  128.651820] RAX: 0000000000000000 RBX: ffff8ae9a6370240 RCX: ffffff8100000000
[  128.653942] RDX: 0000000000000001 RSI: ffffff8100000000 RDI: 00000000000000a0
[  128.655720] RBP: 00000000000000a0 R08: 0000000000000002 R09: ffff8ae9afd2d3d0
[  128.657400] R10: 0000000000000056 R11: 0000000000000000 R12: 0000000000000000
[  128.659099] R13: 0000000000000000 R14: 0000000000000003 R15: 00000000000000a0
[  128.660500] FS:  00007febfd995540(0000) GS:ffff8ae9afd00000(0000) knlGS:0000000000000000
[  128.662204] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  128.663426] CR2: 00000000000000a0 CR3: 0000000420042003 CR4: 0000000000360ee0
[  128.664776] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  128.666022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  128.667282] Call Trace:
[  128.667801]  simple_recursive_removal+0x4e/0x2e0
[  128.668663]  ? debugfs_remove+0x60/0x60
[  128.669368]  debugfs_remove+0x40/0x60
[  128.669985]  blk_trace_free+0xd/0x50
[  128.670593]  __blk_trace_remove+0x27/0x40
[  128.671274]  blk_trace_shutdown+0x30/0x40
[  128.671935]  blk_release_queue+0x95/0xf0
[  128.672589]  kobject_put+0xa5/0x1b0
[  128.673188]  disk_release+0xa2/0xc0
[  128.673786]  device_release+0x28/0x80
[  128.674376]  kobject_put+0xa5/0x1b0
[  128.674915]  loop_remove+0x39/0x50 [loop]
[  128.675511]  loop_control_ioctl+0x113/0x130 [loop]
[  128.676199]  ksys_ioctl+0x87/0xc0
[  128.676708]  __x64_sys_ioctl+0x16/0x20
[  128.677274]  do_syscall_64+0x52/0x180
[  128.677823]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

The common theme here is:

debugfs: Directory 'loop0' with parent 'block' already present

This crash happens because of how blktrace uses the debugfs directory
where it places its files. Upon init we always create the same directory
which would be needed by blktrace but we only do this for make_request
drivers (multiqueue) block drivers, but never for request-based block
drivers. Furthermore, that directory is only created on init for the
entire disk. This means that if you use blktrace on a parition, we'll
always be creating a new directory regardless of whether or not you
are doing blktrace on a make_request driver (multiqueue) or a
request-based block drivers.

These directory creations are only associated with a path, and so
when a debugfs_remove() is called it removes everything in its way.
A device removal will remove all blktrace files, and so if a blktrace
is still present a cleanup of blktrace files later will end up trying
to remove dentries pointing to NULL.

We can fix the UAF by using a debugfs directory which moving forward
will always be accessible if debugfs is enabled for both make_request
drivers (multiqueue) and request-based block drivers, *and* for all
partitions upon creation. This ensures that removal of the directories
only happens on device removal and removes the race of the files
underneath an active blktrace.

This also simplifies the code considerably, with the only penalty now
being that we're always creating the request queue debugfs directory for
the request-based block device drivers, and the partition debugfs
directories upon initialization for both types of drivers.

This patch is part of the work which disputes the severity of
CVE-2019-19770 which shows this issue is not a core debugfs issue, but
a misuse of debugfs within blktace.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: yu kuai <yukuai3@huawei.com>
Reported-by: syzbot+603294af2d01acfdd6da@syzkaller.appspotmail.com
Fixes: 6ac93117ab00 ("blktrace: use existing disk debugfs directory")
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 block/blk-debugfs.c          | 29 +++++++++++++++++++++++++++++
 block/blk-mq-debugfs.c       |  5 -----
 block/blk-sysfs.c            |  4 ++++
 block/blk.h                  | 11 +++++++++++
 block/partitions/core.c      |  3 +++
 drivers/scsi/sg.c            |  2 ++
 include/linux/blkdev.h       |  5 ++++-
 include/linux/blktrace_api.h |  1 -
 include/linux/genhd.h        | 18 ++++++++++++++++++
 kernel/trace/blktrace.c      | 32 +++++++++++++++++++++-----------
 10 files changed, 92 insertions(+), 18 deletions(-)

diff --git a/block/blk-debugfs.c b/block/blk-debugfs.c
index 19091e1effc0..a0f4077d6959 100644
--- a/block/blk-debugfs.c
+++ b/block/blk-debugfs.c
@@ -13,3 +13,32 @@ void blk_debugfs_register(void)
 {
 	blk_debugfs_root = debugfs_create_dir("block", NULL);
 }
+
+static struct dentry *blk_debugfs_dir_register(const char *name)
+{
+	return debugfs_create_dir(name, blk_debugfs_root);
+}
+
+void blk_queue_debugfs_register(struct request_queue *q, const char *name)
+{
+	q->debugfs_dir = blk_debugfs_dir_register(name);
+}
+EXPORT_SYMBOL_GPL(blk_queue_debugfs_register);
+
+void blk_queue_debugfs_unregister(struct request_queue *q)
+{
+	debugfs_remove_recursive(q->debugfs_dir);
+	q->debugfs_dir = NULL;
+}
+EXPORT_SYMBOL_GPL(blk_queue_debugfs_unregister);
+
+void blk_part_debugfs_register(struct hd_struct *p, const char *name)
+{
+	p->debugfs_dir = blk_debugfs_dir_register(name);
+}
+
+void blk_part_debugfs_unregister(struct hd_struct *p)
+{
+	debugfs_remove_recursive(p->debugfs_dir);
+	p->debugfs_dir = NULL;
+}
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 96b7a35c898a..08edc3a54114 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -822,9 +822,6 @@ void blk_mq_debugfs_register(struct request_queue *q)
 	struct blk_mq_hw_ctx *hctx;
 	int i;
 
-	q->debugfs_dir = debugfs_create_dir(kobject_name(q->kobj.parent),
-					    blk_debugfs_root);
-
 	debugfs_create_files(q->debugfs_dir, q, blk_mq_debugfs_queue_attrs);
 
 	/*
@@ -855,9 +852,7 @@ void blk_mq_debugfs_register(struct request_queue *q)
 
 void blk_mq_debugfs_unregister(struct request_queue *q)
 {
-	debugfs_remove_recursive(q->debugfs_dir);
 	q->sched_debugfs_dir = NULL;
-	q->debugfs_dir = NULL;
 }
 
 static void blk_mq_debugfs_register_ctx(struct blk_mq_hw_ctx *hctx,
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index eda8c4985511..f758a7e06671 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -905,6 +905,7 @@ static void blk_release_queue(struct kobject *kobj)
 
 	blk_trace_shutdown(q);
 
+	blk_queue_debugfs_unregister(q);
 	if (queue_is_mq(q))
 		blk_mq_debugfs_unregister(q);
 
@@ -976,6 +977,8 @@ int blk_register_queue(struct gendisk *disk)
 		goto unlock;
 	}
 
+	blk_queue_debugfs_register(q, kobject_name(q->kobj.parent));
+
 	if (queue_is_mq(q)) {
 		__blk_mq_register_dev(dev, q);
 		blk_mq_debugfs_register(q);
@@ -986,6 +989,7 @@ int blk_register_queue(struct gendisk *disk)
 		ret = elv_register_queue(q, false);
 		if (ret) {
 			mutex_unlock(&q->sysfs_lock);
+			blk_queue_debugfs_unregister(q);
 			mutex_unlock(&q->sysfs_dir_lock);
 			kobject_del(&q->kobj);
 			blk_trace_remove_sysfs(dev);
diff --git a/block/blk.h b/block/blk.h
index ec16e8a6049e..46d867a7f5bc 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -458,10 +458,21 @@ int __bio_add_pc_page(struct request_queue *q, struct bio *bio,
 		bool *same_page);
 #ifdef CONFIG_DEBUG_FS
 void blk_debugfs_register(void);
+void blk_part_debugfs_register(struct hd_struct *p, const char *name);
+void blk_part_debugfs_unregister(struct hd_struct *p);
 #else
 static inline void blk_debugfs_register(void)
 {
 }
+
+static inline void blk_part_debugfs_register(struct hd_struct *p,
+					     const char *name)
+{
+}
+
+static inline void blk_part_debugfs_unregister(struct hd_struct *p)
+{
+}
 #endif /* CONFIG_DEBUG_FS */
 
 #endif /* BLK_INTERNAL_H */
diff --git a/block/partitions/core.c b/block/partitions/core.c
index c085bf85509b..ae395b3ec9cc 100644
--- a/block/partitions/core.c
+++ b/block/partitions/core.c
@@ -312,6 +312,7 @@ void delete_partition(struct gendisk *disk, struct hd_struct *part)
 	rcu_assign_pointer(ptbl->part[part->partno], NULL);
 	rcu_assign_pointer(ptbl->last_lookup, NULL);
 	kobject_put(part->holder_dir);
+	blk_part_debugfs_unregister(part);
 	device_del(part_to_dev(part));
 
 	/*
@@ -433,6 +434,8 @@ static struct hd_struct *add_partition(struct gendisk *disk, int partno,
 	if (!p->holder_dir)
 		goto out_del;
 
+	blk_part_debugfs_register(p, dev_name(pdev));
+
 	dev_set_uevent_suppress(pdev, 0);
 	if (flags & ADDPART_FLAG_WHOLEDISK) {
 		err = device_create_file(pdev, &dev_attr_whole_disk);
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 20472aaaf630..f21787611918 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1548,6 +1548,7 @@ sg_add_device(struct device *cl_dev, struct class_interface *cl_intf)
 		goto out;
 	}
 
+	blk_queue_debugfs_register(sdp->device->request_queue, disk->disk_name);
 	error = cdev_add(cdev, MKDEV(SCSI_GENERIC_MAJOR, sdp->index), 1);
 	if (error)
 		goto cdev_add_err;
@@ -1644,6 +1645,7 @@ sg_remove_device(struct device *cl_dev, struct class_interface *cl_intf)
 
 	sysfs_remove_link(&scsidp->sdev_gendev.kobj, "generic");
 	device_destroy(sg_sysfs_class, MKDEV(SCSI_GENERIC_MAJOR, sdp->index));
+	blk_queue_debugfs_unregister(sdp->device->request_queue);
 	cdev_del(sdp->cdev);
 	sdp->cdev = NULL;
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3122a93c7277..e7edd31bdf9a 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -561,8 +561,11 @@ struct request_queue {
 	struct list_head	tag_set_list;
 	struct bio_set		bio_split;
 
-#ifdef CONFIG_BLK_DEBUG_FS
+#ifdef CONFIG_DEBUG_FS
+	/* Used by block/blk-*debugfs.c and kernel/trace/blktrace.c */
 	struct dentry		*debugfs_dir;
+#endif
+#ifdef CONFIG_BLK_DEBUG_FS
 	struct dentry		*sched_debugfs_dir;
 	struct dentry		*rqos_debugfs_dir;
 #endif
diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h
index 3b6ff5902edc..eb6db276e293 100644
--- a/include/linux/blktrace_api.h
+++ b/include/linux/blktrace_api.h
@@ -22,7 +22,6 @@ struct blk_trace {
 	u64 end_lba;
 	u32 pid;
 	u32 dev;
-	struct dentry *dir;
 	struct dentry *dropped_file;
 	struct dentry *msg_file;
 	struct list_head running_list;
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 058d895544c7..899760cf8c37 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -86,6 +86,10 @@ struct hd_struct {
 #endif
 	struct percpu_ref ref;
 	struct rcu_work rcu_work;
+#ifdef CONFIG_DEBUG_FS
+	/* Currently only used by kernel/trace/blktrace.c */
+	struct dentry *debugfs_dir;
+#endif
 };
 
 /**
@@ -382,4 +386,18 @@ static inline dev_t blk_lookup_devt(const char *name, int partno)
 }
 #endif /* CONFIG_BLOCK */
 
+#ifdef CONFIG_DEBUG_FS
+void blk_queue_debugfs_register(struct request_queue *q, const char *name);
+void blk_queue_debugfs_unregister(struct request_queue *q);
+#else
+static inline void blk_queue_debugfs_register(struct request_queue *q,
+					      const char *name)
+{
+}
+
+static inline void blk_queue_debugfs_unregister(struct request_queue *q)
+{
+}
+#endif /* CONFIG_DEBUG_FS */
+
 #endif /* _LINUX_GENHD_H */
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 2c6e6c386ace..5c52976bd762 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -3,6 +3,7 @@
  * Copyright (C) 2006 Jens Axboe <axboe@kernel.dk>
  *
  */
+
 #include <linux/kernel.h>
 #include <linux/blkdev.h>
 #include <linux/blktrace_api.h>
@@ -311,7 +312,6 @@ static void blk_trace_free(struct blk_trace *bt)
 	debugfs_remove(bt->msg_file);
 	debugfs_remove(bt->dropped_file);
 	relay_close(bt->rchan);
-	debugfs_remove(bt->dir);
 	free_percpu(bt->sequence);
 	free_percpu(bt->msg_data);
 	kfree(bt);
@@ -468,16 +468,25 @@ static void blk_trace_setup_lba(struct blk_trace *bt,
 	}
 }
 
-static struct dentry *blk_trace_debugfs_dir(struct blk_user_trace_setup *buts,
-					    struct blk_trace *bt)
+static struct dentry *blk_trace_debugfs_dir(struct block_device *bdev,
+					    struct request_queue *q)
 {
-	struct dentry *dir = NULL;
+	struct hd_struct *p = NULL;
 
-	dir = debugfs_lookup(buts->name, blk_debugfs_root);
-	if (!dir)
-		bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root);
+	/*
+	 * Some drivers like scsi-generic use a NULL block device. For
+	 * other drivers when bdev != bdev->bd_contain we are doing a blktrace
+	 * on a parition, otherwise we know we are working on the whole
+	 * disk, and for that the request_queue already has its own debugfs_dir.
+	 * which we have been using for other things other than blktrace.
+	 */
+	if (bdev && bdev != bdev->bd_contains)
+		p = bdev->bd_part;
 
-	return dir;
+	if (p)
+		return p->debugfs_dir;
+
+	return q->debugfs_dir;
 }
 
 /*
@@ -491,6 +500,7 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
 	struct dentry *dir = NULL;
 	int ret;
 
+
 	if (!buts->buf_size || !buts->buf_nr)
 		return -EINVAL;
 
@@ -521,7 +531,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
 
 	ret = -ENOENT;
 
-	dir = blk_trace_debugfs_dir(buts, bt);
+	dir = blk_trace_debugfs_dir(bdev, q);
+	if (WARN_ON(!dir))
+		goto err;
 
 	bt->dev = dev;
 	atomic_set(&bt->dropped, 0);
@@ -561,8 +573,6 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
 
 	ret = 0;
 err:
-	if (dir && !bt->dir)
-		dput(dir);
 	if (ret)
 		blk_trace_free(bt);
 	return ret;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls
  2020-04-29  7:46 [PATCH v3 0/6] block: fix blktrace debugfs use after free Luis Chamberlain
                   ` (3 preceding siblings ...)
  2020-04-29  7:46 ` [PATCH v3 4/6] blktrace: fix debugfs use after free Luis Chamberlain
@ 2020-04-29  7:46 ` Luis Chamberlain
  2020-04-29  9:49   ` Greg KH
  2020-04-29  7:46 ` [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals Luis Chamberlain
  5 siblings, 1 reply; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29  7:46 UTC (permalink / raw)
  To: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm
  Cc: mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Luis Chamberlain

We use one blktrace per request_queue, that means one per the entire
disk.  So we cannot run one blktrace on say /dev/vda and then /dev/vda1,
or just two calls on /dev/vda.

We check for concurrent setup only at the very end of the blktrace setup though.

If we try to run two concurrent blktraces on the same block device the
second one will fail, and the first one seems to go on. However when
one tries to kill the first one one will see things like this:

The kernel will show these:

```
debugfs: File 'dropped' in directory 'nvme1n1' already present!
debugfs: File 'msg' in directory 'nvme1n1' already present!
debugfs: File 'trace0' in directory 'nvme1n1' already present!
``

And userspace just sees this error message for the second call:

```
blktrace /dev/nvme1n1
BLKTRACESETUP(2) /dev/nvme1n1 failed: 5/Input/output error
```

The first userspace process #1 will also claim that the files
were taken underneath their nose as well. The files are taken
away form the first process given that when the second blktrace
fails, it will follow up with a BLKTRACESTOP and BLKTRACETEARDOWN.
This means that even if go-happy process #1 is waiting for blktrace
data, we *have* been asked to take teardown the blktrace.

This can easily be reproduced with break-blktrace [0] run_0005.sh test.

Just break out early if we know we're already going to fail, this will
prevent trying to create the files all over again, which we know still
exist.

[0] https://github.com/mcgrof/break-blktrace
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 kernel/trace/blktrace.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 5c52976bd762..383045f67cb8 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -4,6 +4,8 @@
  *
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/kernel.h>
 #include <linux/blkdev.h>
 #include <linux/blktrace_api.h>
@@ -516,6 +518,11 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
 	 */
 	strreplace(buts->name, '/', '_');
 
+	if (q->blk_trace) {
+		pr_warn("Concurrent blktraces are not allowed\n");
+		return -EBUSY;
+	}
+
 	bt = kzalloc(sizeof(*bt), GFP_KERNEL);
 	if (!bt)
 		return -ENOMEM;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals
  2020-04-29  7:46 [PATCH v3 0/6] block: fix blktrace debugfs use after free Luis Chamberlain
                   ` (4 preceding siblings ...)
  2020-04-29  7:46 ` [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls Luis Chamberlain
@ 2020-04-29  7:46 ` Luis Chamberlain
  2020-04-29  9:50   ` Greg KH
  2020-04-29 14:05   ` Ming Lei
  5 siblings, 2 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29  7:46 UTC (permalink / raw)
  To: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm
  Cc: mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Luis Chamberlain

Be pedantic on removal as well and hold the mutex.
This should prevent uses of addition while we exit.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 drivers/block/loop.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index da693e6a834e..6dccba22c9b5 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -2333,6 +2333,8 @@ static void __exit loop_exit(void)
 
 	range = max_loop ? max_loop << part_shift : 1UL << MINORBITS;
 
+	mutex_lock(&loop_ctl_mutex);
+
 	idr_for_each(&loop_index_idr, &loop_exit_cb, NULL);
 	idr_destroy(&loop_index_idr);
 
@@ -2340,6 +2342,8 @@ static void __exit loop_exit(void)
 	unregister_blkdev(LOOP_MAJOR, "loop");
 
 	misc_deregister(&loop_misc);
+
+	mutex_unlock(&loop_ctl_mutex);
 }
 
 module_init(loop_init);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29  7:46 ` [PATCH v3 4/6] blktrace: fix debugfs use after free Luis Chamberlain
@ 2020-04-29  9:47   ` Greg KH
  2020-04-29 11:26   ` Christoph Hellwig
  1 sibling, 0 replies; 33+ messages in thread
From: Greg KH @ 2020-04-29  9:47 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: axboe, viro, bvanassche, rostedt, mingo, jack, ming.lei, nstange,
	akpm, mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Omar Sandoval, Hannes Reinecke, Michal Hocko,
	syzbot+603294af2d01acfdd6da

On Wed, Apr 29, 2020 at 07:46:25AM +0000, Luis Chamberlain wrote:
> --- a/block/blk-debugfs.c
> +++ b/block/blk-debugfs.c
> @@ -13,3 +13,32 @@ void blk_debugfs_register(void)
>  {
>  	blk_debugfs_root = debugfs_create_dir("block", NULL);
>  }
> +
> +static struct dentry *blk_debugfs_dir_register(const char *name)
> +{
> +	return debugfs_create_dir(name, blk_debugfs_root);
> +}

Nit, that function is not needed at all, just spell out the call to
debugfs_create_dir() in the 2 places below you call it.  That will
result in less lines of code overall :)

> -	dir = blk_trace_debugfs_dir(buts, bt);
> +	dir = blk_trace_debugfs_dir(bdev, q);
> +	if (WARN_ON(!dir))
> +		goto err;

With panic-on-warn you just rebooted the box, lovely :(

I said previously, that if you _REALLY_ wanted to warn about this, or do
something different based on the result of a debugfs call, then you can,
but you need to comment the heck out of it as to why you are doing so,
otherwise I'm just going to catch it in my tree-wide sweeps and end up
removing it.

Other than those two nits, this looks _much_ better, thanks for doing
this.

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls
  2020-04-29  7:46 ` [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls Luis Chamberlain
@ 2020-04-29  9:49   ` Greg KH
  2020-05-01 15:06     ` Luis Chamberlain
  0 siblings, 1 reply; 33+ messages in thread
From: Greg KH @ 2020-04-29  9:49 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: axboe, viro, bvanassche, rostedt, mingo, jack, ming.lei, nstange,
	akpm, mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel

On Wed, Apr 29, 2020 at 07:46:26AM +0000, Luis Chamberlain wrote:
> We use one blktrace per request_queue, that means one per the entire
> disk.  So we cannot run one blktrace on say /dev/vda and then /dev/vda1,
> or just two calls on /dev/vda.
> 
> We check for concurrent setup only at the very end of the blktrace setup though.
> 
> If we try to run two concurrent blktraces on the same block device the
> second one will fail, and the first one seems to go on. However when
> one tries to kill the first one one will see things like this:
> 
> The kernel will show these:
> 
> ```
> debugfs: File 'dropped' in directory 'nvme1n1' already present!
> debugfs: File 'msg' in directory 'nvme1n1' already present!
> debugfs: File 'trace0' in directory 'nvme1n1' already present!
> ``
> 
> And userspace just sees this error message for the second call:
> 
> ```
> blktrace /dev/nvme1n1
> BLKTRACESETUP(2) /dev/nvme1n1 failed: 5/Input/output error
> ```
> 
> The first userspace process #1 will also claim that the files
> were taken underneath their nose as well. The files are taken
> away form the first process given that when the second blktrace
> fails, it will follow up with a BLKTRACESTOP and BLKTRACETEARDOWN.
> This means that even if go-happy process #1 is waiting for blktrace
> data, we *have* been asked to take teardown the blktrace.
> 
> This can easily be reproduced with break-blktrace [0] run_0005.sh test.
> 
> Just break out early if we know we're already going to fail, this will
> prevent trying to create the files all over again, which we know still
> exist.
> 
> [0] https://github.com/mcgrof/break-blktrace
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  kernel/trace/blktrace.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 5c52976bd762..383045f67cb8 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -4,6 +4,8 @@
>   *
>   */
>  
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
>  #include <linux/kernel.h>
>  #include <linux/blkdev.h>
>  #include <linux/blktrace_api.h>
> @@ -516,6 +518,11 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
>  	 */
>  	strreplace(buts->name, '/', '_');
>  
> +	if (q->blk_trace) {
> +		pr_warn("Concurrent blktraces are not allowed\n");
> +		return -EBUSY;

You have access to a block device here, please use dev_warn() instead
here for that, that makes it obvious as to what device a "concurrent
blktrace" was attempted for.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals
  2020-04-29  7:46 ` [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals Luis Chamberlain
@ 2020-04-29  9:50   ` Greg KH
  2020-05-03  9:09     ` Luis Chamberlain
  2020-04-29 14:05   ` Ming Lei
  1 sibling, 1 reply; 33+ messages in thread
From: Greg KH @ 2020-04-29  9:50 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: axboe, viro, bvanassche, rostedt, mingo, jack, ming.lei, nstange,
	akpm, mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel

On Wed, Apr 29, 2020 at 07:46:27AM +0000, Luis Chamberlain wrote:
> Be pedantic on removal as well and hold the mutex.
> This should prevent uses of addition while we exit.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  drivers/block/loop.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index da693e6a834e..6dccba22c9b5 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -2333,6 +2333,8 @@ static void __exit loop_exit(void)
>  
>  	range = max_loop ? max_loop << part_shift : 1UL << MINORBITS;
>  
> +	mutex_lock(&loop_ctl_mutex);
> +
>  	idr_for_each(&loop_index_idr, &loop_exit_cb, NULL);
>  	idr_destroy(&loop_index_idr);
>  
> @@ -2340,6 +2342,8 @@ static void __exit loop_exit(void)
>  	unregister_blkdev(LOOP_MAJOR, "loop");
>  
>  	misc_deregister(&loop_misc);
> +
> +	mutex_unlock(&loop_ctl_mutex);
>  }
>  
>  module_init(loop_init);

What type of issue is this helping with?  Can it be triggered today?  if
so, shouldn't it be backported to stable kernels?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/6] block: revert back to synchronous request_queue removal
  2020-04-29  7:46 ` [PATCH v3 1/6] block: revert back to synchronous request_queue removal Luis Chamberlain
@ 2020-04-29 11:15   ` Christoph Hellwig
  2020-05-02  0:22   ` Bart Van Assche
  1 sibling, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2020-04-29 11:15 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm, mhocko, yukuai3, linux-block, linux-fsdevel,
	linux-mm, linux-kernel, Omar Sandoval, Hannes Reinecke,
	Michal Hocko

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 2/6] block: move main block debugfs initialization to its own file
  2020-04-29  7:46 ` [PATCH v3 2/6] block: move main block debugfs initialization to its own file Luis Chamberlain
@ 2020-04-29 11:15   ` Christoph Hellwig
  0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2020-04-29 11:15 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm, mhocko, yukuai3, linux-block, linux-fsdevel,
	linux-mm, linux-kernel, Omar Sandoval, Hannes Reinecke,
	Michal Hocko

On Wed, Apr 29, 2020 at 07:46:23AM +0000, Luis Chamberlain wrote:
> make_request-based drivers and and request-based drivers share some
> debugfs code. By moving this into its own file it makes it easier
> to expand and audit this shared code.
> 
> This patch contains no functional changes.
> 
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: Omar Sandoval <osandov@fb.com>
> Cc: Hannes Reinecke <hare@suse.com>
> Cc: Nicolai Stange <nstange@suse.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: yu kuai <yukuai3@huawei.com>
> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 3/6] blktrace: move blktrace debugfs creation to helper function
  2020-04-29  7:46 ` [PATCH v3 3/6] blktrace: move blktrace debugfs creation to helper function Luis Chamberlain
@ 2020-04-29 11:20   ` Christoph Hellwig
  2020-05-02  0:25   ` Bart Van Assche
  1 sibling, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2020-04-29 11:20 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm, mhocko, yukuai3, linux-block, linux-fsdevel,
	linux-mm, linux-kernel

On Wed, Apr 29, 2020 at 07:46:24AM +0000, Luis Chamberlain wrote:
> Move the work to create the debugfs directory used into a helper.
> It will make further checks easier to read. This commit introduces
> no functional changes.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29  7:46 ` [PATCH v3 4/6] blktrace: fix debugfs use after free Luis Chamberlain
  2020-04-29  9:47   ` Greg KH
@ 2020-04-29 11:26   ` Christoph Hellwig
  2020-04-29 11:45     ` Luis Chamberlain
  1 sibling, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2020-04-29 11:26 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm, mhocko, yukuai3, linux-block, linux-fsdevel,
	linux-mm, linux-kernel, Omar Sandoval, Hannes Reinecke,
	Michal Hocko, syzbot+603294af2d01acfdd6da

I can't say I'm a fan of all these long backtraces in commit logs..

> +static struct dentry *blk_debugfs_dir_register(const char *name)
> +{
> +	return debugfs_create_dir(name, blk_debugfs_root);
> +}

I don't think we really need this helper.

> +void blk_part_debugfs_unregister(struct hd_struct *p)
> +{
> +	debugfs_remove_recursive(p->debugfs_dir);
> +	p->debugfs_dir = NULL;
> +}

Why do we need to clear the pointer here?

> +#ifdef CONFIG_DEBUG_FS
> +	/* Currently only used by kernel/trace/blktrace.c */
> +	struct dentry *debugfs_dir;
> +#endif

Does that comment really add value?

> +static struct dentry *blk_trace_debugfs_dir(struct block_device *bdev,
> +					    struct request_queue *q)
>  {
> +	struct hd_struct *p = NULL;
>  
> +	 * Some drivers like scsi-generic use a NULL block device. For
> +	 * other drivers when bdev != bdev->bd_contain we are doing a blktrace
> +	 * on a parition, otherwise we know we are working on the whole
> +	 * disk, and for that the request_queue already has its own debugfs_dir.
> +	 * which we have been using for other things other than blktrace.
> +	 */
> +	if (bdev && bdev != bdev->bd_contains)
> +		p = bdev->bd_part;
>  
> +	if (p)
> +		return p->debugfs_dir;
> +
> +	return q->debugfs_dir;

This could be simplified down to:

	if (bdev && bdev != bdev->bd_contains)
		return bdev->bd_part->debugfs_dir;
	return q->debugfs_dir;

Given that bd_part is in __blkdev_get very near bd_contains.

Also given that this patch completely rewrites blk_trace_debugfs_dir is
there any point in the previous patch?

> @@ -491,6 +500,7 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
>  	struct dentry *dir = NULL;
>  	int ret;
>  
> +
>  	if (!buts->buf_size || !buts->buf_nr)
>  		return -EINVAL;
>  

Spurious whitespace change.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29 11:26   ` Christoph Hellwig
@ 2020-04-29 11:45     ` Luis Chamberlain
  2020-04-29 11:50       ` Christoph Hellwig
  0 siblings, 1 reply; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29 11:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm, mhocko, yukuai3, linux-block, linux-fsdevel,
	linux-mm, linux-kernel, Omar Sandoval, Hannes Reinecke,
	Michal Hocko, syzbot+603294af2d01acfdd6da

On Wed, Apr 29, 2020 at 04:26:37AM -0700, Christoph Hellwig wrote:
> I can't say I'm a fan of all these long backtraces in commit logs..
> 
> > +static struct dentry *blk_debugfs_dir_register(const char *name)
> > +{
> > +	return debugfs_create_dir(name, blk_debugfs_root);
> > +}
> 
> I don't think we really need this helper.

We don't export blk_debugfs_root, didn't think we'd want to, and
since only a few scew funky drivers would use the struct gendisk
and also support BLKTRACE, I didn't think we'd want to export it
now.

A new block private symbol namespace alright?

> > +void blk_part_debugfs_unregister(struct hd_struct *p)
> > +{
> > +	debugfs_remove_recursive(p->debugfs_dir);
> > +	p->debugfs_dir = NULL;
> > +}
> 
> Why do we need to clear the pointer here?

True, not needed for partition.

> > +#ifdef CONFIG_DEBUG_FS
> > +	/* Currently only used by kernel/trace/blktrace.c */
> > +	struct dentry *debugfs_dir;
> > +#endif
> 
> Does that comment really add value?

I'll nuke it.

> > +static struct dentry *blk_trace_debugfs_dir(struct block_device *bdev,
> > +					    struct request_queue *q)
> >  {
> > +	struct hd_struct *p = NULL;
> >  
> > +	 * Some drivers like scsi-generic use a NULL block device. For
> > +	 * other drivers when bdev != bdev->bd_contain we are doing a blktrace
> > +	 * on a parition, otherwise we know we are working on the whole
> > +	 * disk, and for that the request_queue already has its own debugfs_dir.
> > +	 * which we have been using for other things other than blktrace.
> > +	 */
> > +	if (bdev && bdev != bdev->bd_contains)
> > +		p = bdev->bd_part;
> >  
> > +	if (p)
> > +		return p->debugfs_dir;
> > +
> > +	return q->debugfs_dir;
> 
> This could be simplified down to:
> 
> 	if (bdev && bdev != bdev->bd_contains)
> 		return bdev->bd_part->debugfs_dir;
> 	return q->debugfs_dir;
>
> Given that bd_part is in __blkdev_get very near bd_contains.

Ah neat.

> Also given that this patch completely rewrites blk_trace_debugfs_dir is
> there any point in the previous patch?

Still think it helps with making this patch easier to read, but I don't
care, lemme know if I should just fold it.

> > @@ -491,6 +500,7 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
> >  	struct dentry *dir = NULL;
> >  	int ret;
> >  
> > +
> >  	if (!buts->buf_size || !buts->buf_nr)
> >  		return -EINVAL;
> >  
> 
> Spurious whitespace change.

Will nuke.

  Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29 11:45     ` Luis Chamberlain
@ 2020-04-29 11:50       ` Christoph Hellwig
  2020-04-29 12:02         ` Luis Chamberlain
  0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2020-04-29 11:50 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Christoph Hellwig, axboe, viro, bvanassche, gregkh, rostedt,
	mingo, jack, ming.lei, nstange, akpm, mhocko, yukuai3,
	linux-block, linux-fsdevel, linux-mm, linux-kernel,
	Omar Sandoval, Hannes Reinecke, Michal Hocko,
	syzbot+603294af2d01acfdd6da

On Wed, Apr 29, 2020 at 11:45:42AM +0000, Luis Chamberlain wrote:
> On Wed, Apr 29, 2020 at 04:26:37AM -0700, Christoph Hellwig wrote:
> > I can't say I'm a fan of all these long backtraces in commit logs..
> > 
> > > +static struct dentry *blk_debugfs_dir_register(const char *name)
> > > +{
> > > +	return debugfs_create_dir(name, blk_debugfs_root);
> > > +}
> > 
> > I don't think we really need this helper.
> 
> We don't export blk_debugfs_root, didn't think we'd want to, and
> since only a few scew funky drivers would use the struct gendisk
> and also support BLKTRACE, I didn't think we'd want to export it
> now.
> 
> A new block private symbol namespace alright?

Err, that function is static and has two callers.

> > This could be simplified down to:
> > 
> > 	if (bdev && bdev != bdev->bd_contains)
> > 		return bdev->bd_part->debugfs_dir;
> > 	return q->debugfs_dir;
> >
> > Given that bd_part is in __blkdev_get very near bd_contains.
> 
> Ah neat.
> 
> > Also given that this patch completely rewrites blk_trace_debugfs_dir is
> > there any point in the previous patch?
> 
> Still think it helps with making this patch easier to read, but I don't
> care, lemme know if I should just fold it.

In fact I'm not even sure we need the helper.  Modulo the comment
this just becomes a:

	if (bdev && bdev != bdev->bd_contains)
 		dir = bdev->bd_part->debugfs_dir;
	else
	 	dir = q->debugfs_dir;

in do_blk_trace_setup.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29 11:50       ` Christoph Hellwig
@ 2020-04-29 12:02         ` Luis Chamberlain
  2020-04-29 12:04           ` Christoph Hellwig
  0 siblings, 1 reply; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29 12:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm, mhocko, yukuai3, linux-block, linux-fsdevel,
	linux-mm, linux-kernel, Omar Sandoval, Hannes Reinecke,
	Michal Hocko, syzbot+603294af2d01acfdd6da

On Wed, Apr 29, 2020 at 04:50:51AM -0700, Christoph Hellwig wrote:
> On Wed, Apr 29, 2020 at 11:45:42AM +0000, Luis Chamberlain wrote:
> > On Wed, Apr 29, 2020 at 04:26:37AM -0700, Christoph Hellwig wrote:
> > > I can't say I'm a fan of all these long backtraces in commit logs..
> > > 
> > > > +static struct dentry *blk_debugfs_dir_register(const char *name)
> > > > +{
> > > > +	return debugfs_create_dir(name, blk_debugfs_root);
> > > > +}
> > > 
> > > I don't think we really need this helper.
> > 
> > We don't export blk_debugfs_root, didn't think we'd want to, and
> > since only a few scew funky drivers would use the struct gendisk
> > and also support BLKTRACE, I didn't think we'd want to export it
> > now.
> > 
> > A new block private symbol namespace alright?
> 
> Err, that function is static and has two callers.

Yes but that is to make it easier to look for who is creating the
debugfs_dir for either the request_queue or partition. I'll export
blk_debugfs_root and we'll open code all this.

> > > This could be simplified down to:
> > > 
> > > 	if (bdev && bdev != bdev->bd_contains)
> > > 		return bdev->bd_part->debugfs_dir;
> > > 	return q->debugfs_dir;
> > >
> > > Given that bd_part is in __blkdev_get very near bd_contains.
> > 
> > Ah neat.
> > 
> > > Also given that this patch completely rewrites blk_trace_debugfs_dir is
> > > there any point in the previous patch?
> > 
> > Still think it helps with making this patch easier to read, but I don't
> > care, lemme know if I should just fold it.
> 
> In fact I'm not even sure we need the helper.  Modulo the comment
> this just becomes a:
> 
> 	if (bdev && bdev != bdev->bd_contains)
>  		dir = bdev->bd_part->debugfs_dir;
> 	else
> 	 	dir = q->debugfs_dir;
> 
> in do_blk_trace_setup.

True, alright will remove that patch.

  Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29 12:02         ` Luis Chamberlain
@ 2020-04-29 12:04           ` Christoph Hellwig
  2020-04-29 12:21             ` Luis Chamberlain
  0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2020-04-29 12:04 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Christoph Hellwig, axboe, viro, bvanassche, gregkh, rostedt,
	mingo, jack, ming.lei, nstange, akpm, mhocko, yukuai3,
	linux-block, linux-fsdevel, linux-mm, linux-kernel,
	Omar Sandoval, Hannes Reinecke, Michal Hocko,
	syzbot+603294af2d01acfdd6da

On Wed, Apr 29, 2020 at 12:02:30PM +0000, Luis Chamberlain wrote:
> > Err, that function is static and has two callers.
> 
> Yes but that is to make it easier to look for who is creating the
> debugfs_dir for either the request_queue or partition. I'll export
> blk_debugfs_root and we'll open code all this.

No, please not.  exported variables are usually a bad idea.  Just
skip the somewhat pointless trivial static function.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29 12:04           ` Christoph Hellwig
@ 2020-04-29 12:21             ` Luis Chamberlain
  2020-04-29 12:57               ` Greg KH
  0 siblings, 1 reply; 33+ messages in thread
From: Luis Chamberlain @ 2020-04-29 12:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, ming.lei,
	nstange, akpm, mhocko, yukuai3, linux-block, linux-fsdevel,
	linux-mm, linux-kernel, Omar Sandoval, Hannes Reinecke,
	Michal Hocko, syzbot+603294af2d01acfdd6da

On Wed, Apr 29, 2020 at 05:04:06AM -0700, Christoph Hellwig wrote:
> On Wed, Apr 29, 2020 at 12:02:30PM +0000, Luis Chamberlain wrote:
> > > Err, that function is static and has two callers.
> > 
> > Yes but that is to make it easier to look for who is creating the
> > debugfs_dir for either the request_queue or partition. I'll export
> > blk_debugfs_root and we'll open code all this.
> 
> No, please not.  exported variables are usually a bad idea.  Just
> skip the somewhat pointless trivial static function.

Alrighty. It has me thinking we might want to only export those symbols
to a specific namespace. Thoughts, preferences?

BLOCK_GENHD_PRIVATE ?

The scsi-generic driver seems... rather unique, and I'd imagine we'd
want to discourage such concoctions in the future, so proliferations
of these symbols.

  Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29 12:21             ` Luis Chamberlain
@ 2020-04-29 12:57               ` Greg KH
  2020-05-01 15:24                 ` Luis Chamberlain
  0 siblings, 1 reply; 33+ messages in thread
From: Greg KH @ 2020-04-29 12:57 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Christoph Hellwig, axboe, viro, bvanassche, rostedt, mingo, jack,
	ming.lei, nstange, akpm, mhocko, yukuai3, linux-block,
	linux-fsdevel, linux-mm, linux-kernel, Omar Sandoval,
	Hannes Reinecke, Michal Hocko, syzbot+603294af2d01acfdd6da

On Wed, Apr 29, 2020 at 12:21:52PM +0000, Luis Chamberlain wrote:
> On Wed, Apr 29, 2020 at 05:04:06AM -0700, Christoph Hellwig wrote:
> > On Wed, Apr 29, 2020 at 12:02:30PM +0000, Luis Chamberlain wrote:
> > > > Err, that function is static and has two callers.
> > > 
> > > Yes but that is to make it easier to look for who is creating the
> > > debugfs_dir for either the request_queue or partition. I'll export
> > > blk_debugfs_root and we'll open code all this.
> > 
> > No, please not.  exported variables are usually a bad idea.  Just
> > skip the somewhat pointless trivial static function.
> 
> Alrighty. It has me thinking we might want to only export those symbols
> to a specific namespace. Thoughts, preferences?
> 
> BLOCK_GENHD_PRIVATE ?

That's a nice add-on issue after this is fixed.  As Christoph and I
pointed out, you have _less_ code in the file if you remove the static
wrapper function.  Do that now and then worry about symbol namespaces
please.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals
  2020-04-29  7:46 ` [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals Luis Chamberlain
  2020-04-29  9:50   ` Greg KH
@ 2020-04-29 14:05   ` Ming Lei
  1 sibling, 0 replies; 33+ messages in thread
From: Ming Lei @ 2020-04-29 14:05 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: axboe, viro, bvanassche, gregkh, rostedt, mingo, jack, nstange,
	akpm, mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel

On Wed, Apr 29, 2020 at 07:46:27AM +0000, Luis Chamberlain wrote:
> Be pedantic on removal as well and hold the mutex.
> This should prevent uses of addition while we exit.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  drivers/block/loop.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index da693e6a834e..6dccba22c9b5 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -2333,6 +2333,8 @@ static void __exit loop_exit(void)
>  
>  	range = max_loop ? max_loop << part_shift : 1UL << MINORBITS;
>  
> +	mutex_lock(&loop_ctl_mutex);
> +
>  	idr_for_each(&loop_index_idr, &loop_exit_cb, NULL);
>  	idr_destroy(&loop_index_idr);
>  
> @@ -2340,6 +2342,8 @@ static void __exit loop_exit(void)
>  	unregister_blkdev(LOOP_MAJOR, "loop");
>  
>  	misc_deregister(&loop_misc);
> +
> +	mutex_unlock(&loop_ctl_mutex);
>  }
>  
>  module_init(loop_init);
> -- 
> 2.25.1
> 

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls
  2020-04-29  9:49   ` Greg KH
@ 2020-05-01 15:06     ` Luis Chamberlain
  2020-05-01 15:34       ` Christoph Hellwig
  2020-05-01 16:51       ` Greg KH
  0 siblings, 2 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-05-01 15:06 UTC (permalink / raw)
  To: Greg KH
  Cc: axboe, viro, bvanassche, rostedt, mingo, jack, ming.lei, nstange,
	akpm, mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel

On Wed, Apr 29, 2020 at 11:49:37AM +0200, Greg KH wrote:
> On Wed, Apr 29, 2020 at 07:46:26AM +0000, Luis Chamberlain wrote:
> > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> > index 5c52976bd762..383045f67cb8 100644
> > --- a/kernel/trace/blktrace.c
> > +++ b/kernel/trace/blktrace.c
> > @@ -516,6 +518,11 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
> >  	 */
> >  	strreplace(buts->name, '/', '_');
> >  
> > +	if (q->blk_trace) {
> > +		pr_warn("Concurrent blktraces are not allowed\n");
> > +		return -EBUSY;
> 
> You have access to a block device here, please use dev_warn() instead
> here for that, that makes it obvious as to what device a "concurrent
> blktrace" was attempted for.

The block device may be empty, one example is for scsi-generic, but I'll
use buts->name.

  Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 4/6] blktrace: fix debugfs use after free
  2020-04-29 12:57               ` Greg KH
@ 2020-05-01 15:24                 ` Luis Chamberlain
  0 siblings, 0 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-05-01 15:24 UTC (permalink / raw)
  To: Greg KH
  Cc: Christoph Hellwig, axboe, viro, bvanassche, rostedt, mingo, jack,
	ming.lei, nstange, akpm, mhocko, yukuai3, linux-block,
	linux-fsdevel, linux-mm, linux-kernel, Omar Sandoval,
	Hannes Reinecke, Michal Hocko, syzbot+603294af2d01acfdd6da

On Wed, Apr 29, 2020 at 02:57:26PM +0200, Greg KH wrote:
> On Wed, Apr 29, 2020 at 12:21:52PM +0000, Luis Chamberlain wrote:
> > On Wed, Apr 29, 2020 at 05:04:06AM -0700, Christoph Hellwig wrote:
> > > On Wed, Apr 29, 2020 at 12:02:30PM +0000, Luis Chamberlain wrote:
> > > > > Err, that function is static and has two callers.
> > > > 
> > > > Yes but that is to make it easier to look for who is creating the
> > > > debugfs_dir for either the request_queue or partition. I'll export
> > > > blk_debugfs_root and we'll open code all this.
> > > 
> > > No, please not.  exported variables are usually a bad idea.  Just
> > > skip the somewhat pointless trivial static function.
> > 
> > Alrighty. It has me thinking we might want to only export those symbols
> > to a specific namespace. Thoughts, preferences?
> > 
> > BLOCK_GENHD_PRIVATE ?
> 
> That's a nice add-on issue after this is fixed.  As Christoph and I
> pointed out, you have _less_ code in the file if you remove the static
> wrapper function.  Do that now and then worry about symbol namespaces
> please.

So it turns out that in the old implementation, it was implicit that the
request_queue directory was shared with the scsi drive. So, the
q->debugfs_dir *will* be set, and as we have it here', we'd silently be
overwriting the old q->debugfs_dir, as the queue is the same. To keep
things working as it used to, with both, we just need to use a symlink
here. With the old way, we'd *always* create the sg directory and re-use
that, however since we can only have one blktrace per request_queue, it
still had the same restriction, this was just implicit. Using a symlink
will make this much more obvious and upkeep the old functionality. We'll
need to only export one symbol. I'll roll this in.

  Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls
  2020-05-01 15:06     ` Luis Chamberlain
@ 2020-05-01 15:34       ` Christoph Hellwig
  2020-05-01 15:40         ` Luis Chamberlain
  2020-05-01 16:51       ` Greg KH
  1 sibling, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2020-05-01 15:34 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Greg KH, axboe, viro, bvanassche, rostedt, mingo, jack, ming.lei,
	nstange, akpm, mhocko, yukuai3, linux-block, linux-fsdevel,
	linux-mm, linux-kernel

On Fri, May 01, 2020 at 03:06:26PM +0000, Luis Chamberlain wrote:
> > You have access to a block device here, please use dev_warn() instead
> > here for that, that makes it obvious as to what device a "concurrent
> > blktrace" was attempted for.
> 
> The block device may be empty, one example is for scsi-generic, but I'll
> use buts->name.

Is blktrace on /dev/sg something we intentionally support, or just by
some accident of history?  Given all the pains it causes I'd be tempted
to just remove the support and see if anyone screams.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls
  2020-05-01 15:34       ` Christoph Hellwig
@ 2020-05-01 15:40         ` Luis Chamberlain
  2020-05-01 15:50           ` Luis Chamberlain
  0 siblings, 1 reply; 33+ messages in thread
From: Luis Chamberlain @ 2020-05-01 15:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Greg KH, axboe, viro, bvanassche, rostedt, mingo, jack, ming.lei,
	nstange, akpm, mhocko, yukuai3, linux-block, linux-fsdevel,
	linux-mm, linux-kernel

On Fri, May 01, 2020 at 08:34:23AM -0700, Christoph Hellwig wrote:
> On Fri, May 01, 2020 at 03:06:26PM +0000, Luis Chamberlain wrote:
> > > You have access to a block device here, please use dev_warn() instead
> > > here for that, that makes it obvious as to what device a "concurrent
> > > blktrace" was attempted for.
> > 
> > The block device may be empty, one example is for scsi-generic, but I'll
> > use buts->name.
> 
> Is blktrace on /dev/sg something we intentionally support, or just by
> some accident of history?  Given all the pains it causes I'd be tempted
> to just remove the support and see if anyone screams.

From what I can tell I think it was a historic and brutal mistake. I am
more than happy to remove it.

Re-adding support would just be a symlink.

  Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls
  2020-05-01 15:40         ` Luis Chamberlain
@ 2020-05-01 15:50           ` Luis Chamberlain
  0 siblings, 0 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-05-01 15:50 UTC (permalink / raw)
  To: Christoph Hellwig, Christof Schmitt
  Cc: Greg KH, Jens Axboe, Al Viro, Bart Van Assche, Steven Rostedt,
	Ingo Molnar, Jan Kara, Ming Lei, Nicolai Stange, Andrew Morton,
	Michal Hocko, yu kuai, linux-block, Linux FS Devel, linux-mm,
	linux-kernel

On Fri, May 1, 2020 at 9:40 AM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Fri, May 01, 2020 at 08:34:23AM -0700, Christoph Hellwig wrote:
> > On Fri, May 01, 2020 at 03:06:26PM +0000, Luis Chamberlain wrote:
> > > > You have access to a block device here, please use dev_warn() instead
> > > > here for that, that makes it obvious as to what device a "concurrent
> > > > blktrace" was attempted for.
> > >
> > > The block device may be empty, one example is for scsi-generic, but I'll
> > > use buts->name.
> >
> > Is blktrace on /dev/sg something we intentionally support, or just by
> > some accident of history?  Given all the pains it causes I'd be tempted
> > to just remove the support and see if anyone screams.
>
> From what I can tell I think it was a historic and brutal mistake. I am
> more than happy to remove it.

I take that back:

commit 6da127ad0918f93ea93678dad62ce15ffed18797
Author: Christof Schmitt <christof.schmitt@de.ibm.com>
Date:   Fri Jan 11 10:09:43 2008 +0100

    blktrace: Add blktrace ioctls to SCSI generic devices

    Since the SCSI layer uses the request queues from the block layer,
blktrace can
    also be used to trace the requests to all SCSI devices (like SCSI
tape drives),
    not only disks. The only missing part is the ioctl interface to
start and stop
    tracing.

    This patch adds the SETUP, START, STOP and TEARDOWN ioctls from
blktrace to the
    sg device files. With this change, blktrace can be used for SCSI
devices like
    for disks, e.g.: blktrace -d /dev/sg1 -o - | blkparse -i -

    Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
    Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

Christof, any thoughts on removing this support?

 Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls
  2020-05-01 15:06     ` Luis Chamberlain
  2020-05-01 15:34       ` Christoph Hellwig
@ 2020-05-01 16:51       ` Greg KH
  1 sibling, 0 replies; 33+ messages in thread
From: Greg KH @ 2020-05-01 16:51 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: axboe, viro, bvanassche, rostedt, mingo, jack, ming.lei, nstange,
	akpm, mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel

On Fri, May 01, 2020 at 03:06:26PM +0000, Luis Chamberlain wrote:
> On Wed, Apr 29, 2020 at 11:49:37AM +0200, Greg KH wrote:
> > On Wed, Apr 29, 2020 at 07:46:26AM +0000, Luis Chamberlain wrote:
> > > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> > > index 5c52976bd762..383045f67cb8 100644
> > > --- a/kernel/trace/blktrace.c
> > > +++ b/kernel/trace/blktrace.c
> > > @@ -516,6 +518,11 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
> > >  	 */
> > >  	strreplace(buts->name, '/', '_');
> > >  
> > > +	if (q->blk_trace) {
> > > +		pr_warn("Concurrent blktraces are not allowed\n");
> > > +		return -EBUSY;
> > 
> > You have access to a block device here, please use dev_warn() instead
> > here for that, that makes it obvious as to what device a "concurrent
> > blktrace" was attempted for.
> 
> The block device may be empty, one example is for scsi-generic, but I'll
> use buts->name.

That's fine, give us a chance to know what went wrong, your line as is
does not do that :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/6] block: revert back to synchronous request_queue removal
  2020-04-29  7:46 ` [PATCH v3 1/6] block: revert back to synchronous request_queue removal Luis Chamberlain
  2020-04-29 11:15   ` Christoph Hellwig
@ 2020-05-02  0:22   ` Bart Van Assche
  2020-05-03 10:32     ` Matthew Wilcox
  2020-05-04 16:16     ` Luis Chamberlain
  1 sibling, 2 replies; 33+ messages in thread
From: Bart Van Assche @ 2020-05-02  0:22 UTC (permalink / raw)
  To: Luis Chamberlain, axboe, viro, gregkh, rostedt, mingo, jack,
	ming.lei, nstange, akpm
  Cc: mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Omar Sandoval, Hannes Reinecke, Michal Hocko

On 2020-04-29 00:46, Luis Chamberlain wrote:
> The last reference for the request_queue must not be called from atomic
> conext. *When* the last reference to the request_queue reaches 0 varies,
  ^^^^^^
  context?
> and so let's take the opportunity to document when that is expected to
> happen and also document the context of the related calls as best as possible
> so we can avoid future issues, and with the hopes that the synchronous
> request_queue removal sticks.
> 
> We revert back to synchronous request_queue removal because asynchronous
> removal creates a regression with expected userspace interaction with
> several drivers. An example is when removing the loopback driver, one
> uses ioctls from userspace to do so, but upon return and if successful,
> one expects the device to be removed. Likewise if one races to add another
> device the new one may not be added as it is still being removed. This was
> expected behaviour before and it now fails as the device is still present
           ^^^^^^^^^
           behavior?

> +/**
> + * blk_put_queue - decrement the request_queue refcount
> + * @q: the request_queue structure to decrement the refcount for
> + *
> + * Decrements the refcount to the request_queue kobject. When this reaches 0
                              ^^
                              of?

> +/**
> + * blk_get_queue - increment the request_queue refcount
> + * @q: the request_queue structure to incremenet the refcount for
                                         ^^^^^^^^^^
                                         increment?
> + *
> + * Increment the refcount to the request_queue kobject.
                             ^^
                             of?

>  /**
> - * __blk_release_queue - release a request queue
> - * @work: pointer to the release_work member of the request queue to be released
> + * blk_release_queue - releases all allocated resources of the request_queue
> + * @kobj: pointer to a kobject, who's container is a request_queue
                                   ^^^^^
                                   whose?

> +/**
> + * disk_release - releases all allocated resources of the gendisk
> + * @dev: the device representing this disk
> + *
> + * This function releases all allocated resources of the gendisk.
> + *
> + * The struct gendisk refcounted is incremeneted with get_gendisk() or
                         ^^^^^^^^^^    ^^^^^^^^^^^^
                         refcount?     incremented?

Please fix the spelling errors. Otherwise this patch looks good to me.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 3/6] blktrace: move blktrace debugfs creation to helper function
  2020-04-29  7:46 ` [PATCH v3 3/6] blktrace: move blktrace debugfs creation to helper function Luis Chamberlain
  2020-04-29 11:20   ` Christoph Hellwig
@ 2020-05-02  0:25   ` Bart Van Assche
  1 sibling, 0 replies; 33+ messages in thread
From: Bart Van Assche @ 2020-05-02  0:25 UTC (permalink / raw)
  To: Luis Chamberlain, axboe, viro, gregkh, rostedt, mingo, jack,
	ming.lei, nstange, akpm
  Cc: mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm, linux-kernel

On 2020-04-29 00:46, Luis Chamberlain wrote:
> +static struct dentry *blk_trace_debugfs_dir(struct blk_user_trace_setup *buts,
> +					    struct blk_trace *bt)
> +{
> +	struct dentry *dir = NULL;
> +
> +	dir = debugfs_lookup(buts->name, blk_debugfs_root);
> +	if (!dir)
> +		bt->dir = dir = debugfs_create_dir(buts->name, blk_debugfs_root);
> +
> +	return dir;
> +}

Initializing 'dir' is not necessary since the first statement overwrites
'dir'. Anyway:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals
  2020-04-29  9:50   ` Greg KH
@ 2020-05-03  9:09     ` Luis Chamberlain
  0 siblings, 0 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-05-03  9:09 UTC (permalink / raw)
  To: Greg KH
  Cc: axboe, viro, bvanassche, rostedt, mingo, jack, ming.lei, nstange,
	akpm, mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel

On Wed, Apr 29, 2020 at 11:50:34AM +0200, Greg KH wrote:
> On Wed, Apr 29, 2020 at 07:46:27AM +0000, Luis Chamberlain wrote:
> > Be pedantic on removal as well and hold the mutex.
> > This should prevent uses of addition while we exit.
> > 
> > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> > ---
> >  drivers/block/loop.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > index da693e6a834e..6dccba22c9b5 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -2333,6 +2333,8 @@ static void __exit loop_exit(void)
> >  
> >  	range = max_loop ? max_loop << part_shift : 1UL << MINORBITS;
> >  
> > +	mutex_lock(&loop_ctl_mutex);
> > +
> >  	idr_for_each(&loop_index_idr, &loop_exit_cb, NULL);
> >  	idr_destroy(&loop_index_idr);
> >  
> > @@ -2340,6 +2342,8 @@ static void __exit loop_exit(void)
> >  	unregister_blkdev(LOOP_MAJOR, "loop");
> >  
> >  	misc_deregister(&loop_misc);
> > +
> > +	mutex_unlock(&loop_ctl_mutex);
> >  }
> >  
> >  module_init(loop_init);
> 
> What type of issue is this helping with?  Can it be triggered today?  if
> so, shouldn't it be backported to stable kernels?

Just code inspection. I can't trigger a userspace test script to crash
the kernel yet, but suspect a race still does exist.

  Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/6] block: revert back to synchronous request_queue removal
  2020-05-02  0:22   ` Bart Van Assche
@ 2020-05-03 10:32     ` Matthew Wilcox
  2020-05-04 16:18       ` Luis Chamberlain
  2020-05-04 16:16     ` Luis Chamberlain
  1 sibling, 1 reply; 33+ messages in thread
From: Matthew Wilcox @ 2020-05-03 10:32 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Luis Chamberlain, axboe, viro, gregkh, rostedt, mingo, jack,
	ming.lei, nstange, akpm, mhocko, yukuai3, linux-block,
	linux-fsdevel, linux-mm, linux-kernel, Omar Sandoval,
	Hannes Reinecke, Michal Hocko

On Fri, May 01, 2020 at 05:22:12PM -0700, Bart Van Assche wrote:
> > expected behaviour before and it now fails as the device is still present
>            ^^^^^^^^^
>            behavior?

That's UK/US spelling.  We do not "correct" one to the other.

Documentation/doc-guide/contributing.rst: - Both American and British English spellings are allowed within the
Documentation/doc-guide/contributing.rst-   kernel documentation.  There is no need to fix one by replacing it with
Documentation/doc-guide/contributing.rst-   the other.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/6] block: revert back to synchronous request_queue removal
  2020-05-02  0:22   ` Bart Van Assche
  2020-05-03 10:32     ` Matthew Wilcox
@ 2020-05-04 16:16     ` Luis Chamberlain
  1 sibling, 0 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-05-04 16:16 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: axboe, viro, gregkh, rostedt, mingo, jack, ming.lei, nstange,
	akpm, mhocko, yukuai3, linux-block, linux-fsdevel, linux-mm,
	linux-kernel, Omar Sandoval, Hannes Reinecke, Michal Hocko

On Fri, May 01, 2020 at 05:22:12PM -0700, Bart Van Assche wrote:
> Please fix the spelling errors. Otherwise this patch looks good to me.

Fixed, thanks for the review.

  Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 1/6] block: revert back to synchronous request_queue removal
  2020-05-03 10:32     ` Matthew Wilcox
@ 2020-05-04 16:18       ` Luis Chamberlain
  0 siblings, 0 replies; 33+ messages in thread
From: Luis Chamberlain @ 2020-05-04 16:18 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Bart Van Assche, axboe, viro, gregkh, rostedt, mingo, jack,
	ming.lei, nstange, akpm, mhocko, yukuai3, linux-block,
	linux-fsdevel, linux-mm, linux-kernel, Omar Sandoval,
	Hannes Reinecke, Michal Hocko

On Sun, May 03, 2020 at 03:32:45AM -0700, Matthew Wilcox wrote:
> On Fri, May 01, 2020 at 05:22:12PM -0700, Bart Van Assche wrote:
> > > expected behaviour before and it now fails as the device is still present
> >            ^^^^^^^^^
> >            behavior?
> 
> That's UK/US spelling.  We do not "correct" one to the other.
> 
> Documentation/doc-guide/contributing.rst: - Both American and British English spellings are allowed within the
> Documentation/doc-guide/contributing.rst-   kernel documentation.  There is no need to fix one by replacing it with
> Documentation/doc-guide/contributing.rst-   the other.

I already changed it at Bart's request. I'll leave at like that to honor
US as being the leader in COVID19 cases.

  Luis

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2020-05-04 16:18 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-29  7:46 [PATCH v3 0/6] block: fix blktrace debugfs use after free Luis Chamberlain
2020-04-29  7:46 ` [PATCH v3 1/6] block: revert back to synchronous request_queue removal Luis Chamberlain
2020-04-29 11:15   ` Christoph Hellwig
2020-05-02  0:22   ` Bart Van Assche
2020-05-03 10:32     ` Matthew Wilcox
2020-05-04 16:18       ` Luis Chamberlain
2020-05-04 16:16     ` Luis Chamberlain
2020-04-29  7:46 ` [PATCH v3 2/6] block: move main block debugfs initialization to its own file Luis Chamberlain
2020-04-29 11:15   ` Christoph Hellwig
2020-04-29  7:46 ` [PATCH v3 3/6] blktrace: move blktrace debugfs creation to helper function Luis Chamberlain
2020-04-29 11:20   ` Christoph Hellwig
2020-05-02  0:25   ` Bart Van Assche
2020-04-29  7:46 ` [PATCH v3 4/6] blktrace: fix debugfs use after free Luis Chamberlain
2020-04-29  9:47   ` Greg KH
2020-04-29 11:26   ` Christoph Hellwig
2020-04-29 11:45     ` Luis Chamberlain
2020-04-29 11:50       ` Christoph Hellwig
2020-04-29 12:02         ` Luis Chamberlain
2020-04-29 12:04           ` Christoph Hellwig
2020-04-29 12:21             ` Luis Chamberlain
2020-04-29 12:57               ` Greg KH
2020-05-01 15:24                 ` Luis Chamberlain
2020-04-29  7:46 ` [PATCH v3 5/6] blktrace: break out of blktrace setup on concurrent calls Luis Chamberlain
2020-04-29  9:49   ` Greg KH
2020-05-01 15:06     ` Luis Chamberlain
2020-05-01 15:34       ` Christoph Hellwig
2020-05-01 15:40         ` Luis Chamberlain
2020-05-01 15:50           ` Luis Chamberlain
2020-05-01 16:51       ` Greg KH
2020-04-29  7:46 ` [PATCH v3 6/6] loop: be paranoid on exit and prevent new additions / removals Luis Chamberlain
2020-04-29  9:50   ` Greg KH
2020-05-03  9:09     ` Luis Chamberlain
2020-04-29 14:05   ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).