linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] block: Fixes for bdi handling
@ 2017-03-08 16:48 Jan Kara
  2017-03-08 16:48 ` [PATCH 1/4] block: Allow bdi re-registration Jan Kara
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Jan Kara @ 2017-03-08 16:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi, Jan Kara

Hi!

patches in this series fix the most urgent bugs that were introduced by commit
165a5e22fafb "block: Move bdi_unregister() to del_gendisk()" and by
0dba1314d4f8 "scsi, block: fix duplicate bdi name registration crashes".
In fact before these commits we had a different set of problems in the
code but they were less visible :).

I'm still waiting for test confirmation from Omar and Arthur Marsh who reported
issues but I'm not able to hit any problem anymore in my testing.  I think it
would be nice to get the patches to rc2 so to speed up things I'm posting the
patches now so that review can happen in parallel with the testing.

Other BDI handling fixes I have in my queue can wait a bit more since they are
either theoretical or long-standing issues. So I'll repost them once these four
are sorted out.

								Honza

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] block: Allow bdi re-registration
  2017-03-08 16:48 [PATCH 0/4] block: Fixes for bdi handling Jan Kara
@ 2017-03-08 16:48 ` Jan Kara
  2017-03-08 22:55   ` Tejun Heo
  2017-03-08 16:48 ` [PATCH 2/4] bdi: Fix use-after-free in wb_congested_put() Jan Kara
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Jan Kara @ 2017-03-08 16:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi, Jan Kara

SCSI can call device_add_disk() several times for one request queue when
a device in unbound and bound, creating new gendisk each time. This will
lead to bdi being repeatedly registered and unregistered. This was not a
big problem until commit 165a5e22fafb "block: Move bdi_unregister() to
del_gendisk()" since bdi was only registered repeatedly (bdi_register()
handles repeated calls fine, only we ended up leaking reference to
gendisk due to overwriting bdi->owner) but unregistered only in
blk_cleanup_queue() which didn't get called repeatedly. After
165a5e22fafb we were doing correct bdi_register() - bdi_unregister()
cycles however bdi_unregister() is not prepared for it. So make sure
bdi_unregister() cleans up bdi in such a way that it is prepared for
a possible following bdi_register() call.

An easy way to provoke this behavior is to enable
CONFIG_DEBUG_TEST_DRIVER_REMOVE and use scsi_debug driver to create a
scsi disk which immediately hangs without this fix.

Fixes: 165a5e22fafb127ecb5914e12e8c32a1f0d3f820
Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/backing-dev.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 6d861d090e9f..6ac932210f56 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -710,6 +710,11 @@ static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
 	 */
 	atomic_dec(&bdi->usage_cnt);
 	wait_event(cgwb_release_wait, !atomic_read(&bdi->usage_cnt));
+	/*
+	 * Grab back our reference so that we hold it when @bdi gets
+	 * re-registered.
+	 */
+	atomic_inc(&bdi->usage_cnt);
 }
 
 /**
@@ -857,6 +862,8 @@ int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner)
 			MINOR(owner->devt));
 	if (rc)
 		return rc;
+	/* Leaking owner reference... */
+	WARN_ON(bdi->owner);
 	bdi->owner = owner;
 	get_device(owner);
 	return 0;
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] bdi: Fix use-after-free in wb_congested_put()
  2017-03-08 16:48 [PATCH 0/4] block: Fixes for bdi handling Jan Kara
  2017-03-08 16:48 ` [PATCH 1/4] block: Allow bdi re-registration Jan Kara
@ 2017-03-08 16:48 ` Jan Kara
  2017-03-08 22:56   ` Tejun Heo
  2017-03-08 16:48 ` [PATCH 3/4] block: Make del_gendisk() safer for disks without queues Jan Kara
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Jan Kara @ 2017-03-08 16:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi, Jan Kara

bdi_writeback_congested structures get created for each blkcg and bdi
regardless whether bdi is registered or not. When they are created in
unregistered bdi and the request queue (and thus bdi) is then destroyed
while blkg still holds reference to bdi_writeback_congested structure,
this structure will be referencing freed bdi and last wb_congested_put()
will try to remove the structure from already freed bdi.

With commit 165a5e22fafb "block: Move bdi_unregister() to
del_gendisk()", SCSI started to destroy bdis without calling
bdi_unregister() first (previously it was calling bdi_unregister() even
for unregistered bdis) and thus the code detaching
bdi_writeback_congested in cgwb_bdi_destroy() was not triggered and we
started hitting this use-after-free bug. It is enough to boot a KVM
instance with virtio-scsi device to trigger this behavior.

Fix the problem by detaching bdi_writeback_congested structures in
bdi_exit() instead of bdi_unregister(). This is also more logical as
they can get attached to bdi regardless whether it ever got registered
or not.

Fixes: 165a5e22fafb127ecb5914e12e8c32a1f0d3f820
Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/backing-dev.c | 36 +++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 6ac932210f56..c6f2a37028c2 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -683,30 +683,18 @@ static int cgwb_bdi_init(struct backing_dev_info *bdi)
 static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
 {
 	struct radix_tree_iter iter;
-	struct rb_node *rbn;
 	void **slot;
 
 	WARN_ON(test_bit(WB_registered, &bdi->wb.state));
 
 	spin_lock_irq(&cgwb_lock);
-
 	radix_tree_for_each_slot(slot, &bdi->cgwb_tree, &iter, 0)
 		cgwb_kill(*slot);
-
-	while ((rbn = rb_first(&bdi->cgwb_congested_tree))) {
-		struct bdi_writeback_congested *congested =
-			rb_entry(rbn, struct bdi_writeback_congested, rb_node);
-
-		rb_erase(rbn, &bdi->cgwb_congested_tree);
-		congested->bdi = NULL;	/* mark @congested unlinked */
-	}
-
 	spin_unlock_irq(&cgwb_lock);
 
 	/*
-	 * All cgwb's and their congested states must be shutdown and
-	 * released before returning.  Drain the usage counter to wait for
-	 * all cgwb's and cgwb_congested's ever created on @bdi.
+	 * All cgwb's must be shutdown and released before returning.  Drain
+	 * the usage counter to wait for all cgwb's ever created on @bdi.
 	 */
 	atomic_dec(&bdi->usage_cnt);
 	wait_event(cgwb_release_wait, !atomic_read(&bdi->usage_cnt));
@@ -754,6 +742,21 @@ void wb_blkcg_offline(struct blkcg *blkcg)
 	spin_unlock_irq(&cgwb_lock);
 }
 
+static void cgwb_bdi_exit(struct backing_dev_info *bdi)
+{
+	struct rb_node *rbn;
+
+	spin_lock_irq(&cgwb_lock);
+	while ((rbn = rb_first(&bdi->cgwb_congested_tree))) {
+		struct bdi_writeback_congested *congested =
+			rb_entry(rbn, struct bdi_writeback_congested, rb_node);
+
+		rb_erase(rbn, &bdi->cgwb_congested_tree);
+		congested->bdi = NULL;	/* mark @congested unlinked */
+	}
+	spin_unlock_irq(&cgwb_lock);
+}
+
 #else	/* CONFIG_CGROUP_WRITEBACK */
 
 static int cgwb_bdi_init(struct backing_dev_info *bdi)
@@ -774,7 +777,9 @@ static int cgwb_bdi_init(struct backing_dev_info *bdi)
 	return 0;
 }
 
-static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
+static void cgwb_bdi_destroy(struct backing_dev_info *bdi) { }
+
+static void cgwb_bdi_exit(struct backing_dev_info *bdi)
 {
 	wb_congested_put(bdi->wb_congested);
 }
@@ -905,6 +910,7 @@ static void bdi_exit(struct backing_dev_info *bdi)
 {
 	WARN_ON_ONCE(bdi->dev);
 	wb_exit(&bdi->wb);
+	cgwb_bdi_exit(bdi);
 }
 
 static void release_bdi(struct kref *ref)
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] block: Make del_gendisk() safer for disks without queues
  2017-03-08 16:48 [PATCH 0/4] block: Fixes for bdi handling Jan Kara
  2017-03-08 16:48 ` [PATCH 1/4] block: Allow bdi re-registration Jan Kara
  2017-03-08 16:48 ` [PATCH 2/4] bdi: Fix use-after-free in wb_congested_put() Jan Kara
@ 2017-03-08 16:48 ` Jan Kara
  2017-03-08 22:57   ` Tejun Heo
  2017-03-08 16:48 ` [PATCH 4/4] Revert "scsi, block: fix duplicate bdi name registration crashes" Jan Kara
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Jan Kara @ 2017-03-08 16:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi, Jan Kara

Commit 165a5e22fafb "block: Move bdi_unregister() to del_gendisk()"
added disk->queue dereference to del_gendisk(). Although del_gendisk()
is not supposed to be called without disk->queue valid and
blk_unregister_queue() warns in that case, this change will make it oops
instead. Return to the old more robust behavior of just warning when
del_gendisk() gets called for gendisk with disk->queue being NULL.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/genhd.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index b26a5ea115d0..94f323842b52 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -681,12 +681,16 @@ void del_gendisk(struct gendisk *disk)
 	disk->flags &= ~GENHD_FL_UP;
 
 	sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi");
-	/*
-	 * Unregister bdi before releasing device numbers (as they can get
-	 * reused and we'd get clashes in sysfs).
-	 */
-	bdi_unregister(disk->queue->backing_dev_info);
-	blk_unregister_queue(disk);
+	if (disk->queue) {
+		/*
+		 * Unregister bdi before releasing device numbers (as they can
+		 * get reused and we'd get clashes in sysfs).
+		 */
+		bdi_unregister(disk->queue->backing_dev_info);
+		blk_unregister_queue(disk);
+	} else {
+		WARN_ON(1);
+	}
 	blk_unregister_region(disk_devt(disk), disk->minors);
 
 	part_stat_set_all(&disk->part0, 0);
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] Revert "scsi, block: fix duplicate bdi name registration crashes"
  2017-03-08 16:48 [PATCH 0/4] block: Fixes for bdi handling Jan Kara
                   ` (2 preceding siblings ...)
  2017-03-08 16:48 ` [PATCH 3/4] block: Make del_gendisk() safer for disks without queues Jan Kara
@ 2017-03-08 16:48 ` Jan Kara
  2017-03-08 17:12   ` Dan Williams
  2017-03-08 16:50 ` [PATCH 0/4] block: Fixes for bdi handling Omar Sandoval
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Jan Kara @ 2017-03-08 16:48 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi, Jan Kara

This reverts commit 0dba1314d4f81115dce711292ec7981d17231064. It causes
leaking of device numbers for SCSI when SCSI registers multiple gendisks
for one request_queue in succession. It can be easily reproduced using
Omar's script [1] on kernel with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
Furthermore the protection provided by this commit is not needed anymore
as the problem it was fixing got also fixed by commit 165a5e22fafb
"block: Move bdi_unregister() to del_gendisk()".

[1]: http://marc.info/?l=linux-block&m=148554717109098&w=2

Signed-off-by: Jan Kara <jack@suse.cz>
---
 block/blk-core.c       |  2 --
 block/genhd.c          | 21 ---------------------
 drivers/scsi/sd.c      | 41 ++++++++---------------------------------
 include/linux/blkdev.h |  1 -
 include/linux/genhd.h  |  8 --------
 5 files changed, 8 insertions(+), 65 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 1086dac8724c..a76895c9776d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -578,8 +578,6 @@ void blk_cleanup_queue(struct request_queue *q)
 		q->queue_lock = &q->__queue_lock;
 	spin_unlock_irq(lock);
 
-	put_disk_devt(q->disk_devt);
-
 	/* @q is and will stay empty, shutdown and put */
 	blk_put_queue(q);
 }
diff --git a/block/genhd.c b/block/genhd.c
index 94f323842b52..a9c516a8b37d 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -572,20 +572,6 @@ static void register_disk(struct device *parent, struct gendisk *disk)
 	disk_part_iter_exit(&piter);
 }
 
-void put_disk_devt(struct disk_devt *disk_devt)
-{
-	if (disk_devt && atomic_dec_and_test(&disk_devt->count))
-		disk_devt->release(disk_devt);
-}
-EXPORT_SYMBOL(put_disk_devt);
-
-void get_disk_devt(struct disk_devt *disk_devt)
-{
-	if (disk_devt)
-		atomic_inc(&disk_devt->count);
-}
-EXPORT_SYMBOL(get_disk_devt);
-
 /**
  * device_add_disk - add partitioning information to kernel list
  * @parent: parent device for the disk
@@ -626,13 +612,6 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 
 	disk_alloc_events(disk);
 
-	/*
-	 * Take a reference on the devt and assign it to queue since it
-	 * must not be reallocated while the bdi is registered
-	 */
-	disk->queue->disk_devt = disk->disk_devt;
-	get_disk_devt(disk->disk_devt);
-
 	/* Register BDI before referencing it from bdev */
 	bdi = disk->queue->backing_dev_info;
 	bdi_register_owner(bdi, disk_to_dev(disk));
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index c7839f6c35cc..d277e8620e3e 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3075,23 +3075,6 @@ static void sd_probe_async(void *data, async_cookie_t cookie)
 	put_device(&sdkp->dev);
 }
 
-struct sd_devt {
-	int idx;
-	struct disk_devt disk_devt;
-};
-
-static void sd_devt_release(struct disk_devt *disk_devt)
-{
-	struct sd_devt *sd_devt = container_of(disk_devt, struct sd_devt,
-			disk_devt);
-
-	spin_lock(&sd_index_lock);
-	ida_remove(&sd_index_ida, sd_devt->idx);
-	spin_unlock(&sd_index_lock);
-
-	kfree(sd_devt);
-}
-
 /**
  *	sd_probe - called during driver initialization and whenever a
  *	new scsi device is attached to the system. It is called once
@@ -3113,7 +3096,6 @@ static void sd_devt_release(struct disk_devt *disk_devt)
 static int sd_probe(struct device *dev)
 {
 	struct scsi_device *sdp = to_scsi_device(dev);
-	struct sd_devt *sd_devt;
 	struct scsi_disk *sdkp;
 	struct gendisk *gd;
 	int index;
@@ -3139,13 +3121,9 @@ static int sd_probe(struct device *dev)
 	if (!sdkp)
 		goto out;
 
-	sd_devt = kzalloc(sizeof(*sd_devt), GFP_KERNEL);
-	if (!sd_devt)
-		goto out_free;
-
 	gd = alloc_disk(SD_MINORS);
 	if (!gd)
-		goto out_free_devt;
+		goto out_free;
 
 	do {
 		if (!ida_pre_get(&sd_index_ida, GFP_KERNEL))
@@ -3161,11 +3139,6 @@ static int sd_probe(struct device *dev)
 		goto out_put;
 	}
 
-	atomic_set(&sd_devt->disk_devt.count, 1);
-	sd_devt->disk_devt.release = sd_devt_release;
-	sd_devt->idx = index;
-	gd->disk_devt = &sd_devt->disk_devt;
-
 	error = sd_format_disk_name("sd", index, gd->disk_name, DISK_NAME_LEN);
 	if (error) {
 		sdev_printk(KERN_WARNING, sdp, "SCSI disk (sd) name length exceeded.\n");
@@ -3205,12 +3178,11 @@ static int sd_probe(struct device *dev)
 	return 0;
 
  out_free_index:
-	put_disk_devt(&sd_devt->disk_devt);
-	sd_devt = NULL;
+	spin_lock(&sd_index_lock);
+	ida_remove(&sd_index_ida, index);
+	spin_unlock(&sd_index_lock);
  out_put:
 	put_disk(gd);
- out_free_devt:
-	kfree(sd_devt);
  out_free:
 	kfree(sdkp);
  out:
@@ -3271,7 +3243,10 @@ static void scsi_disk_release(struct device *dev)
 	struct scsi_disk *sdkp = to_scsi_disk(dev);
 	struct gendisk *disk = sdkp->disk;
 	
-	put_disk_devt(disk->disk_devt);
+	spin_lock(&sd_index_lock);
+	ida_remove(&sd_index_ida, sdkp->index);
+	spin_unlock(&sd_index_lock);
+
 	disk->private_data = NULL;
 	put_disk(disk);
 	put_device(&sdkp->device->sdev_gendev);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 796016e63c1d..5a7da607ca04 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -435,7 +435,6 @@ struct request_queue {
 	struct delayed_work	delay_work;
 
 	struct backing_dev_info	*backing_dev_info;
-	struct disk_devt	*disk_devt;
 
 	/*
 	 * The queue owner gets to use this for whatever they like.
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index a999d281a2f1..76f39754e7b0 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -167,13 +167,6 @@ struct blk_integrity {
 };
 
 #endif	/* CONFIG_BLK_DEV_INTEGRITY */
-struct disk_devt {
-	atomic_t count;
-	void (*release)(struct disk_devt *disk_devt);
-};
-
-void put_disk_devt(struct disk_devt *disk_devt);
-void get_disk_devt(struct disk_devt *disk_devt);
 
 struct gendisk {
 	/* major, first_minor and minors are input parameters only,
@@ -183,7 +176,6 @@ struct gendisk {
 	int first_minor;
 	int minors;                     /* maximum number of minors, =1 for
                                          * disks that can't be partitioned. */
-	struct disk_devt *disk_devt;
 
 	char disk_name[DISK_NAME_LEN];	/* name of major driver */
 	char *(*devnode)(struct gendisk *gd, umode_t *mode);
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] block: Fixes for bdi handling
  2017-03-08 16:48 [PATCH 0/4] block: Fixes for bdi handling Jan Kara
                   ` (3 preceding siblings ...)
  2017-03-08 16:48 ` [PATCH 4/4] Revert "scsi, block: fix duplicate bdi name registration crashes" Jan Kara
@ 2017-03-08 16:50 ` Omar Sandoval
  2017-03-08 17:39   ` Omar Sandoval
  2017-03-08 16:55 ` Jens Axboe
  2017-03-09  9:05 ` Arthur Marsh
  6 siblings, 1 reply; 17+ messages in thread
From: Omar Sandoval @ 2017-03-08 16:50 UTC (permalink / raw)
  To: Jan Kara
  Cc: Jens Axboe, Tejun Heo, linux-block, Dan Williams, Arthur Marsh,
	linux-scsi

On Wed, Mar 08, 2017 at 05:48:30PM +0100, Jan Kara wrote:
> Hi!
> 
> patches in this series fix the most urgent bugs that were introduced by commit
> 165a5e22fafb "block: Move bdi_unregister() to del_gendisk()" and by
> 0dba1314d4f8 "scsi, block: fix duplicate bdi name registration crashes".
> In fact before these commits we had a different set of problems in the
> code but they were less visible :).
> 
> I'm still waiting for test confirmation from Omar and Arthur Marsh who reported
> issues but I'm not able to hit any problem anymore in my testing.  I think it
> would be nice to get the patches to rc2 so to speed up things I'm posting the
> patches now so that review can happen in parallel with the testing.
> 
> Other BDI handling fixes I have in my queue can wait a bit more since they are
> either theoretical or long-standing issues. So I'll repost them once these four
> are sorted out.
> 
> 								Honza

Applying patches 1 and 2 indeed fixed my virtio-scsi problem. I'll apply
the whole series and test that my sd/sr test cases still work.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] block: Fixes for bdi handling
  2017-03-08 16:48 [PATCH 0/4] block: Fixes for bdi handling Jan Kara
                   ` (4 preceding siblings ...)
  2017-03-08 16:50 ` [PATCH 0/4] block: Fixes for bdi handling Omar Sandoval
@ 2017-03-08 16:55 ` Jens Axboe
  2017-03-09  9:05 ` Arthur Marsh
  6 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2017-03-08 16:55 UTC (permalink / raw)
  To: Jan Kara
  Cc: Tejun Heo, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi

On 03/08/2017 09:48 AM, Jan Kara wrote:
> Hi!
> 
> patches in this series fix the most urgent bugs that were introduced by commit
> 165a5e22fafb "block: Move bdi_unregister() to del_gendisk()" and by
> 0dba1314d4f8 "scsi, block: fix duplicate bdi name registration crashes".
> In fact before these commits we had a different set of problems in the
> code but they were less visible :).
> 
> I'm still waiting for test confirmation from Omar and Arthur Marsh who reported
> issues but I'm not able to hit any problem anymore in my testing.  I think it
> would be nice to get the patches to rc2 so to speed up things I'm posting the
> patches now so that review can happen in parallel with the testing.
> 
> Other BDI handling fixes I have in my queue can wait a bit more since they are
> either theoretical or long-standing issues. So I'll repost them once these four
> are sorted out.

Agree, we'll need to get this sorted for -rc2, thanks a lot for looking
into it. I'll queue these up for testing as well, and hopefully ship
them off tomorrow (or Friday, at the latest).

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] Revert "scsi, block: fix duplicate bdi name registration crashes"
  2017-03-08 16:48 ` [PATCH 4/4] Revert "scsi, block: fix duplicate bdi name registration crashes" Jan Kara
@ 2017-03-08 17:12   ` Dan Williams
  0 siblings, 0 replies; 17+ messages in thread
From: Dan Williams @ 2017-03-08 17:12 UTC (permalink / raw)
  To: Jan Kara
  Cc: Jens Axboe, Tejun Heo, linux-block, Omar Sandoval, Arthur Marsh,
	linux-scsi

On Wed, Mar 8, 2017 at 8:48 AM, Jan Kara <jack@suse.cz> wrote:
> This reverts commit 0dba1314d4f81115dce711292ec7981d17231064. It causes
> leaking of device numbers for SCSI when SCSI registers multiple gendisks
> for one request_queue in succession. It can be easily reproduced using
> Omar's script [1] on kernel with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
> Furthermore the protection provided by this commit is not needed anymore
> as the problem it was fixing got also fixed by commit 165a5e22fafb
> "block: Move bdi_unregister() to del_gendisk()".
>
> [1]: http://marc.info/?l=linux-block&m=148554717109098&w=2
>
> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] block: Fixes for bdi handling
  2017-03-08 16:50 ` [PATCH 0/4] block: Fixes for bdi handling Omar Sandoval
@ 2017-03-08 17:39   ` Omar Sandoval
  0 siblings, 0 replies; 17+ messages in thread
From: Omar Sandoval @ 2017-03-08 17:39 UTC (permalink / raw)
  To: Jan Kara
  Cc: Jens Axboe, Tejun Heo, linux-block, Dan Williams, Arthur Marsh,
	linux-scsi

On Wed, Mar 08, 2017 at 08:50:51AM -0800, Omar Sandoval wrote:
> On Wed, Mar 08, 2017 at 05:48:30PM +0100, Jan Kara wrote:
> > Hi!
> > 
> > patches in this series fix the most urgent bugs that were introduced by commit
> > 165a5e22fafb "block: Move bdi_unregister() to del_gendisk()" and by
> > 0dba1314d4f8 "scsi, block: fix duplicate bdi name registration crashes".
> > In fact before these commits we had a different set of problems in the
> > code but they were less visible :).
> > 
> > I'm still waiting for test confirmation from Omar and Arthur Marsh who reported
> > issues but I'm not able to hit any problem anymore in my testing.  I think it
> > would be nice to get the patches to rc2 so to speed up things I'm posting the
> > patches now so that review can happen in parallel with the testing.
> > 
> > Other BDI handling fixes I have in my queue can wait a bit more since they are
> > either theoretical or long-standing issues. So I'll repost them once these four
> > are sorted out.
> > 
> > 								Honza
> 
> Applying patches 1 and 2 indeed fixed my virtio-scsi problem. I'll apply
> the whole series and test that my sd/sr test cases still work.

Yup, everything is working here. For the series:

Tested-by: Omar Sandoval <osandov@fb.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: Allow bdi re-registration
  2017-03-08 16:48 ` [PATCH 1/4] block: Allow bdi re-registration Jan Kara
@ 2017-03-08 22:55   ` Tejun Heo
  2017-03-08 23:17     ` James Bottomley
  2017-03-09  9:10     ` Jan Kara
  0 siblings, 2 replies; 17+ messages in thread
From: Tejun Heo @ 2017-03-08 22:55 UTC (permalink / raw)
  To: Jan Kara
  Cc: Jens Axboe, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi

Hello,

On Wed, Mar 08, 2017 at 05:48:31PM +0100, Jan Kara wrote:
> @@ -710,6 +710,11 @@ static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
>  	 */
>  	atomic_dec(&bdi->usage_cnt);
>  	wait_event(cgwb_release_wait, !atomic_read(&bdi->usage_cnt));
> +	/*
> +	 * Grab back our reference so that we hold it when @bdi gets
> +	 * re-registered.
> +	 */
> +	atomic_inc(&bdi->usage_cnt);

So, this is more re-initializing the ref to the initial state so that
it can be re-used, right?  Maybe ATOMIC_INIT() is a better choice here
just to clarify what's going on?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] bdi: Fix use-after-free in wb_congested_put()
  2017-03-08 16:48 ` [PATCH 2/4] bdi: Fix use-after-free in wb_congested_put() Jan Kara
@ 2017-03-08 22:56   ` Tejun Heo
  0 siblings, 0 replies; 17+ messages in thread
From: Tejun Heo @ 2017-03-08 22:56 UTC (permalink / raw)
  To: Jan Kara
  Cc: Jens Axboe, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi

On Wed, Mar 08, 2017 at 05:48:32PM +0100, Jan Kara wrote:
> bdi_writeback_congested structures get created for each blkcg and bdi
> regardless whether bdi is registered or not. When they are created in
> unregistered bdi and the request queue (and thus bdi) is then destroyed
> while blkg still holds reference to bdi_writeback_congested structure,
> this structure will be referencing freed bdi and last wb_congested_put()
> will try to remove the structure from already freed bdi.
> 
> With commit 165a5e22fafb "block: Move bdi_unregister() to
> del_gendisk()", SCSI started to destroy bdis without calling
> bdi_unregister() first (previously it was calling bdi_unregister() even
> for unregistered bdis) and thus the code detaching
> bdi_writeback_congested in cgwb_bdi_destroy() was not triggered and we
> started hitting this use-after-free bug. It is enough to boot a KVM
> instance with virtio-scsi device to trigger this behavior.
> 
> Fix the problem by detaching bdi_writeback_congested structures in
> bdi_exit() instead of bdi_unregister(). This is also more logical as
> they can get attached to bdi regardless whether it ever got registered
> or not.
> 
> Fixes: 165a5e22fafb127ecb5914e12e8c32a1f0d3f820
> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] block: Make del_gendisk() safer for disks without queues
  2017-03-08 16:48 ` [PATCH 3/4] block: Make del_gendisk() safer for disks without queues Jan Kara
@ 2017-03-08 22:57   ` Tejun Heo
  0 siblings, 0 replies; 17+ messages in thread
From: Tejun Heo @ 2017-03-08 22:57 UTC (permalink / raw)
  To: Jan Kara
  Cc: Jens Axboe, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi

On Wed, Mar 08, 2017 at 05:48:33PM +0100, Jan Kara wrote:
> Commit 165a5e22fafb "block: Move bdi_unregister() to del_gendisk()"
> added disk->queue dereference to del_gendisk(). Although del_gendisk()
> is not supposed to be called without disk->queue valid and
> blk_unregister_queue() warns in that case, this change will make it oops
> instead. Return to the old more robust behavior of just warning when
> del_gendisk() gets called for gendisk with disk->queue being NULL.
> 
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: Allow bdi re-registration
  2017-03-08 22:55   ` Tejun Heo
@ 2017-03-08 23:17     ` James Bottomley
  2017-03-09  9:10     ` Jan Kara
  1 sibling, 0 replies; 17+ messages in thread
From: James Bottomley @ 2017-03-08 23:17 UTC (permalink / raw)
  To: Tejun Heo, Jan Kara
  Cc: Jens Axboe, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi

On Wed, 2017-03-08 at 17:55 -0500, Tejun Heo wrote:
> Hello,
> 
> On Wed, Mar 08, 2017 at 05:48:31PM +0100, Jan Kara wrote:
> > @@ -710,6 +710,11 @@ static void cgwb_bdi_destroy(struct
> > backing_dev_info *bdi)
> >  	 */
> >  	atomic_dec(&bdi->usage_cnt);
> >  	wait_event(cgwb_release_wait, !atomic_read(&bdi
> > ->usage_cnt));
> > +	/*
> > +	 * Grab back our reference so that we hold it when @bdi
> > gets
> > +	 * re-registered.
> > +	 */
> > +	atomic_inc(&bdi->usage_cnt);
> 
> So, this is more re-initializing the ref to the initial state so that
> it can be re-used, right?  Maybe ATOMIC_INIT() is a better choice 
> here just to clarify what's going on?

Seconded.  Eventually this is going to get converted to a refcount_t
and it will dump a spurious warning on the 0->1 transition.  We can
avoid that by making this a proper initialization.

James

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/4] block: Fixes for bdi handling
  2017-03-08 16:48 [PATCH 0/4] block: Fixes for bdi handling Jan Kara
                   ` (5 preceding siblings ...)
  2017-03-08 16:55 ` Jens Axboe
@ 2017-03-09  9:05 ` Arthur Marsh
  6 siblings, 0 replies; 17+ messages in thread
From: Arthur Marsh @ 2017-03-09  9:05 UTC (permalink / raw)
  To: Jan Kara, Jens Axboe
  Cc: Tejun Heo, linux-block, Dan Williams, Omar Sandoval, linux-scsi



Jan Kara wrote on 09/03/17 03:18:
> Hi!
>
> patches in this series fix the most urgent bugs that were introduced by commit
> 165a5e22fafb "block: Move bdi_unregister() to del_gendisk()" and by
> 0dba1314d4f8 "scsi, block: fix duplicate bdi name registration crashes".
> In fact before these commits we had a different set of problems in the
> code but they were less visible :).
>
> I'm still waiting for test confirmation from Omar and Arthur Marsh who reported
> issues but I'm not able to hit any problem anymore in my testing.  I think it
> would be nice to get the patches to rc2 so to speed up things I'm posting the
> patches now so that review can happen in parallel with the testing.
>
> Other BDI handling fixes I have in my queue can wait a bit more since they are
> either theoretical or long-standing issues. So I'll repost them once these four
> are sorted out.
>
> 								Honza
>

Sorry for the delay in replying, I had to leave the kernel with all 4 
patches applied rebuilding while I was at work and just booted it.

I've only done a kexec reboot so far but there were no problems - no 
errors in dmesg, all disks were recognised and all attempted mounts worked.

Thanks very much for the quick fix!

Arthur.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: Allow bdi re-registration
  2017-03-08 22:55   ` Tejun Heo
  2017-03-08 23:17     ` James Bottomley
@ 2017-03-09  9:10     ` Jan Kara
  1 sibling, 0 replies; 17+ messages in thread
From: Jan Kara @ 2017-03-09  9:10 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jan Kara, Jens Axboe, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi

On Wed 08-03-17 17:55:42, Tejun Heo wrote:
> Hello,
> 
> On Wed, Mar 08, 2017 at 05:48:31PM +0100, Jan Kara wrote:
> > @@ -710,6 +710,11 @@ static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
> >  	 */
> >  	atomic_dec(&bdi->usage_cnt);
> >  	wait_event(cgwb_release_wait, !atomic_read(&bdi->usage_cnt));
> > +	/*
> > +	 * Grab back our reference so that we hold it when @bdi gets
> > +	 * re-registered.
> > +	 */
> > +	atomic_inc(&bdi->usage_cnt);
> 
> So, this is more re-initializing the ref to the initial state so that
> it can be re-used, right?  Maybe ATOMIC_INIT() is a better choice here
> just to clarify what's going on?

OK, I was somewhat undecided between these two option but you and James are
probably right that re-init is clearer.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] block: Allow bdi re-registration
  2017-03-09 10:16 ` [PATCH 1/4] block: Allow bdi re-registration Jan Kara
@ 2017-03-09 16:39   ` Tejun Heo
  0 siblings, 0 replies; 17+ messages in thread
From: Tejun Heo @ 2017-03-09 16:39 UTC (permalink / raw)
  To: Jan Kara
  Cc: Jens Axboe, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi

On Thu, Mar 09, 2017 at 11:16:21AM +0100, Jan Kara wrote:
> SCSI can call device_add_disk() several times for one request queue when
> a device in unbound and bound, creating new gendisk each time. This will
> lead to bdi being repeatedly registered and unregistered. This was not a
> big problem until commit 165a5e22fafb "block: Move bdi_unregister() to
> del_gendisk()" since bdi was only registered repeatedly (bdi_register()
> handles repeated calls fine, only we ended up leaking reference to
> gendisk due to overwriting bdi->owner) but unregistered only in
> blk_cleanup_queue() which didn't get called repeatedly. After
> 165a5e22fafb we were doing correct bdi_register() - bdi_unregister()
> cycles however bdi_unregister() is not prepared for it. So make sure
> bdi_unregister() cleans up bdi in such a way that it is prepared for
> a possible following bdi_register() call.
> 
> An easy way to provoke this behavior is to enable
> CONFIG_DEBUG_TEST_DRIVER_REMOVE and use scsi_debug driver to create a
> scsi disk which immediately hangs without this fix.
> 
> Fixes: 165a5e22fafb127ecb5914e12e8c32a1f0d3f820
> Tested-by: Omar Sandoval <osandov@fb.com>
> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] block: Allow bdi re-registration
  2017-03-09 10:16 [PATCH 0/4 v2] " Jan Kara
@ 2017-03-09 10:16 ` Jan Kara
  2017-03-09 16:39   ` Tejun Heo
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kara @ 2017-03-09 10:16 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Tejun Heo, linux-block, Dan Williams, Omar Sandoval,
	Arthur Marsh, linux-scsi, Jan Kara

SCSI can call device_add_disk() several times for one request queue when
a device in unbound and bound, creating new gendisk each time. This will
lead to bdi being repeatedly registered and unregistered. This was not a
big problem until commit 165a5e22fafb "block: Move bdi_unregister() to
del_gendisk()" since bdi was only registered repeatedly (bdi_register()
handles repeated calls fine, only we ended up leaking reference to
gendisk due to overwriting bdi->owner) but unregistered only in
blk_cleanup_queue() which didn't get called repeatedly. After
165a5e22fafb we were doing correct bdi_register() - bdi_unregister()
cycles however bdi_unregister() is not prepared for it. So make sure
bdi_unregister() cleans up bdi in such a way that it is prepared for
a possible following bdi_register() call.

An easy way to provoke this behavior is to enable
CONFIG_DEBUG_TEST_DRIVER_REMOVE and use scsi_debug driver to create a
scsi disk which immediately hangs without this fix.

Fixes: 165a5e22fafb127ecb5914e12e8c32a1f0d3f820
Tested-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/backing-dev.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 6d861d090e9f..51325489aae5 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -710,6 +710,11 @@ static void cgwb_bdi_destroy(struct backing_dev_info *bdi)
 	 */
 	atomic_dec(&bdi->usage_cnt);
 	wait_event(cgwb_release_wait, !atomic_read(&bdi->usage_cnt));
+	/*
+	 * Reinitialize usage_cnt so that we hold reference when @bdi gets
+	 * re-registered.
+	 */
+	atomic_set(&bdi->usage_cnt, 1);
 }
 
 /**
@@ -857,6 +862,8 @@ int bdi_register_owner(struct backing_dev_info *bdi, struct device *owner)
 			MINOR(owner->devt));
 	if (rc)
 		return rc;
+	/* Leaking owner reference... */
+	WARN_ON(bdi->owner);
 	bdi->owner = owner;
 	get_device(owner);
 	return 0;
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-03-09 16:39 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-08 16:48 [PATCH 0/4] block: Fixes for bdi handling Jan Kara
2017-03-08 16:48 ` [PATCH 1/4] block: Allow bdi re-registration Jan Kara
2017-03-08 22:55   ` Tejun Heo
2017-03-08 23:17     ` James Bottomley
2017-03-09  9:10     ` Jan Kara
2017-03-08 16:48 ` [PATCH 2/4] bdi: Fix use-after-free in wb_congested_put() Jan Kara
2017-03-08 22:56   ` Tejun Heo
2017-03-08 16:48 ` [PATCH 3/4] block: Make del_gendisk() safer for disks without queues Jan Kara
2017-03-08 22:57   ` Tejun Heo
2017-03-08 16:48 ` [PATCH 4/4] Revert "scsi, block: fix duplicate bdi name registration crashes" Jan Kara
2017-03-08 17:12   ` Dan Williams
2017-03-08 16:50 ` [PATCH 0/4] block: Fixes for bdi handling Omar Sandoval
2017-03-08 17:39   ` Omar Sandoval
2017-03-08 16:55 ` Jens Axboe
2017-03-09  9:05 ` Arthur Marsh
2017-03-09 10:16 [PATCH 0/4 v2] " Jan Kara
2017-03-09 10:16 ` [PATCH 1/4] block: Allow bdi re-registration Jan Kara
2017-03-09 16:39   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).