All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] zram: fix two races
@ 2021-10-15 12:16 Ming Lei
  2021-10-15 12:16 ` [PATCH 1/3] zram: replace fsync_bdev with sync_blockdev Ming Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Ming Lei @ 2021-10-15 12:16 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Luis Chamberlain, Minchan Kim, Greg Kroah-Hartman, Ming Lei

Hello,

Fixes the following two issues reported by Luis Chamberlain by simpler
approach, meantime it is sort of simplification of handling resetting/removing
device vs. open().

- zram leak during unloading module, which is one race between resetting
and removing device

- zram resource leak in case that disksize_store is done after resetting
and before deleting gendisk in zram_remove()


Ming Lei (3):
  zram: replace fsync_bdev with sync_blockdev
  zram: simplify handling reset_store/remove vs. open
  zram: avoid race between zram_remove and disksize_store

 drivers/block/zram/zram_drv.c | 62 ++++++++++++++++-------------------
 drivers/block/zram/zram_drv.h |  4 ---
 2 files changed, 28 insertions(+), 38 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] zram: replace fsync_bdev with sync_blockdev
  2021-10-15 12:16 [PATCH 0/3] zram: fix two races Ming Lei
@ 2021-10-15 12:16 ` Ming Lei
  2021-10-15 12:16 ` [PATCH 2/3] zram: simplify handling reset_store/remove vs. open Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2021-10-15 12:16 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Luis Chamberlain, Minchan Kim, Greg Kroah-Hartman, Ming Lei

When calling fsync_bdev(), zram driver guarantees that the bdev won't be
opened by anyone, then there can't be one active fs/superblock over the
zram bdev, so replace fsync_bdev with sync_blockdev.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/zram/zram_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index fcaf2750f68f..6f6f6a3fee0e 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1793,7 +1793,7 @@ static ssize_t reset_store(struct device *dev,
 	mutex_unlock(&bdev->bd_disk->open_mutex);
 
 	/* Make sure all the pending I/O are finished */
-	fsync_bdev(bdev);
+	sync_blockdev(bdev);
 	zram_reset_device(zram);
 
 	mutex_lock(&bdev->bd_disk->open_mutex);
@@ -1984,7 +1984,7 @@ static int zram_remove(struct zram *zram)
 	zram_debugfs_unregister(zram);
 
 	/* Make sure all the pending I/O are finished */
-	fsync_bdev(bdev);
+	sync_blockdev(bdev);
 	zram_reset_device(zram);
 
 	pr_info("Removed device: %s\n", zram->disk->disk_name);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] zram: simplify handling reset_store/remove vs. open
  2021-10-15 12:16 [PATCH 0/3] zram: fix two races Ming Lei
  2021-10-15 12:16 ` [PATCH 1/3] zram: replace fsync_bdev with sync_blockdev Ming Lei
@ 2021-10-15 12:16 ` Ming Lei
  2021-10-15 12:16 ` [PATCH 3/3] zram: avoid race between zram_remove and disksize_store Ming Lei
  2021-10-15 17:32 ` [PATCH 0/3] zram: fix two races Luis Chamberlain
  3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2021-10-15 12:16 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Luis Chamberlain, Minchan Kim, Greg Kroah-Hartman, Ming Lei

Before resetting and removing one zram disk, disk->open_mutex is acquired,
and ->claim is set to prevent new open and zram_remove from being done,
then disk->open_mutex is released, and flush dirty pages and reset the zram
disk.

Turns out it needn't to be so complicated, we can simply flush dirty
pages and reset the zram disk with holding disk->open_mutex. Meantime
we can't fail to remove zram in case that it is running from module
exit, otherwise we will leak the zram device.

bdev_disk_changed() already runs sync_blockdev() with holding
disk->open_mutex, so this way is safe. Also it is safe to call
zram_reset_device() with ->open_mutex too, just we need to deal with
lockdep false warning since backing_dev_store() holds init lock before
acquiring backing device's open_mutex.

More importantly, we can avoid to fail removing zram when unloading module,
then zram device won't be leaked and the warning of 'Error: Removing state
63 which has instances left.' can be fixed.

Reported-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/zram/zram_drv.c | 53 +++++++++++++----------------------
 drivers/block/zram/zram_drv.h |  4 ---
 2 files changed, 20 insertions(+), 37 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 6f6f6a3fee0e..d374c0f46681 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -55,6 +55,8 @@ static size_t huge_class_size;
 static const struct block_device_operations zram_devops;
 static const struct block_device_operations zram_wb_devops;
 
+static struct lock_class_key zram_open_lock_key;
+
 static void zram_free_page(struct zram *zram, size_t index);
 static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
 				u32 index, int offset, struct bio *bio);
@@ -1782,44 +1784,21 @@ static ssize_t reset_store(struct device *dev,
 	bdev = zram->disk->part0;
 
 	mutex_lock(&bdev->bd_disk->open_mutex);
-	/* Do not reset an active device or claimed device */
-	if (bdev->bd_openers || zram->claim) {
+	/* Do not reset an active device */
+	if (bdev->bd_openers) {
 		mutex_unlock(&bdev->bd_disk->open_mutex);
 		return -EBUSY;
 	}
 
-	/* From now on, anyone can't open /dev/zram[0-9] */
-	zram->claim = true;
-	mutex_unlock(&bdev->bd_disk->open_mutex);
-
 	/* Make sure all the pending I/O are finished */
 	sync_blockdev(bdev);
 	zram_reset_device(zram);
-
-	mutex_lock(&bdev->bd_disk->open_mutex);
-	zram->claim = false;
 	mutex_unlock(&bdev->bd_disk->open_mutex);
 
 	return len;
 }
 
-static int zram_open(struct block_device *bdev, fmode_t mode)
-{
-	int ret = 0;
-	struct zram *zram;
-
-	WARN_ON(!mutex_is_locked(&bdev->bd_disk->open_mutex));
-
-	zram = bdev->bd_disk->private_data;
-	/* zram was claimed to reset so open request fails */
-	if (zram->claim)
-		ret = -EBUSY;
-
-	return ret;
-}
-
 static const struct block_device_operations zram_devops = {
-	.open = zram_open,
 	.submit_bio = zram_submit_bio,
 	.swap_slot_free_notify = zram_slot_free_notify,
 	.rw_page = zram_rw_page,
@@ -1827,7 +1806,6 @@ static const struct block_device_operations zram_devops = {
 };
 
 static const struct block_device_operations zram_wb_devops = {
-	.open = zram_open,
 	.submit_bio = zram_submit_bio,
 	.swap_slot_free_notify = zram_slot_free_notify,
 	.owner = THIS_MODULE
@@ -1922,6 +1900,13 @@ static int zram_add(void)
 	zram->disk->private_data = zram;
 	snprintf(zram->disk->disk_name, 16, "zram%d", device_id);
 
+	/*
+	 * avoid false warning on disk->open_mutex because we hold
+	 * init_lock before acquiring backing disk's open_mutex, see
+	 * backing_dev_store()
+	 */
+	lockdep_set_class(&zram->disk->open_mutex, &zram_open_lock_key);
+
 	/* Actual capacity set using syfs (/sys/block/zram<id>/disksize */
 	set_capacity(zram->disk, 0);
 	/* zram devices sort of resembles non-rotational disks */
@@ -1973,19 +1958,17 @@ static int zram_remove(struct zram *zram)
 	struct block_device *bdev = zram->disk->part0;
 
 	mutex_lock(&bdev->bd_disk->open_mutex);
-	if (bdev->bd_openers || zram->claim) {
+	if (bdev->bd_openers) {
 		mutex_unlock(&bdev->bd_disk->open_mutex);
 		return -EBUSY;
 	}
 
-	zram->claim = true;
-	mutex_unlock(&bdev->bd_disk->open_mutex);
-
-	zram_debugfs_unregister(zram);
-
 	/* Make sure all the pending I/O are finished */
 	sync_blockdev(bdev);
 	zram_reset_device(zram);
+	mutex_unlock(&bdev->bd_disk->open_mutex);
+
+	zram_debugfs_unregister(zram);
 
 	pr_info("Removed device: %s\n", zram->disk->disk_name);
 
@@ -2066,7 +2049,11 @@ static struct class zram_control_class = {
 
 static int zram_remove_cb(int id, void *ptr, void *data)
 {
-	zram_remove(ptr);
+	/*
+	 * No one should own the device during module_exit, otherwise zram
+	 * module refcount has to be handled wrong by block layer.
+	 */
+	WARN_ON_ONCE(zram_remove(ptr));
 	return 0;
 }
 
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 80c3b43b4828..ad8564c8a52a 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -109,10 +109,6 @@ struct zram {
 	 */
 	u64 disksize;	/* bytes */
 	char compressor[CRYPTO_MAX_ALG_NAME];
-	/*
-	 * zram is claimed so open request will be failed
-	 */
-	bool claim; /* Protected by disk->open_mutex */
 #ifdef CONFIG_ZRAM_WRITEBACK
 	struct file *backing_dev;
 	spinlock_t wb_limit_lock;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] zram: avoid race between zram_remove and disksize_store
  2021-10-15 12:16 [PATCH 0/3] zram: fix two races Ming Lei
  2021-10-15 12:16 ` [PATCH 1/3] zram: replace fsync_bdev with sync_blockdev Ming Lei
  2021-10-15 12:16 ` [PATCH 2/3] zram: simplify handling reset_store/remove vs. open Ming Lei
@ 2021-10-15 12:16 ` Ming Lei
  2021-10-15 17:32 ` [PATCH 0/3] zram: fix two races Luis Chamberlain
  3 siblings, 0 replies; 5+ messages in thread
From: Ming Lei @ 2021-10-15 12:16 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Luis Chamberlain, Minchan Kim, Greg Kroah-Hartman, Ming Lei

After resetting device in zram_remove(), disksize_store still may come and
allocate resources again before deleting gendisk, fix the race by resetting
zram after del_gendisk() returns. At that time, disksize_store can't come
any more.

Reported-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/zram/zram_drv.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index d374c0f46681..71c4ac26c57b 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1973,6 +1973,13 @@ static int zram_remove(struct zram *zram)
 	pr_info("Removed device: %s\n", zram->disk->disk_name);
 
 	del_gendisk(zram->disk);
+
+	/*
+	 * disksize store may come after the above zram_reset_device
+	 * returns, so run the last reset to avoid the race
+	 */
+	zram_reset_device(zram);
+
 	blk_cleanup_disk(zram->disk);
 	kfree(zram);
 	return 0;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] zram: fix two races
  2021-10-15 12:16 [PATCH 0/3] zram: fix two races Ming Lei
                   ` (2 preceding siblings ...)
  2021-10-15 12:16 ` [PATCH 3/3] zram: avoid race between zram_remove and disksize_store Ming Lei
@ 2021-10-15 17:32 ` Luis Chamberlain
  3 siblings, 0 replies; 5+ messages in thread
From: Luis Chamberlain @ 2021-10-15 17:32 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Minchan Kim, Greg Kroah-Hartman

On Fri, Oct 15, 2021 at 08:16:49PM +0800, Ming Lei wrote:
> Hello,
> 
> Fixes the following two issues reported by Luis Chamberlain by simpler
> approach, meantime it is sort of simplification of handling resetting/removing
> device vs. open().
> 
> - zram leak during unloading module, which is one race between resetting
> and removing device
> 
> - zram resource leak in case that disksize_store is done after resetting
> and before deleting gendisk in zram_remove()

As noted in the other thread, unfortunately this is not enough, and can
leave the driver in a bad state. So either more work is needed or my
alternative patch can be considered. But I do understand the desire to
avoid the mutex on removal, if we can do that great.

  Luis

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-10-15 17:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-15 12:16 [PATCH 0/3] zram: fix two races Ming Lei
2021-10-15 12:16 ` [PATCH 1/3] zram: replace fsync_bdev with sync_blockdev Ming Lei
2021-10-15 12:16 ` [PATCH 2/3] zram: simplify handling reset_store/remove vs. open Ming Lei
2021-10-15 12:16 ` [PATCH 3/3] zram: avoid race between zram_remove and disksize_store Ming Lei
2021-10-15 17:32 ` [PATCH 0/3] zram: fix two races Luis Chamberlain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.