linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] zram: rework reset and destroy path
@ 2015-02-03 16:15 Sergey Senozhatsky
  2015-02-03 23:42 ` Minchan Kim
  0 siblings, 1 reply; 5+ messages in thread
From: Sergey Senozhatsky @ 2015-02-03 16:15 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Ganesh Mahendran, Jerome Marchand, Nitin Gupta,
	Sergey Senozhatsky, linux-kernel, Sergey Senozhatsky

We need to return set_capacity(disk, 0) from reset_store() back
to zram_reset_device(), a catch by Ganesh Mahendran. Potentially,
we can race set_capacity() calls from init and reset paths.

The problem is that zram_reset_device() is also getting called
from zram_exit(), which performs operations in misleading
reversed order -- we first create_device() and then init it,
while zram_exit() perform destroy_device() first and then does
zram_reset_device(). This is done to remove sysfs group before
we reset device, so we can continue with device reset/destruction
not being raced by sysfs attr write (f.e. disksize).

Apart from that, destroy_device() releases zram->disk (but we
still have ->disk pointer), so we cannot acces zram->disk in
later zram_reset_device() call, which may cause additional
errors in the future.

So, this patch rework and cleanup destroy path.

1) remove several unneeded goto labels in zram_init()
2) factor out zram_init() error path and zram_exit() into
destroy_devices() function, which takes the number of devices
to destroy as its argument.
3) remove sysfs group in destroy_devices() first, so we can
reorder operations -- reset device (as expected) goes before
disk destroy and queue cleanup. So we can always access ->disk
in zram_reset_device().
4) and, finally, return set_capacity() back under ->init_lock.

Reported-by: Ganesh Mahendran <opensource.ganesh@gmail.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
---
 drivers/block/zram/zram_drv.c | 71 ++++++++++++++++++-------------------------
 1 file changed, 30 insertions(+), 41 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index a32069f..7d2e86f 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -734,8 +734,9 @@ static void zram_reset_device(struct zram *zram)
 	zram->meta = NULL;
 	/* Reset stats */
 	memset(&zram->stats, 0, sizeof(zram->stats));
-
 	zram->disksize = 0;
+	set_capacity(zram->disk, 0);
+
 	up_write(&zram->init_lock);
 }
 
@@ -828,7 +829,6 @@ static ssize_t reset_store(struct device *dev,
 	/* Make sure all pending I/O is finished */
 	fsync_bdev(bdev);
 	zram_reset_device(zram);
-	set_capacity(zram->disk, 0);
 
 	mutex_unlock(&bdev->bd_mutex);
 	revalidate_disk(zram->disk);
@@ -1114,15 +1114,29 @@ out:
 	return ret;
 }
 
-static void destroy_device(struct zram *zram)
+static void destroy_devices(unsigned int nr)
 {
-	sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
-			&zram_disk_attr_group);
+	struct zram *zram;
+	unsigned int i;
 
-	del_gendisk(zram->disk);
-	put_disk(zram->disk);
+	for (i = 0; i < nr; i++) {
+		zram = &zram_devices[i];
+		/* remove sysfs first, so no one will perform disksize
+		 * store while we destroying devices */
+		sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
+				&zram_disk_attr_group);
 
-	blk_cleanup_queue(zram->queue);
+		zram_reset_device(zram);
+
+		del_gendisk(zram->disk);
+		put_disk(zram->disk);
+
+		blk_cleanup_queue(zram->queue);
+	}
+
+	kfree(zram_devices);
+	unregister_blkdev(zram_major, "zram");
+	pr_debug("Destroyed %u device(s)\n", nr);
 }
 
 static int __init zram_init(void)
@@ -1132,64 +1146,39 @@ static int __init zram_init(void)
 	if (num_devices > max_num_devices) {
 		pr_warn("Invalid value for num_devices: %u\n",
 				num_devices);
-		ret = -EINVAL;
-		goto out;
+		return -EINVAL;
 	}
 
 	zram_major = register_blkdev(0, "zram");
 	if (zram_major <= 0) {
 		pr_warn("Unable to get major number\n");
-		ret = -EBUSY;
-		goto out;
+		return -EBUSY;
 	}
 
 	/* Allocate the device array and initialize each one */
 	zram_devices = kzalloc(num_devices * sizeof(struct zram), GFP_KERNEL);
 	if (!zram_devices) {
 		ret = -ENOMEM;
-		goto unregister;
+		goto out_error;
 	}
 
 	for (dev_id = 0; dev_id < num_devices; dev_id++) {
 		ret = create_device(&zram_devices[dev_id], dev_id);
 		if (ret)
-			goto free_devices;
+			goto out_error;
 	}
 
-	pr_info("Created %u device(s) ...\n", num_devices);
-
+	pr_info("Created %u device(s)\n", num_devices);
 	return 0;
 
-free_devices:
-	while (dev_id)
-		destroy_device(&zram_devices[--dev_id]);
-	kfree(zram_devices);
-unregister:
-	unregister_blkdev(zram_major, "zram");
-out:
+out_error:
+	destroy_devices(dev_id);
 	return ret;
 }
 
 static void __exit zram_exit(void)
 {
-	int i;
-	struct zram *zram;
-
-	for (i = 0; i < num_devices; i++) {
-		zram = &zram_devices[i];
-
-		destroy_device(zram);
-		/*
-		 * Shouldn't access zram->disk after destroy_device
-		 * because destroy_device already released zram->disk.
-		 */
-		zram_reset_device(zram);
-	}
-
-	unregister_blkdev(zram_major, "zram");
-
-	kfree(zram_devices);
-	pr_debug("Cleanup done!\n");
+	destroy_devices(num_devices);
 }
 
 module_init(zram_init);
-- 
2.3.0.rc2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] zram: rework reset and destroy path
  2015-02-03 16:15 [PATCH] zram: rework reset and destroy path Sergey Senozhatsky
@ 2015-02-03 23:42 ` Minchan Kim
  2015-02-04  0:24   ` Sergey Senozhatsky
  0 siblings, 1 reply; 5+ messages in thread
From: Minchan Kim @ 2015-02-03 23:42 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Ganesh Mahendran, Jerome Marchand, Nitin Gupta,
	Sergey Senozhatsky, linux-kernel, nefelim4ag, eternaleye

Hello, Sergey,

On Wed, Feb 04, 2015 at 01:15:06AM +0900, Sergey Senozhatsky wrote:
> We need to return set_capacity(disk, 0) from reset_store() back
> to zram_reset_device(), a catch by Ganesh Mahendran. Potentially,
> we can race set_capacity() calls from init and reset paths.
> 
> The problem is that zram_reset_device() is also getting called
> from zram_exit(), which performs operations in misleading
> reversed order -- we first create_device() and then init it,
> while zram_exit() perform destroy_device() first and then does
> zram_reset_device(). This is done to remove sysfs group before
> we reset device, so we can continue with device reset/destruction
> not being raced by sysfs attr write (f.e. disksize).
> 
> Apart from that, destroy_device() releases zram->disk (but we
> still have ->disk pointer), so we cannot acces zram->disk in
> later zram_reset_device() call, which may cause additional
> errors in the future.
> 
> So, this patch rework and cleanup destroy path.
> 
> 1) remove several unneeded goto labels in zram_init()
> 2) factor out zram_init() error path and zram_exit() into
> destroy_devices() function, which takes the number of devices
> to destroy as its argument.
> 3) remove sysfs group in destroy_devices() first, so we can
> reorder operations -- reset device (as expected) goes before
> disk destroy and queue cleanup. So we can always access ->disk
> in zram_reset_device().
> 4) and, finally, return set_capacity() back under ->init_lock.
> 
> Reported-by: Ganesh Mahendran <opensource.ganesh@gmail.com>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

Looks good to me. Minor nit below.

> ---
>  drivers/block/zram/zram_drv.c | 71 ++++++++++++++++++-------------------------
>  1 file changed, 30 insertions(+), 41 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index a32069f..7d2e86f 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -734,8 +734,9 @@ static void zram_reset_device(struct zram *zram)
>  	zram->meta = NULL;
>  	/* Reset stats */
>  	memset(&zram->stats, 0, sizeof(zram->stats));
> -
>  	zram->disksize = 0;
> +	set_capacity(zram->disk, 0);
> +
>  	up_write(&zram->init_lock);
>  }
>  
> @@ -828,7 +829,6 @@ static ssize_t reset_store(struct device *dev,
>  	/* Make sure all pending I/O is finished */
>  	fsync_bdev(bdev);
>  	zram_reset_device(zram);
> -	set_capacity(zram->disk, 0);
>  
>  	mutex_unlock(&bdev->bd_mutex);
>  	revalidate_disk(zram->disk);
> @@ -1114,15 +1114,29 @@ out:
>  	return ret;
>  }
>  
> -static void destroy_device(struct zram *zram)
> +static void destroy_devices(unsigned int nr)
>  {
> -	sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
> -			&zram_disk_attr_group);
> +	struct zram *zram;
> +	unsigned int i;
>  
> -	del_gendisk(zram->disk);
> -	put_disk(zram->disk);
> +	for (i = 0; i < nr; i++) {
> +		zram = &zram_devices[i];
> +		/* remove sysfs first, so no one will perform disksize
> +		 * store while we destroying devices */
> +		sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
> +				&zram_disk_attr_group);
>  
> -	blk_cleanup_queue(zram->queue);
> +		zram_reset_device(zram);
> +
> +		del_gendisk(zram->disk);
> +		put_disk(zram->disk);
> +
> +		blk_cleanup_queue(zram->queue);
> +	}
> +
> +	kfree(zram_devices);
> +	unregister_blkdev(zram_major, "zram");
> +	pr_debug("Destroyed %u device(s)\n", nr);

Create_device just shows the number of created device so I think
no worth to emit per-device information in destroy_devices.
Let's just emit clean up done like old in zram_exit but
use pr_info instead of pr_debug.

Another concern is I'd like to keep per-device interface(e,g.
create_device, destroy_device) because there was requirement
to add new zram device dynamically. I guess you could remember
that. Although I didn't have a enough time to response,
Alex finally convinced me so I hope a contributor who have time
will do it if he has an interest about that.
For it, per-device creating/destroy interface looks better.
https://lkml.org/lkml/2014/8/8/142
Anyway, I cannot expect it happens sooner so I'm not strong
against your patch(ie, create_device, destroy_devices)
because I think we could do refactoring it when we need it.

Thanks.


>  }
>  
>  static int __init zram_init(void)
> @@ -1132,64 +1146,39 @@ static int __init zram_init(void)
>  	if (num_devices > max_num_devices) {
>  		pr_warn("Invalid value for num_devices: %u\n",
>  				num_devices);
> -		ret = -EINVAL;
> -		goto out;
> +		return -EINVAL;
>  	}
>  
>  	zram_major = register_blkdev(0, "zram");
>  	if (zram_major <= 0) {
>  		pr_warn("Unable to get major number\n");
> -		ret = -EBUSY;
> -		goto out;
> +		return -EBUSY;
>  	}
>  
>  	/* Allocate the device array and initialize each one */
>  	zram_devices = kzalloc(num_devices * sizeof(struct zram), GFP_KERNEL);
>  	if (!zram_devices) {
>  		ret = -ENOMEM;
> -		goto unregister;
> +		goto out_error;
>  	}
>  
>  	for (dev_id = 0; dev_id < num_devices; dev_id++) {
>  		ret = create_device(&zram_devices[dev_id], dev_id);
>  		if (ret)
> -			goto free_devices;
> +			goto out_error;
>  	}
>  
> -	pr_info("Created %u device(s) ...\n", num_devices);
> -
> +	pr_info("Created %u device(s)\n", num_devices);
>  	return 0;
>  
> -free_devices:
> -	while (dev_id)
> -		destroy_device(&zram_devices[--dev_id]);
> -	kfree(zram_devices);
> -unregister:
> -	unregister_blkdev(zram_major, "zram");
> -out:
> +out_error:
> +	destroy_devices(dev_id);
>  	return ret;
>  }
>  
>  static void __exit zram_exit(void)
>  {
> -	int i;
> -	struct zram *zram;
> -
> -	for (i = 0; i < num_devices; i++) {
> -		zram = &zram_devices[i];
> -
> -		destroy_device(zram);
> -		/*
> -		 * Shouldn't access zram->disk after destroy_device
> -		 * because destroy_device already released zram->disk.
> -		 */
> -		zram_reset_device(zram);
> -	}
> -
> -	unregister_blkdev(zram_major, "zram");
> -
> -	kfree(zram_devices);
> -	pr_debug("Cleanup done!\n");
> +	destroy_devices(num_devices);
>  }
>  
>  module_init(zram_init);
> -- 
> 2.3.0.rc2
> 

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] zram: rework reset and destroy path
  2015-02-03 23:42 ` Minchan Kim
@ 2015-02-04  0:24   ` Sergey Senozhatsky
  2015-02-04  0:35     ` Minchan Kim
  0 siblings, 1 reply; 5+ messages in thread
From: Sergey Senozhatsky @ 2015-02-04  0:24 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Andrew Morton, Ganesh Mahendran,
	Jerome Marchand, Nitin Gupta, Sergey Senozhatsky, linux-kernel,
	nefelim4ag, eternaleye

Hello Minchan,

On (02/04/15 08:42), Minchan Kim wrote:
> > +	kfree(zram_devices);
> > +	unregister_blkdev(zram_major, "zram");
> > +	pr_debug("Destroyed %u device(s)\n", nr);
> 
> Create_device just shows the number of created device so I think
> no worth to emit per-device information in destroy_devices.
> Let's just emit clean up done like old in zram_exit but
> use pr_info instead of pr_debug.

not critical let's keep it as is (it just mirrors the message from init()),
and I wouldn't say it's totally useless now. we allocate space for devices,
disk, queue, etc. and we destroy it here.

please see below.

> Another concern is I'd like to keep per-device interface(e,g.
> create_device, destroy_device) because there was requirement
> to add new zram device dynamically. I guess you could remember
> that. Although I didn't have a enough time to response,
> Alex finally convinced me so I hope a contributor who have time
> will do it if he has an interest about that.

yes, I was going to tell you that perhaps I'll do that. I had several
discussions on google+ and it seems that people want to see this
feature. so I was going to ask your opinion.

the solution I'm thinking about now is to replace zram devices fixed
size array with a list, protected by mutex or spin lock (doesn't
matter at this point). this will change destroy_devices() from array
iteration to destroy each list entry.

so:
a)	pr_debug("Destroyed %u device(s)\n", nr)
it will show the actual number of active devices by that time.

b) I'll refactor destroy_devices().

this rework will not make it into the upcoming merge window, so we'll
have enough time.

I haven't decided yet, if I wan't to keep released zram devices in the
list (idle zram devices list, picking up the first available device
when user requests new zram device) or I will destroy abondoned devices
and , thus, the list will represent only active devices.


I tend to select the latter one -- destroy unused zram devices. I don't
want to give additional sysfs knob to limit the number of idle devices,
waste memory for idle devices, etc. We already  have a number of quite
complicated knobs.


	-ss

> For it, per-device creating/destroy interface looks better.
> https://lkml.org/lkml/2014/8/8/142
> Anyway, I cannot expect it happens sooner so I'm not strong
> against your patch(ie, create_device, destroy_devices)
> because I think we could do refactoring it when we need it.
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] zram: rework reset and destroy path
  2015-02-04  0:24   ` Sergey Senozhatsky
@ 2015-02-04  0:35     ` Minchan Kim
  2015-02-04  0:45       ` Sergey Senozhatsky
  0 siblings, 1 reply; 5+ messages in thread
From: Minchan Kim @ 2015-02-04  0:35 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Andrew Morton, Ganesh Mahendran,
	Jerome Marchand, Nitin Gupta, linux-kernel, nefelim4ag,
	eternaleye

On Wed, Feb 04, 2015 at 09:24:51AM +0900, Sergey Senozhatsky wrote:
> Hello Minchan,
> 
> On (02/04/15 08:42), Minchan Kim wrote:
> > > +	kfree(zram_devices);
> > > +	unregister_blkdev(zram_major, "zram");
> > > +	pr_debug("Destroyed %u device(s)\n", nr);
> > 
> > Create_device just shows the number of created device so I think
> > no worth to emit per-device information in destroy_devices.
> > Let's just emit clean up done like old in zram_exit but
> > use pr_info instead of pr_debug.
> 
> not critical let's keep it as is (it just mirrors the message from init()),
> and I wouldn't say it's totally useless now. we allocate space for devices,
> disk, queue, etc. and we destroy it here.

As I said, it is just minor but at least want to fix pr_debug into
pr_info.

> 
> please see below.
> 
> > Another concern is I'd like to keep per-device interface(e,g.
> > create_device, destroy_device) because there was requirement
> > to add new zram device dynamically. I guess you could remember
> > that. Although I didn't have a enough time to response,
> > Alex finally convinced me so I hope a contributor who have time
> > will do it if he has an interest about that.
> 
> yes, I was going to tell you that perhaps I'll do that. I had several
> discussions on google+ and it seems that people want to see this
> feature. so I was going to ask your opinion.
> 
> the solution I'm thinking about now is to replace zram devices fixed
> size array with a list, protected by mutex or spin lock (doesn't
> matter at this point). this will change destroy_devices() from array
> iteration to destroy each list entry.

Yeb.

> 
> so:
> a)	pr_debug("Destroyed %u device(s)\n", nr)
> it will show the actual number of active devices by that time.
> 
> b) I'll refactor destroy_devices().
> 
> this rework will not make it into the upcoming merge window, so we'll
> have enough time.
> 
> I haven't decided yet, if I wan't to keep released zram devices in the
> list (idle zram devices list, picking up the first available device
> when user requests new zram device) or I will destroy abondoned devices
> and , thus, the list will represent only active devices.

I perfer destroy abandoned devices.

> 
> 
> I tend to select the latter one -- destroy unused zram devices. I don't
> want to give additional sysfs knob to limit the number of idle devices,
> waste memory for idle devices, etc. We already  have a number of quite
> complicated knobs.

True.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] zram: rework reset and destroy path
  2015-02-04  0:35     ` Minchan Kim
@ 2015-02-04  0:45       ` Sergey Senozhatsky
  0 siblings, 0 replies; 5+ messages in thread
From: Sergey Senozhatsky @ 2015-02-04  0:45 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Andrew Morton,
	Ganesh Mahendran, Jerome Marchand, Nitin Gupta, linux-kernel,
	nefelim4ag, eternaleye

On (02/04/15 09:35), Minchan Kim wrote:
> > > Create_device just shows the number of created device so I think
> > > no worth to emit per-device information in destroy_devices.
> > > Let's just emit clean up done like old in zram_exit but
> > > use pr_info instead of pr_debug.
> > 
> > not critical let's keep it as is (it just mirrors the message from init()),
> > and I wouldn't say it's totally useless now. we allocate space for devices,
> > disk, queue, etc. and we destroy it here.
> 
> As I said, it is just minor but at least want to fix pr_debug into
> pr_info.

ah, didn't catch your pr_info() <-> pr_debug() note. need more coffee.
the original message was printed by pr_debug().
will send a one-liner.

	-ss

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-02-04  0:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-03 16:15 [PATCH] zram: rework reset and destroy path Sergey Senozhatsky
2015-02-03 23:42 ` Minchan Kim
2015-02-04  0:24   ` Sergey Senozhatsky
2015-02-04  0:35     ` Minchan Kim
2015-02-04  0:45       ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).