From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752021AbbEDC2b (ORCPT ); Sun, 3 May 2015 22:28:31 -0400 Received: from mail-pa0-f50.google.com ([209.85.220.50]:34533 "EHLO mail-pa0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751128AbbEDC2Y (ORCPT ); Sun, 3 May 2015 22:28:24 -0400 Date: Mon, 4 May 2015 11:28:17 +0900 From: Minchan Kim To: Sergey Senozhatsky Cc: Andrew Morton , Nitin Gupta , linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: Re: [PATCHv3 9/9] zram: add dynamic device add/remove functionality Message-ID: <20150504022816.GB14452@blaptop> References: <1430140911-7818-10-git-send-email-sergey.senozhatsky@gmail.com> <20150429001624.GA3917@swordfish> <20150429064858.GA5125@blaptop> <20150429070218.GA616@swordfish> <20150429072328.GA2987@swordfish> <20150430054702.GA21771@blaptop> <20150430063457.GA950@swordfish> <20150430064436.GB21771@blaptop> <20150430065111.GC950@swordfish> <20150504022008.GA14452@blaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150504022008.GA14452@blaptop> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 04, 2015 at 11:20:08AM +0900, Minchan Kim wrote: > Hello Sergey, > > On Thu, Apr 30, 2015 at 03:51:12PM +0900, Sergey Senozhatsky wrote: > > On (04/30/15 15:44), Minchan Kim wrote: > > > > > I think the problem of deadlock is that you are trying to remove sysfs file > > > > > in sysfs handler. > > > > > > > > > > #> echo 1 > /sys/xxx/zram_remove > > > > > > > > > > kernfs_fop_write - hold s_active > > > > > -> zram_remove_store > > > > > -> zram_remove > > > > > -> sysfs_remove_group - hold s_active *again* > > > > > > > > > > Right? > > > > > > > > > > > > > are those same s_active locks? > > > > > > > > > > > > we hold (s_active#163) and (&bdev->bd_mutex) and want to acquire (s_active#162) > > > > > > Thanks for sharing the message. > > > You're right. It's another lock so it shouldn't be a reason. > > > Okay, I will review it. Please give me time. > > > > > > > sure, no problem and no rush. thanks! > > I had a time to think over it. > > I think your patch is rather tricky so someone cannot see sysfs > although he already opened /dev/zram but after a while he can see sysfs. > It's weired. > > I want to fix it more generic way. Othewise, we might have trouble with > locking problem sometime. We already have experieced it with init_lock > although we finally fixed it. > > I think we can fix it with below patch I hope it's more general and right > approach. It's based on your [zram: return zram device_id from zram_add()] > > What do you think about? > > From e943df5407b880f9262ef959b270226fdc81bc9f Mon Sep 17 00:00:00 2001 > From: Minchan Kim > Date: Mon, 4 May 2015 08:36:07 +0900 > Subject: [PATCH 1/2] zram: close race by open overriding > > [1] introduced bdev->bd_mutex to protect a race between mount > and reset. At that time, we don't have dynamic zram-add/remove > feature so it was okay. > > However, as we introduce dynamic device feature, bd_mutex became > trouble. > > CPU 0 > > echo 1 > /sys/block/zram/reset > -> kernfs->s_active(A) > -> zram:reset_store->bd_mutex(B) > > CPU 1 > > echo > /sys/class/zram/zram-remove > ->zram:zram_remove: bd_mutex(B) > -> sysfs_remove_group > -> kernfs->s_active(A) > > IOW, AB -> BA deadlock > > The reason we are holding bd_mutex for zram_remove is to prevent > any incoming open /dev/zram[0-9]. Otherwise, we could remove zram > others already have opened. But it causes above deadlock problem. > > To fix the problem, this patch overrides block_device.open and > it returns -EBUSY if zram asserts he claims zram to reset so any > incoming open will be failed so we don't need to hold bd_mutex > for zram_remove ayn more. > > This patch is to prepare for zram-add/remove feature. > > [1] ba6b17: zram: fix umount-reset_store-mount race condition > Signed-off-by: Minchan Kim If above has no problem, we could apply your last patch on top of it. >>From 5bfa8a2e312a9c8493f574b1cf513ef4693a465c Mon Sep 17 00:00:00 2001 From: Sergey Senozhatsky Date: Mon, 4 May 2015 09:02:23 +0900 Subject: [PATCH 2/2] zram: add dynamic device add/remove functionality We currently don't support on-demand device creation. The one and only way to have N zram devices is to specify num_devices module parameter (default value: 1). IOW if, for some reason, at some point, user wants to have N + 1 devies he/she must umount all the existing devices, unload the module, load the module passing num_devices equals to N + 1. And do this again, if needed. This patch introduces zram control sysfs class, which has two sysfs attrs: - zram_add -- add a new zram device - zram_remove -- remove a specific (device_id) zram device zram_add sysfs attr is read-only and has only automatic device id assignment mode (as requested by Minchan Kim). read operation performed on this attr creates a new zram device and returns back its device_id or error status. Usage example: # add a new specific zram device cat /sys/class/zram-control/zram_add 2 # remove a specific zram device echo 4 > /sys/class/zram-control/zram_remove Returning zram_add() error code back to user (-ENOMEM in this case) cat /sys/class/zram-control/zram_add cat: /sys/class/zram-control/zram_add: Cannot allocate memory NOTE, there might be users who already depend on the fact that at least zram0 device gets always created by zram_init(). Preserve this behavior. [minchan]: use zram->claim to avoid lockdep splat Reported-by: Minchan Kim Signed-off-by: Sergey Senozhatsky --- Documentation/blockdev/zram.txt | 23 ++++++++-- drivers/block/zram/zram_drv.c | 97 +++++++++++++++++++++++++++++++++++++++-- 2 files changed, 114 insertions(+), 6 deletions(-) diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt index 65e9430..fc686d4 100644 --- a/Documentation/blockdev/zram.txt +++ b/Documentation/blockdev/zram.txt @@ -99,7 +99,24 @@ size of the disk when not in use so a huge zram is wasteful. mkfs.ext4 /dev/zram1 mount /dev/zram1 /tmp -7) Stats: +7) Add/remove zram devices + +zram provides a control interface, which enables dynamic (on-demand) device +addition and removal. + +In order to add a new /dev/zramX device, perform read operation on zram_add +attribute. This will return either new device's device id (meaning that you +can use /dev/zram) or error code. + +Example: + cat /sys/class/zram-control/zram_add + 1 + +To remove the existing /dev/zramX device (where X is a device id) +execute + echo X > /sys/class/zram-control/zram_remove + +8) Stats: Per-device statistics are exported as various nodes under /sys/block/zram/ A brief description of exported device attritbutes. For more details please @@ -174,11 +191,11 @@ line of text and contains the following stats separated by whitespace: zero_pages num_migrated -8) Deactivate: +9) Deactivate: swapoff /dev/zram0 umount /dev/zram1 -9) Reset: +10) Reset: Write any positive value to 'reset' sysfs node echo 1 > /sys/block/zram0/reset echo 1 > /sys/block/zram1/reset diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 7fb72dc..97cd4f3 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -29,10 +29,14 @@ #include #include #include +#include #include "zram_drv.h" static DEFINE_IDR(zram_index_idr); +/* idr index must be protected */ +static DEFINE_MUTEX(zram_index_mutex); + static int zram_major; static const char *default_compressor = "lzo"; @@ -1278,24 +1282,101 @@ out_free_dev: return ret; } -static void zram_remove(struct zram *zram) +static int zram_remove(struct zram *zram) { - pr_info("Removed device: %s\n", zram->disk->disk_name); + struct block_device *bdev; + + bdev = bdget_disk(zram->disk, 0); + if (!bdev) + return -ENOMEM; + + mutex_lock(&bdev->bd_mutex); + if (bdev->bd_openers || zram->claim) { + mutex_unlock(&bdev->bd_mutex); + return -EBUSY; + } + + zram->claim = true; + mutex_unlock(&bdev->bd_mutex); + /* * Remove sysfs first, so no one will perform a disksize - * store while we destroy the devices + * store while we destroy the devices. This also helps during + * zram_remove() -- device_reset() is the last holder of + * ->init_lock. */ sysfs_remove_group(&disk_to_dev(zram->disk)->kobj, &zram_disk_attr_group); + /* Make sure all pending I/O is finished */ + fsync_bdev(bdev); zram_reset_device(zram); + mutex_unlock(&bdev->bd_mutex); + + pr_info("Removed device: %s\n", zram->disk->disk_name); + idr_remove(&zram_index_idr, zram->disk->first_minor); blk_cleanup_queue(zram->disk->queue); del_gendisk(zram->disk); put_disk(zram->disk); kfree(zram); + + return 0; } +/* zram module control sysfs attributes */ +static ssize_t zram_add_show(struct class *class, + struct class_attribute *attr, + char *buf) +{ + int ret; + + mutex_lock(&zram_index_mutex); + ret = zram_add(); + mutex_unlock(&zram_index_mutex); + + if (ret < 0) + return ret; + return scnprintf(buf, PAGE_SIZE, "%d\n", ret); +} + +static ssize_t zram_remove_store(struct class *class, + struct class_attribute *attr, + const char *buf, + size_t count) +{ + struct zram *zram; + int ret, dev_id; + + /* dev_id is gendisk->first_minor, which is `int' */ + ret = kstrtoint(buf, 10, &dev_id); + if (ret || dev_id < 0) + return -EINVAL; + + mutex_lock(&zram_index_mutex); + + zram = idr_find(&zram_index_idr, dev_id); + if (zram) + ret = zram_remove(zram); + else + ret = -ENODEV; + + mutex_unlock(&zram_index_mutex); + return ret ? ret : count; +} + +static struct class_attribute zram_control_class_attrs[] = { + __ATTR_RO(zram_add), + __ATTR_WO(zram_remove), + __ATTR_NULL, +}; + +static struct class zram_control_class = { + .name = "zram-control", + .owner = THIS_MODULE, + .class_attrs = zram_control_class_attrs, +}; + static int zram_remove_cb(int id, void *ptr, void *data) { zram_remove(ptr); @@ -1304,6 +1385,7 @@ static int zram_remove_cb(int id, void *ptr, void *data) static void destroy_devices(void) { + class_unregister(&zram_control_class); idr_for_each(&zram_index_idr, &zram_remove_cb, NULL); idr_destroy(&zram_index_idr); unregister_blkdev(zram_major, "zram"); @@ -1313,14 +1395,23 @@ static int __init zram_init(void) { int ret; + ret = class_register(&zram_control_class); + if (ret) { + pr_warn("Unable to register zram-control class\n"); + return ret; + } + zram_major = register_blkdev(0, "zram"); if (zram_major <= 0) { pr_warn("Unable to get major number\n"); + class_unregister(&zram_control_class); return -EBUSY; } while (num_devices != 0) { + mutex_lock(&zram_index_mutex); ret = zram_add(); + mutex_unlock(&zram_index_mutex); if (ret < 0) goto out_error; num_devices--; -- 1.9.3 -- Kind regards, Minchan Kim