From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751956AbcFNJdp (ORCPT ); Tue, 14 Jun 2016 05:33:45 -0400 Received: from mail-wm0-f67.google.com ([74.125.82.67]:36757 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750804AbcFNJdn (ORCPT ); Tue, 14 Jun 2016 05:33:43 -0400 MIME-Version: 1.0 In-Reply-To: <2280543.pBfpMHAWFW@adelgunde> References: <3898019.JRqDjBPssX@adelgunde> <1464863101-16805-1-git-send-email-pranjas@gmail.com> <1464863101-16805-5-git-send-email-pranjas@gmail.com> <2280543.pBfpMHAWFW@adelgunde> From: Pranay Srivastava Date: Tue, 14 Jun 2016 15:03:40 +0530 Message-ID: Subject: Re: [PATCH v2 4/5]nbd: make nbd device wait for its users. To: Markus Pargmann Cc: nbd-general@lists.sourceforge.net, linux-kernel@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Markus, On Tue, Jun 14, 2016 at 2:29 PM, Markus Pargmann wrote: > > On Thursday 02 June 2016 13:25:00 Pranay Kr. Srivastava wrote: > > When a timeout occurs or a recv fails, then > > instead of abruplty killing nbd block device > > wait for it's users to finish. > > > > This is more required when filesystem(s) like > > ext2 or ext3 don't expect their buffer heads to > > disappear while the filesystem is mounted. > > > > Each open of a nbd device is refcounted, while > > the userland program [nbd-client] doing the > > NBD_DO_IT ioctl would now wait for any other users > > of this device before invalidating the nbd device. > > > > Signed-off-by: Pranay Kr. Srivastava > > --- > > drivers/block/nbd.c | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 58 insertions(+) > > > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c > > index d1d898d..4da40dc 100644 > > --- a/drivers/block/nbd.c > > +++ b/drivers/block/nbd.c > > @@ -70,10 +70,13 @@ struct nbd_device { > > #if IS_ENABLED(CONFIG_DEBUG_FS) > > struct dentry *dbg_dir; > > #endif > > + atomic_t inuse; > > /* > > *This is specifically for calling sock_shutdown, for now. > > */ > > struct work_struct ws_shutdown; > > + struct kref users; > > + struct completion user_completion; > > }; > > > > #if IS_ENABLED(CONFIG_DEBUG_FS) > > @@ -104,6 +107,7 @@ static DEFINE_SPINLOCK(nbd_lock); > > * Shutdown function for nbd_dev work struct. > > */ > > static void nbd_ws_func_shutdown(struct work_struct *); > > +static void nbd_kref_release(struct kref *); > > > > static inline struct device *nbd_to_dev(struct nbd_device *nbd) > > { > > @@ -682,6 +686,8 @@ static void nbd_reset(struct nbd_device *nbd) > > nbd->flags = 0; > > nbd->xmit_timeout = 0; > > INIT_WORK(&nbd->ws_shutdown, nbd_ws_func_shutdown); > > + init_completion(&nbd->user_completion); > > + kref_init(&nbd->users); > > queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, nbd->disk->queue); > > del_timer_sync(&nbd->timeout_timer); > > } > > @@ -815,6 +821,14 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd, > > kthread_stop(thread); > > > > sock_shutdown(nbd); > > + /* > > + * kref_init initializes with ref count as 1, > > + * nbd_client, or the user-land program executing > > + * this ioctl will make the refcount to 2[at least] > > + * so subtracting 2 from refcount. > > + */ > > + kref_sub(&nbd->users, 2, nbd_kref_release); > > Why don't you use a kref_put? Ok, so I'll try to explain as I've understood the problem. When the module is loaded the kref is initialized to 1. Suppose now, someone has started nbd-client [nbdC-1] , then this nbd-client will increase the ref count to 2. So far so good... Now let's say this device is being shutdown via nbd-client[nbdC-2]. nbdC-1 will subtract the refcount by two, it has to do in NBD_DO_IT since device file will not be closed until after ioctl is over, and it'll wait_for_completion. nbdC-2 now closes it's use of device file, this makes the refcount as zero and completion is triggered with nbdC-1 completed. Now we don't want to trigger kref_put when nbdC-1 closes the device file so kref_put needs to be conditional in this regard so for that in_use is used. > > > + wait_for_completion(&nbd->user_completion); > > mutex_lock(&nbd->tx_lock); > > nbd_clear_que(nbd); > > kill_bdev(bdev); > > @@ -865,13 +879,56 @@ static int nbd_ioctl(struct block_device *bdev, fmode_t mode, > > > > return error; > > } > > +static void nbd_kref_release(struct kref *kref_users) > > +{ > > + struct nbd_device *nbd = container_of(kref_users, struct nbd_device, > > + users); > > Not indented to opening bracket. > > > + pr_debug("Releasing kref [%s]\n", __func__); > > + atomic_set(&nbd->inuse, 0); > > + complete(&nbd->user_completion); > > + > > +} > > + > > +static int nbd_open(struct block_device *bdev, fmode_t mode) > > +{ > > + struct nbd_device *nbd_dev = bdev->bd_disk->private_data; > > + > > + if (kref_get_unless_zero(&nbd_dev->users)) > > + atomic_set(&nbd_dev->inuse, 1); > > + > > + pr_debug("Opening nbd_dev %s. Active users = %u\n", > > + bdev->bd_disk->disk_name, > > + atomic_read(&nbd_dev->users.refcount) - 1); > > Indent to opening bracket. > > > + return 0; > > +} > > + > > +static void nbd_release(struct gendisk *disk, fmode_t mode) > > +{ > > + struct nbd_device *nbd_dev = disk->private_data; > > + /* > > + *kref_init initializes ref count to 1, so we > > + *we check for refcount to be 2 for a final put. > > + * > > + *kref needs to be re-initialized just here as the > > + *other process holding it must see the ref count as 2. > > + */ > > + if (atomic_read(&nbd_dev->inuse)) > > + kref_put(&nbd_dev->users, nbd_kref_release); > > What is this inuse atomic for? Everyone that releases the nbd device > will need to execute a kref_put(). To do away with inuse, perhaps we can do kref_get just before leaving the NBD_DO_IT? so that when device file is closed everyone would do a kref_put? However there's a small race window while the kref is being initialized, and another process [not just nbd-client] is trying to open the device. Do you think it's better to do this by introducing a spin_lock instead of atomic? Let me know if my understanding is correct. > > Best Regards, > > Markus > > > + > > + pr_debug("Closing nbd_dev %s. Active users = %u\n", > > + disk->disk_name, > > + atomic_read(&nbd_dev->users.refcount) - 1); > > +} > > > > static const struct block_device_operations nbd_fops = { > > .owner = THIS_MODULE, > > .ioctl = nbd_ioctl, > > .compat_ioctl = nbd_ioctl, > > + .open = nbd_open, > > + .release = nbd_release > > }; > > > > + > > static void nbd_ws_func_shutdown(struct work_struct *ws_nbd) > > { > > struct nbd_device *nbd_dev = container_of(ws_nbd, struct nbd_device, > > @@ -1107,6 +1164,7 @@ static int __init nbd_init(void) > > disk->fops = &nbd_fops; > > disk->private_data = &nbd_dev[i]; > > sprintf(disk->disk_name, "nbd%d", i); > > + atomic_set(&nbd_dev[i].inuse, 0); > > nbd_reset(&nbd_dev[i]); > > add_disk(disk); > > } > > > > -- > Pengutronix e.K. | | > Industrial Linux Solutions | http://www.pengutronix.de/ | > Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 | -- ---P.K.S