From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751794AbcFYRwL (ORCPT ); Sat, 25 Jun 2016 13:52:11 -0400 Received: from mail-wm0-f67.google.com ([74.125.82.67]:33928 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751547AbcFYRwJ (ORCPT ); Sat, 25 Jun 2016 13:52:09 -0400 MIME-Version: 1.0 In-Reply-To: <1466760574-1916-1-git-send-email-mpa@pengutronix.de> References: <1466760574-1916-1-git-send-email-mpa@pengutronix.de> From: Pranay Srivastava Date: Sat, 25 Jun 2016 23:22:06 +0530 Message-ID: Subject: Re: [PATCH 1/2] nbd: make nbd device wait for its users To: Markus Pargmann Cc: nbd-general@lists.sourceforge.net, linux-kernel@vger.kernel.org, Wouter Verhelst Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 24, 2016 at 2:59 PM, Markus Pargmann wrote: > From: "Pranay Kr. Srivastava" > > When a timeout occurs or a recv fails, then instead of abruplty killing > nbd block device wait for it's users to finish. > > This is more required when filesystem(s) like ext2 or ext3 don't expect > their buffer heads to disappear while the filesystem is mounted. > > Each open is counted. The blockdevice is kept open until the last user > closes the block device. This offers the possibility as well to open a > new socket to be used while the filesystems are mounted. > > Signed-off-by: Pranay Kr. Srivastava > > [mpa: Keep the blockdevice open until all users left] > Signed-off-by: Markus Pargmann > --- > Hi, > > I used your patch and changed it a bit based on my ideas. The general > difference is that this keeps the block device open. After all users left, the > device is reset. > > The followup patch then restricts access to ioctls after a disconnect. I wanted > to avoid that anyone sets up anything new without all the old users leaving. > > Please let me know what you think about this. > > Best Regards, > > Markus > > drivers/block/nbd.c | 62 ++++++++++++++++++++++++++++++++++++++--------------- > 1 file changed, 45 insertions(+), 17 deletions(-) > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c > index 1efc26bf1d21..620660f3ff0f 100644 > --- a/drivers/block/nbd.c > +++ b/drivers/block/nbd.c > @@ -69,6 +69,8 @@ struct nbd_device { > #if IS_ENABLED(CONFIG_DEBUG_FS) > struct dentry *dbg_dir; > #endif > + atomic_t users; /* Users that opened the block device */ We don't need to put bdev in nbd struct. We can get it via gendisk and bdget/bdput. > + struct block_device *bdev; > }; > > #if IS_ENABLED(CONFIG_DEBUG_FS) > @@ -655,9 +657,26 @@ static int nbd_set_socket(struct nbd_device *nbd, struct socket *sock) > return ret; > } > > +static void nbd_bdev_reset(struct block_device *bdev) > +{ > + set_device_ro(bdev, false); > + bdev->bd_inode->i_size = 0; i_size_write should be better I guess. > + if (max_part > 0) { > + blkdev_reread_part(bdev); > + bdev->bd_invalidated = 1; > + } > +} > + > /* Reset all properties of an NBD device */ > static void nbd_reset(struct nbd_device *nbd) > { > + sock_shutdown(nbd); > + nbd_clear_que(nbd); > + if (nbd->bdev) { > + kill_bdev(nbd->bdev); > + nbd_bdev_reset(nbd->bdev); > + } > + > nbd->disconnect = false; > nbd->timedout = false; > nbd->blksize = 1024; I actually forgot to ask about this blksize. We do set_blocksize call for bdev, but shouldn't we instead do blkdev_logicial_blocksize? Mount would set the blocksize depending on the filesystem's block size and the block device logical block size supported. Should we just get rid of this? > @@ -669,16 +688,6 @@ static void nbd_reset(struct nbd_device *nbd) > del_timer_sync(&nbd->timeout_timer); > } > > -static void nbd_bdev_reset(struct block_device *bdev) > -{ > - set_device_ro(bdev, false); > - bdev->bd_inode->i_size = 0; > - if (max_part > 0) { > - blkdev_reread_part(bdev); > - bdev->bd_invalidated = 1; > - } > -} > - > static void nbd_parse_flags(struct nbd_device *nbd, struct block_device *bdev) > { > if (nbd->flags & NBD_FLAG_READ_ONLY) > @@ -803,18 +812,11 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd, How about disallowing any ioctl until the nbd_device has been reset? Perhaps, throw error in open instead of checking here? > mutex_lock(&nbd->tx_lock); > nbd->task_recv = NULL; > > - sock_shutdown(nbd); > - nbd_clear_que(nbd); > - kill_bdev(bdev); > - nbd_bdev_reset(bdev); > - > if (nbd->disconnect) /* user requested, ignore socket errors */ > error = 0; > if (nbd->timedout) > error = -ETIMEDOUT; > Doesn't this change the semantics for user space? NBD_DO_IT is supposed to be over either on error or a user disconnect not before that right[?]. > - nbd_reset(nbd); > - > return error; > } > > @@ -853,10 +855,35 @@ static int nbd_ioctl(struct block_device *bdev, fmode_t mode, > return error; > } > > +static int nbd_open(struct block_device *bdev, fmode_t mode) > +{ > + struct nbd_device *nbd = bdev->bd_disk->private_data; > + > + atomic_inc(&nbd->users); > + > + if (!nbd->bdev) > + nbd->bdev = bdev; > + > + return 0; > +} > + > +static void nbd_release(struct gendisk *disk, fmode_t mode) > +{ > + struct nbd_device *nbd = disk->private_data; > + > + if (atomic_dec_and_test(&nbd->users)) { > + mutex_lock(&nbd->tx_lock); > + nbd_reset(nbd); > + mutex_unlock(&nbd->tx_lock); > + } > +} > + What if the following happens, 1) nbd-client [nbd-c1] attaches the block device 2) mount this nbd device 3) somebody goes crazy and does nbd-client -d Step 3) would cause a DISCONNECT and a CLEAR_SOCK ioctl to be fired, but in CLEAR_SOCK there's a call to kill_bdev, so I guess that needs to go as well[?]. > static const struct block_device_operations nbd_fops = { > .owner = THIS_MODULE, > .ioctl = nbd_ioctl, > .compat_ioctl = nbd_ioctl, > + .open = nbd_open, > + .release = nbd_release, > }; > > #if IS_ENABLED(CONFIG_DEBUG_FS) > @@ -1087,6 +1114,7 @@ static int __init nbd_init(void) > disk->private_data = &nbd_dev[i]; > sprintf(disk->disk_name, "nbd%d", i); > nbd_reset(&nbd_dev[i]); > + atomic_set(&nbd_dev[i].users, 0); > add_disk(disk); > } > > -- > 2.1.4 > -- ---P.K.S