All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sage Weil <sage@newdream.net>
To: Greg KH <greg@kroah.com>
Cc: Yehuda Sadeh <yehuda@hq.newdream.net>,
	ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] rbd: replace the rbd sysfs interface
Date: Wed, 1 Dec 2010 11:25:16 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.1012011117460.8183@cobra.newdream.net> (raw)
In-Reply-To: <1290558233.1792.73.camel@yehudasa-desktop>

Hi Greg,

I'm sure you're busy and as tired of this thread as we are, but I think
it's close and we have (I hope) just one remaining question.  The current
patch (see below) gives us

 /sys/bus/rbd/{add,remove}
 /sys/bus/rbd/devices/<devid>/                 <-- struct device
 /sys/bus/rbd/devices/<devid>/{some dev attrs}
 /sys/bus/rbd/devices/<devid>/snap_<snapid>/   <-- struct device
 /sys/bus/rbd/devices/<devid>/snap_<snapid>/{some snap attrs}

This works, and I is (I hope) using struct device properly.  The only 
problem, purely from a user interface standpoint, is that the snaps are 
mixed in with attributes, so anybody wanting to iterate over snaps needs 
to do something crufty like

 $ for snap in `ls /sys/bus/rbd/devices/$id | grep ^snap_ | cut -c 6-`; do ...

Adding an intermediate snaps/ subdir would let them instead do

 $ for snap in `ls /sys/bus/rbd/devices/$id/snaps/`; do ...

without worrying about the (arbitrarily named) snaps from colliding with 
device attributes.  Assuming that is a preferable interface, is the 
"right" way to do that to make "snaps" a struct device?  Or is there a 
good reason why that is not preferable?

Thanks!
sage



On Tue, 23 Nov 2010, Yehuda Sadeh wrote:

> On Mon, 2010-11-22 at 16:58 -0800, Greg KH wrote:
> On Mon, Nov 22, 2010 at 04:48:54PM -0800, Yehuda Sadeh Weinraub wrote:
> > > The reason we keep snapshots in a separate subdirectory is that they
> > > can have arbitrary name. So either we prefix them and put them in a
> > > common namespace with the devices, or we put them down the hierarchy.
> > 
> > Do either one.  I would suggest a prefix.
> > 
> > > In any case we don't do any operations on them, we just have them for
> > > informational use and we put them there so that we don't have one big
> > > file that lists them all.
> > 
> > But something cares about them, so treat them properly.
> > 
> > > >> Another way would be to create a group for (2) under (1) and create a
> > > >> kobject for (3), for which you can create group per snapshot.
> > > >>
> > > >> Am I missing something? We already have the first solution (kobjects
> > > >> only) implemented, is there some real benefit for using the third
> > > >> method? We'll have to manually add remove groups anyway, as snapshots
> > > >> can be removed and new snapshots can be added.
> > > >
> > > > Never add kobjects to a struct device, that is showing you that
> > > > something is wrong, and that userspace really will want to get that
> > > > create/destroy event of the sub child.
> > > >
> > > 
> > > But they're there as information device attributes, it's nothing like
> > > partitions in block devices. So we want to just be able to list them
> > > and their attributes easily (and nicely), without having to put them
> > > in one big file.
> > 
> > Then use a prefix and put everything in the same subdirectory underneath
> > the id and you should be fine, right?
> > 
> 
> The following patch puts the snapshots as subdirs on the device directory.
> Each snapshot is in a different device structure, whereas its parent is
> the device. This works, however, not very pretty and/or not very
> convenient. Can we add another intermediate device that'll serve as a
> container for the snapshots?
> 
> Thanks,
> Yehuda
> 
> --
> 
> Yehuda Sadeh (1):
>   rbd: replace the rbd sysfs interface
> 
>  Documentation/ABI/testing/sysfs-bus-rbd |   84 ++++
>  drivers/block/rbd.c                     |  748 +++++++++++++++++++------------
>  2 files changed, 555 insertions(+), 277 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-rbd
> 
> 
> --
> 
> From: Yehuda Sadeh <yehuda@hq.newdream.net>
> Date: Fri, 19 Nov 2010 14:51:04 -0800
> Subject: [PATCH 1/1] rbd: replace the rbd sysfs interface
> 
> The new interface creates directories per mapped image
> and under each it creates a subdir per available snapshot.
> This allows keeping a cleaner interface within the sysfs
> guidelines. The ABI documentation was updated too.
> 
> Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
> ---
>  Documentation/ABI/testing/sysfs-bus-rbd |   83 ++++
>  drivers/block/rbd.c                     |  748 +++++++++++++++++++------------
>  2 files changed, 554 insertions(+), 277 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-rbd b/Documentation/ABI/testing/sysfs-bus-rbd
> new file mode 100644
> index 0000000..90a87e2
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-rbd
> @@ -0,0 +1,83 @@
> +What:		/sys/bus/rbd/
> +Date:		November 2010
> +Contact:	Yehuda Sadeh <yehuda@hq.newdream.net>,
> +		Sage Weil <sage@newdream.net>
> +Description:
> +
> +Being used for adding and removing rbd block devices.
> +
> +Usage: <mon ip addr> <options> <pool name> <rbd image name> [snap name]
> +
> + $ echo "192.168.0.1 name=admin rbd foo" > /sys/bus/rbd/add
> +
> +The snapshot name can be "-" or omitted to map the image read/write. A <dev-id>
> +will be assigned for any registered block device. If snapshot is used, it will
> +be mapped read-only.
> +
> +Removal of a device:
> +
> +  $ echo <dev-id> > /sys/bus/rbd/remove
> +
> +Entries under /sys/bus/rbd/devices/<dev-id>/
> +--------------------------------------------
> +
> +client_id
> +
> +	The ceph unique client id that was assigned for this specific session.
> +
> +major
> +
> +	The block device major number.
> +
> +name
> +
> +	The name of the rbd image.
> +
> +pool
> +
> +	The pool where this rbd image resides. The pool-name pair is unique
> +	per rados system.
> +
> +size
> +
> +	The size (in bytes) of the mapped block device.
> +
> +refresh
> +
> +	Writing to this file will reread the image header data and set
> +	all relevant datastructures accordingly.
> +
> +current_snap
> +
> +	The current snapshot for which the device is mapped.
> +
> +create_snap
> +
> +	Create a snapshot:
> +
> +	 $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_create
> +
> +rollback_snap
> +
> +	Rolls back data to the specified snapshot. This goes over the entire
> +	list of rados blocks and sends a rollback command to each.
> +
> +	 $ echo <snap-name> > /sys/bus/rbd/devices/<dev-id>/snap_rollback
> +
> +snap_*
> +
> +	A directory per each snapshot
> +
> +
> +Entries under /sys/bus/rbd/devices/<dev-id>/snap_<snap-name>
> +-------------------------------------------------------------
> +
> +id
> +
> +	The rados internal snapshot id assigned for this snapshot
> +
> +size
> +
> +	The size of the image when this snapshot was taken.
> +
> +
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 6ec9d53..008d4a0 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -21,80 +21,9 @@
>  
>  
>  
> -   Instructions for use
> -   --------------------
> +   For usage instructions, please refer to:
>  
> -   1) Map a Linux block device to an existing rbd image.
> -
> -      Usage: <mon ip addr> <options> <pool name> <rbd image name> [snap name]
> -
> -      $ echo "192.168.0.1 name=admin rbd foo" > /sys/class/rbd/add
> -
> -      The snapshot name can be "-" or omitted to map the image read/write.
> -
> -   2) List all active blkdev<->object mappings.
> -
> -      In this example, we have performed step #1 twice, creating two blkdevs,
> -      mapped to two separate rados objects in the rados rbd pool
> -
> -      $ cat /sys/class/rbd/list
> -      #id     major   client_name     pool    name    snap    KB
> -      0       254     client4143      rbd     foo     -      1024000
> -
> -      The columns, in order, are:
> -      - blkdev unique id
> -      - blkdev assigned major
> -      - rados client id
> -      - rados pool name
> -      - rados block device name
> -      - mapped snapshot ("-" if none)
> -      - device size in KB
> -
> -
> -   3) Create a snapshot.
> -
> -      Usage: <blkdev id> <snapname>
> -
> -      $ echo "0 mysnap" > /sys/class/rbd/snap_create
> -
> -
> -   4) Listing a snapshot.
> -
> -      $ cat /sys/class/rbd/snaps_list
> -      #id     snap    KB
> -      0       -       1024000 (*)
> -      0       foo     1024000
> -
> -      The columns, in order, are:
> -      - blkdev unique id
> -      - snapshot name, '-' means none (active read/write version)
> -      - size of device at time of snapshot
> -      - the (*) indicates this is the active version
> -
> -   5) Rollback to snapshot.
> -
> -      Usage: <blkdev id> <snapname>
> -
> -      $ echo "0 mysnap" > /sys/class/rbd/snap_rollback
> -
> -
> -   6) Mapping an image using snapshot.
> -
> -      A snapshot mapping is read-only. This is being done by passing
> -      snap=<snapname> to the options when adding a device.
> -
> -      $ echo "192.168.0.1 name=admin,snap=mysnap rbd foo" > /sys/class/rbd/add
> -
> -
> -   7) Remove an active blkdev<->rbd image mapping.
> -
> -      In this example, we remove the mapping with blkdev unique id 1.
> -
> -      $ echo 1 > /sys/class/rbd/remove
> -
> -
> -   NOTE:  The actual creation and deletion of rados objects is outside the scope
> -   of this driver.
> +                 Documentation/ABI/testing/sysfs-bus-rbd
>  
>   */
>  
> @@ -163,6 +92,14 @@ struct rbd_request {
>  	u64			len;
>  };
>  
> +struct rbd_snap {
> +	struct	device		dev;
> +	const char		*name;
> +	size_t			size;
> +	struct list_head	node;
> +	u64			id;
> +};
> +
>  /*
>   * a single device
>   */
> @@ -193,21 +130,60 @@ struct rbd_device {
>  	int read_only;
>  
>  	struct list_head	node;
> +
> +	/* list of snapshots */
> +	struct list_head	snaps;
> +
> +	/* sysfs related */
> +	struct device		dev;
> +};
> +
> +static struct bus_type rbd_bus_type = {
> +	.name		= "rbd",
>  };
>  
>  static spinlock_t node_lock;      /* protects client get/put */
>  
> -static struct class *class_rbd;	  /* /sys/class/rbd */
>  static DEFINE_MUTEX(ctl_mutex);	  /* Serialize open/close/setup/teardown */
>  static LIST_HEAD(rbd_dev_list);    /* devices */
>  static LIST_HEAD(rbd_client_list);      /* clients */
>  
> +static int __rbd_init_snaps_header(struct rbd_device *rbd_dev);
> +static void rbd_dev_release(struct device *dev);
> +static ssize_t rbd_snap_rollback(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf,
> +				 size_t size);
> +static ssize_t rbd_snap_add(struct device *dev,
> +			    struct device_attribute *attr,
> +			    const char *buf,
> +			    size_t count);
> +static void __rbd_remove_snap_dev(struct rbd_device *rbd_dev,
> +				  struct rbd_snap *snap);;
> +
> +
> +static struct rbd_device *dev_to_rbd(struct device *dev)
> +{
> +	return container_of(dev, struct rbd_device, dev);
> +}
> +
> +static struct device *rbd_get_dev(struct rbd_device *rbd_dev)
> +{
> +	return get_device(&rbd_dev->dev);
> +}
> +
> +static void rbd_put_dev(struct rbd_device *rbd_dev)
> +{
> +	put_device(&rbd_dev->dev);
> +}
>  
>  static int rbd_open(struct block_device *bdev, fmode_t mode)
>  {
>  	struct gendisk *disk = bdev->bd_disk;
>  	struct rbd_device *rbd_dev = disk->private_data;
>  
> +	rbd_get_dev(rbd_dev);
> +
>  	set_device_ro(bdev, rbd_dev->read_only);
>  
>  	if ((mode & FMODE_WRITE) && rbd_dev->read_only)
> @@ -216,9 +192,19 @@ static int rbd_open(struct block_device *bdev, fmode_t mode)
>  	return 0;
>  }
>  
> +static int rbd_release(struct gendisk *disk, fmode_t mode)
> +{
> +	struct rbd_device *rbd_dev = disk->private_data;
> +
> +	rbd_put_dev(rbd_dev);
> +
> +	return 0;
> +}
> +
>  static const struct block_device_operations rbd_bd_ops = {
>  	.owner			= THIS_MODULE,
>  	.open			= rbd_open,
> +	.release		= rbd_release,
>  };
>  
>  /*
> @@ -361,7 +347,6 @@ static int rbd_header_from_disk(struct rbd_image_header *header,
>  	int ret = -ENOMEM;
>  
>  	init_rwsem(&header->snap_rwsem);
> -
>  	header->snap_names_len = le64_to_cpu(ondisk->snap_names_len);
>  	header->snapc = kmalloc(sizeof(struct ceph_snap_context) +
>  				snap_count *
> @@ -1256,10 +1241,20 @@ bad:
>  	return -ERANGE;
>  }
>  
> +static void __rbd_remove_all_snaps(struct rbd_device *rbd_dev)
> +{
> +	struct rbd_snap *snap;
> +
> +	while (!list_empty(&rbd_dev->snaps)) {
> +		snap = list_first_entry(&rbd_dev->snaps, struct rbd_snap, node);
> +		__rbd_remove_snap_dev(rbd_dev, snap);
> +	}
> +}
> +
>  /*
>   * only read the first part of the ondisk header, without the snaps info
>   */
> -static int rbd_update_snaps(struct rbd_device *rbd_dev)
> +static int __rbd_update_snaps(struct rbd_device *rbd_dev)
>  {
>  	int ret;
>  	struct rbd_image_header h;
> @@ -1280,12 +1275,15 @@ static int rbd_update_snaps(struct rbd_device *rbd_dev)
>  	rbd_dev->header.total_snaps = h.total_snaps;
>  	rbd_dev->header.snapc = h.snapc;
>  	rbd_dev->header.snap_names = h.snap_names;
> +	rbd_dev->header.snap_names_len = h.snap_names_len;
>  	rbd_dev->header.snap_sizes = h.snap_sizes;
>  	rbd_dev->header.snapc->seq = snap_seq;
>  
> +	ret = __rbd_init_snaps_header(rbd_dev);
> +
>  	up_write(&rbd_dev->header.snap_rwsem);
>  
> -	return 0;
> +	return ret;
>  }
>  
>  static int rbd_init_disk(struct rbd_device *rbd_dev)
> @@ -1300,6 +1298,11 @@ static int rbd_init_disk(struct rbd_device *rbd_dev)
>  	if (rc)
>  		return rc;
>  
> +	/* no need to lock here, as rbd_dev is not registered yet */
> +	rc = __rbd_init_snaps_header(rbd_dev);
> +	if (rc)
> +		return rc;
> +
>  	rc = rbd_header_set_snap(rbd_dev, rbd_dev->snap_name, &total_size);
>  	if (rc)
>  		return rc;
> @@ -1343,54 +1346,360 @@ out:
>  	return rc;
>  }
>  
> -/********************************************************************
> - * /sys/class/rbd/
> - *                   add	map rados objects to blkdev
> - *                   remove	unmap rados objects
> - *                   list	show mappings
> - *******************************************************************/
> +/*
> +  sysfs
> +*/
> +
> +static ssize_t rbd_size_show(struct device *dev,
> +			     struct device_attribute *attr, char *buf)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "%llu\n", (unsigned long long)rbd_dev->header.image_size);
> +}
> +
> +static ssize_t rbd_major_show(struct device *dev,
> +			      struct device_attribute *attr, char *buf)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
>  
> -static void class_rbd_release(struct class *cls)
> +	return sprintf(buf, "%d\n", rbd_dev->major);
> +}
> +
> +static ssize_t rbd_client_id_show(struct device *dev,
> +				  struct device_attribute *attr, char *buf)
>  {
> -	kfree(cls);
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "client%lld\n", ceph_client_id(rbd_dev->client));
>  }
>  
> -static ssize_t class_rbd_list(struct class *c,
> -			      struct class_attribute *attr,
> -			      char *data)
> +static ssize_t rbd_pool_show(struct device *dev,
> +			     struct device_attribute *attr, char *buf)
>  {
> -	int n = 0;
> -	struct list_head *tmp;
> -	int max = PAGE_SIZE;
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "%s\n", rbd_dev->pool_name);
> +}
> +
> +static ssize_t rbd_name_show(struct device *dev,
> +			     struct device_attribute *attr, char *buf)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "%s\n", rbd_dev->obj);
> +}
> +
> +static ssize_t rbd_snap_show(struct device *dev,
> +			     struct device_attribute *attr,
> +			     char *buf)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +
> +	return sprintf(buf, "%s\n", rbd_dev->snap_name);
> +}
> +
> +static ssize_t rbd_image_refresh(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf,
> +				 size_t size)
> +{
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +	int rc;
> +	int ret = size;
>  
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
>  
> -	n += snprintf(data, max,
> -		      "#id\tmajor\tclient_name\tpool\tname\tsnap\tKB\n");
> +	rc = __rbd_update_snaps(rbd_dev);
> +	if (rc < 0)
> +		ret = rc;
>  
> -	list_for_each(tmp, &rbd_dev_list) {
> -		struct rbd_device *rbd_dev;
> +	mutex_unlock(&ctl_mutex);
> +	return ret;
> +}
>  
> -		rbd_dev = list_entry(tmp, struct rbd_device, node);
> -		n += snprintf(data+n, max-n,
> -			      "%d\t%d\tclient%lld\t%s\t%s\t%s\t%lld\n",
> -			      rbd_dev->id,
> -			      rbd_dev->major,
> -			      ceph_client_id(rbd_dev->client),
> -			      rbd_dev->pool_name,
> -			      rbd_dev->obj, rbd_dev->snap_name,
> -			      rbd_dev->header.image_size >> 10);
> -		if (n == max)
> +static DEVICE_ATTR(size, S_IRUGO, rbd_size_show, NULL);
> +static DEVICE_ATTR(major, S_IRUGO, rbd_major_show, NULL);
> +static DEVICE_ATTR(client_id, S_IRUGO, rbd_client_id_show, NULL);
> +static DEVICE_ATTR(pool, S_IRUGO, rbd_pool_show, NULL);
> +static DEVICE_ATTR(name, S_IRUGO, rbd_name_show, NULL);
> +static DEVICE_ATTR(refresh, S_IWUSR, NULL, rbd_image_refresh);
> +static DEVICE_ATTR(current_snap, S_IRUGO, rbd_snap_show, NULL);
> +static DEVICE_ATTR(create_snap, S_IWUSR, NULL, rbd_snap_add);
> +static DEVICE_ATTR(rollback_snap, S_IWUSR, NULL, rbd_snap_rollback);
> +
> +static struct attribute *rbd_attrs[] = {
> +	&dev_attr_size.attr,
> +	&dev_attr_major.attr,
> +	&dev_attr_client_id.attr,
> +	&dev_attr_pool.attr,
> +	&dev_attr_name.attr,
> +	&dev_attr_current_snap.attr,
> +	&dev_attr_refresh.attr,
> +	&dev_attr_create_snap.attr,
> +	&dev_attr_rollback_snap.attr,
> +	NULL
> +};
> +
> +static struct attribute_group rbd_attr_group = {
> +	.attrs = rbd_attrs,
> +};
> +
> +static const struct attribute_group *rbd_attr_groups[] = {
> +	&rbd_attr_group,
> +	NULL
> +};
> +
> +static void rbd_sysfs_dev_release(struct device *dev)
> +{
> +}
> +
> +static struct device_type rbd_device_type = {
> +	.name		= "rbd",
> +	.groups		= rbd_attr_groups,
> +	.release	= rbd_sysfs_dev_release,
> +};
> +
> +
> +/*
> +  sysfs - snapshots
> +*/
> +
> +static ssize_t rbd_snap_size_show(struct device *dev,
> +				  struct device_attribute *attr,
> +				  char *buf)
> +{
> +	struct rbd_snap *snap = container_of(dev, struct rbd_snap, dev);
> +
> +	return sprintf(buf, "%lld\n", (long long)snap->size);
> +}
> +
> +static ssize_t rbd_snap_id_show(struct device *dev,
> +				struct device_attribute *attr,
> +				char *buf)
> +{
> +	struct rbd_snap *snap = container_of(dev, struct rbd_snap, dev);
> +
> +	return sprintf(buf, "%lld\n", (long long)snap->id);
> +}
> +
> +static DEVICE_ATTR(snap_size, S_IRUGO, rbd_snap_size_show, NULL);
> +static DEVICE_ATTR(snap_id, S_IRUGO, rbd_snap_id_show, NULL);
> +
> +static struct attribute *rbd_snap_attrs[] = {
> +	&dev_attr_snap_size.attr,
> +	&dev_attr_snap_id.attr,
> +	NULL,
> +};
> +
> +static struct attribute_group rbd_snap_attr_group = {
> +	.attrs = rbd_snap_attrs,
> +};
> +
> +static void rbd_snap_dev_release(struct device *dev)
> +{
> +	struct rbd_snap *snap = container_of(dev, struct rbd_snap, dev);
> +	kfree(snap->name);
> +	kfree(snap);
> +}
> +
> +static const struct attribute_group *rbd_snap_attr_groups[] = {
> +	&rbd_snap_attr_group,
> +	NULL
> +};
> +
> +static struct device_type rbd_snap_device_type = {
> +	.groups		= rbd_snap_attr_groups,
> +	.release	= rbd_snap_dev_release,
> +};
> +
> +static void __rbd_remove_snap_dev(struct rbd_device *rbd_dev,
> +				  struct rbd_snap *snap)
> +{
> +	list_del(&snap->node);
> +	device_unregister(&snap->dev);
> +}
> +
> +static int rbd_register_snap_dev(struct rbd_device *rbd_dev,
> +				  struct rbd_snap *snap,
> +				  struct device *parent)
> +{
> +	struct device *dev = &snap->dev;
> +	int ret;
> +
> +	dev->type = &rbd_snap_device_type;
> +	dev->parent = parent;
> +	dev->release = rbd_snap_dev_release;
> +	dev_set_name(dev, "snap_%s", snap->name);
> +	ret = device_register(dev);
> +
> +	return ret;
> +}
> +
> +static int __rbd_add_snap_dev(struct rbd_device *rbd_dev,
> +			      int i, const char *name,
> +			      struct rbd_snap **snapp)
> +{
> +	int ret;
> +	struct rbd_snap *snap = kzalloc(sizeof(*snap), GFP_KERNEL);
> +	if (!snap)
> +		return -ENOMEM;
> +	snap->name = kstrdup(name, GFP_KERNEL);
> +	snap->size = rbd_dev->header.snap_sizes[i];
> +	snap->id = rbd_dev->header.snapc->snaps[i];
> +	if (device_is_registered(&rbd_dev->dev)) {
> +		ret = rbd_register_snap_dev(rbd_dev, snap,
> +					     &rbd_dev->dev);
> +		if (ret < 0)
> +			goto err;
> +	}
> +	*snapp = snap;
> +	return 0;
> +err:
> +	kfree(snap->name);
> +	kfree(snap);
> +	return ret;
> +}
> +
> +/*
> + * search for the previous snap in a null delimited string list
> + */
> +const char *rbd_prev_snap_name(const char *name, const char *start)
> +{
> +	if (name < start + 2)
> +		return NULL;
> +
> +	name -= 2;
> +	while (*name) {
> +		if (name == start)
> +			return start;
> +		name--;
> +	}
> +	return name + 1;
> +}
> +
> +/*
> + * compare the old list of snapshots that we have to what's in the header
> + * and update it accordingly. Note that the header holds the snapshots
> + * in a reverse order (from newest to oldest) and we need to go from
> + * older to new so that we don't get a duplicate snap name when
> + * doing the process (e.g., removed snapshot and recreated a new
> + * one with the same name.
> + */
> +static int __rbd_init_snaps_header(struct rbd_device *rbd_dev)
> +{
> +	const char *name, *first_name;
> +	int i = rbd_dev->header.total_snaps;
> +	struct rbd_snap *snap, *old_snap = NULL;
> +	int ret;
> +	struct list_head *p, *n;
> +
> +	first_name = rbd_dev->header.snap_names;
> +	name = first_name + rbd_dev->header.snap_names_len;
> +
> +	list_for_each_prev_safe(p, n, &rbd_dev->snaps) {
> +		u64 cur_id;
> +
> +		old_snap = list_entry(p, struct rbd_snap, node);
> +
> +		if (i)
> +			cur_id = rbd_dev->header.snapc->snaps[i - 1];
> +
> +		if (!i || old_snap->id < cur_id) {
> +			/* old_snap->id was skipped, thus was removed */
> +			__rbd_remove_snap_dev(rbd_dev, old_snap);
> +			continue;
> +		}
> +		if (old_snap->id == cur_id) {
> +			/* we have this snapshot already */
> +			i--;
> +			name = rbd_prev_snap_name(name, first_name);
> +			continue;
> +		}
> +		for (; i > 0;
> +		     i--, name = rbd_prev_snap_name(name, first_name)) {
> +			if (!name) {
> +				WARN_ON(1);
> +				return -EINVAL;
> +			}
> +			cur_id = rbd_dev->header.snapc->snaps[i];
> +			/* snapshot removal? handle it above */
> +			if (cur_id >= old_snap->id)
> +				break;
> +			/* a new snapshot */
> +			ret = __rbd_add_snap_dev(rbd_dev, i - 1, name, &snap);
> +			if (ret < 0)
> +				return ret;
> +
> +			/* note that we add it backward so using n and not p */
> +			list_add(&snap->node, n);
> +			p = &snap->node;
> +		}
> +	}
> +	/* we're done going over the old snap list, just add what's left */
> +	for (; i > 0; i--) {
> +		name = rbd_prev_snap_name(name, first_name);
> +		if (!name) {
> +			WARN_ON(1);
> +			return -EINVAL;
> +		}
> +		ret = __rbd_add_snap_dev(rbd_dev, i - 1, name, &snap);
> +		if (ret < 0)
> +			return ret;
> +		list_add(&snap->node, &rbd_dev->snaps);
> +	}
> +
> +	return 0;
> +}
> +
> +
> +static void rbd_root_dev_release(struct device *dev)
> +{
> +}
> +
> +static struct device rbd_root_dev = {
> +	.init_name =    "rbd",
> +	.release =      rbd_root_dev_release,
> +};
> +
> +static int rbd_bus_add_dev(struct rbd_device *rbd_dev)
> +{
> +	int ret = -ENOMEM;
> +	struct device *dev;
> +	struct rbd_snap *snap;
> +
> +	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> +	dev = &rbd_dev->dev;
> +
> +	dev->bus = &rbd_bus_type;
> +	dev->type = &rbd_device_type;
> +	dev->parent = &rbd_root_dev;
> +	dev->release = rbd_dev_release;
> +	dev_set_name(dev, "%d", rbd_dev->id);
> +	ret = device_register(dev);
> +	if (ret < 0)
> +		goto done_free;
> +
> +	list_for_each_entry(snap, &rbd_dev->snaps, node) {
> +		ret = rbd_register_snap_dev(rbd_dev, snap,
> +					     &rbd_dev->dev);
> +		if (ret < 0)
>  			break;
>  	}
>  
>  	mutex_unlock(&ctl_mutex);
> -	return n;
> +	return 0;
> +done_free:
> +	mutex_unlock(&ctl_mutex);
> +	return ret;
>  }
>  
> -static ssize_t class_rbd_add(struct class *c,
> -			     struct class_attribute *attr,
> -			     const char *buf, size_t count)
> +static void rbd_bus_del_dev(struct rbd_device *rbd_dev)
> +{
> +	device_unregister(&rbd_dev->dev);
> +}
> +
> +static ssize_t rbd_add(struct bus_type *bus, const char *buf, size_t count)
>  {
>  	struct ceph_osd_client *osdc;
>  	struct rbd_device *rbd_dev;
> @@ -1419,6 +1728,7 @@ static ssize_t class_rbd_add(struct class *c,
>  	/* static rbd_device initialization */
>  	spin_lock_init(&rbd_dev->lock);
>  	INIT_LIST_HEAD(&rbd_dev->node);
> +	INIT_LIST_HEAD(&rbd_dev->snaps);
>  
>  	/* generate unique id: find highest unique id, add one */
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> @@ -1478,6 +1788,9 @@ static ssize_t class_rbd_add(struct class *c,
>  	}
>  	rbd_dev->major = irc;
>  
> +	rc = rbd_bus_add_dev(rbd_dev);
> +	if (rc)
> +		goto err_out_disk;
>  	/* set up and announce blkdev mapping */
>  	rc = rbd_init_disk(rbd_dev);
>  	if (rc)
> @@ -1487,6 +1800,8 @@ static ssize_t class_rbd_add(struct class *c,
>  
>  err_out_blkdev:
>  	unregister_blkdev(rbd_dev->major, rbd_dev->name);
> +err_out_disk:
> +	rbd_free_disk(rbd_dev);
>  err_out_client:
>  	rbd_put_client(rbd_dev);
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> @@ -1518,35 +1833,10 @@ static struct rbd_device *__rbd_get_dev(unsigned long id)
>  	return NULL;
>  }
>  
> -static ssize_t class_rbd_remove(struct class *c,
> -				struct class_attribute *attr,
> -				const char *buf,
> -				size_t count)
> +static void rbd_dev_release(struct device *dev)
>  {
> -	struct rbd_device *rbd_dev = NULL;
> -	int target_id, rc;
> -	unsigned long ul;
> -
> -	rc = strict_strtoul(buf, 10, &ul);
> -	if (rc)
> -		return rc;
> -
> -	/* convert to int; abort if we lost anything in the conversion */
> -	target_id = (int) ul;
> -	if (target_id != ul)
> -		return -EINVAL;
> -
> -	/* remove object from list immediately */
> -	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> -
> -	rbd_dev = __rbd_get_dev(target_id);
> -	if (rbd_dev)
> -		list_del_init(&rbd_dev->node);
> -
> -	mutex_unlock(&ctl_mutex);
> -
> -	if (!rbd_dev)
> -		return -ENOENT;
> +	struct rbd_device *rbd_dev =
> +			container_of(dev, struct rbd_device, dev);
>  
>  	rbd_put_client(rbd_dev);
>  
> @@ -1557,67 +1847,11 @@ static ssize_t class_rbd_remove(struct class *c,
>  
>  	/* release module ref */
>  	module_put(THIS_MODULE);
> -
> -	return count;
>  }
>  
> -static ssize_t class_rbd_snaps_list(struct class *c,
> -			      struct class_attribute *attr,
> -			      char *data)
> -{
> -	struct rbd_device *rbd_dev = NULL;
> -	struct list_head *tmp;
> -	struct rbd_image_header *header;
> -	int i, n = 0, max = PAGE_SIZE;
> -	int ret;
> -
> -	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
> -
> -	n += snprintf(data, max, "#id\tsnap\tKB\n");
> -
> -	list_for_each(tmp, &rbd_dev_list) {
> -		char *names, *p;
> -		struct ceph_snap_context *snapc;
> -
> -		rbd_dev = list_entry(tmp, struct rbd_device, node);
> -		header = &rbd_dev->header;
> -
> -		down_read(&header->snap_rwsem);
> -
> -		names = header->snap_names;
> -		snapc = header->snapc;
> -
> -		n += snprintf(data + n, max - n, "%d\t%s\t%lld%s\n",
> -			      rbd_dev->id, RBD_SNAP_HEAD_NAME,
> -			      header->image_size >> 10,
> -			      (!rbd_dev->cur_snap ? " (*)" : ""));
> -		if (n == max)
> -			break;
> -
> -		p = names;
> -		for (i = 0; i < header->total_snaps; i++, p += strlen(p) + 1) {
> -			n += snprintf(data + n, max - n, "%d\t%s\t%lld%s\n",
> -			      rbd_dev->id, p, header->snap_sizes[i] >> 10,
> -			      (rbd_dev->cur_snap &&
> -			       (snap_index(header, i) == rbd_dev->cur_snap) ?
> -			       " (*)" : ""));
> -			if (n == max)
> -				break;
> -		}
> -
> -		up_read(&header->snap_rwsem);
> -	}
> -
> -
> -	ret = n;
> -	mutex_unlock(&ctl_mutex);
> -	return ret;
> -}
> -
> -static ssize_t class_rbd_snaps_refresh(struct class *c,
> -				struct class_attribute *attr,
> -				const char *buf,
> -				size_t count)
> +static ssize_t rbd_remove(struct bus_type *bus,
> +			  const char *buf,
> +			  size_t count)
>  {
>  	struct rbd_device *rbd_dev = NULL;
>  	int target_id, rc;
> @@ -1641,95 +1875,70 @@ static ssize_t class_rbd_snaps_refresh(struct class *c,
>  		goto done;
>  	}
>  
> -	rc = rbd_update_snaps(rbd_dev);
> -	if (rc < 0)
> -		ret = rc;
> +	list_del_init(&rbd_dev->node);
> +
> +	__rbd_remove_all_snaps(rbd_dev);
> +	rbd_bus_del_dev(rbd_dev);
>  
>  done:
>  	mutex_unlock(&ctl_mutex);
>  	return ret;
>  }
>  
> -static ssize_t class_rbd_snap_create(struct class *c,
> -				struct class_attribute *attr,
> -				const char *buf,
> -				size_t count)
> +static ssize_t rbd_snap_add(struct device *dev,
> +			    struct device_attribute *attr,
> +			    const char *buf,
> +			    size_t count)
>  {
> -	struct rbd_device *rbd_dev = NULL;
> -	int target_id, ret;
> -	char *name;
> -
> -	name = kmalloc(RBD_MAX_SNAP_NAME_LEN + 1, GFP_KERNEL);
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +	int ret;
> +	char *name = kmalloc(count + 1, GFP_KERNEL);
>  	if (!name)
>  		return -ENOMEM;
>  
> -	/* parse snaps add command */
> -	if (sscanf(buf, "%d "
> -		   "%" __stringify(RBD_MAX_SNAP_NAME_LEN) "s",
> -		   &target_id,
> -		   name) != 2) {
> -		ret = -EINVAL;
> -		goto done;
> -	}
> +	snprintf(name, count, "%s", buf);
>  
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
>  
> -	rbd_dev = __rbd_get_dev(target_id);
> -	if (!rbd_dev) {
> -		ret = -ENOENT;
> -		goto done_unlock;
> -	}
> -
>  	ret = rbd_header_add_snap(rbd_dev,
>  				  name, GFP_KERNEL);
>  	if (ret < 0)
>  		goto done_unlock;
>  
> -	ret = rbd_update_snaps(rbd_dev);
> +	ret = __rbd_update_snaps(rbd_dev);
>  	if (ret < 0)
>  		goto done_unlock;
>  
>  	ret = count;
>  done_unlock:
>  	mutex_unlock(&ctl_mutex);
> -done:
>  	kfree(name);
>  	return ret;
>  }
>  
> -static ssize_t class_rbd_rollback(struct class *c,
> -				struct class_attribute *attr,
> -				const char *buf,
> -				size_t count)
> +static ssize_t rbd_snap_rollback(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf,
> +				 size_t count)
>  {
> -	struct rbd_device *rbd_dev = NULL;
> -	int target_id, ret;
> +	struct rbd_device *rbd_dev = dev_to_rbd(dev);
> +	int ret;
>  	u64 snapid;
> -	char snap_name[RBD_MAX_SNAP_NAME_LEN];
>  	u64 cur_ofs;
> -	char *seg_name;
> +	char *seg_name = NULL;
> +	char *snap_name = kmalloc(count + 1, GFP_KERNEL);
> +	ret = -ENOMEM;
> +	if (!snap_name)
> +		return ret;
>  
>  	/* parse snaps add command */
> -	if (sscanf(buf, "%d "
> -		   "%" __stringify(RBD_MAX_SNAP_NAME_LEN) "s",
> -		   &target_id,
> -		   snap_name) != 2) {
> -		return -EINVAL;
> -	}
> -
> -	ret = -ENOMEM;
> +	snprintf(snap_name, count, "%s", buf);
>  	seg_name = kmalloc(RBD_MAX_SEG_NAME_LEN + 1, GFP_NOIO);
>  	if (!seg_name)
> -		return ret;
> +		goto done;
>  
>  	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
>  
> -	rbd_dev = __rbd_get_dev(target_id);
> -	if (!rbd_dev) {
> -		ret = -ENOENT;
> -		goto done_unlock;
> -	}
> -
>  	ret = snap_by_name(&rbd_dev->header, snap_name, &snapid, NULL);
>  	if (ret < 0)
>  		goto done_unlock;
> @@ -1750,7 +1959,7 @@ static ssize_t class_rbd_rollback(struct class *c,
>  				   seg_name, ret);
>  	}
>  
> -	ret = rbd_update_snaps(rbd_dev);
> +	ret = __rbd_update_snaps(rbd_dev);
>  	if (ret < 0)
>  		goto done_unlock;
>  
> @@ -1758,57 +1967,42 @@ static ssize_t class_rbd_rollback(struct class *c,
>  
>  done_unlock:
>  	mutex_unlock(&ctl_mutex);
> +done:
>  	kfree(seg_name);
> +	kfree(snap_name);
>  
>  	return ret;
>  }
>  
> -static struct class_attribute class_rbd_attrs[] = {
> -	__ATTR(add,		0200, NULL, class_rbd_add),
> -	__ATTR(remove,		0200, NULL, class_rbd_remove),
> -	__ATTR(list,		0444, class_rbd_list, NULL),
> -	__ATTR(snaps_refresh,	0200, NULL, class_rbd_snaps_refresh),
> -	__ATTR(snap_create,	0200, NULL, class_rbd_snap_create),
> -	__ATTR(snaps_list,	0444, class_rbd_snaps_list, NULL),
> -	__ATTR(snap_rollback,	0200, NULL, class_rbd_rollback),
> +static struct bus_attribute rbd_bus_attrs[] = {
> +	__ATTR(add, S_IWUSR, NULL, rbd_add),
> +	__ATTR(remove, S_IWUSR, NULL, rbd_remove),
>  	__ATTR_NULL
>  };
>  
>  /*
>   * create control files in sysfs
> - * /sys/class/rbd/...
> + * /sys/bus/rbd/...
>   */
>  static int rbd_sysfs_init(void)
>  {
> -	int ret = -ENOMEM;
> +	int ret;
>  
> -	class_rbd = kzalloc(sizeof(*class_rbd), GFP_KERNEL);
> -	if (!class_rbd)
> -		goto out;
> +	rbd_bus_type.bus_attrs = rbd_bus_attrs;
>  
> -	class_rbd->name = DRV_NAME;
> -	class_rbd->owner = THIS_MODULE;
> -	class_rbd->class_release = class_rbd_release;
> -	class_rbd->class_attrs = class_rbd_attrs;
> +	ret = bus_register(&rbd_bus_type);
> +	 if (ret < 0)
> +		return ret;
>  
> -	ret = class_register(class_rbd);
> -	if (ret)
> -		goto out_class;
> -	return 0;
> +	ret = device_register(&rbd_root_dev);
>  
> -out_class:
> -	kfree(class_rbd);
> -	class_rbd = NULL;
> -	pr_err(DRV_NAME ": failed to create class rbd\n");
> -out:
>  	return ret;
>  }
>  
>  static void rbd_sysfs_cleanup(void)
>  {
> -	if (class_rbd)
> -		class_destroy(class_rbd);
> -	class_rbd = NULL;
> +	device_unregister(&rbd_root_dev);
> +	bus_unregister(&rbd_bus_type);
>  }
>  
>  int __init rbd_init(void)
> -- 
> 1.5.6.5
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

  reply	other threads:[~2010-12-01 19:20 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-17  0:32 [PATCH] rbd: replace the rbd sysfs interface Yehuda Sadeh
2010-11-17 17:19 ` Greg KH
2010-11-17 23:00   ` Yehuda Sadeh Weinraub
2010-11-17 23:00     ` Yehuda Sadeh Weinraub
2010-11-18  1:30     ` Greg KH
2010-11-18 22:53       ` Yehuda Sadeh Weinraub
2010-11-18 22:53         ` Yehuda Sadeh Weinraub
2010-11-19  2:08         ` Greg KH
2010-11-19 20:42           ` Yehuda Sadeh Weinraub
2010-11-19 20:42             ` Yehuda Sadeh Weinraub
2010-11-23  0:14             ` Greg KH
2010-11-23  0:14               ` Greg KH
2010-11-23  0:48               ` Yehuda Sadeh Weinraub
2010-11-23  0:48                 ` Yehuda Sadeh Weinraub
2010-11-23  0:58                 ` Greg KH
2010-11-23  0:58                   ` Greg KH
2010-11-23  1:19                   ` Yehuda Sadeh Weinraub
2010-11-23  1:19                     ` Yehuda Sadeh Weinraub
2010-11-24  0:23                   ` Yehuda Sadeh
2010-12-01 19:25                     ` Sage Weil [this message]
2010-12-01 19:47                       ` Greg KH
2010-12-01 20:08                         ` Sage Weil
2010-12-01 20:23                           ` Greg KH
2010-12-02  0:11                             ` Sage Weil
2010-11-22 23:33           ` Yehuda Sadeh
2010-11-23  0:14             ` Greg KH
2010-11-23  0:45               ` Sage Weil
2010-11-23  0:56                 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.1012011117460.8183@cobra.newdream.net \
    --to=sage@newdream.net \
    --cc=ceph-devel@vger.kernel.org \
    --cc=greg@kroah.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yehuda@hq.newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.