All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] rbd: four minor changes
@ 2012-10-22 16:49 Alex Elder
  2012-10-22 16:51 ` [PATCH 1/4] rbd: verify rbd image order value Alex Elder
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Alex Elder @ 2012-10-22 16:49 UTC (permalink / raw)
  To: ceph-devel

This adds a few minor changes to rbd.  The first tightens up
the validity checks for an rbd image.  The second loosens a
restriction on the length of user-provided snapshot name (but
adds early check to see if it's exceeded).  The third refactors
and comments a function, and the last gets rid of a structure
that we don't need to carry around after initialization.

					-Alex

[PATCH 1/4] rbd: verify rbd image order value
[PATCH 2/4] rbd: increase maximum snapshot name length
[PATCH 3/4] rbd: simplify rbd_merge_bvec()
[PATCH 4/4] rbd: kill rbd_device->rbd_opts


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] rbd: verify rbd image order value
  2012-10-22 16:49 [PATCH 0/4] rbd: four minor changes Alex Elder
@ 2012-10-22 16:51 ` Alex Elder
  2012-10-22 22:43   ` Dan Mick
  2012-10-24 18:02   ` Josh Durgin
  2012-10-22 16:51 ` [PATCH 2/4] rbd: increase maximum snapshot name length Alex Elder
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Alex Elder @ 2012-10-22 16:51 UTC (permalink / raw)
  To: ceph-devel

This adds a verification that an rbd image's object order is
within the upper and lower bounds supported by this implementation.

It must be at least 9 (SECTOR_SHIFT), because the Linux bio system
assumes that minimum granularity.

It also must be less than 32 (at the moment anyway) because there
exist spots in the code that store the size of a "segment" (object
backing an rbd image) in a signed int variable, which can be 32 bits
including the sign.  We should be able to relax this limit once
we've verified the code uses 64-bit types where needed.

Note that the CLI tool already limits the order to the range 12-25.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index d032883..4734446 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -533,6 +533,16 @@ static bool rbd_dev_ondisk_valid(struct
rbd_image_header_ondisk *ondisk)
 	if (memcmp(&ondisk->text, RBD_HEADER_TEXT, sizeof (RBD_HEADER_TEXT)))
 		return false;

+	/* The bio layer requires at least sector-sized I/O */
+
+	if (ondisk->options.order < SECTOR_SHIFT)
+		return false;
+
+	/* If we use u64 in a few spots we may be able to loosen this */
+
+	if (ondisk->options.order > 8 * sizeof (int) - 1)
+		return false;
+
 	/*
 	 * The size of a snapshot header has to fit in a size_t, and
 	 * that limits the number of snapshots.
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/4] rbd: increase maximum snapshot name length
  2012-10-22 16:49 [PATCH 0/4] rbd: four minor changes Alex Elder
  2012-10-22 16:51 ` [PATCH 1/4] rbd: verify rbd image order value Alex Elder
@ 2012-10-22 16:51 ` Alex Elder
  2012-10-24 19:01   ` Josh Durgin
  2012-10-24 21:05   ` Dan Mick
  2012-10-22 16:51 ` [PATCH 3/4] rbd: simplify rbd_merge_bvec() Alex Elder
  2012-10-22 16:51 ` [PATCH 4/4] rbd: kill rbd_device->rbd_opts Alex Elder
  3 siblings, 2 replies; 14+ messages in thread
From: Alex Elder @ 2012-10-22 16:51 UTC (permalink / raw)
  To: ceph-devel

Change RBD_MAX_SNAP_NAME_LEN to be based on NAME_MAX.  That is a
practical limit for the length of a snapshot name (based on the
presence of a directory using the name under /sys/bus/rbd to
represent the snapshot).

The /sys entry is created by prefixing it with "snap_"; define that
prefix symbolically, and take its length into account in defining
the snapshot name length limit.

Enforce the limit in rbd_add_parse_args().  Also delete a dout()
call in that function that was not meant to be committed.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 4734446..4858d92 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -61,7 +61,10 @@

 #define RBD_MINORS_PER_MAJOR	256		/* max minors per blkdev */

-#define RBD_MAX_SNAP_NAME_LEN	32
+#define RBD_SNAP_DEV_NAME_PREFIX	"snap_"
+#define RBD_MAX_SNAP_NAME_LEN	\
+			(NAME_MAX - (sizeof (RBD_SNAP_DEV_NAME_PREFIX) - 1))
+
 #define RBD_MAX_SNAP_COUNT	510	/* allows max snapc to fit in 4KB */
 #define RBD_MAX_OPT_LEN		1024

@@ -2063,7 +2066,7 @@ static int rbd_register_snap_dev(struct rbd_snap
*snap,
 	dev->type = &rbd_snap_device_type;
 	dev->parent = parent;
 	dev->release = rbd_snap_dev_release;
-	dev_set_name(dev, "snap_%s", snap->name);
+	dev_set_name(dev, "%s%s", RBD_SNAP_DEV_NAME_PREFIX, snap->name);
 	dout("%s: registering device for snapshot %s\n", __func__, snap->name);

 	ret = device_register(dev);
@@ -2797,8 +2800,13 @@ static char *rbd_add_parse_args(struct rbd_device
*rbd_dev,
 	if (!rbd_dev->image_name)
 		goto out_err;

-	/* Snapshot name is optional */
+	/* Snapshot name is optional; default is to use "head" */
+
 	len = next_token(&buf);
+	if (len > RBD_MAX_SNAP_NAME_LEN) {
+		err_ptr = ERR_PTR(-ENAMETOOLONG);
+		goto out_err;
+	}
 	if (!len) {
 		buf = RBD_SNAP_HEAD_NAME; /* No snapshot supplied */
 		len = sizeof (RBD_SNAP_HEAD_NAME) - 1;
@@ -2809,8 +2817,6 @@ static char *rbd_add_parse_args(struct rbd_device
*rbd_dev,
 	memcpy(snap_name, buf, len);
 	*(snap_name + len) = '\0';

-dout("    SNAP_NAME is <%s>, len is %zd\n", snap_name, len);
-
 	return snap_name;

 out_err:
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/4] rbd: simplify rbd_merge_bvec()
  2012-10-22 16:49 [PATCH 0/4] rbd: four minor changes Alex Elder
  2012-10-22 16:51 ` [PATCH 1/4] rbd: verify rbd image order value Alex Elder
  2012-10-22 16:51 ` [PATCH 2/4] rbd: increase maximum snapshot name length Alex Elder
@ 2012-10-22 16:51 ` Alex Elder
  2012-10-24 22:23   ` Dan Mick
  2012-10-22 16:51 ` [PATCH 4/4] rbd: kill rbd_device->rbd_opts Alex Elder
  3 siblings, 1 reply; 14+ messages in thread
From: Alex Elder @ 2012-10-22 16:51 UTC (permalink / raw)
  To: ceph-devel

The aim of this patch is to make what's going on rbd_merge_bvec() a
bit more obvious than it was before.  This was an issue when a
recent btrfs bug led us to question whether the merge function was
working correctly.

Use "seg" rather than "chunk" to indicate the units whose boundaries
we care about we call "segments."

Signed-off-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   51
+++++++++++++++++++++++++++++++++++----------------
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 4858d92..4ccce4d 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1566,22 +1566,41 @@ static int rbd_merge_bvec(struct request_queue
*q, struct bvec_merge_data *bmd,
 			  struct bio_vec *bvec)
 {
 	struct rbd_device *rbd_dev = q->queuedata;
-	unsigned int chunk_sectors;
-	sector_t sector;
-	unsigned int bio_sectors;
-	int max;
-
-	chunk_sectors = 1 << (rbd_dev->header.obj_order - SECTOR_SHIFT);
-	sector = bmd->bi_sector + get_start_sect(bmd->bi_bdev);
-	bio_sectors = bmd->bi_size >> SECTOR_SHIFT;
-
-	max =  (chunk_sectors - ((sector & (chunk_sectors - 1))
-				 + bio_sectors)) << SECTOR_SHIFT;
-	if (max < 0)
-		max = 0; /* bio_add cannot handle a negative return */
-	if (max <= bvec->bv_len && bio_sectors == 0)
-		return bvec->bv_len;
-	return max;
+	sector_t sector_offset;
+	sector_t sectors_per_seg;
+	sector_t seg_sector_offset;
+	int ret;
+
+	/*
+	 * Find how far into its rbd segment the partition-relative
+	 * bio start sector is to offset relative to the enclosing
+	 * device.
+	 */
+	sector_offset = get_start_sect(bmd->bi_bdev) + bmd->bi_sector;
+	sectors_per_seg = 1 << (rbd_dev->header.obj_order - SECTOR_SHIFT);
+	seg_sector_offset = sector_offset & (sectors_per_seg - 1);
+
+	/*
+	 * Compute the number of bytes from that offset to the end
+	 * of the segment.  Account for what's already used by the bio.
+	 */
+	ret = (int) (sectors_per_seg - seg_sector_offset) << SECTOR_SHIFT;
+	if (ret >= bmd->bi_size)
+		ret -= bmd->bi_size;
+	else
+		ret = 0;
+
+	/*
+	 * Don't send back more than was asked for.  And if the bio
+	 * was empty, let the whole thing through because:  "Note
+	 * that a block device *must* allow a single page to be
+	 * added to an empty bio."
+	 */
+	rbd_assert(bvec->bv_len <= PAGE_SIZE);
+	if (ret > (int) bvec->bv_len || !bmd->bi_size)
+		ret = (int) bvec->bv_len;
+
+	return ret;
 }

 static void rbd_free_disk(struct rbd_device *rbd_dev)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/4] rbd: kill rbd_device->rbd_opts
  2012-10-22 16:49 [PATCH 0/4] rbd: four minor changes Alex Elder
                   ` (2 preceding siblings ...)
  2012-10-22 16:51 ` [PATCH 3/4] rbd: simplify rbd_merge_bvec() Alex Elder
@ 2012-10-22 16:51 ` Alex Elder
  2012-10-24 22:30   ` Josh Durgin
  2012-10-24 22:37   ` Dan Mick
  3 siblings, 2 replies; 14+ messages in thread
From: Alex Elder @ 2012-10-22 16:51 UTC (permalink / raw)
  To: ceph-devel

The rbd_device structure has an embedded rbd_options structure.
Such a structure is needed to work with the generic ceph argument
parsing code, but there's no need to keep it around once argument
parsing is done.

Use a local variable to hold the rbd options used in parsing in
rbd_get_client(), and just transfer its content (it's just a
read_only flag) into the field in the rbd_mapping sub-structure
that requires that information.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 4ccce4d..3c131ab 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -184,7 +184,6 @@ struct rbd_device {
 	struct gendisk		*disk;		/* blkdev's gendisk and rq */

 	u32			image_format;	/* Either 1 or 2 */
-	struct rbd_options	rbd_opts;
 	struct rbd_client	*rbd_client;

 	char			name[DEV_NAME_LEN]; /* blkdev name, e.g. rbd3 */
@@ -456,18 +455,24 @@ static int parse_rbd_opts_token(char *c, void
*private)
 static int rbd_get_client(struct rbd_device *rbd_dev, const char *mon_addr,
 				size_t mon_addr_len, char *options)
 {
-	struct rbd_options *rbd_opts = &rbd_dev->rbd_opts;
+	struct rbd_options rbd_opts;
 	struct ceph_options *ceph_opts;
 	struct rbd_client *rbdc;

-	rbd_opts->read_only = RBD_READ_ONLY_DEFAULT;
+	/* Initialize all rbd options to the defaults */
+
+	rbd_opts.read_only = RBD_READ_ONLY_DEFAULT;

 	ceph_opts = ceph_parse_options(options, mon_addr,
 					mon_addr + mon_addr_len,
-					parse_rbd_opts_token, rbd_opts);
+					parse_rbd_opts_token, &rbd_opts);
 	if (IS_ERR(ceph_opts))
 		return PTR_ERR(ceph_opts);

+	/* Record the parsed rbd options */
+
+	rbd_dev->mapping.read_only = rbd_opts.read_only;
+
 	rbdc = rbd_client_find(ceph_opts);
 	if (rbdc) {
 		/* using an existing client */
@@ -685,7 +690,6 @@ static int rbd_dev_set_mapping(struct rbd_device
*rbd_dev, char *snap_name)
 		rbd_dev->mapping.size = rbd_dev->header.image_size;
 		rbd_dev->mapping.features = rbd_dev->header.features;
 		rbd_dev->mapping.snap_exists = false;
-		rbd_dev->mapping.read_only = rbd_dev->rbd_opts.read_only;
 		ret = 0;
 	} else {
 		ret = snap_by_name(rbd_dev, snap_name);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] rbd: verify rbd image order value
  2012-10-22 16:51 ` [PATCH 1/4] rbd: verify rbd image order value Alex Elder
@ 2012-10-22 22:43   ` Dan Mick
  2012-10-24 18:02   ` Josh Durgin
  1 sibling, 0 replies; 14+ messages in thread
From: Dan Mick @ 2012-10-22 22:43 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Personally I find that sizeof(int) doesn't really help here, but 
otherwise it's fine by me

On 10/22/2012 09:51 AM, Alex Elder wrote:
> This adds a verification that an rbd image's object order is
> within the upper and lower bounds supported by this implementation.
>
> It must be at least 9 (SECTOR_SHIFT), because the Linux bio system
> assumes that minimum granularity.
>
> It also must be less than 32 (at the moment anyway) because there
> exist spots in the code that store the size of a "segment" (object
> backing an rbd image) in a signed int variable, which can be 32 bits
> including the sign.  We should be able to relax this limit once
> we've verified the code uses 64-bit types where needed.
>
> Note that the CLI tool already limits the order to the range 12-25.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   drivers/block/rbd.c |   10 ++++++++++
>   1 file changed, 10 insertions(+)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index d032883..4734446 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -533,6 +533,16 @@ static bool rbd_dev_ondisk_valid(struct
> rbd_image_header_ondisk *ondisk)
>   	if (memcmp(&ondisk->text, RBD_HEADER_TEXT, sizeof (RBD_HEADER_TEXT)))
>   		return false;
>
> +	/* The bio layer requires at least sector-sized I/O */
> +
> +	if (ondisk->options.order < SECTOR_SHIFT)
> +		return false;
> +
> +	/* If we use u64 in a few spots we may be able to loosen this */
> +
> +	if (ondisk->options.order > 8 * sizeof (int) - 1)
> +		return false;
> +
>   	/*
>   	 * The size of a snapshot header has to fit in a size_t, and
>   	 * that limits the number of snapshots.
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] rbd: verify rbd image order value
  2012-10-22 16:51 ` [PATCH 1/4] rbd: verify rbd image order value Alex Elder
  2012-10-22 22:43   ` Dan Mick
@ 2012-10-24 18:02   ` Josh Durgin
  2012-10-26 22:06     ` Alex Elder
  1 sibling, 1 reply; 14+ messages in thread
From: Josh Durgin @ 2012-10-24 18:02 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Might want to add some output so we can tell which error
is being hit.

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

On 10/22/2012 09:51 AM, Alex Elder wrote:
> This adds a verification that an rbd image's object order is
> within the upper and lower bounds supported by this implementation.
>
> It must be at least 9 (SECTOR_SHIFT), because the Linux bio system
> assumes that minimum granularity.
>
> It also must be less than 32 (at the moment anyway) because there
> exist spots in the code that store the size of a "segment" (object
> backing an rbd image) in a signed int variable, which can be 32 bits
> including the sign.  We should be able to relax this limit once
> we've verified the code uses 64-bit types where needed.
>
> Note that the CLI tool already limits the order to the range 12-25.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   drivers/block/rbd.c |   10 ++++++++++
>   1 file changed, 10 insertions(+)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index d032883..4734446 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -533,6 +533,16 @@ static bool rbd_dev_ondisk_valid(struct
> rbd_image_header_ondisk *ondisk)
>   	if (memcmp(&ondisk->text, RBD_HEADER_TEXT, sizeof (RBD_HEADER_TEXT)))
>   		return false;
>
> +	/* The bio layer requires at least sector-sized I/O */
> +
> +	if (ondisk->options.order < SECTOR_SHIFT)
> +		return false;
> +
> +	/* If we use u64 in a few spots we may be able to loosen this */
> +
> +	if (ondisk->options.order > 8 * sizeof (int) - 1)
> +		return false;
> +
>   	/*
>   	 * The size of a snapshot header has to fit in a size_t, and
>   	 * that limits the number of snapshots.
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] rbd: increase maximum snapshot name length
  2012-10-22 16:51 ` [PATCH 2/4] rbd: increase maximum snapshot name length Alex Elder
@ 2012-10-24 19:01   ` Josh Durgin
  2012-10-24 21:05   ` Dan Mick
  1 sibling, 0 replies; 14+ messages in thread
From: Josh Durgin @ 2012-10-24 19:01 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

This is a better limit than before. We still need to
do some clean up of these limits across projects:

http://www.tracker.newdream.net/issues/1701

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

On 10/22/2012 09:51 AM, Alex Elder wrote:
> Change RBD_MAX_SNAP_NAME_LEN to be based on NAME_MAX.  That is a
> practical limit for the length of a snapshot name (based on the
> presence of a directory using the name under /sys/bus/rbd to
> represent the snapshot).
>
> The /sys entry is created by prefixing it with "snap_"; define that
> prefix symbolically, and take its length into account in defining
> the snapshot name length limit.
>
> Enforce the limit in rbd_add_parse_args().  Also delete a dout()
> call in that function that was not meant to be committed.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   drivers/block/rbd.c |   16 +++++++++++-----
>   1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 4734446..4858d92 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -61,7 +61,10 @@
>
>   #define RBD_MINORS_PER_MAJOR	256		/* max minors per blkdev */
>
> -#define RBD_MAX_SNAP_NAME_LEN	32
> +#define RBD_SNAP_DEV_NAME_PREFIX	"snap_"
> +#define RBD_MAX_SNAP_NAME_LEN	\
> +			(NAME_MAX - (sizeof (RBD_SNAP_DEV_NAME_PREFIX) - 1))
> +
>   #define RBD_MAX_SNAP_COUNT	510	/* allows max snapc to fit in 4KB */
>   #define RBD_MAX_OPT_LEN		1024
>
> @@ -2063,7 +2066,7 @@ static int rbd_register_snap_dev(struct rbd_snap
> *snap,
>   	dev->type = &rbd_snap_device_type;
>   	dev->parent = parent;
>   	dev->release = rbd_snap_dev_release;
> -	dev_set_name(dev, "snap_%s", snap->name);
> +	dev_set_name(dev, "%s%s", RBD_SNAP_DEV_NAME_PREFIX, snap->name);
>   	dout("%s: registering device for snapshot %s\n", __func__, snap->name);
>
>   	ret = device_register(dev);
> @@ -2797,8 +2800,13 @@ static char *rbd_add_parse_args(struct rbd_device
> *rbd_dev,
>   	if (!rbd_dev->image_name)
>   		goto out_err;
>
> -	/* Snapshot name is optional */
> +	/* Snapshot name is optional; default is to use "head" */
> +
>   	len = next_token(&buf);
> +	if (len > RBD_MAX_SNAP_NAME_LEN) {
> +		err_ptr = ERR_PTR(-ENAMETOOLONG);
> +		goto out_err;
> +	}
>   	if (!len) {
>   		buf = RBD_SNAP_HEAD_NAME; /* No snapshot supplied */
>   		len = sizeof (RBD_SNAP_HEAD_NAME) - 1;
> @@ -2809,8 +2817,6 @@ static char *rbd_add_parse_args(struct rbd_device
> *rbd_dev,
>   	memcpy(snap_name, buf, len);
>   	*(snap_name + len) = '\0';
>
> -dout("    SNAP_NAME is <%s>, len is %zd\n", snap_name, len);
> -
>   	return snap_name;
>
>   out_err:
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] rbd: increase maximum snapshot name length
  2012-10-22 16:51 ` [PATCH 2/4] rbd: increase maximum snapshot name length Alex Elder
  2012-10-24 19:01   ` Josh Durgin
@ 2012-10-24 21:05   ` Dan Mick
  1 sibling, 0 replies; 14+ messages in thread
From: Dan Mick @ 2012-10-24 21:05 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Dan Mick <dan.mick@inktank.com>

On 10/22/2012 09:51 AM, Alex Elder wrote:
> Change RBD_MAX_SNAP_NAME_LEN to be based on NAME_MAX.  That is a
> practical limit for the length of a snapshot name (based on the
> presence of a directory using the name under /sys/bus/rbd to
> represent the snapshot).
>
> The /sys entry is created by prefixing it with "snap_"; define that
> prefix symbolically, and take its length into account in defining
> the snapshot name length limit.
>
> Enforce the limit in rbd_add_parse_args().  Also delete a dout()
> call in that function that was not meant to be committed.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   drivers/block/rbd.c |   16 +++++++++++-----
>   1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 4734446..4858d92 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -61,7 +61,10 @@
>
>   #define RBD_MINORS_PER_MAJOR	256		/* max minors per blkdev */
>
> -#define RBD_MAX_SNAP_NAME_LEN	32
> +#define RBD_SNAP_DEV_NAME_PREFIX	"snap_"
> +#define RBD_MAX_SNAP_NAME_LEN	\
> +			(NAME_MAX - (sizeof (RBD_SNAP_DEV_NAME_PREFIX) - 1))
> +
>   #define RBD_MAX_SNAP_COUNT	510	/* allows max snapc to fit in 4KB */
>   #define RBD_MAX_OPT_LEN		1024
>
> @@ -2063,7 +2066,7 @@ static int rbd_register_snap_dev(struct rbd_snap
> *snap,
>   	dev->type = &rbd_snap_device_type;
>   	dev->parent = parent;
>   	dev->release = rbd_snap_dev_release;
> -	dev_set_name(dev, "snap_%s", snap->name);
> +	dev_set_name(dev, "%s%s", RBD_SNAP_DEV_NAME_PREFIX, snap->name);
>   	dout("%s: registering device for snapshot %s\n", __func__, snap->name);
>
>   	ret = device_register(dev);
> @@ -2797,8 +2800,13 @@ static char *rbd_add_parse_args(struct rbd_device
> *rbd_dev,
>   	if (!rbd_dev->image_name)
>   		goto out_err;
>
> -	/* Snapshot name is optional */
> +	/* Snapshot name is optional; default is to use "head" */
> +
>   	len = next_token(&buf);
> +	if (len > RBD_MAX_SNAP_NAME_LEN) {
> +		err_ptr = ERR_PTR(-ENAMETOOLONG);
> +		goto out_err;
> +	}
>   	if (!len) {
>   		buf = RBD_SNAP_HEAD_NAME; /* No snapshot supplied */
>   		len = sizeof (RBD_SNAP_HEAD_NAME) - 1;
> @@ -2809,8 +2817,6 @@ static char *rbd_add_parse_args(struct rbd_device
> *rbd_dev,
>   	memcpy(snap_name, buf, len);
>   	*(snap_name + len) = '\0';
>
> -dout("    SNAP_NAME is <%s>, len is %zd\n", snap_name, len);
> -
>   	return snap_name;
>
>   out_err:
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] rbd: simplify rbd_merge_bvec()
  2012-10-22 16:51 ` [PATCH 3/4] rbd: simplify rbd_merge_bvec() Alex Elder
@ 2012-10-24 22:23   ` Dan Mick
  2012-10-24 22:31     ` Josh Durgin
  0 siblings, 1 reply; 14+ messages in thread
From: Dan Mick @ 2012-10-24 22:23 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

'segment' is better than 'chunk', but as these are RADOS objects, I 
really prefer just "object" here to make that clear.  Maybe we can add
a small block comment explaining the terminology just to make it 
crystal-clear?

rbd_object would be fine with me too.  The problem with "segment" is it 
tends to make me think I'm missing a subdivision of the RADOS objects 
that make up an rbd image that I didn't know about.

Otherwise,

Reviewed-by: Dan Mick <dan.mick@inktank.com>

On 10/22/2012 09:51 AM, Alex Elder wrote:
> The aim of this patch is to make what's going on rbd_merge_bvec() a
> bit more obvious than it was before.  This was an issue when a
> recent btrfs bug led us to question whether the merge function was
> working correctly.
>
> Use "seg" rather than "chunk" to indicate the units whose boundaries
> we care about we call "segments."
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   drivers/block/rbd.c |   51
> +++++++++++++++++++++++++++++++++++----------------
>   1 file changed, 35 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 4858d92..4ccce4d 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -1566,22 +1566,41 @@ static int rbd_merge_bvec(struct request_queue
> *q, struct bvec_merge_data *bmd,
>   			  struct bio_vec *bvec)
>   {
>   	struct rbd_device *rbd_dev = q->queuedata;
> -	unsigned int chunk_sectors;
> -	sector_t sector;
> -	unsigned int bio_sectors;
> -	int max;
> -
> -	chunk_sectors = 1 << (rbd_dev->header.obj_order - SECTOR_SHIFT);
> -	sector = bmd->bi_sector + get_start_sect(bmd->bi_bdev);
> -	bio_sectors = bmd->bi_size >> SECTOR_SHIFT;
> -
> -	max =  (chunk_sectors - ((sector & (chunk_sectors - 1))
> -				 + bio_sectors)) << SECTOR_SHIFT;
> -	if (max < 0)
> -		max = 0; /* bio_add cannot handle a negative return */
> -	if (max <= bvec->bv_len && bio_sectors == 0)
> -		return bvec->bv_len;
> -	return max;
> +	sector_t sector_offset;
> +	sector_t sectors_per_seg;
> +	sector_t seg_sector_offset;
> +	int ret;
> +
> +	/*
> +	 * Find how far into its rbd segment the partition-relative
> +	 * bio start sector is to offset relative to the enclosing
> +	 * device.
> +	 */
> +	sector_offset = get_start_sect(bmd->bi_bdev) + bmd->bi_sector;
> +	sectors_per_seg = 1 << (rbd_dev->header.obj_order - SECTOR_SHIFT);
> +	seg_sector_offset = sector_offset & (sectors_per_seg - 1);
> +
> +	/*
> +	 * Compute the number of bytes from that offset to the end
> +	 * of the segment.  Account for what's already used by the bio.
> +	 */
> +	ret = (int) (sectors_per_seg - seg_sector_offset) << SECTOR_SHIFT;
> +	if (ret >= bmd->bi_size)
> +		ret -= bmd->bi_size;
> +	else
> +		ret = 0;
> +
> +	/*
> +	 * Don't send back more than was asked for.  And if the bio
> +	 * was empty, let the whole thing through because:  "Note
> +	 * that a block device *must* allow a single page to be
> +	 * added to an empty bio."
> +	 */
> +	rbd_assert(bvec->bv_len <= PAGE_SIZE);
> +	if (ret > (int) bvec->bv_len || !bmd->bi_size)
> +		ret = (int) bvec->bv_len;
> +
> +	return ret;
>   }
>
>   static void rbd_free_disk(struct rbd_device *rbd_dev)
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] rbd: kill rbd_device->rbd_opts
  2012-10-22 16:51 ` [PATCH 4/4] rbd: kill rbd_device->rbd_opts Alex Elder
@ 2012-10-24 22:30   ` Josh Durgin
  2012-10-24 22:37   ` Dan Mick
  1 sibling, 0 replies; 14+ messages in thread
From: Josh Durgin @ 2012-10-24 22:30 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>

On 10/22/2012 09:51 AM, Alex Elder wrote:
> The rbd_device structure has an embedded rbd_options structure.
> Such a structure is needed to work with the generic ceph argument
> parsing code, but there's no need to keep it around once argument
> parsing is done.
>
> Use a local variable to hold the rbd options used in parsing in
> rbd_get_client(), and just transfer its content (it's just a
> read_only flag) into the field in the rbd_mapping sub-structure
> that requires that information.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   drivers/block/rbd.c |   14 +++++++++-----
>   1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 4ccce4d..3c131ab 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -184,7 +184,6 @@ struct rbd_device {
>   	struct gendisk		*disk;		/* blkdev's gendisk and rq */
>
>   	u32			image_format;	/* Either 1 or 2 */
> -	struct rbd_options	rbd_opts;
>   	struct rbd_client	*rbd_client;
>
>   	char			name[DEV_NAME_LEN]; /* blkdev name, e.g. rbd3 */
> @@ -456,18 +455,24 @@ static int parse_rbd_opts_token(char *c, void
> *private)
>   static int rbd_get_client(struct rbd_device *rbd_dev, const char *mon_addr,
>   				size_t mon_addr_len, char *options)
>   {
> -	struct rbd_options *rbd_opts = &rbd_dev->rbd_opts;
> +	struct rbd_options rbd_opts;
>   	struct ceph_options *ceph_opts;
>   	struct rbd_client *rbdc;
>
> -	rbd_opts->read_only = RBD_READ_ONLY_DEFAULT;
> +	/* Initialize all rbd options to the defaults */
> +
> +	rbd_opts.read_only = RBD_READ_ONLY_DEFAULT;
>
>   	ceph_opts = ceph_parse_options(options, mon_addr,
>   					mon_addr + mon_addr_len,
> -					parse_rbd_opts_token, rbd_opts);
> +					parse_rbd_opts_token, &rbd_opts);
>   	if (IS_ERR(ceph_opts))
>   		return PTR_ERR(ceph_opts);
>
> +	/* Record the parsed rbd options */
> +
> +	rbd_dev->mapping.read_only = rbd_opts.read_only;
> +
>   	rbdc = rbd_client_find(ceph_opts);
>   	if (rbdc) {
>   		/* using an existing client */
> @@ -685,7 +690,6 @@ static int rbd_dev_set_mapping(struct rbd_device
> *rbd_dev, char *snap_name)
>   		rbd_dev->mapping.size = rbd_dev->header.image_size;
>   		rbd_dev->mapping.features = rbd_dev->header.features;
>   		rbd_dev->mapping.snap_exists = false;
> -		rbd_dev->mapping.read_only = rbd_dev->rbd_opts.read_only;
>   		ret = 0;
>   	} else {
>   		ret = snap_by_name(rbd_dev, snap_name);
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] rbd: simplify rbd_merge_bvec()
  2012-10-24 22:23   ` Dan Mick
@ 2012-10-24 22:31     ` Josh Durgin
  0 siblings, 0 replies; 14+ messages in thread
From: Josh Durgin @ 2012-10-24 22:31 UTC (permalink / raw)
  To: Dan Mick; +Cc: Alex Elder, ceph-devel


On 10/24/2012 03:23 PM, Dan Mick wrote:
> 'segment' is better than 'chunk', but as these are RADOS objects, I
> really prefer just "object" here to make that clear.  Maybe we can add
> a small block comment explaining the terminology just to make it
> crystal-clear?
>
> rbd_object would be fine with me too.  The problem with "segment" is it
> tends to make me think I'm missing a subdivision of the RADOS objects
> that make up an rbd image that I didn't know about.

I agree about the naming.

> Otherwise,
>
> Reviewed-by: Dan Mick <dan.mick@inktank.com>
>
> On 10/22/2012 09:51 AM, Alex Elder wrote:
>> The aim of this patch is to make what's going on rbd_merge_bvec() a
>> bit more obvious than it was before.  This was an issue when a
>> recent btrfs bug led us to question whether the merge function was
>> working correctly.
>>
>> Use "seg" rather than "chunk" to indicate the units whose boundaries
>> we care about we call "segments."
>>
>> Signed-off-by: Alex Elder <elder@inktank.com>
>> ---
>>   drivers/block/rbd.c |   51
>> +++++++++++++++++++++++++++++++++++----------------
>>   1 file changed, 35 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>> index 4858d92..4ccce4d 100644
>> --- a/drivers/block/rbd.c
>> +++ b/drivers/block/rbd.c
>> @@ -1566,22 +1566,41 @@ static int rbd_merge_bvec(struct request_queue
>> *q, struct bvec_merge_data *bmd,
>>                 struct bio_vec *bvec)
>>   {
>>       struct rbd_device *rbd_dev = q->queuedata;
>> -    unsigned int chunk_sectors;
>> -    sector_t sector;
>> -    unsigned int bio_sectors;
>> -    int max;
>> -
>> -    chunk_sectors = 1 << (rbd_dev->header.obj_order - SECTOR_SHIFT);
>> -    sector = bmd->bi_sector + get_start_sect(bmd->bi_bdev);
>> -    bio_sectors = bmd->bi_size >> SECTOR_SHIFT;
>> -
>> -    max =  (chunk_sectors - ((sector & (chunk_sectors - 1))
>> -                 + bio_sectors)) << SECTOR_SHIFT;
>> -    if (max < 0)
>> -        max = 0; /* bio_add cannot handle a negative return */
>> -    if (max <= bvec->bv_len && bio_sectors == 0)
>> -        return bvec->bv_len;
>> -    return max;
>> +    sector_t sector_offset;
>> +    sector_t sectors_per_seg;
>> +    sector_t seg_sector_offset;
>> +    int ret;
>> +
>> +    /*
>> +     * Find how far into its rbd segment the partition-relative
>> +     * bio start sector is to offset relative to the enclosing
>> +     * device.
>> +     */
>> +    sector_offset = get_start_sect(bmd->bi_bdev) + bmd->bi_sector;
>> +    sectors_per_seg = 1 << (rbd_dev->header.obj_order - SECTOR_SHIFT);
>> +    seg_sector_offset = sector_offset & (sectors_per_seg - 1);
>> +
>> +    /*
>> +     * Compute the number of bytes from that offset to the end
>> +     * of the segment.  Account for what's already used by the bio.
>> +     */
>> +    ret = (int) (sectors_per_seg - seg_sector_offset) << SECTOR_SHIFT;
>> +    if (ret >= bmd->bi_size)
>> +        ret -= bmd->bi_size;
>> +    else
>> +        ret = 0;
>> +
>> +    /*
>> +     * Don't send back more than was asked for.  And if the bio
>> +     * was empty, let the whole thing through because:  "Note
>> +     * that a block device *must* allow a single page to be
>> +     * added to an empty bio."
>> +     */
>> +    rbd_assert(bvec->bv_len <= PAGE_SIZE);
>> +    if (ret > (int) bvec->bv_len || !bmd->bi_size)
>> +        ret = (int) bvec->bv_len;
>> +
>> +    return ret;
>>   }
>>
>>   static void rbd_free_disk(struct rbd_device *rbd_dev)
>>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] rbd: kill rbd_device->rbd_opts
  2012-10-22 16:51 ` [PATCH 4/4] rbd: kill rbd_device->rbd_opts Alex Elder
  2012-10-24 22:30   ` Josh Durgin
@ 2012-10-24 22:37   ` Dan Mick
  1 sibling, 0 replies; 14+ messages in thread
From: Dan Mick @ 2012-10-24 22:37 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Dan Mick <dan.mick@inktank.com>

On 10/22/2012 09:51 AM, Alex Elder wrote:
> The rbd_device structure has an embedded rbd_options structure.
> Such a structure is needed to work with the generic ceph argument
> parsing code, but there's no need to keep it around once argument
> parsing is done.
>
> Use a local variable to hold the rbd options used in parsing in
> rbd_get_client(), and just transfer its content (it's just a
> read_only flag) into the field in the rbd_mapping sub-structure
> that requires that information.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>   drivers/block/rbd.c |   14 +++++++++-----
>   1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 4ccce4d..3c131ab 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -184,7 +184,6 @@ struct rbd_device {
>   	struct gendisk		*disk;		/* blkdev's gendisk and rq */
>
>   	u32			image_format;	/* Either 1 or 2 */
> -	struct rbd_options	rbd_opts;
>   	struct rbd_client	*rbd_client;
>
>   	char			name[DEV_NAME_LEN]; /* blkdev name, e.g. rbd3 */
> @@ -456,18 +455,24 @@ static int parse_rbd_opts_token(char *c, void
> *private)
>   static int rbd_get_client(struct rbd_device *rbd_dev, const char *mon_addr,
>   				size_t mon_addr_len, char *options)
>   {
> -	struct rbd_options *rbd_opts = &rbd_dev->rbd_opts;
> +	struct rbd_options rbd_opts;
>   	struct ceph_options *ceph_opts;
>   	struct rbd_client *rbdc;
>
> -	rbd_opts->read_only = RBD_READ_ONLY_DEFAULT;
> +	/* Initialize all rbd options to the defaults */
> +
> +	rbd_opts.read_only = RBD_READ_ONLY_DEFAULT;
>
>   	ceph_opts = ceph_parse_options(options, mon_addr,
>   					mon_addr + mon_addr_len,
> -					parse_rbd_opts_token, rbd_opts);
> +					parse_rbd_opts_token, &rbd_opts);
>   	if (IS_ERR(ceph_opts))
>   		return PTR_ERR(ceph_opts);
>
> +	/* Record the parsed rbd options */
> +
> +	rbd_dev->mapping.read_only = rbd_opts.read_only;
> +
>   	rbdc = rbd_client_find(ceph_opts);
>   	if (rbdc) {
>   		/* using an existing client */
> @@ -685,7 +690,6 @@ static int rbd_dev_set_mapping(struct rbd_device
> *rbd_dev, char *snap_name)
>   		rbd_dev->mapping.size = rbd_dev->header.image_size;
>   		rbd_dev->mapping.features = rbd_dev->header.features;
>   		rbd_dev->mapping.snap_exists = false;
> -		rbd_dev->mapping.read_only = rbd_dev->rbd_opts.read_only;
>   		ret = 0;
>   	} else {
>   		ret = snap_by_name(rbd_dev, snap_name);
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] rbd: verify rbd image order value
  2012-10-24 18:02   ` Josh Durgin
@ 2012-10-26 22:06     ` Alex Elder
  0 siblings, 0 replies; 14+ messages in thread
From: Alex Elder @ 2012-10-26 22:06 UTC (permalink / raw)
  To: Josh Durgin; +Cc: ceph-devel

On 10/24/2012 01:02 PM, Josh Durgin wrote:
> Might want to add some output so we can tell which error
> is being hit.

This is true for all of the cases that fail, not just those
I'm adding here.

I'll implement this suggestion separately too, probably next
week.

					-Alex

> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
> 
> On 10/22/2012 09:51 AM, Alex Elder wrote:
>> This adds a verification that an rbd image's object order is
>> within the upper and lower bounds supported by this implementation.
>>
>> It must be at least 9 (SECTOR_SHIFT), because the Linux bio system
>> assumes that minimum granularity.
>>
>> It also must be less than 32 (at the moment anyway) because there
>> exist spots in the code that store the size of a "segment" (object
>> backing an rbd image) in a signed int variable, which can be 32 bits
>> including the sign.  We should be able to relax this limit once
>> we've verified the code uses 64-bit types where needed.
>>
>> Note that the CLI tool already limits the order to the range 12-25.
>>
>> Signed-off-by: Alex Elder <elder@inktank.com>
>> ---
>>   drivers/block/rbd.c |   10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>> index d032883..4734446 100644
>> --- a/drivers/block/rbd.c
>> +++ b/drivers/block/rbd.c
>> @@ -533,6 +533,16 @@ static bool rbd_dev_ondisk_valid(struct
>> rbd_image_header_ondisk *ondisk)
>>       if (memcmp(&ondisk->text, RBD_HEADER_TEXT, sizeof
>> (RBD_HEADER_TEXT)))
>>           return false;
>>
>> +    /* The bio layer requires at least sector-sized I/O */
>> +
>> +    if (ondisk->options.order < SECTOR_SHIFT)
>> +        return false;
>> +
>> +    /* If we use u64 in a few spots we may be able to loosen this */
>> +
>> +    if (ondisk->options.order > 8 * sizeof (int) - 1)
>> +        return false;
>> +
>>       /*
>>        * The size of a snapshot header has to fit in a size_t, and
>>        * that limits the number of snapshots.
>>
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-10-26 22:06 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-22 16:49 [PATCH 0/4] rbd: four minor changes Alex Elder
2012-10-22 16:51 ` [PATCH 1/4] rbd: verify rbd image order value Alex Elder
2012-10-22 22:43   ` Dan Mick
2012-10-24 18:02   ` Josh Durgin
2012-10-26 22:06     ` Alex Elder
2012-10-22 16:51 ` [PATCH 2/4] rbd: increase maximum snapshot name length Alex Elder
2012-10-24 19:01   ` Josh Durgin
2012-10-24 21:05   ` Dan Mick
2012-10-22 16:51 ` [PATCH 3/4] rbd: simplify rbd_merge_bvec() Alex Elder
2012-10-24 22:23   ` Dan Mick
2012-10-24 22:31     ` Josh Durgin
2012-10-22 16:51 ` [PATCH 4/4] rbd: kill rbd_device->rbd_opts Alex Elder
2012-10-24 22:30   ` Josh Durgin
2012-10-24 22:37   ` Dan Mick

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.