All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] rbd: old patches from Josh
@ 2012-07-19 12:30 Alex Elder
  2012-07-19 12:34 ` [PATCH 1/6] rbd: return errors for mapped but deleted snapshot Alex Elder
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Alex Elder @ 2012-07-19 12:30 UTC (permalink / raw)
  To: ceph-devel

Late last year Josh Durgin had put together a series of
fixes for rbd that never got committed.  I told him I
would get them in, and this series represents the last
six that remain.

Here's a summary:
[PATCH 1/6] rbd: return errors for mapped but deleted snapshot
    This adds code to distinguish the result of attempting
    to read data from a deleted snapshot from the the result
    of reading a hole in a snapshot.  The former now produces
    ENXIO.
[PATCH 2/6] rbd: only reset capacity when pointing to head
    When an rbd header is refreshed, its capacity is set in
    case it has been changed.  This should not happen for
    mapped snapshots.
[PATCH 3/6] rbd: expose the correct size of the device in sysfs
    An rbd_dev--even one mapping a snashot--holds the size of
    it's base image in its header's image_size field.  The sysfs
    entry for the snapshot size was showing the wrong value.
[PATCH 4/6] rbd: set image size when header is updated
    The rbd image size was not getting updated when a header
    was refrehsed.
[PATCH 5/6] rbd: use reference counting for the snap context
    This makes sure the rbd code takes a reference to its
    snapshot context while a request related to that context
    is underway.
[PATCH 6/6] rbd: send header version when notifying
    This ensures the version that gets sent back on a watch
    notify acknowledgement is the one that got read as
    a result of refreshing the header.

I've reviewed them all, but am posting them for a chance for
others to comment before I commit them.

					-Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/6] rbd: return errors for mapped but deleted snapshot
  2012-07-19 12:30 [PATCH 0/6] rbd: old patches from Josh Alex Elder
@ 2012-07-19 12:34 ` Alex Elder
  2012-07-19 12:34 ` [PATCH 2/6] rbd: only reset capacity when pointing to head Alex Elder
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-07-19 12:34 UTC (permalink / raw)
  To: ceph-devel

When a snapshot is deleted, the OSD will return ENOENT when reading
from it. This is normally interpreted as a hole by rbd, which will
return zeroes. To minimize the time in which this can happen, stop
requests early when we are notified that our snapshot no longer
exists.

[elder@inktank.com: updated __rbd_init_snaps_header() logic]

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   32 ++++++++++++++++++++++++++++++--
 1 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index b124442..730d0ce 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -172,9 +172,13 @@ struct rbd_device {

 	/* protects updating the header */
 	struct rw_semaphore     header_rwsem;
+	/* name of the snapshot this device reads from */
 	char                    *snap_name;
+	/* id of the snapshot this device reads from */
 	u64                     snap_id;	/* current snapshot id */
-	int read_only;
+	/* whether the snap_id this device reads from still exists */
+	bool                    snap_exists;
+	int                     read_only;

 	struct list_head	node;

@@ -597,6 +601,7 @@ static int rbd_header_set_snap(struct rbd_device
*rbd_dev, u64 *size)
 		else
 			snapc->seq = 0;
 		rbd_dev->snap_id = CEPH_NOSNAP;
+		rbd_dev->snap_exists = false;
 		rbd_dev->read_only = 0;
 		if (size)
 			*size = header->image_size;
@@ -606,6 +611,7 @@ static int rbd_header_set_snap(struct rbd_device
*rbd_dev, u64 *size)
 		if (ret < 0)
 			goto done;
 		rbd_dev->snap_id = snapc->seq;
+		rbd_dev->snap_exists = true;
 		rbd_dev->read_only = 1;
 	}

@@ -1468,6 +1474,21 @@ static void rbd_rq_fn(struct request_queue *q)

 		spin_unlock_irq(q->queue_lock);

+		if (rbd_dev->snap_id != CEPH_NOSNAP) {
+			bool snap_exists;
+
+			down_read(&rbd_dev->header_rwsem);
+			snap_exists = rbd_dev->snap_exists;
+			up_read(&rbd_dev->header_rwsem);
+
+			if (!snap_exists) {
+				dout("request for non-existent snapshot");
+				spin_lock_irq(q->queue_lock);
+				__blk_end_request_all(rq, -ENXIO);
+				continue;
+			}
+		}
+
 		dout("%s 0x%x bytes at 0x%llx\n",
 		     do_write ? "write" : "read",
 		     size, blk_rq_pos(rq) * SECTOR_SIZE);
@@ -2088,7 +2109,14 @@ static int __rbd_init_snaps_header(struct
rbd_device *rbd_dev)
 			cur_id = rbd_dev->header.snapc->snaps[i - 1];

 		if (!i || old_snap->id < cur_id) {
-			/* old_snap->id was skipped, thus was removed */
+			/*
+			 * old_snap->id was skipped, thus was
+			 * removed.  If this rbd_dev is mapped to
+			 * the removed snapshot, record that it no
+			 * longer exists, to prevent further I/O.
+			 */
+			if (rbd_dev->snap_id == old_snap->id)
+				rbd_dev->snap_exists = false;
 			__rbd_remove_snap_dev(rbd_dev, old_snap);
 			continue;
 		}
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/6] rbd: only reset capacity when pointing to head
  2012-07-19 12:30 [PATCH 0/6] rbd: old patches from Josh Alex Elder
  2012-07-19 12:34 ` [PATCH 1/6] rbd: return errors for mapped but deleted snapshot Alex Elder
@ 2012-07-19 12:34 ` Alex Elder
  2012-07-19 12:34 ` [PATCH 3/6] rbd: expose the correct size of the device in sysfs Alex Elder
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-07-19 12:34 UTC (permalink / raw)
  To: ceph-devel

Snapshots cannot be resized, and the new capacity of head should not
be reflected by the snapshot.

Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 730d0ce..f171ceb 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1725,7 +1725,12 @@ static int __rbd_refresh_header(struct rbd_device
*rbd_dev)
 		return ret;

 	/* resized? */
-	set_capacity(rbd_dev->disk, h.image_size / SECTOR_SIZE);
+	if (rbd_dev->snap_id == CEPH_NOSNAP) {
+		sector_t size = (sector_t) h.image_size / SECTOR_SIZE;
+
+		dout("setting size to %llu sectors", (unsigned long long) size);
+		set_capacity(rbd_dev->disk, size);
+	}

 	down_write(&rbd_dev->header_rwsem);

-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/6] rbd: expose the correct size of the device in sysfs
  2012-07-19 12:30 [PATCH 0/6] rbd: old patches from Josh Alex Elder
  2012-07-19 12:34 ` [PATCH 1/6] rbd: return errors for mapped but deleted snapshot Alex Elder
  2012-07-19 12:34 ` [PATCH 2/6] rbd: only reset capacity when pointing to head Alex Elder
@ 2012-07-19 12:34 ` Alex Elder
  2012-07-19 12:34 ` [PATCH 4/6] rbd: set image size when header is updated Alex Elder
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-07-19 12:34 UTC (permalink / raw)
  To: ceph-devel

If an image was mapped to a snapshot, the size of the head version
would be shown. Protect capacity with header_rwsem, since it may
change.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index f171ceb..9c3a1db 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1724,6 +1724,8 @@ static int __rbd_refresh_header(struct rbd_device
*rbd_dev)
 	if (ret < 0)
 		return ret;

+	down_write(&rbd_dev->header_rwsem);
+
 	/* resized? */
 	if (rbd_dev->snap_id == CEPH_NOSNAP) {
 		sector_t size = (sector_t) h.image_size / SECTOR_SIZE;
@@ -1732,8 +1734,6 @@ static int __rbd_refresh_header(struct rbd_device
*rbd_dev)
 		set_capacity(rbd_dev->disk, size);
 	}

-	down_write(&rbd_dev->header_rwsem);
-
 	snap_seq = rbd_dev->header.snapc->seq;
 	if (rbd_dev->header.total_snaps &&
 	    rbd_dev->header.snapc->snaps[0] == snap_seq)
@@ -1853,8 +1853,13 @@ static ssize_t rbd_size_show(struct device *dev,
 			     struct device_attribute *attr, char *buf)
 {
 	struct rbd_device *rbd_dev = dev_to_rbd_dev(dev);
+	sector_t size;
+
+	down_read(&rbd_dev->header_rwsem);
+	size = get_capacity(rbd_dev->disk);
+	up_read(&rbd_dev->header_rwsem);

-	return sprintf(buf, "%llu\n", (unsigned long
long)rbd_dev->header.image_size);
+	return sprintf(buf, "%llu\n", (unsigned long long) size * SECTOR_SIZE);
 }

 static ssize_t rbd_major_show(struct device *dev,
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/6] rbd: set image size when header is updated
  2012-07-19 12:30 [PATCH 0/6] rbd: old patches from Josh Alex Elder
                   ` (2 preceding siblings ...)
  2012-07-19 12:34 ` [PATCH 3/6] rbd: expose the correct size of the device in sysfs Alex Elder
@ 2012-07-19 12:34 ` Alex Elder
  2012-07-19 12:35 ` [PATCH 5/6] rbd: use reference counting for the snap context Alex Elder
  2012-07-19 12:35 ` [PATCH 6/6] rbd: send header version when notifying Alex Elder
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-07-19 12:34 UTC (permalink / raw)
  To: ceph-devel

The image may have been resized.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 9c3a1db..a6bbda2 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1746,6 +1746,7 @@ static int __rbd_refresh_header(struct rbd_device
*rbd_dev)
 	kfree(rbd_dev->header.snap_names);
 	kfree(rbd_dev->header.snapc);

+	rbd_dev->header.image_size = h.image_size;
 	rbd_dev->header.total_snaps = h.total_snaps;
 	rbd_dev->header.snapc = h.snapc;
 	rbd_dev->header.snap_names = h.snap_names;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 5/6] rbd: use reference counting for the snap context
  2012-07-19 12:30 [PATCH 0/6] rbd: old patches from Josh Alex Elder
                   ` (3 preceding siblings ...)
  2012-07-19 12:34 ` [PATCH 4/6] rbd: set image size when header is updated Alex Elder
@ 2012-07-19 12:35 ` Alex Elder
  2012-07-19 12:35 ` [PATCH 6/6] rbd: send header version when notifying Alex Elder
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-07-19 12:35 UTC (permalink / raw)
  To: ceph-devel

This prevents a race between requests with a given snap context and
header updates that free it. The osd client was already expecting the
snap context to be reference counted, since it get()s it in
ceph_osdc_build_request and put()s it when the request completes.

Also remove the second down_read()/up_read() on header_rwsem in
rbd_do_request, which wasn't actually preventing this race or
protecting any other data.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |   36 ++++++++++++++++++------------------
 1 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index a6bbda2..988f944 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -626,7 +626,7 @@ static void rbd_header_free(struct rbd_image_header
*header)
 	kfree(header->object_prefix);
 	kfree(header->snap_sizes);
 	kfree(header->snap_names);
-	kfree(header->snapc);
+	ceph_put_snap_context(header->snapc);
 }

 /*
@@ -902,13 +902,10 @@ static int rbd_do_request(struct request *rq,
 	dout("rbd_do_request object_name=%s ofs=%lld len=%lld\n",
 		object_name, len, ofs);

-	down_read(&rbd_dev->header_rwsem);
-
 	osdc = &rbd_dev->rbd_client->client->osdc;
 	req = ceph_osdc_alloc_request(osdc, flags, snapc, ops,
 					false, GFP_NOIO, pages, bio);
 	if (!req) {
-		up_read(&rbd_dev->header_rwsem);
 		ret = -ENOMEM;
 		goto done_pages;
 	}
@@ -942,7 +939,6 @@ static int rbd_do_request(struct request *rq,
 				snapc,
 				&mtime,
 				req->r_oid, req->r_oid_len);
-	up_read(&rbd_dev->header_rwsem);

 	if (linger_req) {
 		ceph_osdc_set_request_linger(osdc, req);
@@ -1448,6 +1444,7 @@ static void rbd_rq_fn(struct request_queue *q)
 		u64 ofs;
 		int num_segs, cur_seg = 0;
 		struct rbd_req_coll *coll;
+		struct ceph_snap_context *snapc;

 		/* peek at request from block layer */
 		if (!rq)
@@ -1474,21 +1471,20 @@ static void rbd_rq_fn(struct request_queue *q)

 		spin_unlock_irq(q->queue_lock);

-		if (rbd_dev->snap_id != CEPH_NOSNAP) {
-			bool snap_exists;
+		down_read(&rbd_dev->header_rwsem);

-			down_read(&rbd_dev->header_rwsem);
-			snap_exists = rbd_dev->snap_exists;
+		if (rbd_dev->snap_id != CEPH_NOSNAP && !rbd_dev->snap_exists) {
 			up_read(&rbd_dev->header_rwsem);
-
-			if (!snap_exists) {
-				dout("request for non-existent snapshot");
-				spin_lock_irq(q->queue_lock);
-				__blk_end_request_all(rq, -ENXIO);
-				continue;
-			}
+			dout("request for non-existent snapshot");
+			spin_lock_irq(q->queue_lock);
+			__blk_end_request_all(rq, -ENXIO);
+			continue;
 		}

+		snapc = ceph_get_snap_context(rbd_dev->header.snapc);
+
+		up_read(&rbd_dev->header_rwsem);
+
 		dout("%s 0x%x bytes at 0x%llx\n",
 		     do_write ? "write" : "read",
 		     size, blk_rq_pos(rq) * SECTOR_SIZE);
@@ -1498,6 +1494,7 @@ static void rbd_rq_fn(struct request_queue *q)
 		if (!coll) {
 			spin_lock_irq(q->queue_lock);
 			__blk_end_request_all(rq, -ENOMEM);
+			ceph_put_snap_context(snapc);
 			continue;
 		}

@@ -1521,7 +1518,7 @@ static void rbd_rq_fn(struct request_queue *q)
 			/* init OSD command: write or read */
 			if (do_write)
 				rbd_req_write(rq, rbd_dev,
-					      rbd_dev->header.snapc,
+					      snapc,
 					      ofs,
 					      op_size, bio,
 					      coll, cur_seg);
@@ -1544,6 +1541,8 @@ next_seg:
 		if (bp)
 			bio_pair_release(bp);
 		spin_lock_irq(q->queue_lock);
+
+		ceph_put_snap_context(snapc);
 	}
 }

@@ -1744,7 +1743,8 @@ static int __rbd_refresh_header(struct rbd_device
*rbd_dev)
 	/* rbd_dev->header.object_prefix shouldn't change */
 	kfree(rbd_dev->header.snap_sizes);
 	kfree(rbd_dev->header.snap_names);
-	kfree(rbd_dev->header.snapc);
+	/* osd requests may still refer to snapc */
+	ceph_put_snap_context(rbd_dev->header.snapc);

 	rbd_dev->header.image_size = h.image_size;
 	rbd_dev->header.total_snaps = h.total_snaps;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 6/6] rbd: send header version when notifying
  2012-07-19 12:30 [PATCH 0/6] rbd: old patches from Josh Alex Elder
                   ` (4 preceding siblings ...)
  2012-07-19 12:35 ` [PATCH 5/6] rbd: use reference counting for the snap context Alex Elder
@ 2012-07-19 12:35 ` Alex Elder
  5 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2012-07-19 12:35 UTC (permalink / raw)
  To: ceph-devel

Previously the original header version was sent. Now, we update it
when the header changes.

Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Alex Elder <elder@inktank.com>
---
 drivers/block/rbd.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 988f944..4d3a1e0 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1197,7 +1197,7 @@ static int rbd_req_sync_notify_ack(struct
rbd_device *rbd_dev,
 	if (ret < 0)
 		return ret;

-	ops[0].watch.ver = cpu_to_le64(rbd_dev->header.obj_version);
+	ops[0].watch.ver = cpu_to_le64(ver);
 	ops[0].watch.cookie = notify_id;
 	ops[0].watch.flag = 0;

@@ -1216,6 +1216,7 @@ static int rbd_req_sync_notify_ack(struct
rbd_device *rbd_dev,
 static void rbd_watch_cb(u64 ver, u64 notify_id, u8 opcode, void *data)
 {
 	struct rbd_device *rbd_dev = (struct rbd_device *)data;
+	u64 hver;
 	int rc;

 	if (!rbd_dev)
@@ -1225,12 +1226,13 @@ static void rbd_watch_cb(u64 ver, u64 notify_id,
u8 opcode, void *data)
 		rbd_dev->header_name, notify_id, (int) opcode);
 	mutex_lock_nested(&ctl_mutex, SINGLE_DEPTH_NESTING);
 	rc = __rbd_refresh_header(rbd_dev);
+	hver = rbd_dev->header.obj_version;
 	mutex_unlock(&ctl_mutex);
 	if (rc)
 		pr_warning(RBD_DRV_NAME "%d got notification but failed to "
 			   " update snaps: %d\n", rbd_dev->major, rc);

-	rbd_req_sync_notify_ack(rbd_dev, ver, notify_id, rbd_dev->header_name);
+	rbd_req_sync_notify_ack(rbd_dev, hver, notify_id, rbd_dev->header_name);
 }

 /*
@@ -1746,6 +1748,7 @@ static int __rbd_refresh_header(struct rbd_device
*rbd_dev)
 	/* osd requests may still refer to snapc */
 	ceph_put_snap_context(rbd_dev->header.snapc);

+	rbd_dev->header.obj_version = h.obj_version;
 	rbd_dev->header.image_size = h.image_size;
 	rbd_dev->header.total_snaps = h.total_snaps;
 	rbd_dev->header.snapc = h.snapc;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-07-19 12:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-19 12:30 [PATCH 0/6] rbd: old patches from Josh Alex Elder
2012-07-19 12:34 ` [PATCH 1/6] rbd: return errors for mapped but deleted snapshot Alex Elder
2012-07-19 12:34 ` [PATCH 2/6] rbd: only reset capacity when pointing to head Alex Elder
2012-07-19 12:34 ` [PATCH 3/6] rbd: expose the correct size of the device in sysfs Alex Elder
2012-07-19 12:34 ` [PATCH 4/6] rbd: set image size when header is updated Alex Elder
2012-07-19 12:35 ` [PATCH 5/6] rbd: use reference counting for the snap context Alex Elder
2012-07-19 12:35 ` [PATCH 6/6] rbd: send header version when notifying Alex Elder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.