All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628)
@ 2014-06-25 17:16 Ilya Dryomov
  2014-06-25 17:16 ` [PATCH 01/14] libceph: rename ceph_osd_request::r_linger_osd to r_linger_osd_item Ilya Dryomov
                   ` (13 more replies)
  0 siblings, 14 replies; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

Hi,

This series is the result of hunting down and fixing #6628.  The core
patches are:

 7/14 libceph: unregister only registered linger requests
10/14 rbd: rbd_obj_request_wait() should cancel the request if interrupted
12/14 rbd: use rbd_obj_watch_request_helper() helper
14/14 libceph: drop osd ref when canceling con work

The rest are cleanups and miscellaneous fixes.  wip-remove-osd-6628 has
been force-pushed.

Thanks,

                Ilya


Ilya Dryomov (14):
  libceph: rename ceph_osd_request::r_linger_osd to r_linger_osd_item
  libceph: add maybe_move_osd_to_lru() and switch to it
  libceph: move and add dout()s to ceph_msg_{get,put}()
  libceph: move and add dout()s to ceph_osdc_request_{get,put}()
  libceph: harden ceph_osdc_request_release() a bit
  libceph: assert both regular and lingering lists in __remove_osd()
  libceph: unregister only registered linger requests
  libceph: fix linger request check in __unregister_request()
  libceph: introduce ceph_osdc_cancel_request()
  rbd: rbd_obj_request_wait() should cancel the request if interrupted
  rbd: add rbd_obj_watch_request_helper() helper
  rbd: use rbd_obj_watch_request_helper() helper
  libceph: nuke ceph_osdc_unregister_linger_request()
  libceph: drop osd ref when canceling con work

 drivers/block/rbd.c             |  191 ++++++++++++++++++---------------------
 include/linux/ceph/messenger.h  |   14 +--
 include/linux/ceph/osd_client.h |   18 +---
 net/ceph/messenger.c            |   45 ++++++---
 net/ceph/osd_client.c           |  129 +++++++++++++++++---------
 5 files changed, 217 insertions(+), 180 deletions(-)

-- 
1.7.10.4


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 01/14] libceph: rename ceph_osd_request::r_linger_osd to r_linger_osd_item
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-06-30 12:16   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 02/14] libceph: add maybe_move_osd_to_lru() and switch to it Ilya Dryomov
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

So that:

req->r_osd_item --> osd->o_requests list
req->r_linger_osd_item --> osd->o_linger_requests list

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 include/linux/ceph/osd_client.h |    2 +-
 net/ceph/osd_client.c           |    8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
index 94ec69672164..7490a03ac163 100644
--- a/include/linux/ceph/osd_client.h
+++ b/include/linux/ceph/osd_client.h
@@ -117,7 +117,7 @@ struct ceph_osd_request {
 	struct list_head r_req_lru_item;
 	struct list_head r_osd_item;
 	struct list_head r_linger_item;
-	struct list_head r_linger_osd;
+	struct list_head r_linger_osd_item;
 	struct ceph_osd *r_osd;
 	struct ceph_pg   r_pgid;
 	int              r_pg_osds[CEPH_PG_MAX_SIZE];
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 05be0c181695..d5d2be3bd113 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -364,7 +364,7 @@ struct ceph_osd_request *ceph_osdc_alloc_request(struct ceph_osd_client *osdc,
 	RB_CLEAR_NODE(&req->r_node);
 	INIT_LIST_HEAD(&req->r_unsafe_item);
 	INIT_LIST_HEAD(&req->r_linger_item);
-	INIT_LIST_HEAD(&req->r_linger_osd);
+	INIT_LIST_HEAD(&req->r_linger_osd_item);
 	INIT_LIST_HEAD(&req->r_req_lru_item);
 	INIT_LIST_HEAD(&req->r_osd_item);
 
@@ -916,7 +916,7 @@ static void __kick_osd_requests(struct ceph_osd_client *osdc,
 	 * list at the end to keep things in tid order.
 	 */
 	list_for_each_entry_safe(req, nreq, &osd->o_linger_requests,
-				 r_linger_osd) {
+				 r_linger_osd_item) {
 		/*
 		 * reregister request prior to unregistering linger so
 		 * that r_osd is preserved.
@@ -1218,7 +1218,7 @@ static void __register_linger_request(struct ceph_osd_client *osdc,
 	ceph_osdc_get_request(req);
 	list_add_tail(&req->r_linger_item, &osdc->req_linger);
 	if (req->r_osd)
-		list_add_tail(&req->r_linger_osd,
+		list_add_tail(&req->r_linger_osd_item,
 			      &req->r_osd->o_linger_requests);
 }
 
@@ -1228,7 +1228,7 @@ static void __unregister_linger_request(struct ceph_osd_client *osdc,
 	dout("__unregister_linger_request %p\n", req);
 	list_del_init(&req->r_linger_item);
 	if (req->r_osd) {
-		list_del_init(&req->r_linger_osd);
+		list_del_init(&req->r_linger_osd_item);
 
 		if (list_empty(&req->r_osd->o_requests) &&
 		    list_empty(&req->r_osd->o_linger_requests)) {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 02/14] libceph: add maybe_move_osd_to_lru() and switch to it
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
  2014-06-25 17:16 ` [PATCH 01/14] libceph: rename ceph_osd_request::r_linger_osd to r_linger_osd_item Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-06-30 12:17   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 03/14] libceph: move and add dout()s to ceph_msg_{get,put}() Ilya Dryomov
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

Abstract out __move_osd_to_lru() logic from __unregister_request() and
__unregister_linger_request().

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 net/ceph/osd_client.c |   26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index d5d2be3bd113..6202923b41ff 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1029,12 +1029,23 @@ static void remove_all_osds(struct ceph_osd_client *osdc)
 static void __move_osd_to_lru(struct ceph_osd_client *osdc,
 			      struct ceph_osd *osd)
 {
-	dout("__move_osd_to_lru %p\n", osd);
+	dout("%s %p\n", __func__, osd);
 	BUG_ON(!list_empty(&osd->o_osd_lru));
+
 	list_add_tail(&osd->o_osd_lru, &osdc->osd_lru);
 	osd->lru_ttl = jiffies + osdc->client->options->osd_idle_ttl * HZ;
 }
 
+static void maybe_move_osd_to_lru(struct ceph_osd_client *osdc,
+				  struct ceph_osd *osd)
+{
+	dout("%s %p\n", __func__, osd);
+
+	if (list_empty(&osd->o_requests) &&
+	    list_empty(&osd->o_linger_requests))
+		__move_osd_to_lru(osdc, osd);
+}
+
 static void __remove_osd_from_lru(struct ceph_osd *osd)
 {
 	dout("__remove_osd_from_lru %p\n", osd);
@@ -1182,11 +1193,7 @@ static void __unregister_request(struct ceph_osd_client *osdc,
 		ceph_msg_revoke(req->r_request);
 
 		list_del_init(&req->r_osd_item);
-		if (list_empty(&req->r_osd->o_requests) &&
-		    list_empty(&req->r_osd->o_linger_requests)) {
-			dout("moving osd to %p lru\n", req->r_osd);
-			__move_osd_to_lru(osdc, req->r_osd);
-		}
+		maybe_move_osd_to_lru(osdc, req->r_osd);
 		if (list_empty(&req->r_linger_item))
 			req->r_osd = NULL;
 	}
@@ -1229,12 +1236,7 @@ static void __unregister_linger_request(struct ceph_osd_client *osdc,
 	list_del_init(&req->r_linger_item);
 	if (req->r_osd) {
 		list_del_init(&req->r_linger_osd_item);
-
-		if (list_empty(&req->r_osd->o_requests) &&
-		    list_empty(&req->r_osd->o_linger_requests)) {
-			dout("moving osd to %p lru\n", req->r_osd);
-			__move_osd_to_lru(osdc, req->r_osd);
-		}
+		maybe_move_osd_to_lru(osdc, req->r_osd);
 		if (list_empty(&req->r_osd_item))
 			req->r_osd = NULL;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 03/14] libceph: move and add dout()s to ceph_msg_{get,put}()
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
  2014-06-25 17:16 ` [PATCH 01/14] libceph: rename ceph_osd_request::r_linger_osd to r_linger_osd_item Ilya Dryomov
  2014-06-25 17:16 ` [PATCH 02/14] libceph: add maybe_move_osd_to_lru() and switch to it Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-06-30 12:29   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 04/14] libceph: move and add dout()s to ceph_osdc_request_{get,put}() Ilya Dryomov
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

Add dout()s to ceph_msg_{get,put}().  Also move them to .c and turn
kref release callback into a static function.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 include/linux/ceph/messenger.h |   14 ++------------
 net/ceph/messenger.c           |   31 ++++++++++++++++++++++---------
 2 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index d21f2dba0731..40ae58e3e9db 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -285,19 +285,9 @@ extern void ceph_msg_data_add_bio(struct ceph_msg *msg, struct bio *bio,
 
 extern struct ceph_msg *ceph_msg_new(int type, int front_len, gfp_t flags,
 				     bool can_fail);
-extern void ceph_msg_kfree(struct ceph_msg *m);
 
-
-static inline struct ceph_msg *ceph_msg_get(struct ceph_msg *msg)
-{
-	kref_get(&msg->kref);
-	return msg;
-}
-extern void ceph_msg_last_put(struct kref *kref);
-static inline void ceph_msg_put(struct ceph_msg *msg)
-{
-	kref_put(&msg->kref, ceph_msg_last_put);
-}
+extern struct ceph_msg *ceph_msg_get(struct ceph_msg *msg);
+extern void ceph_msg_put(struct ceph_msg *msg);
 
 extern void ceph_msg_dump(struct ceph_msg *msg);
 
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 1948d592aa54..8bffa5b90fef 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -3269,24 +3269,21 @@ static int ceph_con_in_msg_alloc(struct ceph_connection *con, int *skip)
 /*
  * Free a generically kmalloc'd message.
  */
-void ceph_msg_kfree(struct ceph_msg *m)
+static void ceph_msg_free(struct ceph_msg *m)
 {
-	dout("msg_kfree %p\n", m);
+	dout("%s %p\n", __func__, m);
 	ceph_kvfree(m->front.iov_base);
 	kmem_cache_free(ceph_msg_cache, m);
 }
 
-/*
- * Drop a msg ref.  Destroy as needed.
- */
-void ceph_msg_last_put(struct kref *kref)
+static void ceph_msg_release(struct kref *kref)
 {
 	struct ceph_msg *m = container_of(kref, struct ceph_msg, kref);
 	LIST_HEAD(data);
 	struct list_head *links;
 	struct list_head *next;
 
-	dout("ceph_msg_put last one on %p\n", m);
+	dout("%s %p\n", __func__, m);
 	WARN_ON(!list_empty(&m->list_head));
 
 	/* drop middle, data, if any */
@@ -3308,9 +3305,25 @@ void ceph_msg_last_put(struct kref *kref)
 	if (m->pool)
 		ceph_msgpool_put(m->pool, m);
 	else
-		ceph_msg_kfree(m);
+		ceph_msg_free(m);
+}
+
+struct ceph_msg *ceph_msg_get(struct ceph_msg *msg)
+{
+	dout("%s %p (was %d)\n", __func__, msg,
+	     atomic_read(&msg->kref.refcount));
+	kref_get(&msg->kref);
+	return msg;
+}
+EXPORT_SYMBOL(ceph_msg_get);
+
+void ceph_msg_put(struct ceph_msg *msg)
+{
+	dout("%s %p (was %d)\n", __func__, msg,
+	     atomic_read(&msg->kref.refcount));
+	kref_put(&msg->kref, ceph_msg_release);
 }
-EXPORT_SYMBOL(ceph_msg_last_put);
+EXPORT_SYMBOL(ceph_msg_put);
 
 void ceph_msg_dump(struct ceph_msg *msg)
 {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 04/14] libceph: move and add dout()s to ceph_osdc_request_{get,put}()
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (2 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 03/14] libceph: move and add dout()s to ceph_msg_{get,put}() Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-06-30 12:32   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 05/14] libceph: harden ceph_osdc_request_release() a bit Ilya Dryomov
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

Add dout()s to ceph_osdc_request_{get,put}().  Also move them to .c and
turn kref release callback into a static function.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 include/linux/ceph/osd_client.h |   11 ++---------
 net/ceph/osd_client.c           |   26 ++++++++++++++++++++++----
 2 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
index 7490a03ac163..a8d5652f589d 100644
--- a/include/linux/ceph/osd_client.h
+++ b/include/linux/ceph/osd_client.h
@@ -328,15 +328,8 @@ extern void ceph_osdc_set_request_linger(struct ceph_osd_client *osdc,
 extern void ceph_osdc_unregister_linger_request(struct ceph_osd_client *osdc,
 						struct ceph_osd_request *req);
 
-static inline void ceph_osdc_get_request(struct ceph_osd_request *req)
-{
-	kref_get(&req->r_kref);
-}
-extern void ceph_osdc_release_request(struct kref *kref);
-static inline void ceph_osdc_put_request(struct ceph_osd_request *req)
-{
-	kref_put(&req->r_kref, ceph_osdc_release_request);
-}
+extern void ceph_osdc_get_request(struct ceph_osd_request *req);
+extern void ceph_osdc_put_request(struct ceph_osd_request *req);
 
 extern int ceph_osdc_start_request(struct ceph_osd_client *osdc,
 				   struct ceph_osd_request *req,
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 6202923b41ff..7406046212dc 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -297,12 +297,15 @@ static void osd_req_op_data_release(struct ceph_osd_request *osd_req,
 /*
  * requests
  */
-void ceph_osdc_release_request(struct kref *kref)
+static void ceph_osdc_release_request(struct kref *kref)
 {
-	struct ceph_osd_request *req;
+	struct ceph_osd_request *req = container_of(kref,
+					    struct ceph_osd_request, r_kref);
 	unsigned int which;
 
-	req = container_of(kref, struct ceph_osd_request, r_kref);
+	dout("%s %p (r_request %p r_reply %p)\n", __func__, req,
+	     req->r_request, req->r_reply);
+
 	if (req->r_request)
 		ceph_msg_put(req->r_request);
 	if (req->r_reply) {
@@ -320,7 +323,22 @@ void ceph_osdc_release_request(struct kref *kref)
 		kmem_cache_free(ceph_osd_request_cache, req);
 
 }
-EXPORT_SYMBOL(ceph_osdc_release_request);
+
+void ceph_osdc_get_request(struct ceph_osd_request *req)
+{
+	dout("%s %p (was %d)\n", __func__, req,
+	     atomic_read(&req->r_kref.refcount));
+	kref_get(&req->r_kref);
+}
+EXPORT_SYMBOL(ceph_osdc_get_request);
+
+void ceph_osdc_put_request(struct ceph_osd_request *req)
+{
+	dout("%s %p (was %d)\n", __func__, req,
+	     atomic_read(&req->r_kref.refcount));
+	kref_put(&req->r_kref, ceph_osdc_release_request);
+}
+EXPORT_SYMBOL(ceph_osdc_put_request);
 
 struct ceph_osd_request *ceph_osdc_alloc_request(struct ceph_osd_client *osdc,
 					       struct ceph_snap_context *snapc,
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 05/14] libceph: harden ceph_osdc_request_release() a bit
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (3 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 04/14] libceph: move and add dout()s to ceph_osdc_request_{get,put}() Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-06-30 12:36   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 06/14] libceph: assert both regular and lingering lists in __remove_osd() Ilya Dryomov
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

Add some WARN_ONs to alert us when we try to destroy requests that are
still registered.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 net/ceph/osd_client.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 7406046212dc..052eb8bfcc74 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -305,6 +305,12 @@ static void ceph_osdc_release_request(struct kref *kref)
 
 	dout("%s %p (r_request %p r_reply %p)\n", __func__, req,
 	     req->r_request, req->r_reply);
+	WARN_ON(!RB_EMPTY_NODE(&req->r_node));
+	WARN_ON(!list_empty(&req->r_req_lru_item));
+	WARN_ON(!list_empty(&req->r_osd_item));
+	WARN_ON(!list_empty(&req->r_linger_item));
+	WARN_ON(!list_empty(&req->r_linger_osd_item));
+	WARN_ON(req->r_osd);
 
 	if (req->r_request)
 		ceph_msg_put(req->r_request);
@@ -1204,6 +1210,7 @@ static void __unregister_request(struct ceph_osd_client *osdc,
 
 	dout("__unregister_request %p tid %lld\n", req, req->r_tid);
 	rb_erase(&req->r_node, &osdc->requests);
+	RB_CLEAR_NODE(&req->r_node);
 	osdc->num_requests--;
 
 	if (req->r_osd) {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 06/14] libceph: assert both regular and lingering lists in __remove_osd()
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (4 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 05/14] libceph: harden ceph_osdc_request_release() a bit Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-06-30 12:37   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 07/14] libceph: unregister only registered linger requests Ilya Dryomov
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

It is important that both regular and lingering requests lists are
empty when the OSD is removed.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 net/ceph/osd_client.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 052eb8bfcc74..a9b7ea7bfdc6 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1032,6 +1032,8 @@ static void __remove_osd(struct ceph_osd_client *osdc, struct ceph_osd *osd)
 {
 	dout("__remove_osd %p\n", osd);
 	BUG_ON(!list_empty(&osd->o_requests));
+	BUG_ON(!list_empty(&osd->o_linger_requests));
+
 	rb_erase(&osd->o_node, &osdc->osds);
 	list_del_init(&osd->o_osd_lru);
 	ceph_con_close(&osd->o_con);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 07/14] libceph: unregister only registered linger requests
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (5 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 06/14] libceph: assert both regular and lingering lists in __remove_osd() Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-06-30 13:05   ` Alex Elder
  2014-06-30 13:50   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 08/14] libceph: fix linger request check in __unregister_request() Ilya Dryomov
                   ` (6 subsequent siblings)
  13 siblings, 2 replies; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

Linger requests that have not yet been registered should not be
unregistered by __unregister_linger_request().  This messes up ref
count and leads to use-after-free.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 net/ceph/osd_client.c |   15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index a9b7ea7bfdc6..12ec553a7e76 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1248,7 +1248,9 @@ static void __cancel_request(struct ceph_osd_request *req)
 static void __register_linger_request(struct ceph_osd_client *osdc,
 				    struct ceph_osd_request *req)
 {
-	dout("__register_linger_request %p\n", req);
+	dout("%s %p tid %llu\n", __func__, req, req->r_tid);
+	WARN_ON(!req->r_linger);
+
 	ceph_osdc_get_request(req);
 	list_add_tail(&req->r_linger_item, &osdc->req_linger);
 	if (req->r_osd)
@@ -1259,8 +1261,17 @@ static void __register_linger_request(struct ceph_osd_client *osdc,
 static void __unregister_linger_request(struct ceph_osd_client *osdc,
 					struct ceph_osd_request *req)
 {
-	dout("__unregister_linger_request %p\n", req);
+	WARN_ON(!req->r_linger);
+
+	if (list_empty(&req->r_linger_item)) {
+		dout("%s %p tid %llu not registered\n", __func__, req,
+		     req->r_tid);
+		return;
+	}
+
+	dout("%s %p tid %llu\n", __func__, req, req->r_tid);
 	list_del_init(&req->r_linger_item);
+
 	if (req->r_osd) {
 		list_del_init(&req->r_linger_osd_item);
 		maybe_move_osd_to_lru(osdc, req->r_osd);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 08/14] libceph: fix linger request check in __unregister_request()
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (6 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 07/14] libceph: unregister only registered linger requests Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-06-30 13:07   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request() Ilya Dryomov
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

We should check if request is on the linger request list of any of the
OSDs, not whether request is registered or not.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 net/ceph/osd_client.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 12ec553a7e76..8104db24bc09 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1221,7 +1221,7 @@ static void __unregister_request(struct ceph_osd_client *osdc,
 
 		list_del_init(&req->r_osd_item);
 		maybe_move_osd_to_lru(osdc, req->r_osd);
-		if (list_empty(&req->r_linger_item))
+		if (list_empty(&req->r_linger_osd_item))
 			req->r_osd = NULL;
 	}
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request()
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (7 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 08/14] libceph: fix linger request check in __unregister_request() Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-06-30 13:39   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 10/14] rbd: rbd_obj_request_wait() should cancel the request if interrupted Ilya Dryomov
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

Introduce ceph_osdc_cancel_request() intended for canceling requests
from the higher layers (rbd and cephfs).  Because higher layers are in
charge and are supposed to know what and when they are canceling, the
request is not completed, only unref'ed and removed from the libceph
data structures.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 include/linux/ceph/osd_client.h |    1 +
 net/ceph/osd_client.c           |   31 +++++++++++++++++++++++++------
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
index a8d5652f589d..de09cad7b7c7 100644
--- a/include/linux/ceph/osd_client.h
+++ b/include/linux/ceph/osd_client.h
@@ -334,6 +334,7 @@ extern void ceph_osdc_put_request(struct ceph_osd_request *req);
 extern int ceph_osdc_start_request(struct ceph_osd_client *osdc,
 				   struct ceph_osd_request *req,
 				   bool nofail);
+extern void ceph_osdc_cancel_request(struct ceph_osd_request *req);
 extern int ceph_osdc_wait_request(struct ceph_osd_client *osdc,
 				  struct ceph_osd_request *req);
 extern void ceph_osdc_sync(struct ceph_osd_client *osdc);
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 8104db24bc09..ef44cc0bbc6a 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -2470,6 +2470,25 @@ int ceph_osdc_start_request(struct ceph_osd_client *osdc,
 EXPORT_SYMBOL(ceph_osdc_start_request);
 
 /*
+ * Unregister a registered request.  The request is not completed (i.e.
+ * no callbacks or wakeups) - higher layers are supposed to know what
+ * they are canceling.
+ */
+void ceph_osdc_cancel_request(struct ceph_osd_request *req)
+{
+	struct ceph_osd_client *osdc = req->r_osdc;
+
+	mutex_lock(&osdc->request_mutex);
+	if (req->r_linger)
+		__unregister_linger_request(osdc, req);
+	__unregister_request(osdc, req);
+	mutex_unlock(&osdc->request_mutex);
+
+	dout("%s %p tid %llu canceled\n", __func__, req, req->r_tid);
+}
+EXPORT_SYMBOL(ceph_osdc_cancel_request);
+
+/*
  * wait for a request to complete
  */
 int ceph_osdc_wait_request(struct ceph_osd_client *osdc,
@@ -2477,18 +2496,18 @@ int ceph_osdc_wait_request(struct ceph_osd_client *osdc,
 {
 	int rc;
 
+	dout("%s %p tid %llu\n", __func__, req, req->r_tid);
+
 	rc = wait_for_completion_interruptible(&req->r_completion);
 	if (rc < 0) {
-		mutex_lock(&osdc->request_mutex);
-		__cancel_request(req);
-		__unregister_request(osdc, req);
-		mutex_unlock(&osdc->request_mutex);
+		dout("%s %p tid %llu interrupted\n", __func__, req, req->r_tid);
+		ceph_osdc_cancel_request(req);
 		complete_request(req);
-		dout("wait_request tid %llu canceled/timed out\n", req->r_tid);
 		return rc;
 	}
 
-	dout("wait_request tid %llu result %d\n", req->r_tid, req->r_result);
+	dout("%s %p tid %llu result %d\n", __func__, req, req->r_tid,
+	     req->r_result);
 	return req->r_result;
 }
 EXPORT_SYMBOL(ceph_osdc_wait_request);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 10/14] rbd: rbd_obj_request_wait() should cancel the request if interrupted
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (8 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request() Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-07-07 16:55   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 11/14] rbd: add rbd_obj_watch_request_helper() helper Ilya Dryomov
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

rbd_obj_request_wait() should cancel the underlying OSD request if
interrupted.  Otherwise libceph will hold onto it indefinitely, causing
assert failures or leaking the original object request.

This also adds an rbd wrapper around ceph_osdc_cancel_request() to
match rbd_obj_request_submit() and rbd_obj_request_wait().

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 drivers/block/rbd.c |   39 ++++++++++++++++++++++++++++-----------
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index b2c98c1bc037..20147aec86f3 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -1527,11 +1527,37 @@ static bool obj_request_type_valid(enum obj_request_type type)
 static int rbd_obj_request_submit(struct ceph_osd_client *osdc,
 				struct rbd_obj_request *obj_request)
 {
-	dout("%s: osdc %p obj %p\n", __func__, osdc, obj_request);
-
+	dout("%s %p\n", __func__, obj_request);
 	return ceph_osdc_start_request(osdc, obj_request->osd_req, false);
 }
 
+static void rbd_obj_request_end(struct rbd_obj_request *obj_request)
+{
+	dout("%s %p\n", __func__, obj_request);
+	ceph_osdc_cancel_request(obj_request->osd_req);
+}
+
+/*
+ * Wait for an object request to complete.  If interrupted, cancel the
+ * underlying osd request.
+ */
+static int rbd_obj_request_wait(struct rbd_obj_request *obj_request)
+{
+	int ret;
+
+	dout("%s %p\n", __func__, obj_request);
+
+	ret = wait_for_completion_interruptible(&obj_request->completion);
+	if (ret < 0) {
+		dout("%s %p interrupted\n", __func__, obj_request);
+		rbd_obj_request_end(obj_request);
+		return ret;
+	}
+
+	dout("%s %p done\n", __func__, obj_request);
+	return 0;
+}
+
 static void rbd_img_request_complete(struct rbd_img_request *img_request)
 {
 
@@ -1558,15 +1584,6 @@ static void rbd_img_request_complete(struct rbd_img_request *img_request)
 		rbd_img_request_put(img_request);
 }
 
-/* Caller is responsible for rbd_obj_request_destroy(obj_request) */
-
-static int rbd_obj_request_wait(struct rbd_obj_request *obj_request)
-{
-	dout("%s: obj %p\n", __func__, obj_request);
-
-	return wait_for_completion_interruptible(&obj_request->completion);
-}
-
 /*
  * The default/initial value for all image request flags is 0.  Each
  * is conditionally set to 1 at image request initialization time
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 11/14] rbd: add rbd_obj_watch_request_helper() helper
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (9 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 10/14] rbd: rbd_obj_request_wait() should cancel the request if interrupted Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-07-07 22:36   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 12/14] rbd: use " Ilya Dryomov
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

In the past, rbd_dev_header_watch_sync() used to handle both watch and
unwatch requests and was entangled and leaky.  Commit b30a01f2a307
("rbd: fix osd_request memory leak in __rbd_dev_header_watch_sync()")
split it into two separate functions.  This commit cleanly abstracts
the common bits, relying on the fixed rbd_obj_request_wait().

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 drivers/block/rbd.c |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 20147aec86f3..02cf7aba7679 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -2971,6 +2971,59 @@ static void rbd_watch_cb(u64 ver, u64 notify_id, u8 opcode, void *data)
 }
 
 /*
+ * Send a (un)watch request and wait for the ack.  Return a request
+ * with a ref held on success or error.
+ */
+static struct rbd_obj_request *rbd_obj_watch_request_helper(
+						struct rbd_device *rbd_dev,
+						bool watch)
+{
+	struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc;
+	struct rbd_obj_request *obj_request;
+	int ret;
+
+	obj_request = rbd_obj_request_create(rbd_dev->header_name, 0, 0,
+					     OBJ_REQUEST_NODATA);
+	if (!obj_request)
+		return ERR_PTR(-ENOMEM);
+
+	obj_request->osd_req = rbd_osd_req_create(rbd_dev, true, 1,
+						  obj_request);
+	if (!obj_request->osd_req) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	osd_req_op_watch_init(obj_request->osd_req, 0, CEPH_OSD_OP_WATCH,
+			      rbd_dev->watch_event->cookie, 0, watch);
+	rbd_osd_req_format_write(obj_request);
+
+	if (watch)
+		ceph_osdc_set_request_linger(osdc, obj_request->osd_req);
+
+	ret = rbd_obj_request_submit(osdc, obj_request);
+	if (ret)
+		goto out;
+
+	ret = rbd_obj_request_wait(obj_request);
+	if (ret)
+		goto out;
+
+	ret = obj_request->result;
+	if (ret) {
+		if (watch)
+			rbd_obj_request_end(obj_request);
+		goto out;
+	}
+
+	return obj_request;
+
+out:
+	rbd_obj_request_put(obj_request);
+	return ERR_PTR(ret);
+}
+
+/*
  * Initiate a watch request, synchronously.
  */
 static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 12/14] rbd: use rbd_obj_watch_request_helper() helper
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (10 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 11/14] rbd: add rbd_obj_watch_request_helper() helper Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-07-07 22:36   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 13/14] libceph: nuke ceph_osdc_unregister_linger_request() Ilya Dryomov
  2014-06-25 17:16 ` [PATCH 14/14] libceph: drop osd ref when canceling con work Ilya Dryomov
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

Switch rbd_dev_header_{un,}watch_sync() to use the new helper and fix
rbd_dev_header_unwatch_sync() to destroy watch_request structures
before queuing watch-remove message while at it.  This mistake slipped
into commit b30a01f2a307 ("rbd: fix osd_request memory leak in
__rbd_dev_header_watch_sync()") and could lead to "image still in use"
errors on image removal.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 drivers/block/rbd.c |  115 ++++++++-------------------------------------------
 1 file changed, 17 insertions(+), 98 deletions(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 02cf7aba7679..d99aa81774f8 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3040,130 +3040,49 @@ static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev)
 	if (ret < 0)
 		return ret;
 
-	rbd_assert(rbd_dev->watch_event);
-
-	obj_request = rbd_obj_request_create(rbd_dev->header_name, 0, 0,
-					     OBJ_REQUEST_NODATA);
-	if (!obj_request) {
-		ret = -ENOMEM;
-		goto out_cancel;
+	obj_request = rbd_obj_watch_request_helper(rbd_dev, true);
+	if (IS_ERR(obj_request)) {
+		ceph_osdc_cancel_event(rbd_dev->watch_event);
+		rbd_dev->watch_event = NULL;
+		return PTR_ERR(obj_request);
 	}
 
-	obj_request->osd_req = rbd_osd_req_create(rbd_dev, true, 1,
-						  obj_request);
-	if (!obj_request->osd_req) {
-		ret = -ENOMEM;
-		goto out_put;
-	}
-
-	ceph_osdc_set_request_linger(osdc, obj_request->osd_req);
-
-	osd_req_op_watch_init(obj_request->osd_req, 0, CEPH_OSD_OP_WATCH,
-			      rbd_dev->watch_event->cookie, 0, 1);
-	rbd_osd_req_format_write(obj_request);
-
-	ret = rbd_obj_request_submit(osdc, obj_request);
-	if (ret)
-		goto out_linger;
-
-	ret = rbd_obj_request_wait(obj_request);
-	if (ret)
-		goto out_linger;
-
-	ret = obj_request->result;
-	if (ret)
-		goto out_linger;
-
 	/*
 	 * A watch request is set to linger, so the underlying osd
 	 * request won't go away until we unregister it.  We retain
 	 * a pointer to the object request during that time (in
-	 * rbd_dev->watch_request), so we'll keep a reference to
-	 * it.  We'll drop that reference (below) after we've
-	 * unregistered it.
+	 * rbd_dev->watch_request), so we'll keep a reference to it.
+	 * We'll drop that reference after we've unregistered it in
+	 * rbd_dev_header_unwatch_sync().
 	 */
 	rbd_dev->watch_request = obj_request;
 
 	return 0;
-
-out_linger:
-	ceph_osdc_unregister_linger_request(osdc, obj_request->osd_req);
-out_put:
-	rbd_obj_request_put(obj_request);
-out_cancel:
-	ceph_osdc_cancel_event(rbd_dev->watch_event);
-	rbd_dev->watch_event = NULL;
-
-	return ret;
 }
 
 /*
  * Tear down a watch request, synchronously.
  */
-static int __rbd_dev_header_unwatch_sync(struct rbd_device *rbd_dev)
+static void rbd_dev_header_unwatch_sync(struct rbd_device *rbd_dev)
 {
-	struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc;
 	struct rbd_obj_request *obj_request;
-	int ret;
 
 	rbd_assert(rbd_dev->watch_event);
 	rbd_assert(rbd_dev->watch_request);
 
-	obj_request = rbd_obj_request_create(rbd_dev->header_name, 0, 0,
-					     OBJ_REQUEST_NODATA);
-	if (!obj_request) {
-		ret = -ENOMEM;
-		goto out_cancel;
-	}
-
-	obj_request->osd_req = rbd_osd_req_create(rbd_dev, true, 1,
-						  obj_request);
-	if (!obj_request->osd_req) {
-		ret = -ENOMEM;
-		goto out_put;
-	}
-
-	osd_req_op_watch_init(obj_request->osd_req, 0, CEPH_OSD_OP_WATCH,
-			      rbd_dev->watch_event->cookie, 0, 0);
-	rbd_osd_req_format_write(obj_request);
-
-	ret = rbd_obj_request_submit(osdc, obj_request);
-	if (ret)
-		goto out_put;
-
-	ret = rbd_obj_request_wait(obj_request);
-	if (ret)
-		goto out_put;
-
-	ret = obj_request->result;
-	if (ret)
-		goto out_put;
-
-	/* We have successfully torn down the watch request */
-
-	ceph_osdc_unregister_linger_request(osdc,
-					    rbd_dev->watch_request->osd_req);
+	rbd_obj_request_end(rbd_dev->watch_request);
 	rbd_obj_request_put(rbd_dev->watch_request);
 	rbd_dev->watch_request = NULL;
 
-out_put:
-	rbd_obj_request_put(obj_request);
-out_cancel:
+	obj_request = rbd_obj_watch_request_helper(rbd_dev, false);
+	if (!IS_ERR(obj_request))
+		rbd_obj_request_put(obj_request);
+	else
+		rbd_warn(rbd_dev, "unable to tear down watch request (%ld)",
+			 PTR_ERR(obj_request));
+
 	ceph_osdc_cancel_event(rbd_dev->watch_event);
 	rbd_dev->watch_event = NULL;
-
-	return ret;
-}
-
-static void rbd_dev_header_unwatch_sync(struct rbd_device *rbd_dev)
-{
-	int ret;
-
-	ret = __rbd_dev_header_unwatch_sync(rbd_dev);
-	if (ret) {
-		rbd_warn(rbd_dev, "unable to tear down watch request: %d\n",
-			 ret);
-	}
 }
 
 /*
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 13/14] libceph: nuke ceph_osdc_unregister_linger_request()
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (11 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 12/14] rbd: use " Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-07-07 22:36   ` Alex Elder
  2014-06-25 17:16 ` [PATCH 14/14] libceph: drop osd ref when canceling con work Ilya Dryomov
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

Remove now unused ceph_osdc_unregister_linger_request().

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 include/linux/ceph/osd_client.h |    2 --
 net/ceph/osd_client.c           |   12 ------------
 2 files changed, 14 deletions(-)

diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
index de09cad7b7c7..03aeb27fcc69 100644
--- a/include/linux/ceph/osd_client.h
+++ b/include/linux/ceph/osd_client.h
@@ -325,8 +325,6 @@ extern struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *,
 
 extern void ceph_osdc_set_request_linger(struct ceph_osd_client *osdc,
 					 struct ceph_osd_request *req);
-extern void ceph_osdc_unregister_linger_request(struct ceph_osd_client *osdc,
-						struct ceph_osd_request *req);
 
 extern void ceph_osdc_get_request(struct ceph_osd_request *req);
 extern void ceph_osdc_put_request(struct ceph_osd_request *req);
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index ef44cc0bbc6a..30f6faf3584f 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1281,18 +1281,6 @@ static void __unregister_linger_request(struct ceph_osd_client *osdc,
 	ceph_osdc_put_request(req);
 }
 
-void ceph_osdc_unregister_linger_request(struct ceph_osd_client *osdc,
-					 struct ceph_osd_request *req)
-{
-	mutex_lock(&osdc->request_mutex);
-	if (req->r_linger) {
-		req->r_linger = 0;
-		__unregister_linger_request(osdc, req);
-	}
-	mutex_unlock(&osdc->request_mutex);
-}
-EXPORT_SYMBOL(ceph_osdc_unregister_linger_request);
-
 void ceph_osdc_set_request_linger(struct ceph_osd_client *osdc,
 				  struct ceph_osd_request *req)
 {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH 14/14] libceph: drop osd ref when canceling con work
  2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
                   ` (12 preceding siblings ...)
  2014-06-25 17:16 ` [PATCH 13/14] libceph: nuke ceph_osdc_unregister_linger_request() Ilya Dryomov
@ 2014-06-25 17:16 ` Ilya Dryomov
  2014-07-07 22:38   ` Alex Elder
  13 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-25 17:16 UTC (permalink / raw)
  To: ceph-devel

queue_con() bumps osd ref count.  We should do the reverse when
canceling con work.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
---
 net/ceph/messenger.c |   14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 8bffa5b90fef..e51cad0db580 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -174,6 +174,7 @@ static struct lock_class_key socket_class;
 #define SKIP_BUF_SIZE	1024
 
 static void queue_con(struct ceph_connection *con);
+static void cancel_con(struct ceph_connection *con);
 static void con_work(struct work_struct *);
 static void con_fault(struct ceph_connection *con);
 
@@ -680,7 +681,7 @@ void ceph_con_close(struct ceph_connection *con)
 
 	reset_connection(con);
 	con->peer_global_seq = 0;
-	cancel_delayed_work(&con->work);
+	cancel_con(con);
 	con_close_socket(con);
 	mutex_unlock(&con->mutex);
 }
@@ -2667,19 +2668,16 @@ static int queue_con_delay(struct ceph_connection *con, unsigned long delay)
 {
 	if (!con->ops->get(con)) {
 		dout("%s %p ref count 0\n", __func__, con);
-
 		return -ENOENT;
 	}
 
 	if (!queue_delayed_work(ceph_msgr_wq, &con->work, delay)) {
 		dout("%s %p - already queued\n", __func__, con);
 		con->ops->put(con);
-
 		return -EBUSY;
 	}
 
 	dout("%s %p %lu\n", __func__, con, delay);
-
 	return 0;
 }
 
@@ -2688,6 +2686,14 @@ static void queue_con(struct ceph_connection *con)
 	(void) queue_con_delay(con, 0);
 }
 
+static void cancel_con(struct ceph_connection *con)
+{
+	if (cancel_delayed_work(&con->work)) {
+		dout("%s %p\n", __func__, con);
+		con->ops->put(con);
+	}
+}
+
 static bool con_sock_closed(struct ceph_connection *con)
 {
 	if (!con_flag_test_and_clear(con, CON_FLAG_SOCK_CLOSED))
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/14] libceph: rename ceph_osd_request::r_linger_osd to r_linger_osd_item
  2014-06-25 17:16 ` [PATCH 01/14] libceph: rename ceph_osd_request::r_linger_osd to r_linger_osd_item Ilya Dryomov
@ 2014-06-30 12:16   ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-06-30 12:16 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> So that:
> 
> req->r_osd_item --> osd->o_requests list
> req->r_linger_osd_item --> osd->o_linger_requests list
> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>

This looks good and I prefer it too.

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>  include/linux/ceph/osd_client.h |    2 +-
>  net/ceph/osd_client.c           |    8 ++++----
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
> index 94ec69672164..7490a03ac163 100644
> --- a/include/linux/ceph/osd_client.h
> +++ b/include/linux/ceph/osd_client.h
> @@ -117,7 +117,7 @@ struct ceph_osd_request {
>  	struct list_head r_req_lru_item;
>  	struct list_head r_osd_item;
>  	struct list_head r_linger_item;
> -	struct list_head r_linger_osd;
> +	struct list_head r_linger_osd_item;
>  	struct ceph_osd *r_osd;
>  	struct ceph_pg   r_pgid;
>  	int              r_pg_osds[CEPH_PG_MAX_SIZE];
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 05be0c181695..d5d2be3bd113 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -364,7 +364,7 @@ struct ceph_osd_request *ceph_osdc_alloc_request(struct ceph_osd_client *osdc,
>  	RB_CLEAR_NODE(&req->r_node);
>  	INIT_LIST_HEAD(&req->r_unsafe_item);
>  	INIT_LIST_HEAD(&req->r_linger_item);
> -	INIT_LIST_HEAD(&req->r_linger_osd);
> +	INIT_LIST_HEAD(&req->r_linger_osd_item);
>  	INIT_LIST_HEAD(&req->r_req_lru_item);
>  	INIT_LIST_HEAD(&req->r_osd_item);
>  
> @@ -916,7 +916,7 @@ static void __kick_osd_requests(struct ceph_osd_client *osdc,
>  	 * list at the end to keep things in tid order.
>  	 */
>  	list_for_each_entry_safe(req, nreq, &osd->o_linger_requests,
> -				 r_linger_osd) {
> +				 r_linger_osd_item) {
>  		/*
>  		 * reregister request prior to unregistering linger so
>  		 * that r_osd is preserved.
> @@ -1218,7 +1218,7 @@ static void __register_linger_request(struct ceph_osd_client *osdc,
>  	ceph_osdc_get_request(req);
>  	list_add_tail(&req->r_linger_item, &osdc->req_linger);
>  	if (req->r_osd)
> -		list_add_tail(&req->r_linger_osd,
> +		list_add_tail(&req->r_linger_osd_item,
>  			      &req->r_osd->o_linger_requests);
>  }
>  
> @@ -1228,7 +1228,7 @@ static void __unregister_linger_request(struct ceph_osd_client *osdc,
>  	dout("__unregister_linger_request %p\n", req);
>  	list_del_init(&req->r_linger_item);
>  	if (req->r_osd) {
> -		list_del_init(&req->r_linger_osd);
> +		list_del_init(&req->r_linger_osd_item);
>  
>  		if (list_empty(&req->r_osd->o_requests) &&
>  		    list_empty(&req->r_osd->o_linger_requests)) {
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 02/14] libceph: add maybe_move_osd_to_lru() and switch to it
  2014-06-25 17:16 ` [PATCH 02/14] libceph: add maybe_move_osd_to_lru() and switch to it Ilya Dryomov
@ 2014-06-30 12:17   ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-06-30 12:17 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> Abstract out __move_osd_to_lru() logic from __unregister_request() and
> __unregister_linger_request().
> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>

Looks good.

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>  net/ceph/osd_client.c |   26 ++++++++++++++------------
>  1 file changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index d5d2be3bd113..6202923b41ff 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1029,12 +1029,23 @@ static void remove_all_osds(struct ceph_osd_client *osdc)
>  static void __move_osd_to_lru(struct ceph_osd_client *osdc,
>  			      struct ceph_osd *osd)
>  {
> -	dout("__move_osd_to_lru %p\n", osd);
> +	dout("%s %p\n", __func__, osd);
>  	BUG_ON(!list_empty(&osd->o_osd_lru));
> +
>  	list_add_tail(&osd->o_osd_lru, &osdc->osd_lru);
>  	osd->lru_ttl = jiffies + osdc->client->options->osd_idle_ttl * HZ;
>  }
>  
> +static void maybe_move_osd_to_lru(struct ceph_osd_client *osdc,
> +				  struct ceph_osd *osd)
> +{
> +	dout("%s %p\n", __func__, osd);
> +
> +	if (list_empty(&osd->o_requests) &&
> +	    list_empty(&osd->o_linger_requests))
> +		__move_osd_to_lru(osdc, osd);
> +}
> +
>  static void __remove_osd_from_lru(struct ceph_osd *osd)
>  {
>  	dout("__remove_osd_from_lru %p\n", osd);
> @@ -1182,11 +1193,7 @@ static void __unregister_request(struct ceph_osd_client *osdc,
>  		ceph_msg_revoke(req->r_request);
>  
>  		list_del_init(&req->r_osd_item);
> -		if (list_empty(&req->r_osd->o_requests) &&
> -		    list_empty(&req->r_osd->o_linger_requests)) {
> -			dout("moving osd to %p lru\n", req->r_osd);
> -			__move_osd_to_lru(osdc, req->r_osd);
> -		}
> +		maybe_move_osd_to_lru(osdc, req->r_osd);
>  		if (list_empty(&req->r_linger_item))
>  			req->r_osd = NULL;
>  	}
> @@ -1229,12 +1236,7 @@ static void __unregister_linger_request(struct ceph_osd_client *osdc,
>  	list_del_init(&req->r_linger_item);
>  	if (req->r_osd) {
>  		list_del_init(&req->r_linger_osd_item);
> -
> -		if (list_empty(&req->r_osd->o_requests) &&
> -		    list_empty(&req->r_osd->o_linger_requests)) {
> -			dout("moving osd to %p lru\n", req->r_osd);
> -			__move_osd_to_lru(osdc, req->r_osd);
> -		}
> +		maybe_move_osd_to_lru(osdc, req->r_osd);
>  		if (list_empty(&req->r_osd_item))
>  			req->r_osd = NULL;
>  	}
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 03/14] libceph: move and add dout()s to ceph_msg_{get,put}()
  2014-06-25 17:16 ` [PATCH 03/14] libceph: move and add dout()s to ceph_msg_{get,put}() Ilya Dryomov
@ 2014-06-30 12:29   ` Alex Elder
  2014-07-08 11:12     ` Ilya Dryomov
  0 siblings, 1 reply; 40+ messages in thread
From: Alex Elder @ 2014-06-30 12:29 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> Add dout()s to ceph_msg_{get,put}().  Also move them to .c and turn
> kref release callback into a static function.
> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>

This is all very good.

I have one suggestion though, below, but regardless:

Reviewed-by: Alex Elder <elder@linaro.org>


> ---
>  include/linux/ceph/messenger.h |   14 ++------------
>  net/ceph/messenger.c           |   31 ++++++++++++++++++++++---------
>  2 files changed, 24 insertions(+), 21 deletions(-)
> 
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index d21f2dba0731..40ae58e3e9db 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -285,19 +285,9 @@ extern void ceph_msg_data_add_bio(struct ceph_msg *msg, struct bio *bio,
>  
>  extern struct ceph_msg *ceph_msg_new(int type, int front_len, gfp_t flags,
>  				     bool can_fail);
> -extern void ceph_msg_kfree(struct ceph_msg *m);
>  
> -
> -static inline struct ceph_msg *ceph_msg_get(struct ceph_msg *msg)
> -{
> -	kref_get(&msg->kref);
> -	return msg;
> -}
> -extern void ceph_msg_last_put(struct kref *kref);
> -static inline void ceph_msg_put(struct ceph_msg *msg)
> -{
> -	kref_put(&msg->kref, ceph_msg_last_put);
> -}
> +extern struct ceph_msg *ceph_msg_get(struct ceph_msg *msg);
> +extern void ceph_msg_put(struct ceph_msg *msg);
>  
>  extern void ceph_msg_dump(struct ceph_msg *msg);
>  
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 1948d592aa54..8bffa5b90fef 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -3269,24 +3269,21 @@ static int ceph_con_in_msg_alloc(struct ceph_connection *con, int *skip)
>  /*
>   * Free a generically kmalloc'd message.
>   */
> -void ceph_msg_kfree(struct ceph_msg *m)
> +static void ceph_msg_free(struct ceph_msg *m)
>  {
> -	dout("msg_kfree %p\n", m);
> +	dout("%s %p\n", __func__, m);
>  	ceph_kvfree(m->front.iov_base);
>  	kmem_cache_free(ceph_msg_cache, m);
>  }
>  
> -/*
> - * Drop a msg ref.  Destroy as needed.
> - */
> -void ceph_msg_last_put(struct kref *kref)
> +static void ceph_msg_release(struct kref *kref)
>  {
>  	struct ceph_msg *m = container_of(kref, struct ceph_msg, kref);
>  	LIST_HEAD(data);
>  	struct list_head *links;
>  	struct list_head *next;
>  
> -	dout("ceph_msg_put last one on %p\n", m);
> +	dout("%s %p\n", __func__, m);
>  	WARN_ON(!list_empty(&m->list_head));
>  
>  	/* drop middle, data, if any */
> @@ -3308,9 +3305,25 @@ void ceph_msg_last_put(struct kref *kref)
>  	if (m->pool)
>  		ceph_msgpool_put(m->pool, m);
>  	else
> -		ceph_msg_kfree(m);
> +		ceph_msg_free(m);
> +}
> +
> +struct ceph_msg *ceph_msg_get(struct ceph_msg *msg)
> +{
> +	dout("%s %p (was %d)\n", __func__, msg,
> +	     atomic_read(&msg->kref.refcount));
> +	kref_get(&msg->kref);

I suggest you do the dout() *after* you call kref_get().
I'm sure it doesn't matter in practice here, but my
reasoning is that you get the reference immediately, and
you'll have the reference when reading the refcount value.
It also  makes the dout() calls here and in ceph_msg_put()
end up getting symmetrically wrapped by the get kref
get and put.  (You have a race reading the updated
refcount value either way, but it's debug code.)

> +	return msg;
> +}
> +EXPORT_SYMBOL(ceph_msg_get);
> +
> +void ceph_msg_put(struct ceph_msg *msg)
> +{
> +	dout("%s %p (was %d)\n", __func__, msg,
> +	     atomic_read(&msg->kref.refcount));
> +	kref_put(&msg->kref, ceph_msg_release);
>  }
> -EXPORT_SYMBOL(ceph_msg_last_put);
> +EXPORT_SYMBOL(ceph_msg_put);
>  
>  void ceph_msg_dump(struct ceph_msg *msg)
>  {
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/14] libceph: move and add dout()s to ceph_osdc_request_{get,put}()
  2014-06-25 17:16 ` [PATCH 04/14] libceph: move and add dout()s to ceph_osdc_request_{get,put}() Ilya Dryomov
@ 2014-06-30 12:32   ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-06-30 12:32 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> Add dout()s to ceph_osdc_request_{get,put}().  Also move them to .c and
> turn kref release callback into a static function.

You can pretty much take the identical comments from
what I said on [PATCH 03/14].

Reviewed-by: Alex Elder <elder@linaro.org>

> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
> ---
>  include/linux/ceph/osd_client.h |   11 ++---------
>  net/ceph/osd_client.c           |   26 ++++++++++++++++++++++----
>  2 files changed, 24 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
> index 7490a03ac163..a8d5652f589d 100644
> --- a/include/linux/ceph/osd_client.h
> +++ b/include/linux/ceph/osd_client.h
> @@ -328,15 +328,8 @@ extern void ceph_osdc_set_request_linger(struct ceph_osd_client *osdc,
>  extern void ceph_osdc_unregister_linger_request(struct ceph_osd_client *osdc,
>  						struct ceph_osd_request *req);
>  
> -static inline void ceph_osdc_get_request(struct ceph_osd_request *req)
> -{
> -	kref_get(&req->r_kref);
> -}
> -extern void ceph_osdc_release_request(struct kref *kref);
> -static inline void ceph_osdc_put_request(struct ceph_osd_request *req)
> -{
> -	kref_put(&req->r_kref, ceph_osdc_release_request);
> -}
> +extern void ceph_osdc_get_request(struct ceph_osd_request *req);
> +extern void ceph_osdc_put_request(struct ceph_osd_request *req);
>  
>  extern int ceph_osdc_start_request(struct ceph_osd_client *osdc,
>  				   struct ceph_osd_request *req,
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 6202923b41ff..7406046212dc 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -297,12 +297,15 @@ static void osd_req_op_data_release(struct ceph_osd_request *osd_req,
>  /*
>   * requests
>   */
> -void ceph_osdc_release_request(struct kref *kref)
> +static void ceph_osdc_release_request(struct kref *kref)
>  {
> -	struct ceph_osd_request *req;
> +	struct ceph_osd_request *req = container_of(kref,
> +					    struct ceph_osd_request, r_kref);
>  	unsigned int which;
>  
> -	req = container_of(kref, struct ceph_osd_request, r_kref);
> +	dout("%s %p (r_request %p r_reply %p)\n", __func__, req,
> +	     req->r_request, req->r_reply);
> +
>  	if (req->r_request)
>  		ceph_msg_put(req->r_request);
>  	if (req->r_reply) {
> @@ -320,7 +323,22 @@ void ceph_osdc_release_request(struct kref *kref)
>  		kmem_cache_free(ceph_osd_request_cache, req);
>  
>  }
> -EXPORT_SYMBOL(ceph_osdc_release_request);
> +
> +void ceph_osdc_get_request(struct ceph_osd_request *req)
> +{
> +	dout("%s %p (was %d)\n", __func__, req,
> +	     atomic_read(&req->r_kref.refcount));
> +	kref_get(&req->r_kref);
> +}
> +EXPORT_SYMBOL(ceph_osdc_get_request);
> +
> +void ceph_osdc_put_request(struct ceph_osd_request *req)
> +{
> +	dout("%s %p (was %d)\n", __func__, req,
> +	     atomic_read(&req->r_kref.refcount));
> +	kref_put(&req->r_kref, ceph_osdc_release_request);
> +}
> +EXPORT_SYMBOL(ceph_osdc_put_request);
>  
>  struct ceph_osd_request *ceph_osdc_alloc_request(struct ceph_osd_client *osdc,
>  					       struct ceph_snap_context *snapc,
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 05/14] libceph: harden ceph_osdc_request_release() a bit
  2014-06-25 17:16 ` [PATCH 05/14] libceph: harden ceph_osdc_request_release() a bit Ilya Dryomov
@ 2014-06-30 12:36   ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-06-30 12:36 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> Add some WARN_ONs to alert us when we try to destroy requests that are
> still registered.
> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>

Good idea.  Especially the RB_CLEAR_NODE() call.

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>  net/ceph/osd_client.c |    7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 7406046212dc..052eb8bfcc74 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -305,6 +305,12 @@ static void ceph_osdc_release_request(struct kref *kref)
>  
>  	dout("%s %p (r_request %p r_reply %p)\n", __func__, req,
>  	     req->r_request, req->r_reply);
> +	WARN_ON(!RB_EMPTY_NODE(&req->r_node));
> +	WARN_ON(!list_empty(&req->r_req_lru_item));
> +	WARN_ON(!list_empty(&req->r_osd_item));
> +	WARN_ON(!list_empty(&req->r_linger_item));
> +	WARN_ON(!list_empty(&req->r_linger_osd_item));
> +	WARN_ON(req->r_osd);
>  
>  	if (req->r_request)
>  		ceph_msg_put(req->r_request);
> @@ -1204,6 +1210,7 @@ static void __unregister_request(struct ceph_osd_client *osdc,
>  
>  	dout("__unregister_request %p tid %lld\n", req, req->r_tid);
>  	rb_erase(&req->r_node, &osdc->requests);
> +	RB_CLEAR_NODE(&req->r_node);
>  	osdc->num_requests--;
>  
>  	if (req->r_osd) {
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 06/14] libceph: assert both regular and lingering lists in __remove_osd()
  2014-06-25 17:16 ` [PATCH 06/14] libceph: assert both regular and lingering lists in __remove_osd() Ilya Dryomov
@ 2014-06-30 12:37   ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-06-30 12:37 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> It is important that both regular and lingering requests lists are
> empty when the OSD is removed.

Looks good.

Reviewed-by: Alex Elder <elder@linaro.org>

> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
> ---
>  net/ceph/osd_client.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 052eb8bfcc74..a9b7ea7bfdc6 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1032,6 +1032,8 @@ static void __remove_osd(struct ceph_osd_client *osdc, struct ceph_osd *osd)
>  {
>  	dout("__remove_osd %p\n", osd);
>  	BUG_ON(!list_empty(&osd->o_requests));
> +	BUG_ON(!list_empty(&osd->o_linger_requests));
> +
>  	rb_erase(&osd->o_node, &osdc->osds);
>  	list_del_init(&osd->o_osd_lru);
>  	ceph_con_close(&osd->o_con);
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/14] libceph: unregister only registered linger requests
  2014-06-25 17:16 ` [PATCH 07/14] libceph: unregister only registered linger requests Ilya Dryomov
@ 2014-06-30 13:05   ` Alex Elder
  2014-06-30 13:50   ` Alex Elder
  1 sibling, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-06-30 13:05 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> Linger requests that have not yet been registered should not be
> unregistered by __unregister_linger_request().  This messes up ref
> count and leads to use-after-free.

This makes sense.  The problem can occur when updating the OSD
map.  An OSD *client* has its list of linger requests, but they
are not all necessarily registered as associated with the *OSD*.
So the __unregister_linger_request() call in kick_requests()
might pass a not-yet-registered linger request.

It could also occur if a client (like RBD) gets an error after
setting a request to linger but the request has completed
successfully.

Anyway, looks good.  This explains why the rename of the
r_linger_osd_item field was helpful.

Reviewed-by: Alex Elder <elder@linaro.org>

> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
> ---
>  net/ceph/osd_client.c |   15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index a9b7ea7bfdc6..12ec553a7e76 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1248,7 +1248,9 @@ static void __cancel_request(struct ceph_osd_request *req)
>  static void __register_linger_request(struct ceph_osd_client *osdc,
>  				    struct ceph_osd_request *req)
>  {
> -	dout("__register_linger_request %p\n", req);
> +	dout("%s %p tid %llu\n", __func__, req, req->r_tid);
> +	WARN_ON(!req->r_linger);
> +
>  	ceph_osdc_get_request(req);
>  	list_add_tail(&req->r_linger_item, &osdc->req_linger);
>  	if (req->r_osd)
> @@ -1259,8 +1261,17 @@ static void __register_linger_request(struct ceph_osd_client *osdc,
>  static void __unregister_linger_request(struct ceph_osd_client *osdc,
>  					struct ceph_osd_request *req)
>  {
> -	dout("__unregister_linger_request %p\n", req);
> +	WARN_ON(!req->r_linger);
> +
> +	if (list_empty(&req->r_linger_item)) {
> +		dout("%s %p tid %llu not registered\n", __func__, req,
> +		     req->r_tid);
> +		return;
> +	}
> +
> +	dout("%s %p tid %llu\n", __func__, req, req->r_tid);
>  	list_del_init(&req->r_linger_item);
> +
>  	if (req->r_osd) {
>  		list_del_init(&req->r_linger_osd_item);
>  		maybe_move_osd_to_lru(osdc, req->r_osd);
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 08/14] libceph: fix linger request check in __unregister_request()
  2014-06-25 17:16 ` [PATCH 08/14] libceph: fix linger request check in __unregister_request() Ilya Dryomov
@ 2014-06-30 13:07   ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-06-30 13:07 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> We should check if request is on the linger request list of any of the
> OSDs, not whether request is registered or not.
> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>

That was a difficult to spot bug.  Very good.

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>  net/ceph/osd_client.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 12ec553a7e76..8104db24bc09 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1221,7 +1221,7 @@ static void __unregister_request(struct ceph_osd_client *osdc,
>  
>  		list_del_init(&req->r_osd_item);
>  		maybe_move_osd_to_lru(osdc, req->r_osd);
> -		if (list_empty(&req->r_linger_item))
> +		if (list_empty(&req->r_linger_osd_item))
>  			req->r_osd = NULL;
>  	}
>  
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request()
  2014-06-25 17:16 ` [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request() Ilya Dryomov
@ 2014-06-30 13:39   ` Alex Elder
  2014-06-30 14:34     ` Ilya Dryomov
  0 siblings, 1 reply; 40+ messages in thread
From: Alex Elder @ 2014-06-30 13:39 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> Introduce ceph_osdc_cancel_request() intended for canceling requests
> from the higher layers (rbd and cephfs).  Because higher layers are in
> charge and are supposed to know what and when they are canceling, the
> request is not completed, only unref'ed and removed from the libceph
> data structures.

This seems reasonable.

But you make two changes here that I'd like to see a little
more explanation on.  They seem significant enough to warrant
a little more attention in the commit description.

- __cancel_request() is no longer called when
  ceph_osdc_wait_request() fails due to an
  an interrupt.  This is my main concern, and I
  plan to work through it but I'm in a small hurry
  right now.
- __unregister_linger_request() is now called when
  a request is canceled, but it wasn't before.  (Since
  we're consistent about setting the r_linger flag
  this should be fine, but it *is* a behavior change.)

I'm going to keep looking at this; I have about
20 minutes before I have to leave for a while.

					-Alex

> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
> ---
>  include/linux/ceph/osd_client.h |    1 +
>  net/ceph/osd_client.c           |   31 +++++++++++++++++++++++++------
>  2 files changed, 26 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
> index a8d5652f589d..de09cad7b7c7 100644
> --- a/include/linux/ceph/osd_client.h
> +++ b/include/linux/ceph/osd_client.h
> @@ -334,6 +334,7 @@ extern void ceph_osdc_put_request(struct ceph_osd_request *req);
>  extern int ceph_osdc_start_request(struct ceph_osd_client *osdc,
>  				   struct ceph_osd_request *req,
>  				   bool nofail);
> +extern void ceph_osdc_cancel_request(struct ceph_osd_request *req);
>  extern int ceph_osdc_wait_request(struct ceph_osd_client *osdc,
>  				  struct ceph_osd_request *req);
>  extern void ceph_osdc_sync(struct ceph_osd_client *osdc);
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 8104db24bc09..ef44cc0bbc6a 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -2470,6 +2470,25 @@ int ceph_osdc_start_request(struct ceph_osd_client *osdc,
>  EXPORT_SYMBOL(ceph_osdc_start_request);
>  
>  /*
> + * Unregister a registered request.  The request is not completed (i.e.
> + * no callbacks or wakeups) - higher layers are supposed to know what
> + * they are canceling.
> + */
> +void ceph_osdc_cancel_request(struct ceph_osd_request *req)
> +{
> +	struct ceph_osd_client *osdc = req->r_osdc;
> +
> +	mutex_lock(&osdc->request_mutex);
> +	if (req->r_linger)
> +		__unregister_linger_request(osdc, req);
> +	__unregister_request(osdc, req);
> +	mutex_unlock(&osdc->request_mutex);
> +
> +	dout("%s %p tid %llu canceled\n", __func__, req, req->r_tid);
> +}
> +EXPORT_SYMBOL(ceph_osdc_cancel_request);
> +
> +/*
>   * wait for a request to complete
>   */
>  int ceph_osdc_wait_request(struct ceph_osd_client *osdc,
> @@ -2477,18 +2496,18 @@ int ceph_osdc_wait_request(struct ceph_osd_client *osdc,
>  {
>  	int rc;
>  
> +	dout("%s %p tid %llu\n", __func__, req, req->r_tid);
> +
>  	rc = wait_for_completion_interruptible(&req->r_completion);
>  	if (rc < 0) {
> -		mutex_lock(&osdc->request_mutex);
> -		__cancel_request(req);
> -		__unregister_request(osdc, req);
> -		mutex_unlock(&osdc->request_mutex);
> +		dout("%s %p tid %llu interrupted\n", __func__, req, req->r_tid);
> +		ceph_osdc_cancel_request(req);
>  		complete_request(req);
> -		dout("wait_request tid %llu canceled/timed out\n", req->r_tid);
>  		return rc;
>  	}
>  
> -	dout("wait_request tid %llu result %d\n", req->r_tid, req->r_result);
> +	dout("%s %p tid %llu result %d\n", __func__, req, req->r_tid,
> +	     req->r_result);
>  	return req->r_result;
>  }
>  EXPORT_SYMBOL(ceph_osdc_wait_request);
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/14] libceph: unregister only registered linger requests
  2014-06-25 17:16 ` [PATCH 07/14] libceph: unregister only registered linger requests Ilya Dryomov
  2014-06-30 13:05   ` Alex Elder
@ 2014-06-30 13:50   ` Alex Elder
  2014-06-30 14:21     ` Ilya Dryomov
  1 sibling, 1 reply; 40+ messages in thread
From: Alex Elder @ 2014-06-30 13:50 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> Linger requests that have not yet been registered should not be
> unregistered by __unregister_linger_request().  This messes up ref
> count and leads to use-after-free.
> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
> ---
>  net/ceph/osd_client.c |   15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index a9b7ea7bfdc6..12ec553a7e76 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1248,7 +1248,9 @@ static void __cancel_request(struct ceph_osd_request *req)
>  static void __register_linger_request(struct ceph_osd_client *osdc,
>  				    struct ceph_osd_request *req)
>  {
> -	dout("__register_linger_request %p\n", req);
> +	dout("%s %p tid %llu\n", __func__, req, req->r_tid);
> +	WARN_ON(!req->r_linger);
> +
>  	ceph_osdc_get_request(req);
>  	list_add_tail(&req->r_linger_item, &osdc->req_linger);
>  	if (req->r_osd)
> @@ -1259,8 +1261,17 @@ static void __register_linger_request(struct ceph_osd_client *osdc,
>  static void __unregister_linger_request(struct ceph_osd_client *osdc,
>  					struct ceph_osd_request *req)
>  {
> -	dout("__unregister_linger_request %p\n", req);
> +	WARN_ON(!req->r_linger);


I just noticed something.  ceph_osdc_unregister_linger_request()
clears req->r_linger before calling __unregister_linger_request(),
which means this warning must be tripping a lot...

Just delete that assignment in ceph_osdc_unregister_linger_request()
as part of this commit.

					-Alex


> +
> +	if (list_empty(&req->r_linger_item)) {
> +		dout("%s %p tid %llu not registered\n", __func__, req,
> +		     req->r_tid);
> +		return;
> +	}
> +
> +	dout("%s %p tid %llu\n", __func__, req, req->r_tid);
>  	list_del_init(&req->r_linger_item);
> +
>  	if (req->r_osd) {
>  		list_del_init(&req->r_linger_osd_item);
>  		maybe_move_osd_to_lru(osdc, req->r_osd);
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/14] libceph: unregister only registered linger requests
  2014-06-30 13:50   ` Alex Elder
@ 2014-06-30 14:21     ` Ilya Dryomov
  0 siblings, 0 replies; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-30 14:21 UTC (permalink / raw)
  To: Alex Elder; +Cc: Ceph Development

On Mon, Jun 30, 2014 at 5:50 PM, Alex Elder <elder@ieee.org> wrote:
> On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
>> Linger requests that have not yet been registered should not be
>> unregistered by __unregister_linger_request().  This messes up ref
>> count and leads to use-after-free.
>>
>> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
>> ---
>>  net/ceph/osd_client.c |   15 +++++++++++++--
>>  1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
>> index a9b7ea7bfdc6..12ec553a7e76 100644
>> --- a/net/ceph/osd_client.c
>> +++ b/net/ceph/osd_client.c
>> @@ -1248,7 +1248,9 @@ static void __cancel_request(struct ceph_osd_request *req)
>>  static void __register_linger_request(struct ceph_osd_client *osdc,
>>                                   struct ceph_osd_request *req)
>>  {
>> -     dout("__register_linger_request %p\n", req);
>> +     dout("%s %p tid %llu\n", __func__, req, req->r_tid);
>> +     WARN_ON(!req->r_linger);
>> +
>>       ceph_osdc_get_request(req);
>>       list_add_tail(&req->r_linger_item, &osdc->req_linger);
>>       if (req->r_osd)
>> @@ -1259,8 +1261,17 @@ static void __register_linger_request(struct ceph_osd_client *osdc,
>>  static void __unregister_linger_request(struct ceph_osd_client *osdc,
>>                                       struct ceph_osd_request *req)
>>  {
>> -     dout("__unregister_linger_request %p\n", req);
>> +     WARN_ON(!req->r_linger);
>
>
> I just noticed something.  ceph_osdc_unregister_linger_request()
> clears req->r_linger before calling __unregister_linger_request(),
> which means this warning must be tripping a lot...
>
> Just delete that assignment in ceph_osdc_unregister_linger_request()
> as part of this commit.

ceph_osdc_unregister_linger_request() is removed entirely later in the
series.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request()
  2014-06-30 13:39   ` Alex Elder
@ 2014-06-30 14:34     ` Ilya Dryomov
  2014-07-07 13:47       ` Alex Elder
  0 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-06-30 14:34 UTC (permalink / raw)
  To: Alex Elder; +Cc: Ceph Development

On Mon, Jun 30, 2014 at 5:39 PM, Alex Elder <elder@ieee.org> wrote:
> On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
>> Introduce ceph_osdc_cancel_request() intended for canceling requests
>> from the higher layers (rbd and cephfs).  Because higher layers are in
>> charge and are supposed to know what and when they are canceling, the
>> request is not completed, only unref'ed and removed from the libceph
>> data structures.
>
> This seems reasonable.
>
> But you make two changes here that I'd like to see a little
> more explanation on.  They seem significant enough to warrant
> a little more attention in the commit description.
>
> - __cancel_request() is no longer called when
>   ceph_osdc_wait_request() fails due to an
>   an interrupt.  This is my main concern, and I
>   plan to work through it but I'm in a small hurry
>   right now.

Perhaps it should have been a separate commit.  __unregister_request()
revokes r_request, so I opted for not trying to do it twice.  As for
the r_sent condition and assignment, it doesn't seem to make much of
a difference, given that the request is about to be unregistered
anyway.

> - __unregister_linger_request() is now called when
>   a request is canceled, but it wasn't before.  (Since
>   we're consistent about setting the r_linger flag
>   this should be fine, but it *is* a behavior change.)

The general goal here is to make ceph_osdc_cancel_request() cancel
*any* request correctly, so if r_linger is set, which means that the
request in question *could* be lingering, __unregister_linger_request()
is called.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request()
  2014-06-30 14:34     ` Ilya Dryomov
@ 2014-07-07 13:47       ` Alex Elder
  2014-07-08 11:15         ` Ilya Dryomov
  0 siblings, 1 reply; 40+ messages in thread
From: Alex Elder @ 2014-07-07 13:47 UTC (permalink / raw)
  To: Ilya Dryomov; +Cc: Ceph Development

On 06/30/2014 09:34 AM, Ilya Dryomov wrote:
> On Mon, Jun 30, 2014 at 5:39 PM, Alex Elder <elder@ieee.org> wrote:
>> On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
>>> Introduce ceph_osdc_cancel_request() intended for canceling requests
>>> from the higher layers (rbd and cephfs).  Because higher layers are in
>>> charge and are supposed to know what and when they are canceling, the
>>> request is not completed, only unref'ed and removed from the libceph
>>> data structures.
>>
>> This seems reasonable.
>>
>> But you make two changes here that I'd like to see a little
>> more explanation on.  They seem significant enough to warrant
>> a little more attention in the commit description.
>>
>> - __cancel_request() is no longer called when
>>   ceph_osdc_wait_request() fails due to an
>>   an interrupt.  This is my main concern, and I
>>   plan to work through it but I'm in a small hurry
>>   right now.
> 
> Perhaps it should have been a separate commit.  __unregister_request()
> revokes r_request, so I opted for not trying to do it twice.  As for

OK, that makes sense--revoking the request is basically all
__cancel_request() does anyway.  You ought to have mentioned
that in the description anyway, even if it wasn't a separate
commit.

> the r_sent condition and assignment, it doesn't seem to make much of
> a difference, given that the request is about to be unregistered
> anyway.

If the request is getting canceled (from a caller outside libceph)
it can't take into account whether it was sent or not.  As you said,
it is getting revoked unconditionally by __unregister_request().
To be honest though, *that* revoke call should have been zeroing
the r_sent value also.  It apparently won't matter, but I think it's
wrong.  The revoke drops a reference, it doesn't necessarily free
the structure (though I expect it may always do so anyway).

>> - __unregister_linger_request() is now called when
>>   a request is canceled, but it wasn't before.  (Since
>>   we're consistent about setting the r_linger flag
>>   this should be fine, but it *is* a behavior change.)
> 
> The general goal here is to make ceph_osdc_cancel_request() cancel
> *any* request correctly, so if r_linger is set, which means that the
> request in question *could* be lingering, __unregister_linger_request()
> is called.

The goal is good.  Note that __unregister_linger_request() drops
a reference to the request though.  There are three other callers
of this function.  Two of them drop a reference that had just been
added by a call to __register_request().  The other one is in
ceph_osdc_unregister_linger_request(), initiated by a higher layer.
In that last case, r_linger will be zeroed, so calling it again
should be harmless.

I guess I'm going to move on...  I expect all of this discussion
will be moot and you will have made everything work right and
better by the time I get to the end of the series.

					-Alex

> 
> Thanks,
> 
>                 Ilya
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/14] rbd: rbd_obj_request_wait() should cancel the request if interrupted
  2014-06-25 17:16 ` [PATCH 10/14] rbd: rbd_obj_request_wait() should cancel the request if interrupted Ilya Dryomov
@ 2014-07-07 16:55   ` Alex Elder
  2014-07-08 11:18     ` Ilya Dryomov
  0 siblings, 1 reply; 40+ messages in thread
From: Alex Elder @ 2014-07-07 16:55 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> rbd_obj_request_wait() should cancel the underlying OSD request if
> interrupted.  Otherwise libceph will hold onto it indefinitely, causing
> assert failures or leaking the original object request.

At first I didn't understand this.  Let me see if I've got
it now, though.

Each OSD request has a completion associated with it.  An
OSD request is started via ceph_osdc_start_request(), which
registers the request and takes a reference to it.  One can
call ceph_osdc_wait_request() after the request has been
successfully started.  Whether the wait succeeds or not,
by the time ceph_osdc_wait_request() returns the request
should have been cleaned up, back to the state it was
in before the start_request call.  That means the request
needs to be unregistered and its reference dropped, etc.

Similarly, each RBD object request has a completion associated
with it.  It is distinct from the OSD request associated
with the RBD object request because there may be more to do
for RBD request to complete than just complete one object
request.  An RBD object request is started by a call to
rbd_obj_request_submit(), and once that's successfully
returned, one can wait for it to complete using a call to
rbd_obj_request_wait().  And as above, that call should
return state to (roughly) where it was before the submit
call, whether the wait request succeeded or not.

Now, RBD doesn't actually wait for its object requests
to complete--all its OSD requests complete asynchronously.
Instead, it relies on the OSD client to call the callback
function (always rbd_osd_req_callback()) when it has
completed.  That function will lead to the RBD request's
completion being signaled when appropriate.

So...  What happens when an interrupt occurs after
rbd_obj_request_submit() has returned successfully?  That
function is a simple wrapper for ceph_osdc_start_request(),
so a successful return means the request was mapped and
put on a target's unsent list (or the OSD client's no
target list).  Nobody waits for the OSD request, so an
interrupt won't get the benefit of the cleanup done in
ceph_osdc_wait_request().  Therefore the RBD layer needs
to be responsible for doing that.

In other words, when rbd_obj_request_wait() gets an
interrupt while waiting for the completion, it needs
to clean up (end) the interrupted request, and
rbd_obj_request_end() sounds right.  And what that
cleanup function should do is basically the same
as what ceph_osdc_wait_request() should do in that
situation, which is call ceph_osdc_cancel_request().

The only question that leaves me with is, does
ceph_osdc_cancel_request() need to include the
call to complete_request() that's present in
ceph_osdc_wait_request()?

I'll leave it at that.  I think I've convinced myself
this is a good change.  So I await your confirmation
that I understand it right, and your answer to my
question above.  But in any case you can consider this:

Reviewed-by: Alex Elder <elder@linaro.org>

Sorry it's taking me so long to get through these.



> 
> This also adds an rbd wrapper around ceph_osdc_cancel_request() to
> match rbd_obj_request_submit() and rbd_obj_request_wait().
> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
> ---
>  drivers/block/rbd.c |   39 ++++++++++++++++++++++++++++-----------
>  1 file changed, 28 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index b2c98c1bc037..20147aec86f3 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -1527,11 +1527,37 @@ static bool obj_request_type_valid(enum obj_request_type type)
>  static int rbd_obj_request_submit(struct ceph_osd_client *osdc,
>  				struct rbd_obj_request *obj_request)
>  {
> -	dout("%s: osdc %p obj %p\n", __func__, osdc, obj_request);
> -
> +	dout("%s %p\n", __func__, obj_request);
>  	return ceph_osdc_start_request(osdc, obj_request->osd_req, false);
>  }
>  
> +static void rbd_obj_request_end(struct rbd_obj_request *obj_request)
> +{
> +	dout("%s %p\n", __func__, obj_request);
> +	ceph_osdc_cancel_request(obj_request->osd_req);
> +}
> +
> +/*
> + * Wait for an object request to complete.  If interrupted, cancel the
> + * underlying osd request.
> + */
> +static int rbd_obj_request_wait(struct rbd_obj_request *obj_request)
> +{
> +	int ret;
> +
> +	dout("%s %p\n", __func__, obj_request);
> +
> +	ret = wait_for_completion_interruptible(&obj_request->completion);
> +	if (ret < 0) {
> +		dout("%s %p interrupted\n", __func__, obj_request);
> +		rbd_obj_request_end(obj_request);
> +		return ret;
> +	}
> +
> +	dout("%s %p done\n", __func__, obj_request);
> +	return 0;
> +}
> +
>  static void rbd_img_request_complete(struct rbd_img_request *img_request)
>  {
>  
> @@ -1558,15 +1584,6 @@ static void rbd_img_request_complete(struct rbd_img_request *img_request)
>  		rbd_img_request_put(img_request);
>  }
>  
> -/* Caller is responsible for rbd_obj_request_destroy(obj_request) */
> -
> -static int rbd_obj_request_wait(struct rbd_obj_request *obj_request)
> -{
> -	dout("%s: obj %p\n", __func__, obj_request);
> -
> -	return wait_for_completion_interruptible(&obj_request->completion);
> -}
> -
>  /*
>   * The default/initial value for all image request flags is 0.  Each
>   * is conditionally set to 1 at image request initialization time
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 11/14] rbd: add rbd_obj_watch_request_helper() helper
  2014-06-25 17:16 ` [PATCH 11/14] rbd: add rbd_obj_watch_request_helper() helper Ilya Dryomov
@ 2014-07-07 22:36   ` Alex Elder
  2014-07-08 11:18     ` Ilya Dryomov
  0 siblings, 1 reply; 40+ messages in thread
From: Alex Elder @ 2014-07-07 22:36 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> In the past, rbd_dev_header_watch_sync() used to handle both watch and
> unwatch requests and was entangled and leaky.  Commit b30a01f2a307
> ("rbd: fix osd_request memory leak in __rbd_dev_header_watch_sync()")
> split it into two separate functions.  This commit cleanly abstracts
> the common bits, relying on the fixed rbd_obj_request_wait().

Adding this without calling it leads to an unused function
warning in the build, I'm sure.

You could probably squash this into the next patch.

Reviewed-by: Alex Elder <elder@linaro.org>

> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
> ---
>  drivers/block/rbd.c |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 53 insertions(+)
> 
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 20147aec86f3..02cf7aba7679 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -2971,6 +2971,59 @@ static void rbd_watch_cb(u64 ver, u64 notify_id, u8 opcode, void *data)
>  }
>  
>  /*
> + * Send a (un)watch request and wait for the ack.  Return a request
> + * with a ref held on success or error.
> + */
> +static struct rbd_obj_request *rbd_obj_watch_request_helper(
> +						struct rbd_device *rbd_dev,
> +						bool watch)
> +{
> +	struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc;
> +	struct rbd_obj_request *obj_request;
> +	int ret;
> +
> +	obj_request = rbd_obj_request_create(rbd_dev->header_name, 0, 0,
> +					     OBJ_REQUEST_NODATA);
> +	if (!obj_request)
> +		return ERR_PTR(-ENOMEM);
> +
> +	obj_request->osd_req = rbd_osd_req_create(rbd_dev, true, 1,
> +						  obj_request);
> +	if (!obj_request->osd_req) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	osd_req_op_watch_init(obj_request->osd_req, 0, CEPH_OSD_OP_WATCH,
> +			      rbd_dev->watch_event->cookie, 0, watch);
> +	rbd_osd_req_format_write(obj_request);
> +
> +	if (watch)
> +		ceph_osdc_set_request_linger(osdc, obj_request->osd_req);
> +
> +	ret = rbd_obj_request_submit(osdc, obj_request);
> +	if (ret)
> +		goto out;
> +
> +	ret = rbd_obj_request_wait(obj_request);
> +	if (ret)
> +		goto out;
> +
> +	ret = obj_request->result;
> +	if (ret) {
> +		if (watch)
> +			rbd_obj_request_end(obj_request);
> +		goto out;
> +	}
> +
> +	return obj_request;
> +
> +out:
> +	rbd_obj_request_put(obj_request);
> +	return ERR_PTR(ret);
> +}
> +
> +/*
>   * Initiate a watch request, synchronously.
>   */
>  static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev)
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 12/14] rbd: use rbd_obj_watch_request_helper() helper
  2014-06-25 17:16 ` [PATCH 12/14] rbd: use " Ilya Dryomov
@ 2014-07-07 22:36   ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-07-07 22:36 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> Switch rbd_dev_header_{un,}watch_sync() to use the new helper and fix
> rbd_dev_header_unwatch_sync() to destroy watch_request structures
> before queuing watch-remove message while at it.  This mistake slipped
> into commit b30a01f2a307 ("rbd: fix osd_request memory leak in
> __rbd_dev_header_watch_sync()") and could lead to "image still in use"
> errors on image removal.

You can see why--even if they were entangled--both setting up
and taking down the watch request were done in the same function.
So much of what they do is common, just slightly different.

Reviewed-by: Alex Elder <elder@linaro.org>

> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
> ---
>  drivers/block/rbd.c |  115 ++++++++-------------------------------------------
>  1 file changed, 17 insertions(+), 98 deletions(-)
> 
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 02cf7aba7679..d99aa81774f8 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -3040,130 +3040,49 @@ static int rbd_dev_header_watch_sync(struct rbd_device *rbd_dev)
>  	if (ret < 0)
>  		return ret;
>  
> -	rbd_assert(rbd_dev->watch_event);
> -
> -	obj_request = rbd_obj_request_create(rbd_dev->header_name, 0, 0,
> -					     OBJ_REQUEST_NODATA);
> -	if (!obj_request) {
> -		ret = -ENOMEM;
> -		goto out_cancel;
> +	obj_request = rbd_obj_watch_request_helper(rbd_dev, true);
> +	if (IS_ERR(obj_request)) {
> +		ceph_osdc_cancel_event(rbd_dev->watch_event);
> +		rbd_dev->watch_event = NULL;
> +		return PTR_ERR(obj_request);
>  	}
>  
> -	obj_request->osd_req = rbd_osd_req_create(rbd_dev, true, 1,
> -						  obj_request);
> -	if (!obj_request->osd_req) {
> -		ret = -ENOMEM;
> -		goto out_put;
> -	}
> -
> -	ceph_osdc_set_request_linger(osdc, obj_request->osd_req);
> -
> -	osd_req_op_watch_init(obj_request->osd_req, 0, CEPH_OSD_OP_WATCH,
> -			      rbd_dev->watch_event->cookie, 0, 1);
> -	rbd_osd_req_format_write(obj_request);
> -
> -	ret = rbd_obj_request_submit(osdc, obj_request);
> -	if (ret)
> -		goto out_linger;
> -
> -	ret = rbd_obj_request_wait(obj_request);
> -	if (ret)
> -		goto out_linger;
> -
> -	ret = obj_request->result;
> -	if (ret)
> -		goto out_linger;
> -
>  	/*
>  	 * A watch request is set to linger, so the underlying osd
>  	 * request won't go away until we unregister it.  We retain
>  	 * a pointer to the object request during that time (in
> -	 * rbd_dev->watch_request), so we'll keep a reference to
> -	 * it.  We'll drop that reference (below) after we've
> -	 * unregistered it.
> +	 * rbd_dev->watch_request), so we'll keep a reference to it.
> +	 * We'll drop that reference after we've unregistered it in
> +	 * rbd_dev_header_unwatch_sync().
>  	 */
>  	rbd_dev->watch_request = obj_request;
>  
>  	return 0;
> -
> -out_linger:
> -	ceph_osdc_unregister_linger_request(osdc, obj_request->osd_req);
> -out_put:
> -	rbd_obj_request_put(obj_request);
> -out_cancel:
> -	ceph_osdc_cancel_event(rbd_dev->watch_event);
> -	rbd_dev->watch_event = NULL;
> -
> -	return ret;
>  }
>  
>  /*
>   * Tear down a watch request, synchronously.
>   */
> -static int __rbd_dev_header_unwatch_sync(struct rbd_device *rbd_dev)
> +static void rbd_dev_header_unwatch_sync(struct rbd_device *rbd_dev)
>  {
> -	struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc;
>  	struct rbd_obj_request *obj_request;
> -	int ret;
>  
>  	rbd_assert(rbd_dev->watch_event);
>  	rbd_assert(rbd_dev->watch_request);
>  
> -	obj_request = rbd_obj_request_create(rbd_dev->header_name, 0, 0,
> -					     OBJ_REQUEST_NODATA);
> -	if (!obj_request) {
> -		ret = -ENOMEM;
> -		goto out_cancel;
> -	}
> -
> -	obj_request->osd_req = rbd_osd_req_create(rbd_dev, true, 1,
> -						  obj_request);
> -	if (!obj_request->osd_req) {
> -		ret = -ENOMEM;
> -		goto out_put;
> -	}
> -
> -	osd_req_op_watch_init(obj_request->osd_req, 0, CEPH_OSD_OP_WATCH,
> -			      rbd_dev->watch_event->cookie, 0, 0);
> -	rbd_osd_req_format_write(obj_request);
> -
> -	ret = rbd_obj_request_submit(osdc, obj_request);
> -	if (ret)
> -		goto out_put;
> -
> -	ret = rbd_obj_request_wait(obj_request);
> -	if (ret)
> -		goto out_put;
> -
> -	ret = obj_request->result;
> -	if (ret)
> -		goto out_put;
> -
> -	/* We have successfully torn down the watch request */
> -
> -	ceph_osdc_unregister_linger_request(osdc,
> -					    rbd_dev->watch_request->osd_req);
> +	rbd_obj_request_end(rbd_dev->watch_request);
>  	rbd_obj_request_put(rbd_dev->watch_request);
>  	rbd_dev->watch_request = NULL;
>  
> -out_put:
> -	rbd_obj_request_put(obj_request);
> -out_cancel:
> +	obj_request = rbd_obj_watch_request_helper(rbd_dev, false);
> +	if (!IS_ERR(obj_request))
> +		rbd_obj_request_put(obj_request);
> +	else
> +		rbd_warn(rbd_dev, "unable to tear down watch request (%ld)",
> +			 PTR_ERR(obj_request));
> +
>  	ceph_osdc_cancel_event(rbd_dev->watch_event);
>  	rbd_dev->watch_event = NULL;
> -
> -	return ret;
> -}
> -
> -static void rbd_dev_header_unwatch_sync(struct rbd_device *rbd_dev)
> -{
> -	int ret;
> -
> -	ret = __rbd_dev_header_unwatch_sync(rbd_dev);
> -	if (ret) {
> -		rbd_warn(rbd_dev, "unable to tear down watch request: %d\n",
> -			 ret);
> -	}
>  }
>  
>  /*
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 13/14] libceph: nuke ceph_osdc_unregister_linger_request()
  2014-06-25 17:16 ` [PATCH 13/14] libceph: nuke ceph_osdc_unregister_linger_request() Ilya Dryomov
@ 2014-07-07 22:36   ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-07-07 22:36 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> Remove now unused ceph_osdc_unregister_linger_request().
> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>

Looks good.

Reviewed-by: Alex Elder <elder@linaro.org>

> ---
>  include/linux/ceph/osd_client.h |    2 --
>  net/ceph/osd_client.c           |   12 ------------
>  2 files changed, 14 deletions(-)
> 
> diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h
> index de09cad7b7c7..03aeb27fcc69 100644
> --- a/include/linux/ceph/osd_client.h
> +++ b/include/linux/ceph/osd_client.h
> @@ -325,8 +325,6 @@ extern struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *,
>  
>  extern void ceph_osdc_set_request_linger(struct ceph_osd_client *osdc,
>  					 struct ceph_osd_request *req);
> -extern void ceph_osdc_unregister_linger_request(struct ceph_osd_client *osdc,
> -						struct ceph_osd_request *req);
>  
>  extern void ceph_osdc_get_request(struct ceph_osd_request *req);
>  extern void ceph_osdc_put_request(struct ceph_osd_request *req);
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index ef44cc0bbc6a..30f6faf3584f 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1281,18 +1281,6 @@ static void __unregister_linger_request(struct ceph_osd_client *osdc,
>  	ceph_osdc_put_request(req);
>  }
>  
> -void ceph_osdc_unregister_linger_request(struct ceph_osd_client *osdc,
> -					 struct ceph_osd_request *req)
> -{
> -	mutex_lock(&osdc->request_mutex);
> -	if (req->r_linger) {
> -		req->r_linger = 0;
> -		__unregister_linger_request(osdc, req);
> -	}
> -	mutex_unlock(&osdc->request_mutex);
> -}
> -EXPORT_SYMBOL(ceph_osdc_unregister_linger_request);
> -
>  void ceph_osdc_set_request_linger(struct ceph_osd_client *osdc,
>  				  struct ceph_osd_request *req)
>  {
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 14/14] libceph: drop osd ref when canceling con work
  2014-06-25 17:16 ` [PATCH 14/14] libceph: drop osd ref when canceling con work Ilya Dryomov
@ 2014-07-07 22:38   ` Alex Elder
  2014-07-08 11:22     ` Ilya Dryomov
  0 siblings, 1 reply; 40+ messages in thread
From: Alex Elder @ 2014-07-07 22:38 UTC (permalink / raw)
  To: Ilya Dryomov, ceph-devel

On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
> queue_con() bumps osd ref count.  We should do the reverse when
> canceling con work.

Kind of unrelated to the rest of the series, but it looks
good.  Good to have a same-level-of-abstraction function
for it as well.

Reviewed-by: Alex Elder <elder@linaro.org>
> 
> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
> ---
>  net/ceph/messenger.c |   14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 8bffa5b90fef..e51cad0db580 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -174,6 +174,7 @@ static struct lock_class_key socket_class;
>  #define SKIP_BUF_SIZE	1024
>  
>  static void queue_con(struct ceph_connection *con);
> +static void cancel_con(struct ceph_connection *con);
>  static void con_work(struct work_struct *);
>  static void con_fault(struct ceph_connection *con);
>  
> @@ -680,7 +681,7 @@ void ceph_con_close(struct ceph_connection *con)
>  
>  	reset_connection(con);
>  	con->peer_global_seq = 0;
> -	cancel_delayed_work(&con->work);
> +	cancel_con(con);
>  	con_close_socket(con);
>  	mutex_unlock(&con->mutex);
>  }
> @@ -2667,19 +2668,16 @@ static int queue_con_delay(struct ceph_connection *con, unsigned long delay)
>  {
>  	if (!con->ops->get(con)) {
>  		dout("%s %p ref count 0\n", __func__, con);
> -
>  		return -ENOENT;
>  	}
>  
>  	if (!queue_delayed_work(ceph_msgr_wq, &con->work, delay)) {
>  		dout("%s %p - already queued\n", __func__, con);
>  		con->ops->put(con);
> -
>  		return -EBUSY;
>  	}
>  
>  	dout("%s %p %lu\n", __func__, con, delay);
> -
>  	return 0;
>  }
>  
> @@ -2688,6 +2686,14 @@ static void queue_con(struct ceph_connection *con)
>  	(void) queue_con_delay(con, 0);
>  }
>  
> +static void cancel_con(struct ceph_connection *con)
> +{
> +	if (cancel_delayed_work(&con->work)) {
> +		dout("%s %p\n", __func__, con);
> +		con->ops->put(con);
> +	}
> +}
> +
>  static bool con_sock_closed(struct ceph_connection *con)
>  {
>  	if (!con_flag_test_and_clear(con, CON_FLAG_SOCK_CLOSED))
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 03/14] libceph: move and add dout()s to ceph_msg_{get,put}()
  2014-06-30 12:29   ` Alex Elder
@ 2014-07-08 11:12     ` Ilya Dryomov
  0 siblings, 0 replies; 40+ messages in thread
From: Ilya Dryomov @ 2014-07-08 11:12 UTC (permalink / raw)
  To: Alex Elder; +Cc: Ceph Development

On Mon, Jun 30, 2014 at 4:29 PM, Alex Elder <elder@ieee.org> wrote:
> On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
>> Add dout()s to ceph_msg_{get,put}().  Also move them to .c and turn
>> kref release callback into a static function.
>>
>> Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
>
> This is all very good.
>
> I have one suggestion though, below, but regardless:
>
> Reviewed-by: Alex Elder <elder@linaro.org>
>
>
>> ---
>>  include/linux/ceph/messenger.h |   14 ++------------
>>  net/ceph/messenger.c           |   31 ++++++++++++++++++++++---------
>>  2 files changed, 24 insertions(+), 21 deletions(-)
>>
>> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
>> index d21f2dba0731..40ae58e3e9db 100644
>> --- a/include/linux/ceph/messenger.h
>> +++ b/include/linux/ceph/messenger.h
>> @@ -285,19 +285,9 @@ extern void ceph_msg_data_add_bio(struct ceph_msg *msg, struct bio *bio,
>>
>>  extern struct ceph_msg *ceph_msg_new(int type, int front_len, gfp_t flags,
>>                                    bool can_fail);
>> -extern void ceph_msg_kfree(struct ceph_msg *m);
>>
>> -
>> -static inline struct ceph_msg *ceph_msg_get(struct ceph_msg *msg)
>> -{
>> -     kref_get(&msg->kref);
>> -     return msg;
>> -}
>> -extern void ceph_msg_last_put(struct kref *kref);
>> -static inline void ceph_msg_put(struct ceph_msg *msg)
>> -{
>> -     kref_put(&msg->kref, ceph_msg_last_put);
>> -}
>> +extern struct ceph_msg *ceph_msg_get(struct ceph_msg *msg);
>> +extern void ceph_msg_put(struct ceph_msg *msg);
>>
>>  extern void ceph_msg_dump(struct ceph_msg *msg);
>>
>> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
>> index 1948d592aa54..8bffa5b90fef 100644
>> --- a/net/ceph/messenger.c
>> +++ b/net/ceph/messenger.c
>> @@ -3269,24 +3269,21 @@ static int ceph_con_in_msg_alloc(struct ceph_connection *con, int *skip)
>>  /*
>>   * Free a generically kmalloc'd message.
>>   */
>> -void ceph_msg_kfree(struct ceph_msg *m)
>> +static void ceph_msg_free(struct ceph_msg *m)
>>  {
>> -     dout("msg_kfree %p\n", m);
>> +     dout("%s %p\n", __func__, m);
>>       ceph_kvfree(m->front.iov_base);
>>       kmem_cache_free(ceph_msg_cache, m);
>>  }
>>
>> -/*
>> - * Drop a msg ref.  Destroy as needed.
>> - */
>> -void ceph_msg_last_put(struct kref *kref)
>> +static void ceph_msg_release(struct kref *kref)
>>  {
>>       struct ceph_msg *m = container_of(kref, struct ceph_msg, kref);
>>       LIST_HEAD(data);
>>       struct list_head *links;
>>       struct list_head *next;
>>
>> -     dout("ceph_msg_put last one on %p\n", m);
>> +     dout("%s %p\n", __func__, m);
>>       WARN_ON(!list_empty(&m->list_head));
>>
>>       /* drop middle, data, if any */
>> @@ -3308,9 +3305,25 @@ void ceph_msg_last_put(struct kref *kref)
>>       if (m->pool)
>>               ceph_msgpool_put(m->pool, m);
>>       else
>> -             ceph_msg_kfree(m);
>> +             ceph_msg_free(m);
>> +}
>> +
>> +struct ceph_msg *ceph_msg_get(struct ceph_msg *msg)
>> +{
>> +     dout("%s %p (was %d)\n", __func__, msg,
>> +          atomic_read(&msg->kref.refcount));
>> +     kref_get(&msg->kref);
>
> I suggest you do the dout() *after* you call kref_get().
> I'm sure it doesn't matter in practice here, but my
> reasoning is that you get the reference immediately, and
> you'll have the reference when reading the refcount value.
> It also  makes the dout() calls here and in ceph_msg_put()
> end up getting symmetrically wrapped by the get kref
> get and put.  (You have a race reading the updated
> refcount value either way, but it's debug code.)

My inspiration was rbd_{img,obj}_request_get().  kref_get() can't fail
(may spit out a WARN though) and it is racey anyway, so I'll leave it as
is for consistency.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request()
  2014-07-07 13:47       ` Alex Elder
@ 2014-07-08 11:15         ` Ilya Dryomov
  2014-07-08 12:58           ` Alex Elder
  0 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-07-08 11:15 UTC (permalink / raw)
  To: Alex Elder; +Cc: Ceph Development

On Mon, Jul 7, 2014 at 5:47 PM, Alex Elder <elder@ieee.org> wrote:
> On 06/30/2014 09:34 AM, Ilya Dryomov wrote:
>> On Mon, Jun 30, 2014 at 5:39 PM, Alex Elder <elder@ieee.org> wrote:
>>> On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
>>>> Introduce ceph_osdc_cancel_request() intended for canceling requests
>>>> from the higher layers (rbd and cephfs).  Because higher layers are in
>>>> charge and are supposed to know what and when they are canceling, the
>>>> request is not completed, only unref'ed and removed from the libceph
>>>> data structures.
>>>
>>> This seems reasonable.
>>>
>>> But you make two changes here that I'd like to see a little
>>> more explanation on.  They seem significant enough to warrant
>>> a little more attention in the commit description.
>>>
>>> - __cancel_request() is no longer called when
>>>   ceph_osdc_wait_request() fails due to an
>>>   an interrupt.  This is my main concern, and I
>>>   plan to work through it but I'm in a small hurry
>>>   right now.
>>
>> Perhaps it should have been a separate commit.  __unregister_request()
>> revokes r_request, so I opted for not trying to do it twice.  As for
>
> OK, that makes sense--revoking the request is basically all
> __cancel_request() does anyway.  You ought to have mentioned
> that in the description anyway, even if it wasn't a separate
> commit.

I have added the explanation to the commit message.

>
>> the r_sent condition and assignment, it doesn't seem to make much of
>> a difference, given that the request is about to be unregistered
>> anyway.
>
> If the request is getting canceled (from a caller outside libceph)
> it can't take into account whether it was sent or not.  As you said,
> it is getting revoked unconditionally by __unregister_request().
> To be honest though, *that* revoke call should have been zeroing
> the r_sent value also.  It apparently won't matter, but I think it's
> wrong.  The revoke drops a reference, it doesn't necessarily free
> the structure (though I expect it may always do so anyway).
>
>>> - __unregister_linger_request() is now called when
>>>   a request is canceled, but it wasn't before.  (Since
>>>   we're consistent about setting the r_linger flag
>>>   this should be fine, but it *is* a behavior change.)
>>
>> The general goal here is to make ceph_osdc_cancel_request() cancel
>> *any* request correctly, so if r_linger is set, which means that the
>> request in question *could* be lingering, __unregister_linger_request()
>> is called.
>
> The goal is good.  Note that __unregister_linger_request() drops
> a reference to the request though.  There are three other callers
> of this function.  Two of them drop a reference that had just been
> added by a call to __register_request().  The other one is in
> ceph_osdc_unregister_linger_request(), initiated by a higher layer.
> In that last case, r_linger will be zeroed, so calling it again
> should be harmless.

Yeah, ceph_osdc_unregister_linger_request() is removed in favor of
ceph_osdc_cancel_request() later in the series.  r_linger is now
treated is a sort of immutable field - it's never zeroed after it's
been set.  It's still safe to call __unregister_linger_request()
at any point in time though, because unless the request is *actually*
lingering it won't do a thing.

Are you OK with your Reviewed-by for this patch?

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/14] rbd: rbd_obj_request_wait() should cancel the request if interrupted
  2014-07-07 16:55   ` Alex Elder
@ 2014-07-08 11:18     ` Ilya Dryomov
  2014-07-08 12:17       ` Alex Elder
  0 siblings, 1 reply; 40+ messages in thread
From: Ilya Dryomov @ 2014-07-08 11:18 UTC (permalink / raw)
  To: Alex Elder; +Cc: Ceph Development

On Mon, Jul 7, 2014 at 8:55 PM, Alex Elder <elder@linaro.org> wrote:
> On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
>> rbd_obj_request_wait() should cancel the underlying OSD request if
>> interrupted.  Otherwise libceph will hold onto it indefinitely, causing
>> assert failures or leaking the original object request.
>
> At first I didn't understand this.  Let me see if I've got
> it now, though.
>
> Each OSD request has a completion associated with it.  An
> OSD request is started via ceph_osdc_start_request(), which
> registers the request and takes a reference to it.  One can
> call ceph_osdc_wait_request() after the request has been
> successfully started.  Whether the wait succeeds or not,
> by the time ceph_osdc_wait_request() returns the request
> should have been cleaned up, back to the state it was
> in before the start_request call.  That means the request
> needs to be unregistered and its reference dropped, etc.
>
> Similarly, each RBD object request has a completion associated
> with it.  It is distinct from the OSD request associated
> with the RBD object request because there may be more to do
> for RBD request to complete than just complete one object
> request.  An RBD object request is started by a call to
> rbd_obj_request_submit(), and once that's successfully
> returned, one can wait for it to complete using a call to
> rbd_obj_request_wait().  And as above, that call should
> return state to (roughly) where it was before the submit
> call, whether the wait request succeeded or not.
>
> Now, RBD doesn't actually wait for its object requests
> to complete--all its OSD requests complete asynchronously.
> Instead, it relies on the OSD client to call the callback
> function (always rbd_osd_req_callback()) when it has
> completed.  That function will lead to the RBD request's
> completion being signaled when appropriate.
>
> So...  What happens when an interrupt occurs after
> rbd_obj_request_submit() has returned successfully?  That
> function is a simple wrapper for ceph_osdc_start_request(),
> so a successful return means the request was mapped and
> put on a target's unsent list (or the OSD client's no
> target list).  Nobody waits for the OSD request, so an
> interrupt won't get the benefit of the cleanup done in
> ceph_osdc_wait_request().  Therefore the RBD layer needs
> to be responsible for doing that.
>
> In other words, when rbd_obj_request_wait() gets an
> interrupt while waiting for the completion, it needs
> to clean up (end) the interrupted request, and
> rbd_obj_request_end() sounds right.  And what that
> cleanup function should do is basically the same
> as what ceph_osdc_wait_request() should do in that
> situation, which is call ceph_osdc_cancel_request().

That's exactly right.

>
> The only question that leaves me with is, does
> ceph_osdc_cancel_request() need to include the
> call to complete_request() that's present in
> ceph_osdc_wait_request()?

I don't think so - I mentioned it in the ceph_osdc_cancel_request()
function comment.  ceph_osdc_cancel_request() is supposed to be used by
higher layers - rbd, cephfs - and exactly because their completion
logic is decoupled from libceph completions (as you have brilliantly
explained above) it's the higher layers who should be taking care of
it.  IOW higher layers are in charge and are supposed to know what and
when they are cancelling.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 11/14] rbd: add rbd_obj_watch_request_helper() helper
  2014-07-07 22:36   ` Alex Elder
@ 2014-07-08 11:18     ` Ilya Dryomov
  0 siblings, 0 replies; 40+ messages in thread
From: Ilya Dryomov @ 2014-07-08 11:18 UTC (permalink / raw)
  To: Alex Elder; +Cc: Ceph Development

On Tue, Jul 8, 2014 at 2:36 AM, Alex Elder <elder@ieee.org> wrote:
> On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
>> In the past, rbd_dev_header_watch_sync() used to handle both watch and
>> unwatch requests and was entangled and leaky.  Commit b30a01f2a307
>> ("rbd: fix osd_request memory leak in __rbd_dev_header_watch_sync()")
>> split it into two separate functions.  This commit cleanly abstracts
>> the common bits, relying on the fixed rbd_obj_request_wait().
>
> Adding this without calling it leads to an unused function
> warning in the build, I'm sure.
>
> You could probably squash this into the next patch.

It used to be a single patch in the previous version of this series,
but it was too hard to review even for myself, so I had to split it.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 14/14] libceph: drop osd ref when canceling con work
  2014-07-07 22:38   ` Alex Elder
@ 2014-07-08 11:22     ` Ilya Dryomov
  0 siblings, 0 replies; 40+ messages in thread
From: Ilya Dryomov @ 2014-07-08 11:22 UTC (permalink / raw)
  To: Alex Elder; +Cc: Ceph Development

On Tue, Jul 8, 2014 at 2:38 AM, Alex Elder <elder@ieee.org> wrote:
> On 06/25/2014 12:16 PM, Ilya Dryomov wrote:
>> queue_con() bumps osd ref count.  We should do the reverse when
>> canceling con work.
>
> Kind of unrelated to the rest of the series, but it looks
> good.  Good to have a same-level-of-abstraction function
> for it as well.

This series is really "everything I stumbled upon while fixing #6628" ;)

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 10/14] rbd: rbd_obj_request_wait() should cancel the request if interrupted
  2014-07-08 11:18     ` Ilya Dryomov
@ 2014-07-08 12:17       ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-07-08 12:17 UTC (permalink / raw)
  To: Ilya Dryomov; +Cc: Ceph Development

On 07/08/2014 06:18 AM, Ilya Dryomov wrote:
>> > The only question that leaves me with is, does
>> > ceph_osdc_cancel_request() need to include the
>> > call to complete_request() that's present in
>> > ceph_osdc_wait_request()?
> I don't think so - I mentioned it in the ceph_osdc_cancel_request()
> function comment.  ceph_osdc_cancel_request() is supposed to be used by
> higher layers - rbd, cephfs - and exactly because their completion
> logic is decoupled from libceph completions (as you have brilliantly
> explained above) it's the higher layers who should be taking care of
> it.  IOW higher layers are in charge and are supposed to know what and
> when they are cancelling.

I noticed that comment only after sending my message.

RBD doesn't use the safe completion, only the FS client
does, and I was pretty focused on RBD behavior while
looking at this.  I was trying to conceptualize how
(from the perspective of the upper layer) the safe
completion differs from the "normal" completion.

It's possible that an "I have your request" (normal
completion) *also* carries with it the "your request
has completed" (safe completion) indication, but
the higher layer caller has no way of knowing that.

Maybe I should flip my question around, and ask, why
should the ceph_osdc_cancel_request() include the call
to complete_request()?

The answer lies in details of the file system client,
and I'm not in a position right now to dive into that.
Whether it's called in ceph_osdc_cancel_request() or
not has no effect on RBD.

Anyway, your response is fine with me, thank you.

					-Alex

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request()
  2014-07-08 11:15         ` Ilya Dryomov
@ 2014-07-08 12:58           ` Alex Elder
  0 siblings, 0 replies; 40+ messages in thread
From: Alex Elder @ 2014-07-08 12:58 UTC (permalink / raw)
  To: Ilya Dryomov; +Cc: Ceph Development

On 07/08/2014 06:15 AM, Ilya Dryomov wrote:
> Are you OK with your Reviewed-by for this patch?

Reviewed-by: Alex Elder <elder@linaro.org>


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2014-07-08 12:58 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-25 17:16 [PATCH 00/14] rbd: #6628 fixes (wip-remove-osd-6628) Ilya Dryomov
2014-06-25 17:16 ` [PATCH 01/14] libceph: rename ceph_osd_request::r_linger_osd to r_linger_osd_item Ilya Dryomov
2014-06-30 12:16   ` Alex Elder
2014-06-25 17:16 ` [PATCH 02/14] libceph: add maybe_move_osd_to_lru() and switch to it Ilya Dryomov
2014-06-30 12:17   ` Alex Elder
2014-06-25 17:16 ` [PATCH 03/14] libceph: move and add dout()s to ceph_msg_{get,put}() Ilya Dryomov
2014-06-30 12:29   ` Alex Elder
2014-07-08 11:12     ` Ilya Dryomov
2014-06-25 17:16 ` [PATCH 04/14] libceph: move and add dout()s to ceph_osdc_request_{get,put}() Ilya Dryomov
2014-06-30 12:32   ` Alex Elder
2014-06-25 17:16 ` [PATCH 05/14] libceph: harden ceph_osdc_request_release() a bit Ilya Dryomov
2014-06-30 12:36   ` Alex Elder
2014-06-25 17:16 ` [PATCH 06/14] libceph: assert both regular and lingering lists in __remove_osd() Ilya Dryomov
2014-06-30 12:37   ` Alex Elder
2014-06-25 17:16 ` [PATCH 07/14] libceph: unregister only registered linger requests Ilya Dryomov
2014-06-30 13:05   ` Alex Elder
2014-06-30 13:50   ` Alex Elder
2014-06-30 14:21     ` Ilya Dryomov
2014-06-25 17:16 ` [PATCH 08/14] libceph: fix linger request check in __unregister_request() Ilya Dryomov
2014-06-30 13:07   ` Alex Elder
2014-06-25 17:16 ` [PATCH 09/14] libceph: introduce ceph_osdc_cancel_request() Ilya Dryomov
2014-06-30 13:39   ` Alex Elder
2014-06-30 14:34     ` Ilya Dryomov
2014-07-07 13:47       ` Alex Elder
2014-07-08 11:15         ` Ilya Dryomov
2014-07-08 12:58           ` Alex Elder
2014-06-25 17:16 ` [PATCH 10/14] rbd: rbd_obj_request_wait() should cancel the request if interrupted Ilya Dryomov
2014-07-07 16:55   ` Alex Elder
2014-07-08 11:18     ` Ilya Dryomov
2014-07-08 12:17       ` Alex Elder
2014-06-25 17:16 ` [PATCH 11/14] rbd: add rbd_obj_watch_request_helper() helper Ilya Dryomov
2014-07-07 22:36   ` Alex Elder
2014-07-08 11:18     ` Ilya Dryomov
2014-06-25 17:16 ` [PATCH 12/14] rbd: use " Ilya Dryomov
2014-07-07 22:36   ` Alex Elder
2014-06-25 17:16 ` [PATCH 13/14] libceph: nuke ceph_osdc_unregister_linger_request() Ilya Dryomov
2014-07-07 22:36   ` Alex Elder
2014-06-25 17:16 ` [PATCH 14/14] libceph: drop osd ref when canceling con work Ilya Dryomov
2014-07-07 22:38   ` Alex Elder
2014-07-08 11:22     ` Ilya Dryomov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.