All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13] libceph: cleanups preparing for state cleanup
@ 2012-05-30 19:24 Alex Elder
  2012-05-30 19:34 ` [PATCH 01/13] libceph: eliminate connection state "DEAD" Alex Elder
                   ` (12 more replies)
  0 siblings, 13 replies; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:24 UTC (permalink / raw)
  To: ceph-devel

I am working on some fairly big changes to the ceph/RADOS client
messenger code.  The ultimate goal is to simplify the code by
having it more obviously follow a state machine, but getting to
that point isn't necessarily easy.

My approach is to make evolutionary changes, making a long series
of small changes, each of which either produces code that works
identical to before, or which very explicitly fixes a bug or adds
or modifies a feature while continuing to provide functionality
that is equivalent and/or meets what is required.

This series contains a few small batches of changes to begin this
process.  In general they're dependent on each other, so they are
being provided in one series, but I group them below into some
smaller logical subsets.

					-Alex

[PATCH 01/13] libceph: eliminate connection state "DEAD"
[PATCH 02/13] libceph: kill bad_proto ceph connection op
[PATCH 03/13] libceph: delete useless SOCK_CLOSED manipulations
     These three delete dead/unused code

[PATCH 04/13] libceph: rename socket callbacks
[PATCH 05/13] libceph: rename kvec_reset and kvec_add functions
     These two simply rename some symbols.

[PATCH 06/13] libceph: embed ceph messenger structure in ceph_client
[PATCH 07/13] libceph: embed ceph connection structure in mon_client
     These two each change a structure definition so that what was
     once a pointer to a structure becomes instead an embedded
     structure of the pointed-to type.  Doing this makes it obvious
     that the relationship between the containing structure and the
     embedded one is purely one-to-one.

[PATCH 08/13] libceph: start separating connection flags from state
     This identifies a set of values kept in the "state" field and
     records them instead in a new "flags" field, so it is obvious
     the role each plays (whether it's a state diagram state, or
     whether it's a Boolean flag).

[PATCH 09/13] libceph: start tracking connection socket state
     This adds code to explicitly track the state of the socket
     used by a ceph connection.  It begins the process of trying
     to clean up some fuzziness in how the overall state of a
     ceph connection is tracked.

[PATCH 10/13] libceph: provide osd number when creating osd
[PATCH 11/13] libceph: init monitor connection when opening
[PATCH 12/13] libceph: fully initialize connection in con_init()
[PATCH 13/13] libceph: set CLOSED state bit in con_init
     This series moves things around a bit so that all ceph
     connection initialization is done at one time, by code
     defined with the "net/ceph/messenger.c" source file.  It
     also makes explicit that a newly initialized ceph connection
     is in CLOSED state.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH 01/13] libceph: eliminate connection state "DEAD"
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
@ 2012-05-30 19:34 ` Alex Elder
  2012-05-31 16:20   ` Yehuda Sadeh
  2012-05-30 19:34 ` [PATCH 02/13] libceph: kill bad_proto ceph connection op Alex Elder
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:34 UTC (permalink / raw)
  To: ceph-devel

The ceph connection state "DEAD" is never set and is therefore not
needed.  Eliminate it.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  include/linux/ceph/messenger.h |    1 -
  net/ceph/messenger.c           |    6 ------
  2 files changed, 0 insertions(+), 7 deletions(-)

diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 2521a95..aa506ca 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -119,7 +119,6 @@ struct ceph_msg_pos {
  #define CLOSED		10 /* we've closed the connection */
  #define SOCK_CLOSED	11 /* socket state changed to closed */
  #define OPENING         13 /* open connection w/ (possibly new) peer */
-#define DEAD            14 /* dead, about to kfree */
  #define BACKOFF         15

  /*
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 1a80907..42ca8aa 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2087,12 +2087,6 @@ bad_tag:
   */
  static void queue_con(struct ceph_connection *con)
  {
-	if (test_bit(DEAD, &con->state)) {
-		dout("queue_con %p ignoring: DEAD\n",
-		     con);
-		return;
-	}
-
  	if (!con->ops->get(con)) {
  		dout("queue_con %p ref count 0\n", con);
  		return;
-- 
1.7.5.4



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 02/13] libceph: kill bad_proto ceph connection op
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
  2012-05-30 19:34 ` [PATCH 01/13] libceph: eliminate connection state "DEAD" Alex Elder
@ 2012-05-30 19:34 ` Alex Elder
  2012-05-31 16:30   ` Yehuda Sadeh
  2012-05-30 19:34 ` [PATCH 03/13] libceph: delete useless SOCK_CLOSED manipulations Alex Elder
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:34 UTC (permalink / raw)
  To: ceph-devel

No code sets a bad_proto method in its ceph connection operations
vector, so just get rid of it.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  include/linux/ceph/messenger.h |    3 ---
  net/ceph/messenger.c           |    5 -----
  2 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index aa506ca..74f6c9b 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -31,9 +31,6 @@ struct ceph_connection_operations {
  	int (*verify_authorizer_reply) (struct ceph_connection *con, int len);
  	int (*invalidate_authorizer)(struct ceph_connection *con);

-	/* protocol version mismatch */
-	void (*bad_proto) (struct ceph_connection *con);
-
  	/* there was some error on the socket (disconnect, whatever) */
  	void (*fault) (struct ceph_connection *con);

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 42ca8aa..07af994 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1356,11 +1356,6 @@ static void fail_protocol(struct ceph_connection 
*con)
  {
  	reset_connection(con);
  	set_bit(CLOSED, &con->state);  /* in case there's queued work */
-
-	mutex_unlock(&con->mutex);
-	if (con->ops->bad_proto)
-		con->ops->bad_proto(con);
-	mutex_lock(&con->mutex);
  }

  static int process_connect(struct ceph_connection *con)
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 03/13] libceph: delete useless SOCK_CLOSED manipulations
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
  2012-05-30 19:34 ` [PATCH 01/13] libceph: eliminate connection state "DEAD" Alex Elder
  2012-05-30 19:34 ` [PATCH 02/13] libceph: kill bad_proto ceph connection op Alex Elder
@ 2012-05-30 19:34 ` Alex Elder
  2012-06-01 18:47   ` Alex Elder
  2012-05-30 19:34 ` [PATCH 04/13] libceph: rename socket callbacks Alex Elder
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:34 UTC (permalink / raw)
  To: ceph-devel

In con_close_socket(), SOCK_CLOSED is set in the connection state,
then cleared again after shutting down the socket.  Nothing between
the setting and clearing of that bit will ever be affected by it,
so there's no point in setting/clearing it at all.  So don't.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  net/ceph/messenger.c |    2 --
  1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 07af994..fe3c2a1 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -338,11 +338,9 @@ static int con_close_socket(struct ceph_connection 
*con)
  	dout("con_close_socket on %p sock %p\n", con, con->sock);
  	if (!con->sock)
  		return 0;
-	set_bit(SOCK_CLOSED, &con->state);
  	rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
  	sock_release(con->sock);
  	con->sock = NULL;
-	clear_bit(SOCK_CLOSED, &con->state);
  	return rc;
  }

-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 04/13] libceph: rename socket callbacks
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (2 preceding siblings ...)
  2012-05-30 19:34 ` [PATCH 03/13] libceph: delete useless SOCK_CLOSED manipulations Alex Elder
@ 2012-05-30 19:34 ` Alex Elder
  2012-05-31 16:33   ` Yehuda Sadeh Weinraub
  2012-06-01  4:02   ` Sage Weil
  2012-05-30 19:34 ` [PATCH 05/13] libceph: rename kvec_reset and kvec_add functions Alex Elder
                   ` (8 subsequent siblings)
  12 siblings, 2 replies; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:34 UTC (permalink / raw)
  To: ceph-devel

Change the names of the three socket callback functions to make it
more obvious they're specifically associated with a connection's
socket (not the ceph connection that uses it).

Signed-off-by: Alex Elder <elder@inktank.com>
---
  net/ceph/messenger.c |   28 ++++++++++++++--------------
  1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index fe3c2a1..5ad1f0a 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -153,46 +153,46 @@ EXPORT_SYMBOL(ceph_msgr_flush);
   */

  /* data available on socket, or listen socket received a connect */
-static void ceph_data_ready(struct sock *sk, int count_unused)
+static void ceph_sock_data_ready(struct sock *sk, int count_unused)
  {
  	struct ceph_connection *con = sk->sk_user_data;

  	if (sk->sk_state != TCP_CLOSE_WAIT) {
-		dout("ceph_data_ready on %p state = %lu, queueing work\n",
+		dout("%s on %p state = %lu, queueing work\n", __func__,
  		     con, con->state);
  		queue_con(con);
  	}
  }

  /* socket has buffer space for writing */
-static void ceph_write_space(struct sock *sk)
+static void ceph_sock_write_space(struct sock *sk)
  {
  	struct ceph_connection *con = sk->sk_user_data;

  	/* only queue to workqueue if there is data we want to write,
  	 * and there is sufficient space in the socket buffer to accept
-	 * more data.  clear SOCK_NOSPACE so that ceph_write_space()
+	 * more data.  clear SOCK_NOSPACE so that ceph_sock_write_space()
  	 * doesn't get called again until try_write() fills the socket
  	 * buffer. See net/ipv4/tcp_input.c:tcp_check_space()
  	 * and net/core/stream.c:sk_stream_write_space().
  	 */
  	if (test_bit(WRITE_PENDING, &con->state)) {
  		if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
-			dout("ceph_write_space %p queueing write work\n", con);
+			dout("%s %p queueing write work\n", __func__, con);
  			clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
  			queue_con(con);
  		}
  	} else {
-		dout("ceph_write_space %p nothing to write\n", con);
+		dout("%s %p nothing to write\n", __func__, con);
  	}
  }

  /* socket's state has changed */
-static void ceph_state_change(struct sock *sk)
+static void ceph_sock_state_change(struct sock *sk)
  {
  	struct ceph_connection *con = sk->sk_user_data;

-	dout("ceph_state_change %p state = %lu sk_state = %u\n",
+	dout("%s %p state = %lu sk_state = %u\n", __func__,
  	     con, con->state, sk->sk_state);

  	if (test_bit(CLOSED, &con->state))
@@ -200,9 +200,9 @@ static void ceph_state_change(struct sock *sk)

  	switch (sk->sk_state) {
  	case TCP_CLOSE:
-		dout("ceph_state_change TCP_CLOSE\n");
+		dout("%s TCP_CLOSE\n", __func__);
  	case TCP_CLOSE_WAIT:
-		dout("ceph_state_change TCP_CLOSE_WAIT\n");
+		dout("%s TCP_CLOSE_WAIT\n", __func__);
  		if (test_and_set_bit(SOCK_CLOSED, &con->state) == 0) {
  			if (test_bit(CONNECTING, &con->state))
  				con->error_msg = "connection failed";
@@ -212,7 +212,7 @@ static void ceph_state_change(struct sock *sk)
  		}
  		break;
  	case TCP_ESTABLISHED:
-		dout("ceph_state_change TCP_ESTABLISHED\n");
+		dout("%s TCP_ESTABLISHED\n", __func__);
  		queue_con(con);
  		break;
  	default:	/* Everything else is uninteresting */
@@ -228,9 +228,9 @@ static void set_sock_callbacks(struct socket *sock,
  {
  	struct sock *sk = sock->sk;
  	sk->sk_user_data = con;
-	sk->sk_data_ready = ceph_data_ready;
-	sk->sk_write_space = ceph_write_space;
-	sk->sk_state_change = ceph_state_change;
+	sk->sk_data_ready = ceph_sock_data_ready;
+	sk->sk_write_space = ceph_sock_write_space;
+	sk->sk_state_change = ceph_sock_state_change;
  }


-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 05/13] libceph: rename kvec_reset and kvec_add functions
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (3 preceding siblings ...)
  2012-05-30 19:34 ` [PATCH 04/13] libceph: rename socket callbacks Alex Elder
@ 2012-05-30 19:34 ` Alex Elder
  2012-05-31 16:34   ` Yehuda Sadeh
  2012-06-01  4:02   ` Sage Weil
  2012-05-30 19:34 ` [PATCH 06/13] libceph: embed ceph messenger structure in ceph_client Alex Elder
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:34 UTC (permalink / raw)
  To: ceph-devel

The functions ceph_con_out_kvec_reset() and ceph_con_out_kvec_add()
are entirely private functions, so drop the "ceph_" prefix in their
name to make them slightly more wieldy.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  net/ceph/messenger.c |   48 
++++++++++++++++++++++++------------------------
  1 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 5ad1f0a..2e9054f 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -484,14 +484,14 @@ static u32 get_global_seq(struct ceph_messenger 
*msgr, u32 gt)
  	return ret;
  }

-static void ceph_con_out_kvec_reset(struct ceph_connection *con)
+static void con_out_kvec_reset(struct ceph_connection *con)
  {
  	con->out_kvec_left = 0;
  	con->out_kvec_bytes = 0;
  	con->out_kvec_cur = &con->out_kvec[0];
  }

-static void ceph_con_out_kvec_add(struct ceph_connection *con,
+static void con_out_kvec_add(struct ceph_connection *con,
  				size_t size, void *data)
  {
  	int index;
@@ -532,7 +532,7 @@ static void prepare_write_message(struct 
ceph_connection *con)
  	struct ceph_msg *m;
  	u32 crc;

-	ceph_con_out_kvec_reset(con);
+	con_out_kvec_reset(con);
  	con->out_kvec_is_msg = true;
  	con->out_msg_done = false;

@@ -540,9 +540,9 @@ static void prepare_write_message(struct 
ceph_connection *con)
  	 * TCP packet that's a good thing. */
  	if (con->in_seq > con->in_seq_acked) {
  		con->in_seq_acked = con->in_seq;
-		ceph_con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
+		con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
  		con->out_temp_ack = cpu_to_le64(con->in_seq_acked);
-		ceph_con_out_kvec_add(con, sizeof (con->out_temp_ack),
+		con_out_kvec_add(con, sizeof (con->out_temp_ack),
  			&con->out_temp_ack);
  	}

@@ -570,12 +570,12 @@ static void prepare_write_message(struct 
ceph_connection *con)
  	BUG_ON(le32_to_cpu(m->hdr.front_len) != m->front.iov_len);

  	/* tag + hdr + front + middle */
-	ceph_con_out_kvec_add(con, sizeof (tag_msg), &tag_msg);
-	ceph_con_out_kvec_add(con, sizeof (m->hdr), &m->hdr);
-	ceph_con_out_kvec_add(con, m->front.iov_len, m->front.iov_base);
+	con_out_kvec_add(con, sizeof (tag_msg), &tag_msg);
+	con_out_kvec_add(con, sizeof (m->hdr), &m->hdr);
+	con_out_kvec_add(con, m->front.iov_len, m->front.iov_base);

  	if (m->middle)
-		ceph_con_out_kvec_add(con, m->middle->vec.iov_len,
+		con_out_kvec_add(con, m->middle->vec.iov_len,
  			m->middle->vec.iov_base);

  	/* fill in crc (except data pages), footer */
@@ -624,12 +624,12 @@ static void prepare_write_ack(struct 
ceph_connection *con)
  	     con->in_seq_acked, con->in_seq);
  	con->in_seq_acked = con->in_seq;

-	ceph_con_out_kvec_reset(con);
+	con_out_kvec_reset(con);

-	ceph_con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
+	con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);

  	con->out_temp_ack = cpu_to_le64(con->in_seq_acked);
-	ceph_con_out_kvec_add(con, sizeof (con->out_temp_ack),
+	con_out_kvec_add(con, sizeof (con->out_temp_ack),
  				&con->out_temp_ack);

  	con->out_more = 1;  /* more will follow.. eventually.. */
@@ -642,8 +642,8 @@ static void prepare_write_ack(struct ceph_connection 
*con)
  static void prepare_write_keepalive(struct ceph_connection *con)
  {
  	dout("prepare_write_keepalive %p\n", con);
-	ceph_con_out_kvec_reset(con);
-	ceph_con_out_kvec_add(con, sizeof (tag_keepalive), &tag_keepalive);
+	con_out_kvec_reset(con);
+	con_out_kvec_add(con, sizeof (tag_keepalive), &tag_keepalive);
  	set_bit(WRITE_PENDING, &con->state);
  }

@@ -688,8 +688,8 @@ static struct ceph_auth_handshake 
*get_connect_authorizer(struct ceph_connection
   */
  static void prepare_write_banner(struct ceph_connection *con)
  {
-	ceph_con_out_kvec_add(con, strlen(CEPH_BANNER), CEPH_BANNER);
-	ceph_con_out_kvec_add(con, sizeof (con->msgr->my_enc_addr),
+	con_out_kvec_add(con, strlen(CEPH_BANNER), CEPH_BANNER);
+	con_out_kvec_add(con, sizeof (con->msgr->my_enc_addr),
  					&con->msgr->my_enc_addr);

  	con->out_more = 0;
@@ -736,10 +736,10 @@ static int prepare_write_connect(struct 
ceph_connection *con)
  	con->out_connect.authorizer_len = auth ?
  		cpu_to_le32(auth->authorizer_buf_len) : 0;

-	ceph_con_out_kvec_add(con, sizeof (con->out_connect),
+	con_out_kvec_add(con, sizeof (con->out_connect),
  					&con->out_connect);
  	if (auth && auth->authorizer_buf_len)
-		ceph_con_out_kvec_add(con, auth->authorizer_buf_len,
+		con_out_kvec_add(con, auth->authorizer_buf_len,
  					auth->authorizer_buf);

  	con->out_more = 0;
@@ -933,7 +933,7 @@ static int write_partial_msg_pages(struct 
ceph_connection *con)
  	/* prepare and queue up footer, too */
  	if (!do_datacrc)
  		con->out_msg->footer.flags |= CEPH_MSG_FOOTER_NOCRC;
-	ceph_con_out_kvec_reset(con);
+	con_out_kvec_reset(con);
  	prepare_write_message_footer(con);
  	ret = 1;
  out:
@@ -1396,7 +1396,7 @@ static int process_connect(struct ceph_connection 
*con)
  			return -1;
  		}
  		con->auth_retry = 1;
-		ceph_con_out_kvec_reset(con);
+		con_out_kvec_reset(con);
  		ret = prepare_write_connect(con);
  		if (ret < 0)
  			return ret;
@@ -1417,7 +1417,7 @@ static int process_connect(struct ceph_connection 
*con)
  		       ENTITY_NAME(con->peer_name),
  		       ceph_pr_addr(&con->peer_addr.in_addr));
  		reset_connection(con);
-		ceph_con_out_kvec_reset(con);
+		con_out_kvec_reset(con);
  		ret = prepare_write_connect(con);
  		if (ret < 0)
  			return ret;
@@ -1443,7 +1443,7 @@ static int process_connect(struct ceph_connection 
*con)
  		     le32_to_cpu(con->out_connect.connect_seq),
  		     le32_to_cpu(con->in_connect.connect_seq));
  		con->connect_seq = le32_to_cpu(con->in_connect.connect_seq);
-		ceph_con_out_kvec_reset(con);
+		con_out_kvec_reset(con);
  		ret = prepare_write_connect(con);
  		if (ret < 0)
  			return ret;
@@ -1460,7 +1460,7 @@ static int process_connect(struct ceph_connection 
*con)
  		     le32_to_cpu(con->in_connect.global_seq));
  		get_global_seq(con->msgr,
  			       le32_to_cpu(con->in_connect.global_seq));
-		ceph_con_out_kvec_reset(con);
+		con_out_kvec_reset(con);
  		ret = prepare_write_connect(con);
  		if (ret < 0)
  			return ret;
@@ -1867,7 +1867,7 @@ more:

  	/* open the socket first? */
  	if (con->sock == NULL) {
-		ceph_con_out_kvec_reset(con);
+		con_out_kvec_reset(con);
  		prepare_write_banner(con);
  		ret = prepare_write_connect(con);
  		if (ret < 0)
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 06/13] libceph: embed ceph messenger structure in ceph_client
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (4 preceding siblings ...)
  2012-05-30 19:34 ` [PATCH 05/13] libceph: rename kvec_reset and kvec_add functions Alex Elder
@ 2012-05-30 19:34 ` Alex Elder
  2012-05-31 16:44   ` Yehuda Sadeh
  2012-06-01  4:04   ` Sage Weil
  2012-05-30 19:34 ` [PATCH 07/13] libceph: embed ceph connection structure in mon_client Alex Elder
                   ` (6 subsequent siblings)
  12 siblings, 2 replies; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:34 UTC (permalink / raw)
  To: ceph-devel

A ceph client has a pointer to a ceph messenger structure in it.
There is always exactly one ceph messenger for a ceph client, so
there is no need to allocate it separate from the ceph client
structure.

Switch the ceph_client structure to embed its ceph_messenger
structure.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  fs/ceph/mds_client.c           |    2 +-
  include/linux/ceph/libceph.h   |    2 +-
  include/linux/ceph/messenger.h |    9 +++++----
  net/ceph/ceph_common.c         |   18 +++++-------------
  net/ceph/messenger.c           |   30 +++++++++---------------------
  net/ceph/mon_client.c          |    6 +++---
  net/ceph/osd_client.c          |    4 ++--
  7 files changed, 26 insertions(+), 45 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 200bc87..ad30261 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -394,7 +394,7 @@ static struct ceph_mds_session 
*register_session(struct ceph_mds_client *mdsc,
  	s->s_seq = 0;
  	mutex_init(&s->s_mutex);

-	ceph_con_init(mdsc->fsc->client->msgr, &s->s_con);
+	ceph_con_init(&mdsc->fsc->client->msgr, &s->s_con);
  	s->s_con.private = s;
  	s->s_con.ops = &mds_con_ops;
  	s->s_con.peer_name.type = CEPH_ENTITY_TYPE_MDS;
diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h
index 92eef7c..927361c 100644
--- a/include/linux/ceph/libceph.h
+++ b/include/linux/ceph/libceph.h
@@ -131,7 +131,7 @@ struct ceph_client {
  	u32 supported_features;
  	u32 required_features;

-	struct ceph_messenger *msgr;   /* messenger instance */
+	struct ceph_messenger msgr;   /* messenger instance */
  	struct ceph_mon_client monc;
  	struct ceph_osd_client osdc;

diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 74f6c9b..3fbd4be 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -211,10 +211,11 @@ extern int ceph_msgr_init(void);
  extern void ceph_msgr_exit(void);
  extern void ceph_msgr_flush(void);

-extern struct ceph_messenger *ceph_messenger_create(
-	struct ceph_entity_addr *myaddr,
-	u32 features, u32 required);
-extern void ceph_messenger_destroy(struct ceph_messenger *);
+extern void ceph_messenger_init(struct ceph_messenger *msgr,
+			struct ceph_entity_addr *myaddr,
+			u32 supported_features,
+			u32 required_features,
+			bool nocrc);

  extern void ceph_con_init(struct ceph_messenger *msgr,
  			  struct ceph_connection *con);
diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c
index cc91319..2de3ea1 100644
--- a/net/ceph/ceph_common.c
+++ b/net/ceph/ceph_common.c
@@ -468,19 +468,15 @@ struct ceph_client *ceph_create_client(struct 
ceph_options *opt, void *private,
  	/* msgr */
  	if (ceph_test_opt(client, MYIP))
  		myaddr = &client->options->my_addr;
-	client->msgr = ceph_messenger_create(myaddr,
-					     client->supported_features,
-					     client->required_features);
-	if (IS_ERR(client->msgr)) {
-		err = PTR_ERR(client->msgr);
-		goto fail;
-	}
-	client->msgr->nocrc = ceph_test_opt(client, NOCRC);
+	ceph_messenger_init(&client->msgr, myaddr,
+		client->supported_features,
+		client->required_features,
+		ceph_test_opt(client, NOCRC));

  	/* subsystems */
  	err = ceph_monc_init(&client->monc, client);
  	if (err < 0)
-		goto fail_msgr;
+		goto fail;
  	err = ceph_osdc_init(&client->osdc, client);
  	if (err < 0)
  		goto fail_monc;
@@ -489,8 +485,6 @@ struct ceph_client *ceph_create_client(struct 
ceph_options *opt, void *private,

  fail_monc:
  	ceph_monc_stop(&client->monc);
-fail_msgr:
-	ceph_messenger_destroy(client->msgr);
  fail:
  	kfree(client);
  	return ERR_PTR(err);
@@ -515,8 +509,6 @@ void ceph_destroy_client(struct ceph_client *client)

  	ceph_debugfs_client_cleanup(client);

-	ceph_messenger_destroy(client->msgr);
-
  	ceph_destroy_options(client->options);

  	kfree(client);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 2e9054f..19f1948 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2243,18 +2243,14 @@ out:


  /*
- * create a new messenger instance
+ * initialize a new messenger instance
   */
-struct ceph_messenger *ceph_messenger_create(struct ceph_entity_addr 
*myaddr,
-					     u32 supported_features,
-					     u32 required_features)
+void ceph_messenger_init(struct ceph_messenger *msgr,
+			struct ceph_entity_addr *myaddr,
+			u32 supported_features,
+			u32 required_features,
+			bool nocrc)
  {
-	struct ceph_messenger *msgr;
-
-	msgr = kzalloc(sizeof(*msgr), GFP_KERNEL);
-	if (msgr == NULL)
-		return ERR_PTR(-ENOMEM);
-
  	msgr->supported_features = supported_features;
  	msgr->required_features = required_features;

@@ -2267,19 +2263,11 @@ struct ceph_messenger 
*ceph_messenger_create(struct ceph_entity_addr *myaddr,
  	msgr->inst.addr.type = 0;
  	get_random_bytes(&msgr->inst.addr.nonce, sizeof(msgr->inst.addr.nonce));
  	encode_my_addr(msgr);
+	msgr->nocrc = nocrc;

-	dout("messenger_create %p\n", msgr);
-	return msgr;
-}
-EXPORT_SYMBOL(ceph_messenger_create);
-
-void ceph_messenger_destroy(struct ceph_messenger *msgr)
-{
-	dout("destroy %p\n", msgr);
-	kfree(msgr);
-	dout("destroyed messenger %p\n", msgr);
+	dout("%s %p\n", __func__, msgr);
  }
-EXPORT_SYMBOL(ceph_messenger_destroy);
+EXPORT_SYMBOL(ceph_messenger_init);

  static void clear_standby(struct ceph_connection *con)
  {
diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index 1845cde..704dc95 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -763,7 +763,7 @@ int ceph_monc_init(struct ceph_mon_client *monc, 
struct ceph_client *cl)
  	monc->con = kmalloc(sizeof(*monc->con), GFP_KERNEL);
  	if (!monc->con)
  		goto out_monmap;
-	ceph_con_init(monc->client->msgr, monc->con);
+	ceph_con_init(&monc->client->msgr, monc->con);
  	monc->con->private = monc;
  	monc->con->ops = &mon_con_ops;

@@ -880,8 +880,8 @@ static void handle_auth_reply(struct ceph_mon_client 
*monc,
  	} else if (!was_auth && monc->auth->ops->is_authenticated(monc->auth)) {
  		dout("authenticated, starting session\n");

-		monc->client->msgr->inst.name.type = CEPH_ENTITY_TYPE_CLIENT;
-		monc->client->msgr->inst.name.num =
+		monc->client->msgr.inst.name.type = CEPH_ENTITY_TYPE_CLIENT;
+		monc->client->msgr.inst.name.num =
  					cpu_to_le64(monc->auth->global_id);

  		__send_subscribe(monc);
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index b098e7b..cca4c7f 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -639,7 +639,7 @@ static struct ceph_osd *create_osd(struct 
ceph_osd_client *osdc)
  	INIT_LIST_HEAD(&osd->o_osd_lru);
  	osd->o_incarnation = 1;

-	ceph_con_init(osdc->client->msgr, &osd->o_con);
+	ceph_con_init(&osdc->client->msgr, &osd->o_con);
  	osd->o_con.private = osd;
  	osd->o_con.ops = &osd_con_ops;
  	osd->o_con.peer_name.type = CEPH_ENTITY_TYPE_OSD;
@@ -1391,7 +1391,7 @@ void ceph_osdc_handle_map(struct ceph_osd_client 
*osdc, struct ceph_msg *msg)
  			     epoch, maplen);
  			newmap = osdmap_apply_incremental(&p, next,
  							  osdc->osdmap,
-							  osdc->client->msgr);
+							  &osdc->client->msgr);
  			if (IS_ERR(newmap)) {
  				err = PTR_ERR(newmap);
  				goto bad;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 07/13] libceph: embed ceph connection structure in mon_client
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (5 preceding siblings ...)
  2012-05-30 19:34 ` [PATCH 06/13] libceph: embed ceph messenger structure in ceph_client Alex Elder
@ 2012-05-30 19:34 ` Alex Elder
  2012-06-01  4:24   ` Sage Weil
  2012-05-30 19:35 ` [PATCH 08/13] libceph: start separating connection flags from state Alex Elder
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:34 UTC (permalink / raw)
  To: ceph-devel

A monitor client has a pointer to a ceph connection structure in it.
This is the only one of the three ceph client types that do it this
way; the OSD and MDS clients embed the connection into their main
structures.  There is always exactly one ceph connection for a
monitor client, so there is no need to allocate it separate from the
monitor client structure.

So switch the ceph_mon_client structure to embed its
ceph_connection structure.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  include/linux/ceph/mon_client.h |    2 +-
  net/ceph/mon_client.c           |   47 
++++++++++++++++----------------------
  2 files changed, 21 insertions(+), 28 deletions(-)

diff --git a/include/linux/ceph/mon_client.h 
b/include/linux/ceph/mon_client.h
index 545f859..2113e38 100644
--- a/include/linux/ceph/mon_client.h
+++ b/include/linux/ceph/mon_client.h
@@ -70,7 +70,7 @@ struct ceph_mon_client {
  	bool hunting;
  	int cur_mon;                       /* last monitor i contacted */
  	unsigned long sub_sent, sub_renew_after;
-	struct ceph_connection *con;
+	struct ceph_connection con;
  	bool have_fsid;

  	/* pending generic requests */
diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index 704dc95..ac4d6b1 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -106,9 +106,9 @@ static void __send_prepared_auth_request(struct 
ceph_mon_client *monc, int len)
  	monc->pending_auth = 1;
  	monc->m_auth->front.iov_len = len;
  	monc->m_auth->hdr.front_len = cpu_to_le32(len);
-	ceph_con_revoke(monc->con, monc->m_auth);
+	ceph_con_revoke(&monc->con, monc->m_auth);
  	ceph_msg_get(monc->m_auth);  /* keep our ref */
-	ceph_con_send(monc->con, monc->m_auth);
+	ceph_con_send(&monc->con, monc->m_auth);
  }

  /*
@@ -117,8 +117,8 @@ static void __send_prepared_auth_request(struct 
ceph_mon_client *monc, int len)
  static void __close_session(struct ceph_mon_client *monc)
  {
  	dout("__close_session closing mon%d\n", monc->cur_mon);
-	ceph_con_revoke(monc->con, monc->m_auth);
-	ceph_con_close(monc->con);
+	ceph_con_revoke(&monc->con, monc->m_auth);
+	ceph_con_close(&monc->con);
  	monc->cur_mon = -1;
  	monc->pending_auth = 0;
  	ceph_auth_reset(monc->auth);
@@ -142,9 +142,9 @@ static int __open_session(struct ceph_mon_client *monc)
  		monc->want_next_osdmap = !!monc->want_next_osdmap;

  		dout("open_session mon%d opening\n", monc->cur_mon);
-		monc->con->peer_name.type = CEPH_ENTITY_TYPE_MON;
-		monc->con->peer_name.num = cpu_to_le64(monc->cur_mon);
-		ceph_con_open(monc->con,
+		monc->con.peer_name.type = CEPH_ENTITY_TYPE_MON;
+		monc->con.peer_name.num = cpu_to_le64(monc->cur_mon);
+		ceph_con_open(&monc->con,
  			      &monc->monmap->mon_inst[monc->cur_mon].addr);

  		/* initiatiate authentication handshake */
@@ -226,8 +226,8 @@ static void __send_subscribe(struct ceph_mon_client 
*monc)

  		msg->front.iov_len = p - msg->front.iov_base;
  		msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
-		ceph_con_revoke(monc->con, msg);
-		ceph_con_send(monc->con, ceph_msg_get(msg));
+		ceph_con_revoke(&monc->con, msg);
+		ceph_con_send(&monc->con, ceph_msg_get(msg));

  		monc->sub_sent = jiffies | 1;  /* never 0 */
  	}
@@ -247,7 +247,7 @@ static void handle_subscribe_ack(struct 
ceph_mon_client *monc,
  	if (monc->hunting) {
  		pr_info("mon%d %s session established\n",
  			monc->cur_mon,
-			ceph_pr_addr(&monc->con->peer_addr.in_addr));
+			ceph_pr_addr(&monc->con.peer_addr.in_addr));
  		monc->hunting = false;
  	}
  	dout("handle_subscribe_ack after %d seconds\n", seconds);
@@ -461,7 +461,7 @@ static int do_generic_request(struct ceph_mon_client 
*monc,
  	req->request->hdr.tid = cpu_to_le64(req->tid);
  	__insert_generic_request(monc, req);
  	monc->num_generic_requests++;
-	ceph_con_send(monc->con, ceph_msg_get(req->request));
+	ceph_con_send(&monc->con, ceph_msg_get(req->request));
  	mutex_unlock(&monc->mutex);

  	err = wait_for_completion_interruptible(&req->completion);
@@ -684,8 +684,8 @@ static void __resend_generic_request(struct 
ceph_mon_client *monc)

  	for (p = rb_first(&monc->generic_request_tree); p; p = rb_next(p)) {
  		req = rb_entry(p, struct ceph_mon_generic_request, node);
-		ceph_con_revoke(monc->con, req->request);
-		ceph_con_send(monc->con, ceph_msg_get(req->request));
+		ceph_con_revoke(&monc->con, req->request);
+		ceph_con_send(&monc->con, ceph_msg_get(req->request));
  	}
  }

@@ -705,7 +705,7 @@ static void delayed_work(struct work_struct *work)
  		__close_session(monc);
  		__open_session(monc);  /* continue hunting */
  	} else {
-		ceph_con_keepalive(monc->con);
+		ceph_con_keepalive(&monc->con);

  		__validate_auth(monc);

@@ -760,19 +760,16 @@ int ceph_monc_init(struct ceph_mon_client *monc, 
struct ceph_client *cl)
  		goto out;

  	/* connection */
-	monc->con = kmalloc(sizeof(*monc->con), GFP_KERNEL);
-	if (!monc->con)
-		goto out_monmap;
-	ceph_con_init(&monc->client->msgr, monc->con);
-	monc->con->private = monc;
-	monc->con->ops = &mon_con_ops;
+	ceph_con_init(&monc->client->msgr, &monc->con);
+	monc->con.private = monc;
+	monc->con.ops = &mon_con_ops;

  	/* authentication */
  	monc->auth = ceph_auth_init(cl->options->name,
  				    cl->options->key);
  	if (IS_ERR(monc->auth)) {
  		err = PTR_ERR(monc->auth);
-		goto out_con;
+		goto out_monmap;
  	}
  	monc->auth->want_keys =
  		CEPH_ENTITY_TYPE_AUTH | CEPH_ENTITY_TYPE_MON |
@@ -824,8 +821,6 @@ out_subscribe_ack:
  	ceph_msg_put(monc->m_subscribe_ack);
  out_auth:
  	ceph_auth_destroy(monc->auth);
-out_con:
-	monc->con->ops->put(monc->con);
  out_monmap:
  	kfree(monc->monmap);
  out:
@@ -841,9 +836,7 @@ void ceph_monc_stop(struct ceph_mon_client *monc)
  	mutex_lock(&monc->mutex);
  	__close_session(monc);

-	monc->con->private = NULL;
-	monc->con->ops->put(monc->con);
-	monc->con = NULL;
+	monc->con.private = NULL;

  	mutex_unlock(&monc->mutex);

@@ -1021,7 +1014,7 @@ static void mon_fault(struct ceph_connection *con)
  	if (!monc->hunting)
  		pr_info("mon%d %s session lost, "
  			"hunting for new mon\n", monc->cur_mon,
-			ceph_pr_addr(&monc->con->peer_addr.in_addr));
+			ceph_pr_addr(&monc->con.peer_addr.in_addr));

  	__close_session(monc);
  	if (!monc->hunting) {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 08/13] libceph: start separating connection flags from state
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (6 preceding siblings ...)
  2012-05-30 19:34 ` [PATCH 07/13] libceph: embed ceph connection structure in mon_client Alex Elder
@ 2012-05-30 19:35 ` Alex Elder
  2012-06-01  4:25   ` Sage Weil
  2012-05-30 19:35 ` [PATCH 09/13] libceph: start tracking connection socket state Alex Elder
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:35 UTC (permalink / raw)
  To: ceph-devel

A ceph_connection holds a mixture of connection state (as in "state
machine" state) and connection flags in a single "state" field.  To
make the distinction more clear, define a new "flags" field and use
it rather than the "state" field to hold Boolean flag values.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  include/linux/ceph/messenger.h |   18 +++++++++----
  net/ceph/messenger.c           |   50 
++++++++++++++++++++--------------------
  2 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 3fbd4be..920235e 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -103,20 +103,25 @@ struct ceph_msg_pos {
  #define MAX_DELAY_INTERVAL	(5 * 60 * HZ)

  /*
- * ceph_connection state bit flags
+ * ceph_connection flag bits
   */
+
  #define LOSSYTX         0  /* we can close channel or drop messages on 
errors */
-#define CONNECTING	1
-#define NEGOTIATING	2
  #define KEEPALIVE_PENDING      3
  #define WRITE_PENDING	4  /* we have data ready to send */
+#define SOCK_CLOSED	11 /* socket state changed to closed */
+#define BACKOFF         15
+
+/*
+ * ceph_connection states
+ */
+#define CONNECTING	1
+#define NEGOTIATING	2
  #define STANDBY		8  /* no outgoing messages, socket closed.  we keep
  			    * the ceph_connection around to maintain shared
  			    * state with the peer. */
  #define CLOSED		10 /* we've closed the connection */
-#define SOCK_CLOSED	11 /* socket state changed to closed */
  #define OPENING         13 /* open connection w/ (possibly new) peer */
-#define BACKOFF         15

  /*
   * A single connection with another host.
@@ -133,7 +138,8 @@ struct ceph_connection {

  	struct ceph_messenger *msgr;
  	struct socket *sock;
-	unsigned long state;	/* connection state (see flags above) */
+	unsigned long flags;
+	unsigned long state;
  	const char *error_msg;  /* error message, if any */

  	struct ceph_entity_addr peer_addr; /* peer address */
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 19f1948..29055df 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -176,7 +176,7 @@ static void ceph_sock_write_space(struct sock *sk)
  	 * buffer. See net/ipv4/tcp_input.c:tcp_check_space()
  	 * and net/core/stream.c:sk_stream_write_space().
  	 */
-	if (test_bit(WRITE_PENDING, &con->state)) {
+	if (test_bit(WRITE_PENDING, &con->flags)) {
  		if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
  			dout("%s %p queueing write work\n", __func__, con);
  			clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
@@ -203,7 +203,7 @@ static void ceph_sock_state_change(struct sock *sk)
  		dout("%s TCP_CLOSE\n", __func__);
  	case TCP_CLOSE_WAIT:
  		dout("%s TCP_CLOSE_WAIT\n", __func__);
-		if (test_and_set_bit(SOCK_CLOSED, &con->state) == 0) {
+		if (test_and_set_bit(SOCK_CLOSED, &con->flags) == 0) {
  			if (test_bit(CONNECTING, &con->state))
  				con->error_msg = "connection failed";
  			else
@@ -393,9 +393,9 @@ void ceph_con_close(struct ceph_connection *con)
  	     ceph_pr_addr(&con->peer_addr.in_addr));
  	set_bit(CLOSED, &con->state);  /* in case there's queued work */
  	clear_bit(STANDBY, &con->state);  /* avoid connect_seq bump */
-	clear_bit(LOSSYTX, &con->state);  /* so we retry next connect */
-	clear_bit(KEEPALIVE_PENDING, &con->state);
-	clear_bit(WRITE_PENDING, &con->state);
+	clear_bit(LOSSYTX, &con->flags);  /* so we retry next connect */
+	clear_bit(KEEPALIVE_PENDING, &con->flags);
+	clear_bit(WRITE_PENDING, &con->flags);
  	mutex_lock(&con->mutex);
  	reset_connection(con);
  	con->peer_global_seq = 0;
@@ -612,7 +612,7 @@ static void prepare_write_message(struct 
ceph_connection *con)
  		prepare_write_message_footer(con);
  	}

-	set_bit(WRITE_PENDING, &con->state);
+	set_bit(WRITE_PENDING, &con->flags);
  }

  /*
@@ -633,7 +633,7 @@ static void prepare_write_ack(struct ceph_connection 
*con)
  				&con->out_temp_ack);

  	con->out_more = 1;  /* more will follow.. eventually.. */
-	set_bit(WRITE_PENDING, &con->state);
+	set_bit(WRITE_PENDING, &con->flags);
  }

  /*
@@ -644,7 +644,7 @@ static void prepare_write_keepalive(struct 
ceph_connection *con)
  	dout("prepare_write_keepalive %p\n", con);
  	con_out_kvec_reset(con);
  	con_out_kvec_add(con, sizeof (tag_keepalive), &tag_keepalive);
-	set_bit(WRITE_PENDING, &con->state);
+	set_bit(WRITE_PENDING, &con->flags);
  }

  /*
@@ -673,7 +673,7 @@ static struct ceph_auth_handshake 
*get_connect_authorizer(struct ceph_connection

  	if (IS_ERR(auth))
  		return auth;
-	if (test_bit(CLOSED, &con->state) || test_bit(OPENING, &con->state))
+	if (test_bit(CLOSED, &con->state) || test_bit(OPENING, &con->flags))
  		return ERR_PTR(-EAGAIN);

  	con->auth_reply_buf = auth->authorizer_reply_buf;
@@ -693,7 +693,7 @@ static void prepare_write_banner(struct 
ceph_connection *con)
  					&con->msgr->my_enc_addr);

  	con->out_more = 0;
-	set_bit(WRITE_PENDING, &con->state);
+	set_bit(WRITE_PENDING, &con->flags);
  }

  static int prepare_write_connect(struct ceph_connection *con)
@@ -743,7 +743,7 @@ static int prepare_write_connect(struct 
ceph_connection *con)
  					auth->authorizer_buf);

  	con->out_more = 0;
-	set_bit(WRITE_PENDING, &con->state);
+	set_bit(WRITE_PENDING, &con->flags);

  	return 0;
  }
@@ -1490,7 +1490,7 @@ static int process_connect(struct ceph_connection 
*con)
  			le32_to_cpu(con->in_reply.connect_seq));

  		if (con->in_reply.flags & CEPH_MSG_CONNECT_LOSSY)
-			set_bit(LOSSYTX, &con->state);
+			set_bit(LOSSYTX, &con->flags);

  		prepare_read_tag(con);
  		break;
@@ -1931,14 +1931,14 @@ do_next:
  			prepare_write_ack(con);
  			goto more;
  		}
-		if (test_and_clear_bit(KEEPALIVE_PENDING, &con->state)) {
+		if (test_and_clear_bit(KEEPALIVE_PENDING, &con->flags)) {
  			prepare_write_keepalive(con);
  			goto more;
  		}
  	}

  	/* Nothing to do! */
-	clear_bit(WRITE_PENDING, &con->state);
+	clear_bit(WRITE_PENDING, &con->flags);
  	dout("try_write nothing else to write.\n");
  	ret = 0;
  out:
@@ -2104,7 +2104,7 @@ static void con_work(struct work_struct *work)

  	mutex_lock(&con->mutex);
  restart:
-	if (test_and_clear_bit(BACKOFF, &con->state)) {
+	if (test_and_clear_bit(BACKOFF, &con->flags)) {
  		dout("con_work %p backing off\n", con);
  		if (queue_delayed_work(ceph_msgr_wq, &con->work,
  				       round_jiffies_relative(con->delay))) {
@@ -2133,7 +2133,7 @@ restart:
  		con_close_socket(con);
  	}

-	if (test_and_clear_bit(SOCK_CLOSED, &con->state))
+	if (test_and_clear_bit(SOCK_CLOSED, &con->flags))
  		goto fault;

  	ret = try_read(con);
@@ -2172,7 +2172,7 @@ static void ceph_fault(struct ceph_connection *con)
  	dout("fault %p state %lu to peer %s\n",
  	     con, con->state, ceph_pr_addr(&con->peer_addr.in_addr));

-	if (test_bit(LOSSYTX, &con->state)) {
+	if (test_bit(LOSSYTX, &con->flags)) {
  		dout("fault on LOSSYTX channel\n");
  		goto out;
  	}
@@ -2194,9 +2194,9 @@ static void ceph_fault(struct ceph_connection *con)
  	/* If there are no messages queued or keepalive pending, place
  	 * the connection in a STANDBY state */
  	if (list_empty(&con->out_queue) &&
-	    !test_bit(KEEPALIVE_PENDING, &con->state)) {
+	    !test_bit(KEEPALIVE_PENDING, &con->flags)) {
  		dout("fault %p setting STANDBY clearing WRITE_PENDING\n", con);
-		clear_bit(WRITE_PENDING, &con->state);
+		clear_bit(WRITE_PENDING, &con->flags);
  		set_bit(STANDBY, &con->state);
  	} else {
  		/* retry after a delay. */
@@ -2220,7 +2220,7 @@ static void ceph_fault(struct ceph_connection *con)
  			 * that when con_work restarts we schedule the
  			 * delay then.
  			 */
-			set_bit(BACKOFF, &con->state);
+			set_bit(BACKOFF, &con->flags);
  		}
  	}

@@ -2276,8 +2276,8 @@ static void clear_standby(struct ceph_connection *con)
  		mutex_lock(&con->mutex);
  		dout("clear_standby %p and ++connect_seq\n", con);
  		con->connect_seq++;
-		WARN_ON(test_bit(WRITE_PENDING, &con->state));
-		WARN_ON(test_bit(KEEPALIVE_PENDING, &con->state));
+		WARN_ON(test_bit(WRITE_PENDING, &con->flags));
+		WARN_ON(test_bit(KEEPALIVE_PENDING, &con->flags));
  		mutex_unlock(&con->mutex);
  	}
  }
@@ -2315,7 +2315,7 @@ void ceph_con_send(struct ceph_connection *con, 
struct ceph_msg *msg)
  	/* if there wasn't anything waiting to send before, queue
  	 * new work */
  	clear_standby(con);
-	if (test_and_set_bit(WRITE_PENDING, &con->state) == 0)
+	if (test_and_set_bit(WRITE_PENDING, &con->flags) == 0)
  		queue_con(con);
  }
  EXPORT_SYMBOL(ceph_con_send);
@@ -2382,8 +2382,8 @@ void ceph_con_keepalive(struct ceph_connection *con)
  {
  	dout("con_keepalive %p\n", con);
  	clear_standby(con);
-	if (test_and_set_bit(KEEPALIVE_PENDING, &con->state) == 0 &&
-	    test_and_set_bit(WRITE_PENDING, &con->state) == 0)
+	if (test_and_set_bit(KEEPALIVE_PENDING, &con->flags) == 0 &&
+	    test_and_set_bit(WRITE_PENDING, &con->flags) == 0)
  		queue_con(con);
  }
  EXPORT_SYMBOL(ceph_con_keepalive);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 09/13] libceph: start tracking connection socket state
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (7 preceding siblings ...)
  2012-05-30 19:35 ` [PATCH 08/13] libceph: start separating connection flags from state Alex Elder
@ 2012-05-30 19:35 ` Alex Elder
  2012-06-01  4:28   ` Sage Weil
  2012-06-12  4:52   ` Yan, Zheng
  2012-05-30 19:35 ` [PATCH 10/13] libceph: provide osd number when creating osd Alex Elder
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:35 UTC (permalink / raw)
  To: ceph-devel

Start explicitly keeping track of the state of a ceph connection's
socket, separate from the state of the connection itself.  Create
placeholder functions to encapsulate the state transitions.

     --------
     | NEW* |  transient initial state
     --------
         | con_sock_state_init()
         v
     ----------
     | CLOSED |  initialized, but no socket (and no
     ----------  TCP connection)
      ^      \
      |       \ con_sock_state_connecting()
      |        ----------------------
      |                              \
      + con_sock_state_closed()       \
      |\                               \
      | \                               \
      |  -----------                     \
      |  | CLOSING |  socket event;       \
      |  -----------  await close          \
      |       ^                            |
      |       |                            |
      |       + con_sock_state_closing()   |
      |      / \                           |
      |     /   ---------------            |
      |    /                   \           v
      |   /                    --------------
      |  /    -----------------| CONNECTING |  socket created, TCP
      |  |   /                 --------------  connect initiated
      |  |   | con_sock_state_connected()
      |  |   v
     -------------
     | CONNECTED |  TCP connection established
     -------------

Make the socket state an atomic variable, reinforcing that it's a
distinct transtion with no possible "intermediate/both" states.
This is almost certainly overkill at this point, though the
transitions into CONNECTED and CLOSING state do get called via
socket callback (the rest of the transitions occur with the
connection mutex held).  We can back out the atomicity later.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  include/linux/ceph/messenger.h |    8 ++++-
  net/ceph/messenger.c           |   63 
++++++++++++++++++++++++++++++++++++++++
  2 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 920235e..5e852f4 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -137,14 +137,18 @@ struct ceph_connection {
  	const struct ceph_connection_operations *ops;

  	struct ceph_messenger *msgr;
+
+	atomic_t sock_state;
  	struct socket *sock;
+	struct ceph_entity_addr peer_addr; /* peer address */
+	struct ceph_entity_addr peer_addr_for_me;
+
  	unsigned long flags;
  	unsigned long state;
  	const char *error_msg;  /* error message, if any */

-	struct ceph_entity_addr peer_addr; /* peer address */
  	struct ceph_entity_name peer_name; /* peer name */
-	struct ceph_entity_addr peer_addr_for_me;
+
  	unsigned peer_features;
  	u32 connect_seq;      /* identify the most recent connection
  				 attempt for this connection, client */
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 29055df..7e11b07 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -29,6 +29,14 @@
   * the sender.
   */

+/* State values for ceph_connection->sock_state; NEW is assumed to be 0 */
+
+#define CON_SOCK_STATE_NEW		0	/* -> CLOSED */
+#define CON_SOCK_STATE_CLOSED		1	/* -> CONNECTING */
+#define CON_SOCK_STATE_CONNECTING	2	/* -> CONNECTED or -> CLOSING */
+#define CON_SOCK_STATE_CONNECTED	3	/* -> CLOSING or -> CLOSED */
+#define CON_SOCK_STATE_CLOSING		4	/* -> CLOSED */
+
  /* static tag bytes (protocol control messages) */
  static char tag_msg = CEPH_MSGR_TAG_MSG;
  static char tag_ack = CEPH_MSGR_TAG_ACK;
@@ -147,6 +155,54 @@ void ceph_msgr_flush(void)
  }
  EXPORT_SYMBOL(ceph_msgr_flush);

+/* Connection socket state transition functions */
+
+static void con_sock_state_init(struct ceph_connection *con)
+{
+	int old_state;
+
+	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
+	if (WARN_ON(old_state != CON_SOCK_STATE_NEW))
+		printk("%s: unexpected old state %d\n", __func__, old_state);
+}
+
+static void con_sock_state_connecting(struct ceph_connection *con)
+{
+	int old_state;
+
+	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTING);
+	if (WARN_ON(old_state != CON_SOCK_STATE_CLOSED))
+		printk("%s: unexpected old state %d\n", __func__, old_state);
+}
+
+static void con_sock_state_connected(struct ceph_connection *con)
+{
+	int old_state;
+
+	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTED);
+	if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING))
+		printk("%s: unexpected old state %d\n", __func__, old_state);
+}
+
+static void con_sock_state_closing(struct ceph_connection *con)
+{
+	int old_state;
+
+	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSING);
+	if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING &&
+			old_state != CON_SOCK_STATE_CONNECTED))
+		printk("%s: unexpected old state %d\n", __func__, old_state);
+}
+
+static void con_sock_state_closed(struct ceph_connection *con)
+{
+	int old_state;
+
+	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
+	if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTED &&
+			old_state != CON_SOCK_STATE_CLOSING))
+		printk("%s: unexpected old state %d\n", __func__, old_state);
+}

  /*
   * socket callback functions
@@ -203,6 +259,7 @@ static void ceph_sock_state_change(struct sock *sk)
  		dout("%s TCP_CLOSE\n", __func__);
  	case TCP_CLOSE_WAIT:
  		dout("%s TCP_CLOSE_WAIT\n", __func__);
+		con_sock_state_closing(con);
  		if (test_and_set_bit(SOCK_CLOSED, &con->flags) == 0) {
  			if (test_bit(CONNECTING, &con->state))
  				con->error_msg = "connection failed";
@@ -213,6 +270,7 @@ static void ceph_sock_state_change(struct sock *sk)
  		break;
  	case TCP_ESTABLISHED:
  		dout("%s TCP_ESTABLISHED\n", __func__);
+		con_sock_state_connected(con);
  		queue_con(con);
  		break;
  	default:	/* Everything else is uninteresting */
@@ -277,6 +335,7 @@ static int ceph_tcp_connect(struct ceph_connection *con)
  		return ret;
  	}
  	con->sock = sock;
+	con_sock_state_connecting(con);

  	return 0;
  }
@@ -341,6 +400,7 @@ static int con_close_socket(struct ceph_connection *con)
  	rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
  	sock_release(con->sock);
  	con->sock = NULL;
+	con_sock_state_closed(con);
  	return rc;
  }

@@ -460,6 +520,9 @@ void ceph_con_init(struct ceph_messenger *msgr, 
struct ceph_connection *con)
  	memset(con, 0, sizeof(*con));
  	atomic_set(&con->nref, 1);
  	con->msgr = msgr;
+
+	con_sock_state_init(con);
+
  	mutex_init(&con->mutex);
  	INIT_LIST_HEAD(&con->out_queue);
  	INIT_LIST_HEAD(&con->out_sent);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 10/13] libceph: provide osd number when creating osd
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (8 preceding siblings ...)
  2012-05-30 19:35 ` [PATCH 09/13] libceph: start tracking connection socket state Alex Elder
@ 2012-05-30 19:35 ` Alex Elder
  2012-06-01  4:29   ` Sage Weil
  2012-05-30 19:35 ` [PATCH 11/13] libceph: init monitor connection when opening Alex Elder
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:35 UTC (permalink / raw)
  To: ceph-devel

Pass the osd number to the create_osd() routine, and move the
initialization of fields that depend on it therein.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  net/ceph/osd_client.c |    8 ++++----
  1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index cca4c7f..e30efbc 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -624,7 +624,7 @@ static void osd_reset(struct ceph_connection *con)
  /*
   * Track open sessions with osds.
   */
-static struct ceph_osd *create_osd(struct ceph_osd_client *osdc)
+static struct ceph_osd *create_osd(struct ceph_osd_client *osdc, int onum)
  {
  	struct ceph_osd *osd;

@@ -634,6 +634,7 @@ static struct ceph_osd *create_osd(struct 
ceph_osd_client *osdc)

  	atomic_set(&osd->o_ref, 1);
  	osd->o_osdc = osdc;
+	osd->o_osd = onum;
  	INIT_LIST_HEAD(&osd->o_requests);
  	INIT_LIST_HEAD(&osd->o_linger_requests);
  	INIT_LIST_HEAD(&osd->o_osd_lru);
@@ -643,6 +644,7 @@ static struct ceph_osd *create_osd(struct 
ceph_osd_client *osdc)
  	osd->o_con.private = osd;
  	osd->o_con.ops = &osd_con_ops;
  	osd->o_con.peer_name.type = CEPH_ENTITY_TYPE_OSD;
+	osd->o_con.peer_name.num = cpu_to_le64(onum);

  	INIT_LIST_HEAD(&osd->o_keepalive_item);
  	return osd;
@@ -998,15 +1000,13 @@ static int __map_request(struct ceph_osd_client 
*osdc,
  	req->r_osd = __lookup_osd(osdc, o);
  	if (!req->r_osd && o >= 0) {
  		err = -ENOMEM;
-		req->r_osd = create_osd(osdc);
+		req->r_osd = create_osd(osdc, o);
  		if (!req->r_osd) {
  			list_move(&req->r_req_lru_item, &osdc->req_notarget);
  			goto out;
  		}

  		dout("map_request osd %p is osd%d\n", req->r_osd, o);
-		req->r_osd->o_osd = o;
-		req->r_osd->o_con.peer_name.num = cpu_to_le64(o);
  		__insert_osd(osdc, req->r_osd);

  		ceph_con_open(&req->r_osd->o_con, &osdc->osdmap->osd_addr[o]);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 11/13] libceph: init monitor connection when opening
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (9 preceding siblings ...)
  2012-05-30 19:35 ` [PATCH 10/13] libceph: provide osd number when creating osd Alex Elder
@ 2012-05-30 19:35 ` Alex Elder
  2012-06-01  4:30   ` Sage Weil
  2012-05-30 19:35 ` [PATCH 12/13] libceph: fully initialize connection in con_init() Alex Elder
  2012-05-30 19:35 ` [PATCH 13/13] libceph: set CLOSED state bit in con_init Alex Elder
  12 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:35 UTC (permalink / raw)
  To: ceph-devel

Hold off initializing a monitor client's connection until just
before it gets opened for use.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  net/ceph/mon_client.c |   13 ++++++-------
  1 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index ac4d6b1..77da480 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -119,6 +119,7 @@ static void __close_session(struct ceph_mon_client 
*monc)
  	dout("__close_session closing mon%d\n", monc->cur_mon);
  	ceph_con_revoke(&monc->con, monc->m_auth);
  	ceph_con_close(&monc->con);
+	monc->con.private = NULL;
  	monc->cur_mon = -1;
  	monc->pending_auth = 0;
  	ceph_auth_reset(monc->auth);
@@ -141,9 +142,13 @@ static int __open_session(struct ceph_mon_client *monc)
  		monc->sub_renew_after = jiffies;  /* i.e., expired */
  		monc->want_next_osdmap = !!monc->want_next_osdmap;

-		dout("open_session mon%d opening\n", monc->cur_mon);
+		ceph_con_init(&monc->client->msgr, &monc->con);
+		monc->con.private = monc;
+		monc->con.ops = &mon_con_ops;
  		monc->con.peer_name.type = CEPH_ENTITY_TYPE_MON;
  		monc->con.peer_name.num = cpu_to_le64(monc->cur_mon);
+
+		dout("open_session mon%d opening\n", monc->cur_mon);
  		ceph_con_open(&monc->con,
  			      &monc->monmap->mon_inst[monc->cur_mon].addr);

@@ -760,10 +765,6 @@ int ceph_monc_init(struct ceph_mon_client *monc, 
struct ceph_client *cl)
  		goto out;

  	/* connection */
-	ceph_con_init(&monc->client->msgr, &monc->con);
-	monc->con.private = monc;
-	monc->con.ops = &mon_con_ops;
-
  	/* authentication */
  	monc->auth = ceph_auth_init(cl->options->name,
  				    cl->options->key);
@@ -836,8 +837,6 @@ void ceph_monc_stop(struct ceph_mon_client *monc)
  	mutex_lock(&monc->mutex);
  	__close_session(monc);

-	monc->con.private = NULL;
-
  	mutex_unlock(&monc->mutex);

  	ceph_auth_destroy(monc->auth);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 12/13] libceph: fully initialize connection in con_init()
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (10 preceding siblings ...)
  2012-05-30 19:35 ` [PATCH 11/13] libceph: init monitor connection when opening Alex Elder
@ 2012-05-30 19:35 ` Alex Elder
  2012-06-01  4:31   ` Sage Weil
  2012-05-30 19:35 ` [PATCH 13/13] libceph: set CLOSED state bit in con_init Alex Elder
  12 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:35 UTC (permalink / raw)
  To: ceph-devel

Move the initialization of a ceph connection's private pointer,
operations vector pointer, and peer name information into
ceph_con_init().  Rearrange the arguments so the connection pointer
is first.  Hide the byte-swapping of the peer entity number inside
ceph_con_init()

Signed-off-by: Alex Elder <elder@inktank.com>
---
  fs/ceph/mds_client.c           |    7 ++-----
  include/linux/ceph/messenger.h |    6 ++++--
  net/ceph/messenger.c           |    9 ++++++++-
  net/ceph/mon_client.c          |    8 +++-----
  net/ceph/osd_client.c          |    7 ++-----
  5 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index ad30261..ecd7f15 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -394,11 +394,8 @@ static struct ceph_mds_session 
*register_session(struct ceph_mds_client *mdsc,
  	s->s_seq = 0;
  	mutex_init(&s->s_mutex);

-	ceph_con_init(&mdsc->fsc->client->msgr, &s->s_con);
-	s->s_con.private = s;
-	s->s_con.ops = &mds_con_ops;
-	s->s_con.peer_name.type = CEPH_ENTITY_TYPE_MDS;
-	s->s_con.peer_name.num = cpu_to_le64(mds);
+	ceph_con_init(&s->s_con, s, &mds_con_ops, &mdsc->fsc->client->msgr,
+		CEPH_ENTITY_TYPE_MDS, mds);

  	spin_lock_init(&s->s_gen_ttl_lock);
  	s->s_cap_gen = 0;
diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
index 5e852f4..dd27837 100644
--- a/include/linux/ceph/messenger.h
+++ b/include/linux/ceph/messenger.h
@@ -227,8 +227,10 @@ extern void ceph_messenger_init(struct 
ceph_messenger *msgr,
  			u32 required_features,
  			bool nocrc);

-extern void ceph_con_init(struct ceph_messenger *msgr,
-			  struct ceph_connection *con);
+extern void ceph_con_init(struct ceph_connection *con, void *private,
+			const struct ceph_connection_operations *ops,
+			struct ceph_messenger *msgr, __u8 entity_type,
+			__u64 entity_num);
  extern void ceph_con_open(struct ceph_connection *con,
  			  struct ceph_entity_addr *addr);
  extern bool ceph_con_opened(struct ceph_connection *con);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 7e11b07..cdf8299 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -514,15 +514,22 @@ void ceph_con_put(struct ceph_connection *con)
  /*
   * initialize a new connection.
   */
-void ceph_con_init(struct ceph_messenger *msgr, struct ceph_connection 
*con)
+void ceph_con_init(struct ceph_connection *con, void *private,
+	const struct ceph_connection_operations *ops,
+	struct ceph_messenger *msgr, __u8 entity_type, __u64 entity_num)
  {
  	dout("con_init %p\n", con);
  	memset(con, 0, sizeof(*con));
+	con->private = private;
  	atomic_set(&con->nref, 1);
+	con->ops = ops;
  	con->msgr = msgr;

  	con_sock_state_init(con);

+	con->peer_name.type = (__u8) entity_type;
+	con->peer_name.num = cpu_to_le64(entity_num);
+
  	mutex_init(&con->mutex);
  	INIT_LIST_HEAD(&con->out_queue);
  	INIT_LIST_HEAD(&con->out_sent);
diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index 77da480..9b4cef9 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -142,11 +142,9 @@ static int __open_session(struct ceph_mon_client *monc)
  		monc->sub_renew_after = jiffies;  /* i.e., expired */
  		monc->want_next_osdmap = !!monc->want_next_osdmap;

-		ceph_con_init(&monc->client->msgr, &monc->con);
-		monc->con.private = monc;
-		monc->con.ops = &mon_con_ops;
-		monc->con.peer_name.type = CEPH_ENTITY_TYPE_MON;
-		monc->con.peer_name.num = cpu_to_le64(monc->cur_mon);
+		ceph_con_init(&monc->con, monc, &mon_con_ops,
+			&monc->client->msgr,
+			CEPH_ENTITY_TYPE_MON, monc->cur_mon);

  		dout("open_session mon%d opening\n", monc->cur_mon);
  		ceph_con_open(&monc->con,
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index e30efbc..1f3951a 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -640,11 +640,8 @@ static struct ceph_osd *create_osd(struct 
ceph_osd_client *osdc, int onum)
  	INIT_LIST_HEAD(&osd->o_osd_lru);
  	osd->o_incarnation = 1;

-	ceph_con_init(&osdc->client->msgr, &osd->o_con);
-	osd->o_con.private = osd;
-	osd->o_con.ops = &osd_con_ops;
-	osd->o_con.peer_name.type = CEPH_ENTITY_TYPE_OSD;
-	osd->o_con.peer_name.num = cpu_to_le64(onum);
+	ceph_con_init(&osd->o_con, osd, &osd_con_ops, &osdc->client->msgr,
+		CEPH_ENTITY_TYPE_OSD, onum);

  	INIT_LIST_HEAD(&osd->o_keepalive_item);
  	return osd;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 13/13] libceph: set CLOSED state bit in con_init
  2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
                   ` (11 preceding siblings ...)
  2012-05-30 19:35 ` [PATCH 12/13] libceph: fully initialize connection in con_init() Alex Elder
@ 2012-05-30 19:35 ` Alex Elder
  2012-06-01  4:32   ` Sage Weil
  12 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-05-30 19:35 UTC (permalink / raw)
  To: ceph-devel

Once a connection is fully initialized, it is really in a CLOSED
state, so make that explicit by setting the bit in its state field.

It is possible for a connection in NEGOTIATING state to get a
failure, leading to ceph_fault() and ultimately ceph_con_close().
Clear that bits if it is set in that case, to reflect that the
connection truly is closed and is no longer participating in a
connect sequence.

Issue a warning if ceph_con_open() is called on a connection that
is not in CLOSED state.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  net/ceph/messenger.c |    8 +++++++-
  1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index cdf8299..85bfe12 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -452,10 +452,13 @@ void ceph_con_close(struct ceph_connection *con)
  	dout("con_close %p peer %s\n", con,
  	     ceph_pr_addr(&con->peer_addr.in_addr));
  	set_bit(CLOSED, &con->state);  /* in case there's queued work */
+	clear_bit(NEGOTIATING, &con->state);
  	clear_bit(STANDBY, &con->state);  /* avoid connect_seq bump */
+
  	clear_bit(LOSSYTX, &con->flags);  /* so we retry next connect */
  	clear_bit(KEEPALIVE_PENDING, &con->flags);
  	clear_bit(WRITE_PENDING, &con->flags);
+
  	mutex_lock(&con->mutex);
  	reset_connection(con);
  	con->peer_global_seq = 0;
@@ -472,7 +475,8 @@ void ceph_con_open(struct ceph_connection *con, 
struct ceph_entity_addr *addr)
  {
  	dout("con_open %p %s\n", con, ceph_pr_addr(&addr->in_addr));
  	set_bit(OPENING, &con->state);
-	clear_bit(CLOSED, &con->state);
+	WARN_ON(!test_and_clear_bit(CLOSED, &con->state));
+
  	memcpy(&con->peer_addr, addr, sizeof(*addr));
  	con->delay = 0;      /* reset backoff memory */
  	queue_con(con);
@@ -534,6 +538,8 @@ void ceph_con_init(struct ceph_connection *con, void 
*private,
  	INIT_LIST_HEAD(&con->out_queue);
  	INIT_LIST_HEAD(&con->out_sent);
  	INIT_DELAYED_WORK(&con->work, con_work);
+
+	set_bit(CLOSED, &con->state);
  }
  EXPORT_SYMBOL(ceph_con_init);

-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH 01/13] libceph: eliminate connection state "DEAD"
  2012-05-30 19:34 ` [PATCH 01/13] libceph: eliminate connection state "DEAD" Alex Elder
@ 2012-05-31 16:20   ` Yehuda Sadeh
  0 siblings, 0 replies; 44+ messages in thread
From: Yehuda Sadeh @ 2012-05-31 16:20 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>

On Wed, May 30, 2012 at 12:34 PM, Alex Elder <elder@inktank.com> wrote:
> The ceph connection state "DEAD" is never set and is therefore not
> needed.  Eliminate it.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  include/linux/ceph/messenger.h |    1 -
>  net/ceph/messenger.c           |    6 ------
>  2 files changed, 0 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 2521a95..aa506ca 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -119,7 +119,6 @@ struct ceph_msg_pos {
>  #define CLOSED         10 /* we've closed the connection */
>  #define SOCK_CLOSED    11 /* socket state changed to closed */
>  #define OPENING         13 /* open connection w/ (possibly new) peer */
> -#define DEAD            14 /* dead, about to kfree */
>  #define BACKOFF         15
>
>  /*
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 1a80907..42ca8aa 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2087,12 +2087,6 @@ bad_tag:
>  */
>  static void queue_con(struct ceph_connection *con)
>  {
> -       if (test_bit(DEAD, &con->state)) {
> -               dout("queue_con %p ignoring: DEAD\n",
> -                    con);
> -               return;
> -       }
> -
>        if (!con->ops->get(con)) {
>                dout("queue_con %p ref count 0\n", con);
>                return;
> --
> 1.7.5.4
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 02/13] libceph: kill bad_proto ceph connection op
  2012-05-30 19:34 ` [PATCH 02/13] libceph: kill bad_proto ceph connection op Alex Elder
@ 2012-05-31 16:30   ` Yehuda Sadeh
  0 siblings, 0 replies; 44+ messages in thread
From: Yehuda Sadeh @ 2012-05-31 16:30 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>

On Wed, May 30, 2012 at 12:34 PM, Alex Elder <elder@inktank.com> wrote:
> No code sets a bad_proto method in its ceph connection operations
> vector, so just get rid of it.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  include/linux/ceph/messenger.h |    3 ---
>  net/ceph/messenger.c           |    5 -----
>  2 files changed, 0 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index aa506ca..74f6c9b 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -31,9 +31,6 @@ struct ceph_connection_operations {
>        int (*verify_authorizer_reply) (struct ceph_connection *con, int
> len);
>        int (*invalidate_authorizer)(struct ceph_connection *con);
>
> -       /* protocol version mismatch */
> -       void (*bad_proto) (struct ceph_connection *con);
> -
>        /* there was some error on the socket (disconnect, whatever) */
>        void (*fault) (struct ceph_connection *con);
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 42ca8aa..07af994 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -1356,11 +1356,6 @@ static void fail_protocol(struct ceph_connection
> *con)
>  {
>        reset_connection(con);
>        set_bit(CLOSED, &con->state);  /* in case there's queued work */
> -
> -       mutex_unlock(&con->mutex);
> -       if (con->ops->bad_proto)
> -               con->ops->bad_proto(con);
> -       mutex_lock(&con->mutex);
>  }
>
>  static int process_connect(struct ceph_connection *con)
> --
> 1.7.5.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 04/13] libceph: rename socket callbacks
  2012-05-30 19:34 ` [PATCH 04/13] libceph: rename socket callbacks Alex Elder
@ 2012-05-31 16:33   ` Yehuda Sadeh Weinraub
  2012-06-01  4:02   ` Sage Weil
  1 sibling, 0 replies; 44+ messages in thread
From: Yehuda Sadeh Weinraub @ 2012-05-31 16:33 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>

On Wed, May 30, 2012 at 12:34 PM, Alex Elder <elder@inktank.com> wrote:
> Change the names of the three socket callback functions to make it
> more obvious they're specifically associated with a connection's
> socket (not the ceph connection that uses it).
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/messenger.c |   28 ++++++++++++++--------------
>  1 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index fe3c2a1..5ad1f0a 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -153,46 +153,46 @@ EXPORT_SYMBOL(ceph_msgr_flush);
>  */
>
>  /* data available on socket, or listen socket received a connect */
> -static void ceph_data_ready(struct sock *sk, int count_unused)
> +static void ceph_sock_data_ready(struct sock *sk, int count_unused)
>  {
>        struct ceph_connection *con = sk->sk_user_data;
>
>        if (sk->sk_state != TCP_CLOSE_WAIT) {
> -               dout("ceph_data_ready on %p state = %lu, queueing work\n",
> +               dout("%s on %p state = %lu, queueing work\n", __func__,
>                     con, con->state);
>                queue_con(con);
>        }
>  }
>
>  /* socket has buffer space for writing */
> -static void ceph_write_space(struct sock *sk)
> +static void ceph_sock_write_space(struct sock *sk)
>  {
>        struct ceph_connection *con = sk->sk_user_data;
>
>        /* only queue to workqueue if there is data we want to write,
>         * and there is sufficient space in the socket buffer to accept
> -        * more data.  clear SOCK_NOSPACE so that ceph_write_space()
> +        * more data.  clear SOCK_NOSPACE so that ceph_sock_write_space()
>         * doesn't get called again until try_write() fills the socket
>         * buffer. See net/ipv4/tcp_input.c:tcp_check_space()
>         * and net/core/stream.c:sk_stream_write_space().
>         */
>        if (test_bit(WRITE_PENDING, &con->state)) {
>                if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
> -                       dout("ceph_write_space %p queueing write work\n",
> con);
> +                       dout("%s %p queueing write work\n", __func__, con);
>                        clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
>                        queue_con(con);
>                }
>        } else {
> -               dout("ceph_write_space %p nothing to write\n", con);
> +               dout("%s %p nothing to write\n", __func__, con);
>        }
>  }
>
>  /* socket's state has changed */
> -static void ceph_state_change(struct sock *sk)
> +static void ceph_sock_state_change(struct sock *sk)
>  {
>        struct ceph_connection *con = sk->sk_user_data;
>
> -       dout("ceph_state_change %p state = %lu sk_state = %u\n",
> +       dout("%s %p state = %lu sk_state = %u\n", __func__,
>             con, con->state, sk->sk_state);
>
>        if (test_bit(CLOSED, &con->state))
> @@ -200,9 +200,9 @@ static void ceph_state_change(struct sock *sk)
>
>        switch (sk->sk_state) {
>        case TCP_CLOSE:
> -               dout("ceph_state_change TCP_CLOSE\n");
> +               dout("%s TCP_CLOSE\n", __func__);
>        case TCP_CLOSE_WAIT:
> -               dout("ceph_state_change TCP_CLOSE_WAIT\n");
> +               dout("%s TCP_CLOSE_WAIT\n", __func__);
>                if (test_and_set_bit(SOCK_CLOSED, &con->state) == 0) {
>                        if (test_bit(CONNECTING, &con->state))
>                                con->error_msg = "connection failed";
> @@ -212,7 +212,7 @@ static void ceph_state_change(struct sock *sk)
>                }
>                break;
>        case TCP_ESTABLISHED:
> -               dout("ceph_state_change TCP_ESTABLISHED\n");
> +               dout("%s TCP_ESTABLISHED\n", __func__);
>                queue_con(con);
>                break;
>        default:        /* Everything else is uninteresting */
> @@ -228,9 +228,9 @@ static void set_sock_callbacks(struct socket *sock,
>  {
>        struct sock *sk = sock->sk;
>        sk->sk_user_data = con;
> -       sk->sk_data_ready = ceph_data_ready;
> -       sk->sk_write_space = ceph_write_space;
> -       sk->sk_state_change = ceph_state_change;
> +       sk->sk_data_ready = ceph_sock_data_ready;
> +       sk->sk_write_space = ceph_sock_write_space;
> +       sk->sk_state_change = ceph_sock_state_change;
>  }
>
>
> --
> 1.7.5.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 05/13] libceph: rename kvec_reset and kvec_add functions
  2012-05-30 19:34 ` [PATCH 05/13] libceph: rename kvec_reset and kvec_add functions Alex Elder
@ 2012-05-31 16:34   ` Yehuda Sadeh
  2012-06-01  4:02   ` Sage Weil
  1 sibling, 0 replies; 44+ messages in thread
From: Yehuda Sadeh @ 2012-05-31 16:34 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>

On Wed, May 30, 2012 at 12:34 PM, Alex Elder <elder@inktank.com> wrote:
> The functions ceph_con_out_kvec_reset() and ceph_con_out_kvec_add()
> are entirely private functions, so drop the "ceph_" prefix in their
> name to make them slightly more wieldy.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/messenger.c |   48
> ++++++++++++++++++++++++------------------------
>  1 files changed, 24 insertions(+), 24 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 5ad1f0a..2e9054f 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -484,14 +484,14 @@ static u32 get_global_seq(struct ceph_messenger *msgr,
> u32 gt)
>        return ret;
>  }
>
> -static void ceph_con_out_kvec_reset(struct ceph_connection *con)
> +static void con_out_kvec_reset(struct ceph_connection *con)
>  {
>        con->out_kvec_left = 0;
>        con->out_kvec_bytes = 0;
>        con->out_kvec_cur = &con->out_kvec[0];
>  }
>
> -static void ceph_con_out_kvec_add(struct ceph_connection *con,
> +static void con_out_kvec_add(struct ceph_connection *con,
>                                size_t size, void *data)
>  {
>        int index;
> @@ -532,7 +532,7 @@ static void prepare_write_message(struct ceph_connection
> *con)
>        struct ceph_msg *m;
>        u32 crc;
>
> -       ceph_con_out_kvec_reset(con);
> +       con_out_kvec_reset(con);
>        con->out_kvec_is_msg = true;
>        con->out_msg_done = false;
>
> @@ -540,9 +540,9 @@ static void prepare_write_message(struct ceph_connection
> *con)
>         * TCP packet that's a good thing. */
>        if (con->in_seq > con->in_seq_acked) {
>                con->in_seq_acked = con->in_seq;
> -               ceph_con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
> +               con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
>                con->out_temp_ack = cpu_to_le64(con->in_seq_acked);
> -               ceph_con_out_kvec_add(con, sizeof (con->out_temp_ack),
> +               con_out_kvec_add(con, sizeof (con->out_temp_ack),
>                        &con->out_temp_ack);
>        }
>
> @@ -570,12 +570,12 @@ static void prepare_write_message(struct
> ceph_connection *con)
>        BUG_ON(le32_to_cpu(m->hdr.front_len) != m->front.iov_len);
>
>        /* tag + hdr + front + middle */
> -       ceph_con_out_kvec_add(con, sizeof (tag_msg), &tag_msg);
> -       ceph_con_out_kvec_add(con, sizeof (m->hdr), &m->hdr);
> -       ceph_con_out_kvec_add(con, m->front.iov_len, m->front.iov_base);
> +       con_out_kvec_add(con, sizeof (tag_msg), &tag_msg);
> +       con_out_kvec_add(con, sizeof (m->hdr), &m->hdr);
> +       con_out_kvec_add(con, m->front.iov_len, m->front.iov_base);
>
>        if (m->middle)
> -               ceph_con_out_kvec_add(con, m->middle->vec.iov_len,
> +               con_out_kvec_add(con, m->middle->vec.iov_len,
>                        m->middle->vec.iov_base);
>
>        /* fill in crc (except data pages), footer */
> @@ -624,12 +624,12 @@ static void prepare_write_ack(struct ceph_connection
> *con)
>             con->in_seq_acked, con->in_seq);
>        con->in_seq_acked = con->in_seq;
>
> -       ceph_con_out_kvec_reset(con);
> +       con_out_kvec_reset(con);
>
> -       ceph_con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
> +       con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
>
>        con->out_temp_ack = cpu_to_le64(con->in_seq_acked);
> -       ceph_con_out_kvec_add(con, sizeof (con->out_temp_ack),
> +       con_out_kvec_add(con, sizeof (con->out_temp_ack),
>                                &con->out_temp_ack);
>
>        con->out_more = 1;  /* more will follow.. eventually.. */
> @@ -642,8 +642,8 @@ static void prepare_write_ack(struct ceph_connection
> *con)
>  static void prepare_write_keepalive(struct ceph_connection *con)
>  {
>        dout("prepare_write_keepalive %p\n", con);
> -       ceph_con_out_kvec_reset(con);
> -       ceph_con_out_kvec_add(con, sizeof (tag_keepalive), &tag_keepalive);
> +       con_out_kvec_reset(con);
> +       con_out_kvec_add(con, sizeof (tag_keepalive), &tag_keepalive);
>        set_bit(WRITE_PENDING, &con->state);
>  }
>
> @@ -688,8 +688,8 @@ static struct ceph_auth_handshake
> *get_connect_authorizer(struct ceph_connection
>  */
>  static void prepare_write_banner(struct ceph_connection *con)
>  {
> -       ceph_con_out_kvec_add(con, strlen(CEPH_BANNER), CEPH_BANNER);
> -       ceph_con_out_kvec_add(con, sizeof (con->msgr->my_enc_addr),
> +       con_out_kvec_add(con, strlen(CEPH_BANNER), CEPH_BANNER);
> +       con_out_kvec_add(con, sizeof (con->msgr->my_enc_addr),
>                                        &con->msgr->my_enc_addr);
>
>        con->out_more = 0;
> @@ -736,10 +736,10 @@ static int prepare_write_connect(struct
> ceph_connection *con)
>        con->out_connect.authorizer_len = auth ?
>                cpu_to_le32(auth->authorizer_buf_len) : 0;
>
> -       ceph_con_out_kvec_add(con, sizeof (con->out_connect),
> +       con_out_kvec_add(con, sizeof (con->out_connect),
>                                        &con->out_connect);
>        if (auth && auth->authorizer_buf_len)
> -               ceph_con_out_kvec_add(con, auth->authorizer_buf_len,
> +               con_out_kvec_add(con, auth->authorizer_buf_len,
>                                        auth->authorizer_buf);
>
>        con->out_more = 0;
> @@ -933,7 +933,7 @@ static int write_partial_msg_pages(struct
> ceph_connection *con)
>        /* prepare and queue up footer, too */
>        if (!do_datacrc)
>                con->out_msg->footer.flags |= CEPH_MSG_FOOTER_NOCRC;
> -       ceph_con_out_kvec_reset(con);
> +       con_out_kvec_reset(con);
>        prepare_write_message_footer(con);
>        ret = 1;
>  out:
> @@ -1396,7 +1396,7 @@ static int process_connect(struct ceph_connection
> *con)
>                        return -1;
>                }
>                con->auth_retry = 1;
> -               ceph_con_out_kvec_reset(con);
> +               con_out_kvec_reset(con);
>                ret = prepare_write_connect(con);
>                if (ret < 0)
>                        return ret;
> @@ -1417,7 +1417,7 @@ static int process_connect(struct ceph_connection
> *con)
>                       ENTITY_NAME(con->peer_name),
>                       ceph_pr_addr(&con->peer_addr.in_addr));
>                reset_connection(con);
> -               ceph_con_out_kvec_reset(con);
> +               con_out_kvec_reset(con);
>                ret = prepare_write_connect(con);
>                if (ret < 0)
>                        return ret;
> @@ -1443,7 +1443,7 @@ static int process_connect(struct ceph_connection
> *con)
>                     le32_to_cpu(con->out_connect.connect_seq),
>                     le32_to_cpu(con->in_connect.connect_seq));
>                con->connect_seq = le32_to_cpu(con->in_connect.connect_seq);
> -               ceph_con_out_kvec_reset(con);
> +               con_out_kvec_reset(con);
>                ret = prepare_write_connect(con);
>                if (ret < 0)
>                        return ret;
> @@ -1460,7 +1460,7 @@ static int process_connect(struct ceph_connection
> *con)
>                     le32_to_cpu(con->in_connect.global_seq));
>                get_global_seq(con->msgr,
>                               le32_to_cpu(con->in_connect.global_seq));
> -               ceph_con_out_kvec_reset(con);
> +               con_out_kvec_reset(con);
>                ret = prepare_write_connect(con);
>                if (ret < 0)
>                        return ret;
> @@ -1867,7 +1867,7 @@ more:
>
>        /* open the socket first? */
>        if (con->sock == NULL) {
> -               ceph_con_out_kvec_reset(con);
> +               con_out_kvec_reset(con);
>                prepare_write_banner(con);
>                ret = prepare_write_connect(con);
>                if (ret < 0)
> --
> 1.7.5.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 06/13] libceph: embed ceph messenger structure in ceph_client
  2012-05-30 19:34 ` [PATCH 06/13] libceph: embed ceph messenger structure in ceph_client Alex Elder
@ 2012-05-31 16:44   ` Yehuda Sadeh
  2012-06-01  4:04   ` Sage Weil
  1 sibling, 0 replies; 44+ messages in thread
From: Yehuda Sadeh @ 2012-05-31 16:44 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>

On Wed, May 30, 2012 at 12:34 PM, Alex Elder <elder@inktank.com> wrote:
> A ceph client has a pointer to a ceph messenger structure in it.
> There is always exactly one ceph messenger for a ceph client, so
> there is no need to allocate it separate from the ceph client
> structure.
>
> Switch the ceph_client structure to embed its ceph_messenger
> structure.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  fs/ceph/mds_client.c           |    2 +-
>  include/linux/ceph/libceph.h   |    2 +-
>  include/linux/ceph/messenger.h |    9 +++++----
>  net/ceph/ceph_common.c         |   18 +++++-------------
>  net/ceph/messenger.c           |   30 +++++++++---------------------
>  net/ceph/mon_client.c          |    6 +++---
>  net/ceph/osd_client.c          |    4 ++--
>  7 files changed, 26 insertions(+), 45 deletions(-)
>
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 200bc87..ad30261 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -394,7 +394,7 @@ static struct ceph_mds_session *register_session(struct
> ceph_mds_client *mdsc,
>        s->s_seq = 0;
>        mutex_init(&s->s_mutex);
>
> -       ceph_con_init(mdsc->fsc->client->msgr, &s->s_con);
> +       ceph_con_init(&mdsc->fsc->client->msgr, &s->s_con);
>        s->s_con.private = s;
>        s->s_con.ops = &mds_con_ops;
>        s->s_con.peer_name.type = CEPH_ENTITY_TYPE_MDS;
> diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h
> index 92eef7c..927361c 100644
> --- a/include/linux/ceph/libceph.h
> +++ b/include/linux/ceph/libceph.h
> @@ -131,7 +131,7 @@ struct ceph_client {
>        u32 supported_features;
>        u32 required_features;
>
> -       struct ceph_messenger *msgr;   /* messenger instance */
> +       struct ceph_messenger msgr;   /* messenger instance */
>        struct ceph_mon_client monc;
>        struct ceph_osd_client osdc;
>
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 74f6c9b..3fbd4be 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -211,10 +211,11 @@ extern int ceph_msgr_init(void);
>  extern void ceph_msgr_exit(void);
>  extern void ceph_msgr_flush(void);
>
> -extern struct ceph_messenger *ceph_messenger_create(
> -       struct ceph_entity_addr *myaddr,
> -       u32 features, u32 required);
> -extern void ceph_messenger_destroy(struct ceph_messenger *);
> +extern void ceph_messenger_init(struct ceph_messenger *msgr,
> +                       struct ceph_entity_addr *myaddr,
> +                       u32 supported_features,
> +                       u32 required_features,
> +                       bool nocrc);
>
>  extern void ceph_con_init(struct ceph_messenger *msgr,
>                          struct ceph_connection *con);
> diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c
> index cc91319..2de3ea1 100644
> --- a/net/ceph/ceph_common.c
> +++ b/net/ceph/ceph_common.c
> @@ -468,19 +468,15 @@ struct ceph_client *ceph_create_client(struct
> ceph_options *opt, void *private,
>        /* msgr */
>        if (ceph_test_opt(client, MYIP))
>                myaddr = &client->options->my_addr;
> -       client->msgr = ceph_messenger_create(myaddr,
> -                                            client->supported_features,
> -                                            client->required_features);
> -       if (IS_ERR(client->msgr)) {
> -               err = PTR_ERR(client->msgr);
> -               goto fail;
> -       }
> -       client->msgr->nocrc = ceph_test_opt(client, NOCRC);
> +       ceph_messenger_init(&client->msgr, myaddr,
> +               client->supported_features,
> +               client->required_features,
> +               ceph_test_opt(client, NOCRC));
>
>        /* subsystems */
>        err = ceph_monc_init(&client->monc, client);
>        if (err < 0)
> -               goto fail_msgr;
> +               goto fail;
>        err = ceph_osdc_init(&client->osdc, client);
>        if (err < 0)
>                goto fail_monc;
> @@ -489,8 +485,6 @@ struct ceph_client *ceph_create_client(struct
> ceph_options *opt, void *private,
>
>  fail_monc:
>        ceph_monc_stop(&client->monc);
> -fail_msgr:
> -       ceph_messenger_destroy(client->msgr);
>  fail:
>        kfree(client);
>        return ERR_PTR(err);
> @@ -515,8 +509,6 @@ void ceph_destroy_client(struct ceph_client *client)
>
>        ceph_debugfs_client_cleanup(client);
>
> -       ceph_messenger_destroy(client->msgr);
> -
>        ceph_destroy_options(client->options);
>
>        kfree(client);
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 2e9054f..19f1948 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2243,18 +2243,14 @@ out:
>
>
>  /*
> - * create a new messenger instance
> + * initialize a new messenger instance
>  */
> -struct ceph_messenger *ceph_messenger_create(struct ceph_entity_addr
> *myaddr,
> -                                            u32 supported_features,
> -                                            u32 required_features)
> +void ceph_messenger_init(struct ceph_messenger *msgr,
> +                       struct ceph_entity_addr *myaddr,
> +                       u32 supported_features,
> +                       u32 required_features,
> +                       bool nocrc)
>  {
> -       struct ceph_messenger *msgr;
> -
> -       msgr = kzalloc(sizeof(*msgr), GFP_KERNEL);
> -       if (msgr == NULL)
> -               return ERR_PTR(-ENOMEM);
> -
>        msgr->supported_features = supported_features;
>        msgr->required_features = required_features;
>
> @@ -2267,19 +2263,11 @@ struct ceph_messenger *ceph_messenger_create(struct
> ceph_entity_addr *myaddr,
>        msgr->inst.addr.type = 0;
>        get_random_bytes(&msgr->inst.addr.nonce,
> sizeof(msgr->inst.addr.nonce));
>        encode_my_addr(msgr);
> +       msgr->nocrc = nocrc;
>
> -       dout("messenger_create %p\n", msgr);
> -       return msgr;
> -}
> -EXPORT_SYMBOL(ceph_messenger_create);
> -
> -void ceph_messenger_destroy(struct ceph_messenger *msgr)
> -{
> -       dout("destroy %p\n", msgr);
> -       kfree(msgr);
> -       dout("destroyed messenger %p\n", msgr);
> +       dout("%s %p\n", __func__, msgr);
>  }
> -EXPORT_SYMBOL(ceph_messenger_destroy);
> +EXPORT_SYMBOL(ceph_messenger_init);
>
>  static void clear_standby(struct ceph_connection *con)
>  {
> diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
> index 1845cde..704dc95 100644
> --- a/net/ceph/mon_client.c
> +++ b/net/ceph/mon_client.c
> @@ -763,7 +763,7 @@ int ceph_monc_init(struct ceph_mon_client *monc, struct
> ceph_client *cl)
>        monc->con = kmalloc(sizeof(*monc->con), GFP_KERNEL);
>        if (!monc->con)
>                goto out_monmap;
> -       ceph_con_init(monc->client->msgr, monc->con);
> +       ceph_con_init(&monc->client->msgr, monc->con);
>        monc->con->private = monc;
>        monc->con->ops = &mon_con_ops;
>
> @@ -880,8 +880,8 @@ static void handle_auth_reply(struct ceph_mon_client
> *monc,
>        } else if (!was_auth &&
> monc->auth->ops->is_authenticated(monc->auth)) {
>                dout("authenticated, starting session\n");
>
> -               monc->client->msgr->inst.name.type =
> CEPH_ENTITY_TYPE_CLIENT;
> -               monc->client->msgr->inst.name.num =
> +               monc->client->msgr.inst.name.type = CEPH_ENTITY_TYPE_CLIENT;
> +               monc->client->msgr.inst.name.num =
>                                        cpu_to_le64(monc->auth->global_id);
>
>                __send_subscribe(monc);
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index b098e7b..cca4c7f 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -639,7 +639,7 @@ static struct ceph_osd *create_osd(struct
> ceph_osd_client *osdc)
>        INIT_LIST_HEAD(&osd->o_osd_lru);
>        osd->o_incarnation = 1;
>
> -       ceph_con_init(osdc->client->msgr, &osd->o_con);
> +       ceph_con_init(&osdc->client->msgr, &osd->o_con);
>        osd->o_con.private = osd;
>        osd->o_con.ops = &osd_con_ops;
>        osd->o_con.peer_name.type = CEPH_ENTITY_TYPE_OSD;
> @@ -1391,7 +1391,7 @@ void ceph_osdc_handle_map(struct ceph_osd_client
> *osdc, struct ceph_msg *msg)
>                             epoch, maplen);
>                        newmap = osdmap_apply_incremental(&p, next,
>                                                          osdc->osdmap,
> -
> osdc->client->msgr);
> +
> &osdc->client->msgr);
>                        if (IS_ERR(newmap)) {
>                                err = PTR_ERR(newmap);
>                                goto bad;
> --
> 1.7.5.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 04/13] libceph: rename socket callbacks
  2012-05-30 19:34 ` [PATCH 04/13] libceph: rename socket callbacks Alex Elder
  2012-05-31 16:33   ` Yehuda Sadeh Weinraub
@ 2012-06-01  4:02   ` Sage Weil
  1 sibling, 0 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:02 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On Wed, 30 May 2012, Alex Elder wrote:
> Change the names of the three socket callback functions to make it
> more obvious they're specifically associated with a connection's
> socket (not the ceph connection that uses it).
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/messenger.c |   28 ++++++++++++++--------------
>  1 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index fe3c2a1..5ad1f0a 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -153,46 +153,46 @@ EXPORT_SYMBOL(ceph_msgr_flush);
>   */
> 
>  /* data available on socket, or listen socket received a connect */
> -static void ceph_data_ready(struct sock *sk, int count_unused)
> +static void ceph_sock_data_ready(struct sock *sk, int count_unused)
>  {
>  	struct ceph_connection *con = sk->sk_user_data;
> 
>  	if (sk->sk_state != TCP_CLOSE_WAIT) {
> -		dout("ceph_data_ready on %p state = %lu, queueing work\n",
> +		dout("%s on %p state = %lu, queueing work\n", __func__,

I think it's marginally better to do

		dout(__func__ " on %p state = %lu, queueing work\n",

so that the concatenation happens at compile-time instead of runtime.

Otherwise, looks good!

Reviewed-by: Sage Weil <sage@inktank.com>

>  		     con, con->state);
>  		queue_con(con);
>  	}
>  }
> 
>  /* socket has buffer space for writing */
> -static void ceph_write_space(struct sock *sk)
> +static void ceph_sock_write_space(struct sock *sk)
>  {
>  	struct ceph_connection *con = sk->sk_user_data;
> 
>  	/* only queue to workqueue if there is data we want to write,
>  	 * and there is sufficient space in the socket buffer to accept
> -	 * more data.  clear SOCK_NOSPACE so that ceph_write_space()
> +	 * more data.  clear SOCK_NOSPACE so that ceph_sock_write_space()
>  	 * doesn't get called again until try_write() fills the socket
>  	 * buffer. See net/ipv4/tcp_input.c:tcp_check_space()
>  	 * and net/core/stream.c:sk_stream_write_space().
>  	 */
>  	if (test_bit(WRITE_PENDING, &con->state)) {
>  		if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
> -			dout("ceph_write_space %p queueing write work\n",
> con);
> +			dout("%s %p queueing write work\n", __func__, con);
>  			clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
>  			queue_con(con);
>  		}
>  	} else {
> -		dout("ceph_write_space %p nothing to write\n", con);
> +		dout("%s %p nothing to write\n", __func__, con);
>  	}
>  }
> 
>  /* socket's state has changed */
> -static void ceph_state_change(struct sock *sk)
> +static void ceph_sock_state_change(struct sock *sk)
>  {
>  	struct ceph_connection *con = sk->sk_user_data;
> 
> -	dout("ceph_state_change %p state = %lu sk_state = %u\n",
> +	dout("%s %p state = %lu sk_state = %u\n", __func__,
>  	     con, con->state, sk->sk_state);
> 
>  	if (test_bit(CLOSED, &con->state))
> @@ -200,9 +200,9 @@ static void ceph_state_change(struct sock *sk)
> 
>  	switch (sk->sk_state) {
>  	case TCP_CLOSE:
> -		dout("ceph_state_change TCP_CLOSE\n");
> +		dout("%s TCP_CLOSE\n", __func__);
>  	case TCP_CLOSE_WAIT:
> -		dout("ceph_state_change TCP_CLOSE_WAIT\n");
> +		dout("%s TCP_CLOSE_WAIT\n", __func__);
>  		if (test_and_set_bit(SOCK_CLOSED, &con->state) == 0) {
>  			if (test_bit(CONNECTING, &con->state))
>  				con->error_msg = "connection failed";
> @@ -212,7 +212,7 @@ static void ceph_state_change(struct sock *sk)
>  		}
>  		break;
>  	case TCP_ESTABLISHED:
> -		dout("ceph_state_change TCP_ESTABLISHED\n");
> +		dout("%s TCP_ESTABLISHED\n", __func__);
>  		queue_con(con);
>  		break;
>  	default:	/* Everything else is uninteresting */
> @@ -228,9 +228,9 @@ static void set_sock_callbacks(struct socket *sock,
>  {
>  	struct sock *sk = sock->sk;
>  	sk->sk_user_data = con;
> -	sk->sk_data_ready = ceph_data_ready;
> -	sk->sk_write_space = ceph_write_space;
> -	sk->sk_state_change = ceph_state_change;
> +	sk->sk_data_ready = ceph_sock_data_ready;
> +	sk->sk_write_space = ceph_sock_write_space;
> +	sk->sk_state_change = ceph_sock_state_change;
>  }
> 
> 
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 05/13] libceph: rename kvec_reset and kvec_add functions
  2012-05-30 19:34 ` [PATCH 05/13] libceph: rename kvec_reset and kvec_add functions Alex Elder
  2012-05-31 16:34   ` Yehuda Sadeh
@ 2012-06-01  4:02   ` Sage Weil
  1 sibling, 0 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:02 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Yep

On Wed, 30 May 2012, Alex Elder wrote:

> The functions ceph_con_out_kvec_reset() and ceph_con_out_kvec_add()
> are entirely private functions, so drop the "ceph_" prefix in their
> name to make them slightly more wieldy.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/messenger.c |   48 ++++++++++++++++++++++++------------------------
>  1 files changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 5ad1f0a..2e9054f 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -484,14 +484,14 @@ static u32 get_global_seq(struct ceph_messenger *msgr,
> u32 gt)
>  	return ret;
>  }
> 
> -static void ceph_con_out_kvec_reset(struct ceph_connection *con)
> +static void con_out_kvec_reset(struct ceph_connection *con)
>  {
>  	con->out_kvec_left = 0;
>  	con->out_kvec_bytes = 0;
>  	con->out_kvec_cur = &con->out_kvec[0];
>  }
> 
> -static void ceph_con_out_kvec_add(struct ceph_connection *con,
> +static void con_out_kvec_add(struct ceph_connection *con,
>  				size_t size, void *data)
>  {
>  	int index;
> @@ -532,7 +532,7 @@ static void prepare_write_message(struct ceph_connection
> *con)
>  	struct ceph_msg *m;
>  	u32 crc;
> 
> -	ceph_con_out_kvec_reset(con);
> +	con_out_kvec_reset(con);
>  	con->out_kvec_is_msg = true;
>  	con->out_msg_done = false;
> 
> @@ -540,9 +540,9 @@ static void prepare_write_message(struct ceph_connection
> *con)
>  	 * TCP packet that's a good thing. */
>  	if (con->in_seq > con->in_seq_acked) {
>  		con->in_seq_acked = con->in_seq;
> -		ceph_con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
> +		con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
>  		con->out_temp_ack = cpu_to_le64(con->in_seq_acked);
> -		ceph_con_out_kvec_add(con, sizeof (con->out_temp_ack),
> +		con_out_kvec_add(con, sizeof (con->out_temp_ack),
>  			&con->out_temp_ack);
>  	}
> 
> @@ -570,12 +570,12 @@ static void prepare_write_message(struct ceph_connection
> *con)
>  	BUG_ON(le32_to_cpu(m->hdr.front_len) != m->front.iov_len);
> 
>  	/* tag + hdr + front + middle */
> -	ceph_con_out_kvec_add(con, sizeof (tag_msg), &tag_msg);
> -	ceph_con_out_kvec_add(con, sizeof (m->hdr), &m->hdr);
> -	ceph_con_out_kvec_add(con, m->front.iov_len, m->front.iov_base);
> +	con_out_kvec_add(con, sizeof (tag_msg), &tag_msg);
> +	con_out_kvec_add(con, sizeof (m->hdr), &m->hdr);
> +	con_out_kvec_add(con, m->front.iov_len, m->front.iov_base);
> 
>  	if (m->middle)
> -		ceph_con_out_kvec_add(con, m->middle->vec.iov_len,
> +		con_out_kvec_add(con, m->middle->vec.iov_len,
>  			m->middle->vec.iov_base);
> 
>  	/* fill in crc (except data pages), footer */
> @@ -624,12 +624,12 @@ static void prepare_write_ack(struct ceph_connection
> *con)
>  	     con->in_seq_acked, con->in_seq);
>  	con->in_seq_acked = con->in_seq;
> 
> -	ceph_con_out_kvec_reset(con);
> +	con_out_kvec_reset(con);
> 
> -	ceph_con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
> +	con_out_kvec_add(con, sizeof (tag_ack), &tag_ack);
> 
>  	con->out_temp_ack = cpu_to_le64(con->in_seq_acked);
> -	ceph_con_out_kvec_add(con, sizeof (con->out_temp_ack),
> +	con_out_kvec_add(con, sizeof (con->out_temp_ack),
>  				&con->out_temp_ack);
> 
>  	con->out_more = 1;  /* more will follow.. eventually.. */
> @@ -642,8 +642,8 @@ static void prepare_write_ack(struct ceph_connection *con)
>  static void prepare_write_keepalive(struct ceph_connection *con)
>  {
>  	dout("prepare_write_keepalive %p\n", con);
> -	ceph_con_out_kvec_reset(con);
> -	ceph_con_out_kvec_add(con, sizeof (tag_keepalive), &tag_keepalive);
> +	con_out_kvec_reset(con);
> +	con_out_kvec_add(con, sizeof (tag_keepalive), &tag_keepalive);
>  	set_bit(WRITE_PENDING, &con->state);
>  }
> 
> @@ -688,8 +688,8 @@ static struct ceph_auth_handshake
> *get_connect_authorizer(struct ceph_connection
>   */
>  static void prepare_write_banner(struct ceph_connection *con)
>  {
> -	ceph_con_out_kvec_add(con, strlen(CEPH_BANNER), CEPH_BANNER);
> -	ceph_con_out_kvec_add(con, sizeof (con->msgr->my_enc_addr),
> +	con_out_kvec_add(con, strlen(CEPH_BANNER), CEPH_BANNER);
> +	con_out_kvec_add(con, sizeof (con->msgr->my_enc_addr),
>  					&con->msgr->my_enc_addr);
> 
>  	con->out_more = 0;
> @@ -736,10 +736,10 @@ static int prepare_write_connect(struct ceph_connection
> *con)
>  	con->out_connect.authorizer_len = auth ?
>  		cpu_to_le32(auth->authorizer_buf_len) : 0;
> 
> -	ceph_con_out_kvec_add(con, sizeof (con->out_connect),
> +	con_out_kvec_add(con, sizeof (con->out_connect),
>  					&con->out_connect);
>  	if (auth && auth->authorizer_buf_len)
> -		ceph_con_out_kvec_add(con, auth->authorizer_buf_len,
> +		con_out_kvec_add(con, auth->authorizer_buf_len,
>  					auth->authorizer_buf);
> 
>  	con->out_more = 0;
> @@ -933,7 +933,7 @@ static int write_partial_msg_pages(struct ceph_connection
> *con)
>  	/* prepare and queue up footer, too */
>  	if (!do_datacrc)
>  		con->out_msg->footer.flags |= CEPH_MSG_FOOTER_NOCRC;
> -	ceph_con_out_kvec_reset(con);
> +	con_out_kvec_reset(con);
>  	prepare_write_message_footer(con);
>  	ret = 1;
>  out:
> @@ -1396,7 +1396,7 @@ static int process_connect(struct ceph_connection *con)
>  			return -1;
>  		}
>  		con->auth_retry = 1;
> -		ceph_con_out_kvec_reset(con);
> +		con_out_kvec_reset(con);
>  		ret = prepare_write_connect(con);
>  		if (ret < 0)
>  			return ret;
> @@ -1417,7 +1417,7 @@ static int process_connect(struct ceph_connection *con)
>  		       ENTITY_NAME(con->peer_name),
>  		       ceph_pr_addr(&con->peer_addr.in_addr));
>  		reset_connection(con);
> -		ceph_con_out_kvec_reset(con);
> +		con_out_kvec_reset(con);
>  		ret = prepare_write_connect(con);
>  		if (ret < 0)
>  			return ret;
> @@ -1443,7 +1443,7 @@ static int process_connect(struct ceph_connection *con)
>  		     le32_to_cpu(con->out_connect.connect_seq),
>  		     le32_to_cpu(con->in_connect.connect_seq));
>  		con->connect_seq = le32_to_cpu(con->in_connect.connect_seq);
> -		ceph_con_out_kvec_reset(con);
> +		con_out_kvec_reset(con);
>  		ret = prepare_write_connect(con);
>  		if (ret < 0)
>  			return ret;
> @@ -1460,7 +1460,7 @@ static int process_connect(struct ceph_connection *con)
>  		     le32_to_cpu(con->in_connect.global_seq));
>  		get_global_seq(con->msgr,
>  			       le32_to_cpu(con->in_connect.global_seq));
> -		ceph_con_out_kvec_reset(con);
> +		con_out_kvec_reset(con);
>  		ret = prepare_write_connect(con);
>  		if (ret < 0)
>  			return ret;
> @@ -1867,7 +1867,7 @@ more:
> 
>  	/* open the socket first? */
>  	if (con->sock == NULL) {
> -		ceph_con_out_kvec_reset(con);
> +		con_out_kvec_reset(con);
>  		prepare_write_banner(con);
>  		ret = prepare_write_connect(con);
>  		if (ret < 0)
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 06/13] libceph: embed ceph messenger structure in ceph_client
  2012-05-30 19:34 ` [PATCH 06/13] libceph: embed ceph messenger structure in ceph_client Alex Elder
  2012-05-31 16:44   ` Yehuda Sadeh
@ 2012-06-01  4:04   ` Sage Weil
  1 sibling, 0 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:04 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Sage Weil <sage@inktank.com>

On Wed, 30 May 2012, Alex Elder wrote:

> A ceph client has a pointer to a ceph messenger structure in it.
> There is always exactly one ceph messenger for a ceph client, so
> there is no need to allocate it separate from the ceph client
> structure.
> 
> Switch the ceph_client structure to embed its ceph_messenger
> structure.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  fs/ceph/mds_client.c           |    2 +-
>  include/linux/ceph/libceph.h   |    2 +-
>  include/linux/ceph/messenger.h |    9 +++++----
>  net/ceph/ceph_common.c         |   18 +++++-------------
>  net/ceph/messenger.c           |   30 +++++++++---------------------
>  net/ceph/mon_client.c          |    6 +++---
>  net/ceph/osd_client.c          |    4 ++--
>  7 files changed, 26 insertions(+), 45 deletions(-)
> 
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 200bc87..ad30261 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -394,7 +394,7 @@ static struct ceph_mds_session *register_session(struct
> ceph_mds_client *mdsc,
>  	s->s_seq = 0;
>  	mutex_init(&s->s_mutex);
> 
> -	ceph_con_init(mdsc->fsc->client->msgr, &s->s_con);
> +	ceph_con_init(&mdsc->fsc->client->msgr, &s->s_con);
>  	s->s_con.private = s;
>  	s->s_con.ops = &mds_con_ops;
>  	s->s_con.peer_name.type = CEPH_ENTITY_TYPE_MDS;
> diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h
> index 92eef7c..927361c 100644
> --- a/include/linux/ceph/libceph.h
> +++ b/include/linux/ceph/libceph.h
> @@ -131,7 +131,7 @@ struct ceph_client {
>  	u32 supported_features;
>  	u32 required_features;
> 
> -	struct ceph_messenger *msgr;   /* messenger instance */
> +	struct ceph_messenger msgr;   /* messenger instance */
>  	struct ceph_mon_client monc;
>  	struct ceph_osd_client osdc;
> 
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 74f6c9b..3fbd4be 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -211,10 +211,11 @@ extern int ceph_msgr_init(void);
>  extern void ceph_msgr_exit(void);
>  extern void ceph_msgr_flush(void);
> 
> -extern struct ceph_messenger *ceph_messenger_create(
> -	struct ceph_entity_addr *myaddr,
> -	u32 features, u32 required);
> -extern void ceph_messenger_destroy(struct ceph_messenger *);
> +extern void ceph_messenger_init(struct ceph_messenger *msgr,
> +			struct ceph_entity_addr *myaddr,
> +			u32 supported_features,
> +			u32 required_features,
> +			bool nocrc);
> 
>  extern void ceph_con_init(struct ceph_messenger *msgr,
>  			  struct ceph_connection *con);
> diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c
> index cc91319..2de3ea1 100644
> --- a/net/ceph/ceph_common.c
> +++ b/net/ceph/ceph_common.c
> @@ -468,19 +468,15 @@ struct ceph_client *ceph_create_client(struct
> ceph_options *opt, void *private,
>  	/* msgr */
>  	if (ceph_test_opt(client, MYIP))
>  		myaddr = &client->options->my_addr;
> -	client->msgr = ceph_messenger_create(myaddr,
> -					     client->supported_features,
> -					     client->required_features);
> -	if (IS_ERR(client->msgr)) {
> -		err = PTR_ERR(client->msgr);
> -		goto fail;
> -	}
> -	client->msgr->nocrc = ceph_test_opt(client, NOCRC);
> +	ceph_messenger_init(&client->msgr, myaddr,
> +		client->supported_features,
> +		client->required_features,
> +		ceph_test_opt(client, NOCRC));
> 
>  	/* subsystems */
>  	err = ceph_monc_init(&client->monc, client);
>  	if (err < 0)
> -		goto fail_msgr;
> +		goto fail;
>  	err = ceph_osdc_init(&client->osdc, client);
>  	if (err < 0)
>  		goto fail_monc;
> @@ -489,8 +485,6 @@ struct ceph_client *ceph_create_client(struct ceph_options
> *opt, void *private,
> 
>  fail_monc:
>  	ceph_monc_stop(&client->monc);
> -fail_msgr:
> -	ceph_messenger_destroy(client->msgr);
>  fail:
>  	kfree(client);
>  	return ERR_PTR(err);
> @@ -515,8 +509,6 @@ void ceph_destroy_client(struct ceph_client *client)
> 
>  	ceph_debugfs_client_cleanup(client);
> 
> -	ceph_messenger_destroy(client->msgr);
> -
>  	ceph_destroy_options(client->options);
> 
>  	kfree(client);
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 2e9054f..19f1948 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2243,18 +2243,14 @@ out:
> 
> 
>  /*
> - * create a new messenger instance
> + * initialize a new messenger instance
>   */
> -struct ceph_messenger *ceph_messenger_create(struct ceph_entity_addr *myaddr,
> -					     u32 supported_features,
> -					     u32 required_features)
> +void ceph_messenger_init(struct ceph_messenger *msgr,
> +			struct ceph_entity_addr *myaddr,
> +			u32 supported_features,
> +			u32 required_features,
> +			bool nocrc)
>  {
> -	struct ceph_messenger *msgr;
> -
> -	msgr = kzalloc(sizeof(*msgr), GFP_KERNEL);
> -	if (msgr == NULL)
> -		return ERR_PTR(-ENOMEM);
> -
>  	msgr->supported_features = supported_features;
>  	msgr->required_features = required_features;
> 
> @@ -2267,19 +2263,11 @@ struct ceph_messenger *ceph_messenger_create(struct
> ceph_entity_addr *myaddr,
>  	msgr->inst.addr.type = 0;
>  	get_random_bytes(&msgr->inst.addr.nonce,
> sizeof(msgr->inst.addr.nonce));
>  	encode_my_addr(msgr);
> +	msgr->nocrc = nocrc;
> 
> -	dout("messenger_create %p\n", msgr);
> -	return msgr;
> -}
> -EXPORT_SYMBOL(ceph_messenger_create);
> -
> -void ceph_messenger_destroy(struct ceph_messenger *msgr)
> -{
> -	dout("destroy %p\n", msgr);
> -	kfree(msgr);
> -	dout("destroyed messenger %p\n", msgr);
> +	dout("%s %p\n", __func__, msgr);
>  }
> -EXPORT_SYMBOL(ceph_messenger_destroy);
> +EXPORT_SYMBOL(ceph_messenger_init);
> 
>  static void clear_standby(struct ceph_connection *con)
>  {
> diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
> index 1845cde..704dc95 100644
> --- a/net/ceph/mon_client.c
> +++ b/net/ceph/mon_client.c
> @@ -763,7 +763,7 @@ int ceph_monc_init(struct ceph_mon_client *monc, struct
> ceph_client *cl)
>  	monc->con = kmalloc(sizeof(*monc->con), GFP_KERNEL);
>  	if (!monc->con)
>  		goto out_monmap;
> -	ceph_con_init(monc->client->msgr, monc->con);
> +	ceph_con_init(&monc->client->msgr, monc->con);
>  	monc->con->private = monc;
>  	monc->con->ops = &mon_con_ops;
> 
> @@ -880,8 +880,8 @@ static void handle_auth_reply(struct ceph_mon_client
> *monc,
>  	} else if (!was_auth && monc->auth->ops->is_authenticated(monc->auth))
> {
>  		dout("authenticated, starting session\n");
> 
> -		monc->client->msgr->inst.name.type = CEPH_ENTITY_TYPE_CLIENT;
> -		monc->client->msgr->inst.name.num =
> +		monc->client->msgr.inst.name.type = CEPH_ENTITY_TYPE_CLIENT;
> +		monc->client->msgr.inst.name.num =
>  					cpu_to_le64(monc->auth->global_id);
> 
>  		__send_subscribe(monc);
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index b098e7b..cca4c7f 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -639,7 +639,7 @@ static struct ceph_osd *create_osd(struct ceph_osd_client
> *osdc)
>  	INIT_LIST_HEAD(&osd->o_osd_lru);
>  	osd->o_incarnation = 1;
> 
> -	ceph_con_init(osdc->client->msgr, &osd->o_con);
> +	ceph_con_init(&osdc->client->msgr, &osd->o_con);
>  	osd->o_con.private = osd;
>  	osd->o_con.ops = &osd_con_ops;
>  	osd->o_con.peer_name.type = CEPH_ENTITY_TYPE_OSD;
> @@ -1391,7 +1391,7 @@ void ceph_osdc_handle_map(struct ceph_osd_client *osdc,
> struct ceph_msg *msg)
>  			     epoch, maplen);
>  			newmap = osdmap_apply_incremental(&p, next,
>  							  osdc->osdmap,
> -							  osdc->client->msgr);
> +
> &osdc->client->msgr);
>  			if (IS_ERR(newmap)) {
>  				err = PTR_ERR(newmap);
>  				goto bad;
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client
  2012-05-30 19:34 ` [PATCH 07/13] libceph: embed ceph connection structure in mon_client Alex Elder
@ 2012-06-01  4:24   ` Sage Weil
  2012-06-01 12:12     ` Alex Elder
  2012-06-01 17:09     ` Alex Elder
  0 siblings, 2 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:24 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On Wed, 30 May 2012, Alex Elder wrote:
> A monitor client has a pointer to a ceph connection structure in it.
> This is the only one of the three ceph client types that do it this
> way; the OSD and MDS clients embed the connection into their main
> structures.  There is always exactly one ceph connection for a
> monitor client, so there is no need to allocate it separate from the
> monitor client structure.
> 
> So switch the ceph_mon_client structure to embed its
> ceph_connection structure.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  include/linux/ceph/mon_client.h |    2 +-
>  net/ceph/mon_client.c           |   47 ++++++++++++++++----------------------
>  2 files changed, 21 insertions(+), 28 deletions(-)
> 
> diff --git a/include/linux/ceph/mon_client.h b/include/linux/ceph/mon_client.h
> index 545f859..2113e38 100644
> --- a/include/linux/ceph/mon_client.h
> +++ b/include/linux/ceph/mon_client.h
> @@ -70,7 +70,7 @@ struct ceph_mon_client {
>  	bool hunting;
>  	int cur_mon;                       /* last monitor i contacted */
>  	unsigned long sub_sent, sub_renew_after;
> -	struct ceph_connection *con;
> +	struct ceph_connection con;
>  	bool have_fsid;
> 
>  	/* pending generic requests */
> diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
> index 704dc95..ac4d6b1 100644
> --- a/net/ceph/mon_client.c
> +++ b/net/ceph/mon_client.c
> @@ -106,9 +106,9 @@ static void __send_prepared_auth_request(struct
> ceph_mon_client *monc, int len)
>  	monc->pending_auth = 1;
>  	monc->m_auth->front.iov_len = len;
>  	monc->m_auth->hdr.front_len = cpu_to_le32(len);
> -	ceph_con_revoke(monc->con, monc->m_auth);
> +	ceph_con_revoke(&monc->con, monc->m_auth);
>  	ceph_msg_get(monc->m_auth);  /* keep our ref */
> -	ceph_con_send(monc->con, monc->m_auth);
> +	ceph_con_send(&monc->con, monc->m_auth);
>  }
> 
>  /*
> @@ -117,8 +117,8 @@ static void __send_prepared_auth_request(struct
> ceph_mon_client *monc, int len)
>  static void __close_session(struct ceph_mon_client *monc)
>  {
>  	dout("__close_session closing mon%d\n", monc->cur_mon);
> -	ceph_con_revoke(monc->con, monc->m_auth);
> -	ceph_con_close(monc->con);
> +	ceph_con_revoke(&monc->con, monc->m_auth);
> +	ceph_con_close(&monc->con);
>  	monc->cur_mon = -1;
>  	monc->pending_auth = 0;
>  	ceph_auth_reset(monc->auth);
> @@ -142,9 +142,9 @@ static int __open_session(struct ceph_mon_client *monc)
>  		monc->want_next_osdmap = !!monc->want_next_osdmap;
> 
>  		dout("open_session mon%d opening\n", monc->cur_mon);
> -		monc->con->peer_name.type = CEPH_ENTITY_TYPE_MON;
> -		monc->con->peer_name.num = cpu_to_le64(monc->cur_mon);
> -		ceph_con_open(monc->con,
> +		monc->con.peer_name.type = CEPH_ENTITY_TYPE_MON;
> +		monc->con.peer_name.num = cpu_to_le64(monc->cur_mon);
> +		ceph_con_open(&monc->con,
>  			      &monc->monmap->mon_inst[monc->cur_mon].addr);
> 
>  		/* initiatiate authentication handshake */
> @@ -226,8 +226,8 @@ static void __send_subscribe(struct ceph_mon_client *monc)
> 
>  		msg->front.iov_len = p - msg->front.iov_base;
>  		msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
> -		ceph_con_revoke(monc->con, msg);
> -		ceph_con_send(monc->con, ceph_msg_get(msg));
> +		ceph_con_revoke(&monc->con, msg);
> +		ceph_con_send(&monc->con, ceph_msg_get(msg));
> 
>  		monc->sub_sent = jiffies | 1;  /* never 0 */
>  	}
> @@ -247,7 +247,7 @@ static void handle_subscribe_ack(struct ceph_mon_client
> *monc,
>  	if (monc->hunting) {
>  		pr_info("mon%d %s session established\n",
>  			monc->cur_mon,
> -			ceph_pr_addr(&monc->con->peer_addr.in_addr));
> +			ceph_pr_addr(&monc->con.peer_addr.in_addr));
>  		monc->hunting = false;
>  	}
>  	dout("handle_subscribe_ack after %d seconds\n", seconds);
> @@ -461,7 +461,7 @@ static int do_generic_request(struct ceph_mon_client
> *monc,
>  	req->request->hdr.tid = cpu_to_le64(req->tid);
>  	__insert_generic_request(monc, req);
>  	monc->num_generic_requests++;
> -	ceph_con_send(monc->con, ceph_msg_get(req->request));
> +	ceph_con_send(&monc->con, ceph_msg_get(req->request));
>  	mutex_unlock(&monc->mutex);
> 
>  	err = wait_for_completion_interruptible(&req->completion);
> @@ -684,8 +684,8 @@ static void __resend_generic_request(struct
> ceph_mon_client *monc)
> 
>  	for (p = rb_first(&monc->generic_request_tree); p; p = rb_next(p)) {
>  		req = rb_entry(p, struct ceph_mon_generic_request, node);
> -		ceph_con_revoke(monc->con, req->request);
> -		ceph_con_send(monc->con, ceph_msg_get(req->request));
> +		ceph_con_revoke(&monc->con, req->request);
> +		ceph_con_send(&monc->con, ceph_msg_get(req->request));
>  	}
>  }
> 
> @@ -705,7 +705,7 @@ static void delayed_work(struct work_struct *work)
>  		__close_session(monc);
>  		__open_session(monc);  /* continue hunting */
>  	} else {
> -		ceph_con_keepalive(monc->con);
> +		ceph_con_keepalive(&monc->con);
> 
>  		__validate_auth(monc);
> 
> @@ -760,19 +760,16 @@ int ceph_monc_init(struct ceph_mon_client *monc, struct
> ceph_client *cl)
>  		goto out;
> 
>  	/* connection */
> -	monc->con = kmalloc(sizeof(*monc->con), GFP_KERNEL);
> -	if (!monc->con)
> -		goto out_monmap;
> -	ceph_con_init(&monc->client->msgr, monc->con);
> -	monc->con->private = monc;
> -	monc->con->ops = &mon_con_ops;
> +	ceph_con_init(&monc->client->msgr, &monc->con);
> +	monc->con.private = monc;
> +	monc->con.ops = &mon_con_ops;
> 
>  	/* authentication */
>  	monc->auth = ceph_auth_init(cl->options->name,
>  				    cl->options->key);
>  	if (IS_ERR(monc->auth)) {
>  		err = PTR_ERR(monc->auth);
> -		goto out_con;
> +		goto out_monmap;
>  	}
>  	monc->auth->want_keys =
>  		CEPH_ENTITY_TYPE_AUTH | CEPH_ENTITY_TYPE_MON |
> @@ -824,8 +821,6 @@ out_subscribe_ack:
>  	ceph_msg_put(monc->m_subscribe_ack);
>  out_auth:
>  	ceph_auth_destroy(monc->auth);
> -out_con:
> -	monc->con->ops->put(monc->con);

AH!

This reminds me, these connections need to be refcounted.  There's a 
->get() and ->put() op defined so that you can refcount the containing 
structure.  That means that this patch needs to alo change 

static const struct ceph_connection_operations mon_con_ops = {
	.get = ceph_con_get,
	.put = ceph_con_put,

in mon_client.c.  Hopefully the mon_client itself is refcounted, *or* we 
can ensure that it won't go away before the msgr workqueue is drained and 
the get/put ops can turn to no-ops.

Also: when poking around, I noticed that ceph_con_get() and put() are 
called directly from osd_client.c... that's a bug!  Those connections have 
a get and put op defined that twiddles the containing ceph_osd struct's 
ref count.

I pushed several patches to your latest (wip-messenger-2) branch that fix 
these issues.  Compile tested only!  The first should probably be folded 
into this one, the others follow.


>  out_monmap:
>  	kfree(monc->monmap);
>  out:
> @@ -841,9 +836,7 @@ void ceph_monc_stop(struct ceph_mon_client *monc)
>  	mutex_lock(&monc->mutex);
>  	__close_session(monc);
> 
> -	monc->con->private = NULL;
> -	monc->con->ops->put(monc->con);
> -	monc->con = NULL;
> +	monc->con.private = NULL;
> 
>  	mutex_unlock(&monc->mutex);
> 
> @@ -1021,7 +1014,7 @@ static void mon_fault(struct ceph_connection *con)
>  	if (!monc->hunting)
>  		pr_info("mon%d %s session lost, "
>  			"hunting for new mon\n", monc->cur_mon,
> -			ceph_pr_addr(&monc->con->peer_addr.in_addr));
> +			ceph_pr_addr(&monc->con.peer_addr.in_addr));
> 
>  	__close_session(monc);
>  	if (!monc->hunting) {
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 08/13] libceph: start separating connection flags from state
  2012-05-30 19:35 ` [PATCH 08/13] libceph: start separating connection flags from state Alex Elder
@ 2012-06-01  4:25   ` Sage Weil
  2012-06-01 12:13     ` Alex Elder
  0 siblings, 1 reply; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:25 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On Wed, 30 May 2012, Alex Elder wrote:
> A ceph_connection holds a mixture of connection state (as in "state
> machine" state) and connection flags in a single "state" field.  To
> make the distinction more clear, define a new "flags" field and use
> it rather than the "state" field to hold Boolean flag values.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  include/linux/ceph/messenger.h |   18 +++++++++----
>  net/ceph/messenger.c           |   50
> ++++++++++++++++++++--------------------
>  2 files changed, 37 insertions(+), 31 deletions(-)
> 
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 3fbd4be..920235e 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -103,20 +103,25 @@ struct ceph_msg_pos {
>  #define MAX_DELAY_INTERVAL	(5 * 60 * HZ)
> 
>  /*
> - * ceph_connection state bit flags
> + * ceph_connection flag bits
>   */
> +
>  #define LOSSYTX         0  /* we can close channel or drop messages on errors
> */
> -#define CONNECTING	1
> -#define NEGOTIATING	2
>  #define KEEPALIVE_PENDING      3
>  #define WRITE_PENDING	4  /* we have data ready to send */
> +#define SOCK_CLOSED	11 /* socket state changed to closed */
> +#define BACKOFF         15
> +
> +/*
> + * ceph_connection states
> + */
> +#define CONNECTING	1
> +#define NEGOTIATING	2
>  #define STANDBY		8  /* no outgoing messages, socket closed.  we
> keep
>  			    * the ceph_connection around to maintain shared
>  			    * state with the peer. */
>  #define CLOSED		10 /* we've closed the connection */
> -#define SOCK_CLOSED	11 /* socket state changed to closed */
>  #define OPENING         13 /* open connection w/ (possibly new) peer */
> -#define BACKOFF         15

Later it might be work prefixing these with FLAG_ and/or STATE_.

Reviewed-by: Sage Weil <sage@inktank.com>


> 
>  /*
>   * A single connection with another host.
> @@ -133,7 +138,8 @@ struct ceph_connection {
> 
>  	struct ceph_messenger *msgr;
>  	struct socket *sock;
> -	unsigned long state;	/* connection state (see flags above) */
> +	unsigned long flags;
> +	unsigned long state;
>  	const char *error_msg;  /* error message, if any */
> 
>  	struct ceph_entity_addr peer_addr; /* peer address */
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 19f1948..29055df 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -176,7 +176,7 @@ static void ceph_sock_write_space(struct sock *sk)
>  	 * buffer. See net/ipv4/tcp_input.c:tcp_check_space()
>  	 * and net/core/stream.c:sk_stream_write_space().
>  	 */
> -	if (test_bit(WRITE_PENDING, &con->state)) {
> +	if (test_bit(WRITE_PENDING, &con->flags)) {
>  		if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) {
>  			dout("%s %p queueing write work\n", __func__, con);
>  			clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> @@ -203,7 +203,7 @@ static void ceph_sock_state_change(struct sock *sk)
>  		dout("%s TCP_CLOSE\n", __func__);
>  	case TCP_CLOSE_WAIT:
>  		dout("%s TCP_CLOSE_WAIT\n", __func__);
> -		if (test_and_set_bit(SOCK_CLOSED, &con->state) == 0) {
> +		if (test_and_set_bit(SOCK_CLOSED, &con->flags) == 0) {
>  			if (test_bit(CONNECTING, &con->state))
>  				con->error_msg = "connection failed";
>  			else
> @@ -393,9 +393,9 @@ void ceph_con_close(struct ceph_connection *con)
>  	     ceph_pr_addr(&con->peer_addr.in_addr));
>  	set_bit(CLOSED, &con->state);  /* in case there's queued work */
>  	clear_bit(STANDBY, &con->state);  /* avoid connect_seq bump */
> -	clear_bit(LOSSYTX, &con->state);  /* so we retry next connect */
> -	clear_bit(KEEPALIVE_PENDING, &con->state);
> -	clear_bit(WRITE_PENDING, &con->state);
> +	clear_bit(LOSSYTX, &con->flags);  /* so we retry next connect */
> +	clear_bit(KEEPALIVE_PENDING, &con->flags);
> +	clear_bit(WRITE_PENDING, &con->flags);
>  	mutex_lock(&con->mutex);
>  	reset_connection(con);
>  	con->peer_global_seq = 0;
> @@ -612,7 +612,7 @@ static void prepare_write_message(struct ceph_connection
> *con)
>  		prepare_write_message_footer(con);
>  	}
> 
> -	set_bit(WRITE_PENDING, &con->state);
> +	set_bit(WRITE_PENDING, &con->flags);
>  }
> 
>  /*
> @@ -633,7 +633,7 @@ static void prepare_write_ack(struct ceph_connection *con)
>  				&con->out_temp_ack);
> 
>  	con->out_more = 1;  /* more will follow.. eventually.. */
> -	set_bit(WRITE_PENDING, &con->state);
> +	set_bit(WRITE_PENDING, &con->flags);
>  }
> 
>  /*
> @@ -644,7 +644,7 @@ static void prepare_write_keepalive(struct ceph_connection
> *con)
>  	dout("prepare_write_keepalive %p\n", con);
>  	con_out_kvec_reset(con);
>  	con_out_kvec_add(con, sizeof (tag_keepalive), &tag_keepalive);
> -	set_bit(WRITE_PENDING, &con->state);
> +	set_bit(WRITE_PENDING, &con->flags);
>  }
> 
>  /*
> @@ -673,7 +673,7 @@ static struct ceph_auth_handshake
> *get_connect_authorizer(struct ceph_connection
> 
>  	if (IS_ERR(auth))
>  		return auth;
> -	if (test_bit(CLOSED, &con->state) || test_bit(OPENING, &con->state))
> +	if (test_bit(CLOSED, &con->state) || test_bit(OPENING, &con->flags))
>  		return ERR_PTR(-EAGAIN);
> 
>  	con->auth_reply_buf = auth->authorizer_reply_buf;
> @@ -693,7 +693,7 @@ static void prepare_write_banner(struct ceph_connection
> *con)
>  					&con->msgr->my_enc_addr);
> 
>  	con->out_more = 0;
> -	set_bit(WRITE_PENDING, &con->state);
> +	set_bit(WRITE_PENDING, &con->flags);
>  }
> 
>  static int prepare_write_connect(struct ceph_connection *con)
> @@ -743,7 +743,7 @@ static int prepare_write_connect(struct ceph_connection
> *con)
>  					auth->authorizer_buf);
> 
>  	con->out_more = 0;
> -	set_bit(WRITE_PENDING, &con->state);
> +	set_bit(WRITE_PENDING, &con->flags);
> 
>  	return 0;
>  }
> @@ -1490,7 +1490,7 @@ static int process_connect(struct ceph_connection *con)
>  			le32_to_cpu(con->in_reply.connect_seq));
> 
>  		if (con->in_reply.flags & CEPH_MSG_CONNECT_LOSSY)
> -			set_bit(LOSSYTX, &con->state);
> +			set_bit(LOSSYTX, &con->flags);
> 
>  		prepare_read_tag(con);
>  		break;
> @@ -1931,14 +1931,14 @@ do_next:
>  			prepare_write_ack(con);
>  			goto more;
>  		}
> -		if (test_and_clear_bit(KEEPALIVE_PENDING, &con->state)) {
> +		if (test_and_clear_bit(KEEPALIVE_PENDING, &con->flags)) {
>  			prepare_write_keepalive(con);
>  			goto more;
>  		}
>  	}
> 
>  	/* Nothing to do! */
> -	clear_bit(WRITE_PENDING, &con->state);
> +	clear_bit(WRITE_PENDING, &con->flags);
>  	dout("try_write nothing else to write.\n");
>  	ret = 0;
>  out:
> @@ -2104,7 +2104,7 @@ static void con_work(struct work_struct *work)
> 
>  	mutex_lock(&con->mutex);
>  restart:
> -	if (test_and_clear_bit(BACKOFF, &con->state)) {
> +	if (test_and_clear_bit(BACKOFF, &con->flags)) {
>  		dout("con_work %p backing off\n", con);
>  		if (queue_delayed_work(ceph_msgr_wq, &con->work,
>  				       round_jiffies_relative(con->delay))) {
> @@ -2133,7 +2133,7 @@ restart:
>  		con_close_socket(con);
>  	}
> 
> -	if (test_and_clear_bit(SOCK_CLOSED, &con->state))
> +	if (test_and_clear_bit(SOCK_CLOSED, &con->flags))
>  		goto fault;
> 
>  	ret = try_read(con);
> @@ -2172,7 +2172,7 @@ static void ceph_fault(struct ceph_connection *con)
>  	dout("fault %p state %lu to peer %s\n",
>  	     con, con->state, ceph_pr_addr(&con->peer_addr.in_addr));
> 
> -	if (test_bit(LOSSYTX, &con->state)) {
> +	if (test_bit(LOSSYTX, &con->flags)) {
>  		dout("fault on LOSSYTX channel\n");
>  		goto out;
>  	}
> @@ -2194,9 +2194,9 @@ static void ceph_fault(struct ceph_connection *con)
>  	/* If there are no messages queued or keepalive pending, place
>  	 * the connection in a STANDBY state */
>  	if (list_empty(&con->out_queue) &&
> -	    !test_bit(KEEPALIVE_PENDING, &con->state)) {
> +	    !test_bit(KEEPALIVE_PENDING, &con->flags)) {
>  		dout("fault %p setting STANDBY clearing WRITE_PENDING\n",
> con);
> -		clear_bit(WRITE_PENDING, &con->state);
> +		clear_bit(WRITE_PENDING, &con->flags);
>  		set_bit(STANDBY, &con->state);
>  	} else {
>  		/* retry after a delay. */
> @@ -2220,7 +2220,7 @@ static void ceph_fault(struct ceph_connection *con)
>  			 * that when con_work restarts we schedule the
>  			 * delay then.
>  			 */
> -			set_bit(BACKOFF, &con->state);
> +			set_bit(BACKOFF, &con->flags);
>  		}
>  	}
> 
> @@ -2276,8 +2276,8 @@ static void clear_standby(struct ceph_connection *con)
>  		mutex_lock(&con->mutex);
>  		dout("clear_standby %p and ++connect_seq\n", con);
>  		con->connect_seq++;
> -		WARN_ON(test_bit(WRITE_PENDING, &con->state));
> -		WARN_ON(test_bit(KEEPALIVE_PENDING, &con->state));
> +		WARN_ON(test_bit(WRITE_PENDING, &con->flags));
> +		WARN_ON(test_bit(KEEPALIVE_PENDING, &con->flags));
>  		mutex_unlock(&con->mutex);
>  	}
>  }
> @@ -2315,7 +2315,7 @@ void ceph_con_send(struct ceph_connection *con, struct
> ceph_msg *msg)
>  	/* if there wasn't anything waiting to send before, queue
>  	 * new work */
>  	clear_standby(con);
> -	if (test_and_set_bit(WRITE_PENDING, &con->state) == 0)
> +	if (test_and_set_bit(WRITE_PENDING, &con->flags) == 0)
>  		queue_con(con);
>  }
>  EXPORT_SYMBOL(ceph_con_send);
> @@ -2382,8 +2382,8 @@ void ceph_con_keepalive(struct ceph_connection *con)
>  {
>  	dout("con_keepalive %p\n", con);
>  	clear_standby(con);
> -	if (test_and_set_bit(KEEPALIVE_PENDING, &con->state) == 0 &&
> -	    test_and_set_bit(WRITE_PENDING, &con->state) == 0)
> +	if (test_and_set_bit(KEEPALIVE_PENDING, &con->flags) == 0 &&
> +	    test_and_set_bit(WRITE_PENDING, &con->flags) == 0)
>  		queue_con(con);
>  }
>  EXPORT_SYMBOL(ceph_con_keepalive);
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/13] libceph: start tracking connection socket state
  2012-05-30 19:35 ` [PATCH 09/13] libceph: start tracking connection socket state Alex Elder
@ 2012-06-01  4:28   ` Sage Weil
  2012-06-01 12:15     ` Alex Elder
  2012-06-12  4:52   ` Yan, Zheng
  1 sibling, 1 reply; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:28 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On Wed, 30 May 2012, Alex Elder wrote:
> Start explicitly keeping track of the state of a ceph connection's
> socket, separate from the state of the connection itself.  Create
> placeholder functions to encapsulate the state transitions.
> 
>     --------
>     | NEW* |  transient initial state
>     --------
>         | con_sock_state_init()
>         v
>     ----------
>     | CLOSED |  initialized, but no socket (and no
>     ----------  TCP connection)
>      ^      \
>      |       \ con_sock_state_connecting()
>      |        ----------------------
>      |                              \
>      + con_sock_state_closed()       \
>      |\                               \
>      | \                               \
>      |  -----------                     \
>      |  | CLOSING |  socket event;       \
>      |  -----------  await close          \
>      |       ^                            |
>      |       |                            |
>      |       + con_sock_state_closing()   |
>      |      / \                           |
>      |     /   ---------------            |
>      |    /                   \           v
>      |   /                    --------------
>      |  /    -----------------| CONNECTING |  socket created, TCP
>      |  |   /                 --------------  connect initiated
>      |  |   | con_sock_state_connected()
>      |  |   v
>     -------------
>     | CONNECTED |  TCP connection established
>     -------------

Can we put this beautiful pictures in the header next to the states?

Reviewed-by: Sage Weil <sage@inktank.com>

> 
> Make the socket state an atomic variable, reinforcing that it's a
> distinct transtion with no possible "intermediate/both" states.
> This is almost certainly overkill at this point, though the
> transitions into CONNECTED and CLOSING state do get called via
> socket callback (the rest of the transitions occur with the
> connection mutex held).  We can back out the atomicity later.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  include/linux/ceph/messenger.h |    8 ++++-
>  net/ceph/messenger.c           |   63
> ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 920235e..5e852f4 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -137,14 +137,18 @@ struct ceph_connection {
>  	const struct ceph_connection_operations *ops;
> 
>  	struct ceph_messenger *msgr;
> +
> +	atomic_t sock_state;
>  	struct socket *sock;
> +	struct ceph_entity_addr peer_addr; /* peer address */
> +	struct ceph_entity_addr peer_addr_for_me;
> +
>  	unsigned long flags;
>  	unsigned long state;
>  	const char *error_msg;  /* error message, if any */
> 
> -	struct ceph_entity_addr peer_addr; /* peer address */
>  	struct ceph_entity_name peer_name; /* peer name */
> -	struct ceph_entity_addr peer_addr_for_me;
> +
>  	unsigned peer_features;
>  	u32 connect_seq;      /* identify the most recent connection
>  				 attempt for this connection, client */
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 29055df..7e11b07 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -29,6 +29,14 @@
>   * the sender.
>   */
> 
> +/* State values for ceph_connection->sock_state; NEW is assumed to be 0 */
> +
> +#define CON_SOCK_STATE_NEW		0	/* -> CLOSED */
> +#define CON_SOCK_STATE_CLOSED		1	/* -> CONNECTING */
> +#define CON_SOCK_STATE_CONNECTING	2	/* -> CONNECTED or -> CLOSING
> */
> +#define CON_SOCK_STATE_CONNECTED	3	/* -> CLOSING or -> CLOSED */
> +#define CON_SOCK_STATE_CLOSING		4	/* -> CLOSED */
> +
>  /* static tag bytes (protocol control messages) */
>  static char tag_msg = CEPH_MSGR_TAG_MSG;
>  static char tag_ack = CEPH_MSGR_TAG_ACK;
> @@ -147,6 +155,54 @@ void ceph_msgr_flush(void)
>  }
>  EXPORT_SYMBOL(ceph_msgr_flush);
> 
> +/* Connection socket state transition functions */
> +
> +static void con_sock_state_init(struct ceph_connection *con)
> +{
> +	int old_state;
> +
> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
> +	if (WARN_ON(old_state != CON_SOCK_STATE_NEW))
> +		printk("%s: unexpected old state %d\n", __func__, old_state);
> +}
> +
> +static void con_sock_state_connecting(struct ceph_connection *con)
> +{
> +	int old_state;
> +
> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTING);
> +	if (WARN_ON(old_state != CON_SOCK_STATE_CLOSED))
> +		printk("%s: unexpected old state %d\n", __func__, old_state);
> +}
> +
> +static void con_sock_state_connected(struct ceph_connection *con)
> +{
> +	int old_state;
> +
> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTED);
> +	if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING))
> +		printk("%s: unexpected old state %d\n", __func__, old_state);
> +}
> +
> +static void con_sock_state_closing(struct ceph_connection *con)
> +{
> +	int old_state;
> +
> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSING);
> +	if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING &&
> +			old_state != CON_SOCK_STATE_CONNECTED))
> +		printk("%s: unexpected old state %d\n", __func__, old_state);
> +}
> +
> +static void con_sock_state_closed(struct ceph_connection *con)
> +{
> +	int old_state;
> +
> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
> +	if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTED &&
> +			old_state != CON_SOCK_STATE_CLOSING))
> +		printk("%s: unexpected old state %d\n", __func__, old_state);
> +}
> 
>  /*
>   * socket callback functions
> @@ -203,6 +259,7 @@ static void ceph_sock_state_change(struct sock *sk)
>  		dout("%s TCP_CLOSE\n", __func__);
>  	case TCP_CLOSE_WAIT:
>  		dout("%s TCP_CLOSE_WAIT\n", __func__);
> +		con_sock_state_closing(con);
>  		if (test_and_set_bit(SOCK_CLOSED, &con->flags) == 0) {
>  			if (test_bit(CONNECTING, &con->state))
>  				con->error_msg = "connection failed";
> @@ -213,6 +270,7 @@ static void ceph_sock_state_change(struct sock *sk)
>  		break;
>  	case TCP_ESTABLISHED:
>  		dout("%s TCP_ESTABLISHED\n", __func__);
> +		con_sock_state_connected(con);
>  		queue_con(con);
>  		break;
>  	default:	/* Everything else is uninteresting */
> @@ -277,6 +335,7 @@ static int ceph_tcp_connect(struct ceph_connection *con)
>  		return ret;
>  	}
>  	con->sock = sock;
> +	con_sock_state_connecting(con);
> 
>  	return 0;
>  }
> @@ -341,6 +400,7 @@ static int con_close_socket(struct ceph_connection *con)
>  	rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
>  	sock_release(con->sock);
>  	con->sock = NULL;
> +	con_sock_state_closed(con);
>  	return rc;
>  }
> 
> @@ -460,6 +520,9 @@ void ceph_con_init(struct ceph_messenger *msgr, struct
> ceph_connection *con)
>  	memset(con, 0, sizeof(*con));
>  	atomic_set(&con->nref, 1);
>  	con->msgr = msgr;
> +
> +	con_sock_state_init(con);
> +
>  	mutex_init(&con->mutex);
>  	INIT_LIST_HEAD(&con->out_queue);
>  	INIT_LIST_HEAD(&con->out_sent);
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 10/13] libceph: provide osd number when creating osd
  2012-05-30 19:35 ` [PATCH 10/13] libceph: provide osd number when creating osd Alex Elder
@ 2012-06-01  4:29   ` Sage Weil
  0 siblings, 0 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:29 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Looks good!

Reviewed-by: Sage Weil <sage@inktank.com>

On Wed, 30 May 2012, Alex Elder wrote:

> Pass the osd number to the create_osd() routine, and move the
> initialization of fields that depend on it therein.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/osd_client.c |    8 ++++----
>  1 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index cca4c7f..e30efbc 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -624,7 +624,7 @@ static void osd_reset(struct ceph_connection *con)
>  /*
>   * Track open sessions with osds.
>   */
> -static struct ceph_osd *create_osd(struct ceph_osd_client *osdc)
> +static struct ceph_osd *create_osd(struct ceph_osd_client *osdc, int onum)
>  {
>  	struct ceph_osd *osd;
> 
> @@ -634,6 +634,7 @@ static struct ceph_osd *create_osd(struct ceph_osd_client
> *osdc)
> 
>  	atomic_set(&osd->o_ref, 1);
>  	osd->o_osdc = osdc;
> +	osd->o_osd = onum;
>  	INIT_LIST_HEAD(&osd->o_requests);
>  	INIT_LIST_HEAD(&osd->o_linger_requests);
>  	INIT_LIST_HEAD(&osd->o_osd_lru);
> @@ -643,6 +644,7 @@ static struct ceph_osd *create_osd(struct ceph_osd_client
> *osdc)
>  	osd->o_con.private = osd;
>  	osd->o_con.ops = &osd_con_ops;
>  	osd->o_con.peer_name.type = CEPH_ENTITY_TYPE_OSD;
> +	osd->o_con.peer_name.num = cpu_to_le64(onum);
> 
>  	INIT_LIST_HEAD(&osd->o_keepalive_item);
>  	return osd;
> @@ -998,15 +1000,13 @@ static int __map_request(struct ceph_osd_client *osdc,
>  	req->r_osd = __lookup_osd(osdc, o);
>  	if (!req->r_osd && o >= 0) {
>  		err = -ENOMEM;
> -		req->r_osd = create_osd(osdc);
> +		req->r_osd = create_osd(osdc, o);
>  		if (!req->r_osd) {
>  			list_move(&req->r_req_lru_item, &osdc->req_notarget);
>  			goto out;
>  		}
> 
>  		dout("map_request osd %p is osd%d\n", req->r_osd, o);
> -		req->r_osd->o_osd = o;
> -		req->r_osd->o_con.peer_name.num = cpu_to_le64(o);
>  		__insert_osd(osdc, req->r_osd);
> 
>  		ceph_con_open(&req->r_osd->o_con, &osdc->osdmap->osd_addr[o]);
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 11/13] libceph: init monitor connection when opening
  2012-05-30 19:35 ` [PATCH 11/13] libceph: init monitor connection when opening Alex Elder
@ 2012-06-01  4:30   ` Sage Weil
  0 siblings, 0 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:30 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

yep!

On Wed, 30 May 2012, Alex Elder wrote:

> Hold off initializing a monitor client's connection until just
> before it gets opened for use.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/mon_client.c |   13 ++++++-------
>  1 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
> index ac4d6b1..77da480 100644
> --- a/net/ceph/mon_client.c
> +++ b/net/ceph/mon_client.c
> @@ -119,6 +119,7 @@ static void __close_session(struct ceph_mon_client *monc)
>  	dout("__close_session closing mon%d\n", monc->cur_mon);
>  	ceph_con_revoke(&monc->con, monc->m_auth);
>  	ceph_con_close(&monc->con);
> +	monc->con.private = NULL;
>  	monc->cur_mon = -1;
>  	monc->pending_auth = 0;
>  	ceph_auth_reset(monc->auth);
> @@ -141,9 +142,13 @@ static int __open_session(struct ceph_mon_client *monc)
>  		monc->sub_renew_after = jiffies;  /* i.e., expired */
>  		monc->want_next_osdmap = !!monc->want_next_osdmap;
> 
> -		dout("open_session mon%d opening\n", monc->cur_mon);
> +		ceph_con_init(&monc->client->msgr, &monc->con);
> +		monc->con.private = monc;
> +		monc->con.ops = &mon_con_ops;
>  		monc->con.peer_name.type = CEPH_ENTITY_TYPE_MON;
>  		monc->con.peer_name.num = cpu_to_le64(monc->cur_mon);
> +
> +		dout("open_session mon%d opening\n", monc->cur_mon);
>  		ceph_con_open(&monc->con,
>  			      &monc->monmap->mon_inst[monc->cur_mon].addr);
> 
> @@ -760,10 +765,6 @@ int ceph_monc_init(struct ceph_mon_client *monc, struct
> ceph_client *cl)
>  		goto out;
> 
>  	/* connection */
> -	ceph_con_init(&monc->client->msgr, &monc->con);
> -	monc->con.private = monc;
> -	monc->con.ops = &mon_con_ops;
> -
>  	/* authentication */
>  	monc->auth = ceph_auth_init(cl->options->name,
>  				    cl->options->key);
> @@ -836,8 +837,6 @@ void ceph_monc_stop(struct ceph_mon_client *monc)
>  	mutex_lock(&monc->mutex);
>  	__close_session(monc);
> 
> -	monc->con.private = NULL;
> -
>  	mutex_unlock(&monc->mutex);
> 
>  	ceph_auth_destroy(monc->auth);
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 12/13] libceph: fully initialize connection in con_init()
  2012-05-30 19:35 ` [PATCH 12/13] libceph: fully initialize connection in con_init() Alex Elder
@ 2012-06-01  4:31   ` Sage Weil
  0 siblings, 0 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:31 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Sage Weil <sage@inktank.com>

On Wed, 30 May 2012, Alex Elder wrote:

> Move the initialization of a ceph connection's private pointer,
> operations vector pointer, and peer name information into
> ceph_con_init().  Rearrange the arguments so the connection pointer
> is first.  Hide the byte-swapping of the peer entity number inside
> ceph_con_init()
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  fs/ceph/mds_client.c           |    7 ++-----
>  include/linux/ceph/messenger.h |    6 ++++--
>  net/ceph/messenger.c           |    9 ++++++++-
>  net/ceph/mon_client.c          |    8 +++-----
>  net/ceph/osd_client.c          |    7 ++-----
>  5 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index ad30261..ecd7f15 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -394,11 +394,8 @@ static struct ceph_mds_session *register_session(struct
> ceph_mds_client *mdsc,
>  	s->s_seq = 0;
>  	mutex_init(&s->s_mutex);
> 
> -	ceph_con_init(&mdsc->fsc->client->msgr, &s->s_con);
> -	s->s_con.private = s;
> -	s->s_con.ops = &mds_con_ops;
> -	s->s_con.peer_name.type = CEPH_ENTITY_TYPE_MDS;
> -	s->s_con.peer_name.num = cpu_to_le64(mds);
> +	ceph_con_init(&s->s_con, s, &mds_con_ops, &mdsc->fsc->client->msgr,
> +		CEPH_ENTITY_TYPE_MDS, mds);
> 
>  	spin_lock_init(&s->s_gen_ttl_lock);
>  	s->s_cap_gen = 0;
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 5e852f4..dd27837 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -227,8 +227,10 @@ extern void ceph_messenger_init(struct ceph_messenger
> *msgr,
>  			u32 required_features,
>  			bool nocrc);
> 
> -extern void ceph_con_init(struct ceph_messenger *msgr,
> -			  struct ceph_connection *con);
> +extern void ceph_con_init(struct ceph_connection *con, void *private,
> +			const struct ceph_connection_operations *ops,
> +			struct ceph_messenger *msgr, __u8 entity_type,
> +			__u64 entity_num);
>  extern void ceph_con_open(struct ceph_connection *con,
>  			  struct ceph_entity_addr *addr);
>  extern bool ceph_con_opened(struct ceph_connection *con);
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 7e11b07..cdf8299 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -514,15 +514,22 @@ void ceph_con_put(struct ceph_connection *con)
>  /*
>   * initialize a new connection.
>   */
> -void ceph_con_init(struct ceph_messenger *msgr, struct ceph_connection *con)
> +void ceph_con_init(struct ceph_connection *con, void *private,
> +	const struct ceph_connection_operations *ops,
> +	struct ceph_messenger *msgr, __u8 entity_type, __u64 entity_num)
>  {
>  	dout("con_init %p\n", con);
>  	memset(con, 0, sizeof(*con));
> +	con->private = private;
>  	atomic_set(&con->nref, 1);
> +	con->ops = ops;
>  	con->msgr = msgr;
> 
>  	con_sock_state_init(con);
> 
> +	con->peer_name.type = (__u8) entity_type;
> +	con->peer_name.num = cpu_to_le64(entity_num);
> +
>  	mutex_init(&con->mutex);
>  	INIT_LIST_HEAD(&con->out_queue);
>  	INIT_LIST_HEAD(&con->out_sent);
> diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
> index 77da480..9b4cef9 100644
> --- a/net/ceph/mon_client.c
> +++ b/net/ceph/mon_client.c
> @@ -142,11 +142,9 @@ static int __open_session(struct ceph_mon_client *monc)
>  		monc->sub_renew_after = jiffies;  /* i.e., expired */
>  		monc->want_next_osdmap = !!monc->want_next_osdmap;
> 
> -		ceph_con_init(&monc->client->msgr, &monc->con);
> -		monc->con.private = monc;
> -		monc->con.ops = &mon_con_ops;
> -		monc->con.peer_name.type = CEPH_ENTITY_TYPE_MON;
> -		monc->con.peer_name.num = cpu_to_le64(monc->cur_mon);
> +		ceph_con_init(&monc->con, monc, &mon_con_ops,
> +			&monc->client->msgr,
> +			CEPH_ENTITY_TYPE_MON, monc->cur_mon);
> 
>  		dout("open_session mon%d opening\n", monc->cur_mon);
>  		ceph_con_open(&monc->con,
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index e30efbc..1f3951a 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -640,11 +640,8 @@ static struct ceph_osd *create_osd(struct ceph_osd_client
> *osdc, int onum)
>  	INIT_LIST_HEAD(&osd->o_osd_lru);
>  	osd->o_incarnation = 1;
> 
> -	ceph_con_init(&osdc->client->msgr, &osd->o_con);
> -	osd->o_con.private = osd;
> -	osd->o_con.ops = &osd_con_ops;
> -	osd->o_con.peer_name.type = CEPH_ENTITY_TYPE_OSD;
> -	osd->o_con.peer_name.num = cpu_to_le64(onum);
> +	ceph_con_init(&osd->o_con, osd, &osd_con_ops, &osdc->client->msgr,
> +		CEPH_ENTITY_TYPE_OSD, onum);
> 
>  	INIT_LIST_HEAD(&osd->o_keepalive_item);
>  	return osd;
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 13/13] libceph: set CLOSED state bit in con_init
  2012-05-30 19:35 ` [PATCH 13/13] libceph: set CLOSED state bit in con_init Alex Elder
@ 2012-06-01  4:32   ` Sage Weil
  0 siblings, 0 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01  4:32 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Sage Weil <sage@inktank.com>

On Wed, 30 May 2012, Alex Elder wrote:

> Once a connection is fully initialized, it is really in a CLOSED
> state, so make that explicit by setting the bit in its state field.
> 
> It is possible for a connection in NEGOTIATING state to get a
> failure, leading to ceph_fault() and ultimately ceph_con_close().
> Clear that bits if it is set in that case, to reflect that the
> connection truly is closed and is no longer participating in a
> connect sequence.
> 
> Issue a warning if ceph_con_open() is called on a connection that
> is not in CLOSED state.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/messenger.c |    8 +++++++-
>  1 files changed, 7 insertions(+), 1 deletions(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index cdf8299..85bfe12 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -452,10 +452,13 @@ void ceph_con_close(struct ceph_connection *con)
>  	dout("con_close %p peer %s\n", con,
>  	     ceph_pr_addr(&con->peer_addr.in_addr));
>  	set_bit(CLOSED, &con->state);  /* in case there's queued work */
> +	clear_bit(NEGOTIATING, &con->state);
>  	clear_bit(STANDBY, &con->state);  /* avoid connect_seq bump */
> +
>  	clear_bit(LOSSYTX, &con->flags);  /* so we retry next connect */
>  	clear_bit(KEEPALIVE_PENDING, &con->flags);
>  	clear_bit(WRITE_PENDING, &con->flags);
> +
>  	mutex_lock(&con->mutex);
>  	reset_connection(con);
>  	con->peer_global_seq = 0;
> @@ -472,7 +475,8 @@ void ceph_con_open(struct ceph_connection *con, struct
> ceph_entity_addr *addr)
>  {
>  	dout("con_open %p %s\n", con, ceph_pr_addr(&addr->in_addr));
>  	set_bit(OPENING, &con->state);
> -	clear_bit(CLOSED, &con->state);
> +	WARN_ON(!test_and_clear_bit(CLOSED, &con->state));
> +
>  	memcpy(&con->peer_addr, addr, sizeof(*addr));
>  	con->delay = 0;      /* reset backoff memory */
>  	queue_con(con);
> @@ -534,6 +538,8 @@ void ceph_con_init(struct ceph_connection *con, void
> *private,
>  	INIT_LIST_HEAD(&con->out_queue);
>  	INIT_LIST_HEAD(&con->out_sent);
>  	INIT_DELAYED_WORK(&con->work, con_work);
> +
> +	set_bit(CLOSED, &con->state);
>  }
>  EXPORT_SYMBOL(ceph_con_init);
> 
> -- 
> 1.7.5.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client
  2012-06-01  4:24   ` Sage Weil
@ 2012-06-01 12:12     ` Alex Elder
  2012-06-01 13:30       ` Alex Elder
  2012-06-01 17:09     ` Alex Elder
  1 sibling, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-06-01 12:12 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 05/31/2012 11:24 PM, Sage Weil wrote:
> On Wed, 30 May 2012, Alex Elder wrote:
>> A monitor client has a pointer to a ceph connection structure in it.
>> This is the only one of the three ceph client types that do it this
>> way; the OSD and MDS clients embed the connection into their main
>> structures.  There is always exactly one ceph connection for a
>> monitor client, so there is no need to allocate it separate from the
>> monitor client structure.
>>
>> So switch the ceph_mon_client structure to embed its
>> ceph_connection structure.
>>
>> Signed-off-by: Alex Elder<elder@inktank.com>
>> ---
>>   include/linux/ceph/mon_client.h |    2 +-
>>   net/ceph/mon_client.c           |   47 ++++++++++++++++----------------------
>>   2 files changed, 21 insertions(+), 28 deletions(-)
>>
>> diff --git a/include/linux/ceph/mon_client.h b/include/linux/ceph/mon_client.h
>> index 545f859..2113e38 100644
>> --- a/include/linux/ceph/mon_client.h
>> +++ b/include/linux/ceph/mon_client.h
>> @@ -70,7 +70,7 @@ struct ceph_mon_client {
>>   	bool hunting;
>>   	int cur_mon;                       /* last monitor i contacted */
>>   	unsigned long sub_sent, sub_renew_after;
>> -	struct ceph_connection *con;
>> +	struct ceph_connection con;
>>   	bool have_fsid;
>>
>>   	/* pending generic requests */
>> diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
>> index 704dc95..ac4d6b1 100644
>> --- a/net/ceph/mon_client.c
>> +++ b/net/ceph/mon_client.c
>> @@ -106,9 +106,9 @@ static void __send_prepared_auth_request(struct
>> ceph_mon_client *monc, int len)
>>   	monc->pending_auth = 1;
>>   	monc->m_auth->front.iov_len = len;
>>   	monc->m_auth->hdr.front_len = cpu_to_le32(len);
>> -	ceph_con_revoke(monc->con, monc->m_auth);
>> +	ceph_con_revoke(&monc->con, monc->m_auth);
>>   	ceph_msg_get(monc->m_auth);  /* keep our ref */
>> -	ceph_con_send(monc->con, monc->m_auth);
>> +	ceph_con_send(&monc->con, monc->m_auth);
>>   }
>>
>>   /*
>> @@ -117,8 +117,8 @@ static void __send_prepared_auth_request(struct
>> ceph_mon_client *monc, int len)
>>   static void __close_session(struct ceph_mon_client *monc)
>>   {
>>   	dout("__close_session closing mon%d\n", monc->cur_mon);
>> -	ceph_con_revoke(monc->con, monc->m_auth);
>> -	ceph_con_close(monc->con);
>> +	ceph_con_revoke(&monc->con, monc->m_auth);
>> +	ceph_con_close(&monc->con);
>>   	monc->cur_mon = -1;
>>   	monc->pending_auth = 0;
>>   	ceph_auth_reset(monc->auth);
>> @@ -142,9 +142,9 @@ static int __open_session(struct ceph_mon_client *monc)
>>   		monc->want_next_osdmap = !!monc->want_next_osdmap;
>>
>>   		dout("open_session mon%d opening\n", monc->cur_mon);
>> -		monc->con->peer_name.type = CEPH_ENTITY_TYPE_MON;
>> -		monc->con->peer_name.num = cpu_to_le64(monc->cur_mon);
>> -		ceph_con_open(monc->con,
>> +		monc->con.peer_name.type = CEPH_ENTITY_TYPE_MON;
>> +		monc->con.peer_name.num = cpu_to_le64(monc->cur_mon);
>> +		ceph_con_open(&monc->con,
>>   			&monc->monmap->mon_inst[monc->cur_mon].addr);
>>
>>   		/* initiatiate authentication handshake */
>> @@ -226,8 +226,8 @@ static void __send_subscribe(struct ceph_mon_client *monc)
>>
>>   		msg->front.iov_len = p - msg->front.iov_base;
>>   		msg->hdr.front_len = cpu_to_le32(msg->front.iov_len);
>> -		ceph_con_revoke(monc->con, msg);
>> -		ceph_con_send(monc->con, ceph_msg_get(msg));
>> +		ceph_con_revoke(&monc->con, msg);
>> +		ceph_con_send(&monc->con, ceph_msg_get(msg));
>>
>>   		monc->sub_sent = jiffies | 1;  /* never 0 */
>>   	}
>> @@ -247,7 +247,7 @@ static void handle_subscribe_ack(struct ceph_mon_client
>> *monc,
>>   	if (monc->hunting) {
>>   		pr_info("mon%d %s session established\n",
>>   			monc->cur_mon,
>> -			ceph_pr_addr(&monc->con->peer_addr.in_addr));
>> +			ceph_pr_addr(&monc->con.peer_addr.in_addr));
>>   		monc->hunting = false;
>>   	}
>>   	dout("handle_subscribe_ack after %d seconds\n", seconds);
>> @@ -461,7 +461,7 @@ static int do_generic_request(struct ceph_mon_client
>> *monc,
>>   	req->request->hdr.tid = cpu_to_le64(req->tid);
>>   	__insert_generic_request(monc, req);
>>   	monc->num_generic_requests++;
>> -	ceph_con_send(monc->con, ceph_msg_get(req->request));
>> +	ceph_con_send(&monc->con, ceph_msg_get(req->request));
>>   	mutex_unlock(&monc->mutex);
>>
>>   	err = wait_for_completion_interruptible(&req->completion);
>> @@ -684,8 +684,8 @@ static void __resend_generic_request(struct
>> ceph_mon_client *monc)
>>
>>   	for (p = rb_first(&monc->generic_request_tree); p; p = rb_next(p)) {
>>   		req = rb_entry(p, struct ceph_mon_generic_request, node);
>> -		ceph_con_revoke(monc->con, req->request);
>> -		ceph_con_send(monc->con, ceph_msg_get(req->request));
>> +		ceph_con_revoke(&monc->con, req->request);
>> +		ceph_con_send(&monc->con, ceph_msg_get(req->request));
>>   	}
>>   }
>>
>> @@ -705,7 +705,7 @@ static void delayed_work(struct work_struct *work)
>>   		__close_session(monc);
>>   		__open_session(monc);  /* continue hunting */
>>   	} else {
>> -		ceph_con_keepalive(monc->con);
>> +		ceph_con_keepalive(&monc->con);
>>
>>   		__validate_auth(monc);
>>
>> @@ -760,19 +760,16 @@ int ceph_monc_init(struct ceph_mon_client *monc, struct
>> ceph_client *cl)
>>   		goto out;
>>
>>   	/* connection */
>> -	monc->con = kmalloc(sizeof(*monc->con), GFP_KERNEL);
>> -	if (!monc->con)
>> -		goto out_monmap;
>> -	ceph_con_init(&monc->client->msgr, monc->con);
>> -	monc->con->private = monc;
>> -	monc->con->ops =&mon_con_ops;
>> +	ceph_con_init(&monc->client->msgr,&monc->con);
>> +	monc->con.private = monc;
>> +	monc->con.ops =&mon_con_ops;
>>
>>   	/* authentication */
>>   	monc->auth = ceph_auth_init(cl->options->name,
>>   				    cl->options->key);
>>   	if (IS_ERR(monc->auth)) {
>>   		err = PTR_ERR(monc->auth);
>> -		goto out_con;
>> +		goto out_monmap;
>>   	}
>>   	monc->auth->want_keys =
>>   		CEPH_ENTITY_TYPE_AUTH | CEPH_ENTITY_TYPE_MON |
>> @@ -824,8 +821,6 @@ out_subscribe_ack:
>>   	ceph_msg_put(monc->m_subscribe_ack);
>>   out_auth:
>>   	ceph_auth_destroy(monc->auth);
>> -out_con:
>> -	monc->con->ops->put(monc->con);
>
> AH!
>
> This reminds me, these connections need to be refcounted.  There's a
> ->get() and ->put() op defined so that you can refcount the containing
> structure.  That means that this patch needs to alo change
>
> static const struct ceph_connection_operations mon_con_ops = {
> 	.get = ceph_con_get,
> 	.put = ceph_con_put,
>
> in mon_client.c.  Hopefully the mon_client itself is refcounted, *or* we
> can ensure that it won't go away before the msgr workqueue is drained and
> the get/put ops can turn to no-ops.


Earlier I looked at the ref counting stuff a bit and stopped myself
from going off on that tangent.  But it didn't look like it was used
consistently and made a note to myself to revisit it.

> Also: when poking around, I noticed that ceph_con_get() and put() are
> called directly from osd_client.c... that's a bug!  Those connections have
> a get and put op defined that twiddles the containing ceph_osd struct's
> ref count.
>
> I pushed several patches to your latest (wip-messenger-2) branch that fix
> these issues.  Compile tested only!  The first should probably be folded
> into this one, the others follow.

I'll look at your patches and incorporate them as appropriate.  But at
the moment I don't see them; whenever you are back online again perhaps
you'll send me a link.

					-Alex

>
>>   out_monmap:
>>   	kfree(monc->monmap);
>>   out:
>> @@ -841,9 +836,7 @@ void ceph_monc_stop(struct ceph_mon_client *monc)
>>   	mutex_lock(&monc->mutex);
>>   	__close_session(monc);
>>
>> -	monc->con->private = NULL;
>> -	monc->con->ops->put(monc->con);
>> -	monc->con = NULL;
>> +	monc->con.private = NULL;
>>
>>   	mutex_unlock(&monc->mutex);
>>
>> @@ -1021,7 +1014,7 @@ static void mon_fault(struct ceph_connection *con)
>>   	if (!monc->hunting)
>>   		pr_info("mon%d %s session lost, "
>>   			"hunting for new mon\n", monc->cur_mon,
>> -			ceph_pr_addr(&monc->con->peer_addr.in_addr));
>> +			ceph_pr_addr(&monc->con.peer_addr.in_addr));
>>
>>   	__close_session(monc);
>>   	if (!monc->hunting) {
>> --
>> 1.7.5.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 08/13] libceph: start separating connection flags from state
  2012-06-01  4:25   ` Sage Weil
@ 2012-06-01 12:13     ` Alex Elder
  0 siblings, 0 replies; 44+ messages in thread
From: Alex Elder @ 2012-06-01 12:13 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 05/31/2012 11:25 PM, Sage Weil wrote:
> On Wed, 30 May 2012, Alex Elder wrote:
>> A ceph_connection holds a mixture of connection state (as in "state
>> machine" state) and connection flags in a single "state" field.  To
>> make the distinction more clear, define a new "flags" field and use
>> it rather than the "state" field to hold Boolean flag values.
>>
>> Signed-off-by: Alex Elder<elder@inktank.com>
>> ---
>>   include/linux/ceph/messenger.h |   18 +++++++++----
>>   net/ceph/messenger.c           |   50
>> ++++++++++++++++++++--------------------
>>   2 files changed, 37 insertions(+), 31 deletions(-)
>>
>> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
>> index 3fbd4be..920235e 100644
>> --- a/include/linux/ceph/messenger.h
>> +++ b/include/linux/ceph/messenger.h
>> @@ -103,20 +103,25 @@ struct ceph_msg_pos {
>>   #define MAX_DELAY_INTERVAL	(5 * 60 * HZ)
>>
>>   /*
>> - * ceph_connection state bit flags
>> + * ceph_connection flag bits
>>    */
>> +
>>   #define LOSSYTX         0  /* we can close channel or drop messages on errors
>> */
>> -#define CONNECTING	1
>> -#define NEGOTIATING	2
>>   #define KEEPALIVE_PENDING      3
>>   #define WRITE_PENDING	4  /* we have data ready to send */
>> +#define SOCK_CLOSED	11 /* socket state changed to closed */
>> +#define BACKOFF         15
>> +
>> +/*
>> + * ceph_connection states
>> + */
>> +#define CONNECTING	1
>> +#define NEGOTIATING	2
>>   #define STANDBY		8  /* no outgoing messages, socket closed.  we
>> keep
>>   			    * the ceph_connection around to maintain shared
>>   			    * state with the peer. */
>>   #define CLOSED		10 /* we've closed the connection */
>> -#define SOCK_CLOSED	11 /* socket state changed to closed */
>>   #define OPENING         13 /* open connection w/ (possibly new) peer */
>> -#define BACKOFF         15
>
> Later it might be work prefixing these with FLAG_ and/or STATE_.

Absolutely, I'm saving that easy stuff for the end.  I'll move the
definitions into messenger.c as well if I can.

> Reviewed-by: Sage Weil<sage@inktank.com>

Thanks.

					-Alex

>>
>>   /*
>>    * A single connection with another host.
>> @@ -133,7 +138,8 @@ struct ceph_connection {
>>
>>   	struct ceph_messenger *msgr;
>>   	struct socket *sock;
>> -	unsigned long state;	/* connection state (see flags above) */
>> +	unsigned long flags;
>> +	unsigned long state;
>>   	const char *error_msg;  /* error message, if any */
>>
>>   	struct ceph_entity_addr peer_addr; /* peer address */
>> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
>> index 19f1948..29055df 100644
>> --- a/net/ceph/messenger.c
>> +++ b/net/ceph/messenger.c
>> @@ -176,7 +176,7 @@ static void ceph_sock_write_space(struct sock *sk)
>>   	 * buffer. See net/ipv4/tcp_input.c:tcp_check_space()
>>   	 * and net/core/stream.c:sk_stream_write_space().
>>   	 */
>> -	if (test_bit(WRITE_PENDING,&con->state)) {
>> +	if (test_bit(WRITE_PENDING,&con->flags)) {
>>   		if (sk_stream_wspace(sk)>= sk_stream_min_wspace(sk)) {
>>   			dout("%s %p queueing write work\n", __func__, con);
>>   			clear_bit(SOCK_NOSPACE,&sk->sk_socket->flags);
>> @@ -203,7 +203,7 @@ static void ceph_sock_state_change(struct sock *sk)
>>   		dout("%s TCP_CLOSE\n", __func__);
>>   	case TCP_CLOSE_WAIT:
>>   		dout("%s TCP_CLOSE_WAIT\n", __func__);
>> -		if (test_and_set_bit(SOCK_CLOSED,&con->state) == 0) {
>> +		if (test_and_set_bit(SOCK_CLOSED,&con->flags) == 0) {
>>   			if (test_bit(CONNECTING,&con->state))
>>   				con->error_msg = "connection failed";
>>   			else
>> @@ -393,9 +393,9 @@ void ceph_con_close(struct ceph_connection *con)
>>   	     ceph_pr_addr(&con->peer_addr.in_addr));
>>   	set_bit(CLOSED,&con->state);  /* in case there's queued work */
>>   	clear_bit(STANDBY,&con->state);  /* avoid connect_seq bump */
>> -	clear_bit(LOSSYTX,&con->state);  /* so we retry next connect */
>> -	clear_bit(KEEPALIVE_PENDING,&con->state);
>> -	clear_bit(WRITE_PENDING,&con->state);
>> +	clear_bit(LOSSYTX,&con->flags);  /* so we retry next connect */
>> +	clear_bit(KEEPALIVE_PENDING,&con->flags);
>> +	clear_bit(WRITE_PENDING,&con->flags);
>>   	mutex_lock(&con->mutex);
>>   	reset_connection(con);
>>   	con->peer_global_seq = 0;
>> @@ -612,7 +612,7 @@ static void prepare_write_message(struct ceph_connection
>> *con)
>>   		prepare_write_message_footer(con);
>>   	}
>>
>> -	set_bit(WRITE_PENDING,&con->state);
>> +	set_bit(WRITE_PENDING,&con->flags);
>>   }
>>
>>   /*
>> @@ -633,7 +633,7 @@ static void prepare_write_ack(struct ceph_connection *con)
>>   				&con->out_temp_ack);
>>
>>   	con->out_more = 1;  /* more will follow.. eventually.. */
>> -	set_bit(WRITE_PENDING,&con->state);
>> +	set_bit(WRITE_PENDING,&con->flags);
>>   }
>>
>>   /*
>> @@ -644,7 +644,7 @@ static void prepare_write_keepalive(struct ceph_connection
>> *con)
>>   	dout("prepare_write_keepalive %p\n", con);
>>   	con_out_kvec_reset(con);
>>   	con_out_kvec_add(con, sizeof (tag_keepalive),&tag_keepalive);
>> -	set_bit(WRITE_PENDING,&con->state);
>> +	set_bit(WRITE_PENDING,&con->flags);
>>   }
>>
>>   /*
>> @@ -673,7 +673,7 @@ static struct ceph_auth_handshake
>> *get_connect_authorizer(struct ceph_connection
>>
>>   	if (IS_ERR(auth))
>>   		return auth;
>> -	if (test_bit(CLOSED,&con->state) || test_bit(OPENING,&con->state))
>> +	if (test_bit(CLOSED,&con->state) || test_bit(OPENING,&con->flags))
>>   		return ERR_PTR(-EAGAIN);
>>
>>   	con->auth_reply_buf = auth->authorizer_reply_buf;
>> @@ -693,7 +693,7 @@ static void prepare_write_banner(struct ceph_connection
>> *con)
>>   					&con->msgr->my_enc_addr);
>>
>>   	con->out_more = 0;
>> -	set_bit(WRITE_PENDING,&con->state);
>> +	set_bit(WRITE_PENDING,&con->flags);
>>   }
>>
>>   static int prepare_write_connect(struct ceph_connection *con)
>> @@ -743,7 +743,7 @@ static int prepare_write_connect(struct ceph_connection
>> *con)
>>   					auth->authorizer_buf);
>>
>>   	con->out_more = 0;
>> -	set_bit(WRITE_PENDING,&con->state);
>> +	set_bit(WRITE_PENDING,&con->flags);
>>
>>   	return 0;
>>   }
>> @@ -1490,7 +1490,7 @@ static int process_connect(struct ceph_connection *con)
>>   			le32_to_cpu(con->in_reply.connect_seq));
>>
>>   		if (con->in_reply.flags&  CEPH_MSG_CONNECT_LOSSY)
>> -			set_bit(LOSSYTX,&con->state);
>> +			set_bit(LOSSYTX,&con->flags);
>>
>>   		prepare_read_tag(con);
>>   		break;
>> @@ -1931,14 +1931,14 @@ do_next:
>>   			prepare_write_ack(con);
>>   			goto more;
>>   		}
>> -		if (test_and_clear_bit(KEEPALIVE_PENDING,&con->state)) {
>> +		if (test_and_clear_bit(KEEPALIVE_PENDING,&con->flags)) {
>>   			prepare_write_keepalive(con);
>>   			goto more;
>>   		}
>>   	}
>>
>>   	/* Nothing to do! */
>> -	clear_bit(WRITE_PENDING,&con->state);
>> +	clear_bit(WRITE_PENDING,&con->flags);
>>   	dout("try_write nothing else to write.\n");
>>   	ret = 0;
>>   out:
>> @@ -2104,7 +2104,7 @@ static void con_work(struct work_struct *work)
>>
>>   	mutex_lock(&con->mutex);
>>   restart:
>> -	if (test_and_clear_bit(BACKOFF,&con->state)) {
>> +	if (test_and_clear_bit(BACKOFF,&con->flags)) {
>>   		dout("con_work %p backing off\n", con);
>>   		if (queue_delayed_work(ceph_msgr_wq,&con->work,
>>   				       round_jiffies_relative(con->delay))) {
>> @@ -2133,7 +2133,7 @@ restart:
>>   		con_close_socket(con);
>>   	}
>>
>> -	if (test_and_clear_bit(SOCK_CLOSED,&con->state))
>> +	if (test_and_clear_bit(SOCK_CLOSED,&con->flags))
>>   		goto fault;
>>
>>   	ret = try_read(con);
>> @@ -2172,7 +2172,7 @@ static void ceph_fault(struct ceph_connection *con)
>>   	dout("fault %p state %lu to peer %s\n",
>>   	     con, con->state, ceph_pr_addr(&con->peer_addr.in_addr));
>>
>> -	if (test_bit(LOSSYTX,&con->state)) {
>> +	if (test_bit(LOSSYTX,&con->flags)) {
>>   		dout("fault on LOSSYTX channel\n");
>>   		goto out;
>>   	}
>> @@ -2194,9 +2194,9 @@ static void ceph_fault(struct ceph_connection *con)
>>   	/* If there are no messages queued or keepalive pending, place
>>   	 * the connection in a STANDBY state */
>>   	if (list_empty(&con->out_queue)&&
>> -	    !test_bit(KEEPALIVE_PENDING,&con->state)) {
>> +	    !test_bit(KEEPALIVE_PENDING,&con->flags)) {
>>   		dout("fault %p setting STANDBY clearing WRITE_PENDING\n",
>> con);
>> -		clear_bit(WRITE_PENDING,&con->state);
>> +		clear_bit(WRITE_PENDING,&con->flags);
>>   		set_bit(STANDBY,&con->state);
>>   	} else {
>>   		/* retry after a delay. */
>> @@ -2220,7 +2220,7 @@ static void ceph_fault(struct ceph_connection *con)
>>   			 * that when con_work restarts we schedule the
>>   			 * delay then.
>>   			 */
>> -			set_bit(BACKOFF,&con->state);
>> +			set_bit(BACKOFF,&con->flags);
>>   		}
>>   	}
>>
>> @@ -2276,8 +2276,8 @@ static void clear_standby(struct ceph_connection *con)
>>   		mutex_lock(&con->mutex);
>>   		dout("clear_standby %p and ++connect_seq\n", con);
>>   		con->connect_seq++;
>> -		WARN_ON(test_bit(WRITE_PENDING,&con->state));
>> -		WARN_ON(test_bit(KEEPALIVE_PENDING,&con->state));
>> +		WARN_ON(test_bit(WRITE_PENDING,&con->flags));
>> +		WARN_ON(test_bit(KEEPALIVE_PENDING,&con->flags));
>>   		mutex_unlock(&con->mutex);
>>   	}
>>   }
>> @@ -2315,7 +2315,7 @@ void ceph_con_send(struct ceph_connection *con, struct
>> ceph_msg *msg)
>>   	/* if there wasn't anything waiting to send before, queue
>>   	 * new work */
>>   	clear_standby(con);
>> -	if (test_and_set_bit(WRITE_PENDING,&con->state) == 0)
>> +	if (test_and_set_bit(WRITE_PENDING,&con->flags) == 0)
>>   		queue_con(con);
>>   }
>>   EXPORT_SYMBOL(ceph_con_send);
>> @@ -2382,8 +2382,8 @@ void ceph_con_keepalive(struct ceph_connection *con)
>>   {
>>   	dout("con_keepalive %p\n", con);
>>   	clear_standby(con);
>> -	if (test_and_set_bit(KEEPALIVE_PENDING,&con->state) == 0&&
>> -	    test_and_set_bit(WRITE_PENDING,&con->state) == 0)
>> +	if (test_and_set_bit(KEEPALIVE_PENDING,&con->flags) == 0&&
>> +	    test_and_set_bit(WRITE_PENDING,&con->flags) == 0)
>>   		queue_con(con);
>>   }
>>   EXPORT_SYMBOL(ceph_con_keepalive);
>> --
>> 1.7.5.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/13] libceph: start tracking connection socket state
  2012-06-01  4:28   ` Sage Weil
@ 2012-06-01 12:15     ` Alex Elder
  0 siblings, 0 replies; 44+ messages in thread
From: Alex Elder @ 2012-06-01 12:15 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 05/31/2012 11:28 PM, Sage Weil wrote:
> On Wed, 30 May 2012, Alex Elder wrote:
>> Start explicitly keeping track of the state of a ceph connection's
>> socket, separate from the state of the connection itself.  Create
>> placeholder functions to encapsulate the state transitions.
>>
>>      --------
>>      | NEW* |  transient initial state
>>      --------
>>          | con_sock_state_init()
>>          v
>>      ----------
>>      | CLOSED |  initialized, but no socket (and no
>>      ----------  TCP connection)
>>       ^      \
>>       |       \ con_sock_state_connecting()
>>       |        ----------------------
>>       |                              \
>>       + con_sock_state_closed()       \
>>       |\                               \
>>       | \                               \
>>       |  -----------                     \
>>       |  | CLOSING |  socket event;       \
>>       |  -----------  await close          \
>>       |       ^                            |
>>       |       |                            |
>>       |       + con_sock_state_closing()   |
>>       |      / \                           |
>>       |     /   ---------------            |
>>       |    /                   \           v
>>       |   /                    --------------
>>       |  /    -----------------| CONNECTING |  socket created, TCP
>>       |  |   /                 --------------  connect initiated
>>       |  |   | con_sock_state_connected()
>>       |  |   v
>>      -------------
>>      | CONNECTED |  TCP connection established
>>      -------------
>
> Can we put this beautiful pictures in the header next to the states?

I can be quite the ASCII artist.  Yes, I will add this, when I update
the state definitions with better names and numbers.

					-Alex

> Reviewed-by: Sage Weil<sage@inktank.com>
>
>>
>> Make the socket state an atomic variable, reinforcing that it's a
>> distinct transtion with no possible "intermediate/both" states.
>> This is almost certainly overkill at this point, though the
>> transitions into CONNECTED and CLOSING state do get called via
>> socket callback (the rest of the transitions occur with the
>> connection mutex held).  We can back out the atomicity later.
>>
>> Signed-off-by: Alex Elder<elder@inktank.com>
>> ---
>>   include/linux/ceph/messenger.h |    8 ++++-
>>   net/ceph/messenger.c           |   63
>> ++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 69 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
>> index 920235e..5e852f4 100644
>> --- a/include/linux/ceph/messenger.h
>> +++ b/include/linux/ceph/messenger.h
>> @@ -137,14 +137,18 @@ struct ceph_connection {
>>   	const struct ceph_connection_operations *ops;
>>
>>   	struct ceph_messenger *msgr;
>> +
>> +	atomic_t sock_state;
>>   	struct socket *sock;
>> +	struct ceph_entity_addr peer_addr; /* peer address */
>> +	struct ceph_entity_addr peer_addr_for_me;
>> +
>>   	unsigned long flags;
>>   	unsigned long state;
>>   	const char *error_msg;  /* error message, if any */
>>
>> -	struct ceph_entity_addr peer_addr; /* peer address */
>>   	struct ceph_entity_name peer_name; /* peer name */
>> -	struct ceph_entity_addr peer_addr_for_me;
>> +
>>   	unsigned peer_features;
>>   	u32 connect_seq;      /* identify the most recent connection
>>   				 attempt for this connection, client */
>> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
>> index 29055df..7e11b07 100644
>> --- a/net/ceph/messenger.c
>> +++ b/net/ceph/messenger.c
>> @@ -29,6 +29,14 @@
>>    * the sender.
>>    */
>>
>> +/* State values for ceph_connection->sock_state; NEW is assumed to be 0 */
>> +
>> +#define CON_SOCK_STATE_NEW		0	/* ->  CLOSED */
>> +#define CON_SOCK_STATE_CLOSED		1	/* ->  CONNECTING */
>> +#define CON_SOCK_STATE_CONNECTING	2	/* ->  CONNECTED or ->  CLOSING
>> */
>> +#define CON_SOCK_STATE_CONNECTED	3	/* ->  CLOSING or ->  CLOSED */
>> +#define CON_SOCK_STATE_CLOSING		4	/* ->  CLOSED */
>> +
>>   /* static tag bytes (protocol control messages) */
>>   static char tag_msg = CEPH_MSGR_TAG_MSG;
>>   static char tag_ack = CEPH_MSGR_TAG_ACK;
>> @@ -147,6 +155,54 @@ void ceph_msgr_flush(void)
>>   }
>>   EXPORT_SYMBOL(ceph_msgr_flush);
>>
>> +/* Connection socket state transition functions */
>> +
>> +static void con_sock_state_init(struct ceph_connection *con)
>> +{
>> +	int old_state;
>> +
>> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
>> +	if (WARN_ON(old_state != CON_SOCK_STATE_NEW))
>> +		printk("%s: unexpected old state %d\n", __func__, old_state);
>> +}
>> +
>> +static void con_sock_state_connecting(struct ceph_connection *con)
>> +{
>> +	int old_state;
>> +
>> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTING);
>> +	if (WARN_ON(old_state != CON_SOCK_STATE_CLOSED))
>> +		printk("%s: unexpected old state %d\n", __func__, old_state);
>> +}
>> +
>> +static void con_sock_state_connected(struct ceph_connection *con)
>> +{
>> +	int old_state;
>> +
>> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTED);
>> +	if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING))
>> +		printk("%s: unexpected old state %d\n", __func__, old_state);
>> +}
>> +
>> +static void con_sock_state_closing(struct ceph_connection *con)
>> +{
>> +	int old_state;
>> +
>> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSING);
>> +	if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING&&
>> +			old_state != CON_SOCK_STATE_CONNECTED))
>> +		printk("%s: unexpected old state %d\n", __func__, old_state);
>> +}
>> +
>> +static void con_sock_state_closed(struct ceph_connection *con)
>> +{
>> +	int old_state;
>> +
>> +	old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
>> +	if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTED&&
>> +			old_state != CON_SOCK_STATE_CLOSING))
>> +		printk("%s: unexpected old state %d\n", __func__, old_state);
>> +}
>>
>>   /*
>>    * socket callback functions
>> @@ -203,6 +259,7 @@ static void ceph_sock_state_change(struct sock *sk)
>>   		dout("%s TCP_CLOSE\n", __func__);
>>   	case TCP_CLOSE_WAIT:
>>   		dout("%s TCP_CLOSE_WAIT\n", __func__);
>> +		con_sock_state_closing(con);
>>   		if (test_and_set_bit(SOCK_CLOSED,&con->flags) == 0) {
>>   			if (test_bit(CONNECTING,&con->state))
>>   				con->error_msg = "connection failed";
>> @@ -213,6 +270,7 @@ static void ceph_sock_state_change(struct sock *sk)
>>   		break;
>>   	case TCP_ESTABLISHED:
>>   		dout("%s TCP_ESTABLISHED\n", __func__);
>> +		con_sock_state_connected(con);
>>   		queue_con(con);
>>   		break;
>>   	default:	/* Everything else is uninteresting */
>> @@ -277,6 +335,7 @@ static int ceph_tcp_connect(struct ceph_connection *con)
>>   		return ret;
>>   	}
>>   	con->sock = sock;
>> +	con_sock_state_connecting(con);
>>
>>   	return 0;
>>   }
>> @@ -341,6 +400,7 @@ static int con_close_socket(struct ceph_connection *con)
>>   	rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
>>   	sock_release(con->sock);
>>   	con->sock = NULL;
>> +	con_sock_state_closed(con);
>>   	return rc;
>>   }
>>
>> @@ -460,6 +520,9 @@ void ceph_con_init(struct ceph_messenger *msgr, struct
>> ceph_connection *con)
>>   	memset(con, 0, sizeof(*con));
>>   	atomic_set(&con->nref, 1);
>>   	con->msgr = msgr;
>> +
>> +	con_sock_state_init(con);
>> +
>>   	mutex_init(&con->mutex);
>>   	INIT_LIST_HEAD(&con->out_queue);
>>   	INIT_LIST_HEAD(&con->out_sent);
>> --
>> 1.7.5.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client
  2012-06-01 12:12     ` Alex Elder
@ 2012-06-01 13:30       ` Alex Elder
  2012-06-01 16:20         ` Sage Weil
  0 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-06-01 13:30 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 06/01/2012 07:12 AM, Alex Elder wrote:
> On 05/31/2012 11:24 PM, Sage Weil wrote:
>> On Wed, 30 May 2012, Alex Elder wrote:
>>> A monitor client has a pointer to a ceph connection structure in it.
>>> This is the only one of the three ceph client types that do it this
>>> way; the OSD and MDS clients embed the connection into their main
>>> structures. There is always exactly one ceph connection for a
>>> monitor client, so there is no need to allocate it separate from the
>>> monitor client structure.
>>>
>>> So switch the ceph_mon_client structure to embed its
>>> ceph_connection structure.
>>>
>>> Signed-off-by: Alex Elder<elder@inktank.com>

. . .

>>> /* authentication */
>>> monc->auth = ceph_auth_init(cl->options->name,
>>> cl->options->key);
>>> if (IS_ERR(monc->auth)) {
>>> err = PTR_ERR(monc->auth);
>>> - goto out_con;
>>> + goto out_monmap;
>>> }
>>> monc->auth->want_keys =
>>> CEPH_ENTITY_TYPE_AUTH | CEPH_ENTITY_TYPE_MON |
>>> @@ -824,8 +821,6 @@ out_subscribe_ack:
>>> ceph_msg_put(monc->m_subscribe_ack);
>>> out_auth:
>>> ceph_auth_destroy(monc->auth);
>>> -out_con:
>>> - monc->con->ops->put(monc->con);
>>
>> AH!
>>
>> This reminds me, these connections need to be refcounted. There's a
>> ->get() and ->put() op defined so that you can refcount the containing
>> structure. That means that this patch needs to alo change

Looking at this again.  Why do they need to be refcounted?  If
this patch is correct in embedding the connection into the
containing structure, then the last reference to the containing
structure is coincident with with the last reference to the
connection.  And the other connections are already embedded into
other containing structures.

So--again assuming it's OK to embed the connection--I would rather
see the ->get and ->put methods for the connection go away entirely
and have the containing structure take care of its own damned ref
counting...

This actually gets into another thing I wanted to do anyway (while
digging through raw memory trying to figure out what's going on).
I want every ceph_message to point back to the connection it is
associated with.  That way there's no need for the OSD (for example)
to keep track of the connection--a revoke is simply an operation
on the message, which would already know the connection from which
it is being revoked.

If you think the above approach is good I'll gladly do it now
rather than later.  I think it might eliminate the need for
any need to reference count the connections.

Anyway, when I'm done (if I ever finish!) the state of the connection
will tell you whether any legitimate uses of a connection remain.

					-Alex

>> static const struct ceph_connection_operations mon_con_ops = {
>> .get = ceph_con_get,
>> .put = ceph_con_put,
>>
>> in mon_client.c. Hopefully the mon_client itself is refcounted, *or* we
>> can ensure that it won't go away before the msgr workqueue is drained and
>> the get/put ops can turn to no-ops.
>
>
> Earlier I looked at the ref counting stuff a bit and stopped myself
> from going off on that tangent. But it didn't look like it was used
> consistently and made a note to myself to revisit it.
>
>> Also: when poking around, I noticed that ceph_con_get() and put() are
>> called directly from osd_client.c... that's a bug! Those connections have
>> a get and put op defined that twiddles the containing ceph_osd struct's
>> ref count.
>>
>> I pushed several patches to your latest (wip-messenger-2) branch that fix
>> these issues. Compile tested only! The first should probably be folded
>> into this one, the others follow.
>
> I'll look at your patches and incorporate them as appropriate. But at
> the moment I don't see them; whenever you are back online again perhaps
> you'll send me a link.
>
> -Alex
>
>>
>>> out_monmap:
>>> kfree(monc->monmap);
>>> out:
>>> @@ -841,9 +836,7 @@ void ceph_monc_stop(struct ceph_mon_client *monc)
>>> mutex_lock(&monc->mutex);
>>> __close_session(monc);
>>>
>>> - monc->con->private = NULL;
>>> - monc->con->ops->put(monc->con);
>>> - monc->con = NULL;
>>> + monc->con.private = NULL;
>>>
>>> mutex_unlock(&monc->mutex);
>>>
>>> @@ -1021,7 +1014,7 @@ static void mon_fault(struct ceph_connection *con)
>>> if (!monc->hunting)
>>> pr_info("mon%d %s session lost, "
>>> "hunting for new mon\n", monc->cur_mon,
>>> - ceph_pr_addr(&monc->con->peer_addr.in_addr));
>>> + ceph_pr_addr(&monc->con.peer_addr.in_addr));
>>>
>>> __close_session(monc);
>>> if (!monc->hunting) {
>>> --
>>> 1.7.5.4
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client
  2012-06-01 13:30       ` Alex Elder
@ 2012-06-01 16:20         ` Sage Weil
  2012-06-01 16:32           ` Alex Elder
  0 siblings, 1 reply; 44+ messages in thread
From: Sage Weil @ 2012-06-01 16:20 UTC (permalink / raw)
  To: Alex Elder; +Cc: Sage Weil, ceph-devel

On Fri, 1 Jun 2012, Alex Elder wrote:
> On 06/01/2012 07:12 AM, Alex Elder wrote:
> > On 05/31/2012 11:24 PM, Sage Weil wrote:
> > > On Wed, 30 May 2012, Alex Elder wrote:
> > > > A monitor client has a pointer to a ceph connection structure in it.
> > > > This is the only one of the three ceph client types that do it this
> > > > way; the OSD and MDS clients embed the connection into their main
> > > > structures. There is always exactly one ceph connection for a
> > > > monitor client, so there is no need to allocate it separate from the
> > > > monitor client structure.
> > > > 
> > > > So switch the ceph_mon_client structure to embed its
> > > > ceph_connection structure.
> > > > 
> > > > Signed-off-by: Alex Elder<elder@inktank.com>
> 
> . . .
> 
> > > > /* authentication */
> > > > monc->auth = ceph_auth_init(cl->options->name,
> > > > cl->options->key);
> > > > if (IS_ERR(monc->auth)) {
> > > > err = PTR_ERR(monc->auth);
> > > > - goto out_con;
> > > > + goto out_monmap;
> > > > }
> > > > monc->auth->want_keys =
> > > > CEPH_ENTITY_TYPE_AUTH | CEPH_ENTITY_TYPE_MON |
> > > > @@ -824,8 +821,6 @@ out_subscribe_ack:
> > > > ceph_msg_put(monc->m_subscribe_ack);
> > > > out_auth:
> > > > ceph_auth_destroy(monc->auth);
> > > > -out_con:
> > > > - monc->con->ops->put(monc->con);
> > > 
> > > AH!
> > > 
> > > This reminds me, these connections need to be refcounted. There's a
> > > ->get() and ->put() op defined so that you can refcount the containing
> > > structure. That means that this patch needs to alo change
> 
> Looking at this again.  Why do they need to be refcounted?  If
> this patch is correct in embedding the connection into the
> containing structure, then the last reference to the containing
> structure is coincident with with the last reference to the
> connection.  And the other connections are already embedded into
> other containing structures.
> 
> So--again assuming it's OK to embed the connection--I would rather
> see the ->get and ->put methods for the connection go away entirely
> and have the containing structure take care of its own damned ref
> counting...

The problem is that socket events queue work, which can take a while, and 
race with, say, osd_client getting an osdmap and dropping it's 
struct ceph_osd.  The ->get and ->put ops just twiddle the containing 
struct's refcount, in that case, so the con_work will find the (now 
closed) ceph_connection and do nothing...

> This actually gets into another thing I wanted to do anyway (while
> digging through raw memory trying to figure out what's going on).
> I want every ceph_message to point back to the connection it is
> associated with.  That way there's no need for the OSD (for example)
> to keep track of the connection--a revoke is simply an operation
> on the message, which would already know the connection from which
> it is being revoked.
> 
> If you think the above approach is good I'll gladly do it now
> rather than later.  I think it might eliminate the need for
> any need to reference count the connections.

That sounds reasonable.. but i'm pretty sure the con refcounts can't go 
away :)

sage


> 
> Anyway, when I'm done (if I ever finish!) the state of the connection
> will tell you whether any legitimate uses of a connection remain.
> 
> 					-Alex
> 
> > > static const struct ceph_connection_operations mon_con_ops = {
> > > .get = ceph_con_get,
> > > .put = ceph_con_put,
> > > 
> > > in mon_client.c. Hopefully the mon_client itself is refcounted, *or* we
> > > can ensure that it won't go away before the msgr workqueue is drained and
> > > the get/put ops can turn to no-ops.
> > 
> > 
> > Earlier I looked at the ref counting stuff a bit and stopped myself
> > from going off on that tangent. But it didn't look like it was used
> > consistently and made a note to myself to revisit it.
> > 
> > > Also: when poking around, I noticed that ceph_con_get() and put() are
> > > called directly from osd_client.c... that's a bug! Those connections have
> > > a get and put op defined that twiddles the containing ceph_osd struct's
> > > ref count.
> > > 
> > > I pushed several patches to your latest (wip-messenger-2) branch that fix
> > > these issues. Compile tested only! The first should probably be folded
> > > into this one, the others follow.
> > 
> > I'll look at your patches and incorporate them as appropriate. But at
> > the moment I don't see them; whenever you are back online again perhaps
> > you'll send me a link.
> > 
> > -Alex
> > 
> > > 
> > > > out_monmap:
> > > > kfree(monc->monmap);
> > > > out:
> > > > @@ -841,9 +836,7 @@ void ceph_monc_stop(struct ceph_mon_client *monc)
> > > > mutex_lock(&monc->mutex);
> > > > __close_session(monc);
> > > > 
> > > > - monc->con->private = NULL;
> > > > - monc->con->ops->put(monc->con);
> > > > - monc->con = NULL;
> > > > + monc->con.private = NULL;
> > > > 
> > > > mutex_unlock(&monc->mutex);
> > > > 
> > > > @@ -1021,7 +1014,7 @@ static void mon_fault(struct ceph_connection *con)
> > > > if (!monc->hunting)
> > > > pr_info("mon%d %s session lost, "
> > > > "hunting for new mon\n", monc->cur_mon,
> > > > - ceph_pr_addr(&monc->con->peer_addr.in_addr));
> > > > + ceph_pr_addr(&monc->con.peer_addr.in_addr));
> > > > 
> > > > __close_session(monc);
> > > > if (!monc->hunting) {
> > > > --
> > > > 1.7.5.4
> > > > 
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > 
> > > > 
> > > 
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client
  2012-06-01 16:20         ` Sage Weil
@ 2012-06-01 16:32           ` Alex Elder
  2012-06-01 16:39             ` Sage Weil
  0 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-06-01 16:32 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 06/01/2012 11:20 AM, Sage Weil wrote:
> The problem is that socket events queue work, which can take a while, and
> race with, say, osd_client getting an osdmap and dropping it's
> struct ceph_osd.  The ->get and ->put ops just twiddle the containing
> struct's refcount, in that case, so the con_work will find the (now
> closed) ceph_connection and do nothing...

I think you're saying that the connection (or its socket) needs to
be protected from its containing structure going away.  So the
connection needs to hold a reference to its container.  If that's
the case then the disposal of the ceph_osd needs to clean up
the connection fully before it goes away.

Anyway, I think I see why there might be a need for the ref counts
and they obviously won't go away if they're needed...

					-Alex

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client
  2012-06-01 16:32           ` Alex Elder
@ 2012-06-01 16:39             ` Sage Weil
  0 siblings, 0 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01 16:39 UTC (permalink / raw)
  To: Alex Elder; +Cc: Sage Weil, ceph-devel

On Fri, 1 Jun 2012, Alex Elder wrote:
> On 06/01/2012 11:20 AM, Sage Weil wrote:
> > The problem is that socket events queue work, which can take a while, and
> > race with, say, osd_client getting an osdmap and dropping it's
> > struct ceph_osd.  The ->get and ->put ops just twiddle the containing
> > struct's refcount, in that case, so the con_work will find the (now
> > closed) ceph_connection and do nothing...
> 
> I think you're saying that the connection (or its socket) needs to
> be protected from its containing structure going away.  So the
> connection needs to hold a reference to its container.  If that's
> the case then the disposal of the ceph_osd needs to clean up
> the connection fully before it goes away.

Yeah.  I think it happens already before we drop the ref:

static void __remove_osd(struct ceph_osd_client *osdc, struct ceph_osd *osd)
{
	dout("__remove_osd %p\n", osd);
	BUG_ON(!list_empty(&osd->o_requests));
	rb_erase(&osd->o_node, &osdc->osds);
	list_del_init(&osd->o_osd_lru);
	ceph_con_close(&osd->o_con);
	put_osd(osd);
}

So it's just the con reference in the workqueue that matters.

sage



> 
> Anyway, I think I see why there might be a need for the ref counts
> and they obviously won't go away if they're needed...
> 
> 					-Alex
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client
  2012-06-01  4:24   ` Sage Weil
  2012-06-01 12:12     ` Alex Elder
@ 2012-06-01 17:09     ` Alex Elder
  2012-06-01 17:10       ` Sage Weil
  1 sibling, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-06-01 17:09 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 05/31/2012 11:24 PM, Sage Weil wrote:
> Also: when poking around, I noticed that ceph_con_get() and put() are
> called directly from osd_client.c... that's a bug!  Those connections have
> a get and put op defined that twiddles the containing ceph_osd struct's
> ref count.

So are you saying that the calls in "osd_client.c" to ceph_con_get()
and ceph_con_put() should instead be calls to get_osd_con() and
put_osd_con(), respectively?  (Or more generally con->ops->get()
and con->ops->put()?)

					-Alex

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 07/13] libceph: embed ceph connection structure in mon_client
  2012-06-01 17:09     ` Alex Elder
@ 2012-06-01 17:10       ` Sage Weil
  0 siblings, 0 replies; 44+ messages in thread
From: Sage Weil @ 2012-06-01 17:10 UTC (permalink / raw)
  To: Alex Elder; +Cc: Sage Weil, ceph-devel

On Fri, 1 Jun 2012, Alex Elder wrote:
> On 05/31/2012 11:24 PM, Sage Weil wrote:
> > Also: when poking around, I noticed that ceph_con_get() and put() are
> > called directly from osd_client.c... that's a bug!  Those connections have
> > a get and put op defined that twiddles the containing ceph_osd struct's
> > ref count.
> 
> So are you saying that the calls in "osd_client.c" to ceph_con_get()
> and ceph_con_put() should instead be calls to get_osd_con() and
> put_osd_con(), respectively?  (Or more generally con->ops->get()
> and con->ops->put()?)

Yeah.. one of the patches I pushed [er, and pushing now] fixes that.  

sage

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 03/13] libceph: delete useless SOCK_CLOSED manipulations
  2012-05-30 19:34 ` [PATCH 03/13] libceph: delete useless SOCK_CLOSED manipulations Alex Elder
@ 2012-06-01 18:47   ` Alex Elder
  0 siblings, 0 replies; 44+ messages in thread
From: Alex Elder @ 2012-06-01 18:47 UTC (permalink / raw)
  To: ceph-devel

On 05/30/2012 02:34 PM, Alex Elder wrote:
> In con_close_socket(), SOCK_CLOSED is set in the connection state,
> then cleared again after shutting down the socket. Nothing between
> the setting and clearing of that bit will ever be affected by it,
> so there's no point in setting/clearing it at all. So don't.
>
> Signed-off-by: Alex Elder <elder@inktank.com>

I am retracting this proposed change.

I believe it's possible for the con->sock->ops->shutdown()
call to trigger a TCP_CLOSE socket state change event,
which means that there *is* something that can be affected
by that state bit being set.

					-Alex

> ---
> net/ceph/messenger.c | 2 --
> 1 files changed, 0 insertions(+), 2 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 07af994..fe3c2a1 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -338,11 +338,9 @@ static int con_close_socket(struct ceph_connection
> *con)
> dout("con_close_socket on %p sock %p\n", con, con->sock);
> if (!con->sock)
> return 0;
> - set_bit(SOCK_CLOSED, &con->state);
> rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
> sock_release(con->sock);
> con->sock = NULL;
> - clear_bit(SOCK_CLOSED, &con->state);
> return rc;
> }
>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/13] libceph: start tracking connection socket state
  2012-05-30 19:35 ` [PATCH 09/13] libceph: start tracking connection socket state Alex Elder
  2012-06-01  4:28   ` Sage Weil
@ 2012-06-12  4:52   ` Yan, Zheng
  2012-06-12  5:00     ` Sage Weil
  1 sibling, 1 reply; 44+ messages in thread
From: Yan, Zheng @ 2012-06-12  4:52 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On Thu, May 31, 2012 at 3:35 AM, Alex Elder <elder@inktank.com> wrote:
> Start explicitly keeping track of the state of a ceph connection's
> socket, separate from the state of the connection itself.  Create
> placeholder functions to encapsulate the state transitions.
>
>    --------
>    | NEW* |  transient initial state
>    --------
>        | con_sock_state_init()
>        v
>    ----------
>    | CLOSED |  initialized, but no socket (and no
>    ----------  TCP connection)
>     ^      \
>     |       \ con_sock_state_connecting()
>     |        ----------------------
>     |                              \
>     + con_sock_state_closed()       \
>     |\                               \
>     | \                               \
>     |  -----------                     \
>     |  | CLOSING |  socket event;       \
>     |  -----------  await close          \
>     |       ^                            |
>     |       |                            |
>     |       + con_sock_state_closing()   |
>     |      / \                           |
>     |     /   ---------------            |
>     |    /                   \           v
>     |   /                    --------------
>     |  /    -----------------| CONNECTING |  socket created, TCP
>     |  |   /                 --------------  connect initiated
>     |  |   | con_sock_state_connected()
>     |  |   v
>    -------------
>    | CONNECTED |  TCP connection established
>    -------------
>
> Make the socket state an atomic variable, reinforcing that it's a
> distinct transtion with no possible "intermediate/both" states.
> This is almost certainly overkill at this point, though the
> transitions into CONNECTED and CLOSING state do get called via
> socket callback (the rest of the transitions occur with the
> connection mutex held).  We can back out the atomicity later.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  include/linux/ceph/messenger.h |    8 ++++-
>  net/ceph/messenger.c           |   63
> ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> index 920235e..5e852f4 100644
> --- a/include/linux/ceph/messenger.h
> +++ b/include/linux/ceph/messenger.h
> @@ -137,14 +137,18 @@ struct ceph_connection {
>        const struct ceph_connection_operations *ops;
>
>        struct ceph_messenger *msgr;
> +
> +       atomic_t sock_state;
>        struct socket *sock;
> +       struct ceph_entity_addr peer_addr; /* peer address */
> +       struct ceph_entity_addr peer_addr_for_me;
> +
>        unsigned long flags;
>        unsigned long state;
>        const char *error_msg;  /* error message, if any */
>
> -       struct ceph_entity_addr peer_addr; /* peer address */
>        struct ceph_entity_name peer_name; /* peer name */
> -       struct ceph_entity_addr peer_addr_for_me;
> +
>        unsigned peer_features;
>        u32 connect_seq;      /* identify the most recent connection
>                                 attempt for this connection, client */
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 29055df..7e11b07 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -29,6 +29,14 @@
>  * the sender.
>  */
>
> +/* State values for ceph_connection->sock_state; NEW is assumed to be 0 */
> +
> +#define CON_SOCK_STATE_NEW             0       /* -> CLOSED */
> +#define CON_SOCK_STATE_CLOSED          1       /* -> CONNECTING */
> +#define CON_SOCK_STATE_CONNECTING      2       /* -> CONNECTED or ->
> CLOSING */
> +#define CON_SOCK_STATE_CONNECTED       3       /* -> CLOSING or -> CLOSED
> */
> +#define CON_SOCK_STATE_CLOSING         4       /* -> CLOSED */
> +
>  /* static tag bytes (protocol control messages) */
>  static char tag_msg = CEPH_MSGR_TAG_MSG;
>  static char tag_ack = CEPH_MSGR_TAG_ACK;
> @@ -147,6 +155,54 @@ void ceph_msgr_flush(void)
>  }
>  EXPORT_SYMBOL(ceph_msgr_flush);
>
> +/* Connection socket state transition functions */
> +
> +static void con_sock_state_init(struct ceph_connection *con)
> +{
> +       int old_state;
> +
> +       old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
> +       if (WARN_ON(old_state != CON_SOCK_STATE_NEW))
> +               printk("%s: unexpected old state %d\n", __func__,
> old_state);
> +}
> +
> +static void con_sock_state_connecting(struct ceph_connection *con)
> +{
> +       int old_state;
> +
> +       old_state = atomic_xchg(&con->sock_state,
> CON_SOCK_STATE_CONNECTING);
> +       if (WARN_ON(old_state != CON_SOCK_STATE_CLOSED))
> +               printk("%s: unexpected old state %d\n", __func__,
> old_state);
> +}
> +
> +static void con_sock_state_connected(struct ceph_connection *con)
> +{
> +       int old_state;
> +
> +       old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTED);
> +       if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING))
> +               printk("%s: unexpected old state %d\n", __func__,
> old_state);
> +}
> +
> +static void con_sock_state_closing(struct ceph_connection *con)
> +{
> +       int old_state;
> +
> +       old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSING);
> +       if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING &&
> +                       old_state != CON_SOCK_STATE_CONNECTED))
> +               printk("%s: unexpected old state %d\n", __func__,
> old_state);
> +}
> +
> +static void con_sock_state_closed(struct ceph_connection *con)
> +{
> +       int old_state;
> +
> +       old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
> +       if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTED &&
> +                       old_state != CON_SOCK_STATE_CLOSING))
> +               printk("%s: unexpected old state %d\n", __func__,
> old_state);
> +}
>
Hi Alex,

Single node setup can easily trigger above WARN_ON.  I think this is because
local TCP connection setup and shutdown is very fast, they finish immediately
after sock->connect and sock->shutdown return.

Yan, Zheng

>  /*
>  * socket callback functions
> @@ -203,6 +259,7 @@ static void ceph_sock_state_change(struct sock *sk)
>                dout("%s TCP_CLOSE\n", __func__);
>        case TCP_CLOSE_WAIT:
>                dout("%s TCP_CLOSE_WAIT\n", __func__);
> +               con_sock_state_closing(con);
>                if (test_and_set_bit(SOCK_CLOSED, &con->flags) == 0) {
>                        if (test_bit(CONNECTING, &con->state))
>                                con->error_msg = "connection failed";
> @@ -213,6 +270,7 @@ static void ceph_sock_state_change(struct sock *sk)
>                break;
>        case TCP_ESTABLISHED:
>                dout("%s TCP_ESTABLISHED\n", __func__);
> +               con_sock_state_connected(con);
>                queue_con(con);
>                break;
>        default:        /* Everything else is uninteresting */
> @@ -277,6 +335,7 @@ static int ceph_tcp_connect(struct ceph_connection *con)
>                return ret;
>        }
>        con->sock = sock;
> +       con_sock_state_connecting(con);
>
>        return 0;
>  }
> @@ -341,6 +400,7 @@ static int con_close_socket(struct ceph_connection *con)
>        rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
>        sock_release(con->sock);
>        con->sock = NULL;
> +       con_sock_state_closed(con);
>        return rc;
>  }
>
> @@ -460,6 +520,9 @@ void ceph_con_init(struct ceph_messenger *msgr, struct
> ceph_connection *con)
>        memset(con, 0, sizeof(*con));
>        atomic_set(&con->nref, 1);
>        con->msgr = msgr;
> +
> +       con_sock_state_init(con);
> +
>        mutex_init(&con->mutex);
>        INIT_LIST_HEAD(&con->out_queue);
>        INIT_LIST_HEAD(&con->out_sent);
> --
> 1.7.5.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/13] libceph: start tracking connection socket state
  2012-06-12  4:52   ` Yan, Zheng
@ 2012-06-12  5:00     ` Sage Weil
  2012-06-12  5:02       ` Yan, Zheng
  0 siblings, 1 reply; 44+ messages in thread
From: Sage Weil @ 2012-06-12  5:00 UTC (permalink / raw)
  To: Yan, Zheng; +Cc: Alex Elder, ceph-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 9068 bytes --]

On Tue, 12 Jun 2012, Yan, Zheng wrote:
> On Thu, May 31, 2012 at 3:35 AM, Alex Elder <elder@inktank.com> wrote:
> > Start explicitly keeping track of the state of a ceph connection's
> > socket, separate from the state of the connection itself.  Create
> > placeholder functions to encapsulate the state transitions.
> >
> >    --------
> >    | NEW* |  transient initial state
> >    --------
> >        | con_sock_state_init()
> >        v
> >    ----------
> >    | CLOSED |  initialized, but no socket (and no
> >    ----------  TCP connection)
> >     ^      \
> >     |       \ con_sock_state_connecting()
> >     |        ----------------------
> >     |                              \
> >     + con_sock_state_closed()       \
> >     |\                               \
> >     | \                               \
> >     |  -----------                     \
> >     |  | CLOSING |  socket event;       \
> >     |  -----------  await close          \
> >     |       ^                            |
> >     |       |                            |
> >     |       + con_sock_state_closing()   |
> >     |      / \                           |
> >     |     /   ---------------            |
> >     |    /                   \           v
> >     |   /                    --------------
> >     |  /    -----------------| CONNECTING |  socket created, TCP
> >     |  |   /                 --------------  connect initiated
> >     |  |   | con_sock_state_connected()
> >     |  |   v
> >    -------------
> >    | CONNECTED |  TCP connection established
> >    -------------
> >
> > Make the socket state an atomic variable, reinforcing that it's a
> > distinct transtion with no possible "intermediate/both" states.
> > This is almost certainly overkill at this point, though the
> > transitions into CONNECTED and CLOSING state do get called via
> > socket callback (the rest of the transitions occur with the
> > connection mutex held).  We can back out the atomicity later.
> >
> > Signed-off-by: Alex Elder <elder@inktank.com>
> > ---
> >  include/linux/ceph/messenger.h |    8 ++++-
> >  net/ceph/messenger.c           |   63
> > ++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 69 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h
> > index 920235e..5e852f4 100644
> > --- a/include/linux/ceph/messenger.h
> > +++ b/include/linux/ceph/messenger.h
> > @@ -137,14 +137,18 @@ struct ceph_connection {
> >        const struct ceph_connection_operations *ops;
> >
> >        struct ceph_messenger *msgr;
> > +
> > +       atomic_t sock_state;
> >        struct socket *sock;
> > +       struct ceph_entity_addr peer_addr; /* peer address */
> > +       struct ceph_entity_addr peer_addr_for_me;
> > +
> >        unsigned long flags;
> >        unsigned long state;
> >        const char *error_msg;  /* error message, if any */
> >
> > -       struct ceph_entity_addr peer_addr; /* peer address */
> >        struct ceph_entity_name peer_name; /* peer name */
> > -       struct ceph_entity_addr peer_addr_for_me;
> > +
> >        unsigned peer_features;
> >        u32 connect_seq;      /* identify the most recent connection
> >                                 attempt for this connection, client */
> > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> > index 29055df..7e11b07 100644
> > --- a/net/ceph/messenger.c
> > +++ b/net/ceph/messenger.c
> > @@ -29,6 +29,14 @@
> >  * the sender.
> >  */
> >
> > +/* State values for ceph_connection->sock_state; NEW is assumed to be 0 */
> > +
> > +#define CON_SOCK_STATE_NEW             0       /* -> CLOSED */
> > +#define CON_SOCK_STATE_CLOSED          1       /* -> CONNECTING */
> > +#define CON_SOCK_STATE_CONNECTING      2       /* -> CONNECTED or ->
> > CLOSING */
> > +#define CON_SOCK_STATE_CONNECTED       3       /* -> CLOSING or -> CLOSED
> > */
> > +#define CON_SOCK_STATE_CLOSING         4       /* -> CLOSED */
> > +
> >  /* static tag bytes (protocol control messages) */
> >  static char tag_msg = CEPH_MSGR_TAG_MSG;
> >  static char tag_ack = CEPH_MSGR_TAG_ACK;
> > @@ -147,6 +155,54 @@ void ceph_msgr_flush(void)
> >  }
> >  EXPORT_SYMBOL(ceph_msgr_flush);
> >
> > +/* Connection socket state transition functions */
> > +
> > +static void con_sock_state_init(struct ceph_connection *con)
> > +{
> > +       int old_state;
> > +
> > +       old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
> > +       if (WARN_ON(old_state != CON_SOCK_STATE_NEW))
> > +               printk("%s: unexpected old state %d\n", __func__,
> > old_state);
> > +}
> > +
> > +static void con_sock_state_connecting(struct ceph_connection *con)
> > +{
> > +       int old_state;
> > +
> > +       old_state = atomic_xchg(&con->sock_state,
> > CON_SOCK_STATE_CONNECTING);
> > +       if (WARN_ON(old_state != CON_SOCK_STATE_CLOSED))
> > +               printk("%s: unexpected old state %d\n", __func__,
> > old_state);
> > +}
> > +
> > +static void con_sock_state_connected(struct ceph_connection *con)
> > +{
> > +       int old_state;
> > +
> > +       old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTED);
> > +       if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING))
> > +               printk("%s: unexpected old state %d\n", __func__,
> > old_state);
> > +}
> > +
> > +static void con_sock_state_closing(struct ceph_connection *con)
> > +{
> > +       int old_state;
> > +
> > +       old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSING);
> > +       if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING &&
> > +                       old_state != CON_SOCK_STATE_CONNECTED))
> > +               printk("%s: unexpected old state %d\n", __func__,
> > old_state);
> > +}
> > +
> > +static void con_sock_state_closed(struct ceph_connection *con)
> > +{
> > +       int old_state;
> > +
> > +       old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
> > +       if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTED &&
> > +                       old_state != CON_SOCK_STATE_CLOSING))
> > +               printk("%s: unexpected old state %d\n", __func__,
> > old_state);
> > +}
> >
> Hi Alex,
> 
> Single node setup can easily trigger above WARN_ON.  I think this is because
> local TCP connection setup and shutdown is very fast, they finish immediately
> after sock->connect and sock->shutdown return.

Yep.  This was just fixed yesterday, in the testing-next branch, by 
'libceph: transition socket state prior to actual connect'.

Are you still hitting the bio null deref?

Thanks-
sage


> 
> Yan, Zheng
> 
> >  /*
> >  * socket callback functions
> > @@ -203,6 +259,7 @@ static void ceph_sock_state_change(struct sock *sk)
> >                dout("%s TCP_CLOSE\n", __func__);
> >        case TCP_CLOSE_WAIT:
> >                dout("%s TCP_CLOSE_WAIT\n", __func__);
> > +               con_sock_state_closing(con);
> >                if (test_and_set_bit(SOCK_CLOSED, &con->flags) == 0) {
> >                        if (test_bit(CONNECTING, &con->state))
> >                                con->error_msg = "connection failed";
> > @@ -213,6 +270,7 @@ static void ceph_sock_state_change(struct sock *sk)
> >                break;
> >        case TCP_ESTABLISHED:
> >                dout("%s TCP_ESTABLISHED\n", __func__);
> > +               con_sock_state_connected(con);
> >                queue_con(con);
> >                break;
> >        default:        /* Everything else is uninteresting */
> > @@ -277,6 +335,7 @@ static int ceph_tcp_connect(struct ceph_connection *con)
> >                return ret;
> >        }
> >        con->sock = sock;
> > +       con_sock_state_connecting(con);
> >
> >        return 0;
> >  }
> > @@ -341,6 +400,7 @@ static int con_close_socket(struct ceph_connection *con)
> >        rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
> >        sock_release(con->sock);
> >        con->sock = NULL;
> > +       con_sock_state_closed(con);
> >        return rc;
> >  }
> >
> > @@ -460,6 +520,9 @@ void ceph_con_init(struct ceph_messenger *msgr, struct
> > ceph_connection *con)
> >        memset(con, 0, sizeof(*con));
> >        atomic_set(&con->nref, 1);
> >        con->msgr = msgr;
> > +
> > +       con_sock_state_init(con);
> > +
> >        mutex_init(&con->mutex);
> >        INIT_LIST_HEAD(&con->out_queue);
> >        INIT_LIST_HEAD(&con->out_sent);
> > --
> > 1.7.5.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/13] libceph: start tracking connection socket state
  2012-06-12  5:00     ` Sage Weil
@ 2012-06-12  5:02       ` Yan, Zheng
  2012-06-12 16:58         ` Alex Elder
  0 siblings, 1 reply; 44+ messages in thread
From: Yan, Zheng @ 2012-06-12  5:02 UTC (permalink / raw)
  To: Sage Weil; +Cc: Alex Elder, ceph-devel

On 06/12/2012 01:00 PM, Sage Weil wrote:
> Yep.  This was just fixed yesterday, in the testing-next branch, by 
> 'libceph: transition socket state prior to actual connect'.
> 
> Are you still hitting the bio null deref?
> 
No,

Cheers
Yan, Zheng



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/13] libceph: start tracking connection socket state
  2012-06-12  5:02       ` Yan, Zheng
@ 2012-06-12 16:58         ` Alex Elder
  2012-06-13  1:50           ` Yan, Zheng
  0 siblings, 1 reply; 44+ messages in thread
From: Alex Elder @ 2012-06-12 16:58 UTC (permalink / raw)
  To: Yan, Zheng; +Cc: Sage Weil, ceph-devel

On 06/12/2012 12:02 AM, Yan, Zheng wrote:
> On 06/12/2012 01:00 PM, Sage Weil wrote:
>> Yep.  This was just fixed yesterday, in the testing-next branch, by 
>> 'libceph: transition socket state prior to actual connect'.
>>
>> Are you still hitting the bio null deref?
>>
> No,
> 
> Cheers
> Yan, Zheng

Would you be able to narrow down exactly what fixed the bio null
problem?  Are you able to easily reproduce it?  Are you running
the master branch, or testing, or what?

Thank you.

					-Alex


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH 09/13] libceph: start tracking connection socket state
  2012-06-12 16:58         ` Alex Elder
@ 2012-06-13  1:50           ` Yan, Zheng
  0 siblings, 0 replies; 44+ messages in thread
From: Yan, Zheng @ 2012-06-13  1:50 UTC (permalink / raw)
  To: Alex Elder; +Cc: Sage Weil, ceph-devel

On 06/13/2012 12:58 AM, Alex Elder wrote:
> On 06/12/2012 12:02 AM, Yan, Zheng wrote:
>> On 06/12/2012 01:00 PM, Sage Weil wrote:
>>> Yep.  This was just fixed yesterday, in the testing-next branch, by 
>>> 'libceph: transition socket state prior to actual connect'.
>>>
>>> Are you still hitting the bio null deref?
>>>
>> No,
>>
>> Cheers
>> Yan, Zheng
> 
> Would you be able to narrow down exactly what fixed the bio null
> problem?  Are you able to easily reproduce it?  Are you running
> the master branch, or testing, or what?
> 
The 'clear msg->bio_iter' patch. Without it, I always got below Oops
when xfstest reached the 49th test case. 

---
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.610673] libceph: osd3 10.239.36.78:6807 socket closed
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.612784] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.614482] IP: [<ffffffffa062bc98>] con_work+0x19a8/0x2c80 [libceph]
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.616120] PGD 137d91067 PUD 137d92067 PMD 0
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.617749] Oops: 0000 [#1] SMP
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.619404] CPU 6
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.621059] Pid: 8239, comm: kworker/6:6 Not tainted 3.5.0-rc2+ #91 Dell Inc. Studio XPS 8000/0X231R
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.622700] RIP: 0010:[<ffffffffa062bc98>]  [<ffffffffa062bc98>] con_work+0x19a8/0x2c80 [libceph]
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.624371] RSP: 0018:ffff88000efb7cb0  EFLAGS: 00010246
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.626119] RAX: 0000000000000000 RBX: ffff880122370030 RCX: 000000000006a000
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.627694] RDX: 0000000000000000 RSI: 0000000000016000 RDI: ffff880122370420
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.629267] RBP: ffff88000efb7e00 R08: 00000000418d4bd6 R09: 0000000000000000
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.630811] R10: 0000000000000002 R11: 0000000000000dc7 R12: 0000000000080000
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.632441] R13: 0000000000000000 R14: ffff88003af1bd00 R15: ffffea00044aec40
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.633996] FS:  0000000000000000(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.635621] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.637239] CR2: 0000000000000048 CR3: 0000000137d90000 CR4: 00000000000007e0
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.638841] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.640493] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.642255] Process kworker/6:6 (pid: 8239, threadinfo ffff88000efb6000, task ffff880041242e60)
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.643917] Stack:
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.645588]  ffff880000000001 ffffffff00000006 ffff88000efb7dac 0000000600000002
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.647270]  ffff88000efb7d30 ffffffff810930d1 ffff88013b00a400 0000000000000087
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.648923]  0000000eed1e4f54 ffff880041242e60 ffff88000efb7d30 ffffffff00000000
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.650563] Call Trace:
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.652162]  [<ffffffff810930d1>] ? update_curr+0x141/0x1f0
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.653688]  [<ffffffffa062a2f0>] ? ceph_msg_new+0x2d0/0x2d0 [libceph]
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.655203]  [<ffffffff81075f6d>] process_one_work+0x11d/0x470
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.656714]  [<ffffffff81077069>] worker_thread+0x159/0x340
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.658278]  [<ffffffff81076f10>] ? manage_workers+0x230/0x230
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.659606]  [<ffffffff8107bee3>] kthread+0x93/0xa0
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.660932]  [<ffffffff8160d9e4>] kernel_thread_helper+0x4/0x10
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.662293]  [<ffffffff8107be50>] ? kthread_freezable_should_stop+0x70/0x70
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.663670]  [<ffffffff8160d9e0>] ? gs_change+0x13/0x13
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.665027] Code: ef fb ff ff 0f 1f 80 00 00 00 00 49 83 be 90 00 00 00 00 0f 84 7a 01 00 00 49 63 86 a0 00 00 00 49 8b 96 98 00 00 00 48 c1 e0 04 <48> 03 42 48 4c 8b 38 8b 48 0c 8b 50 08 e9 08 f8 ff ff 49 89 86
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.666570] RIP  [<ffffffffa062bc98>] con_work+0x19a8/0x2c80 [libceph]
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.668232]  RSP <ffff88000efb7cb0>
Jun 13 09:34:14 zyan5-desk kernel: [ 1192.669795] CR2: 0000000000000048


^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2012-06-13  1:50 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-30 19:24 [PATCH 00/13] libceph: cleanups preparing for state cleanup Alex Elder
2012-05-30 19:34 ` [PATCH 01/13] libceph: eliminate connection state "DEAD" Alex Elder
2012-05-31 16:20   ` Yehuda Sadeh
2012-05-30 19:34 ` [PATCH 02/13] libceph: kill bad_proto ceph connection op Alex Elder
2012-05-31 16:30   ` Yehuda Sadeh
2012-05-30 19:34 ` [PATCH 03/13] libceph: delete useless SOCK_CLOSED manipulations Alex Elder
2012-06-01 18:47   ` Alex Elder
2012-05-30 19:34 ` [PATCH 04/13] libceph: rename socket callbacks Alex Elder
2012-05-31 16:33   ` Yehuda Sadeh Weinraub
2012-06-01  4:02   ` Sage Weil
2012-05-30 19:34 ` [PATCH 05/13] libceph: rename kvec_reset and kvec_add functions Alex Elder
2012-05-31 16:34   ` Yehuda Sadeh
2012-06-01  4:02   ` Sage Weil
2012-05-30 19:34 ` [PATCH 06/13] libceph: embed ceph messenger structure in ceph_client Alex Elder
2012-05-31 16:44   ` Yehuda Sadeh
2012-06-01  4:04   ` Sage Weil
2012-05-30 19:34 ` [PATCH 07/13] libceph: embed ceph connection structure in mon_client Alex Elder
2012-06-01  4:24   ` Sage Weil
2012-06-01 12:12     ` Alex Elder
2012-06-01 13:30       ` Alex Elder
2012-06-01 16:20         ` Sage Weil
2012-06-01 16:32           ` Alex Elder
2012-06-01 16:39             ` Sage Weil
2012-06-01 17:09     ` Alex Elder
2012-06-01 17:10       ` Sage Weil
2012-05-30 19:35 ` [PATCH 08/13] libceph: start separating connection flags from state Alex Elder
2012-06-01  4:25   ` Sage Weil
2012-06-01 12:13     ` Alex Elder
2012-05-30 19:35 ` [PATCH 09/13] libceph: start tracking connection socket state Alex Elder
2012-06-01  4:28   ` Sage Weil
2012-06-01 12:15     ` Alex Elder
2012-06-12  4:52   ` Yan, Zheng
2012-06-12  5:00     ` Sage Weil
2012-06-12  5:02       ` Yan, Zheng
2012-06-12 16:58         ` Alex Elder
2012-06-13  1:50           ` Yan, Zheng
2012-05-30 19:35 ` [PATCH 10/13] libceph: provide osd number when creating osd Alex Elder
2012-06-01  4:29   ` Sage Weil
2012-05-30 19:35 ` [PATCH 11/13] libceph: init monitor connection when opening Alex Elder
2012-06-01  4:30   ` Sage Weil
2012-05-30 19:35 ` [PATCH 12/13] libceph: fully initialize connection in con_init() Alex Elder
2012-06-01  4:31   ` Sage Weil
2012-05-30 19:35 ` [PATCH 13/13] libceph: set CLOSED state bit in con_init Alex Elder
2012-06-01  4:32   ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.