All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] libceph: fix backoff handling
@ 2012-10-09 21:31 Alex Elder
  2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Alex Elder @ 2012-10-09 21:31 UTC (permalink / raw)
  To: ceph-devel

These three patches fix some problems related to how backoff
is handled when a ceph connection faults.

					-Alex

     [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue
     [PATCH 2/3] libceph: let con_work() handle backoff
     [PATCH 3/3] libceph: define common queue_con_delay()

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue
  2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder
@ 2012-10-09 21:33 ` Alex Elder
  2012-10-09 21:33   ` Sage Weil
  2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder
  2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder
  2 siblings, 1 reply; 7+ messages in thread
From: Alex Elder @ 2012-10-09 21:33 UTC (permalink / raw)
  To: ceph-devel

If ceph_fault() is unable to queue work after a delay, it sets the
BACKOFF connection flag so con_work() will attempt to do so.

In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't
result in newly-queued work, it simply ignores this condition and
proceeds as if no backoff delay were desired.  There are two
problems with this--one of which is a bug.

The first problem is simply that the intended behavior is to back
off, and if we aren't able queue the work item to run after a delay
we're not doing that.

The only reason queue_delayed_work() won't queue work is if the
provided work item is already queued.  In the messenger, this
means that con_work() is already scheduled to be run again.  So
if we simply set the BACKOFF flag again when this occurs, we know
the next con_work() call will again attempt to hold off activity
on the connection until after the delay.

The second problem--the bug--is a leak of a reference count.  If
queue_delayed_work() returns 0 in con_work(), con->ops->put() drops
the connection reference held on entry to con_work().  However,
processing is (was) allowed to continue, and at the end of the
function a second con->ops->put() is called.

This patch fixes both problems.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  net/ceph/messenger.c |    3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index f9f65fe..ece06bc 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2300,10 +2300,11 @@ restart:
  			mutex_unlock(&con->mutex);
  			return;
  		} else {
-			con->ops->put(con);
  			dout("con_work %p FAILED to back off %lu\n", con,
  			     con->delay);
+			set_bit(CON_FLAG_BACKOFF, &con->flags);
  		}
+		goto done;
  	}

  	if (con->state == CON_STATE_STANDBY) {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] libceph: let con_work() handle backoff
  2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder
  2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
@ 2012-10-09 21:33 ` Alex Elder
  2012-10-09 21:34   ` Sage Weil
  2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder
  2 siblings, 1 reply; 7+ messages in thread
From: Alex Elder @ 2012-10-09 21:33 UTC (permalink / raw)
  To: ceph-devel

Both ceph_fault() and con_work() include handling for imposing a
delay before doing further processing on a faulted connection.
The latter is used only if ceph_fault() is unable to.

Instead, just let con_work() always be responsible for implementing
the delay.  After setting up the delay value, set the BACKOFF flag
on the connection unconditionally and call queue_con() to ensure
con_work() will get called to handle it.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  net/ceph/messenger.c |   20 ++------------------
  1 file changed, 2 insertions(+), 18 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index ece06bc..9170c20 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2398,24 +2398,8 @@ static void ceph_fault(struct ceph_connection *con)
  			con->delay = BASE_DELAY_INTERVAL;
  		else if (con->delay < MAX_DELAY_INTERVAL)
  			con->delay *= 2;
-		con->ops->get(con);
-		if (queue_delayed_work(ceph_msgr_wq, &con->work,
-				       round_jiffies_relative(con->delay))) {
-			dout("fault queued %p delay %lu\n", con, con->delay);
-		} else {
-			con->ops->put(con);
-			dout("fault failed to queue %p delay %lu, backoff\n",
-			     con, con->delay);
-			/*
-			 * In many cases we see a socket state change
-			 * while con_work is running and end up
-			 * queuing (non-delayed) work, such that we
-			 * can't backoff with a delay.  Set a flag so
-			 * that when con_work restarts we schedule the
-			 * delay then.
-			 */
-			set_bit(CON_FLAG_BACKOFF, &con->flags);
-		}
+		set_bit(CON_FLAG_BACKOFF, &con->flags);
+		queue_con(con);
  	}

  out_unlock:
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] libceph: define common queue_con_delay()
  2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder
  2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
  2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder
@ 2012-10-09 21:33 ` Alex Elder
  2012-10-09 21:37   ` Sage Weil
  2 siblings, 1 reply; 7+ messages in thread
From: Alex Elder @ 2012-10-09 21:33 UTC (permalink / raw)
  To: ceph-devel

This patch defines a single function, queue_con_delay() to call
queue_delayed_work() for a connection.  It basically generalizes
what was previously queue_con() by adding the delay argument.
queue_con() is now a simple helper that passes 0 for its delay.
queue_con_delay() returns 0 if it queued work or an errno if it
did not for some reason.

If con_work() finds the BACKOFF flag set for a connection, it now
calls queue_con_delay() to handle arranging to start again after a
delay.


Note about connection reference counts:  con_work() only ever gets
called as a work item function.  At the time that work is scheduled,
a reference to the connection is acquired, and the corresponding
con_work() call is then responsible for dropping that reference
before it returns.

Previously, the backoff handling inside con_work() silently handed
off its reference to delayed work it scheduled.  Now that
queue_con_delay() is used, a new reference is acquired for the
newly-scheduled work, and the original reference is dropped by the
con->ops->put() call at the end of the function.

Signed-off-by: Alex Elder <elder@inktank.com>
---
  net/ceph/messenger.c |   38 +++++++++++++++++++++++---------------
  1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 9170c20..77cc8b1 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2244,22 +2244,33 @@ bad_tag:


  /*
- * Atomically queue work on a connection.  Bump @con reference to
- * avoid races with connection teardown.
+ * Atomically queue work on a connection after the specified delay.
+ * Bump @con reference to avoid races with connection teardown.
+ * Returns 0 if work was queued, or an error code otherwise.
   */
-static void queue_con(struct ceph_connection *con)
+static int queue_con_delay(struct ceph_connection *con, unsigned long 
delay)
  {
  	if (!con->ops->get(con)) {
-		dout("queue_con %p ref count 0\n", con);
-		return;
+		dout("%s %p ref count 0\n", __func__, con);
+
+		return -ENOENT;
  	}

-	if (!queue_delayed_work(ceph_msgr_wq, &con->work, 0)) {
-		dout("queue_con %p - already queued\n", con);
+	if (!queue_delayed_work(ceph_msgr_wq, &con->work, delay)) {
+		dout("%s %p - already queued\n", __func__, con);
  		con->ops->put(con);
-	} else {
-		dout("queue_con %p\n", con);
+
+		return -EBUSY;
  	}
+
+	dout("%s %p %lu\n", __func__, con, delay);
+
+	return 0;
+}
+
+static void queue_con(struct ceph_connection *con)
+{
+	(void) queue_con_delay(con, 0);
  }

  /*
@@ -2294,14 +2305,11 @@ restart:

  	if (test_and_clear_bit(CON_FLAG_BACKOFF, &con->flags)) {
  		dout("con_work %p backing off\n", con);
-		if (queue_delayed_work(ceph_msgr_wq, &con->work,
-				       round_jiffies_relative(con->delay))) {
-			dout("con_work %p backoff %lu\n", con, con->delay);
-			mutex_unlock(&con->mutex);
-			return;
-		} else {
+		ret = queue_con_delay(con, round_jiffies_relative(con->delay));
+		if (ret) {
  			dout("con_work %p FAILED to back off %lu\n", con,
  			     con->delay);
+			BUG_ON(ret == -ENOENT);
  			set_bit(CON_FLAG_BACKOFF, &con->flags);
  		}
  		goto done;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue
  2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
@ 2012-10-09 21:33   ` Sage Weil
  0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2012-10-09 21:33 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Sage Weil <sage@inktank.com>

On Tue, 9 Oct 2012, Alex Elder wrote:

> If ceph_fault() is unable to queue work after a delay, it sets the
> BACKOFF connection flag so con_work() will attempt to do so.
> 
> In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't
> result in newly-queued work, it simply ignores this condition and
> proceeds as if no backoff delay were desired.  There are two
> problems with this--one of which is a bug.
> 
> The first problem is simply that the intended behavior is to back
> off, and if we aren't able queue the work item to run after a delay
> we're not doing that.
> 
> The only reason queue_delayed_work() won't queue work is if the
> provided work item is already queued.  In the messenger, this
> means that con_work() is already scheduled to be run again.  So
> if we simply set the BACKOFF flag again when this occurs, we know
> the next con_work() call will again attempt to hold off activity
> on the connection until after the delay.
> 
> The second problem--the bug--is a leak of a reference count.  If
> queue_delayed_work() returns 0 in con_work(), con->ops->put() drops
> the connection reference held on entry to con_work().  However,
> processing is (was) allowed to continue, and at the end of the
> function a second con->ops->put() is called.
> 
> This patch fixes both problems.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/messenger.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index f9f65fe..ece06bc 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2300,10 +2300,11 @@ restart:
>  			mutex_unlock(&con->mutex);
>  			return;
>  		} else {
> -			con->ops->put(con);
>  			dout("con_work %p FAILED to back off %lu\n", con,
>  			     con->delay);
> +			set_bit(CON_FLAG_BACKOFF, &con->flags);
>  		}
> +		goto done;
>  	}
> 
>  	if (con->state == CON_STATE_STANDBY) {
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/3] libceph: let con_work() handle backoff
  2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder
@ 2012-10-09 21:34   ` Sage Weil
  0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2012-10-09 21:34 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Sage Weil <sage@inktank.com>


On Tue, 9 Oct 2012, Alex Elder wrote:

> Both ceph_fault() and con_work() include handling for imposing a
> delay before doing further processing on a faulted connection.
> The latter is used only if ceph_fault() is unable to.
> 
> Instead, just let con_work() always be responsible for implementing
> the delay.  After setting up the delay value, set the BACKOFF flag
> on the connection unconditionally and call queue_con() to ensure
> con_work() will get called to handle it.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/messenger.c |   20 ++------------------
>  1 file changed, 2 insertions(+), 18 deletions(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index ece06bc..9170c20 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2398,24 +2398,8 @@ static void ceph_fault(struct ceph_connection *con)
>  			con->delay = BASE_DELAY_INTERVAL;
>  		else if (con->delay < MAX_DELAY_INTERVAL)
>  			con->delay *= 2;
> -		con->ops->get(con);
> -		if (queue_delayed_work(ceph_msgr_wq, &con->work,
> -				       round_jiffies_relative(con->delay))) {
> -			dout("fault queued %p delay %lu\n", con, con->delay);
> -		} else {
> -			con->ops->put(con);
> -			dout("fault failed to queue %p delay %lu, backoff\n",
> -			     con, con->delay);
> -			/*
> -			 * In many cases we see a socket state change
> -			 * while con_work is running and end up
> -			 * queuing (non-delayed) work, such that we
> -			 * can't backoff with a delay.  Set a flag so
> -			 * that when con_work restarts we schedule the
> -			 * delay then.
> -			 */
> -			set_bit(CON_FLAG_BACKOFF, &con->flags);
> -		}
> +		set_bit(CON_FLAG_BACKOFF, &con->flags);
> +		queue_con(con);
>  	}
> 
>  out_unlock:
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/3] libceph: define common queue_con_delay()
  2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder
@ 2012-10-09 21:37   ` Sage Weil
  0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2012-10-09 21:37 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

Reviewed-by: Sage Weil <sage@inktank.com>

On Tue, 9 Oct 2012, Alex Elder wrote:

> This patch defines a single function, queue_con_delay() to call
> queue_delayed_work() for a connection.  It basically generalizes
> what was previously queue_con() by adding the delay argument.
> queue_con() is now a simple helper that passes 0 for its delay.
> queue_con_delay() returns 0 if it queued work or an errno if it
> did not for some reason.
> 
> If con_work() finds the BACKOFF flag set for a connection, it now
> calls queue_con_delay() to handle arranging to start again after a
> delay.
> 
> 
> Note about connection reference counts:  con_work() only ever gets
> called as a work item function.  At the time that work is scheduled,
> a reference to the connection is acquired, and the corresponding
> con_work() call is then responsible for dropping that reference
> before it returns.
> 
> Previously, the backoff handling inside con_work() silently handed
> off its reference to delayed work it scheduled.  Now that
> queue_con_delay() is used, a new reference is acquired for the
> newly-scheduled work, and the original reference is dropped by the
> con->ops->put() call at the end of the function.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/messenger.c |   38 +++++++++++++++++++++++---------------
>  1 file changed, 23 insertions(+), 15 deletions(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 9170c20..77cc8b1 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2244,22 +2244,33 @@ bad_tag:
> 
> 
>  /*
> - * Atomically queue work on a connection.  Bump @con reference to
> - * avoid races with connection teardown.
> + * Atomically queue work on a connection after the specified delay.
> + * Bump @con reference to avoid races with connection teardown.
> + * Returns 0 if work was queued, or an error code otherwise.
>   */
> -static void queue_con(struct ceph_connection *con)
> +static int queue_con_delay(struct ceph_connection *con, unsigned long delay)
>  {
>  	if (!con->ops->get(con)) {
> -		dout("queue_con %p ref count 0\n", con);
> -		return;
> +		dout("%s %p ref count 0\n", __func__, con);
> +
> +		return -ENOENT;
>  	}
> 
> -	if (!queue_delayed_work(ceph_msgr_wq, &con->work, 0)) {
> -		dout("queue_con %p - already queued\n", con);
> +	if (!queue_delayed_work(ceph_msgr_wq, &con->work, delay)) {
> +		dout("%s %p - already queued\n", __func__, con);
>  		con->ops->put(con);
> -	} else {
> -		dout("queue_con %p\n", con);
> +
> +		return -EBUSY;
>  	}
> +
> +	dout("%s %p %lu\n", __func__, con, delay);
> +
> +	return 0;
> +}
> +
> +static void queue_con(struct ceph_connection *con)
> +{
> +	(void) queue_con_delay(con, 0);
>  }
> 
>  /*
> @@ -2294,14 +2305,11 @@ restart:
> 
>  	if (test_and_clear_bit(CON_FLAG_BACKOFF, &con->flags)) {
>  		dout("con_work %p backing off\n", con);
> -		if (queue_delayed_work(ceph_msgr_wq, &con->work,
> -				       round_jiffies_relative(con->delay))) {
> -			dout("con_work %p backoff %lu\n", con, con->delay);
> -			mutex_unlock(&con->mutex);
> -			return;
> -		} else {
> +		ret = queue_con_delay(con,
> round_jiffies_relative(con->delay));
> +		if (ret) {
>  			dout("con_work %p FAILED to back off %lu\n", con,
>  			     con->delay);
> +			BUG_ON(ret == -ENOENT);
>  			set_bit(CON_FLAG_BACKOFF, &con->flags);
>  		}
>  		goto done;
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-10-09 21:37 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder
2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
2012-10-09 21:33   ` Sage Weil
2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder
2012-10-09 21:34   ` Sage Weil
2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder
2012-10-09 21:37   ` Sage Weil

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.