* [PATCH 0/3] libceph: fix backoff handling
@ 2012-10-09 21:31 Alex Elder
2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Alex Elder @ 2012-10-09 21:31 UTC (permalink / raw)
To: ceph-devel
These three patches fix some problems related to how backoff
is handled when a ceph connection faults.
-Alex
[PATCH 1/3] libceph: reset BACKOFF if unable to re-queue
[PATCH 2/3] libceph: let con_work() handle backoff
[PATCH 3/3] libceph: define common queue_con_delay()
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue
2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder
@ 2012-10-09 21:33 ` Alex Elder
2012-10-09 21:33 ` Sage Weil
2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder
2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder
2 siblings, 1 reply; 7+ messages in thread
From: Alex Elder @ 2012-10-09 21:33 UTC (permalink / raw)
To: ceph-devel
If ceph_fault() is unable to queue work after a delay, it sets the
BACKOFF connection flag so con_work() will attempt to do so.
In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't
result in newly-queued work, it simply ignores this condition and
proceeds as if no backoff delay were desired. There are two
problems with this--one of which is a bug.
The first problem is simply that the intended behavior is to back
off, and if we aren't able queue the work item to run after a delay
we're not doing that.
The only reason queue_delayed_work() won't queue work is if the
provided work item is already queued. In the messenger, this
means that con_work() is already scheduled to be run again. So
if we simply set the BACKOFF flag again when this occurs, we know
the next con_work() call will again attempt to hold off activity
on the connection until after the delay.
The second problem--the bug--is a leak of a reference count. If
queue_delayed_work() returns 0 in con_work(), con->ops->put() drops
the connection reference held on entry to con_work(). However,
processing is (was) allowed to continue, and at the end of the
function a second con->ops->put() is called.
This patch fixes both problems.
Signed-off-by: Alex Elder <elder@inktank.com>
---
net/ceph/messenger.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index f9f65fe..ece06bc 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2300,10 +2300,11 @@ restart:
mutex_unlock(&con->mutex);
return;
} else {
- con->ops->put(con);
dout("con_work %p FAILED to back off %lu\n", con,
con->delay);
+ set_bit(CON_FLAG_BACKOFF, &con->flags);
}
+ goto done;
}
if (con->state == CON_STATE_STANDBY) {
--
1.7.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] libceph: let con_work() handle backoff
2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder
2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
@ 2012-10-09 21:33 ` Alex Elder
2012-10-09 21:34 ` Sage Weil
2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder
2 siblings, 1 reply; 7+ messages in thread
From: Alex Elder @ 2012-10-09 21:33 UTC (permalink / raw)
To: ceph-devel
Both ceph_fault() and con_work() include handling for imposing a
delay before doing further processing on a faulted connection.
The latter is used only if ceph_fault() is unable to.
Instead, just let con_work() always be responsible for implementing
the delay. After setting up the delay value, set the BACKOFF flag
on the connection unconditionally and call queue_con() to ensure
con_work() will get called to handle it.
Signed-off-by: Alex Elder <elder@inktank.com>
---
net/ceph/messenger.c | 20 ++------------------
1 file changed, 2 insertions(+), 18 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index ece06bc..9170c20 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2398,24 +2398,8 @@ static void ceph_fault(struct ceph_connection *con)
con->delay = BASE_DELAY_INTERVAL;
else if (con->delay < MAX_DELAY_INTERVAL)
con->delay *= 2;
- con->ops->get(con);
- if (queue_delayed_work(ceph_msgr_wq, &con->work,
- round_jiffies_relative(con->delay))) {
- dout("fault queued %p delay %lu\n", con, con->delay);
- } else {
- con->ops->put(con);
- dout("fault failed to queue %p delay %lu, backoff\n",
- con, con->delay);
- /*
- * In many cases we see a socket state change
- * while con_work is running and end up
- * queuing (non-delayed) work, such that we
- * can't backoff with a delay. Set a flag so
- * that when con_work restarts we schedule the
- * delay then.
- */
- set_bit(CON_FLAG_BACKOFF, &con->flags);
- }
+ set_bit(CON_FLAG_BACKOFF, &con->flags);
+ queue_con(con);
}
out_unlock:
--
1.7.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] libceph: define common queue_con_delay()
2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder
2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder
@ 2012-10-09 21:33 ` Alex Elder
2012-10-09 21:37 ` Sage Weil
2 siblings, 1 reply; 7+ messages in thread
From: Alex Elder @ 2012-10-09 21:33 UTC (permalink / raw)
To: ceph-devel
This patch defines a single function, queue_con_delay() to call
queue_delayed_work() for a connection. It basically generalizes
what was previously queue_con() by adding the delay argument.
queue_con() is now a simple helper that passes 0 for its delay.
queue_con_delay() returns 0 if it queued work or an errno if it
did not for some reason.
If con_work() finds the BACKOFF flag set for a connection, it now
calls queue_con_delay() to handle arranging to start again after a
delay.
Note about connection reference counts: con_work() only ever gets
called as a work item function. At the time that work is scheduled,
a reference to the connection is acquired, and the corresponding
con_work() call is then responsible for dropping that reference
before it returns.
Previously, the backoff handling inside con_work() silently handed
off its reference to delayed work it scheduled. Now that
queue_con_delay() is used, a new reference is acquired for the
newly-scheduled work, and the original reference is dropped by the
con->ops->put() call at the end of the function.
Signed-off-by: Alex Elder <elder@inktank.com>
---
net/ceph/messenger.c | 38 +++++++++++++++++++++++---------------
1 file changed, 23 insertions(+), 15 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 9170c20..77cc8b1 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2244,22 +2244,33 @@ bad_tag:
/*
- * Atomically queue work on a connection. Bump @con reference to
- * avoid races with connection teardown.
+ * Atomically queue work on a connection after the specified delay.
+ * Bump @con reference to avoid races with connection teardown.
+ * Returns 0 if work was queued, or an error code otherwise.
*/
-static void queue_con(struct ceph_connection *con)
+static int queue_con_delay(struct ceph_connection *con, unsigned long
delay)
{
if (!con->ops->get(con)) {
- dout("queue_con %p ref count 0\n", con);
- return;
+ dout("%s %p ref count 0\n", __func__, con);
+
+ return -ENOENT;
}
- if (!queue_delayed_work(ceph_msgr_wq, &con->work, 0)) {
- dout("queue_con %p - already queued\n", con);
+ if (!queue_delayed_work(ceph_msgr_wq, &con->work, delay)) {
+ dout("%s %p - already queued\n", __func__, con);
con->ops->put(con);
- } else {
- dout("queue_con %p\n", con);
+
+ return -EBUSY;
}
+
+ dout("%s %p %lu\n", __func__, con, delay);
+
+ return 0;
+}
+
+static void queue_con(struct ceph_connection *con)
+{
+ (void) queue_con_delay(con, 0);
}
/*
@@ -2294,14 +2305,11 @@ restart:
if (test_and_clear_bit(CON_FLAG_BACKOFF, &con->flags)) {
dout("con_work %p backing off\n", con);
- if (queue_delayed_work(ceph_msgr_wq, &con->work,
- round_jiffies_relative(con->delay))) {
- dout("con_work %p backoff %lu\n", con, con->delay);
- mutex_unlock(&con->mutex);
- return;
- } else {
+ ret = queue_con_delay(con, round_jiffies_relative(con->delay));
+ if (ret) {
dout("con_work %p FAILED to back off %lu\n", con,
con->delay);
+ BUG_ON(ret == -ENOENT);
set_bit(CON_FLAG_BACKOFF, &con->flags);
}
goto done;
--
1.7.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue
2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
@ 2012-10-09 21:33 ` Sage Weil
0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2012-10-09 21:33 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
Reviewed-by: Sage Weil <sage@inktank.com>
On Tue, 9 Oct 2012, Alex Elder wrote:
> If ceph_fault() is unable to queue work after a delay, it sets the
> BACKOFF connection flag so con_work() will attempt to do so.
>
> In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't
> result in newly-queued work, it simply ignores this condition and
> proceeds as if no backoff delay were desired. There are two
> problems with this--one of which is a bug.
>
> The first problem is simply that the intended behavior is to back
> off, and if we aren't able queue the work item to run after a delay
> we're not doing that.
>
> The only reason queue_delayed_work() won't queue work is if the
> provided work item is already queued. In the messenger, this
> means that con_work() is already scheduled to be run again. So
> if we simply set the BACKOFF flag again when this occurs, we know
> the next con_work() call will again attempt to hold off activity
> on the connection until after the delay.
>
> The second problem--the bug--is a leak of a reference count. If
> queue_delayed_work() returns 0 in con_work(), con->ops->put() drops
> the connection reference held on entry to con_work(). However,
> processing is (was) allowed to continue, and at the end of the
> function a second con->ops->put() is called.
>
> This patch fixes both problems.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
> net/ceph/messenger.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index f9f65fe..ece06bc 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2300,10 +2300,11 @@ restart:
> mutex_unlock(&con->mutex);
> return;
> } else {
> - con->ops->put(con);
> dout("con_work %p FAILED to back off %lu\n", con,
> con->delay);
> + set_bit(CON_FLAG_BACKOFF, &con->flags);
> }
> + goto done;
> }
>
> if (con->state == CON_STATE_STANDBY) {
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] libceph: let con_work() handle backoff
2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder
@ 2012-10-09 21:34 ` Sage Weil
0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2012-10-09 21:34 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
Reviewed-by: Sage Weil <sage@inktank.com>
On Tue, 9 Oct 2012, Alex Elder wrote:
> Both ceph_fault() and con_work() include handling for imposing a
> delay before doing further processing on a faulted connection.
> The latter is used only if ceph_fault() is unable to.
>
> Instead, just let con_work() always be responsible for implementing
> the delay. After setting up the delay value, set the BACKOFF flag
> on the connection unconditionally and call queue_con() to ensure
> con_work() will get called to handle it.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
> net/ceph/messenger.c | 20 ++------------------
> 1 file changed, 2 insertions(+), 18 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index ece06bc..9170c20 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2398,24 +2398,8 @@ static void ceph_fault(struct ceph_connection *con)
> con->delay = BASE_DELAY_INTERVAL;
> else if (con->delay < MAX_DELAY_INTERVAL)
> con->delay *= 2;
> - con->ops->get(con);
> - if (queue_delayed_work(ceph_msgr_wq, &con->work,
> - round_jiffies_relative(con->delay))) {
> - dout("fault queued %p delay %lu\n", con, con->delay);
> - } else {
> - con->ops->put(con);
> - dout("fault failed to queue %p delay %lu, backoff\n",
> - con, con->delay);
> - /*
> - * In many cases we see a socket state change
> - * while con_work is running and end up
> - * queuing (non-delayed) work, such that we
> - * can't backoff with a delay. Set a flag so
> - * that when con_work restarts we schedule the
> - * delay then.
> - */
> - set_bit(CON_FLAG_BACKOFF, &con->flags);
> - }
> + set_bit(CON_FLAG_BACKOFF, &con->flags);
> + queue_con(con);
> }
>
> out_unlock:
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] libceph: define common queue_con_delay()
2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder
@ 2012-10-09 21:37 ` Sage Weil
0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2012-10-09 21:37 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
Reviewed-by: Sage Weil <sage@inktank.com>
On Tue, 9 Oct 2012, Alex Elder wrote:
> This patch defines a single function, queue_con_delay() to call
> queue_delayed_work() for a connection. It basically generalizes
> what was previously queue_con() by adding the delay argument.
> queue_con() is now a simple helper that passes 0 for its delay.
> queue_con_delay() returns 0 if it queued work or an errno if it
> did not for some reason.
>
> If con_work() finds the BACKOFF flag set for a connection, it now
> calls queue_con_delay() to handle arranging to start again after a
> delay.
>
>
> Note about connection reference counts: con_work() only ever gets
> called as a work item function. At the time that work is scheduled,
> a reference to the connection is acquired, and the corresponding
> con_work() call is then responsible for dropping that reference
> before it returns.
>
> Previously, the backoff handling inside con_work() silently handed
> off its reference to delayed work it scheduled. Now that
> queue_con_delay() is used, a new reference is acquired for the
> newly-scheduled work, and the original reference is dropped by the
> con->ops->put() call at the end of the function.
>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
> net/ceph/messenger.c | 38 +++++++++++++++++++++++---------------
> 1 file changed, 23 insertions(+), 15 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 9170c20..77cc8b1 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2244,22 +2244,33 @@ bad_tag:
>
>
> /*
> - * Atomically queue work on a connection. Bump @con reference to
> - * avoid races with connection teardown.
> + * Atomically queue work on a connection after the specified delay.
> + * Bump @con reference to avoid races with connection teardown.
> + * Returns 0 if work was queued, or an error code otherwise.
> */
> -static void queue_con(struct ceph_connection *con)
> +static int queue_con_delay(struct ceph_connection *con, unsigned long delay)
> {
> if (!con->ops->get(con)) {
> - dout("queue_con %p ref count 0\n", con);
> - return;
> + dout("%s %p ref count 0\n", __func__, con);
> +
> + return -ENOENT;
> }
>
> - if (!queue_delayed_work(ceph_msgr_wq, &con->work, 0)) {
> - dout("queue_con %p - already queued\n", con);
> + if (!queue_delayed_work(ceph_msgr_wq, &con->work, delay)) {
> + dout("%s %p - already queued\n", __func__, con);
> con->ops->put(con);
> - } else {
> - dout("queue_con %p\n", con);
> +
> + return -EBUSY;
> }
> +
> + dout("%s %p %lu\n", __func__, con, delay);
> +
> + return 0;
> +}
> +
> +static void queue_con(struct ceph_connection *con)
> +{
> + (void) queue_con_delay(con, 0);
> }
>
> /*
> @@ -2294,14 +2305,11 @@ restart:
>
> if (test_and_clear_bit(CON_FLAG_BACKOFF, &con->flags)) {
> dout("con_work %p backing off\n", con);
> - if (queue_delayed_work(ceph_msgr_wq, &con->work,
> - round_jiffies_relative(con->delay))) {
> - dout("con_work %p backoff %lu\n", con, con->delay);
> - mutex_unlock(&con->mutex);
> - return;
> - } else {
> + ret = queue_con_delay(con,
> round_jiffies_relative(con->delay));
> + if (ret) {
> dout("con_work %p FAILED to back off %lu\n", con,
> con->delay);
> + BUG_ON(ret == -ENOENT);
> set_bit(CON_FLAG_BACKOFF, &con->flags);
> }
> goto done;
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-10-09 21:37 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-09 21:31 [PATCH 0/3] libceph: fix backoff handling Alex Elder
2012-10-09 21:33 ` [PATCH 1/3] libceph: reset BACKOFF if unable to re-queue Alex Elder
2012-10-09 21:33 ` Sage Weil
2012-10-09 21:33 ` [PATCH 2/3] libceph: let con_work() handle backoff Alex Elder
2012-10-09 21:34 ` Sage Weil
2012-10-09 21:33 ` [PATCH 3/3] libceph: define common queue_con_delay() Alex Elder
2012-10-09 21:37 ` Sage Weil
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.