All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] libceph: wait for con->work to finish when cancelling con
@ 2022-03-08  9:59 xiubli
  2022-03-08 11:45 ` Jeff Layton
  0 siblings, 1 reply; 3+ messages in thread
From: xiubli @ 2022-03-08  9:59 UTC (permalink / raw)
  To: jlayton, idryomov; +Cc: vshankar, ceph-devel, Xiubo Li

From: Xiubo Li <xiubli@redhat.com>

When reconnecting MDS it will reopen the con with new ip address,
but the when opening the con with new address it couldn't be sure
that the stale work has finished. So it's possible that the stale
work queued will use the new data.

This will use cancel_delayed_work_sync() instead.

URL: https://tracker.ceph.com/issues/54461
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 net/ceph/messenger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index d3bb656308b4..32eb5dc00583 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1416,7 +1416,7 @@ static void queue_con(struct ceph_connection *con)
 
 static void cancel_con(struct ceph_connection *con)
 {
-	if (cancel_delayed_work(&con->work)) {
+	if (cancel_delayed_work_sync(&con->work)) {
 		dout("%s %p\n", __func__, con);
 		con->ops->put(con);
 	}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] libceph: wait for con->work to finish when cancelling con
  2022-03-08  9:59 [RFC PATCH] libceph: wait for con->work to finish when cancelling con xiubli
@ 2022-03-08 11:45 ` Jeff Layton
  2022-03-08 13:02   ` Xiubo Li
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Layton @ 2022-03-08 11:45 UTC (permalink / raw)
  To: xiubli, idryomov; +Cc: vshankar, ceph-devel

On Tue, 2022-03-08 at 17:59 +0800, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
> 
> When reconnecting MDS it will reopen the con with new ip address,
> but the when opening the con with new address it couldn't be sure
> that the stale work has finished. So it's possible that the stale
> work queued will use the new data.
> 
> This will use cancel_delayed_work_sync() instead.
> 
> URL: https://tracker.ceph.com/issues/54461
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>  net/ceph/messenger.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index d3bb656308b4..32eb5dc00583 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -1416,7 +1416,7 @@ static void queue_con(struct ceph_connection *con)
>  
>  static void cancel_con(struct ceph_connection *con)
>  {
> -	if (cancel_delayed_work(&con->work)) {
> +	if (cancel_delayed_work_sync(&con->work)) {
>  		dout("%s %p\n", __func__, con);
>  		con->ops->put(con);
>  	}

Won't this deadlock?

This function is called from ceph_con_close with the con->mutex held.
The work will try to take the same mutex and will get stuck. If you want
to do this, then you may also need to change it to call cancel_con after
dropping the mutex.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] libceph: wait for con->work to finish when cancelling con
  2022-03-08 11:45 ` Jeff Layton
@ 2022-03-08 13:02   ` Xiubo Li
  0 siblings, 0 replies; 3+ messages in thread
From: Xiubo Li @ 2022-03-08 13:02 UTC (permalink / raw)
  To: Jeff Layton, idryomov; +Cc: vshankar, ceph-devel


On 3/8/22 7:45 PM, Jeff Layton wrote:
> On Tue, 2022-03-08 at 17:59 +0800, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> When reconnecting MDS it will reopen the con with new ip address,
>> but the when opening the con with new address it couldn't be sure
>> that the stale work has finished. So it's possible that the stale
>> work queued will use the new data.
>>
>> This will use cancel_delayed_work_sync() instead.
>>
>> URL: https://tracker.ceph.com/issues/54461
>> Signed-off-by: Xiubo Li <xiubli@redhat.com>
>> ---
>>   net/ceph/messenger.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
>> index d3bb656308b4..32eb5dc00583 100644
>> --- a/net/ceph/messenger.c
>> +++ b/net/ceph/messenger.c
>> @@ -1416,7 +1416,7 @@ static void queue_con(struct ceph_connection *con)
>>   
>>   static void cancel_con(struct ceph_connection *con)
>>   {
>> -	if (cancel_delayed_work(&con->work)) {
>> +	if (cancel_delayed_work_sync(&con->work)) {
>>   		dout("%s %p\n", __func__, con);
>>   		con->ops->put(con);
>>   	}
> Won't this deadlock?
>
> This function is called from ceph_con_close with the con->mutex held.
> The work will try to take the same mutex and will get stuck. If you want
> to do this, then you may also need to change it to call cancel_con after
> dropping the mutex.

Yeah, correct :-)

- Xiubo



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-03-08 13:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-08  9:59 [RFC PATCH] libceph: wait for con->work to finish when cancelling con xiubli
2022-03-08 11:45 ` Jeff Layton
2022-03-08 13:02   ` Xiubo Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.