All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
@ 2016-10-07 12:21 Paolo Bonzini
  2016-10-08 14:58 ` Wen Congyang
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Paolo Bonzini @ 2016-10-07 12:21 UTC (permalink / raw)
  To: qemu-devel; +Cc: wency, zhang.zhanghailiang, arei.gonglei, stefanha

Without this change, there is a race condition in tests/test-replication.
Depending on how fast the failover job (active commit) runs, there is a
chance of two bad things happening:

1) replication_done can be called after the secondary has been closed
and hence when the BDRVReplicationState is not valid anymore.

2) two copies of the active disk are present during the
/replication/secondary/stop test (that test runs immediately after
/replication/secondary/start, which tests failover).  This causes the
corruption detector to fire.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/replication.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/replication.c b/block/replication.c
index 3bd1cf1..5231a00 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -133,6 +133,9 @@ static void replication_close(BlockDriverState *bs)
     if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
         replication_stop(s->rs, false, NULL);
     }
+    if (s->replication_state == BLOCK_REPLICATION_FAILOVER) {
+        block_job_cancel_sync(s->active_disk->bs->job);
+    }
 
     if (s->mode == REPLICATION_MODE_SECONDARY) {
         g_free(s->top_id);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
  2016-10-07 12:21 [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed Paolo Bonzini
@ 2016-10-08 14:58 ` Wen Congyang
  2016-10-12  5:02 ` Changlong Xie
  2016-10-14 15:22 ` Stefan Hajnoczi
  2 siblings, 0 replies; 4+ messages in thread
From: Wen Congyang @ 2016-10-08 14:58 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: arei.gonglei, stefanha, zhang.zhanghailiang

At 2016/10/7 20:21, Paolo Bonzini wrote:
> Without this change, there is a race condition in tests/test-replication.
> Depending on how fast the failover job (active commit) runs, there is a
> chance of two bad things happening:
>
> 1) replication_done can be called after the secondary has been closed
> and hence when the BDRVReplicationState is not valid anymore.
>
> 2) two copies of the active disk are present during the
> /replication/secondary/stop test (that test runs immediately after
> /replication/secondary/start, which tests failover).  This causes the
> corruption detector to fire.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

This patch looks fine to me.
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>

> ---
>   block/replication.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/block/replication.c b/block/replication.c
> index 3bd1cf1..5231a00 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -133,6 +133,9 @@ static void replication_close(BlockDriverState *bs)
>       if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
>           replication_stop(s->rs, false, NULL);
>       }
> +    if (s->replication_state == BLOCK_REPLICATION_FAILOVER) {
> +        block_job_cancel_sync(s->active_disk->bs->job);
> +    }
>
>       if (s->mode == REPLICATION_MODE_SECONDARY) {
>           g_free(s->top_id);
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
  2016-10-07 12:21 [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed Paolo Bonzini
  2016-10-08 14:58 ` Wen Congyang
@ 2016-10-12  5:02 ` Changlong Xie
  2016-10-14 15:22 ` Stefan Hajnoczi
  2 siblings, 0 replies; 4+ messages in thread
From: Changlong Xie @ 2016-10-12  5:02 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: arei.gonglei, stefanha, zhang.zhanghailiang

On 10/07/2016 08:21 PM, Paolo Bonzini wrote:
> Without this change, there is a race condition in tests/test-replication.
> Depending on how fast the failover job (active commit) runs, there is a
> chance of two bad things happening:
>
> 1) replication_done can be called after the secondary has been closed
> and hence when the BDRVReplicationState is not valid anymore.
>
> 2) two copies of the active disk are present during the
> /replication/secondary/stop test (that test runs immediately after
> /replication/secondary/start, which tests failover).  This causes the
> corruption detector to fire.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>

> ---
>   block/replication.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/block/replication.c b/block/replication.c
> index 3bd1cf1..5231a00 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -133,6 +133,9 @@ static void replication_close(BlockDriverState *bs)
>       if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
>           replication_stop(s->rs, false, NULL);
>       }
> +    if (s->replication_state == BLOCK_REPLICATION_FAILOVER) {
> +        block_job_cancel_sync(s->active_disk->bs->job);
> +    }
>
>       if (s->mode == REPLICATION_MODE_SECONDARY) {
>           g_free(s->top_id);
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
  2016-10-07 12:21 [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed Paolo Bonzini
  2016-10-08 14:58 ` Wen Congyang
  2016-10-12  5:02 ` Changlong Xie
@ 2016-10-14 15:22 ` Stefan Hajnoczi
  2 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2016-10-14 15:22 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, arei.gonglei, stefanha, zhang.zhanghailiang

[-- Attachment #1: Type: text/plain, Size: 863 bytes --]

On Fri, Oct 07, 2016 at 02:21:33PM +0200, Paolo Bonzini wrote:
> Without this change, there is a race condition in tests/test-replication.
> Depending on how fast the failover job (active commit) runs, there is a
> chance of two bad things happening:
> 
> 1) replication_done can be called after the secondary has been closed
> and hence when the BDRVReplicationState is not valid anymore.
> 
> 2) two copies of the active disk are present during the
> /replication/secondary/stop test (that test runs immediately after
> /replication/secondary/start, which tests failover).  This causes the
> corruption detector to fire.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/replication.c | 3 +++
>  1 file changed, 3 insertions(+)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-10-14 15:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-07 12:21 [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed Paolo Bonzini
2016-10-08 14:58 ` Wen Congyang
2016-10-12  5:02 ` Changlong Xie
2016-10-14 15:22 ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.