* [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
@ 2016-10-07 12:21 Paolo Bonzini
2016-10-08 14:58 ` Wen Congyang
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Paolo Bonzini @ 2016-10-07 12:21 UTC (permalink / raw)
To: qemu-devel; +Cc: wency, zhang.zhanghailiang, arei.gonglei, stefanha
Without this change, there is a race condition in tests/test-replication.
Depending on how fast the failover job (active commit) runs, there is a
chance of two bad things happening:
1) replication_done can be called after the secondary has been closed
and hence when the BDRVReplicationState is not valid anymore.
2) two copies of the active disk are present during the
/replication/secondary/stop test (that test runs immediately after
/replication/secondary/start, which tests failover). This causes the
corruption detector to fire.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/replication.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/block/replication.c b/block/replication.c
index 3bd1cf1..5231a00 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -133,6 +133,9 @@ static void replication_close(BlockDriverState *bs)
if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
replication_stop(s->rs, false, NULL);
}
+ if (s->replication_state == BLOCK_REPLICATION_FAILOVER) {
+ block_job_cancel_sync(s->active_disk->bs->job);
+ }
if (s->mode == REPLICATION_MODE_SECONDARY) {
g_free(s->top_id);
--
2.7.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
2016-10-07 12:21 [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed Paolo Bonzini
@ 2016-10-08 14:58 ` Wen Congyang
2016-10-12 5:02 ` Changlong Xie
2016-10-14 15:22 ` Stefan Hajnoczi
2 siblings, 0 replies; 4+ messages in thread
From: Wen Congyang @ 2016-10-08 14:58 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: arei.gonglei, stefanha, zhang.zhanghailiang
At 2016/10/7 20:21, Paolo Bonzini wrote:
> Without this change, there is a race condition in tests/test-replication.
> Depending on how fast the failover job (active commit) runs, there is a
> chance of two bad things happening:
>
> 1) replication_done can be called after the secondary has been closed
> and hence when the BDRVReplicationState is not valid anymore.
>
> 2) two copies of the active disk are present during the
> /replication/secondary/stop test (that test runs immediately after
> /replication/secondary/start, which tests failover). This causes the
> corruption detector to fire.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch looks fine to me.
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
> ---
> block/replication.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/block/replication.c b/block/replication.c
> index 3bd1cf1..5231a00 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -133,6 +133,9 @@ static void replication_close(BlockDriverState *bs)
> if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
> replication_stop(s->rs, false, NULL);
> }
> + if (s->replication_state == BLOCK_REPLICATION_FAILOVER) {
> + block_job_cancel_sync(s->active_disk->bs->job);
> + }
>
> if (s->mode == REPLICATION_MODE_SECONDARY) {
> g_free(s->top_id);
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
2016-10-07 12:21 [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed Paolo Bonzini
2016-10-08 14:58 ` Wen Congyang
@ 2016-10-12 5:02 ` Changlong Xie
2016-10-14 15:22 ` Stefan Hajnoczi
2 siblings, 0 replies; 4+ messages in thread
From: Changlong Xie @ 2016-10-12 5:02 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: arei.gonglei, stefanha, zhang.zhanghailiang
On 10/07/2016 08:21 PM, Paolo Bonzini wrote:
> Without this change, there is a race condition in tests/test-replication.
> Depending on how fast the failover job (active commit) runs, there is a
> chance of two bad things happening:
>
> 1) replication_done can be called after the secondary has been closed
> and hence when the BDRVReplicationState is not valid anymore.
>
> 2) two copies of the active disk are present during the
> /replication/secondary/stop test (that test runs immediately after
> /replication/secondary/start, which tests failover). This causes the
> corruption detector to fire.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
> ---
> block/replication.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/block/replication.c b/block/replication.c
> index 3bd1cf1..5231a00 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -133,6 +133,9 @@ static void replication_close(BlockDriverState *bs)
> if (s->replication_state == BLOCK_REPLICATION_RUNNING) {
> replication_stop(s->rs, false, NULL);
> }
> + if (s->replication_state == BLOCK_REPLICATION_FAILOVER) {
> + block_job_cancel_sync(s->active_disk->bs->job);
> + }
>
> if (s->mode == REPLICATION_MODE_SECONDARY) {
> g_free(s->top_id);
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed
2016-10-07 12:21 [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed Paolo Bonzini
2016-10-08 14:58 ` Wen Congyang
2016-10-12 5:02 ` Changlong Xie
@ 2016-10-14 15:22 ` Stefan Hajnoczi
2 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2016-10-14 15:22 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, arei.gonglei, stefanha, zhang.zhanghailiang
[-- Attachment #1: Type: text/plain, Size: 863 bytes --]
On Fri, Oct 07, 2016 at 02:21:33PM +0200, Paolo Bonzini wrote:
> Without this change, there is a race condition in tests/test-replication.
> Depending on how fast the failover job (active commit) runs, there is a
> chance of two bad things happening:
>
> 1) replication_done can be called after the secondary has been closed
> and hence when the BDRVReplicationState is not valid anymore.
>
> 2) two copies of the active disk are present during the
> /replication/secondary/stop test (that test runs immediately after
> /replication/secondary/start, which tests failover). This causes the
> corruption detector to fire.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/replication.c | 3 +++
> 1 file changed, 3 insertions(+)
Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-10-14 15:22 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-07 12:21 [Qemu-devel] [PATCH] replication: interrupt failover if the main device is closed Paolo Bonzini
2016-10-08 14:58 ` Wen Congyang
2016-10-12 5:02 ` Changlong Xie
2016-10-14 15:22 ` Stefan Hajnoczi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.