All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] COLO: how to flip a secondary to a primary?
@ 2016-01-22 19:35 Dr. David Alan Gilbert
  2016-01-25  1:32 ` Wen Congyang
  0 siblings, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2016-01-22 19:35 UTC (permalink / raw)
  To: Changlong Xie, zhanghailiang, Wen Congyang
  Cc: Simon Kollberg, Luis Tomas, qemu devel, qemu block, Abel Souza

Hi,
  I've been looking at what's needed to add a new secondary after
a primary failed; from the block side it doesn't look as hard
as I'd expected, perhaps you can tell me if I'm missing something!

The normal primary setup is:

   quorum
      Real disk
      nbd client

The normal secondary setup is:
   replication
      active-disk
      hidden-disk
      Real-disk

With a couple of minor code hacks; I changed the secondary to be:

   quorum
      replication
        active-disk
        hidden-disk
        Real-disk
      dummy-disk

and then after the primary fails, I start a new secondary
on another host and then on the old secondary do:

  nbd_server_stop
  stop
  x_block_change top-quorum -d children.0         # deletes use of real disk, leaves dummy
  drive_del active-disk0
  x_block_change top-quorum -a node-real-disk
  x_block_change top-quorum -d children.1         # Seems to have deleted the dummy?!, the disk is now child 0
  drive_add buddy driver=replication,mode=primary,file.driver=nbd,file.host=ibpair,file.port=8889,file.export=colo-disk0,node-name=nbd-client,if=none,cache=none
  x_block_change top-quorum -a nbd-client
  c
  migrate_set_capability x-colo on
  migrate -d -b tcp:ibpair:8888

and I think that means what was the secondary, has the same disk
structure as a normal primary.
That's not quite happy yet, and I've not figured out why - but the
order/structure of the block devices looks right?

Notes:
   a) The dummy serves two purposes, 1) it works around the segfault
      I reported in the other mail, 2) when I delete the real disk in the
      first x_block_change it means the quorum still has 1 disk so doesn't
      get upset.
   b) I had to remove the restriction in quorum_start_replication
      on which mode it would run in. 
   c) I'm not really sure everything knows it's in secondary mode yet, and
      I'm not convinced whether the replication is doing the right thing.
   d) The migrate -d -b   eventually fails on the destination, not worked out why
      yet.
   e) Adding/deleting children on quorum is hard having to use the children.0/1
      notation when you've added children using node names - it's worrying
      which number is which; is there a way to give them a name?
   f) I've not thought about the colo-proxy that much yet - I guess that
      existing connections need to keep their sequence number offset but
      new connections made by what is now the primary dont need to do anything
      special.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] COLO: how to flip a secondary to a primary?
  2016-01-22 19:35 [Qemu-devel] COLO: how to flip a secondary to a primary? Dr. David Alan Gilbert
@ 2016-01-25  1:32 ` Wen Congyang
  2016-01-25  2:11   ` Li Zhijian
  2016-01-25 18:59   ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 7+ messages in thread
From: Wen Congyang @ 2016-01-25  1:32 UTC (permalink / raw)
  To: Dr. David Alan Gilbert, Changlong Xie, zhanghailiang
  Cc: qemu block, Li Zhijian, qemu devel, Luis Tomas, Simon Kollberg,
	Abel Souza

On 01/23/2016 03:35 AM, Dr. David Alan Gilbert wrote:
> Hi,
>   I've been looking at what's needed to add a new secondary after
> a primary failed; from the block side it doesn't look as hard
> as I'd expected, perhaps you can tell me if I'm missing something!
> 
> The normal primary setup is:
> 
>    quorum
>       Real disk
>       nbd client

quorum
   real disk
   replication
      nbd client

> 
> The normal secondary setup is:
>    replication
>       active-disk
>       hidden-disk
>       Real-disk

IIRC, we can do it like this:
quorum
   replication
      active-disk
      hidden-disk
      real-disk

> 
> With a couple of minor code hacks; I changed the secondary to be:
> 
>    quorum
>       replication
>         active-disk
>         hidden-disk
>         Real-disk
>       dummy-disk

after failover,
quorum
   replicaion(old, mode is secondary)
     active-disk
     hidden-disk*
     real-disk*
   replication(new, mode is primary)
     nbd-client

In the newest version, we active commit active-disk to real-disk.
So it will be:
quorum
   replicaion(old, mode is secondary)
     active-disk(it is real disk now)
   replication(new, mode is primary)
     nbd-client

> 
> and then after the primary fails, I start a new secondary
> on another host and then on the old secondary do:
> 
>   nbd_server_stop
>   stop
>   x_block_change top-quorum -d children.0         # deletes use of real disk, leaves dummy
>   drive_del active-disk0
>   x_block_change top-quorum -a node-real-disk
>   x_block_change top-quorum -d children.1         # Seems to have deleted the dummy?!, the disk is now child 0
>   drive_add buddy driver=replication,mode=primary,file.driver=nbd,file.host=ibpair,file.port=8889,file.export=colo-disk0,node-name=nbd-client,if=none,cache=none
>   x_block_change top-quorum -a nbd-client
>   c
>   migrate_set_capability x-colo on
>   migrate -d -b tcp:ibpair:8888
> 
> and I think that means what was the secondary, has the same disk
> structure as a normal primary.
> That's not quite happy yet, and I've not figured out why - but the
> order/structure of the block devices looks right?
> 
> Notes:
>    a) The dummy serves two purposes, 1) it works around the segfault
>       I reported in the other mail, 2) when I delete the real disk in the
>       first x_block_change it means the quorum still has 1 disk so doesn't
>       get upset.

I don't understand the purpose 2.

>    b) I had to remove the restriction in quorum_start_replication
>       on which mode it would run in. 

IIRC, this check will be removed.

>    c) I'm not really sure everything knows it's in secondary mode yet, and
>       I'm not convinced whether the replication is doing the right thing.
>    d) The migrate -d -b   eventually fails on the destination, not worked out why
>       yet.

Can you give me the error message?

>    e) Adding/deleting children on quorum is hard having to use the children.0/1
>       notation when you've added children using node names - it's worrying
>       which number is which; is there a way to give them a name?

No. I think we can improve 'info block' output.

>    f) I've not thought about the colo-proxy that much yet - I guess that
>       existing connections need to keep their sequence number offset but
>       new connections made by what is now the primary dont need to do anything
>       special.

Hailiang or Zhijian can answer this question.

Thanks
Wen Congyang

> 
> Dave
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] COLO: how to flip a secondary to a primary?
  2016-01-25  1:32 ` Wen Congyang
@ 2016-01-25  2:11   ` Li Zhijian
  2016-01-25 20:20     ` Dr. David Alan Gilbert
  2016-01-25 18:59   ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 7+ messages in thread
From: Li Zhijian @ 2016-01-25  2:11 UTC (permalink / raw)
  To: Wen Congyang, Dr. David Alan Gilbert, Changlong Xie, zhanghailiang
  Cc: Simon Kollberg, Luis Tomas, qemu devel, qemu block, Abel Souza



On 01/25/2016 09:32 AM, Wen Congyang wrote:
>> >    f) I've not thought about the colo-proxy that much yet - I guess that
>> >       existing connections need to keep their sequence number offset but

Strictly speaking, after failover, we only need to keep servicing for the tcp connections which are
established after the last checkpoint but not all existing connections. Because after a checkpoint
(primary and secondary node works well), primary vm and secondary vm is same, that means the existing
tcp connection has the same sequence。

>> >       new connections made by what is now the primary dont need to do anything
>> >       special.
Yes, you are right.


> Hailiang or Zhijian can answer this question.


Thanks
Li Zhijian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] COLO: how to flip a secondary to a primary?
  2016-01-25  1:32 ` Wen Congyang
  2016-01-25  2:11   ` Li Zhijian
@ 2016-01-25 18:59   ` Dr. David Alan Gilbert
  2016-01-26  1:06     ` Wen Congyang
  1 sibling, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2016-01-25 18:59 UTC (permalink / raw)
  To: Wen Congyang
  Cc: Li Zhijian, Changlong Xie, zhanghailiang, qemu block, qemu devel,
	Luis Tomas, Simon Kollberg, Abel Souza

* Wen Congyang (wency@cn.fujitsu.com) wrote:
> On 01/23/2016 03:35 AM, Dr. David Alan Gilbert wrote:
> > Hi,
> >   I've been looking at what's needed to add a new secondary after
> > a primary failed; from the block side it doesn't look as hard
> > as I'd expected, perhaps you can tell me if I'm missing something!
> > 
> > The normal primary setup is:
> > 
> >    quorum
> >       Real disk
> >       nbd client
> 
> quorum
>    real disk
>    replication
>       nbd client
> 
> > 
> > The normal secondary setup is:
> >    replication
> >       active-disk
> >       hidden-disk
> >       Real-disk
> 
> IIRC, we can do it like this:
> quorum
>    replication
>       active-disk
>       hidden-disk
>       real-disk

Yes.

> > With a couple of minor code hacks; I changed the secondary to be:
> > 
> >    quorum
> >       replication
> >         active-disk
> >         hidden-disk
> >         Real-disk
> >       dummy-disk
> 
> after failover,
> quorum
>    replicaion(old, mode is secondary)
>      active-disk
>      hidden-disk*
>      real-disk*
>    replication(new, mode is primary)
>      nbd-client

Do you need to keep the old secondary-replication?
Does that just pass straight through?

> In the newest version, we active commit active-disk to real-disk.
> So it will be:
> quorum
>    replicaion(old, mode is secondary)
>      active-disk(it is real disk now)
>    replication(new, mode is primary)
>      nbd-client

How does that active-commit work?  I didn't think you
could change the real disk until you had the full checkpoint,
since you don't know whether the primary or secondaries
changes need to be written?

> > and then after the primary fails, I start a new secondary
> > on another host and then on the old secondary do:
> > 
> >   nbd_server_stop
> >   stop
> >   x_block_change top-quorum -d children.0         # deletes use of real disk, leaves dummy
> >   drive_del active-disk0
> >   x_block_change top-quorum -a node-real-disk
> >   x_block_change top-quorum -d children.1         # Seems to have deleted the dummy?!, the disk is now child 0
> >   drive_add buddy driver=replication,mode=primary,file.driver=nbd,file.host=ibpair,file.port=8889,file.export=colo-disk0,node-name=nbd-client,if=none,cache=none
> >   x_block_change top-quorum -a nbd-client
> >   c
> >   migrate_set_capability x-colo on
> >   migrate -d -b tcp:ibpair:8888
> > 
> > and I think that means what was the secondary, has the same disk
> > structure as a normal primary.
> > That's not quite happy yet, and I've not figured out why - but the
> > order/structure of the block devices looks right?
> > 
> > Notes:
> >    a) The dummy serves two purposes, 1) it works around the segfault
> >       I reported in the other mail, 2) when I delete the real disk in the
> >       first x_block_change it means the quorum still has 1 disk so doesn't
> >       get upset.
> 
> I don't understand the purpose 2.

quorum wont allow you to delete all it's members ('The number of children cannot be lower than the vote threshold 1')
and it's very tricky getting the order correct with add/delete; for example
I tried:

drive_add buddy driver=replication,mode=primary,file.driver=nbd,file.host=ibpair,file.port=8889,file.export=colo-disk0,node-name=nbd-client,if=none,cache=none
# gets children.1
x_block_change top-quorum -a nbd-client
# deletes the secondary replication
x_block_change top-quorum -d children.0
drive_del active-disk0
# ends up as children.0 but in the 2nd slot
x_block_change top-quorum -a node-real-disk

info block shows me:
top-quorum (#block615): json:{"children": [
    {"driver": "replication", "mode": "primary", "file": {"port": "8889", "host": "ibpair", "driver": "nbd", "export": "colo-disk0"}},
    {"driver": "raw", "file": {"driver": "file", "filename": "/home/localvms/bugzilla.raw"}}
   ],
   "driver": "quorum", "blkverify": false, "rewrite-corrupted": false, "vote-threshold": 1} (quorum)
    Cache mode:       writeback

that has the replication first and the file second; that's the opposite
from the normal primary startup - does it matter?

I can't add node-real-disk until I drive_del active-disk0 (which
previously used it);  and I can't drive_del until I remove
it from the quorum; but I can't remove that from the quorum first,
because that leaves an empty quorum.

> >    b) I had to remove the restriction in quorum_start_replication
> >       on which mode it would run in. 
> 
> IIRC, this check will be removed.
> 
> >    c) I'm not really sure everything knows it's in secondary mode yet, and
> >       I'm not convinced whether the replication is doing the right thing.
> >    d) The migrate -d -b   eventually fails on the destination, not worked out why
> >       yet.
> 
> Can you give me the error message?

I need to repeat it to check; it was something like a bad flag from the block migration
code; it happened after the block migration hit 100%.

> >    e) Adding/deleting children on quorum is hard having to use the children.0/1
> >       notation when you've added children using node names - it's worrying
> >       which number is which; is there a way to give them a name?
> 
> No. I think we can improve 'info block' output.

Yes, that would be good; I thought it was the order in the list; but after
debugging it today I'm not convinced it is; I think it always keeps the same
name - so for example if you start off with [children.0, children.1]; then
delete children.0 you now have [children.1];  if you then add a new
child I *think* that becomes children.0 but you end up with [children.1,children.0]

> >    f) I've not thought about the colo-proxy that much yet - I guess that
> >       existing connections need to keep their sequence number offset but
> >       new connections made by what is now the primary dont need to do anything
> >       special.
> 
> Hailiang or Zhijian can answer this question.

Thanks,

> Thanks
> Wen Congyang
> 
> > 
> > Dave
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > 
> > 
> > .
> > 
> 
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] COLO: how to flip a secondary to a primary?
  2016-01-25  2:11   ` Li Zhijian
@ 2016-01-25 20:20     ` Dr. David Alan Gilbert
  2016-01-26  1:16       ` Li Zhijian
  0 siblings, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2016-01-25 20:20 UTC (permalink / raw)
  To: Li Zhijian
  Cc: Changlong Xie, zhanghailiang, qemu block, qemu devel, Luis Tomas,
	Simon Kollberg, Abel Souza

* Li Zhijian (lizhijian@cn.fujitsu.com) wrote:
> 
> 
> On 01/25/2016 09:32 AM, Wen Congyang wrote:
> >>>    f) I've not thought about the colo-proxy that much yet - I guess that
> >>>       existing connections need to keep their sequence number offset but
> 
> Strictly speaking, after failover, we only need to keep servicing for the tcp connections which are
> established after the last checkpoint but not all existing connections. Because after a checkpoint
> (primary and secondary node works well), primary vm and secondary vm is same, that means the existing
> tcp connection has the same sequence。
> 
> >>>       new connections made by what is now the primary dont need to do anything
> >>>       special.
> Yes, you are right.

I wonder whether we need to do something special to the new-secondary;
consider this:

   1 primary (P1) & secondary (S1) run together
   2 New connection opened
   3    secondary records an offset
   4 <running OK for a while - no checkpoint>
   5 primary (P1) fails; do failover to secondary
   6 secondary (S1) still rewrites sequence for connection opened at (2)
   7 Start new-secondary (S2), send checkpoint from S1->S2
   8 S2 has same guest contents as S1; so the
     sequence numbers are still offset compared to the outside world.

So S2 needs to be sent the offsets for existing connections, otherwise
is S1 was then to fail, S2 would send the wrong output on the existing
connection?

Dave
     
> 
> 
> >Hailiang or Zhijian can answer this question.
> 
> 
> Thanks
> Li Zhijian
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] COLO: how to flip a secondary to a primary?
  2016-01-25 18:59   ` Dr. David Alan Gilbert
@ 2016-01-26  1:06     ` Wen Congyang
  0 siblings, 0 replies; 7+ messages in thread
From: Wen Congyang @ 2016-01-26  1:06 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Li Zhijian, Changlong Xie, zhanghailiang, qemu block, qemu devel,
	Luis Tomas, Simon Kollberg, Abel Souza

On 01/26/2016 02:59 AM, Dr. David Alan Gilbert wrote:
> * Wen Congyang (wency@cn.fujitsu.com) wrote:
>> On 01/23/2016 03:35 AM, Dr. David Alan Gilbert wrote:
>>> Hi,
>>>   I've been looking at what's needed to add a new secondary after
>>> a primary failed; from the block side it doesn't look as hard
>>> as I'd expected, perhaps you can tell me if I'm missing something!
>>>
>>> The normal primary setup is:
>>>
>>>    quorum
>>>       Real disk
>>>       nbd client
>>
>> quorum
>>    real disk
>>    replication
>>       nbd client
>>
>>>
>>> The normal secondary setup is:
>>>    replication
>>>       active-disk
>>>       hidden-disk
>>>       Real-disk
>>
>> IIRC, we can do it like this:
>> quorum
>>    replication
>>       active-disk
>>       hidden-disk
>>       real-disk
> 
> Yes.
> 
>>> With a couple of minor code hacks; I changed the secondary to be:
>>>
>>>    quorum
>>>       replication
>>>         active-disk
>>>         hidden-disk
>>>         Real-disk
>>>       dummy-disk
>>
>> after failover,
>> quorum
>>    replicaion(old, mode is secondary)
>>      active-disk
>>      hidden-disk*
>>      real-disk*
>>    replication(new, mode is primary)
>>      nbd-client
> 
> Do you need to keep the old secondary-replication?
> Does that just pass straight through?

Yes, the old secondary-replication can work in the newest mode.
For example, we don't start colo again after failover, we do nothing.

> 
>> In the newest version, we active commit active-disk to real-disk.
>> So it will be:
>> quorum
>>    replicaion(old, mode is secondary)
>>      active-disk(it is real disk now)
>>    replication(new, mode is primary)
>>      nbd-client
> 
> How does that active-commit work?  I didn't think you
> could change the real disk until you had the full checkpoint,
> since you don't know whether the primary or secondaries
> changes need to be written?

I start the active-commit work when doing failover. After failover,
the primary changes after last checkpoint should be dropped(How to cancel
the inprogress write ops?).

> 
>>> and then after the primary fails, I start a new secondary
>>> on another host and then on the old secondary do:
>>>
>>>   nbd_server_stop
>>>   stop
>>>   x_block_change top-quorum -d children.0         # deletes use of real disk, leaves dummy
>>>   drive_del active-disk0
>>>   x_block_change top-quorum -a node-real-disk
>>>   x_block_change top-quorum -d children.1         # Seems to have deleted the dummy?!, the disk is now child 0
>>>   drive_add buddy driver=replication,mode=primary,file.driver=nbd,file.host=ibpair,file.port=8889,file.export=colo-disk0,node-name=nbd-client,if=none,cache=none
>>>   x_block_change top-quorum -a nbd-client
>>>   c
>>>   migrate_set_capability x-colo on
>>>   migrate -d -b tcp:ibpair:8888
>>>
>>> and I think that means what was the secondary, has the same disk
>>> structure as a normal primary.
>>> That's not quite happy yet, and I've not figured out why - but the
>>> order/structure of the block devices looks right?
>>>
>>> Notes:
>>>    a) The dummy serves two purposes, 1) it works around the segfault
>>>       I reported in the other mail, 2) when I delete the real disk in the
>>>       first x_block_change it means the quorum still has 1 disk so doesn't
>>>       get upset.
>>
>> I don't understand the purpose 2.
> 
> quorum wont allow you to delete all it's members ('The number of children cannot be lower than the vote threshold 1')
> and it's very tricky getting the order correct with add/delete; for example
> I tried:
> 
> drive_add buddy driver=replication,mode=primary,file.driver=nbd,file.host=ibpair,file.port=8889,file.export=colo-disk0,node-name=nbd-client,if=none,cache=none
> # gets children.1
> x_block_change top-quorum -a nbd-client
> # deletes the secondary replication
> x_block_change top-quorum -d children.0
> drive_del active-disk0

The active-disk0 contains some data, and you should not delete it.
If we do active-commit after failover, the active-disk0 is the real disk.

> # ends up as children.0 but in the 2nd slot
> x_block_change top-quorum -a node-real-disk
> 
> info block shows me:
> top-quorum (#block615): json:{"children": [
>     {"driver": "replication", "mode": "primary", "file": {"port": "8889", "host": "ibpair", "driver": "nbd", "export": "colo-disk0"}},
>     {"driver": "raw", "file": {"driver": "file", "filename": "/home/localvms/bugzilla.raw"}}
>    ],
>    "driver": "quorum", "blkverify": false, "rewrite-corrupted": false, "vote-threshold": 1} (quorum)
>     Cache mode:       writeback
> 
> that has the replication first and the file second; that's the opposite
> from the normal primary startup - does it matter?

it is OK. But reading from children.0 always fails and will read data from children.1

> 
> I can't add node-real-disk until I drive_del active-disk0 (which
> previously used it);  and I can't drive_del until I remove
> it from the quorum; but I can't remove that from the quorum first,
> because that leaves an empty quorum.
> 
>>>    b) I had to remove the restriction in quorum_start_replication
>>>       on which mode it would run in. 
>>
>> IIRC, this check will be removed.
>>
>>>    c) I'm not really sure everything knows it's in secondary mode yet, and
>>>       I'm not convinced whether the replication is doing the right thing.
>>>    d) The migrate -d -b   eventually fails on the destination, not worked out why
>>>       yet.
>>
>> Can you give me the error message?
> 
> I need to repeat it to check; it was something like a bad flag from the block migration
> code; it happened after the block migration hit 100%.

IIRC, we find some block migration's bug, and fix it. It may be a new bug.

> 
>>>    e) Adding/deleting children on quorum is hard having to use the children.0/1
>>>       notation when you've added children using node names - it's worrying
>>>       which number is which; is there a way to give them a name?
>>
>> No. I think we can improve 'info block' output.
> 
> Yes, that would be good; I thought it was the order in the list; but after
> debugging it today I'm not convinced it is; I think it always keeps the same
> name - so for example if you start off with [children.0, children.1]; then
> delete children.0 you now have [children.1];  if you then add a new
> child I *think* that becomes children.0 but you end up with [children.1,children.0]

Note that: quorum fifo mode cares this order. I think it is better to read
the older child first.

Thanks
Wen Congyang

> 
>>>    f) I've not thought about the colo-proxy that much yet - I guess that
>>>       existing connections need to keep their sequence number offset but
>>>       new connections made by what is now the primary dont need to do anything
>>>       special.
>>
>> Hailiang or Zhijian can answer this question.
> 
> Thanks,
> 
>> Thanks
>> Wen Congyang
>>
>>>
>>> Dave
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>>
>>> .
>>>
>>
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] COLO: how to flip a secondary to a primary?
  2016-01-25 20:20     ` Dr. David Alan Gilbert
@ 2016-01-26  1:16       ` Li Zhijian
  0 siblings, 0 replies; 7+ messages in thread
From: Li Zhijian @ 2016-01-26  1:16 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Changlong Xie, zhanghailiang, qemu block, qemu devel, Luis Tomas,
	Simon Kollberg, Abel Souza



On 01/26/2016 04:20 AM, Dr. David Alan Gilbert wrote:
> * Li Zhijian (lizhijian@cn.fujitsu.com) wrote:
>>
>>
>> On 01/25/2016 09:32 AM, Wen Congyang wrote:
>>>>>     f) I've not thought about the colo-proxy that much yet - I guess that
>>>>>        existing connections need to keep their sequence number offset but
>>
>> Strictly speaking, after failover, we only need to keep servicing for the tcp connections which are
>> established after the last checkpoint but not all existing connections. Because after a checkpoint
>> (primary and secondary node works well), primary vm and secondary vm is same, that means the existing
>> tcp connection has the same sequence。
>>
>>>>>        new connections made by what is now the primary dont need to do anything
>>>>>        special.
>> Yes, you are right.
>
> I wonder whether we need to do something special to the new-secondary;
> consider this:
>
>     1 primary (P1) & secondary (S1) run together
>     2 New connection opened
>     3    secondary records an offset
>     4 <running OK for a while - no checkpoint>
>     5 primary (P1) fails; do failover to secondary
>     6 secondary (S1) still rewrites sequence for connection opened at (2)
>     7 Start new-secondary (S2), send checkpoint from S1->S2
>     8 S2 has same guest contents as S1; so the
>       sequence numbers are still offset compared to the outside world.
>
> So S2 needs to be sent the offsets for existing connections, otherwise
> is S1 was then to fail, S2 would send the wrong output on the existing
> connection?

Thanks for the example.
Sure, if we support continuous FT, colo proxy need to implement migration_save and migration_load.
At the beginning of (7), we need to save colo_proxy info(including connection info and sequence offset) at S1
and load colo_proxy at S2. S1/S2 need to keep doing tcp re-writer for the connections opened at (2)
until they are closed.

Thanks
Li Zhijian

>
> Dave
>
>>
>>
>>> Hailiang or Zhijian can answer this question.
>>
>>
>> Thanks
>> Li Zhijian
>>
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>
> .
>

-- 
Best regards.
Li Zhijian (8555)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-01-26  1:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-22 19:35 [Qemu-devel] COLO: how to flip a secondary to a primary? Dr. David Alan Gilbert
2016-01-25  1:32 ` Wen Congyang
2016-01-25  2:11   ` Li Zhijian
2016-01-25 20:20     ` Dr. David Alan Gilbert
2016-01-26  1:16       ` Li Zhijian
2016-01-25 18:59   ` Dr. David Alan Gilbert
2016-01-26  1:06     ` Wen Congyang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.