From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=0.6 required=3.0 tests=DATE_IN_PAST_06_12, DKIM_INVALID,DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72637C43603 for ; Fri, 6 Dec 2019 16:36:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2E26420659 for ; Fri, 6 Dec 2019 16:36:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=qnap.com header.i=@qnap.com header.b="FHuHR9Ry" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2E26420659 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=qnap.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40962 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1idGaJ-0005pY-1g for qemu-devel@archiver.kernel.org; Fri, 06 Dec 2019 11:36:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:46186) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1idFPZ-0005Oh-7E for qemu-devel@nongnu.org; Fri, 06 Dec 2019 10:20:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1idFPW-0002bs-Fe for qemu-devel@nongnu.org; Fri, 06 Dec 2019 10:20:53 -0500 Received: from mail-yw1-xc2c.google.com ([2607:f8b0:4864:20::c2c]:45047) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1idFPW-0002YJ-3f for qemu-devel@nongnu.org; Fri, 06 Dec 2019 10:20:50 -0500 Received: by mail-yw1-xc2c.google.com with SMTP id t141so2782185ywc.11 for ; Fri, 06 Dec 2019 07:20:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qnap.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MkqGu/9SEpSHPvM0w3KGikWqCzM12K55Lr4YbJrSVEo=; b=FHuHR9RyUlOHMyUdskuwMtey4Xhd7U9o+Fd2ADDFeDt/m+XIQwLvqnYnM5B0VA4M1f gQD4+Y4IQZ3VmCVU3ZYtN3qxACbCn+3klWm/uNDETZcCiZ6Im4DKwBG14AiouatNjjw0 4izN03YDjg89BpHReJdaERon8F71N7JBnRzHI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MkqGu/9SEpSHPvM0w3KGikWqCzM12K55Lr4YbJrSVEo=; b=j6GHC5oaq68bLJVp/nZz8+XKtfutDYxKHPgOgQWqON2EzckL1cW9jaqG3KkZoDVoSf cz07BTLftb3PX/Vb2w8JEleLv/hhZitxU8XPfgwFs00ukQBMaksKq+lW9f+n67lGAQCq 9JLvnA2xSjHRJ2U9A75IL028fv9t6fWHUcJ3OO46HvU1DxlWIImIFuHl/yEkzwLDZSmP 98jKq5hG91hw6JrvTtlTRM0GMVkM8SvqodE+2M6WME3u3XSJE6NCJiJsVfXxhX6TvVjS NByqqAaGxuyR3/nCqnZMaPLpPMTpS3FfCMYHyDrW5357NR4lk2jM3f/9A8Y69qRzxABZ j2Uw== X-Gm-Message-State: APjAAAVdVxBhQIJ+KsGB8DgM7oZEUeoxvk7rU6AjWcdpnF128yaxstsc t8erMxl1GBSOabJVUgUkffrcP2yV70LeSCobdf5ohjfGqbUYLQ== X-Google-Smtp-Source: APXvYqxZW50ZUPqe4dn5JYQxzKJjWMJ/jx/CgBZWkyY7ZGc1TMja9qtA+7GVJB4xakHLGprG8NFyOwjCfdnEyI8MIgc= X-Received: by 2002:a25:d00f:: with SMTP id h15mr9568899ybg.70.1575613893274; Thu, 05 Dec 2019 22:31:33 -0800 (PST) MIME-Version: 1.0 References: <20191127105121.GA3017@work-vm> <9CFF81C0F6B98A43A459C9EDAD400D780631A02A@shsmsx102.ccr.corp.intel.com> <9CFF81C0F6B98A43A459C9EDAD400D780631C682@shsmsx102.ccr.corp.intel.com> <20191202095806.GA2904@work-vm> <20191203132504.GH3078@work-vm> In-Reply-To: From: Daniel Cho Date: Fri, 6 Dec 2019 14:31:13 +0800 Message-ID: Subject: Re: Network connection with COLO VM To: "Zhang, Chen" Content-Type: multipart/alternative; boundary="00000000000069977305990332ed" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::c2c X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "lukasstraub2@web.de" , "Dr. David Alan Gilbert" , "qemu-devel@nongnu.org" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" --00000000000069977305990332ed Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Dave, Zhang, Thanks for your help. I will try your recommendations. Best Regard, Daniel Cho Zhang, Chen =E6=96=BC 2019=E5=B9=B412=E6=9C=884=E6= =97=A5 =E9=80=B1=E4=B8=89 =E4=B8=8B=E5=8D=884:32=E5=AF=AB=E9=81=93=EF=BC=9A > > On 12/3/2019 9:25 PM, Dr. David Alan Gilbert wrote: > > * Daniel Cho (danielcho@qnap.com) wrote: > >> Hi Dave, > >> > >> We could use the exist interface to add netfilter and chardev, it migh= t > not > >> have the problem you said. > >> > >> However, the netfilter and chardev on the primary at the start, that > means > >> we could not dynamic set COLO > >> feature to VM? > > I wasn't expecting that to be possible - I'd expect you to be able > > to start in a state that looks the same as a primary+failed secondary; > > but I'm not sure. > > Current COLO (with Lukas's patch) can support dynamic set COLO system. > > This status is same like the secondary has triggered failover, the > primary node need to find new secondary > > node to combine new COLO system. > > > >> We try to change this chardev to prevent primary VM will stuck to wait > >> secondary VM. > >> > >> -chardev socket,id=3Dcompare1,host=3D127.0.0.1,port=3D9004,server,wait= \ > >> > >> to > >> > >> -chardev socket,id=3Dcompare1,host=3D127.0.0.1,port=3D9004,server,nowa= it \ > >> > >> But it will make primary VM's network not works. (Can't get ip), until > >> starting connect with secondary VM. > > I think you need to check the port 9004 if already connect to the pair > node. > > > I'm not sure of the answer to this; I've not tried doing it - I'm not > > sure anyone has! > > But, the colo components do track the state of tcp connections, so I'm > > expecting that they have to already exist to have the state of those > > connections available for when you start the secondary. > > Yes, you are right. > > For this status, we don't need to sync the state of tcp connections, > because after failover > > (or just solo COLO primary node), we have empty all the tcp connections > state in COLO module. > > In the processing of build new COLO pair, we will sync all the VM state > to secondary node and re-build > > new track things in COLO module. > > > > > > > >> Otherwise, the primary VM with netfileter / chardev and without > netfilter / > >> chardev , they takes very different > >> booting time. > >> Without netfilter / chardev : about 1 mins > >> With netfilter / chardev : about 5 mins > >> Is this an issue? > > that sounds like it needs investigating. > > > > Dave > > Yes, In previous COLO use cases, we need make primary node and secondary > node boot in the same time. > > I did't expect such a big difference on netfilter/chardev. > > I think you can try without netfilter/chardev, after VM boot, re-build > the netfilter/chardev related work with chardev server nowait. > > > Thanks > > Zhang Chen > > > > >> Best regards, > >> Daniel Cho > >> > >> > >> Dr. David Alan Gilbert =E6=96=BC 2019=E5=B9=B412= =E6=9C=882=E6=97=A5 =E9=80=B1=E4=B8=80 =E4=B8=8B=E5=8D=885:58=E5=AF=AB=E9= =81=93=EF=BC=9A > >> > >>> * Daniel Cho (danielcho@qnap.com) wrote: > >>>> Hi Zhang, > >>>> > >>>> We use qemu-4.1.0 release on this case. > >>>> > >>>> I think we need use block mirror to sync the disk to secondary node > >>> first, > >>>> then stop the primary VM and build COLO system. > >>>> > >>>> In the stop moment, you need add some netfilter and chardev socket > node > >>> for > >>>> COLO, maybe you need re-check this part. > >>>> > >>>> > >>>> Our test was already follow those step. Maybe I could describe the > detail > >>>> of the test flow and issues. > >>>> > >>>> > >>>> Step 1: > >>>> > >>>> Create primary VM without any netfilter and chardev for COLO, and > using > >>>> other host ping primary VM continually. > >>>> > >>>> > >>>> Step 2: > >>>> > >>>> Create secondary VM (the same device/drive with primary VM), and do > block > >>>> mirror sync ( ping to primary VM normally ) > >>>> > >>>> > >>>> Step 3: > >>>> > >>>> After block mirror sync finish, add those netfilter and chardev to > >>> primary > >>>> VM and secondary VM for COLO ( *Can't* ping to primary VM but those > >>> packets > >>>> will be received later ) > >>>> > >>>> > >>>> Step 4: > >>>> > >>>> Start migrate primary VM to secondary VM, and primary VM & secondary > VM > >>> are > >>>> running ( ping to primary VM works and receive those packets on step= 3 > >>>> status ) > >>>> > >>>> > >>>> > >>>> > >>>> Between Step 3 to Step 4, it will take 10~20 seconds in our > environment. > >>>> > >>>> I could image this issue (delay reply packets) is because of setting > COLO > >>>> proxy for temporary status, > >>>> > >>>> but we thought 10~20 seconds might a little long. (If primary VM is > >>> already > >>>> doing some jobs, it might lose the data.) > >>>> > >>>> > >>>> Could we reduce those time? or those delay is depends on different V= M? > >>> I think you need to set up the netfilter and chardev on the primary a= t > >>> the start; the filter contains the state of the TCP connections > working > >>> with the VM, so adding it later can't gain that state for existing > >>> connections. > >>> > >>> Dave > >>> > >>>> Best Regard, > >>>> > >>>> Daniel Cho. > >>>> > >>>> > >>>> > >>>> Zhang, Chen =E6=96=BC 2019=E5=B9=B411=E6=9C= =8830=E6=97=A5 =E9=80=B1=E5=85=AD =E4=B8=8A=E5=8D=882:04=E5=AF=AB=E9=81=93= =EF=BC=9A > >>>> > >>>>> > >>>>> > >>>>> > >>>>> *From:* Daniel Cho > >>>>> *Sent:* Friday, November 29, 2019 10:43 AM > >>>>> *To:* Zhang, Chen > >>>>> *Cc:* Dr. David Alan Gilbert ; > >>> lukasstraub2@web.de; > >>>>> qemu-devel@nongnu.org > >>>>> *Subject:* Re: Network connection with COLO VM > >>>>> > >>>>> > >>>>> > >>>>> Hi David, Zhang, > >>>>> > >>>>> > >>>>> > >>>>> Thanks for replying my question. > >>>>> > >>>>> We know why will occur this issue. > >>>>> > >>>>> As you said, the COLO VM's network needs > >>>>> > >>>>> colo-proxy to control packets, so the guest's > >>>>> > >>>>> interface should set the filter to solve the problem. > >>>>> > >>>>> > >>>>> > >>>>> But we found another question, when we set the > >>>>> > >>>>> fault-tolerance feature to guest (primary VM is running, > >>>>> > >>>>> secondary VM is pausing), the guest's network would not > >>>>> > >>>>> responds any request for a while (in our environment > >>>>> > >>>>> about 20~30 secs) after secondary VM runs. > >>>>> > >>>>> > >>>>> > >>>>> Does it be a normal situation, or a known issue? > >>>>> > >>>>> > >>>>> > >>>>> Our test is creating primary VM for a while, then creating > >>>>> > >>>>> secondary VM to make it with COLO feature. > >>>>> > >>>>> > >>>>> > >>>>> Hi Daniel, > >>>>> > >>>>> > >>>>> > >>>>> Happy to hear you have solved ssh disconnection issue. > >>>>> > >>>>> > >>>>> > >>>>> Do you use Lukas=E2=80=99s patch on this case? > >>>>> > >>>>> I think we need use block mirror to sync the disk to secondary node > >>> first, > >>>>> then stop the primary VM and build COLO system. > >>>>> > >>>>> In the stop moment, you need add some netfilter and chardev socket > node > >>>>> for COLO, maybe you need re-check this part. > >>>>> > >>>>> > >>>>> > >>>>> Best Regard, > >>>>> > >>>>> Daniel Cho > >>>>> > >>>>> > >>>>> > >>>>> Zhang, Chen =E6=96=BC 2019=E5=B9=B411=E6=9C= =8828=E6=97=A5 =E9=80=B1=E5=9B=9B =E4=B8=8A=E5=8D=889:26=E5=AF=AB=E9=81=93= =EF=BC=9A > >>>>> > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Dr. David Alan Gilbert > >>>>>> Sent: Wednesday, November 27, 2019 6:51 PM > >>>>>> To: Daniel Cho ; Zhang, Chen > >>>>>> ; lukasstraub2@web.de > >>>>>> Cc: qemu-devel@nongnu.org > >>>>>> Subject: Re: Network connection with COLO VM > >>>>>> > >>>>>> * Daniel Cho (danielcho@qnap.com) wrote: > >>>>>>> Hello everyone, > >>>>>>> > >>>>>>> Could we ssh to colo VM (means PVM & SVM are starting)? > >>>>>>> > >>>>>> Lets cc in Zhang Chen and Lukas Straub. > >>>>> Thanks Dave. > >>>>> > >>>>>>> SSH will connect to colo VM for a while, but it will disconnect > >>> with > >>>>>>> error > >>>>>>> *client_loop: send disconnect: Broken pipe* > >>>>>>> > >>>>>>> It seems to colo VM could not keep network session. > >>>>>>> > >>>>>>> Does it be a known issue? > >>>>>> That sounds like the COLO proxy is getting upset; it's supposed to > >>>>> compare > >>>>>> packets sent by the primary and secondary and only send one to the > >>>>> outside > >>>>>> - you shouldn't be talking directly to the guest, but always via t= he > >>>>> proxy. See > >>>>>> docs/colo-proxy.txt > >>>>>> > >>>>> Hi Daniel, > >>>>> > >>>>> I have try ssh to COLO guest with 8 hours, not occurred this issue. > >>>>> Please check your network/qemu configuration. > >>>>> But I found another problem maybe related this issue, if no network > >>>>> communication for a period of time(maybe 10min), the first message > >>> send to > >>>>> guest have a chance with delay(maybe 1-5 sec), I will try to fix it > >>> when I > >>>>> have time. > >>>>> > >>>>> Thanks > >>>>> Zhang Chen > >>>>> > >>>>>> Dave > >>>>>> > >>>>>>> Best Regard, > >>>>>>> Daniel Cho > >>>>>> -- > >>>>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > >>>>> > >>> -- > >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > >>> > >>> > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > --00000000000069977305990332ed Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Dave,=C2=A0 Zhang,

Thanks for your h= elp. I will try your recommendations.=C2=A0

Best R= egard,=C2=A0
Daniel Cho

Zhang, Chen <chen.zhang@intel.com> =E6=96=BC 2019=E5=B9=B412= =E6=9C=884=E6=97=A5 =E9=80=B1=E4=B8=89 =E4=B8=8B=E5=8D=884:32=E5=AF=AB=E9= =81=93=EF=BC=9A
=
On 12/3/2019 9:25 PM, Dr. David Alan Gilbert wrote:
> * Daniel Cho (= danielcho@qnap.com) wrote:
>> Hi Dave,
>>
>> We could use the exist interface to add netfilter and chardev, it = might not
>> have the problem you said.
>>
>> However, the netfilter and chardev on the primary at the start, th= at means
>> we could not dynamic set COLO
>> feature to VM?
> I wasn't expecting that to be possible - I'd expect you to be = able
> to start in a state that looks the same as a primary+failed secondary;=
> but I'm not sure.

Current COLO (with Lukas's patch) can support dynamic set COLO system.<= br>
This status is same like the secondary has triggered failover, the
primary node need to find new secondary

node to combine new COLO system.


>> We try to change this chardev to prevent primary VM will stuck to = wait
>> secondary VM.
>>
>> -chardev socket,id=3Dcompare1,host=3D127.0.0.1,port=3D9004,server,= wait \
>>
>> to
>>
>> -chardev socket,id=3Dcompare1,host=3D127.0.0.1,port=3D9004,server,= nowait \
>>
>> But it will make primary VM's network not works. (Can't ge= t ip), until
>> starting connect with secondary VM.

I think you need to check the port 9004 if already connect to the pair node= .

> I'm not sure of the answer to this; I've not tried doing it - = I'm not
> sure anyone has!
> But, the colo components do track the state of tcp connections, so I&#= 39;m
> expecting that they have to already exist to have the state of those > connections available for when you start the secondary.

Yes, you are right.

For this status, we don't need to sync the state of tcp connections, because after failover

(or just solo COLO primary node), we have empty all the tcp connections state in COLO module.

In the processing of build new COLO pair, we will sync all the VM state to secondary node and re-build

new track things in COLO module.


>
>
>> Otherwise, the primary VM with netfileter / chardev and without ne= tfilter /
>> chardev , they takes very different
>> booting time.
>> Without=C2=A0 netfilter / chardev : about 1 mins
>> With=C2=A0 =C2=A0netfilter / chardev : about 5 mins
>> Is this an issue?
> that sounds like it needs investigating.
>
> Dave

Yes, In previous COLO use cases, we need make primary node and secondary node boot in the same time.

I did't expect such a big difference on netfilter/chardev.

I think you can try without netfilter/chardev, after VM boot, re-build
the netfilter/chardev related work with chardev server nowait.


Thanks

Zhang Chen

>
>> Best regards,
>> Daniel Cho
>>
>>
>> Dr. David Alan Gilbert <dgilbert@redhat.com> =E6=96=BC 2019=E5=B9=B412=E6= =9C=882=E6=97=A5 =E9=80=B1=E4=B8=80 =E4=B8=8B=E5=8D=885:58=E5=AF=AB=E9=81= =93=EF=BC=9A
>>
>>> * Daniel Cho (danielcho@qnap.com) wrote:
>>>> Hi Zhang,
>>>>
>>>> We use qemu-4.1.0 release on this case.
>>>>
>>>> I think we need use block mirror to sync the disk to secon= dary node
>>> first,
>>>> then stop the primary VM and build COLO system.
>>>>
>>>> In the stop moment, you need add some netfilter and charde= v socket node
>>> for
>>>> COLO, maybe you need re-check this part.
>>>>
>>>>
>>>> Our test was already follow those step. Maybe I could desc= ribe the detail
>>>> of the test flow and issues.
>>>>
>>>>
>>>> Step 1:
>>>>
>>>> Create primary VM without any netfilter and chardev for CO= LO, and using
>>>> other host ping primary VM continually.
>>>>
>>>>
>>>> Step 2:
>>>>
>>>> Create secondary VM (the same device/drive with primary VM= ), and do block
>>>> mirror sync ( ping to primary VM normally )
>>>>
>>>>
>>>> Step 3:
>>>>
>>>> After block mirror sync finish, add those netfilter and ch= ardev to
>>> primary
>>>> VM and secondary VM for COLO ( *Can't* ping to primary= VM but those
>>> packets
>>>> will be received later )
>>>>
>>>>
>>>> Step 4:
>>>>
>>>> Start migrate primary VM to secondary VM, and primary VM &= amp; secondary VM
>>> are
>>>> running ( ping to primary VM works and receive those packe= ts on step 3
>>>> status )
>>>>
>>>>
>>>>
>>>>
>>>> Between Step 3 to Step 4, it will take 10~20 seconds in ou= r environment.
>>>>
>>>> I could image this issue (delay reply packets) is because = of setting COLO
>>>> proxy for temporary status,
>>>>
>>>> but we thought 10~20 seconds might a little long. (If prim= ary VM is
>>> already
>>>> doing some jobs, it might lose the data.)
>>>>
>>>>
>>>> Could we reduce those time? or those delay is depends on d= ifferent VM?
>>> I think you need to set up the netfilter and chardev on the pr= imary at
>>> the start;=C2=A0 the filter contains the state of the TCP conn= ections working
>>> with the VM, so adding it later can't gain that state for = existing
>>> connections.
>>>
>>> Dave
>>>
>>>> Best Regard,
>>>>
>>>> Daniel Cho.
>>>>
>>>>
>>>>
>>>> Zhang, Chen <chen.zhang@intel.com> =E6=96=BC 2019=E5=B9=B411=E6= =9C=8830=E6=97=A5 =E9=80=B1=E5=85=AD =E4=B8=8A=E5=8D=882:04=E5=AF=AB=E9=81= =93=EF=BC=9A
>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From:* Daniel Cho <danielcho@qnap.com>
>>>>> *Sent:* Friday, November 29, 2019 10:43 AM
>>>>> *To:* Zhang, Chen <chen.zhang@intel.com>
>>>>> *Cc:* Dr. David Alan Gilbert <dgilbert@redhat.com>;
>>> lukas= straub2@web.de;
>>>>> qemu-devel@nongnu.org
>>>>> *Subject:* Re: Network connection with COLO VM
>>>>>
>>>>>
>>>>>
>>>>> Hi David,=C2=A0 Zhang,
>>>>>
>>>>>
>>>>>
>>>>> Thanks for replying my question.
>>>>>
>>>>> We know why will occur this issue.
>>>>>
>>>>> As you said, the COLO VM's network needs
>>>>>
>>>>> colo-proxy to control packets, so the guest's
>>>>>
>>>>> interface should set the filter to solve the problem.<= br> >>>>>
>>>>>
>>>>>
>>>>> But we found another question, when we set the
>>>>>
>>>>> fault-tolerance feature to guest (primary VM is runnin= g,
>>>>>
>>>>> secondary VM is pausing), the guest's network woul= d not
>>>>>
>>>>> responds any request for a while (in our environment >>>>>
>>>>> about 20~30 secs) after secondary VM runs.
>>>>>
>>>>>
>>>>>
>>>>> Does it be a normal situation, or a known issue?
>>>>>
>>>>>
>>>>>
>>>>> Our test is creating primary VM for a while, then crea= ting
>>>>>
>>>>> secondary VM to make it with COLO feature.
>>>>>
>>>>>
>>>>>
>>>>> Hi Daniel,
>>>>>
>>>>>
>>>>>
>>>>> Happy to hear you have solved ssh disconnection issue.=
>>>>>
>>>>>
>>>>>
>>>>> Do you use Lukas=E2=80=99s patch on this case?
>>>>>
>>>>> I think we need use block mirror to sync the disk to s= econdary node
>>> first,
>>>>> then stop the primary VM and build COLO system.
>>>>>
>>>>> In the stop moment, you need add some netfilter and ch= ardev socket node
>>>>> for COLO, maybe you need re-check this part.
>>>>>
>>>>>
>>>>>
>>>>> Best Regard,
>>>>>
>>>>> Daniel Cho
>>>>>
>>>>>
>>>>>
>>>>> Zhang, Chen <chen.zhang@intel.com> =E6=96=BC 2019=E5=B9=B411= =E6=9C=8828=E6=97=A5 =E9=80=B1=E5=9B=9B =E4=B8=8A=E5=8D=889:26=E5=AF=AB=E9= =81=93=EF=BC=9A
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>>>> Sent: Wednesday, November 27, 2019 6:51 PM
>>>>>> To: Daniel Cho <danielcho@qnap.com>; Zhang, Chen
>>>>>> <chen.zhang@intel.com>; lukasstraub2@web.de
>>>>>> Cc: qemu-devel@nongnu.org
>>>>>> Subject: Re: Network connection with COLO VM
>>>>>>
>>>>>> * Daniel Cho (danielcho@qnap.com) wrote:
>>>>>>> Hello everyone,
>>>>>>>
>>>>>>> Could we ssh to colo VM (means PVM & SVM a= re starting)?
>>>>>>>
>>>>>> Lets cc in Zhang Chen and Lukas Straub.
>>>>> Thanks Dave.
>>>>>
>>>>>>> SSH will connect to colo VM for a while, but i= t will disconnect
>>> with
>>>>>>> error
>>>>>>> *client_loop: send disconnect: Broken pipe* >>>>>>>
>>>>>>> It seems to colo VM could not keep network ses= sion.
>>>>>>>
>>>>>>> Does it be a known issue?
>>>>>> That sounds like the COLO proxy is getting upset; = it's supposed to
>>>>> compare
>>>>>> packets sent by the primary and secondary and only= send one to the
>>>>> outside
>>>>>> - you shouldn't be talking directly to the gue= st, but always via the
>>>>> proxy.=C2=A0 See
>>>>>> docs/colo-proxy.txt
>>>>>>
>>>>> Hi Daniel,
>>>>>
>>>>> I have try ssh to COLO guest with 8 hours, not occurre= d this issue.
>>>>> Please check your network/qemu configuration.
>>>>> But I found another problem maybe related this issue, = if no network
>>>>> communication for a period of time(maybe 10min), the f= irst message
>>> send to
>>>>> guest have a chance with delay(maybe 1-5 sec), I will = try to fix it
>>> when I
>>>>> have time.
>>>>>
>>>>> Thanks
>>>>> Zhang Chen
>>>>>
>>>>>> Dave
>>>>>>
>>>>>>> Best Regard,
>>>>>>> Daniel Cho
>>>>>> --
>>>>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>>>
>>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
--00000000000069977305990332ed--