qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: "yangke (J)" <yangke27@huawei.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"wangxin (U)" <wangxinxin.wang@huawei.com>,
	"Maxime Coquelin" <maxime.coquelin@redhat.com>,
	"quintela@redhat.com" <quintela@redhat.com>
Subject: Re: 答复: [question]vhost-user: atuo fix network link broken during migration
Date: Thu, 26 Mar 2020 17:45:50 +0800	[thread overview]
Message-ID: <bb51d1b8-522d-0c05-46ec-102cb8a917f7@redhat.com> (raw)
In-Reply-To: <0CC1E03725E48D478F815032182740230A42C15B@DGGEMM532-MBS.china.huawei.com>


On 2020/3/24 下午7:08, yangke (J) wrote:
>>> We find an issue when host mce trigger openvswitch(dpdk) restart in
>>> source host during guest migration,
>>
>> Did you mean the vhost-user netev was deleted from the source host?
>
> The vhost-user netev was not deleted from the source host. I mean that:
> in normal scenario, OVS(DPDK) begin to restart, then qemu_chr disconnect to OVS and link status is set to link down; OVS(DPDK) started, then qemu_chr reconnect to OVS and link status is set to link up. But in our scenario, before qemu_chr reconnect to OVS, the VM migrate is finished. The link_down of frontend was loaded from n->status in destination, it cause the network in gust never be up again.


I'm not sure we should fix this in qemu.

Generally, it's the task of management to make sure the destination 
device configuration is the same as source.

E.g in this case, management should bring up the link if re-connection 
in source is completed.

What's more the qmp_set_link() done in vhost-user.c looks hacky which 
changes the link status without the care of management.


>
> qemu_chr disconnect:
> #0  vhost_user_write (msg=msg@entry=0x7fff59ecb2b0, fds=fds@entry=0x0, fd_num=fd_num@entry=0, dev=0x295c730, dev=0x295c730)
>      at /usr/src/debug/qemu-kvm-2.8.1/hw/virtio/vhost_user.c:239
> #1  0x00000000004e6bad in vhost_user_get_vring_base (dev=0x295c730, ring=0x7fff59ecb510)
>      at /usr/src/debug/qemu-kvm-2.8.1/hw/virtio/vhost_user.c:497
> #2  0x00000000004e2e88 in vhost_virtqueue_stop (dev=dev@entry=0x295c730, vdev=vdev@entry=0x2ca36c0, vq=0x295c898, idx=0)
>      at /usr/src/debug/qemu-kvm-2.8.1/hw/virtio/vhost.c:1036
> #3  0x00000000004e45ab in vhost_dev_stop (hdev=hdev@entry=0x295c730, vdev=vdev@entry=0x2ca36c0)
>      at /usr/src/debug/qemu-kvm-2.8.1/hw/virtio/vhost.c:1556
> #4  0x00000000004bc56a in vhost_net_stop_one (net=0x295c730, dev=dev@entry=0x2ca36c0)
>      at /usr/src/debug/qemu-kvm-2.8.1/hw/net/vhost_net.c:326
> #5  0x00000000004bcc3b in vhost_net_stop (dev=dev@entry=0x2ca36c0, ncs=<optimized out>,	total_queues=4)
>      at /usr/src/debug/qemu-kvm-2.8.1/hw/net/vhost_net.c:407
> #6  0x00000000004b85f6 in virtio_net_vhost_status (n=n@entry=0x2ca36c0,	status=status@entry=7 '\a')
>      at /usr/src/debug/qemu-kvm-2.8.1/hw/net/virtio_net.c:177
> #7  0x00000000004b869f in virtio_net_set_status (vdev=<optimized out>, status=<optimized out>)
>      at /usr/src/debug/qemu-kvm-2.8.1/hw/net/virtio_net.c:243
> #8  0x000000000073d00d in qmp_set_link (name=name@entry=0x2956d40 "hostnet0", up=up@entry=false, errp=errp@entry=0x7fff59ecd718)
>      at net/net.c:1437
> #9  0x00000000007460c1 in net_vhost_user_event (opaque=0x2956d40, event=4) at net/vhost_user.c:217//qemu_chr_be_event
> #10 0x0000000000574f0d in tcp_chr_disconnect (chr=0x2951a40) at qemu_char.c:3220
> #11 0x000000000057511f in tcp_chr_hup (channel=<optimized out>,	cond=<optimized out>, opaque=<optimized out>) at qemu_char.c:3265
>
>
>>
>>> VM is still link down in frontend after migration, it cause the network in VM never be up again.
>>>
>>> virtio_net_load_device:
>>>       /* nc.link_down can't be migrated, so infer link_down according
>>>        * to link status bit in n->status */
>>>       link_down = (n->status & VIRTIO_NET_S_LINK_UP) == 0;
>>>       for (i = 0; i < n->max_queues; i++) {
>>>           qemu_get_subqueue(n->nic, i)->link_down = link_down;
>>>       }
>>>
>>> guset:               migrate begin -----> vCPU pause ---> vmsate load ---> migrate finish
>>>                                       ^                ^                ^
>>>                                       |                |                |
>>> openvswitch in source host:   begin to restart   restarting        started
>>>                                       ^                ^                ^
>>>                                       |                |                |
>>> nc in frontend in source:        link down        link down        link down
>>>                                       ^                ^                ^
>>>                                       |                |                |
>>> nc in frontend in destination:   link up          link up          link down
>>>                                       ^                ^                ^
>>>                                       |                |                |
>>> guset network:                    broken           broken           broken
>>>                                       ^                ^                ^
>>>                                       |                |                |
>>> nc in backend in source:         link down        link down        link up
>>>                                       ^                ^                ^
>>>                                       |                |                |
>>> nc in backend in destination:    link up          link up          link up
>>>
>>> The link_down of frontend was loaded from n->status, n->status is link
>>> down in source, so the link_down of frontend is true. The backend in
>>> destination host is link up, but the frontend in destination host is link down, it cause the network in gust never be up again until an guest cold reboot.
>>>
>>> Is there a way to auto fix the link status? or just abort the migration in virtio net device load?
>>
>> Maybe we can try to sync link status after migration?
>>
>> Thanks
>
> In extreme scenario, after migration the OVS(DPDK) in source may be still not started.
>
>
> Our plan is to check the link state of backend when load the link_down of frontend.
>       /* nc.link_down can't be migrated, so infer link_down according
>        * to link status bit in n->status */
> -    link_down = (n->status & VIRTIO_NET_S_LINK_UP) == 0;
> +    if (qemu_get_queue(n->nic)->peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> +        link_down = (n->status & VIRTIO_NET_S_LINK_UP | !qemu_get_queue(n->nic)->peer->link_down) == 0;
> +    } else {
> +        link_down = (n->status & VIRTIO_NET_S_LINK_UP) == 0;
> +    }
>       for (i = 0; i < n->max_queues; i++) {
>           qemu_get_subqueue(n->nic, i)->link_down = link_down;
>       }
>
> Is good enough to auto fix the link status?


I still think it's the task of management. Try sync status internally as 
what vhost-user currently did may lead bugs.

Thanks


>
> Thanks



      reply	other threads:[~2020-03-26  9:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-23  8:17 [question]vhost-user: atuo fix network link broken during migration yangke (J)
2020-03-24  5:49 ` Jason Wang
2020-03-24 11:08   ` 答复: " yangke (J)
2020-03-26  9:45     ` Jason Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb51d1b8-522d-0c05-46ec-102cb8a917f7@redhat.com \
    --to=jasowang@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=wangxinxin.wang@huawei.com \
    --cc=yangke27@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).