All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Marc-André Lureau" <marcandre.lureau@gmail.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	QEMU <qemu-devel@nongnu.org>,
	"Wang, Zhihong" <zhihong.wang@intel.com>,
	Tetsuya Mukawa <mukawa@igel.co.jp>,
	"Kavanagh, Mark B" <mark.b.kavanagh@intel.com>,
	"Traynor, Kevin" <kevin.traynor@intel.com>
Subject: Re: [Qemu-devel] [PATCH RFC 00/14] vhost-user: shutdown and reconnection
Date: Tue, 29 Mar 2016 12:52:32 +0200	[thread overview]
Message-ID: <CAJ+F1CJziBRyq3SZJ7DYZij2fREY2PRMQtESy4xBoyKamnLXXw@mail.gmail.com> (raw)
In-Reply-To: <20160329081010.GA3080@yliu-dev.sh.intel.com>

Hi

On Tue, Mar 29, 2016 at 10:10 AM, Yuanhan Liu
<yuanhan.liu@linux.intel.com> wrote:
> Backend crash may be not that normal in production usage, it is in
> development stage. It would be handy if we can handle that as well,
> as it's a painful experience for me that every time I do a VM2VM test
> and once my changes to backend causes a crash (unfortunately, it
> happens often in early stage), I have to reboot the two VMs.
>
> If we have reconnect, life could be easier. I then don't need worry
> that I will mess the backend and crash it any more.
>
> And again, the reason I mentioned crash here again is not because
> we need handle it, specially. Instead, I was thinking we might be
> able to handle the two reasons both.

I think crash could be handle with queue reset Michael proposed, but
that would probably require guest side changes.

>> Conceptually, I think if we allow the backend to disconnect, it makes
>> sense that qemu is actually the socket server. But it doesn't matter
>> much, it's simple to teach qemu to reconnect a timer...
>
> We already have that, right? I mean, the char-dev "reconnect" option.

Sure, what I mean is that it sounds cleaner from a design pov for qemu
to be the server (since it is the one actually waiting for backend in
this case), beside a timer is often a pretty rough solution to
anything.

>> So we should
>> probably allow both cases anyway.
>
> Yes, I think both should work. I may be wrong (that I may miss
> something), but it seems it's (far) easier to me to keep QEMU
> as the client, and adding the "reconnect" option, judging that
> we have all stuff to make it work ready. In this way,
>
> - if backend crashes/quits, you just need restart the backend,
>   and QEMU will retry the connection.
>
> - if QEMU crashes/quits, you just need restart the QEMU, and
>   then QEMU will start a fresh connection.

It may all depend on use cases, it's not more obvious or easy than the
other to me.

> However, if let QEMU as the server, there are 2 major works need
> to be done in the backend side (take DPDK as example, that just
> acts as vhost-user server only so far):
>
> - Introducing a new API or extending current API to let it to
>   connect the server socket from QEMU.
>
> - If QEMU crashes/quits, we need add code to backend to keep
>   reconnecting unless connection is established, which is something
>   similar to the "reconnect" stuff in QEMU.
> As you can see, it needs more effort (though it's not something
> you care :). And it has duplicate work.


Ah, I am looking at this from qemu angle, backend may need to adapt if
it doesn't already handle both "socket role" (client & server).

>
>>
>> > - start DPDK vhost-switch example
>> >
>> > - start QEMU, which will connect to DPDK vhost-user
>> >
>> >   link is good now.
>> >
>> > - kill DPDK vhost-switch
>> >
>> >   link is broken at this stage
>> >
>> > - start DPDK vhost-switch again
>> >
>> >   you will find that the link is back again.
>> >
>> >
>> > Will that makes sense to you? If so, we may need do nothing (or just
>> > very few) changes at all to DPDK to get the reconnect work.
>>
>> The main issue with handling crashes (gone at any time) is that the
>> backend my not have time to sync the used idx (at the least). It may
>> already have processed incoming packets, so on reconnect, it may
>> duplicate the receiving/dispatching work.
>
> That's not the case for DPDK backend implementation: incoming packets
> won't be delivered for processing before we update the used idx.

It could be ok on incoming packet side (vm->dpdk), but I don't see how
you could avoid packet loss on the other side (dpdk->vm), since the
packets must be added to queues before updating used idx.

>
>> Similarly, on the backend
>> receiving end, some packets may be lost, never received by the VM, and
>> later overwritten by the backend after reconnect (for the same used
>> idx update reason). This may not be a big deal for unreliable
>> protocols, but I am not familiar enough with network usage to know if
>> that's fine in all cases. It may be fine for some packets, such as
>> udp.
>>
>> However, in general, vhost-user should not be specific to network
>> transmission, and it would be nice to have a reliable way for the the
>> backend to reconnect. That's what I try to do in this series.
>
> Agreed.
>
>> I'll
>> repost it after I have done more testing.
>
> Great, and thanks for the work. I will also take some time to give
> closer review.

thanks


-- 
Marc-André Lureau

  reply	other threads:[~2016-03-29 10:52 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-08 23:09 [Qemu-devel] [PATCH RFC 00/14] vhost-user: shutdown and reconnection marcandre.lureau
2015-09-08 23:09 ` [Qemu-devel] [PATCH RFC 01/14] vhost-user: Add ability to know vhost-user backend disconnection marcandre.lureau
2015-09-08 23:09 ` [Qemu-devel] [PATCH RFC 02/14] vhost-user: remove useless is_server field marcandre.lureau
2015-09-08 23:09 ` [Qemu-devel] [PATCH RFC 03/14] qemu-char: avoid potential double-free marcandre.lureau
2015-09-08 23:09 ` [Qemu-devel] [PATCH RFC 04/14] qemu-char: remove all msgfds on disconnect marcandre.lureau
2015-09-08 23:09 ` [Qemu-devel] [PATCH RFC 05/14] qemu-char: make tcp_chr_disconnect() reentrant-safe marcandre.lureau
2015-09-08 23:09 ` [Qemu-devel] [PATCH RFC 06/14] vhost-net: keep VIRTIO_NET_F_STATUS for vhost-user marcandre.lureau
2015-09-08 23:09 ` [Qemu-devel] [PATCH RFC 07/14] virtio-net: enable tx notification if up and vhost started marcandre.lureau
2015-09-08 23:10 ` [Qemu-devel] [PATCH RFC 08/14] vhost: add vhost_dev stop callback marcandre.lureau
2015-09-08 23:10 ` [Qemu-devel] [PATCH RFC 09/14] vhost-user: add vhost_user to hold the chr marcandre.lureau
2015-09-08 23:10 ` [Qemu-devel] [PATCH RFC 10/14] qemu-char: add qemu_chr_free() marcandre.lureau
2015-09-08 23:10 ` [Qemu-devel] [PATCH RFC 11/14] vhost-user: add slave-fd support marcandre.lureau
2015-09-08 23:10 ` [Qemu-devel] [PATCH RFC 12/14] vhost-user: add shutdown support marcandre.lureau
2015-09-08 23:10 ` [Qemu-devel] [PATCH RFC 13/14] qemu-char: Add qemu_chr_disconnect to close a fd accepted by listen fd marcandre.lureau
2015-09-08 23:10 ` [Qemu-devel] [PATCH RFC 14/14] test: start vhost-user reconnect test marcandre.lureau
2015-11-26 10:33 ` [Qemu-devel] [PATCH RFC 00/14] vhost-user: shutdown and reconnection Michael S. Tsirkin
2016-02-23 18:00   ` Marc-André Lureau
2016-03-02  6:44     ` Michael S. Tsirkin
2016-03-24  7:10   ` Yuanhan Liu
2016-03-25 18:00     ` Marc-André Lureau
2016-03-28  1:53       ` Tetsuya Mukawa
2016-03-28  2:06         ` Tetsuya Mukawa
2016-03-29  8:10       ` Yuanhan Liu
2016-03-29 10:52         ` Marc-André Lureau [this message]
2016-03-29 14:28           ` Yuanhan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ+F1CJziBRyq3SZJ7DYZij2fREY2PRMQtESy4xBoyKamnLXXw@mail.gmail.com \
    --to=marcandre.lureau@gmail.com \
    --cc=kevin.traynor@intel.com \
    --cc=mark.b.kavanagh@intel.com \
    --cc=mst@redhat.com \
    --cc=mukawa@igel.co.jp \
    --cc=qemu-devel@nongnu.org \
    --cc=yuanhan.liu@linux.intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.