All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: "virtio-fs@redhat.com" <virtio-fs@redhat.com>,
	Yongji Xie <xieyongji@bytedance.com>,
	"Boeuf, Sebastien" <sebastien.boeuf@intel.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	fam.zheng@bytedance.com
Subject: Re: [External] Re: [Virtio-fs] host-user reconnection and crash recovery
Date: Thu, 13 May 2021 16:51:09 +0800	[thread overview]
Message-ID: <CAFQAk7iDPOKyxR4X_smGH=mBqDccWJUXooMJ+gyXRoaBt2yg4g@mail.gmail.com> (raw)
In-Reply-To: <YJzirr/g1DlZr4X8@work-vm>

[-- Attachment #1: Type: text/plain, Size: 6974 bytes --]

On Thu, May 13, 2021 at 4:26 PM Dr. David Alan Gilbert <dgilbert@redhat.com>
wrote:

> * Jiachen Zhang (zhangjiachen.jaycee@bytedance.com) wrote:
> > Hi Stefan and Sebastien,
> >
> > I think I should give some background context from my perspective.
> >
> > For the virtiofsd crash reconnection (recovery) to QEMU, as said by
> > Stefan, we discussed the possible implementation on the bi-weekly
> virtio-fs
> > call. I had also sent an RFC patch to the virtio-fs mail-list (
> >
> https://patchwork.kernel.org/project/qemu-devel/cover/20201215162119.27360-1-zhangjiachen.jaycee@bytedance.com/
> ),
> > we also have some discussion on the further revision direction in that
> > mail.
> >
> > We also have some needs to support virtiofsd crash recovery when it is
> used
> > with cloud-hypervisor (
> https://github.com/cloud-hypervisor/cloud-hypervisor).
> > However, the virtiofsd crash reconnection RFC patch relies on
> > QEMU's vhost-user socket reconnection feature and QEMU's vhost-user
> > inflight I/O tracking feature, which are both not supported by
> > cloud-hypervisor.
> >
> > So I also issued an initial pull-request of cloud-hypervisor vhost-user
> > socket reconnection (
> > https://github.com/cloud-hypervisor/cloud-hypervisor/pull/2387), which
> is
> > reviewed by Sebastien. Based on vhost-user socket reconnection, we also
> > want to further develop vhost-user inflight I/O tracking feature for
> > cloud-hypervisor, and finally to support virtiofsd crash reconnection.
> >
> > I am sorry for the delayed patch-revision of the two patch sets. I hope I
> > can free up some time in these two months to make some further progress.
>
> I'm curious what your use case is for virtiofsd crash
> recovery/reconnection - is there some reason you expect the daemon to
> crash or need to be restarted more than the whole VM?
>
> In the case of vhost-user networking with dpdk I can see the case where
> there is a central networking switch process shared between many VMs; so
> wanting to restart that without restarting all the VMs makes sense to
> me; where each VM has it's own virtiofsd I don't understand the use as
> much.
>
>
Hi Dave,

Yes, we want to restart virtiofsd without restarting the whole VM. One
reason is to avoid I/O hang caused by virtiofs daemon crash. Another
important reason to support virtiofsd live-upgrade for virtiofsd's bug or
security fixes based on virtiofsd reconnection.

All the best,
Jiachen



> Dave
>
> > All the best,
> > Jiachen
> >
> > On Tue, May 11, 2021 at 11:02 PM Boeuf, Sebastien <
> sebastien.boeuf@intel.com>
> > wrote:
> >
> > > Hi Stefan,
> > >
> > > Thanks for the explanation.
> > >
> > > So reconnection for vhost-user is not a well defined behavior,
> > > and QEMU is doing its best to retry when possible, depending
> > > on each device.
> > >
> > > The guest does not know about it, so it's never notified that
> > > the device needs to be reset.
> > >
> > > But what about the vhost-user backend initialization? Does
> > > QEMU go again through initializing memory table, vrings, etc...
> > > since it can't assume anything from the backend?
> > >
> > > Thanks,
> > > Sebastien
> > >
> > > ------------------------------
> > > *From:* Stefan Hajnoczi
> > > *Sent:* Tuesday, May 11, 2021 2:45 PM
> > > *To:* Boeuf, Sebastien
> > > *Cc:* virtio-fs@redhat.com; qemu-devel@nongnu.org
> > > *Subject:* vhost-user reconnection and crash recovery
> > >
> > > Hi Sebastien,
> > > On #virtio-fs IRC you asked:
> > >
> > >  I have a vhost-user question regarding disconnection/reconnection. How
> > >  should this be handled? Let's say the vhost-user backend disconnects,
> > >  and reconnects later on, does QEMU reset the virtio device by
> notifying
> > >  the guest? Or does it simply reconnects to the backend without letting
> > >  the guest know about what happened?
> > >
> > > The vhost-user protocol does not have a generic reconnection solution.
> > > Reconnection is handled on a case-by-case basis because device-specific
> > > and implementation-specific state is involved.
> > >
> > > The vhost-user-fs-pci device in QEMU has not been tested with
> > > reconnection as far as I know.
> > >
> > > The ideal reconnection behavior is to resume the device from its
> > > previous state without disrupting the guest. Device state must survive
> > > reconnection in order for this to work. Neither QEMU virtiofsd nor
> > > virtiofsd-rs implement this today.
> > >
> > > virtiofs has a lot of state, making it particularly difficult to
> support
> > > either DEVICE_NEEDS_RESET or transparent vhost-user reconnection. We
> > > have discussed virtiofs crash recovery on the bi-weekly virtiofs call
> > > (https://etherpad.opendev.org/p/virtiofs-external-meeting). If you
> want
> > > to work on this then joining the call would be a good starting point to
> > > coordinate with others.
> > >
> > > One approach for transparent crash recovery is for virtiofsd to keep
> its
> > > state in tmpfs (e.g. inode/fd mappings) and open fds shared with a
> > > clone(2) process via CLONE_FILES. This way the virtiofsd process can
> > > terminate but its state persists in memory thanks to its clone process.
> > > The clone can then be used to launch the new virtiofsd process from the
> > > old state. This would allow the device to resume transparently with
> QEMU
> > > only reconnecting the vhost-user UNIX domain socket. This is an idea
> > > that we discussed in the bi-weekly virtiofs call.
> > >
> > > You mentioned device reset. VIRTIO 1.1 has the Device Status Field
> > > DEVICE_NEEDS_RESET flat that the device can use to tell the driver that
> > > a reset is necessary. This feature is present in the specification but
> > > not implemented in the Linux guest drivers. Again the reason is that
> > > handling it requires driver-specific logic for restoring state after
> > > reset...otherwise the device reset would be visible to userspace.
> > >
> > > Stefan
> > >
> > > ---------------------------------------------------------------------
> > > Intel Corporation SAS (French simplified joint stock company)
> > > Registered headquarters: "Les Montalets"- 2, rue de Paris,
> > > 92196 Meudon Cedex, France
> > > Registration Number:  302 456 199 R.C.S. NANTERRE
> > > Capital: 4,572,000 Euros
> > >
> > > This e-mail and any attachments may contain confidential material for
> > > the sole use of the intended recipient(s). Any review or distribution
> > > by others is strictly prohibited. If you are not the intended
> > > recipient, please contact the sender and delete all copies.
> > > _______________________________________________
> > > Virtio-fs mailing list
> > > Virtio-fs@redhat.com
> > > https://listman.redhat.com/mailman/listinfo/virtio-fs
> > >
>
> > _______________________________________________
> > Virtio-fs mailing list
> > Virtio-fs@redhat.com
> > https://listman.redhat.com/mailman/listinfo/virtio-fs
>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>

[-- Attachment #2: Type: text/html, Size: 9725 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: "virtio-fs@redhat.com" <virtio-fs@redhat.com>,
	Yongji Xie <xieyongji@bytedance.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	fam.zheng@bytedance.com
Subject: Re: [Virtio-fs] [External] Re: host-user reconnection and crash recovery
Date: Thu, 13 May 2021 16:51:09 +0800	[thread overview]
Message-ID: <CAFQAk7iDPOKyxR4X_smGH=mBqDccWJUXooMJ+gyXRoaBt2yg4g@mail.gmail.com> (raw)
In-Reply-To: <YJzirr/g1DlZr4X8@work-vm>

[-- Attachment #1: Type: text/plain, Size: 6974 bytes --]

On Thu, May 13, 2021 at 4:26 PM Dr. David Alan Gilbert <dgilbert@redhat.com>
wrote:

> * Jiachen Zhang (zhangjiachen.jaycee@bytedance.com) wrote:
> > Hi Stefan and Sebastien,
> >
> > I think I should give some background context from my perspective.
> >
> > For the virtiofsd crash reconnection (recovery) to QEMU, as said by
> > Stefan, we discussed the possible implementation on the bi-weekly
> virtio-fs
> > call. I had also sent an RFC patch to the virtio-fs mail-list (
> >
> https://patchwork.kernel.org/project/qemu-devel/cover/20201215162119.27360-1-zhangjiachen.jaycee@bytedance.com/
> ),
> > we also have some discussion on the further revision direction in that
> > mail.
> >
> > We also have some needs to support virtiofsd crash recovery when it is
> used
> > with cloud-hypervisor (
> https://github.com/cloud-hypervisor/cloud-hypervisor).
> > However, the virtiofsd crash reconnection RFC patch relies on
> > QEMU's vhost-user socket reconnection feature and QEMU's vhost-user
> > inflight I/O tracking feature, which are both not supported by
> > cloud-hypervisor.
> >
> > So I also issued an initial pull-request of cloud-hypervisor vhost-user
> > socket reconnection (
> > https://github.com/cloud-hypervisor/cloud-hypervisor/pull/2387), which
> is
> > reviewed by Sebastien. Based on vhost-user socket reconnection, we also
> > want to further develop vhost-user inflight I/O tracking feature for
> > cloud-hypervisor, and finally to support virtiofsd crash reconnection.
> >
> > I am sorry for the delayed patch-revision of the two patch sets. I hope I
> > can free up some time in these two months to make some further progress.
>
> I'm curious what your use case is for virtiofsd crash
> recovery/reconnection - is there some reason you expect the daemon to
> crash or need to be restarted more than the whole VM?
>
> In the case of vhost-user networking with dpdk I can see the case where
> there is a central networking switch process shared between many VMs; so
> wanting to restart that without restarting all the VMs makes sense to
> me; where each VM has it's own virtiofsd I don't understand the use as
> much.
>
>
Hi Dave,

Yes, we want to restart virtiofsd without restarting the whole VM. One
reason is to avoid I/O hang caused by virtiofs daemon crash. Another
important reason to support virtiofsd live-upgrade for virtiofsd's bug or
security fixes based on virtiofsd reconnection.

All the best,
Jiachen



> Dave
>
> > All the best,
> > Jiachen
> >
> > On Tue, May 11, 2021 at 11:02 PM Boeuf, Sebastien <
> sebastien.boeuf@intel.com>
> > wrote:
> >
> > > Hi Stefan,
> > >
> > > Thanks for the explanation.
> > >
> > > So reconnection for vhost-user is not a well defined behavior,
> > > and QEMU is doing its best to retry when possible, depending
> > > on each device.
> > >
> > > The guest does not know about it, so it's never notified that
> > > the device needs to be reset.
> > >
> > > But what about the vhost-user backend initialization? Does
> > > QEMU go again through initializing memory table, vrings, etc...
> > > since it can't assume anything from the backend?
> > >
> > > Thanks,
> > > Sebastien
> > >
> > > ------------------------------
> > > *From:* Stefan Hajnoczi
> > > *Sent:* Tuesday, May 11, 2021 2:45 PM
> > > *To:* Boeuf, Sebastien
> > > *Cc:* virtio-fs@redhat.com; qemu-devel@nongnu.org
> > > *Subject:* vhost-user reconnection and crash recovery
> > >
> > > Hi Sebastien,
> > > On #virtio-fs IRC you asked:
> > >
> > >  I have a vhost-user question regarding disconnection/reconnection. How
> > >  should this be handled? Let's say the vhost-user backend disconnects,
> > >  and reconnects later on, does QEMU reset the virtio device by
> notifying
> > >  the guest? Or does it simply reconnects to the backend without letting
> > >  the guest know about what happened?
> > >
> > > The vhost-user protocol does not have a generic reconnection solution.
> > > Reconnection is handled on a case-by-case basis because device-specific
> > > and implementation-specific state is involved.
> > >
> > > The vhost-user-fs-pci device in QEMU has not been tested with
> > > reconnection as far as I know.
> > >
> > > The ideal reconnection behavior is to resume the device from its
> > > previous state without disrupting the guest. Device state must survive
> > > reconnection in order for this to work. Neither QEMU virtiofsd nor
> > > virtiofsd-rs implement this today.
> > >
> > > virtiofs has a lot of state, making it particularly difficult to
> support
> > > either DEVICE_NEEDS_RESET or transparent vhost-user reconnection. We
> > > have discussed virtiofs crash recovery on the bi-weekly virtiofs call
> > > (https://etherpad.opendev.org/p/virtiofs-external-meeting). If you
> want
> > > to work on this then joining the call would be a good starting point to
> > > coordinate with others.
> > >
> > > One approach for transparent crash recovery is for virtiofsd to keep
> its
> > > state in tmpfs (e.g. inode/fd mappings) and open fds shared with a
> > > clone(2) process via CLONE_FILES. This way the virtiofsd process can
> > > terminate but its state persists in memory thanks to its clone process.
> > > The clone can then be used to launch the new virtiofsd process from the
> > > old state. This would allow the device to resume transparently with
> QEMU
> > > only reconnecting the vhost-user UNIX domain socket. This is an idea
> > > that we discussed in the bi-weekly virtiofs call.
> > >
> > > You mentioned device reset. VIRTIO 1.1 has the Device Status Field
> > > DEVICE_NEEDS_RESET flat that the device can use to tell the driver that
> > > a reset is necessary. This feature is present in the specification but
> > > not implemented in the Linux guest drivers. Again the reason is that
> > > handling it requires driver-specific logic for restoring state after
> > > reset...otherwise the device reset would be visible to userspace.
> > >
> > > Stefan
> > >
> > > ---------------------------------------------------------------------
> > > Intel Corporation SAS (French simplified joint stock company)
> > > Registered headquarters: "Les Montalets"- 2, rue de Paris,
> > > 92196 Meudon Cedex, France
> > > Registration Number:  302 456 199 R.C.S. NANTERRE
> > > Capital: 4,572,000 Euros
> > >
> > > This e-mail and any attachments may contain confidential material for
> > > the sole use of the intended recipient(s). Any review or distribution
> > > by others is strictly prohibited. If you are not the intended
> > > recipient, please contact the sender and delete all copies.
> > > _______________________________________________
> > > Virtio-fs mailing list
> > > Virtio-fs@redhat.com
> > > https://listman.redhat.com/mailman/listinfo/virtio-fs
> > >
>
> > _______________________________________________
> > Virtio-fs mailing list
> > Virtio-fs@redhat.com
> > https://listman.redhat.com/mailman/listinfo/virtio-fs
>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
>

[-- Attachment #2: Type: text/html, Size: 9725 bytes --]

  reply	other threads:[~2021-05-13  8:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-11 15:00 vhost-user reconnection and crash recovery Boeuf, Sebastien
2021-05-11 15:00 ` [Virtio-fs] " Boeuf, Sebastien
2021-05-11 15:37 ` Stefan Hajnoczi
2021-05-11 15:37   ` [Virtio-fs] " Stefan Hajnoczi
2021-05-13  8:20 ` [Phishing Risk] [External] " Jiachen Zhang
2021-05-13  8:20   ` [Virtio-fs] [Phishing Risk] [External] " Jiachen Zhang
2021-05-13  8:26   ` [Virtio-fs] host-user " Dr. David Alan Gilbert
2021-05-13  8:26     ` Dr. David Alan Gilbert
2021-05-13  8:51     ` Jiachen Zhang [this message]
2021-05-13  8:51       ` [Virtio-fs] [External] " Jiachen Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFQAk7iDPOKyxR4X_smGH=mBqDccWJUXooMJ+gyXRoaBt2yg4g@mail.gmail.com' \
    --to=zhangjiachen.jaycee@bytedance.com \
    --cc=dgilbert@redhat.com \
    --cc=fam.zheng@bytedance.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sebastien.boeuf@intel.com \
    --cc=virtio-fs@redhat.com \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.