All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Kirti Wankhede <kwankhede@nvidia.com>,
	cjia@nvidia.com, kevin.tian@intel.com, ziye.yang@intel.com,
	changpeng.liu@intel.com, yi.l.liu@intel.com, mlevitsk@redhat.com,
	eskultet@redhat.com, cohuck@redhat.com,
	jonathan.davies@nutanix.com, eauger@redhat.com, aik@ozlabs.ru,
	pasic@linux.ibm.com, felipe@nutanix.com,
	Zhengxiao.zx@alibaba-inc.com, shuangtai.tst@alibaba-inc.com,
	Ken.Xue@amd.com, zhi.a.wang@intel.com, yan.y.zhao@intel.com,
	qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state
Date: Mon, 6 Jan 2020 16:18:51 -0700	[thread overview]
Message-ID: <20200106161851.07871e28@w520.home> (raw)
In-Reply-To: <20200102182537.GK2927@work-vm>

On Thu, 2 Jan 2020 18:25:37 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Alex Williamson (alex.williamson@redhat.com) wrote:
> > On Fri, 20 Dec 2019 01:40:35 +0530
> > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >   
> > > On 12/19/2019 10:57 PM, Alex Williamson wrote:
> > > 
> > > <Snip>
> > >   
> 
> <snip>
> 
> > > 
> > > If device state it at pre-copy state (011b).
> > > Transition, i.e., write to device state as stop-and-copy state (010b) 
> > > failed, then by previous state I meant device should return pre-copy 
> > > state(011b), i.e. previous state which was successfully set, or as you 
> > > said current state which was successfully set.  
> > 
> > Yes, the point I'm trying to make is that this version of the spec
> > tries to tell the user what they should do upon error according to our
> > current interpretation of the QEMU migration protocol.  We're not
> > defining the QEMU migration protocol, we're defining something that can
> > be used in a way to support that protocol.  So I think we should be
> > concerned with defining our spec, for example my proposal would be: "If
> > a state transition fails the user can read device_state to determine the
> > current state of the device.  This should be the previous state of the
> > device unless the vendor driver has encountered an internal error, in
> > which case the device may report the invalid device_state 110b.  The
> > user must use the device reset ioctl in order to recover the device
> > from this state.  If the device is indicated in a valid device state
> > via reading device_state, the user may attempt to transition the device
> > to any valid state reachable from the current state."  
> 
> We might want to be able to distinguish between:
>   a) The device has failed and needs a reset
>   b) The migration has failed

I think the above provides this.  For Kirti's example above of
transitioning from pre-copy to stop-and-copy, the device could refuse
to transition to stop-and-copy, generating an error on the write() of
device_state.  The user re-reading device_state would allow them to
determine the current device state, still in pre-copy or failed.  Only
the latter would require a device reset.

> If some part of the devices mechanics for migration fail, but the device
> is otherwise operational then we should be able to decide to fail the
> migration without taking the device down, which might be very bad for
> the VM.
> Losing a VM during migration due to a problem with migration really
> annoys users; it's one thing the migration failing, but taking the VM
> out as well really gets to them.
> 
> Having the device automatically transition back to the 'running' state
> seems a bad idea to me; much better to tell the hypervisor and provide
> it with a way to clean up; for example, imagine a system with multiple
> devices that are being migrated, most of them have happily transitioned
> to stop-and-copy, but then the last device decides to fail - so now
> someone is going to have to take all of them back to running.

Right, unless I'm missing one, it seems invalid->running is the only
self transition the device should make, though still by way of user
interaction via the reset ioctl.  Thanks,

Alex


WARNING: multiple messages have this Message-ID (diff)
From: Alex Williamson <alex.williamson@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Zhengxiao.zx@alibaba-inc.com, kevin.tian@intel.com,
	yi.l.liu@intel.com, cjia@nvidia.com, kvm@vger.kernel.org,
	eskultet@redhat.com, ziye.yang@intel.com, cohuck@redhat.com,
	shuangtai.tst@alibaba-inc.com, qemu-devel@nongnu.org,
	zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com,
	aik@ozlabs.ru, Kirti Wankhede <kwankhede@nvidia.com>,
	eauger@redhat.com, felipe@nutanix.com,
	jonathan.davies@nutanix.com, yan.y.zhao@intel.com,
	changpeng.liu@intel.com, Ken.Xue@amd.com
Subject: Re: [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state
Date: Mon, 6 Jan 2020 16:18:51 -0700	[thread overview]
Message-ID: <20200106161851.07871e28@w520.home> (raw)
In-Reply-To: <20200102182537.GK2927@work-vm>

On Thu, 2 Jan 2020 18:25:37 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Alex Williamson (alex.williamson@redhat.com) wrote:
> > On Fri, 20 Dec 2019 01:40:35 +0530
> > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >   
> > > On 12/19/2019 10:57 PM, Alex Williamson wrote:
> > > 
> > > <Snip>
> > >   
> 
> <snip>
> 
> > > 
> > > If device state it at pre-copy state (011b).
> > > Transition, i.e., write to device state as stop-and-copy state (010b) 
> > > failed, then by previous state I meant device should return pre-copy 
> > > state(011b), i.e. previous state which was successfully set, or as you 
> > > said current state which was successfully set.  
> > 
> > Yes, the point I'm trying to make is that this version of the spec
> > tries to tell the user what they should do upon error according to our
> > current interpretation of the QEMU migration protocol.  We're not
> > defining the QEMU migration protocol, we're defining something that can
> > be used in a way to support that protocol.  So I think we should be
> > concerned with defining our spec, for example my proposal would be: "If
> > a state transition fails the user can read device_state to determine the
> > current state of the device.  This should be the previous state of the
> > device unless the vendor driver has encountered an internal error, in
> > which case the device may report the invalid device_state 110b.  The
> > user must use the device reset ioctl in order to recover the device
> > from this state.  If the device is indicated in a valid device state
> > via reading device_state, the user may attempt to transition the device
> > to any valid state reachable from the current state."  
> 
> We might want to be able to distinguish between:
>   a) The device has failed and needs a reset
>   b) The migration has failed

I think the above provides this.  For Kirti's example above of
transitioning from pre-copy to stop-and-copy, the device could refuse
to transition to stop-and-copy, generating an error on the write() of
device_state.  The user re-reading device_state would allow them to
determine the current device state, still in pre-copy or failed.  Only
the latter would require a device reset.

> If some part of the devices mechanics for migration fail, but the device
> is otherwise operational then we should be able to decide to fail the
> migration without taking the device down, which might be very bad for
> the VM.
> Losing a VM during migration due to a problem with migration really
> annoys users; it's one thing the migration failing, but taking the VM
> out as well really gets to them.
> 
> Having the device automatically transition back to the 'running' state
> seems a bad idea to me; much better to tell the hypervisor and provide
> it with a way to clean up; for example, imagine a system with multiple
> devices that are being migrated, most of them have happily transitioned
> to stop-and-copy, but then the last device decides to fail - so now
> someone is going to have to take all of them back to running.

Right, unless I'm missing one, it seems invalid->running is the only
self transition the device should make, though still by way of user
interaction via the reset ioctl.  Thanks,

Alex



  reply	other threads:[~2020-01-06 23:19 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-16 20:21 [PATCH v10 Kernel 0/5] KABIs to support migration for VFIO devices Kirti Wankhede
2019-12-16 20:21 ` Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede
2019-12-16 22:44   ` Alex Williamson
2019-12-16 22:44     ` Alex Williamson
2019-12-17  6:28     ` Kirti Wankhede
2019-12-17  6:28       ` Kirti Wankhede
2019-12-17  7:12       ` Yan Zhao
2019-12-17  7:12         ` Yan Zhao
2019-12-17 18:43       ` Alex Williamson
2019-12-17 18:43         ` Alex Williamson
2019-12-19 16:08         ` Kirti Wankhede
2019-12-19 16:08           ` Kirti Wankhede
2019-12-19 17:27           ` Alex Williamson
2019-12-19 17:27             ` Alex Williamson
2019-12-19 20:10             ` Kirti Wankhede
2019-12-19 20:10               ` Kirti Wankhede
2019-12-19 21:09               ` Alex Williamson
2019-12-19 21:09                 ` Alex Williamson
2020-01-02 18:25                 ` Dr. David Alan Gilbert
2020-01-02 18:25                   ` Dr. David Alan Gilbert
2020-01-06 23:18                   ` Alex Williamson [this message]
2020-01-06 23:18                     ` Alex Williamson
2020-01-07  7:28                     ` Kirti Wankhede
2020-01-07  7:28                       ` Kirti Wankhede
2020-01-07 17:09                       ` Alex Williamson
2020-01-07 17:09                         ` Alex Williamson
2020-01-07 17:53                         ` Kirti Wankhede
2020-01-07 17:53                           ` Kirti Wankhede
2020-01-07 18:56                           ` Alex Williamson
2020-01-07 18:56                             ` Alex Williamson
2020-01-08 14:59                             ` Cornelia Huck
2020-01-08 14:59                               ` Cornelia Huck
2020-01-08 18:31                               ` Alex Williamson
2020-01-08 18:31                                 ` Alex Williamson
2020-01-08 20:41                                 ` Kirti Wankhede
2020-01-08 20:41                                   ` Kirti Wankhede
2020-01-08 22:44                                   ` Alex Williamson
2020-01-08 22:44                                     ` Alex Williamson
2020-01-10 14:21                                     ` Cornelia Huck
2020-01-10 14:21                                       ` Cornelia Huck
2020-01-07  9:57                     ` Dr. David Alan Gilbert
2020-01-07  9:57                       ` Dr. David Alan Gilbert
2020-01-07 16:54                       ` Alex Williamson
2020-01-07 16:54                         ` Alex Williamson
2020-01-07 17:50                         ` Dr. David Alan Gilbert
2020-01-07 17:50                           ` Dr. David Alan Gilbert
2019-12-16 20:21 ` [PATCH v10 Kernel 2/5] vfio iommu: Adds flag to indicate dirty pages tracking capability support Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede
2019-12-16 23:16   ` Alex Williamson
2019-12-16 23:16     ` Alex Williamson
2019-12-17  6:32     ` Kirti Wankhede
2019-12-17  6:32       ` Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 3/5] vfio iommu: Add ioctl defination for dirty pages tracking Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 4/5] vfio iommu: Implementation of ioctl to " Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede
2019-12-17  5:15   ` Yan Zhao
2019-12-17  5:15     ` Yan Zhao
2019-12-17  9:24     ` Kirti Wankhede
2019-12-17  9:24       ` Kirti Wankhede
2019-12-17  9:51       ` Yan Zhao
2019-12-17  9:51         ` Yan Zhao
2019-12-17 11:47         ` Kirti Wankhede
2019-12-17 11:47           ` Kirti Wankhede
2019-12-18  1:04           ` Yan Zhao
2019-12-18  1:04             ` Yan Zhao
2019-12-18 20:05             ` Dr. David Alan Gilbert
2019-12-18 20:05               ` Dr. David Alan Gilbert
2019-12-19  0:57               ` Yan Zhao
2019-12-19  0:57                 ` Yan Zhao
2019-12-19 16:21                 ` Kirti Wankhede
2019-12-19 16:21                   ` Kirti Wankhede
2019-12-20  0:58                   ` Yan Zhao
2019-12-20  0:58                     ` Yan Zhao
2020-01-03 19:44                     ` Dr. David Alan Gilbert
2020-01-03 19:44                       ` Dr. David Alan Gilbert
2020-01-04  3:53                       ` Yan Zhao
2020-01-04  3:53                         ` Yan Zhao
2019-12-18 21:39       ` Alex Williamson
2019-12-18 21:39         ` Alex Williamson
2019-12-19 18:42         ` Kirti Wankhede
2019-12-19 18:42           ` Kirti Wankhede
2019-12-19 18:56           ` Alex Williamson
2019-12-19 18:56             ` Alex Williamson
2019-12-16 20:21 ` [PATCH v10 Kernel 5/5] vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200106161851.07871e28@w520.home \
    --to=alex.williamson@redhat.com \
    --cc=Ken.Xue@amd.com \
    --cc=Zhengxiao.zx@alibaba-inc.com \
    --cc=aik@ozlabs.ru \
    --cc=changpeng.liu@intel.com \
    --cc=cjia@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=jonathan.davies@nutanix.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=mlevitsk@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shuangtai.tst@alibaba-inc.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhi.a.wang@intel.com \
    --cc=ziye.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.