kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kirti Wankhede <kwankhede@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, <cjia@nvidia.com>,
	<kevin.tian@intel.com>, <ziye.yang@intel.com>,
	<changpeng.liu@intel.com>, <yi.l.liu@intel.com>,
	<mlevitsk@redhat.com>, <eskultet@redhat.com>, <cohuck@redhat.com>,
	<jonathan.davies@nutanix.com>, <eauger@redhat.com>,
	<aik@ozlabs.ru>, <pasic@linux.ibm.com>, <felipe@nutanix.com>,
	<Zhengxiao.zx@alibaba-inc.com>, <shuangtai.tst@alibaba-inc.com>,
	<Ken.Xue@amd.com>, <zhi.a.wang@intel.com>, <yan.y.zhao@intel.com>,
	<qemu-devel@nongnu.org>, <kvm@vger.kernel.org>
Subject: Re: [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state
Date: Tue, 7 Jan 2020 23:23:17 +0530	[thread overview]
Message-ID: <08b7f953-6ac5-cd79-b1ff-54338da32d1e@nvidia.com> (raw)
In-Reply-To: <20200107100923.2f7b5597@w520.home>



On 1/7/2020 10:39 PM, Alex Williamson wrote:
> On Tue, 7 Jan 2020 12:58:22 +0530
> Kirti Wankhede <kwankhede@nvidia.com> wrote:
> 
>> On 1/7/2020 4:48 AM, Alex Williamson wrote:
>>> On Thu, 2 Jan 2020 18:25:37 +0000
>>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>>    
>>>> * Alex Williamson (alex.williamson@redhat.com) wrote:
>>>>> On Fri, 20 Dec 2019 01:40:35 +0530
>>>>> Kirti Wankhede <kwankhede@nvidia.com> wrote:
>>>>>       
>>>>>> On 12/19/2019 10:57 PM, Alex Williamson wrote:
>>>>>>
>>>>>> <Snip>
>>>>>>       
>>>>
>>>> <snip>
>>>>   
>>>>>>
>>>>>> If device state it at pre-copy state (011b).
>>>>>> Transition, i.e., write to device state as stop-and-copy state (010b)
>>>>>> failed, then by previous state I meant device should return pre-copy
>>>>>> state(011b), i.e. previous state which was successfully set, or as you
>>>>>> said current state which was successfully set.
>>>>>
>>>>> Yes, the point I'm trying to make is that this version of the spec
>>>>> tries to tell the user what they should do upon error according to our
>>>>> current interpretation of the QEMU migration protocol.  We're not
>>>>> defining the QEMU migration protocol, we're defining something that can
>>>>> be used in a way to support that protocol.  So I think we should be
>>>>> concerned with defining our spec, for example my proposal would be: "If
>>>>> a state transition fails the user can read device_state to determine the
>>>>> current state of the device.  This should be the previous state of the
>>>>> device unless the vendor driver has encountered an internal error, in
>>>>> which case the device may report the invalid device_state 110b.  The
>>>>> user must use the device reset ioctl in order to recover the device
>>>>> from this state.  If the device is indicated in a valid device state
>>>>> via reading device_state, the user may attempt to transition the device
>>>>> to any valid state reachable from the current state."
>>>>
>>>> We might want to be able to distinguish between:
>>>>     a) The device has failed and needs a reset
>>>>     b) The migration has failed
>>>
>>> I think the above provides this.  For Kirti's example above of
>>> transitioning from pre-copy to stop-and-copy, the device could refuse
>>> to transition to stop-and-copy, generating an error on the write() of
>>> device_state.  The user re-reading device_state would allow them to
>>> determine the current device state, still in pre-copy or failed.  Only
>>> the latter would require a device reset.
>>>    
>>>> If some part of the devices mechanics for migration fail, but the device
>>>> is otherwise operational then we should be able to decide to fail the
>>>> migration without taking the device down, which might be very bad for
>>>> the VM.
>>>> Losing a VM during migration due to a problem with migration really
>>>> annoys users; it's one thing the migration failing, but taking the VM
>>>> out as well really gets to them.
>>>>
>>>> Having the device automatically transition back to the 'running' state
>>>> seems a bad idea to me; much better to tell the hypervisor and provide
>>>> it with a way to clean up; for example, imagine a system with multiple
>>>> devices that are being migrated, most of them have happily transitioned
>>>> to stop-and-copy, but then the last device decides to fail - so now
>>>> someone is going to have to take all of them back to running.
>>>
>>> Right, unless I'm missing one, it seems invalid->running is the only
>>> self transition the device should make, though still by way of user
>>> interaction via the reset ioctl.  Thanks,
>>>    
>>
>> Instead of using invalid state by vendor driver on device failure, I
>> think better to reserve one bit in device state which vendor driver can
>> set on device failure. When error bit is set, other bits in device state
>> should be ignored.
> 
> Why is a separate bit better?  Saving and Restoring states are mutually
> exclusive, so we have an unused and invalid device state already
> without burning another bit.  Thanks,
> 

There are 3 invalid states:
  *  101b => Invalid state
  *  110b => Invalid state
  *  111b => Invalid state

why only 110b should be used to report error from vendor driver to 
report error? Aren't we adding more confusions in the interface?

Only 3 bits from 32 bits are yet used, one bit can be spared to 
represent error state from vendor driver.

Thanks,
Kirti

  reply	other threads:[~2020-01-07 17:53 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-16 20:21 [PATCH v10 Kernel 0/5] KABIs to support migration for VFIO devices Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state Kirti Wankhede
2019-12-16 22:44   ` Alex Williamson
2019-12-17  6:28     ` Kirti Wankhede
2019-12-17  7:12       ` Yan Zhao
2019-12-17 18:43       ` Alex Williamson
2019-12-19 16:08         ` Kirti Wankhede
2019-12-19 17:27           ` Alex Williamson
2019-12-19 20:10             ` Kirti Wankhede
2019-12-19 21:09               ` Alex Williamson
2020-01-02 18:25                 ` Dr. David Alan Gilbert
2020-01-06 23:18                   ` Alex Williamson
2020-01-07  7:28                     ` Kirti Wankhede
2020-01-07 17:09                       ` Alex Williamson
2020-01-07 17:53                         ` Kirti Wankhede [this message]
2020-01-07 18:56                           ` Alex Williamson
2020-01-08 14:59                             ` Cornelia Huck
2020-01-08 18:31                               ` Alex Williamson
2020-01-08 20:41                                 ` Kirti Wankhede
2020-01-08 22:44                                   ` Alex Williamson
2020-01-10 14:21                                     ` Cornelia Huck
2020-01-07  9:57                     ` Dr. David Alan Gilbert
2020-01-07 16:54                       ` Alex Williamson
2020-01-07 17:50                         ` Dr. David Alan Gilbert
2019-12-16 20:21 ` [PATCH v10 Kernel 2/5] vfio iommu: Adds flag to indicate dirty pages tracking capability support Kirti Wankhede
2019-12-16 23:16   ` Alex Williamson
2019-12-17  6:32     ` Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 3/5] vfio iommu: Add ioctl defination for dirty pages tracking Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 4/5] vfio iommu: Implementation of ioctl to " Kirti Wankhede
2019-12-17  5:15   ` Yan Zhao
2019-12-17  9:24     ` Kirti Wankhede
2019-12-17  9:51       ` Yan Zhao
2019-12-17 11:47         ` Kirti Wankhede
2019-12-18  1:04           ` Yan Zhao
2019-12-18 20:05             ` Dr. David Alan Gilbert
2019-12-19  0:57               ` Yan Zhao
2019-12-19 16:21                 ` Kirti Wankhede
2019-12-20  0:58                   ` Yan Zhao
2020-01-03 19:44                     ` Dr. David Alan Gilbert
2020-01-04  3:53                       ` Yan Zhao
2019-12-18 21:39       ` Alex Williamson
2019-12-19 18:42         ` Kirti Wankhede
2019-12-19 18:56           ` Alex Williamson
2019-12-16 20:21 ` [PATCH v10 Kernel 5/5] vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap Kirti Wankhede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=08b7f953-6ac5-cd79-b1ff-54338da32d1e@nvidia.com \
    --to=kwankhede@nvidia.com \
    --cc=Ken.Xue@amd.com \
    --cc=Zhengxiao.zx@alibaba-inc.com \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=changpeng.liu@intel.com \
    --cc=cjia@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=jonathan.davies@nutanix.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shuangtai.tst@alibaba-inc.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhi.a.wang@intel.com \
    --cc=ziye.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).