From: Kirti Wankhede <kwankhede@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, <cjia@nvidia.com>,
<kevin.tian@intel.com>, <ziye.yang@intel.com>,
<changpeng.liu@intel.com>, <yi.l.liu@intel.com>,
<mlevitsk@redhat.com>, <eskultet@redhat.com>, <cohuck@redhat.com>,
<jonathan.davies@nutanix.com>, <eauger@redhat.com>,
<aik@ozlabs.ru>, <pasic@linux.ibm.com>, <felipe@nutanix.com>,
<Zhengxiao.zx@alibaba-inc.com>, <shuangtai.tst@alibaba-inc.com>,
<Ken.Xue@amd.com>, <zhi.a.wang@intel.com>, <yan.y.zhao@intel.com>,
<qemu-devel@nongnu.org>, <kvm@vger.kernel.org>
Subject: Re: [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state
Date: Tue, 7 Jan 2020 23:23:17 +0530 [thread overview]
Message-ID: <08b7f953-6ac5-cd79-b1ff-54338da32d1e@nvidia.com> (raw)
In-Reply-To: <20200107100923.2f7b5597@w520.home>
On 1/7/2020 10:39 PM, Alex Williamson wrote:
> On Tue, 7 Jan 2020 12:58:22 +0530
> Kirti Wankhede <kwankhede@nvidia.com> wrote:
>
>> On 1/7/2020 4:48 AM, Alex Williamson wrote:
>>> On Thu, 2 Jan 2020 18:25:37 +0000
>>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>>
>>>> * Alex Williamson (alex.williamson@redhat.com) wrote:
>>>>> On Fri, 20 Dec 2019 01:40:35 +0530
>>>>> Kirti Wankhede <kwankhede@nvidia.com> wrote:
>>>>>
>>>>>> On 12/19/2019 10:57 PM, Alex Williamson wrote:
>>>>>>
>>>>>> <Snip>
>>>>>>
>>>>
>>>> <snip>
>>>>
>>>>>>
>>>>>> If device state it at pre-copy state (011b).
>>>>>> Transition, i.e., write to device state as stop-and-copy state (010b)
>>>>>> failed, then by previous state I meant device should return pre-copy
>>>>>> state(011b), i.e. previous state which was successfully set, or as you
>>>>>> said current state which was successfully set.
>>>>>
>>>>> Yes, the point I'm trying to make is that this version of the spec
>>>>> tries to tell the user what they should do upon error according to our
>>>>> current interpretation of the QEMU migration protocol. We're not
>>>>> defining the QEMU migration protocol, we're defining something that can
>>>>> be used in a way to support that protocol. So I think we should be
>>>>> concerned with defining our spec, for example my proposal would be: "If
>>>>> a state transition fails the user can read device_state to determine the
>>>>> current state of the device. This should be the previous state of the
>>>>> device unless the vendor driver has encountered an internal error, in
>>>>> which case the device may report the invalid device_state 110b. The
>>>>> user must use the device reset ioctl in order to recover the device
>>>>> from this state. If the device is indicated in a valid device state
>>>>> via reading device_state, the user may attempt to transition the device
>>>>> to any valid state reachable from the current state."
>>>>
>>>> We might want to be able to distinguish between:
>>>> a) The device has failed and needs a reset
>>>> b) The migration has failed
>>>
>>> I think the above provides this. For Kirti's example above of
>>> transitioning from pre-copy to stop-and-copy, the device could refuse
>>> to transition to stop-and-copy, generating an error on the write() of
>>> device_state. The user re-reading device_state would allow them to
>>> determine the current device state, still in pre-copy or failed. Only
>>> the latter would require a device reset.
>>>
>>>> If some part of the devices mechanics for migration fail, but the device
>>>> is otherwise operational then we should be able to decide to fail the
>>>> migration without taking the device down, which might be very bad for
>>>> the VM.
>>>> Losing a VM during migration due to a problem with migration really
>>>> annoys users; it's one thing the migration failing, but taking the VM
>>>> out as well really gets to them.
>>>>
>>>> Having the device automatically transition back to the 'running' state
>>>> seems a bad idea to me; much better to tell the hypervisor and provide
>>>> it with a way to clean up; for example, imagine a system with multiple
>>>> devices that are being migrated, most of them have happily transitioned
>>>> to stop-and-copy, but then the last device decides to fail - so now
>>>> someone is going to have to take all of them back to running.
>>>
>>> Right, unless I'm missing one, it seems invalid->running is the only
>>> self transition the device should make, though still by way of user
>>> interaction via the reset ioctl. Thanks,
>>>
>>
>> Instead of using invalid state by vendor driver on device failure, I
>> think better to reserve one bit in device state which vendor driver can
>> set on device failure. When error bit is set, other bits in device state
>> should be ignored.
>
> Why is a separate bit better? Saving and Restoring states are mutually
> exclusive, so we have an unused and invalid device state already
> without burning another bit. Thanks,
>
There are 3 invalid states:
* 101b => Invalid state
* 110b => Invalid state
* 111b => Invalid state
why only 110b should be used to report error from vendor driver to
report error? Aren't we adding more confusions in the interface?
Only 3 bits from 32 bits are yet used, one bit can be spared to
represent error state from vendor driver.
Thanks,
Kirti
next prev parent reply other threads:[~2020-01-07 17:53 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-16 20:21 [PATCH v10 Kernel 0/5] KABIs to support migration for VFIO devices Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state Kirti Wankhede
2019-12-16 22:44 ` Alex Williamson
2019-12-17 6:28 ` Kirti Wankhede
2019-12-17 7:12 ` Yan Zhao
2019-12-17 18:43 ` Alex Williamson
2019-12-19 16:08 ` Kirti Wankhede
2019-12-19 17:27 ` Alex Williamson
2019-12-19 20:10 ` Kirti Wankhede
2019-12-19 21:09 ` Alex Williamson
2020-01-02 18:25 ` Dr. David Alan Gilbert
2020-01-06 23:18 ` Alex Williamson
2020-01-07 7:28 ` Kirti Wankhede
2020-01-07 17:09 ` Alex Williamson
2020-01-07 17:53 ` Kirti Wankhede [this message]
2020-01-07 18:56 ` Alex Williamson
2020-01-08 14:59 ` Cornelia Huck
2020-01-08 18:31 ` Alex Williamson
2020-01-08 20:41 ` Kirti Wankhede
2020-01-08 22:44 ` Alex Williamson
2020-01-10 14:21 ` Cornelia Huck
2020-01-07 9:57 ` Dr. David Alan Gilbert
2020-01-07 16:54 ` Alex Williamson
2020-01-07 17:50 ` Dr. David Alan Gilbert
2019-12-16 20:21 ` [PATCH v10 Kernel 2/5] vfio iommu: Adds flag to indicate dirty pages tracking capability support Kirti Wankhede
2019-12-16 23:16 ` Alex Williamson
2019-12-17 6:32 ` Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 3/5] vfio iommu: Add ioctl defination for dirty pages tracking Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 4/5] vfio iommu: Implementation of ioctl to " Kirti Wankhede
2019-12-17 5:15 ` Yan Zhao
2019-12-17 9:24 ` Kirti Wankhede
2019-12-17 9:51 ` Yan Zhao
2019-12-17 11:47 ` Kirti Wankhede
2019-12-18 1:04 ` Yan Zhao
2019-12-18 20:05 ` Dr. David Alan Gilbert
2019-12-19 0:57 ` Yan Zhao
2019-12-19 16:21 ` Kirti Wankhede
2019-12-20 0:58 ` Yan Zhao
2020-01-03 19:44 ` Dr. David Alan Gilbert
2020-01-04 3:53 ` Yan Zhao
2019-12-18 21:39 ` Alex Williamson
2019-12-19 18:42 ` Kirti Wankhede
2019-12-19 18:56 ` Alex Williamson
2019-12-16 20:21 ` [PATCH v10 Kernel 5/5] vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap Kirti Wankhede
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=08b7f953-6ac5-cd79-b1ff-54338da32d1e@nvidia.com \
--to=kwankhede@nvidia.com \
--cc=Ken.Xue@amd.com \
--cc=Zhengxiao.zx@alibaba-inc.com \
--cc=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=changpeng.liu@intel.com \
--cc=cjia@nvidia.com \
--cc=cohuck@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eauger@redhat.com \
--cc=eskultet@redhat.com \
--cc=felipe@nutanix.com \
--cc=jonathan.davies@nutanix.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=mlevitsk@redhat.com \
--cc=pasic@linux.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=shuangtai.tst@alibaba-inc.com \
--cc=yan.y.zhao@intel.com \
--cc=yi.l.liu@intel.com \
--cc=zhi.a.wang@intel.com \
--cc=ziye.yang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).