All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cornelia Huck <cohuck@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Kirti Wankhede <kwankhede@nvidia.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>, <cjia@nvidia.com>,
	<kevin.tian@intel.com>, <ziye.yang@intel.com>,
	<changpeng.liu@intel.com>, <yi.l.liu@intel.com>,
	<mlevitsk@redhat.com>, <eskultet@redhat.com>,
	<jonathan.davies@nutanix.com>, <eauger@redhat.com>,
	<aik@ozlabs.ru>, <pasic@linux.ibm.com>, <felipe@nutanix.com>,
	<Zhengxiao.zx@alibaba-inc.com>, <shuangtai.tst@alibaba-inc.com>,
	<Ken.Xue@amd.com>, <zhi.a.wang@intel.com>, <yan.y.zhao@intel.com>,
	<qemu-devel@nongnu.org>, <kvm@vger.kernel.org>
Subject: Re: [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state
Date: Wed, 8 Jan 2020 15:59:55 +0100	[thread overview]
Message-ID: <20200108155955.78e908c1.cohuck@redhat.com> (raw)
In-Reply-To: <20200107115602.25156c41@w520.home>

On Tue, 7 Jan 2020 11:56:02 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Tue, 7 Jan 2020 23:23:17 +0530
> Kirti Wankhede <kwankhede@nvidia.com> wrote:

> > There are 3 invalid states:
> >   *  101b => Invalid state
> >   *  110b => Invalid state
> >   *  111b => Invalid state
> > 
> > why only 110b should be used to report error from vendor driver to 
> > report error? Aren't we adding more confusions in the interface?  
> 
> I think the only chance of confusion is poor documentation.  If we
> define all of the above as invalid and then say any invalid state
> indicates an error condition, then the burden is on the user to
> enumerate all the invalid states.  That's not a good idea.  Instead we
> could say 101b (_RESUMING|_RUNNING) is reserved, it's not currently
> used but it might be useful some day.  Therefore there are no valid
> transitions into or out of this state.  A vendor driver should fail a
> write(2) attempting to enter this state.
> 
> That leaves 11Xb, where we consider _RESUMING and _SAVING as mutually
> exclusive, so neither are likely to ever be valid states.  Logically,
> if the device is in a failed state such that it needs to be reset to be
> recovered, I would hope the device is not running, so !_RUNNING (110b)
> seems appropriate.  I'm not sure we need that level of detail yet
> though, so I was actually just assuming both 11Xb states would indicate
> an error state and the undefined _RUNNING bit might differentiate
> something in the future.
> 
> Therefore, I think we'd have:
> 
>  * 101b => Reserved
>  * 11Xb => Error
> 
> Where the device can only self transition into the Error state on a
> failed device_state transition and the only exit from the Error state
> is via the reset ioctl.  The Reserved state is unreachable.  The vendor
> driver must error on device_state writes to enter or exit the Error
> state and must error on writes to enter Reserved states.  Is that still
> confusing?

I think one thing we could do is start to tie the meaning more to the
actual state (bit combination) and less to the individual bits. I.e.

- bit 0 indicates 'running',
- bit 1 indicates 'saving',
- bit 2 indicates 'resuming',
- bits 3-31 are reserved. [Aside: reserved-and-ignored or
  reserved-and-must-be-zero?]

[Note that I don't specify what happens when a bit is set or unset.]

States are then defined as:
000b => stopped state (not saving or resuming)
001b => running state (not saving or resuming)
010b => stop-and-copy state
011b => pre-copy state
100b => resuming state

[Transitions between these states defined, as before.]

101b => reserved [for post-copy; no transitions defined]
111b => reserved [state does not make sense; no transitions defined]
110b => error state [state does not make sense per se, but it does not
        indicate running; transitions into this state *are* possible]

To a 'reserved' state, we can later assign a different meaning (we
could even re-use 111b for a different error state, if needed); while
the error state must always stay the error state.

We should probably use some kind of feature indication to signify
whether a 'reserved' state actually has a meaning. Also, maybe we also
should designate the states > 111b as 'reserved'.

Does that make sense?


WARNING: multiple messages have this Message-ID (diff)
From: Cornelia Huck <cohuck@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: kevin.tian@intel.com, yi.l.liu@intel.com, cjia@nvidia.com,
	kvm@vger.kernel.org, eskultet@redhat.com, ziye.yang@intel.com,
	qemu-devel@nongnu.org, Zhengxiao.zx@alibaba-inc.com,
	shuangtai.tst@alibaba-inc.com,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	zhi.a.wang@intel.com, mlevitsk@redhat.com, pasic@linux.ibm.com,
	aik@ozlabs.ru, Kirti Wankhede <kwankhede@nvidia.com>,
	eauger@redhat.com, felipe@nutanix.com,
	jonathan.davies@nutanix.com, yan.y.zhao@intel.com,
	changpeng.liu@intel.com, Ken.Xue@amd.com
Subject: Re: [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state
Date: Wed, 8 Jan 2020 15:59:55 +0100	[thread overview]
Message-ID: <20200108155955.78e908c1.cohuck@redhat.com> (raw)
In-Reply-To: <20200107115602.25156c41@w520.home>

On Tue, 7 Jan 2020 11:56:02 -0700
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Tue, 7 Jan 2020 23:23:17 +0530
> Kirti Wankhede <kwankhede@nvidia.com> wrote:

> > There are 3 invalid states:
> >   *  101b => Invalid state
> >   *  110b => Invalid state
> >   *  111b => Invalid state
> > 
> > why only 110b should be used to report error from vendor driver to 
> > report error? Aren't we adding more confusions in the interface?  
> 
> I think the only chance of confusion is poor documentation.  If we
> define all of the above as invalid and then say any invalid state
> indicates an error condition, then the burden is on the user to
> enumerate all the invalid states.  That's not a good idea.  Instead we
> could say 101b (_RESUMING|_RUNNING) is reserved, it's not currently
> used but it might be useful some day.  Therefore there are no valid
> transitions into or out of this state.  A vendor driver should fail a
> write(2) attempting to enter this state.
> 
> That leaves 11Xb, where we consider _RESUMING and _SAVING as mutually
> exclusive, so neither are likely to ever be valid states.  Logically,
> if the device is in a failed state such that it needs to be reset to be
> recovered, I would hope the device is not running, so !_RUNNING (110b)
> seems appropriate.  I'm not sure we need that level of detail yet
> though, so I was actually just assuming both 11Xb states would indicate
> an error state and the undefined _RUNNING bit might differentiate
> something in the future.
> 
> Therefore, I think we'd have:
> 
>  * 101b => Reserved
>  * 11Xb => Error
> 
> Where the device can only self transition into the Error state on a
> failed device_state transition and the only exit from the Error state
> is via the reset ioctl.  The Reserved state is unreachable.  The vendor
> driver must error on device_state writes to enter or exit the Error
> state and must error on writes to enter Reserved states.  Is that still
> confusing?

I think one thing we could do is start to tie the meaning more to the
actual state (bit combination) and less to the individual bits. I.e.

- bit 0 indicates 'running',
- bit 1 indicates 'saving',
- bit 2 indicates 'resuming',
- bits 3-31 are reserved. [Aside: reserved-and-ignored or
  reserved-and-must-be-zero?]

[Note that I don't specify what happens when a bit is set or unset.]

States are then defined as:
000b => stopped state (not saving or resuming)
001b => running state (not saving or resuming)
010b => stop-and-copy state
011b => pre-copy state
100b => resuming state

[Transitions between these states defined, as before.]

101b => reserved [for post-copy; no transitions defined]
111b => reserved [state does not make sense; no transitions defined]
110b => error state [state does not make sense per se, but it does not
        indicate running; transitions into this state *are* possible]

To a 'reserved' state, we can later assign a different meaning (we
could even re-use 111b for a different error state, if needed); while
the error state must always stay the error state.

We should probably use some kind of feature indication to signify
whether a 'reserved' state actually has a meaning. Also, maybe we also
should designate the states > 111b as 'reserved'.

Does that make sense?



  reply	other threads:[~2020-01-08 15:00 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-16 20:21 [PATCH v10 Kernel 0/5] KABIs to support migration for VFIO devices Kirti Wankhede
2019-12-16 20:21 ` Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 1/5] vfio: KABI for migration interface for device state Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede
2019-12-16 22:44   ` Alex Williamson
2019-12-16 22:44     ` Alex Williamson
2019-12-17  6:28     ` Kirti Wankhede
2019-12-17  6:28       ` Kirti Wankhede
2019-12-17  7:12       ` Yan Zhao
2019-12-17  7:12         ` Yan Zhao
2019-12-17 18:43       ` Alex Williamson
2019-12-17 18:43         ` Alex Williamson
2019-12-19 16:08         ` Kirti Wankhede
2019-12-19 16:08           ` Kirti Wankhede
2019-12-19 17:27           ` Alex Williamson
2019-12-19 17:27             ` Alex Williamson
2019-12-19 20:10             ` Kirti Wankhede
2019-12-19 20:10               ` Kirti Wankhede
2019-12-19 21:09               ` Alex Williamson
2019-12-19 21:09                 ` Alex Williamson
2020-01-02 18:25                 ` Dr. David Alan Gilbert
2020-01-02 18:25                   ` Dr. David Alan Gilbert
2020-01-06 23:18                   ` Alex Williamson
2020-01-06 23:18                     ` Alex Williamson
2020-01-07  7:28                     ` Kirti Wankhede
2020-01-07  7:28                       ` Kirti Wankhede
2020-01-07 17:09                       ` Alex Williamson
2020-01-07 17:09                         ` Alex Williamson
2020-01-07 17:53                         ` Kirti Wankhede
2020-01-07 17:53                           ` Kirti Wankhede
2020-01-07 18:56                           ` Alex Williamson
2020-01-07 18:56                             ` Alex Williamson
2020-01-08 14:59                             ` Cornelia Huck [this message]
2020-01-08 14:59                               ` Cornelia Huck
2020-01-08 18:31                               ` Alex Williamson
2020-01-08 18:31                                 ` Alex Williamson
2020-01-08 20:41                                 ` Kirti Wankhede
2020-01-08 20:41                                   ` Kirti Wankhede
2020-01-08 22:44                                   ` Alex Williamson
2020-01-08 22:44                                     ` Alex Williamson
2020-01-10 14:21                                     ` Cornelia Huck
2020-01-10 14:21                                       ` Cornelia Huck
2020-01-07  9:57                     ` Dr. David Alan Gilbert
2020-01-07  9:57                       ` Dr. David Alan Gilbert
2020-01-07 16:54                       ` Alex Williamson
2020-01-07 16:54                         ` Alex Williamson
2020-01-07 17:50                         ` Dr. David Alan Gilbert
2020-01-07 17:50                           ` Dr. David Alan Gilbert
2019-12-16 20:21 ` [PATCH v10 Kernel 2/5] vfio iommu: Adds flag to indicate dirty pages tracking capability support Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede
2019-12-16 23:16   ` Alex Williamson
2019-12-16 23:16     ` Alex Williamson
2019-12-17  6:32     ` Kirti Wankhede
2019-12-17  6:32       ` Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 3/5] vfio iommu: Add ioctl defination for dirty pages tracking Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede
2019-12-16 20:21 ` [PATCH v10 Kernel 4/5] vfio iommu: Implementation of ioctl to " Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede
2019-12-17  5:15   ` Yan Zhao
2019-12-17  5:15     ` Yan Zhao
2019-12-17  9:24     ` Kirti Wankhede
2019-12-17  9:24       ` Kirti Wankhede
2019-12-17  9:51       ` Yan Zhao
2019-12-17  9:51         ` Yan Zhao
2019-12-17 11:47         ` Kirti Wankhede
2019-12-17 11:47           ` Kirti Wankhede
2019-12-18  1:04           ` Yan Zhao
2019-12-18  1:04             ` Yan Zhao
2019-12-18 20:05             ` Dr. David Alan Gilbert
2019-12-18 20:05               ` Dr. David Alan Gilbert
2019-12-19  0:57               ` Yan Zhao
2019-12-19  0:57                 ` Yan Zhao
2019-12-19 16:21                 ` Kirti Wankhede
2019-12-19 16:21                   ` Kirti Wankhede
2019-12-20  0:58                   ` Yan Zhao
2019-12-20  0:58                     ` Yan Zhao
2020-01-03 19:44                     ` Dr. David Alan Gilbert
2020-01-03 19:44                       ` Dr. David Alan Gilbert
2020-01-04  3:53                       ` Yan Zhao
2020-01-04  3:53                         ` Yan Zhao
2019-12-18 21:39       ` Alex Williamson
2019-12-18 21:39         ` Alex Williamson
2019-12-19 18:42         ` Kirti Wankhede
2019-12-19 18:42           ` Kirti Wankhede
2019-12-19 18:56           ` Alex Williamson
2019-12-19 18:56             ` Alex Williamson
2019-12-16 20:21 ` [PATCH v10 Kernel 5/5] vfio iommu: Update UNMAP_DMA ioctl to get dirty bitmap before unmap Kirti Wankhede
2019-12-16 20:21   ` Kirti Wankhede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200108155955.78e908c1.cohuck@redhat.com \
    --to=cohuck@redhat.com \
    --cc=Ken.Xue@amd.com \
    --cc=Zhengxiao.zx@alibaba-inc.com \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=changpeng.liu@intel.com \
    --cc=cjia@nvidia.com \
    --cc=dgilbert@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=jonathan.davies@nutanix.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=mlevitsk@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shuangtai.tst@alibaba-inc.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhi.a.wang@intel.com \
    --cc=ziye.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.