All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
@ 2018-06-11  7:56 Tiwei Bie
  2018-06-11  8:43 ` [virtio-dev] " Cornelia Huck
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Tiwei Bie @ 2018-06-11  7:56 UTC (permalink / raw)
  To: mst, cohuck, stefanha, pbonzini, virtio-dev
  Cc: dan.daly, cunming.liang, zhihong.wang

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
---
v2:
- Refine the wording (Cornelia);

v3:
- Refine the wording (MST);

 content.tex | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/content.tex b/content.tex
index f996fad..3c7d67d 100644
--- a/content.tex
+++ b/content.tex
@@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
 of features the driver accepts, otherwise it MUST fail to set the
 FEATURES_OK \field{device status} bit when the driver writes it.
 
+If a device has successfully negotiated a set of features
+at least once (by accepting the FEATURES_OK \field{device
+status} bit during device initialization), then it SHOULD
+NOT fail re-negotiation of the same set of features after
+a device or system reset.  Failure to do so would interfere
+with resuming from suspend and error recovery.
+
 \subsection{Legacy Interface: A Note on Feature
 Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
 Bits / Legacy Interface: A Note on Feature Bits}
-- 
2.17.0


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [virtio-dev] Re: [PATCH v3] content: enhance device requirements for feature bits
  2018-06-11  7:56 [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits Tiwei Bie
@ 2018-06-11  8:43 ` Cornelia Huck
  2018-06-11 13:24 ` Michael S. Tsirkin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Cornelia Huck @ 2018-06-11  8:43 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: mst, stefanha, pbonzini, virtio-dev, dan.daly, cunming.liang,
	zhihong.wang

On Mon, 11 Jun 2018 15:56:40 +0800
Tiwei Bie <tiwei.bie@intel.com> wrote:

> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> ---
> v2:
> - Refine the wording (Cornelia);
> 
> v3:
> - Refine the wording (MST);
> 
>  content.tex | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/content.tex b/content.tex
> index f996fad..3c7d67d 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
>  of features the driver accepts, otherwise it MUST fail to set the
>  FEATURES_OK \field{device status} bit when the driver writes it.
>  
> +If a device has successfully negotiated a set of features
> +at least once (by accepting the FEATURES_OK \field{device
> +status} bit during device initialization), then it SHOULD
> +NOT fail re-negotiation of the same set of features after
> +a device or system reset.  Failure to do so would interfere
> +with resuming from suspend and error recovery.
> +
>  \subsection{Legacy Interface: A Note on Feature
>  Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
>  Bits / Legacy Interface: A Note on Feature Bits}

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [virtio-dev] Re: [PATCH v3] content: enhance device requirements for feature bits
  2018-06-11  7:56 [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits Tiwei Bie
  2018-06-11  8:43 ` [virtio-dev] " Cornelia Huck
@ 2018-06-11 13:24 ` Michael S. Tsirkin
  2018-06-11 13:29   ` Cornelia Huck
  2018-06-11 13:44 ` Michael S. Tsirkin
  2018-06-15 12:10 ` [virtio-dev] " Halil Pasic
  3 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2018-06-11 13:24 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: cohuck, stefanha, pbonzini, virtio-dev, dan.daly, cunming.liang,
	zhihong.wang

On Mon, Jun 11, 2018 at 03:56:40PM +0800, Tiwei Bie wrote:
> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> ---
> v2:
> - Refine the wording (Cornelia);
> 
> v3:
> - Refine the wording (MST);
> 
>  content.tex | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/content.tex b/content.tex
> index f996fad..3c7d67d 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
>  of features the driver accepts, otherwise it MUST fail to set the
>  FEATURES_OK \field{device status} bit when the driver writes it.
>  
> +If a device has successfully negotiated a set of features
> +at least once (by accepting the FEATURES_OK \field{device
> +status} bit during device initialization), then it SHOULD
> +NOT fail re-negotiation of the same set of features after
> +a device or system reset.  Failure to do so would interfere
> +with resuming from suspend and error recovery.
> +
>  \subsection{Legacy Interface: A Note on Feature
>  Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
>  Bits / Legacy Interface: A Note on Feature Bits}

OK but there's no \field{device status} anywhere
else I think.
We only have \field{status} in spec.


> -- 
> 2.17.0

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [virtio-dev] Re: [PATCH v3] content: enhance device requirements for feature bits
  2018-06-11 13:24 ` Michael S. Tsirkin
@ 2018-06-11 13:29   ` Cornelia Huck
  2018-06-11 13:44     ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Cornelia Huck @ 2018-06-11 13:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tiwei Bie, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang

On Mon, 11 Jun 2018 16:24:51 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Jun 11, 2018 at 03:56:40PM +0800, Tiwei Bie wrote:
> > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > ---
> > v2:
> > - Refine the wording (Cornelia);
> > 
> > v3:
> > - Refine the wording (MST);
> > 
> >  content.tex | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/content.tex b/content.tex
> > index f996fad..3c7d67d 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> >  of features the driver accepts, otherwise it MUST fail to set the
> >  FEATURES_OK \field{device status} bit when the driver writes it.
> >  
> > +If a device has successfully negotiated a set of features
> > +at least once (by accepting the FEATURES_OK \field{device
> > +status} bit during device initialization), then it SHOULD
> > +NOT fail re-negotiation of the same set of features after
> > +a device or system reset.  Failure to do so would interfere
> > +with resuming from suspend and error recovery.
> > +
> >  \subsection{Legacy Interface: A Note on Feature
> >  Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
> >  Bits / Legacy Interface: A Note on Feature Bits}  
> 
> OK but there's no \field{device status} anywhere
> else I think.
> We only have \field{status} in spec.

Should probably be "device \field{status} bit", then?

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [virtio-dev] Re: [PATCH v3] content: enhance device requirements for feature bits
  2018-06-11 13:29   ` Cornelia Huck
@ 2018-06-11 13:44     ` Michael S. Tsirkin
  0 siblings, 0 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2018-06-11 13:44 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Tiwei Bie, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang

On Mon, Jun 11, 2018 at 03:29:33PM +0200, Cornelia Huck wrote:
> On Mon, 11 Jun 2018 16:24:51 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Mon, Jun 11, 2018 at 03:56:40PM +0800, Tiwei Bie wrote:
> > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > > ---
> > > v2:
> > > - Refine the wording (Cornelia);
> > > 
> > > v3:
> > > - Refine the wording (MST);
> > > 
> > >  content.tex | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > > 
> > > diff --git a/content.tex b/content.tex
> > > index f996fad..3c7d67d 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> > >  of features the driver accepts, otherwise it MUST fail to set the
> > >  FEATURES_OK \field{device status} bit when the driver writes it.
> > >  
> > > +If a device has successfully negotiated a set of features
> > > +at least once (by accepting the FEATURES_OK \field{device
> > > +status} bit during device initialization), then it SHOULD
> > > +NOT fail re-negotiation of the same set of features after
> > > +a device or system reset.  Failure to do so would interfere
> > > +with resuming from suspend and error recovery.
> > > +
> > >  \subsection{Legacy Interface: A Note on Feature
> > >  Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
> > >  Bits / Legacy Interface: A Note on Feature Bits}  
> > 
> > OK but there's no \field{device status} anywhere
> > else I think.
> > We only have \field{status} in spec.
> 
> Should probably be "device \field{status} bit", then?

I'm actually wrong, it is called device status in "Basic Facilities of a
Virtio Device". We probably should fix the places that
call it \field{status}.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [virtio-dev] Re: [PATCH v3] content: enhance device requirements for feature bits
  2018-06-11  7:56 [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits Tiwei Bie
  2018-06-11  8:43 ` [virtio-dev] " Cornelia Huck
  2018-06-11 13:24 ` Michael S. Tsirkin
@ 2018-06-11 13:44 ` Michael S. Tsirkin
  2018-06-15 12:10 ` [virtio-dev] " Halil Pasic
  3 siblings, 0 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2018-06-11 13:44 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: cohuck, stefanha, pbonzini, virtio-dev, dan.daly, cunming.liang,
	zhihong.wang

On Mon, Jun 11, 2018 at 03:56:40PM +0800, Tiwei Bie wrote:
> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
> v2:
> - Refine the wording (Cornelia);
> 
> v3:
> - Refine the wording (MST);
> 
>  content.tex | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/content.tex b/content.tex
> index f996fad..3c7d67d 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
>  of features the driver accepts, otherwise it MUST fail to set the
>  FEATURES_OK \field{device status} bit when the driver writes it.
>  
> +If a device has successfully negotiated a set of features
> +at least once (by accepting the FEATURES_OK \field{device
> +status} bit during device initialization), then it SHOULD
> +NOT fail re-negotiation of the same set of features after
> +a device or system reset.  Failure to do so would interfere
> +with resuming from suspend and error recovery.
> +
>  \subsection{Legacy Interface: A Note on Feature
>  Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
>  Bits / Legacy Interface: A Note on Feature Bits}
> -- 
> 2.17.0

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-11  7:56 [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits Tiwei Bie
                   ` (2 preceding siblings ...)
  2018-06-11 13:44 ` Michael S. Tsirkin
@ 2018-06-15 12:10 ` Halil Pasic
  2018-06-15 12:19   ` Michael S. Tsirkin
  3 siblings, 1 reply; 24+ messages in thread
From: Halil Pasic @ 2018-06-15 12:10 UTC (permalink / raw)
  To: Tiwei Bie, mst, cohuck, stefanha, pbonzini, virtio-dev
  Cc: dan.daly, cunming.liang, zhihong.wang



On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> ---
> v2:
> - Refine the wording (Cornelia);
> 
> v3:
> - Refine the wording (MST);
> 
>   content.tex | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/content.tex b/content.tex
> index f996fad..3c7d67d 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
>   of features the driver accepts, otherwise it MUST fail to set the
>   FEATURES_OK \field{device status} bit when the driver writes it.
>   
> +If a device has successfully negotiated a set of features
> +at least once (by accepting the FEATURES_OK \field{device
> +status} bit during device initialization), then it SHOULD
> +NOT fail re-negotiation of the same set of features after
> +a device or system reset.  Failure to do so would interfere
> +with resuming from suspend and error recovery.
> +


Sorry people but I don't get it. I mean it is kind of reasonable
to assume that with a given device and a given driver (given, i.e.
nothing changes) the two will always negotiate the same features
(including the extremal case where the negotiation fails).

Either the device or a driver rolling a dice to make feature negotiation
more fun seems quite unreasonable. So I assume this is not what we are
bothering to soft prohibit here.

So the interesting scenario seems to be when stuff changes. When
migrating the implementation of the device could change. Or something
changes regarding the resources used to provide the virtual device.

But then, if the device really can not support the set of features
it used to be able, I guess the SHOULD does not take effect (I guess
that is the difference compared to MUST).

Bottom line is: I tried to figure out what is this about, but I failed.
I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
it did not click. I would appreciate some assistance.


>   \subsection{Legacy Interface: A Note on Feature
>   Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
>   Bits / Legacy Interface: A Note on Feature Bits}
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 12:10 ` [virtio-dev] " Halil Pasic
@ 2018-06-15 12:19   ` Michael S. Tsirkin
  2018-06-15 12:42     ` Halil Pasic
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2018-06-15 12:19 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Tiwei Bie, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang

On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
> 
> 
> On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > ---
> > v2:
> > - Refine the wording (Cornelia);
> > 
> > v3:
> > - Refine the wording (MST);
> > 
> >   content.tex | 7 +++++++
> >   1 file changed, 7 insertions(+)
> > 
> > diff --git a/content.tex b/content.tex
> > index f996fad..3c7d67d 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> >   of features the driver accepts, otherwise it MUST fail to set the
> >   FEATURES_OK \field{device status} bit when the driver writes it.
> > +If a device has successfully negotiated a set of features
> > +at least once (by accepting the FEATURES_OK \field{device
> > +status} bit during device initialization), then it SHOULD
> > +NOT fail re-negotiation of the same set of features after
> > +a device or system reset.  Failure to do so would interfere
> > +with resuming from suspend and error recovery.
> > +
> 
> 
> Sorry people but I don't get it. I mean it is kind of reasonable
> to assume that with a given device and a given driver (given, i.e.
> nothing changes) the two will always negotiate the same features
> (including the extremal case where the negotiation fails).
> 
> Either the device or a driver rolling a dice to make feature negotiation
> more fun seems quite unreasonable. So I assume this is not what we are
> bothering to soft prohibit here.
> 
> So the interesting scenario seems to be when stuff changes. When
> migrating the implementation of the device could change. Or something
> changes regarding the resources used to provide the virtual device.
> 
> But then, if the device really can not support the set of features
> it used to be able, I guess the SHOULD does not take effect (I guess
> that is the difference compared to MUST).
> 
> Bottom line is: I tried to figure out what is this about, but I failed.
> I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
> it did not click. I would appreciate some assistance.

It's exactly what it says. Let's say you negotiated a feature and then
device sets NEED_RESET.  Driver must now reset the device and put it
back in the same state it had before the reset, then resubmit
requests that were available but never used.

What if any of the features changed? Device suddenly
needs to check for requests which do not match the
features.

Suspend is similar: guests tend to assume hardware
does not change across suspend/resume, any changes
tend to make resume fail.

> 
> >   \subsection{Legacy Interface: A Note on Feature
> >   Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
> >   Bits / Legacy Interface: A Note on Feature Bits}
> > 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 12:19   ` Michael S. Tsirkin
@ 2018-06-15 12:42     ` Halil Pasic
  2018-06-15 13:38       ` Michael S. Tsirkin
  2018-06-15 13:39       ` Tiwei Bie
  0 siblings, 2 replies; 24+ messages in thread
From: Halil Pasic @ 2018-06-15 12:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tiwei Bie, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang



On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
> On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
>>
>>
>> On 06/11/2018 09:56 AM, Tiwei Bie wrote:
>>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>>> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>>> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
>>> ---
>>> v2:
>>> - Refine the wording (Cornelia);
>>>
>>> v3:
>>> - Refine the wording (MST);
>>>
>>>    content.tex | 7 +++++++
>>>    1 file changed, 7 insertions(+)
>>>
>>> diff --git a/content.tex b/content.tex
>>> index f996fad..3c7d67d 100644
>>> --- a/content.tex
>>> +++ b/content.tex
>>> @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
>>>    of features the driver accepts, otherwise it MUST fail to set the
>>>    FEATURES_OK \field{device status} bit when the driver writes it.
>>> +If a device has successfully negotiated a set of features
>>> +at least once (by accepting the FEATURES_OK \field{device
>>> +status} bit during device initialization), then it SHOULD
>>> +NOT fail re-negotiation of the same set of features after
>>> +a device or system reset.  Failure to do so would interfere
>>> +with resuming from suspend and error recovery.
>>> +
>>
>>
>> Sorry people but I don't get it. I mean it is kind of reasonable
>> to assume that with a given device and a given driver (given, i.e.
>> nothing changes) the two will always negotiate the same features
>> (including the extremal case where the negotiation fails).
>>
>> Either the device or a driver rolling a dice to make feature negotiation
>> more fun seems quite unreasonable. So I assume this is not what we are
>> bothering to soft prohibit here.
>>
>> So the interesting scenario seems to be when stuff changes. When
>> migrating the implementation of the device could change. Or something
>> changes regarding the resources used to provide the virtual device.
>>
>> But then, if the device really can not support the set of features
>> it used to be able, I guess the SHOULD does not take effect (I guess
>> that is the difference compared to MUST).
>>
>> Bottom line is: I tried to figure out what is this about, but I failed.
>> I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
>> it did not click. I would appreciate some assistance.
> 
> It's exactly what it says. Let's say you negotiated a feature and then
> device sets NEED_RESET.  Driver must now reset the device and put it
> back in the same state it had before the reset, then resubmit
> requests that were available but never used.
> 
> What if any of the features changed? Device suddenly
> needs to check for requests which do not match the
> features.
> 
> Suspend is similar: guests tend to assume hardware
> does not change across suspend/resume, any changes
> tend to make resume fail.
> 

Thank you very much! But it still does not answer why would a device
want to do that (fail to negotiate a feature that it was able
to negotiate before). So I'm still in the dark about what are we
trading for what.

Is there somewhere a patch that fixes such a bug? Maybe that would
help me understand what can be done at the device to avoid the
problem.

Regards,
Halil


>>
>>>    \subsection{Legacy Interface: A Note on Feature
>>>    Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
>>>    Bits / Legacy Interface: A Note on Feature Bits}
>>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 12:42     ` Halil Pasic
@ 2018-06-15 13:38       ` Michael S. Tsirkin
  2018-06-15 15:16         ` Halil Pasic
  2018-06-15 13:39       ` Tiwei Bie
  1 sibling, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2018-06-15 13:38 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Tiwei Bie, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang

On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
> 
> 
> On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
> > On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
> > > 
> > > 
> > > On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > > > ---
> > > > v2:
> > > > - Refine the wording (Cornelia);
> > > > 
> > > > v3:
> > > > - Refine the wording (MST);
> > > > 
> > > >    content.tex | 7 +++++++
> > > >    1 file changed, 7 insertions(+)
> > > > 
> > > > diff --git a/content.tex b/content.tex
> > > > index f996fad..3c7d67d 100644
> > > > --- a/content.tex
> > > > +++ b/content.tex
> > > > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> > > >    of features the driver accepts, otherwise it MUST fail to set the
> > > >    FEATURES_OK \field{device status} bit when the driver writes it.
> > > > +If a device has successfully negotiated a set of features
> > > > +at least once (by accepting the FEATURES_OK \field{device
> > > > +status} bit during device initialization), then it SHOULD
> > > > +NOT fail re-negotiation of the same set of features after
> > > > +a device or system reset.  Failure to do so would interfere
> > > > +with resuming from suspend and error recovery.
> > > > +
> > > 
> > > 
> > > Sorry people but I don't get it. I mean it is kind of reasonable
> > > to assume that with a given device and a given driver (given, i.e.
> > > nothing changes) the two will always negotiate the same features
> > > (including the extremal case where the negotiation fails).
> > > 
> > > Either the device or a driver rolling a dice to make feature negotiation
> > > more fun seems quite unreasonable. So I assume this is not what we are
> > > bothering to soft prohibit here.
> > > 
> > > So the interesting scenario seems to be when stuff changes. When
> > > migrating the implementation of the device could change. Or something
> > > changes regarding the resources used to provide the virtual device.
> > > 
> > > But then, if the device really can not support the set of features
> > > it used to be able, I guess the SHOULD does not take effect (I guess
> > > that is the difference compared to MUST).
> > > 
> > > Bottom line is: I tried to figure out what is this about, but I failed.
> > > I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
> > > it did not click. I would appreciate some assistance.
> > 
> > It's exactly what it says. Let's say you negotiated a feature and then
> > device sets NEED_RESET.  Driver must now reset the device and put it
> > back in the same state it had before the reset, then resubmit
> > requests that were available but never used.
> > 
> > What if any of the features changed? Device suddenly
> > needs to check for requests which do not match the
> > features.
> > 
> > Suspend is similar: guests tend to assume hardware does not change
> > across suspend/resume, any changes tend to make resume fail.
> > 
> 
> Thank you very much! But it still does not answer why would a device
> want to do that (fail to negotiate a feature that it was able to
> negotiate before). So I'm still in the dark about what are we trading
> for what.

It would be a mis-configured device.  For example QEMU does not migrate
the device features so if you misconfigure QEMU with different flags on
source and destination (not a supported configuration), features might
seem to change from guest POV.

> Is there somewhere a patch that fixes such a bug? Maybe that would
> help me understand what can be done at the device to avoid the
> problem.
> 
> Regards,
> Halil
> 
> 
> > > 
> > > >    \subsection{Legacy Interface: A Note on Feature
> > > >    Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
> > > >    Bits / Legacy Interface: A Note on Feature Bits}
> > > > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 12:42     ` Halil Pasic
  2018-06-15 13:38       ` Michael S. Tsirkin
@ 2018-06-15 13:39       ` Tiwei Bie
  2018-06-15 14:21         ` Halil Pasic
  1 sibling, 1 reply; 24+ messages in thread
From: Tiwei Bie @ 2018-06-15 13:39 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Michael S. Tsirkin, cohuck, stefanha, pbonzini, virtio-dev,
	dan.daly, cunming.liang, zhihong.wang

On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
> On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
> > On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
> > > 
> > > 
> > > On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > > > ---
> > > > v2:
> > > > - Refine the wording (Cornelia);
> > > > 
> > > > v3:
> > > > - Refine the wording (MST);
> > > > 
> > > >    content.tex | 7 +++++++
> > > >    1 file changed, 7 insertions(+)
> > > > 
> > > > diff --git a/content.tex b/content.tex
> > > > index f996fad..3c7d67d 100644
> > > > --- a/content.tex
> > > > +++ b/content.tex
> > > > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> > > >    of features the driver accepts, otherwise it MUST fail to set the
> > > >    FEATURES_OK \field{device status} bit when the driver writes it.
> > > > +If a device has successfully negotiated a set of features
> > > > +at least once (by accepting the FEATURES_OK \field{device
> > > > +status} bit during device initialization), then it SHOULD
> > > > +NOT fail re-negotiation of the same set of features after
> > > > +a device or system reset.  Failure to do so would interfere
> > > > +with resuming from suspend and error recovery.
> > > > +
> > > 
> > > 
> > > Sorry people but I don't get it. I mean it is kind of reasonable
> > > to assume that with a given device and a given driver (given, i.e.
> > > nothing changes) the two will always negotiate the same features
> > > (including the extremal case where the negotiation fails).
> > > 
> > > Either the device or a driver rolling a dice to make feature negotiation
> > > more fun seems quite unreasonable. So I assume this is not what we are
> > > bothering to soft prohibit here.
> > > 
> > > So the interesting scenario seems to be when stuff changes. When
> > > migrating the implementation of the device could change. Or something
> > > changes regarding the resources used to provide the virtual device.
> > > 
> > > But then, if the device really can not support the set of features
> > > it used to be able, I guess the SHOULD does not take effect (I guess
> > > that is the difference compared to MUST).
> > > 
> > > Bottom line is: I tried to figure out what is this about, but I failed.
> > > I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
> > > it did not click. I would appreciate some assistance.
> > 
> > It's exactly what it says. Let's say you negotiated a feature and then
> > device sets NEED_RESET.  Driver must now reset the device and put it
> > back in the same state it had before the reset, then resubmit
> > requests that were available but never used.
> > 
> > What if any of the features changed? Device suddenly
> > needs to check for requests which do not match the
> > features.
> > 
> > Suspend is similar: guests tend to assume hardware
> > does not change across suspend/resume, any changes
> > tend to make resume fail.
> > 
> 
> Thank you very much! But it still does not answer why would a device
> want to do that (fail to negotiate a feature that it was able
> to negotiate before). So I'm still in the dark about what are we
> trading for what.

Hi Halil,

Just like what you said, normally there is no reason
for a device to fail to negotiate a feature that it
was able to negotiate before. But the spec doesn't
forbid devices to do this , i.e. the spec allows a
device to fail to negotiate a feature that it was
able to negotiate before, which could cause problems
in some cases. Although everything works fine in
reality because there is no device would really do
this, it would be better to make spec to explicitly
forbid devices to do this in the necessary cases.

Best regards,
Tiwei Bie

> 
> Is there somewhere a patch that fixes such a bug? Maybe that would
> help me understand what can be done at the device to avoid the
> problem.
> 
> Regards,
> Halil
> 
> 
> > > 
> > > >    \subsection{Legacy Interface: A Note on Feature
> > > >    Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
> > > >    Bits / Legacy Interface: A Note on Feature Bits}
> > > > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 13:39       ` Tiwei Bie
@ 2018-06-15 14:21         ` Halil Pasic
  2018-06-15 15:36           ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Halil Pasic @ 2018-06-15 14:21 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: Michael S. Tsirkin, cohuck, stefanha, pbonzini, virtio-dev,
	dan.daly, cunming.liang, zhihong.wang



On 06/15/2018 03:39 PM, Tiwei Bie wrote:
> On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
>> On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
>>> On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
>>>>
>>>>
>>>> On 06/11/2018 09:56 AM, Tiwei Bie wrote:
>>>>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>>>>> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>>>>> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
>>>>> ---
>>>>> v2:
>>>>> - Refine the wording (Cornelia);
>>>>>
>>>>> v3:
>>>>> - Refine the wording (MST);
>>>>>
>>>>>     content.tex | 7 +++++++
>>>>>     1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/content.tex b/content.tex
>>>>> index f996fad..3c7d67d 100644
>>>>> --- a/content.tex
>>>>> +++ b/content.tex
>>>>> @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
>>>>>     of features the driver accepts, otherwise it MUST fail to set the
>>>>>     FEATURES_OK \field{device status} bit when the driver writes it.
>>>>> +If a device has successfully negotiated a set of features
>>>>> +at least once (by accepting the FEATURES_OK \field{device
>>>>> +status} bit during device initialization), then it SHOULD
>>>>> +NOT fail re-negotiation of the same set of features after
>>>>> +a device or system reset.  Failure to do so would interfere
>>>>> +with resuming from suspend and error recovery.
>>>>> +
>>>>
>>>>
>>>> Sorry people but I don't get it. I mean it is kind of reasonable
>>>> to assume that with a given device and a given driver (given, i.e.
>>>> nothing changes) the two will always negotiate the same features
>>>> (including the extremal case where the negotiation fails).
>>>>
>>>> Either the device or a driver rolling a dice to make feature negotiation
>>>> more fun seems quite unreasonable. So I assume this is not what we are
>>>> bothering to soft prohibit here.
>>>>
>>>> So the interesting scenario seems to be when stuff changes. When
>>>> migrating the implementation of the device could change. Or something
>>>> changes regarding the resources used to provide the virtual device.
>>>>
>>>> But then, if the device really can not support the set of features
>>>> it used to be able, I guess the SHOULD does not take effect (I guess
>>>> that is the difference compared to MUST).
>>>>
>>>> Bottom line is: I tried to figure out what is this about, but I failed.
>>>> I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
>>>> it did not click. I would appreciate some assistance.
>>>
>>> It's exactly what it says. Let's say you negotiated a feature and then
>>> device sets NEED_RESET.  Driver must now reset the device and put it
>>> back in the same state it had before the reset, then resubmit
>>> requests that were available but never used.
>>>
>>> What if any of the features changed? Device suddenly
>>> needs to check for requests which do not match the
>>> features.
>>>
>>> Suspend is similar: guests tend to assume hardware
>>> does not change across suspend/resume, any changes
>>> tend to make resume fail.
>>>
>>
>> Thank you very much! But it still does not answer why would a device
>> want to do that (fail to negotiate a feature that it was able
>> to negotiate before). So I'm still in the dark about what are we
>> trading for what.
> 
> Hi Halil,
> 
> Just like what you said, normally there is no reason
> for a device to fail to negotiate a feature that it
> was able to negotiate before. But the spec doesn't
> forbid devices to do this , i.e. the spec allows a
> device to fail to negotiate a feature that it was
> able to negotiate before, which could cause problems
> in some cases. Although everything works fine in
> reality because there is no device would really do
> this, it would be better to make spec to explicitly
> forbid devices to do this in the necessary cases.
> 
> Best regards,
> Tiwei Bie
> 

I think we have most of it already covered with 'The device SHOULD
accept any valid subset of features the driver accepts'.

IMHO what we add with your proposed normative statement is that
if the device used to offer a feature bit it SHOULD keep offering it.
That's clearly not covered by the by what I've cited.

But it's kind of covered by a non-normative statement 'Each virtio
device offers all the features it understands.'

This seems most relevant in case of migration. That is device
implementation S(ource) and device implementation T(arget) are
migration compatible. But hey, features that are present
in S and not present in T are of concern  for migration compatibility. AFAIK
the VIRTIO specification does not make claims about migration
compatibility.

So if I think QEMU, and somebody (maintainer) is deciding to remove support for
of a certain device for a certain feature bit in the next version,
he better thinks hard how could this breakmigration. I don't think
the proposed normative statement with it's SHOULD would make the the
guy more careful.

What is even more interesting is the scenario where the new version of
the device does not remove support for a feature, but adds support for
one, let's call it F_N.

The scenario is the following we have systems O(ld) and N(ew). We
start on O then we migrate to new. There some reset of concern happens.
Features get re-negotiated and we start exploiting F_N. In my reading
of your addition, this is legit. But then we migrate back from N to O.
No re-negotiation happens (because it is not obligatory), and things
explode (hopefully, just migration fails, and not guest dies) because
O does not have support for F_N. Your normative statement was nowhere
violated as far as I can tell.

Bottom line is, I still don't know what benefit does this addition
to the standard have to the implementer of the standard. In my opinion
it's just another chunk of text that is hard to figure out. It's hard
to tell what is the device and what is before, what is system reset. If
we were to make the spec complete with spelling out every 'don't make
anything stupid' I'm under the impression there is a lot of work to
do. I had a discussion here on the completeness of this spec, and
completeness does not seem to be a primary goal. I'm still not
sold on this one.

Regards,
Halil

>>
>> Is there somewhere a patch that fixes such a bug? Maybe that would
>> help me understand what can be done at the device to avoid the
>> problem.
>>
>> Regards,
>> Halil
>>
>>
>>>>
>>>>>     \subsection{Legacy Interface: A Note on Feature
>>>>>     Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
>>>>>     Bits / Legacy Interface: A Note on Feature Bits}
>>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 13:38       ` Michael S. Tsirkin
@ 2018-06-15 15:16         ` Halil Pasic
  2018-06-15 15:37           ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Halil Pasic @ 2018-06-15 15:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tiwei Bie, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang



On 06/15/2018 03:38 PM, Michael S. Tsirkin wrote:
> On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
>>
>>
>> On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
>>> On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
>>>>
>>>>
>>>> On 06/11/2018 09:56 AM, Tiwei Bie wrote:
>>>>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>>>>> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>>>>> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
>>>>> ---
>>>>> v2:
>>>>> - Refine the wording (Cornelia);
>>>>>
>>>>> v3:
>>>>> - Refine the wording (MST);
>>>>>
>>>>>     content.tex | 7 +++++++
>>>>>     1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/content.tex b/content.tex
>>>>> index f996fad..3c7d67d 100644
>>>>> --- a/content.tex
>>>>> +++ b/content.tex
>>>>> @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
>>>>>     of features the driver accepts, otherwise it MUST fail to set the
>>>>>     FEATURES_OK \field{device status} bit when the driver writes it.
>>>>> +If a device has successfully negotiated a set of features
>>>>> +at least once (by accepting the FEATURES_OK \field{device
>>>>> +status} bit during device initialization), then it SHOULD
>>>>> +NOT fail re-negotiation of the same set of features after
>>>>> +a device or system reset.  Failure to do so would interfere
>>>>> +with resuming from suspend and error recovery.
>>>>> +
>>>>
>>>>
>>>> Sorry people but I don't get it. I mean it is kind of reasonable
>>>> to assume that with a given device and a given driver (given, i.e.
>>>> nothing changes) the two will always negotiate the same features
>>>> (including the extremal case where the negotiation fails).
>>>>
>>>> Either the device or a driver rolling a dice to make feature negotiation
>>>> more fun seems quite unreasonable. So I assume this is not what we are
>>>> bothering to soft prohibit here.
>>>>
>>>> So the interesting scenario seems to be when stuff changes. When
>>>> migrating the implementation of the device could change. Or something
>>>> changes regarding the resources used to provide the virtual device.
>>>>
>>>> But then, if the device really can not support the set of features
>>>> it used to be able, I guess the SHOULD does not take effect (I guess
>>>> that is the difference compared to MUST).
>>>>
>>>> Bottom line is: I tried to figure out what is this about, but I failed.
>>>> I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
>>>> it did not click. I would appreciate some assistance.
>>>
>>> It's exactly what it says. Let's say you negotiated a feature and then
>>> device sets NEED_RESET.  Driver must now reset the device and put it
>>> back in the same state it had before the reset, then resubmit
>>> requests that were available but never used.
>>>
>>> What if any of the features changed? Device suddenly
>>> needs to check for requests which do not match the
>>> features.
>>>
>>> Suspend is similar: guests tend to assume hardware does not change
>>> across suspend/resume, any changes tend to make resume fail.
>>>
>>
>> Thank you very much! But it still does not answer why would a device
>> want to do that (fail to negotiate a feature that it was able to
>> negotiate before). So I'm still in the dark about what are we trading
>> for what.
> 
> It would be a mis-configured device.  For example QEMU does not migrate
> the device features so if you misconfigure QEMU with different flags on
> source and destination (not a supported configuration), features might
> seem to change from guest POV.
> 

Do you mean set (or rather restrict) what QEMU calls the host_features?

AFAIR there is no reset right after the migration. But yes if then there
is a reset and another migration. After a lots of thinking, it seems you
speak about the scenario I described in the answer to Tiwei Bie. But
there I also say that this statement you add here is not good enough for
that. Still puzzled.

>> Is there somewhere a patch that fixes such a bug? Maybe that would
>> help me understand what can be done at the device to avoid the
>> problem.
>>
>> Regards,
>> Halil
>>
>>
>>>>
>>>>>     \subsection{Legacy Interface: A Note on Feature
>>>>>     Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
>>>>>     Bits / Legacy Interface: A Note on Feature Bits}
>>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 14:21         ` Halil Pasic
@ 2018-06-15 15:36           ` Michael S. Tsirkin
  2018-06-15 18:06             ` Halil Pasic
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2018-06-15 15:36 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Tiwei Bie, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang

On Fri, Jun 15, 2018 at 04:21:32PM +0200, Halil Pasic wrote:
> 
> 
> On 06/15/2018 03:39 PM, Tiwei Bie wrote:
> > On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
> > > On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
> > > > > 
> > > > > 
> > > > > On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> > > > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > > > > > ---
> > > > > > v2:
> > > > > > - Refine the wording (Cornelia);
> > > > > > 
> > > > > > v3:
> > > > > > - Refine the wording (MST);
> > > > > > 
> > > > > >     content.tex | 7 +++++++
> > > > > >     1 file changed, 7 insertions(+)
> > > > > > 
> > > > > > diff --git a/content.tex b/content.tex
> > > > > > index f996fad..3c7d67d 100644
> > > > > > --- a/content.tex
> > > > > > +++ b/content.tex
> > > > > > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> > > > > >     of features the driver accepts, otherwise it MUST fail to set the
> > > > > >     FEATURES_OK \field{device status} bit when the driver writes it.
> > > > > > +If a device has successfully negotiated a set of features
> > > > > > +at least once (by accepting the FEATURES_OK \field{device
> > > > > > +status} bit during device initialization), then it SHOULD
> > > > > > +NOT fail re-negotiation of the same set of features after
> > > > > > +a device or system reset.  Failure to do so would interfere
> > > > > > +with resuming from suspend and error recovery.
> > > > > > +
> > > > > 
> > > > > 
> > > > > Sorry people but I don't get it. I mean it is kind of reasonable
> > > > > to assume that with a given device and a given driver (given, i.e.
> > > > > nothing changes) the two will always negotiate the same features
> > > > > (including the extremal case where the negotiation fails).
> > > > > 
> > > > > Either the device or a driver rolling a dice to make feature negotiation
> > > > > more fun seems quite unreasonable. So I assume this is not what we are
> > > > > bothering to soft prohibit here.
> > > > > 
> > > > > So the interesting scenario seems to be when stuff changes. When
> > > > > migrating the implementation of the device could change. Or something
> > > > > changes regarding the resources used to provide the virtual device.
> > > > > 
> > > > > But then, if the device really can not support the set of features
> > > > > it used to be able, I guess the SHOULD does not take effect (I guess
> > > > > that is the difference compared to MUST).
> > > > > 
> > > > > Bottom line is: I tried to figure out what is this about, but I failed.
> > > > > I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
> > > > > it did not click. I would appreciate some assistance.
> > > > 
> > > > It's exactly what it says. Let's say you negotiated a feature and then
> > > > device sets NEED_RESET.  Driver must now reset the device and put it
> > > > back in the same state it had before the reset, then resubmit
> > > > requests that were available but never used.
> > > > 
> > > > What if any of the features changed? Device suddenly
> > > > needs to check for requests which do not match the
> > > > features.
> > > > 
> > > > Suspend is similar: guests tend to assume hardware
> > > > does not change across suspend/resume, any changes
> > > > tend to make resume fail.
> > > > 
> > > 
> > > Thank you very much! But it still does not answer why would a device
> > > want to do that (fail to negotiate a feature that it was able
> > > to negotiate before). So I'm still in the dark about what are we
> > > trading for what.
> > 
> > Hi Halil,
> > 
> > Just like what you said, normally there is no reason
> > for a device to fail to negotiate a feature that it
> > was able to negotiate before. But the spec doesn't
> > forbid devices to do this , i.e. the spec allows a
> > device to fail to negotiate a feature that it was
> > able to negotiate before, which could cause problems
> > in some cases. Although everything works fine in
> > reality because there is no device would really do
> > this, it would be better to make spec to explicitly
> > forbid devices to do this in the necessary cases.
> > 
> > Best regards,
> > Tiwei Bie
> > 
> 
> I think we have most of it already covered with 'The device SHOULD
> accept any valid subset of features the driver accepts'.
> 
> IMHO what we add with your proposed normative statement is that
> if the device used to offer a feature bit it SHOULD keep offering it.
> That's clearly not covered by the by what I've cited.
> 
> But it's kind of covered by a non-normative statement 'Each virtio
> device offers all the features it understands.'

Well one has to squint very hard to understand it.
And note that "understands" is not the same as "supports". Device can
still fail to set FEATURES_OK.


> This seems most relevant in case of migration. That is device
> implementation S(ource) and device implementation T(arget) are
> migration compatible. But hey, features that are present
> in S and not present in T are of concern  for migration compatibility. AFAIK
> the VIRTIO specification does not make claims about migration
> compatibility.
> 
> So if I think QEMU, and somebody (maintainer) is deciding to remove support for
> of a certain device for a certain feature bit in the next version,
> he better thinks hard how could this breakmigration. I don't think
> the proposed normative statement with it's SHOULD would make the the
> guy more careful.
> 
> What is even more interesting is the scenario where the new version of
> the device does not remove support for a feature, but adds support for
> one, let's call it F_N.
> 
> The scenario is the following we have systems O(ld) and N(ew). We
> start on O then we migrate to new. There some reset of concern happens.
> Features get re-negotiated and we start exploiting F_N. In my reading
> of your addition, this is legit. But then we migrate back from N to O.
> No re-negotiation happens (because it is not obligatory), and things
> explode (hopefully, just migration fails, and not guest dies) because
> O does not have support for F_N. Your normative statement was nowhere
> violated as far as I can tell.

Oops I shouldn't even have started about migration.  Let's forget
migration. It's a simple question on what we can assume after we reset
device.

Some people want to be able to change features dynamically.
Is that OK? This text clarifies that no, it isn't.

> Bottom line is, I still don't know what benefit does this addition
> to the standard have to the implementer of the standard.

A question was asked. On suspend we save features and try to
restore them. Should driver handle device not offering some of these
features after resume? What this offers is a simple answer: don't
worry about it too much, devices have been warned that it's not a
good idea.



> In my opinion
> it's just another chunk of text that is hard to figure out. It's hard
> to tell what is the device

Most people know this I think

> and what is before

Sorry before what?


>, what is system reset.

I think many people do know what is a system reset.
It's an attempt to cover suspend to disk. How would you put it?


> If
> we were to make the spec complete with spelling out every 'don't make
> anything stupid' I'm under the impression there is a lot of work to
> do. I had a discussion here on the completeness of this spec, and
> completeness does not seem to be a primary goal. I'm still not
> sold on this one.
> 
> Regards,
> Halil

Yea, it's just that it's not clear that changing feature
bits when device is reset is all that stupid, since it
does after all lose its state.



> > > 
> > > Is there somewhere a patch that fixes such a bug? Maybe that would
> > > help me understand what can be done at the device to avoid the
> > > problem.
> > > 
> > > Regards,
> > > Halil
> > > 
> > > 
> > > > > 
> > > > > >     \subsection{Legacy Interface: A Note on Feature
> > > > > >     Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
> > > > > >     Bits / Legacy Interface: A Note on Feature Bits}
> > > > > > 
> > > > 
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > > 
> > > 
> > 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 15:16         ` Halil Pasic
@ 2018-06-15 15:37           ` Michael S. Tsirkin
  2018-06-18 15:08             ` Halil Pasic
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2018-06-15 15:37 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Tiwei Bie, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang

On Fri, Jun 15, 2018 at 05:16:10PM +0200, Halil Pasic wrote:
> 
> 
> On 06/15/2018 03:38 PM, Michael S. Tsirkin wrote:
> > On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
> > > 
> > > 
> > > On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
> > > > > 
> > > > > 
> > > > > On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> > > > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > > > > > ---
> > > > > > v2:
> > > > > > - Refine the wording (Cornelia);
> > > > > > 
> > > > > > v3:
> > > > > > - Refine the wording (MST);
> > > > > > 
> > > > > >     content.tex | 7 +++++++
> > > > > >     1 file changed, 7 insertions(+)
> > > > > > 
> > > > > > diff --git a/content.tex b/content.tex
> > > > > > index f996fad..3c7d67d 100644
> > > > > > --- a/content.tex
> > > > > > +++ b/content.tex
> > > > > > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> > > > > >     of features the driver accepts, otherwise it MUST fail to set the
> > > > > >     FEATURES_OK \field{device status} bit when the driver writes it.
> > > > > > +If a device has successfully negotiated a set of features
> > > > > > +at least once (by accepting the FEATURES_OK \field{device
> > > > > > +status} bit during device initialization), then it SHOULD
> > > > > > +NOT fail re-negotiation of the same set of features after
> > > > > > +a device or system reset.  Failure to do so would interfere
> > > > > > +with resuming from suspend and error recovery.
> > > > > > +
> > > > > 
> > > > > 
> > > > > Sorry people but I don't get it. I mean it is kind of reasonable
> > > > > to assume that with a given device and a given driver (given, i.e.
> > > > > nothing changes) the two will always negotiate the same features
> > > > > (including the extremal case where the negotiation fails).
> > > > > 
> > > > > Either the device or a driver rolling a dice to make feature negotiation
> > > > > more fun seems quite unreasonable. So I assume this is not what we are
> > > > > bothering to soft prohibit here.
> > > > > 
> > > > > So the interesting scenario seems to be when stuff changes. When
> > > > > migrating the implementation of the device could change. Or something
> > > > > changes regarding the resources used to provide the virtual device.
> > > > > 
> > > > > But then, if the device really can not support the set of features
> > > > > it used to be able, I guess the SHOULD does not take effect (I guess
> > > > > that is the difference compared to MUST).
> > > > > 
> > > > > Bottom line is: I tried to figure out what is this about, but I failed.
> > > > > I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
> > > > > it did not click. I would appreciate some assistance.
> > > > 
> > > > It's exactly what it says. Let's say you negotiated a feature and then
> > > > device sets NEED_RESET.  Driver must now reset the device and put it
> > > > back in the same state it had before the reset, then resubmit
> > > > requests that were available but never used.
> > > > 
> > > > What if any of the features changed? Device suddenly
> > > > needs to check for requests which do not match the
> > > > features.
> > > > 
> > > > Suspend is similar: guests tend to assume hardware does not change
> > > > across suspend/resume, any changes tend to make resume fail.
> > > > 
> > > 
> > > Thank you very much! But it still does not answer why would a device
> > > want to do that (fail to negotiate a feature that it was able to
> > > negotiate before). So I'm still in the dark about what are we trading
> > > for what.
> > 
> > It would be a mis-configured device.  For example QEMU does not migrate
> > the device features so if you misconfigure QEMU with different flags on
> > source and destination (not a supported configuration), features might
> > seem to change from guest POV.
> > 
> 
> Do you mean set (or rather restrict) what QEMU calls the host_features?
> 
> AFAIR there is no reset right after the migration. But yes if then there
> is a reset and another migration. After a lots of thinking, it seems you
> speak about the scenario I described in the answer to Tiwei Bie. But
> there I also say that this statement you add here is not good enough for
> that. Still puzzled.

What would a good enough statement look like?


> > > Is there somewhere a patch that fixes such a bug? Maybe that would
> > > help me understand what can be done at the device to avoid the
> > > problem.
> > > 
> > > Regards,
> > > Halil
> > > 
> > > 
> > > > > 
> > > > > >     \subsection{Legacy Interface: A Note on Feature
> > > > > >     Bits}\label{sec:Basic Facilities of a Virtio Device / Feature
> > > > > >     Bits / Legacy Interface: A Note on Feature Bits}
> > > > > > 
> > > > 
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 15:36           ` Michael S. Tsirkin
@ 2018-06-15 18:06             ` Halil Pasic
  0 siblings, 0 replies; 24+ messages in thread
From: Halil Pasic @ 2018-06-15 18:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tiwei Bie, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang



On 06/15/2018 05:36 PM, Michael S. Tsirkin wrote:
> On Fri, Jun 15, 2018 at 04:21:32PM +0200, Halil Pasic wrote:
>>
>>
>> On 06/15/2018 03:39 PM, Tiwei Bie wrote:
>>> On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
>>>> On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
>>>>> On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
>>>>>>
>>>>>>
>>>>>> On 06/11/2018 09:56 AM, Tiwei Bie wrote:
>>>>>>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>>>>>>> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
>>>>>>> ---
>>>>>>> v2:
>>>>>>> - Refine the wording (Cornelia);
>>>>>>>
>>>>>>> v3:
>>>>>>> - Refine the wording (MST);
>>>>>>>
>>>>>>>      content.tex | 7 +++++++
>>>>>>>      1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/content.tex b/content.tex
>>>>>>> index f996fad..3c7d67d 100644
>>>>>>> --- a/content.tex
>>>>>>> +++ b/content.tex
>>>>>>> @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
>>>>>>>      of features the driver accepts, otherwise it MUST fail to set the
>>>>>>>      FEATURES_OK \field{device status} bit when the driver writes it.
>>>>>>> +If a device has successfully negotiated a set of features
>>>>>>> +at least once (by accepting the FEATURES_OK \field{device
>>>>>>> +status} bit during device initialization), then it SHOULD
>>>>>>> +NOT fail re-negotiation of the same set of features after
>>>>>>> +a device or system reset.  Failure to do so would interfere
>>>>>>> +with resuming from suspend and error recovery.
>>>>>>> +
>>>>>>
>>>>>>
>>>>>> Sorry people but I don't get it. I mean it is kind of reasonable
>>>>>> to assume that with a given device and a given driver (given, i.e.
>>>>>> nothing changes) the two will always negotiate the same features
>>>>>> (including the extremal case where the negotiation fails).
>>>>>>
>>>>>> Either the device or a driver rolling a dice to make feature negotiation
>>>>>> more fun seems quite unreasonable. So I assume this is not what we are
>>>>>> bothering to soft prohibit here.
>>>>>>
>>>>>> So the interesting scenario seems to be when stuff changes. When
>>>>>> migrating the implementation of the device could change. Or something
>>>>>> changes regarding the resources used to provide the virtual device.
>>>>>>
>>>>>> But then, if the device really can not support the set of features
>>>>>> it used to be able, I guess the SHOULD does not take effect (I guess
>>>>>> that is the difference compared to MUST).
>>>>>>
>>>>>> Bottom line is: I tried to figure out what is this about, but I failed.
>>>>>> I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
>>>>>> it did not click. I would appreciate some assistance.
>>>>>
>>>>> It's exactly what it says. Let's say you negotiated a feature and then
>>>>> device sets NEED_RESET.  Driver must now reset the device and put it
>>>>> back in the same state it had before the reset, then resubmit
>>>>> requests that were available but never used.
>>>>>
>>>>> What if any of the features changed? Device suddenly
>>>>> needs to check for requests which do not match the
>>>>> features.
>>>>>
>>>>> Suspend is similar: guests tend to assume hardware
>>>>> does not change across suspend/resume, any changes
>>>>> tend to make resume fail.
>>>>>
>>>>
>>>> Thank you very much! But it still does not answer why would a device
>>>> want to do that (fail to negotiate a feature that it was able
>>>> to negotiate before). So I'm still in the dark about what are we
>>>> trading for what.
>>>
>>> Hi Halil,
>>>
>>> Just like what you said, normally there is no reason
>>> for a device to fail to negotiate a feature that it
>>> was able to negotiate before. But the spec doesn't
>>> forbid devices to do this , i.e. the spec allows a
>>> device to fail to negotiate a feature that it was
>>> able to negotiate before, which could cause problems
>>> in some cases. Although everything works fine in
>>> reality because there is no device would really do
>>> this, it would be better to make spec to explicitly
>>> forbid devices to do this in the necessary cases.
>>>
>>> Best regards,
>>> Tiwei Bie
>>>
>>
>> I think we have most of it already covered with 'The device SHOULD
>> accept any valid subset of features the driver accepts'.
>>
>> IMHO what we add with your proposed normative statement is that
>> if the device used to offer a feature bit it SHOULD keep offering it.
>> That's clearly not covered by the by what I've cited.
>>
>> But it's kind of covered by a non-normative statement 'Each virtio
>> device offers all the features it understands.'
> 
> Well one has to squint very hard to understand it.
> And note that "understands" is not the same as "supports". Device can
> still fail to set FEATURES_OK.
> 

But I guess it should not. I don't know what is the driver supposed
to do in the scenario you describe: The device offered me (the driver) a set
of features, I the driver accepted them *all*. The device failed to
set FEATURES_OK, because there was *one feature that it "understands"
but does not "support". Should I (the driver) start a backtracking feature
negotiation to figure out the difference between "understands"
and "supports".

> 
>> This seems most relevant in case of migration. That is device
>> implementation S(ource) and device implementation T(arget) are
>> migration compatible. But hey, features that are present
>> in S and not present in T are of concern  for migration compatibility. AFAIK
>> the VIRTIO specification does not make claims about migration
>> compatibility.
>>
>> So if I think QEMU, and somebody (maintainer) is deciding to remove support for
>> of a certain device for a certain feature bit in the next version,
>> he better thinks hard how could this breakmigration. I don't think
>> the proposed normative statement with it's SHOULD would make the the
>> guy more careful.
>>
>> What is even more interesting is the scenario where the new version of
>> the device does not remove support for a feature, but adds support for
>> one, let's call it F_N.
>>
>> The scenario is the following we have systems O(ld) and N(ew). We
>> start on O then we migrate to new. There some reset of concern happens.
>> Features get re-negotiated and we start exploiting F_N. In my reading
>> of your addition, this is legit. But then we migrate back from N to O.
>> No re-negotiation happens (because it is not obligatory), and things
>> explode (hopefully, just migration fails, and not guest dies) because
>> O does not have support for F_N. Your normative statement was nowhere
>> violated as far as I can tell.
> 
> Oops I shouldn't even have started about migration.  Let's forget
> migration. It's a simple question on what we can assume after we reset
> device.
> 
> Some people want to be able to change features dynamically.
> Is that OK? This text clarifies that no, it isn't.
> 

That's a very reasonable question, and a straight answer. Yet I think
the normative statement is not good enough. In a sense, that it does
not say 'it is not OK to change features dynamically'. IMHO to express
that we should state something like: 'For a life-time of a virtio device
(which transcends device resets) each subsequent feature re-negotiation
SHOULD result in the exact same set of features being negotiated as the
first successful negotiation.'

In my reading the normative statement discussed here says features are
not allowed to 'disappear' dynamically. But does not say a thing about
new features 'appearing' dynamically.


About features 'appearing' dynamically, AFAIR there was a virtio-crypto feature
that changed the request format (if negotiated). So IFAIR if we
were to re-submit the requests unchanged after gaining this feature,
we would end up having the problem you described.

However if both add and remove are unsafe, then 'The only way
to renegotiate is to reset the device.' is misleading IMHO.


>> Bottom line is, I still don't know what benefit does this addition
>> to the standard have to the implementer of the standard.
> 
> A question was asked. On suspend we save features and try to
> restore them. Should driver handle device not offering some of these
> features after resume? What this offers is a simple answer: don't
> worry about it too much, devices have been warned that it's not a
> good idea.
> 

I don't know enough about suspend/resume. I will try to catch up. But
I think I'm slowly starting to understand the problem. My guess is that
there is some sort of reset involved in the procedure that could affect
what QEMU calls the host_features, but would not affect the requests on
the queue.

The questions still remains: Why would the device want to take away a
feature? What should the device do (respecting the warning given here)
instead of taking away the feature (if the need arises) ?

> 
> 
>> In my opinion
>> it's just another chunk of text that is hard to figure out. It's hard
>> to tell what is the device
> 
> Most people know this I think
> 

I mean the same device. If I migrate back and forth in the spirit of the
normative statement the device is still the same device. When I think QEMU
however,we would end up realizing a device each time we spin up a QEMU at the
target host. So the life-cycle of the QEMU device and of the virtio device
ends up being a different one.

>> and what is before
> 
> Sorry before what?
> 

My bad. The original text does not use 'before' just 'after'. For some
strange reason I started thinking about sequences of re-negotiations and
there 'before' slipped in...

> 
>> , what is system reset.
> 
> I think many people do know what is a system reset.
> It's an attempt to cover suspend to disk. How would you put it?
> 

At this point I think I have enough understanding of what is behind,
to make a step back, and do the research and the thinking. Thanks
for your patience.

But let me carry on with my answer without doing the research
for now. Having a notion of system reset and specifying how virio facilities
relate to it (affected, unaffected) seems very reasonable. But I think
it is a new thing in the spec. I don't think solely adding tihs
normative statement is sufficient to achieve that.

> 
>> If
>> we were to make the spec complete with spelling out every 'don't make
>> anything stupid' I'm under the impression there is a lot of work to
>> do. I had a discussion here on the completeness of this spec, and
>> completeness does not seem to be a primary goal. I'm still not
>> sold on this one.
>>
>> Regards,
>> Halil
> 
> Yea, it's just that it's not clear that changing feature
> bits when device is reset is all that stupid, since it
> does after all lose its state.
> 

My intuition was that this should be a part of describing
what a device reset is. It seems the device does not loose all
state though -- otherwise I don't understand the problem with the
available but not yet used requests.

Anyway many thanks for having this discussion with me. My initial
problem was that I could not relate this to anything sane. Now
I have to learn more about suspend/resume.

A  quick recap at the end.  This is about 'Should driver handle
device not offering some of these features after resume?' This paragraph
is supposed to tell the driver developer don't bother. And I guess it's
also supposed to tell the device developer: fail to resume (e.g. migrate)
the device if you realize if some features negotiated before can not be
supported any more.

Like this if it is a suspend/resume we still end up not being able to resume
the device or the whole guest. But at least no funny things will happen
if the driver does try to use the feature that went away.

My intuitions is, that handling such feature changes in the guest would be
cleaner. The guest has all the information it needs at it's disposal (e.g.
are requests in flight, do these depend on some feature that went away or
the opposite, can we let the upper layer re-submit the requests and just
give up on the ones that stuck available). But I have to the gaps in
my understanding before having any.

Regards,
Halil


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-15 15:37           ` Michael S. Tsirkin
@ 2018-06-18 15:08             ` Halil Pasic
  2018-06-18 16:28               ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Halil Pasic @ 2018-06-18 15:08 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Tiwei Bie, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang



On 06/15/2018 05:37 PM, Michael S. Tsirkin wrote:
> On Fri, Jun 15, 2018 at 05:16:10PM +0200, Halil Pasic wrote:
>>
>>
>> On 06/15/2018 03:38 PM, Michael S. Tsirkin wrote:
>>> On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
>>>>
>>>>
>>>> On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
>>>>> On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
>>>>>>
>>>>>>
>>>>>> On 06/11/2018 09:56 AM, Tiwei Bie wrote:
>>>>>>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
>>>>>>> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
>>>>>>> ---
>>>>>>> v2:
>>>>>>> - Refine the wording (Cornelia);
>>>>>>>
>>>>>>> v3:
>>>>>>> - Refine the wording (MST);
>>>>>>>
>>>>>>>      content.tex | 7 +++++++
>>>>>>>      1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/content.tex b/content.tex
>>>>>>> index f996fad..3c7d67d 100644
>>>>>>> --- a/content.tex
>>>>>>> +++ b/content.tex
>>>>>>> @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
>>>>>>>      of features the driver accepts, otherwise it MUST fail to set the
>>>>>>>      FEATURES_OK \field{device status} bit when the driver writes it.
>>>>>>> +If a device has successfully negotiated a set of features
>>>>>>> +at least once (by accepting the FEATURES_OK \field{device
>>>>>>> +status} bit during device initialization), then it SHOULD
>>>>>>> +NOT fail re-negotiation of the same set of features after
>>>>>>> +a device or system reset.  Failure to do so would interfere
>>>>>>> +with resuming from suspend and error recovery.
>>>>>>> +
>>>>>>
>>>>>>
>>>>>> Sorry people but I don't get it. I mean it is kind of reasonable
>>>>>> to assume that with a given device and a given driver (given, i.e.
>>>>>> nothing changes) the two will always negotiate the same features
>>>>>> (including the extremal case where the negotiation fails).
>>>>>>
>>>>>> Either the device or a driver rolling a dice to make feature negotiation
>>>>>> more fun seems quite unreasonable. So I assume this is not what we are
>>>>>> bothering to soft prohibit here.
>>>>>>
>>>>>> So the interesting scenario seems to be when stuff changes. When
>>>>>> migrating the implementation of the device could change. Or something
>>>>>> changes regarding the resources used to provide the virtual device.
>>>>>>
>>>>>> But then, if the device really can not support the set of features
>>>>>> it used to be able, I guess the SHOULD does not take effect (I guess
>>>>>> that is the difference compared to MUST).
>>>>>>
>>>>>> Bottom line is: I tried to figure out what is this about, but I failed.
>>>>>> I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
>>>>>> it did not click. I would appreciate some assistance.
>>>>>
>>>>> It's exactly what it says. Let's say you negotiated a feature and then
>>>>> device sets NEED_RESET.  Driver must now reset the device and put it
>>>>> back in the same state it had before the reset, then resubmit
>>>>> requests that were available but never used.
>>>>>
>>>>> What if any of the features changed? Device suddenly
>>>>> needs to check for requests which do not match the
>>>>> features.
>>>>>
>>>>> Suspend is similar: guests tend to assume hardware does not change
>>>>> across suspend/resume, any changes tend to make resume fail.
>>>>>
>>>>
>>>> Thank you very much! But it still does not answer why would a device
>>>> want to do that (fail to negotiate a feature that it was able to
>>>> negotiate before). So I'm still in the dark about what are we trading
>>>> for what.
>>>
>>> It would be a mis-configured device.  For example QEMU does not migrate
>>> the device features so if you misconfigure QEMU with different flags on
>>> source and destination (not a supported configuration), features might
>>> seem to change from guest POV.
>>>
>>
>> Do you mean set (or rather restrict) what QEMU calls the host_features?
>>
>> AFAIR there is no reset right after the migration. But yes if then there
>> is a reset and another migration. After a lots of thinking, it seems you
>> speak about the scenario I described in the answer to Tiwei Bie. But
>> there I also say that this statement you add here is not good enough for
>> that. Still puzzled.
> 
> What would a good enough statement look like?
> 
> 


I did some reading and some thinking on the weekend. AFAIU the situation
is tricky. To mitigate that let me establish the terminology I'm going to
use. For vm lifecycle I'm going to use the definitions form libvirt as
defined by https://libvirt.org/guide/html/Application_Development_Guide-Guest_Domains-Lifecycle.html.

You explained, the motivation for this addition to the VIRTIO
specification is hibernate (aka suspend to disk).

(1) AFAIU on hibernate the VM goes from 'running' to (most likely)
'defined'.  The first step of the resume from hibernate is to start the
VM. From the guest OS life-cycle perspective however we don't start a
completely new cycle (like the VM life-cycle does) with complete
re-initialization. After resuming form hibernate the system is expected
to be put in essentially the same state (but not exactly) as it was
before hibernate.

(2) From VM (life-cycle) perspective we can not distinguish between a
'shutdown' as a part of a  hibernate and a 'plain shutdown'.

(3) Any rule we come up for a device (e.g. the normative statement
proposed here) that regulates the effects of a 'system reset' that is a
part of the hibernate cycle equally affects the normal shutdown-start
cycle.

(4) Any change in the negotiated feature set can affect the validity of
requests that have been constructed under different assumptions (i.e.
not only features going away, but also features appearing can be a
problem).

(5) The Linux implementation already has reasonable handling for both
types of changes: the driver does not try to use the new features, and
fails cleanly if the old ones are not accepted.

(6) Because of (3), prohibiting devices dropping support for a set of
features within a hibernate cycle is only possible if we prohibit such
changes in general.

(7) If I read
https://www.kernel.org/doc/html/v4.14/driver-api/pm/devices.html
correctly the driver is expected to quiesce the device before going to
hibernate. AFAIU hibernating with requests in flight isn't a great idea.

(8) If there are no in-flight requests in-flight (including on the
queues), then this whole feature changes might break requests story seems
irrelevant.

(9) After a quick look the freeze in virtio (Linux implementation) does
not seem to do anything to prevent in-flight requests though.

(10) From a VM management perspective a 'save' seems much preferable to
hibernate.  A VM 'save' is migration like, so even if some components of
the system change between 'save' and 'restore' (e.g. QEMU up- or
donwngarde) we still have mechanisms in place that (hopefully) the guest
view of the system does not change. In this sense save/restore is
migration like.

(11) The VIRTIO specification is a bit vague about how a reset is
supposed to be handled by the guest, but it certainly does not prohibit
the negotiated features from changing after reset. Here I will quote two
fragments that hint this is actually something foreseen by the VIRTIO
standard:
  * 'During device initialization, the driver reads this and tells the
     device the subset that it accepts.  The only way to renegotiate is to
     reset the device.'
  * 'If the driver sets the FAILED bit, the driver MUST later reset the
     device before attempting to re-initialize.' If re-initialize is in a
     sense of '3.1.1 Driver Requirements: Device Initialization' then full
     feature negotiation seems to be compulsory.  Linux does not do this. But
     since setting up queues seems to be a part of the 3.1.1 initialization
     sequence (even if formulated somewhat vague), my best guess after reset
     the driver is not supposed to perform 3.1.1 to the letter.

(12) If I were to hibernate my PC and then, let's say replace my NIC with
a different model, the hardware does not change assumption would not hold
for a non-virtualized system either. I'm not sure this problem is ours to
solve.

My conclusion is the following. I think constraining feature changes
after system_reset is a bad idea. For 'normal' virtio reset some
clarifications would be welcome, but this one does not seem to be a very
good one. Regarding changing features, I think we are good enough with
what we have today (both standard and implementation). However if we want
to prohibit the features from changing after a reset in spite of my
arguments presented here, IMHO we need a driver normative statement too.

Regards,
Halil


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-18 15:08             ` Halil Pasic
@ 2018-06-18 16:28               ` Michael S. Tsirkin
  2018-06-19  9:14                 ` Tiwei Bie
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2018-06-18 16:28 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Tiwei Bie, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang

On Mon, Jun 18, 2018 at 05:08:32PM +0200, Halil Pasic wrote:
> 
> 
> On 06/15/2018 05:37 PM, Michael S. Tsirkin wrote:
> > On Fri, Jun 15, 2018 at 05:16:10PM +0200, Halil Pasic wrote:
> > > 
> > > 
> > > On 06/15/2018 03:38 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
> > > > > 
> > > > > 
> > > > > On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
> > > > > > On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> > > > > > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > > > > > > > ---
> > > > > > > > v2:
> > > > > > > > - Refine the wording (Cornelia);
> > > > > > > > 
> > > > > > > > v3:
> > > > > > > > - Refine the wording (MST);
> > > > > > > > 
> > > > > > > >      content.tex | 7 +++++++
> > > > > > > >      1 file changed, 7 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/content.tex b/content.tex
> > > > > > > > index f996fad..3c7d67d 100644
> > > > > > > > --- a/content.tex
> > > > > > > > +++ b/content.tex
> > > > > > > > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> > > > > > > >      of features the driver accepts, otherwise it MUST fail to set the
> > > > > > > >      FEATURES_OK \field{device status} bit when the driver writes it.
> > > > > > > > +If a device has successfully negotiated a set of features
> > > > > > > > +at least once (by accepting the FEATURES_OK \field{device
> > > > > > > > +status} bit during device initialization), then it SHOULD
> > > > > > > > +NOT fail re-negotiation of the same set of features after
> > > > > > > > +a device or system reset.  Failure to do so would interfere
> > > > > > > > +with resuming from suspend and error recovery.
> > > > > > > > +
> > > > > > > 
> > > > > > > 
> > > > > > > Sorry people but I don't get it. I mean it is kind of reasonable
> > > > > > > to assume that with a given device and a given driver (given, i.e.
> > > > > > > nothing changes) the two will always negotiate the same features
> > > > > > > (including the extremal case where the negotiation fails).
> > > > > > > 
> > > > > > > Either the device or a driver rolling a dice to make feature negotiation
> > > > > > > more fun seems quite unreasonable. So I assume this is not what we are
> > > > > > > bothering to soft prohibit here.
> > > > > > > 
> > > > > > > So the interesting scenario seems to be when stuff changes. When
> > > > > > > migrating the implementation of the device could change. Or something
> > > > > > > changes regarding the resources used to provide the virtual device.
> > > > > > > 
> > > > > > > But then, if the device really can not support the set of features
> > > > > > > it used to be able, I guess the SHOULD does not take effect (I guess
> > > > > > > that is the difference compared to MUST).
> > > > > > > 
> > > > > > > Bottom line is: I tried to figure out what is this about, but I failed.
> > > > > > > I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
> > > > > > > it did not click. I would appreciate some assistance.
> > > > > > 
> > > > > > It's exactly what it says. Let's say you negotiated a feature and then
> > > > > > device sets NEED_RESET.  Driver must now reset the device and put it
> > > > > > back in the same state it had before the reset, then resubmit
> > > > > > requests that were available but never used.
> > > > > > 
> > > > > > What if any of the features changed? Device suddenly
> > > > > > needs to check for requests which do not match the
> > > > > > features.
> > > > > > 
> > > > > > Suspend is similar: guests tend to assume hardware does not change
> > > > > > across suspend/resume, any changes tend to make resume fail.
> > > > > > 
> > > > > 
> > > > > Thank you very much! But it still does not answer why would a device
> > > > > want to do that (fail to negotiate a feature that it was able to
> > > > > negotiate before). So I'm still in the dark about what are we trading
> > > > > for what.
> > > > 
> > > > It would be a mis-configured device.  For example QEMU does not migrate
> > > > the device features so if you misconfigure QEMU with different flags on
> > > > source and destination (not a supported configuration), features might
> > > > seem to change from guest POV.
> > > > 
> > > 
> > > Do you mean set (or rather restrict) what QEMU calls the host_features?
> > > 
> > > AFAIR there is no reset right after the migration. But yes if then there
> > > is a reset and another migration. After a lots of thinking, it seems you
> > > speak about the scenario I described in the answer to Tiwei Bie. But
> > > there I also say that this statement you add here is not good enough for
> > > that. Still puzzled.
> > 
> > What would a good enough statement look like?
> > 
> > 
> 
> 
> I did some reading and some thinking on the weekend. AFAIU the situation
> is tricky. To mitigate that let me establish the terminology I'm going to
> use. For vm lifecycle I'm going to use the definitions form libvirt as
> defined by https://libvirt.org/guide/html/Application_Development_Guide-Guest_Domains-Lifecycle.html.
> 
> You explained, the motivation for this addition to the VIRTIO
> specification is hibernate (aka suspend to disk).
> 
> (1) AFAIU on hibernate the VM goes from 'running' to (most likely)
> 'defined'.  The first step of the resume from hibernate is to start the
> VM. From the guest OS life-cycle perspective however we don't start a
> completely new cycle (like the VM life-cycle does) with complete
> re-initialization. After resuming form hibernate the system is expected
> to be put in essentially the same state (but not exactly) as it was
> before hibernate.
> 
> (2) From VM (life-cycle) perspective we can not distinguish between a
> 'shutdown' as a part of a  hibernate and a 'plain shutdown'.
> 
> (3) Any rule we come up for a device (e.g. the normative statement
> proposed here) that regulates the effects of a 'system reset' that is a
> part of the hibernate cycle equally affects the normal shutdown-start
> cycle.
> 
> (4) Any change in the negotiated feature set can affect the validity of
> requests that have been constructed under different assumptions (i.e.
> not only features going away, but also features appearing can be a
> problem).
> 
> (5) The Linux implementation already has reasonable handling for both
> types of changes: the driver does not try to use the new features, and
> fails cleanly if the old ones are not accepted.
> 
> (6) Because of (3), prohibiting devices dropping support for a set of
> features within a hibernate cycle is only possible if we prohibit such
> changes in general.
> 
> (7) If I read
> https://www.kernel.org/doc/html/v4.14/driver-api/pm/devices.html
> correctly the driver is expected to quiesce the device before going to
> hibernate. AFAIU hibernating with requests in flight isn't a great idea.
> 
> (8) If there are no in-flight requests in-flight (including on the
> queues), then this whole feature changes might break requests story seems
> irrelevant.
> 
> (9) After a quick look the freeze in virtio (Linux implementation) does
> not seem to do anything to prevent in-flight requests though.
> 
> (10) From a VM management perspective a 'save' seems much preferable to
> hibernate.  A VM 'save' is migration like, so even if some components of
> the system change between 'save' and 'restore' (e.g. QEMU up- or
> donwngarde) we still have mechanisms in place that (hopefully) the guest
> view of the system does not change. In this sense save/restore is
> migration like.
> 
> (11) The VIRTIO specification is a bit vague about how a reset is
> supposed to be handled by the guest, but it certainly does not prohibit
> the negotiated features from changing after reset. Here I will quote two
> fragments that hint this is actually something foreseen by the VIRTIO
> standard:
>  * 'During device initialization, the driver reads this and tells the
>     device the subset that it accepts.  The only way to renegotiate is to
>     reset the device.'
>  * 'If the driver sets the FAILED bit, the driver MUST later reset the
>     device before attempting to re-initialize.' If re-initialize is in a
>     sense of '3.1.1 Driver Requirements: Device Initialization' then full
>     feature negotiation seems to be compulsory.  Linux does not do this. But
>     since setting up queues seems to be a part of the 3.1.1 initialization
>     sequence (even if formulated somewhat vague), my best guess after reset
>     the driver is not supposed to perform 3.1.1 to the letter.

I think frankly if we want dynamic features we should work on
a mechanism that allows changing them without a system reset.

And I think the use-case that triggered this is the SRIOV feature,
take a look at how that is handled across e.g. suspend/resume.

> 
> (12) If I were to hibernate my PC and then, let's say replace my NIC with
> a different model, the hardware does not change assumption would not hold
> for a non-virtualized system either. I'm not sure this problem is ours to
> solve.

Precisely and since we can't solve it, we warn people not to
create this kind of configuration unless they know exactly what they
are doing.

> My conclusion is the following. I think constraining feature changes
> after system_reset is a bad idea. For 'normal' virtio reset some
> clarifications would be welcome, but this one does not seem to be a very
> good one. Regarding changing features, I think we are good enough with
> what we have today (both standard and implementation). However if we want
> to prohibit the features from changing after a reset in spite of my
> arguments presented here, IMHO we need a driver normative statement too.
> 
> Regards,
> Halil

Well the motion passed with 1 abstain and 5 in favor.  Tiwei was the one
who proposed it so as I already did this in the past, I'll wait a day or
two for him to respond and let us know whether he'd like to drop the
patch, but in absence of such a response I'll have to push the proposed
wording.
In that case you will need to put in a motion to revert, or make some
other change on top.


-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-18 16:28               ` Michael S. Tsirkin
@ 2018-06-19  9:14                 ` Tiwei Bie
  2018-06-19 10:46                   ` Halil Pasic
  2018-06-20  1:06                   ` Michael S. Tsirkin
  0 siblings, 2 replies; 24+ messages in thread
From: Tiwei Bie @ 2018-06-19  9:14 UTC (permalink / raw)
  To: Michael S. Tsirkin, Halil Pasic
  Cc: cohuck, stefanha, pbonzini, virtio-dev, dan.daly, cunming.liang,
	zhihong.wang

On Mon, Jun 18, 2018 at 07:28:33PM +0300, Michael S. Tsirkin wrote:
> On Mon, Jun 18, 2018 at 05:08:32PM +0200, Halil Pasic wrote:
> > 
> > 
> > On 06/15/2018 05:37 PM, Michael S. Tsirkin wrote:
> > > On Fri, Jun 15, 2018 at 05:16:10PM +0200, Halil Pasic wrote:
> > > > 
> > > > 
> > > > On 06/15/2018 03:38 PM, Michael S. Tsirkin wrote:
> > > > > On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
> > > > > > 
> > > > > > 
> > > > > > On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
> > > > > > > On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> > > > > > > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > > > > > > > > ---
> > > > > > > > > v2:
> > > > > > > > > - Refine the wording (Cornelia);
> > > > > > > > > 
> > > > > > > > > v3:
> > > > > > > > > - Refine the wording (MST);
> > > > > > > > > 
> > > > > > > > >      content.tex | 7 +++++++
> > > > > > > > >      1 file changed, 7 insertions(+)
> > > > > > > > > 
> > > > > > > > > diff --git a/content.tex b/content.tex
> > > > > > > > > index f996fad..3c7d67d 100644
> > > > > > > > > --- a/content.tex
> > > > > > > > > +++ b/content.tex
> > > > > > > > > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> > > > > > > > >      of features the driver accepts, otherwise it MUST fail to set the
> > > > > > > > >      FEATURES_OK \field{device status} bit when the driver writes it.
> > > > > > > > > +If a device has successfully negotiated a set of features
> > > > > > > > > +at least once (by accepting the FEATURES_OK \field{device
> > > > > > > > > +status} bit during device initialization), then it SHOULD
> > > > > > > > > +NOT fail re-negotiation of the same set of features after
> > > > > > > > > +a device or system reset.  Failure to do so would interfere
> > > > > > > > > +with resuming from suspend and error recovery.
> > > > > > > > > +
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Sorry people but I don't get it. I mean it is kind of reasonable
> > > > > > > > to assume that with a given device and a given driver (given, i.e.
> > > > > > > > nothing changes) the two will always negotiate the same features
> > > > > > > > (including the extremal case where the negotiation fails).
> > > > > > > > 
> > > > > > > > Either the device or a driver rolling a dice to make feature negotiation
> > > > > > > > more fun seems quite unreasonable. So I assume this is not what we are
> > > > > > > > bothering to soft prohibit here.
> > > > > > > > 
> > > > > > > > So the interesting scenario seems to be when stuff changes. When
> > > > > > > > migrating the implementation of the device could change. Or something
> > > > > > > > changes regarding the resources used to provide the virtual device.
> > > > > > > > 
> > > > > > > > But then, if the device really can not support the set of features
> > > > > > > > it used to be able, I guess the SHOULD does not take effect (I guess
> > > > > > > > that is the difference compared to MUST).
> > > > > > > > 
> > > > > > > > Bottom line is: I tried to figure out what is this about, but I failed.
> > > > > > > > I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
> > > > > > > > it did not click. I would appreciate some assistance.
> > > > > > > 
> > > > > > > It's exactly what it says. Let's say you negotiated a feature and then
> > > > > > > device sets NEED_RESET.  Driver must now reset the device and put it
> > > > > > > back in the same state it had before the reset, then resubmit
> > > > > > > requests that were available but never used.
> > > > > > > 
> > > > > > > What if any of the features changed? Device suddenly
> > > > > > > needs to check for requests which do not match the
> > > > > > > features.
> > > > > > > 
> > > > > > > Suspend is similar: guests tend to assume hardware does not change
> > > > > > > across suspend/resume, any changes tend to make resume fail.
> > > > > > > 
> > > > > > 
> > > > > > Thank you very much! But it still does not answer why would a device
> > > > > > want to do that (fail to negotiate a feature that it was able to
> > > > > > negotiate before). So I'm still in the dark about what are we trading
> > > > > > for what.
> > > > > 
> > > > > It would be a mis-configured device.  For example QEMU does not migrate
> > > > > the device features so if you misconfigure QEMU with different flags on
> > > > > source and destination (not a supported configuration), features might
> > > > > seem to change from guest POV.
> > > > > 
> > > > 
> > > > Do you mean set (or rather restrict) what QEMU calls the host_features?
> > > > 
> > > > AFAIR there is no reset right after the migration. But yes if then there
> > > > is a reset and another migration. After a lots of thinking, it seems you
> > > > speak about the scenario I described in the answer to Tiwei Bie. But
> > > > there I also say that this statement you add here is not good enough for
> > > > that. Still puzzled.
> > > 
> > > What would a good enough statement look like?
> > > 
> > > 
> > 
> > 
> > I did some reading and some thinking on the weekend. AFAIU the situation
> > is tricky. To mitigate that let me establish the terminology I'm going to
> > use. For vm lifecycle I'm going to use the definitions form libvirt as
> > defined by https://libvirt.org/guide/html/Application_Development_Guide-Guest_Domains-Lifecycle.html.
> > 
> > You explained, the motivation for this addition to the VIRTIO
> > specification is hibernate (aka suspend to disk).
> > 
> > (1) AFAIU on hibernate the VM goes from 'running' to (most likely)
> > 'defined'.  The first step of the resume from hibernate is to start the
> > VM. From the guest OS life-cycle perspective however we don't start a
> > completely new cycle (like the VM life-cycle does) with complete
> > re-initialization. After resuming form hibernate the system is expected
> > to be put in essentially the same state (but not exactly) as it was
> > before hibernate.
> > 
> > (2) From VM (life-cycle) perspective we can not distinguish between a
> > 'shutdown' as a part of a  hibernate and a 'plain shutdown'.
> > 
> > (3) Any rule we come up for a device (e.g. the normative statement
> > proposed here) that regulates the effects of a 'system reset' that is a
> > part of the hibernate cycle equally affects the normal shutdown-start
> > cycle.
> > 
> > (4) Any change in the negotiated feature set can affect the validity of
> > requests that have been constructed under different assumptions (i.e.
> > not only features going away, but also features appearing can be a
> > problem).
> > 
> > (5) The Linux implementation already has reasonable handling for both
> > types of changes: the driver does not try to use the new features, and
> > fails cleanly if the old ones are not accepted.
> > 
> > (6) Because of (3), prohibiting devices dropping support for a set of
> > features within a hibernate cycle is only possible if we prohibit such
> > changes in general.
> > 
> > (7) If I read
> > https://www.kernel.org/doc/html/v4.14/driver-api/pm/devices.html
> > correctly the driver is expected to quiesce the device before going to
> > hibernate. AFAIU hibernating with requests in flight isn't a great idea.
> > 
> > (8) If there are no in-flight requests in-flight (including on the
> > queues), then this whole feature changes might break requests story seems
> > irrelevant.
> > 
> > (9) After a quick look the freeze in virtio (Linux implementation) does
> > not seem to do anything to prevent in-flight requests though.
> > 
> > (10) From a VM management perspective a 'save' seems much preferable to
> > hibernate.  A VM 'save' is migration like, so even if some components of
> > the system change between 'save' and 'restore' (e.g. QEMU up- or
> > donwngarde) we still have mechanisms in place that (hopefully) the guest
> > view of the system does not change. In this sense save/restore is
> > migration like.
> > 
> > (11) The VIRTIO specification is a bit vague about how a reset is
> > supposed to be handled by the guest, but it certainly does not prohibit
> > the negotiated features from changing after reset. Here I will quote two
> > fragments that hint this is actually something foreseen by the VIRTIO
> > standard:
> >  * 'During device initialization, the driver reads this and tells the
> >     device the subset that it accepts.  The only way to renegotiate is to
> >     reset the device.'
> >  * 'If the driver sets the FAILED bit, the driver MUST later reset the
> >     device before attempting to re-initialize.' If re-initialize is in a
> >     sense of '3.1.1 Driver Requirements: Device Initialization' then full
> >     feature negotiation seems to be compulsory.  Linux does not do this. But
> >     since setting up queues seems to be a part of the 3.1.1 initialization
> >     sequence (even if formulated somewhat vague), my best guess after reset
> >     the driver is not supposed to perform 3.1.1 to the letter.
> 
> I think frankly if we want dynamic features we should work on
> a mechanism that allows changing them without a system reset.
> 
> And I think the use-case that triggered this is the SRIOV feature,
> take a look at how that is handled across e.g. suspend/resume.
> 
> > 
> > (12) If I were to hibernate my PC and then, let's say replace my NIC with
> > a different model, the hardware does not change assumption would not hold
> > for a non-virtualized system either. I'm not sure this problem is ours to
> > solve.
> 
> Precisely and since we can't solve it, we warn people not to
> create this kind of configuration unless they know exactly what they
> are doing.
> 
> > My conclusion is the following. I think constraining feature changes
> > after system_reset is a bad idea. For 'normal' virtio reset some
> > clarifications would be welcome, but this one does not seem to be a very
> > good one. Regarding changing features, I think we are good enough with
> > what we have today (both standard and implementation). However if we want
> > to prohibit the features from changing after a reset in spite of my
> > arguments presented here, IMHO we need a driver normative statement too.
> > 
> > Regards,
> > Halil
> 
> Well the motion passed with 1 abstain and 5 in favor.  Tiwei was the one
> who proposed it so as I already did this in the past, I'll wait a day or
> two for him to respond and let us know whether he'd like to drop the
> patch, but in absence of such a response I'll have to push the proposed
> wording.
> In that case you will need to put in a motion to revert, or make some
> other change on top.
> 

If it would be better to drop this patch,
I'm fine with dropping it. Thanks!

Best regards,
Tiwei Bie

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-19  9:14                 ` Tiwei Bie
@ 2018-06-19 10:46                   ` Halil Pasic
  2018-06-19 16:30                     ` Tiwei Bie
  2018-06-20  1:06                   ` Michael S. Tsirkin
  1 sibling, 1 reply; 24+ messages in thread
From: Halil Pasic @ 2018-06-19 10:46 UTC (permalink / raw)
  To: Tiwei Bie, Michael S. Tsirkin
  Cc: cohuck, stefanha, pbonzini, virtio-dev, dan.daly, cunming.liang,
	zhihong.wang



On 06/19/2018 11:14 AM, Tiwei Bie wrote:
> On Mon, Jun 18, 2018 at 07:28:33PM +0300, Michael S. Tsirkin wrote:
[..]
>>>
>>> (11) The VIRTIO specification is a bit vague about how a reset is
>>> supposed to be handled by the guest, but it certainly does not prohibit
>>> the negotiated features from changing after reset. Here I will quote two
>>> fragments that hint this is actually something foreseen by the VIRTIO
>>> standard:
>>>   * 'During device initialization, the driver reads this and tells the
>>>      device the subset that it accepts.  The only way to renegotiate is to
>>>      reset the device.'
>>>   * 'If the driver sets the FAILED bit, the driver MUST later reset the
>>>      device before attempting to re-initialize.' If re-initialize is in a
>>>      sense of '3.1.1 Driver Requirements: Device Initialization' then full
>>>      feature negotiation seems to be compulsory.  Linux does not do this. But
>>>      since setting up queues seems to be a part of the 3.1.1 initialization
>>>      sequence (even if formulated somewhat vague), my best guess after reset
>>>      the driver is not supposed to perform 3.1.1 to the letter.
>>
>> I think frankly if we want dynamic features we should work on
>> a mechanism that allows changing them without a system reset.
>>

@Michael
I was talking abut normal virtio reset in (11). I think in Linux we
have dynamic features without system reset today if a virtio device driver
that is loaded as module gets replaced (e.g. rmmod/insmod new) with a more
capable implementation of the same device driver.  

>> And I think the use-case that triggered this is the SRIOV feature,
>> take a look at how that is handled across e.g. suspend/resume.
>>
>>>
>>> (12) If I were to hibernate my PC and then, let's say replace my NIC with
>>> a different model, the hardware does not change assumption would not hold
>>> for a non-virtualized system either. I'm not sure this problem is ours to
>>> solve.
>>
>> Precisely and since we can't solve it, we warn people not to
>> create this kind of configuration unless they know exactly what they
>> are doing.

@Michael
I assume the various bus specifications don't bother to spell this out,
and I doubt manuals of HW components do either.

If our main goal is to warn the end user to not fiddle with the features
of a hibernated VM (e.g. via libvirt domain xml), and hint that if the guest
is going to get hibernated, he should better configure guest as migratable
even if it's not (e.g. machine type, cpu model should not be moving target)
I doubt the VIRTIO spec is the right place.

IMHO neither QEMU nor KVM can detect the condition in question, and I don't
think higher level management software can help either. That's why I say
end-user.

Hibernate is IMHO an OS concept, and I guess some OSes don't have the concept of
hibernate. I see support for hibernate out of scope for the VIRTIO spec (much like
migration). But since the VIRTIO spec is supposed to be helpful above all, I'm
not opposed to a note that spells the warning out.

I still oppose a device normative, as this does not seem to be something an
implementer of the device should heed. And if we do want to place a note,
it needs to be more direct. I could not figure out what is this about. I doubt
end-users have better chances to.

>>
>>> My conclusion is the following. I think constraining feature changes
>>> after system_reset is a bad idea. For 'normal' virtio reset some
>>> clarifications would be welcome, but this one does not seem to be a very
>>> good one. Regarding changing features, I think we are good enough with
>>> what we have today (both standard and implementation). However if we want
>>> to prohibit the features from changing after a reset in spite of my
>>> arguments presented here, IMHO we need a driver normative statement too.
>>>
>>> Regards,
>>> Halil
>>
>> Well the motion passed with 1 abstain and 5 in favor.  Tiwei was the one
>> who proposed it so as I already did this in the past, I'll wait a day or
>> two for him to respond and let us know whether he'd like to drop the
>> patch, but in absence of such a response I'll have to push the proposed
>> wording.
>> In that case you will need to put in a motion to revert, or make some
>> other change on top.
>>

@Michael
If I can not convince you, nor at least some of the committee people here
I'm not willing to escalate this as a motion to revert. There is no point,
as I'm running out of arguments. While I'm still not convinced that this
is the way to go, I'm willing to bow my head in front of the opinion of
the majority. It is not like including this would have tragic consequences.
I think mustered a fair effort to form an opinion and defend it. Thus
there is no shame in admitting defeat.


> 
> If it would be better to drop this patch,
> I'm fine with dropping it. Thanks!
> 

@Tiwei Bie
Thanks for your flexibility! What is your opinion (after considering the
arguments from my previous mail), is it better to include this patch in the spec or
is it better to drop it? Were you able to identify mistakes in my reasoning
(I mean points (1)-(12))?

Regards,
Halil


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-19 10:46                   ` Halil Pasic
@ 2018-06-19 16:30                     ` Tiwei Bie
  2018-06-20  7:54                       ` Cornelia Huck
  2018-06-20 12:10                       ` Halil Pasic
  0 siblings, 2 replies; 24+ messages in thread
From: Tiwei Bie @ 2018-06-19 16:30 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Michael S. Tsirkin, cohuck, stefanha, pbonzini, virtio-dev,
	dan.daly, cunming.liang, zhihong.wang

On Tue, Jun 19, 2018 at 12:46:45PM +0200, Halil Pasic wrote:
> On 06/19/2018 11:14 AM, Tiwei Bie wrote:
> > On Mon, Jun 18, 2018 at 07:28:33PM +0300, Michael S. Tsirkin wrote:
[...]
> > 
> > If it would be better to drop this patch,
> > I'm fine with dropping it. Thanks!
> > 
> 
> @Tiwei Bie
> Thanks for your flexibility! What is your opinion (after considering the
> arguments from my previous mail), is it better to include this patch in the spec or
> is it better to drop it? Were you able to identify mistakes in my reasoning
> (I mean points (1)-(12))?
> 

Hi Halil,

I think maybe you thought too much about this proposal
(or maybe I really missed something obvious). In my
opinion, the device requirement proposed by this patch
is quite simple and straightforward:

- It's just to make the spec explicitly require that
  a certain virtio device shouldn't fail re-negotiation
  of a feature set it has successfully accepted once.

- It covers the cases of virtio device reset and system
  reset (which includes normal shutdown and start).

I think the requirement is reasonable because for a
certain virtio device, there is no reason that the
feature bits it offers will change (because it should
always offer all the features it understands). And we
are just to add a device normative to make the spec be
more explicit about that (because if a device really
changes the features it offers after a device or
system reset, something will go wrong). If the configs
of an emulated virtio device are changed, maybe we
shouldn't treat it as the same device any more, and
IMO this case is not related to this proposal.

Although we have 'Each virtio device offers all the
features it understands', it's not an explicit device
requirement. So I don't think it's a bad idea to
have an explicit device requirement about this.

Best regards,
Tiwei Bie

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-19  9:14                 ` Tiwei Bie
  2018-06-19 10:46                   ` Halil Pasic
@ 2018-06-20  1:06                   ` Michael S. Tsirkin
  1 sibling, 0 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2018-06-20  1:06 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: Halil Pasic, cohuck, stefanha, pbonzini, virtio-dev, dan.daly,
	cunming.liang, zhihong.wang

On Tue, Jun 19, 2018 at 05:14:18PM +0800, Tiwei Bie wrote:
> On Mon, Jun 18, 2018 at 07:28:33PM +0300, Michael S. Tsirkin wrote:
> > On Mon, Jun 18, 2018 at 05:08:32PM +0200, Halil Pasic wrote:
> > > 
> > > 
> > > On 06/15/2018 05:37 PM, Michael S. Tsirkin wrote:
> > > > On Fri, Jun 15, 2018 at 05:16:10PM +0200, Halil Pasic wrote:
> > > > > 
> > > > > 
> > > > > On 06/15/2018 03:38 PM, Michael S. Tsirkin wrote:
> > > > > > On Fri, Jun 15, 2018 at 02:42:58PM +0200, Halil Pasic wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 06/15/2018 02:19 PM, Michael S. Tsirkin wrote:
> > > > > > > > On Fri, Jun 15, 2018 at 02:10:11PM +0200, Halil Pasic wrote:
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On 06/11/2018 09:56 AM, Tiwei Bie wrote:
> > > > > > > > > > Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > > > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/14
> > > > > > > > > > ---
> > > > > > > > > > v2:
> > > > > > > > > > - Refine the wording (Cornelia);
> > > > > > > > > > 
> > > > > > > > > > v3:
> > > > > > > > > > - Refine the wording (MST);
> > > > > > > > > > 
> > > > > > > > > >      content.tex | 7 +++++++
> > > > > > > > > >      1 file changed, 7 insertions(+)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/content.tex b/content.tex
> > > > > > > > > > index f996fad..3c7d67d 100644
> > > > > > > > > > --- a/content.tex
> > > > > > > > > > +++ b/content.tex
> > > > > > > > > > @@ -125,6 +125,13 @@ which was not offered.  The device SHOULD accept any valid subset
> > > > > > > > > >      of features the driver accepts, otherwise it MUST fail to set the
> > > > > > > > > >      FEATURES_OK \field{device status} bit when the driver writes it.
> > > > > > > > > > +If a device has successfully negotiated a set of features
> > > > > > > > > > +at least once (by accepting the FEATURES_OK \field{device
> > > > > > > > > > +status} bit during device initialization), then it SHOULD
> > > > > > > > > > +NOT fail re-negotiation of the same set of features after
> > > > > > > > > > +a device or system reset.  Failure to do so would interfere
> > > > > > > > > > +with resuming from suspend and error recovery.
> > > > > > > > > > +
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Sorry people but I don't get it. I mean it is kind of reasonable
> > > > > > > > > to assume that with a given device and a given driver (given, i.e.
> > > > > > > > > nothing changes) the two will always negotiate the same features
> > > > > > > > > (including the extremal case where the negotiation fails).
> > > > > > > > > 
> > > > > > > > > Either the device or a driver rolling a dice to make feature negotiation
> > > > > > > > > more fun seems quite unreasonable. So I assume this is not what we are
> > > > > > > > > bothering to soft prohibit here.
> > > > > > > > > 
> > > > > > > > > So the interesting scenario seems to be when stuff changes. When
> > > > > > > > > migrating the implementation of the device could change. Or something
> > > > > > > > > changes regarding the resources used to provide the virtual device.
> > > > > > > > > 
> > > > > > > > > But then, if the device really can not support the set of features
> > > > > > > > > it used to be able, I guess the SHOULD does not take effect (I guess
> > > > > > > > > that is the difference compared to MUST).
> > > > > > > > > 
> > > > > > > > > Bottom line is: I tried to figure out what is this about, but I failed.
> > > > > > > > > I've read https://github.com/oasis-tcs/virtio-spec/issues/14 too but
> > > > > > > > > it did not click. I would appreciate some assistance.
> > > > > > > > 
> > > > > > > > It's exactly what it says. Let's say you negotiated a feature and then
> > > > > > > > device sets NEED_RESET.  Driver must now reset the device and put it
> > > > > > > > back in the same state it had before the reset, then resubmit
> > > > > > > > requests that were available but never used.
> > > > > > > > 
> > > > > > > > What if any of the features changed? Device suddenly
> > > > > > > > needs to check for requests which do not match the
> > > > > > > > features.
> > > > > > > > 
> > > > > > > > Suspend is similar: guests tend to assume hardware does not change
> > > > > > > > across suspend/resume, any changes tend to make resume fail.
> > > > > > > > 
> > > > > > > 
> > > > > > > Thank you very much! But it still does not answer why would a device
> > > > > > > want to do that (fail to negotiate a feature that it was able to
> > > > > > > negotiate before). So I'm still in the dark about what are we trading
> > > > > > > for what.
> > > > > > 
> > > > > > It would be a mis-configured device.  For example QEMU does not migrate
> > > > > > the device features so if you misconfigure QEMU with different flags on
> > > > > > source and destination (not a supported configuration), features might
> > > > > > seem to change from guest POV.
> > > > > > 
> > > > > 
> > > > > Do you mean set (or rather restrict) what QEMU calls the host_features?
> > > > > 
> > > > > AFAIR there is no reset right after the migration. But yes if then there
> > > > > is a reset and another migration. After a lots of thinking, it seems you
> > > > > speak about the scenario I described in the answer to Tiwei Bie. But
> > > > > there I also say that this statement you add here is not good enough for
> > > > > that. Still puzzled.
> > > > 
> > > > What would a good enough statement look like?
> > > > 
> > > > 
> > > 
> > > 
> > > I did some reading and some thinking on the weekend. AFAIU the situation
> > > is tricky. To mitigate that let me establish the terminology I'm going to
> > > use. For vm lifecycle I'm going to use the definitions form libvirt as
> > > defined by https://libvirt.org/guide/html/Application_Development_Guide-Guest_Domains-Lifecycle.html.
> > > 
> > > You explained, the motivation for this addition to the VIRTIO
> > > specification is hibernate (aka suspend to disk).
> > > 
> > > (1) AFAIU on hibernate the VM goes from 'running' to (most likely)
> > > 'defined'.  The first step of the resume from hibernate is to start the
> > > VM. From the guest OS life-cycle perspective however we don't start a
> > > completely new cycle (like the VM life-cycle does) with complete
> > > re-initialization. After resuming form hibernate the system is expected
> > > to be put in essentially the same state (but not exactly) as it was
> > > before hibernate.
> > > 
> > > (2) From VM (life-cycle) perspective we can not distinguish between a
> > > 'shutdown' as a part of a  hibernate and a 'plain shutdown'.
> > > 
> > > (3) Any rule we come up for a device (e.g. the normative statement
> > > proposed here) that regulates the effects of a 'system reset' that is a
> > > part of the hibernate cycle equally affects the normal shutdown-start
> > > cycle.
> > > 
> > > (4) Any change in the negotiated feature set can affect the validity of
> > > requests that have been constructed under different assumptions (i.e.
> > > not only features going away, but also features appearing can be a
> > > problem).
> > > 
> > > (5) The Linux implementation already has reasonable handling for both
> > > types of changes: the driver does not try to use the new features, and
> > > fails cleanly if the old ones are not accepted.
> > > 
> > > (6) Because of (3), prohibiting devices dropping support for a set of
> > > features within a hibernate cycle is only possible if we prohibit such
> > > changes in general.
> > > 
> > > (7) If I read
> > > https://www.kernel.org/doc/html/v4.14/driver-api/pm/devices.html
> > > correctly the driver is expected to quiesce the device before going to
> > > hibernate. AFAIU hibernating with requests in flight isn't a great idea.
> > > 
> > > (8) If there are no in-flight requests in-flight (including on the
> > > queues), then this whole feature changes might break requests story seems
> > > irrelevant.
> > > 
> > > (9) After a quick look the freeze in virtio (Linux implementation) does
> > > not seem to do anything to prevent in-flight requests though.
> > > 
> > > (10) From a VM management perspective a 'save' seems much preferable to
> > > hibernate.  A VM 'save' is migration like, so even if some components of
> > > the system change between 'save' and 'restore' (e.g. QEMU up- or
> > > donwngarde) we still have mechanisms in place that (hopefully) the guest
> > > view of the system does not change. In this sense save/restore is
> > > migration like.
> > > 
> > > (11) The VIRTIO specification is a bit vague about how a reset is
> > > supposed to be handled by the guest, but it certainly does not prohibit
> > > the negotiated features from changing after reset. Here I will quote two
> > > fragments that hint this is actually something foreseen by the VIRTIO
> > > standard:
> > >  * 'During device initialization, the driver reads this and tells the
> > >     device the subset that it accepts.  The only way to renegotiate is to
> > >     reset the device.'
> > >  * 'If the driver sets the FAILED bit, the driver MUST later reset the
> > >     device before attempting to re-initialize.' If re-initialize is in a
> > >     sense of '3.1.1 Driver Requirements: Device Initialization' then full
> > >     feature negotiation seems to be compulsory.  Linux does not do this. But
> > >     since setting up queues seems to be a part of the 3.1.1 initialization
> > >     sequence (even if formulated somewhat vague), my best guess after reset
> > >     the driver is not supposed to perform 3.1.1 to the letter.
> > 
> > I think frankly if we want dynamic features we should work on
> > a mechanism that allows changing them without a system reset.
> > 
> > And I think the use-case that triggered this is the SRIOV feature,
> > take a look at how that is handled across e.g. suspend/resume.
> > 
> > > 
> > > (12) If I were to hibernate my PC and then, let's say replace my NIC with
> > > a different model, the hardware does not change assumption would not hold
> > > for a non-virtualized system either. I'm not sure this problem is ours to
> > > solve.
> > 
> > Precisely and since we can't solve it, we warn people not to
> > create this kind of configuration unless they know exactly what they
> > are doing.
> > 
> > > My conclusion is the following. I think constraining feature changes
> > > after system_reset is a bad idea. For 'normal' virtio reset some
> > > clarifications would be welcome, but this one does not seem to be a very
> > > good one. Regarding changing features, I think we are good enough with
> > > what we have today (both standard and implementation). However if we want
> > > to prohibit the features from changing after a reset in spite of my
> > > arguments presented here, IMHO we need a driver normative statement too.
> > > 
> > > Regards,
> > > Halil
> > 
> > Well the motion passed with 1 abstain and 5 in favor.  Tiwei was the one
> > who proposed it so as I already did this in the past, I'll wait a day or
> > two for him to respond and let us know whether he'd like to drop the
> > patch, but in absence of such a response I'll have to push the proposed
> > wording.
> > In that case you will need to put in a motion to revert, or make some
> > other change on top.
> > 
> 
> If it would be better to drop this patch,
> I'm fine with dropping it. Thanks!
> 
> Best regards,
> Tiwei Bie


Tiwei, Halil, I suspect you are just trying to be nice, but it can't really work like
this - we have a process to follow.  Objections were raised during the
voting process.  The TC still voted unanimously for the proposal
to resolve your comments to the spec, with one abstention.

If the person who proposed the change, no longer thinks it's a
good idea, and requests that we postpone making the change, maybe it's
best to give everyone chance to re-consider and I am not going to stand
on formality and require re-voting on postponing making the change.

Anything else we'd have to re-vote.

But if you request that, please be explicit and say so.  So if you want
one of these, you could say for example "I withdraw my comments on the
specification, the issue does not need to be addressed. Please cancel
the change proposal." or "While my comments on the specification stand,
I withdraw my proposal to resolve the issue, please cancel the change
proposal".  Or if you want the TC to decide, then you'd say e.g.  "I
leave that decision to the TC".

We'd then either follow a previous voting decision or re-vote
if someone requests that.


-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-19 16:30                     ` Tiwei Bie
@ 2018-06-20  7:54                       ` Cornelia Huck
  2018-06-20 12:10                       ` Halil Pasic
  1 sibling, 0 replies; 24+ messages in thread
From: Cornelia Huck @ 2018-06-20  7:54 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: Halil Pasic, Michael S. Tsirkin, stefanha, pbonzini, virtio-dev,
	dan.daly, cunming.liang, zhihong.wang

On Wed, 20 Jun 2018 00:30:59 +0800
Tiwei Bie <tiwei.bie@intel.com> wrote:

> On Tue, Jun 19, 2018 at 12:46:45PM +0200, Halil Pasic wrote:
> > On 06/19/2018 11:14 AM, Tiwei Bie wrote:  
> > > On Mon, Jun 18, 2018 at 07:28:33PM +0300, Michael S. Tsirkin wrote:  
> [...]
> > > 
> > > If it would be better to drop this patch,
> > > I'm fine with dropping it. Thanks!
> > >   
> > 
> > @Tiwei Bie
> > Thanks for your flexibility! What is your opinion (after considering the
> > arguments from my previous mail), is it better to include this patch in the spec or
> > is it better to drop it? Were you able to identify mistakes in my reasoning
> > (I mean points (1)-(12))?
> >   
> 
> Hi Halil,
> 
> I think maybe you thought too much about this proposal
> (or maybe I really missed something obvious). In my
> opinion, the device requirement proposed by this patch
> is quite simple and straightforward:
> 
> - It's just to make the spec explicitly require that
>   a certain virtio device shouldn't fail re-negotiation
>   of a feature set it has successfully accepted once.
> 
> - It covers the cases of virtio device reset and system
>   reset (which includes normal shutdown and start).
> 
> I think the requirement is reasonable because for a
> certain virtio device, there is no reason that the
> feature bits it offers will change (because it should
> always offer all the features it understands). And we
> are just to add a device normative to make the spec be
> more explicit about that (because if a device really
> changes the features it offers after a device or
> system reset, something will go wrong). If the configs
> of an emulated virtio device are changed, maybe we
> shouldn't treat it as the same device any more, and
> IMO this case is not related to this proposal.
> 
> Although we have 'Each virtio device offers all the
> features it understands', it's not an explicit device
> requirement. So I don't think it's a bad idea to
> have an explicit device requirement about this.

I think this reasoning is sane and we really should not overthink it.
The update as has been voted on looks fine to me.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits
  2018-06-19 16:30                     ` Tiwei Bie
  2018-06-20  7:54                       ` Cornelia Huck
@ 2018-06-20 12:10                       ` Halil Pasic
  1 sibling, 0 replies; 24+ messages in thread
From: Halil Pasic @ 2018-06-20 12:10 UTC (permalink / raw)
  To: Tiwei Bie
  Cc: Michael S. Tsirkin, cohuck, stefanha, pbonzini, virtio-dev,
	dan.daly, cunming.liang, zhihong.wang



On 06/19/2018 06:30 PM, Tiwei Bie wrote:
> On Tue, Jun 19, 2018 at 12:46:45PM +0200, Halil Pasic wrote:
>> On 06/19/2018 11:14 AM, Tiwei Bie wrote:
>>> On Mon, Jun 18, 2018 at 07:28:33PM +0300, Michael S. Tsirkin wrote:
> [...]
>>> If it would be better to drop this patch,
>>> I'm fine with dropping it. Thanks!
>>>
>> @Tiwei Bie
>> Thanks for your flexibility! What is your opinion (after considering the
>> arguments from my previous mail), is it better to include this patch in the spec or
>> is it better to drop it? Were you able to identify mistakes in my reasoning
>> (I mean points (1)-(12))?
>>
> Hi Halil,
> 
> I think maybe you thought too much about this proposal
> (or maybe I really missed something obvious). In my
> opinion, the device requirement proposed by this patch
> is quite simple and straightforward:
> 
> - It's just to make the spec explicitly require that
>    a certain virtio device shouldn't fail re-negotiation
>    of a feature set it has successfully accepted once.
> 
> - It covers the cases of virtio device reset and system
>    reset (which includes normal shutdown and start).
> 
> I think the requirement is reasonable because for a
> certain virtio device, there is no reason that the
> feature bits it offers will change (because it should
> always offer all the features it understands). And we
> are just to add a device normative to make the spec be
> more explicit about that (because if a device really
> changes the features it offers after a device or
> system reset, something will go wrong). If the configs
> of an emulated virtio device are changed, maybe we
> shouldn't treat it as the same device any more, and
> IMO this case is not related to this proposal.

Thanks for clarifying your position. I don't want to
usurp any more of your valuable time. I'm not convinced
but I've given up on hope to convince the opposition.

I'm giving up.

Regards,
Halil


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-06-20 12:11 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-11  7:56 [virtio-dev] [PATCH v3] content: enhance device requirements for feature bits Tiwei Bie
2018-06-11  8:43 ` [virtio-dev] " Cornelia Huck
2018-06-11 13:24 ` Michael S. Tsirkin
2018-06-11 13:29   ` Cornelia Huck
2018-06-11 13:44     ` Michael S. Tsirkin
2018-06-11 13:44 ` Michael S. Tsirkin
2018-06-15 12:10 ` [virtio-dev] " Halil Pasic
2018-06-15 12:19   ` Michael S. Tsirkin
2018-06-15 12:42     ` Halil Pasic
2018-06-15 13:38       ` Michael S. Tsirkin
2018-06-15 15:16         ` Halil Pasic
2018-06-15 15:37           ` Michael S. Tsirkin
2018-06-18 15:08             ` Halil Pasic
2018-06-18 16:28               ` Michael S. Tsirkin
2018-06-19  9:14                 ` Tiwei Bie
2018-06-19 10:46                   ` Halil Pasic
2018-06-19 16:30                     ` Tiwei Bie
2018-06-20  7:54                       ` Cornelia Huck
2018-06-20 12:10                       ` Halil Pasic
2018-06-20  1:06                   ` Michael S. Tsirkin
2018-06-15 13:39       ` Tiwei Bie
2018-06-15 14:21         ` Halil Pasic
2018-06-15 15:36           ` Michael S. Tsirkin
2018-06-15 18:06             ` Halil Pasic

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.