All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] virtio-mmio: Specify wait needed in driver during reset
@ 2021-08-31 13:57 Srivatsa Vaddagiri
  2021-08-31 14:45 ` Michael S. Tsirkin
       [not found] ` <20211125162349-mutt-send-email-mst@kernel.org>
  0 siblings, 2 replies; 8+ messages in thread
From: Srivatsa Vaddagiri @ 2021-08-31 13:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, Cornelia Huck, Jason Wang; +Cc: virtio-dev, tsoni

Reset of a virtio-mmio device is initiated by writing 0 to its Status register.
In case of some devices, the reset operation itself may not be completed
by the time write instruction completes and hence such devices would require
drivers to wait on reset operation to complete before they proceed with
remaining steps of initialization.

Update the specification to indicate which devices would need driver to block on
reset completion.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Srivatsa Vaddagiri <quic_svaddagi@quicinc.com>
---
v3->v2:
- Introduce a new status bit, DEVICE_RESET_IN_PROGRESS
- Limit driver need to poll for reset completion only for MMIO devices having
  Version = 0x3 

Previous version can be found at:

https://lists.oasis-open.org/archives/virtio-dev/202108/msg00065.html

 content.tex | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/content.tex b/content.tex
index 7cec1c3..e10f7d3 100644
--- a/content.tex
+++ b/content.tex
@@ -49,6 +49,8 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
 
 \item[DEVICE_NEEDS_RESET (64)] Indicates that the device has experienced
   an error from which it can't recover.
+
+\item[DEVICE_RESET_IN_PROGRESS (256)] Indicates that the device reset is in progress.
 \end{description}
 
 The \field{device status} field starts out as 0, and is reinitialized to 0 by
@@ -1730,9 +1732,9 @@ \subsection{MMIO Device Register Layout}\label{sec:Virtio Transport Options / Vi
   } 
   \hline
   \mmioreg{Version}{Device version number}{0x004}{R}{%
-    0x2.
+    0x3.
     \begin{note}
-      Legacy devices (see \ref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy interface}~\nameref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy interface}) used 0x1.
+      Legacy devices (see \ref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy interface}~\nameref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy interface}) used 0x1. Devices that do not require drivers to poll for reset completion can use 0x2. See \ref{devicenormative:Virtio Transport Options / Virtio Over MMIO / MMIO Device Register Layout} for more details.
     \end{note}
   }
   \hline 
@@ -1916,7 +1918,15 @@ \subsection{MMIO Device Register Layout}\label{sec:Virtio Transport Options / Vi
 
 The device MUST return 0x74726976 in \field{MagicValue}.
 
-The device MUST return value 0x2 in \field{Version}.
+The device MUST return either value 0x3 or 0x2 in \field{Version} based on its reset behavior. Drivers trigger reset
+of a device by writing 0 to \field{Status}. The reset operation itself may or may not be completed by the time write
+operation is complete. Devices whose reset operation completes synchronously with the write operation are allowed to
+return value of 0x2 for \field{Version}. Other devices, whose reset operation can be incomplete by the time write
+operation completes MUST return value 0x3 as an indication for drivers to poll for reset completion. Such devices
+MUST indicate progress of reset operation in \field{Status}, with value of DEVICE_RESET_IN_PROGRESS indicating reset
+operation is still in progress, a value of 0x0 indicating reset operation is complete and a value of
+DEVICE_NEEDS_RESET indicating device has experienced an error and that reset operation could not be completed
+successfully.
 
 The device MUST present each event by setting the corresponding bit in \field{InterruptStatus} from the
 moment it takes place, until the driver acknowledges the interrupt
@@ -1947,9 +1957,14 @@ \subsection{MMIO Device Register Layout}\label{sec:Virtio Transport Options / Vi
 The driver MUST ignore a device with \field{MagicValue} which is not 0x74726976,
 although it MAY report an error.
 
-The driver MUST ignore a device with \field{Version} which is not 0x2,
+The driver MUST ignore a device with \field{Version} which is neither 0x2 nor 0x3,
 although it MAY report an error.
 
+When \field{Version} is 0x3, the driver, after initiating reset of device by
+writing 0 to \field{Status}, MUST wait for device reset operation to complete.
+Further, when \field{Version} is 0x3, the driver MUST not access any register other than
+reading \field{Status} while device reset is in progress.
+
 The driver MUST ignore a device with \field{DeviceID} 0x0,
 but MUST NOT report any error.
 


-- 
Qualcomm Innovation Center, Inc. is submitting the attached "feedback" as a
non-member to the virtio-dev mailing list for consideration and inclusion.


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] virtio-mmio: Specify wait needed in driver during reset
  2021-08-31 13:57 [PATCH v3] virtio-mmio: Specify wait needed in driver during reset Srivatsa Vaddagiri
@ 2021-08-31 14:45 ` Michael S. Tsirkin
  2021-08-31 15:56   ` [virtio-dev] " Srivatsa Vaddagiri
       [not found] ` <20211125162349-mutt-send-email-mst@kernel.org>
  1 sibling, 1 reply; 8+ messages in thread
From: Michael S. Tsirkin @ 2021-08-31 14:45 UTC (permalink / raw)
  To: Srivatsa Vaddagiri; +Cc: Cornelia Huck, Jason Wang, virtio-dev, tsoni

On Tue, Aug 31, 2021 at 07:27:53PM +0530, Srivatsa Vaddagiri wrote:
> Reset of a virtio-mmio device is initiated by writing 0 to its Status register.
> In case of some devices, the reset operation itself may not be completed
> by the time write instruction completes and hence such devices would require
> drivers to wait on reset operation to complete before they proceed with
> remaining steps of initialization.
> 
> Update the specification to indicate which devices would need driver to block on
> reset completion.
> 
> Suggested-by: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Srivatsa Vaddagiri <quic_svaddagi@quicinc.com>


I am still of two minds on whether we
want such a drastic change as a version update for such a
minor thing. Yes, we did it for PCI but then PCI did
not break backwards compat like mmio did.

Let's see what needs to happen to make existing drivers work
1- reset starts the reset process
2- following writes into status are buffered by the device
  until reset completes
3- read from features completes after reset is complete

This does not sound like a difficult set of requirements
to implement. Maybe just write up 1-3 above in the spec
and be done with it?



> ---
> v3->v2:
> - Introduce a new status bit, DEVICE_RESET_IN_PROGRESS
> - Limit driver need to poll for reset completion only for MMIO devices having
>   Version = 0x3 
> 
> Previous version can be found at:
> 
> https://lists.oasis-open.org/archives/virtio-dev/202108/msg00065.html
> 
>  content.tex | 23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/content.tex b/content.tex
> index 7cec1c3..e10f7d3 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -49,6 +49,8 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
>  
>  \item[DEVICE_NEEDS_RESET (64)] Indicates that the device has experienced
>    an error from which it can't recover.
> +
> +\item[DEVICE_RESET_IN_PROGRESS (256)] Indicates that the device reset is in progress.
>  \end{description}
>  
>  The \field{device status} field starts out as 0, and is reinitialized to 0 by
> @@ -1730,9 +1732,9 @@ \subsection{MMIO Device Register Layout}\label{sec:Virtio Transport Options / Vi
>    } 
>    \hline
>    \mmioreg{Version}{Device version number}{0x004}{R}{%
> -    0x2.
> +    0x3.
>      \begin{note}
> -      Legacy devices (see \ref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy interface}~\nameref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy interface}) used 0x1.
> +      Legacy devices (see \ref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy interface}~\nameref{sec:Virtio Transport Options / Virtio Over MMIO / Legacy interface}) used 0x1. Devices that do not require drivers to poll for reset completion can use 0x2. See \ref{devicenormative:Virtio Transport Options / Virtio Over MMIO / MMIO Device Register Layout} for more details.
>      \end{note}
>    }
>    \hline 
> @@ -1916,7 +1918,15 @@ \subsection{MMIO Device Register Layout}\label{sec:Virtio Transport Options / Vi
>  
>  The device MUST return 0x74726976 in \field{MagicValue}.
>  
> -The device MUST return value 0x2 in \field{Version}.
> +The device MUST return either value 0x3 or 0x2 in \field{Version} based on its reset behavior. Drivers trigger reset
> +of a device by writing 0 to \field{Status}. The reset operation itself may or may not be completed by the time write
> +operation is complete. Devices whose reset operation completes synchronously with the write operation are allowed to
> +return value of 0x2 for \field{Version}. Other devices, whose reset operation can be incomplete by the time write
> +operation completes MUST return value 0x3 as an indication for drivers to poll for reset completion. Such devices
> +MUST indicate progress of reset operation in \field{Status}, with value of DEVICE_RESET_IN_PROGRESS indicating reset
> +operation is still in progress, a value of 0x0 indicating reset operation is complete and a value of
> +DEVICE_NEEDS_RESET indicating device has experienced an error and that reset operation could not be completed
> +successfully.
>  
>  The device MUST present each event by setting the corresponding bit in \field{InterruptStatus} from the
>  moment it takes place, until the driver acknowledges the interrupt
> @@ -1947,9 +1957,14 @@ \subsection{MMIO Device Register Layout}\label{sec:Virtio Transport Options / Vi
>  The driver MUST ignore a device with \field{MagicValue} which is not 0x74726976,
>  although it MAY report an error.
>  
> -The driver MUST ignore a device with \field{Version} which is not 0x2,
> +The driver MUST ignore a device with \field{Version} which is neither 0x2 nor 0x3,
>  although it MAY report an error.
>  
> +When \field{Version} is 0x3, the driver, after initiating reset of device by
> +writing 0 to \field{Status}, MUST wait for device reset operation to complete.
> +Further, when \field{Version} is 0x3, the driver MUST not access any register other than
> +reading \field{Status} while device reset is in progress.
> +
>  The driver MUST ignore a device with \field{DeviceID} 0x0,
>  but MUST NOT report any error.
>  
> 
> 
> -- 
> Qualcomm Innovation Center, Inc. is submitting the attached "feedback" as a
> non-member to the virtio-dev mailing list for consideration and inclusion.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [virtio-dev] Re: [PATCH v3] virtio-mmio: Specify wait needed in driver during reset
  2021-08-31 14:45 ` Michael S. Tsirkin
@ 2021-08-31 15:56   ` Srivatsa Vaddagiri
  2021-09-01  1:19     ` Michael S. Tsirkin
  0 siblings, 1 reply; 8+ messages in thread
From: Srivatsa Vaddagiri @ 2021-08-31 15:56 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Cornelia Huck, Jason Wang, virtio-dev, tsoni

* Michael S. Tsirkin <mst@redhat.com> [2021-08-31 10:45:53]:

> On Tue, Aug 31, 2021 at 07:27:53PM +0530, Srivatsa Vaddagiri wrote:
> > Reset of a virtio-mmio device is initiated by writing 0 to its Status register.
> > In case of some devices, the reset operation itself may not be completed
> > by the time write instruction completes and hence such devices would require
> > drivers to wait on reset operation to complete before they proceed with
> > remaining steps of initialization.
> > 
> > Update the specification to indicate which devices would need driver to block on
> > reset completion.
> > 
> > Suggested-by: Jason Wang <jasowang@redhat.com>
> > Signed-off-by: Srivatsa Vaddagiri <quic_svaddagi@quicinc.com>
> 
> 
> I am still of two minds on whether we
> want such a drastic change as a version update for such a
> minor thing. Yes, we did it for PCI but then PCI did
> not break backwards compat like mmio did.
> 
> Let's see what needs to happen to make existing drivers work
> 1- reset starts the reset process
> 2- following writes into status are buffered by the device
>   until reset completes
> 3- read from features completes after reset is complete

Couple of scenarios which we discussed in this regard earlier:

1) What if device reset encounters a failure? What should it return for the
'features' read in that case?

2) For untrusted devices, like in our case [A], it would require hypervisor to
stall vcpu until the untrusted backend responds to the features read request,
which could take a long time.  In worst case, the VM may get reset due to
watchdog firing. Requiring drivers to poll will avoid that situation and allow
drivers to fail probe gracefully.

Ref A: https://lists.oasis-open.org/archives/virtio-dev/202108/msg00090.html

- vatsa

-- 
Qualcomm Innovation Center, Inc. is submitting the attached "feedback" as a
non-member to the virtio-dev mailing list for consideration and inclusion.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [virtio-dev] Re: [PATCH v3] virtio-mmio: Specify wait needed in driver during reset
  2021-08-31 15:56   ` [virtio-dev] " Srivatsa Vaddagiri
@ 2021-09-01  1:19     ` Michael S. Tsirkin
  2021-09-01 13:31       ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 8+ messages in thread
From: Michael S. Tsirkin @ 2021-09-01  1:19 UTC (permalink / raw)
  To: Srivatsa Vaddagiri; +Cc: Cornelia Huck, Jason Wang, virtio-dev, tsoni

On Tue, Aug 31, 2021 at 09:26:03PM +0530, Srivatsa Vaddagiri wrote:
> * Michael S. Tsirkin <mst@redhat.com> [2021-08-31 10:45:53]:
> 
> > On Tue, Aug 31, 2021 at 07:27:53PM +0530, Srivatsa Vaddagiri wrote:
> > > Reset of a virtio-mmio device is initiated by writing 0 to its Status register.
> > > In case of some devices, the reset operation itself may not be completed
> > > by the time write instruction completes and hence such devices would require
> > > drivers to wait on reset operation to complete before they proceed with
> > > remaining steps of initialization.
> > > 
> > > Update the specification to indicate which devices would need driver to block on
> > > reset completion.
> > > 
> > > Suggested-by: Jason Wang <jasowang@redhat.com>
> > > Signed-off-by: Srivatsa Vaddagiri <quic_svaddagi@quicinc.com>
> > 
> > 
> > I am still of two minds on whether we
> > want such a drastic change as a version update for such a
> > minor thing. Yes, we did it for PCI but then PCI did
> > not break backwards compat like mmio did.
> > 
> > Let's see what needs to happen to make existing drivers work
> > 1- reset starts the reset process
> > 2- following writes into status are buffered by the device
> >   until reset completes
> > 3- read from features completes after reset is complete
> 
> Couple of scenarios which we discussed in this regard earlier:
> 
> 1) What if device reset encounters a failure? What should it return for the
> 'features' read in that case?

I'd say the main thing is to fail to set FEATURES_OK.

> 2) For untrusted devices, like in our case [A], it would require hypervisor to
> stall vcpu until the untrusted backend responds to the features read request,

Wait a second, this is fundamental to reads anyway. They can't bypass
writes. E.g. this is the case when FEATURES_OK is written then read
back.

> which could take a long time.

This last is a reasonable argument. so it's not about hardware it's for
software where reset takes a long time and we do not
want to stall the VCPU. That's a reasonable requirement but
pls include it in the text though. And I wonder how you are handling
other cases where reads are ordered with writes.
Is there a reason to assume reset is special?


>  In worst case, the VM may get reset due to
> watchdog firing. Requiring drivers to poll will avoid that situation and allow
> drivers to fail probe gracefully.
>
> Ref A: https://lists.oasis-open.org/archives/virtio-dev/202108/msg00090.html
> 
> - vatsa
> 
> -- 
> Qualcomm Innovation Center, Inc. is submitting the attached "feedback" as a
> non-member to the virtio-dev mailing list for consideration and inclusion.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [virtio-dev] Re: [PATCH v3] virtio-mmio: Specify wait needed in driver during reset
  2021-09-01  1:19     ` Michael S. Tsirkin
@ 2021-09-01 13:31       ` Srivatsa Vaddagiri
  2021-09-02  7:27         ` Jason Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Srivatsa Vaddagiri @ 2021-09-01 13:31 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Cornelia Huck, Jason Wang, virtio-dev, tsoni

* Michael S. Tsirkin <mst@redhat.com> [2021-08-31 21:19:47]:

> On Tue, Aug 31, 2021 at 09:26:03PM +0530, Srivatsa Vaddagiri wrote:
> > * Michael S. Tsirkin <mst@redhat.com> [2021-08-31 10:45:53]:
> > 
> > > On Tue, Aug 31, 2021 at 07:27:53PM +0530, Srivatsa Vaddagiri wrote:
> > > > Reset of a virtio-mmio device is initiated by writing 0 to its Status register.
> > > > In case of some devices, the reset operation itself may not be completed
> > > > by the time write instruction completes and hence such devices would require
> > > > drivers to wait on reset operation to complete before they proceed with
> > > > remaining steps of initialization.
> > > > 
> > > > Update the specification to indicate which devices would need driver to block on
> > > > reset completion.
> > > > 
> > > > Suggested-by: Jason Wang <jasowang@redhat.com>
> > > > Signed-off-by: Srivatsa Vaddagiri <quic_svaddagi@quicinc.com>
> > > 
> > > 
> > > I am still of two minds on whether we
> > > want such a drastic change as a version update for such a
> > > minor thing. Yes, we did it for PCI but then PCI did
> > > not break backwards compat like mmio did.
> > > 
> > > Let's see what needs to happen to make existing drivers work
> > > 1- reset starts the reset process
> > > 2- following writes into status are buffered by the device
> > >   until reset completes
> > > 3- read from features completes after reset is complete
> > 
> > Couple of scenarios which we discussed in this regard earlier:
> > 
> > 1) What if device reset encounters a failure? What should it return for the
> > 'features' read in that case?
> 
> I'd say the main thing is to fail to set FEATURES_OK.

That would signal device does not support the features offered by driver, which
seems like very hackish way to fail probe in this case (when reset had actually
failed). Driver has no way to understand what happened - whether device did not
like the features offered or reset failed.

> > 2) For untrusted devices, like in our case [A], it would require hypervisor to
> > stall vcpu until the untrusted backend responds to the features read request,
> 
> Wait a second, this is fundamental to reads anyway. They can't bypass
> writes. E.g. this is the case when FEATURES_OK is written then read
> back.
> 
> > which could take a long time.
> 
> This last is a reasonable argument. so it's not about hardware it's for
> software where reset takes a long time and we do not
> want to stall the VCPU. That's a reasonable requirement but
> pls include it in the text though.

Sure will update commit text to indicate that better.

> And I wonder how you are handling
> other cases where reads are ordered with writes.
> Is there a reason to assume reset is special?

With what we have atm, which is working well for atleast block device,
hypervisor is able to handle read/writes on its own without needing synchronous
intervention from (untrusted) backend diver - except for reset that is.

Hypervisor is fed with virtio device configuration information even before VM
starts and in fact in our current prototype, reads don't even trap into
hypervisor. A config page is mapped as read-only into guest and so all
configuration information like device id, version etc can be read w/o causing a
trap. Writes are trapped and we have been able to have hypervisor handles all
writes (except reset) w/o needing synchronous intervention from backend driver.
For example, any write to driver features, QueueNumMax etc are cached within
hypervisor which is later passed to backend driver lazily - via some hypercall
interface - before it starts working on the first IO transaction.

Reset seems to be the only write that needs synchronous intervention from
backend, which we want to address by having guest loop on reset completion
(rather than stall vcpu until backend finishes reset handling).

Alternately, I could go by your suggestion and stall the vcpu for a certain
threshold period of time when reset is issued. If backend fails to ACK reset
within that threshold period, unblock vcpu and fail further initialization by
refusing to set FATURES_OK bit. Like I said before, that seems very
hackish way to fail probe - with no clear indication to driver on what failed.

How about merging a change like this to Linux driver w/o making any
spec changes for now?

diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 56128b9..9db81e6 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -262,6 +262,9 @@ static void vm_reset(struct virtio_device *vdev)

        /* 0 status means a reset. */
        writel(0, vm_dev->base + VIRTIO_MMIO_STATUS);
+
+       while (vm_get_status(vdev)) {
+		/* Bail after some timeout */
+               msleep(1);
+	}
 }

- vatsa


> > Ref A: https://lists.oasis-open.org/archives/virtio-dev/202108/msg00090.html



-- 
Qualcomm Innovation Center, Inc. is submitting the attached "feedback" as a
non-member to the virtio-dev mailing list for consideration and inclusion.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [virtio-dev] Re: [PATCH v3] virtio-mmio: Specify wait needed in driver during reset
  2021-09-01 13:31       ` Srivatsa Vaddagiri
@ 2021-09-02  7:27         ` Jason Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Wang @ 2021-09-02  7:27 UTC (permalink / raw)
  To: Srivatsa Vaddagiri
  Cc: Michael S. Tsirkin, Cornelia Huck, Virtio-Dev, Trilok Soni

On Wed, Sep 1, 2021 at 9:31 PM Srivatsa Vaddagiri
<quic_svaddagi@quicinc.com> wrote:
>
> * Michael S. Tsirkin <mst@redhat.com> [2021-08-31 21:19:47]:
>
> > On Tue, Aug 31, 2021 at 09:26:03PM +0530, Srivatsa Vaddagiri wrote:
> > > * Michael S. Tsirkin <mst@redhat.com> [2021-08-31 10:45:53]:
> > >
> > > > On Tue, Aug 31, 2021 at 07:27:53PM +0530, Srivatsa Vaddagiri wrote:
> > > > > Reset of a virtio-mmio device is initiated by writing 0 to its Status register.
> > > > > In case of some devices, the reset operation itself may not be completed
> > > > > by the time write instruction completes and hence such devices would require
> > > > > drivers to wait on reset operation to complete before they proceed with
> > > > > remaining steps of initialization.
> > > > >
> > > > > Update the specification to indicate which devices would need driver to block on
> > > > > reset completion.
> > > > >
> > > > > Suggested-by: Jason Wang <jasowang@redhat.com>
> > > > > Signed-off-by: Srivatsa Vaddagiri <quic_svaddagi@quicinc.com>
> > > >
> > > >
> > > > I am still of two minds on whether we
> > > > want such a drastic change as a version update for such a
> > > > minor thing. Yes, we did it for PCI but then PCI did
> > > > not break backwards compat like mmio did.
> > > >
> > > > Let's see what needs to happen to make existing drivers work
> > > > 1- reset starts the reset process
> > > > 2- following writes into status are buffered by the device
> > > >   until reset completes
> > > > 3- read from features completes after reset is complete
> > >
> > > Couple of scenarios which we discussed in this regard earlier:
> > >
> > > 1) What if device reset encounters a failure? What should it return for the
> > > 'features' read in that case?
> >
> > I'd say the main thing is to fail to set FEATURES_OK.
>
> That would signal device does not support the features offered by driver, which
> seems like very hackish way to fail probe in this case (when reset had actually
> failed). Driver has no way to understand what happened - whether device did not
> like the features offered or reset failed.
>
> > > 2) For untrusted devices, like in our case [A], it would require hypervisor to
> > > stall vcpu until the untrusted backend responds to the features read request,
> >
> > Wait a second, this is fundamental to reads anyway. They can't bypass
> > writes. E.g. this is the case when FEATURES_OK is written then read
> > back.
> >
> > > which could take a long time.
> >
> > This last is a reasonable argument. so it's not about hardware it's for
> > software where reset takes a long time and we do not
> > want to stall the VCPU. That's a reasonable requirement but
> > pls include it in the text though.
>
> Sure will update commit text to indicate that better.
>
> > And I wonder how you are handling
> > other cases where reads are ordered with writes.

A question, do we need to clarify the ordering in the spec?

> > Is there a reason to assume reset is special?
>
> With what we have atm, which is working well for atleast block device,
> hypervisor is able to handle read/writes on its own without needing synchronous
> intervention from (untrusted) backend diver - except for reset that is.
>
> Hypervisor is fed with virtio device configuration information even before VM
> starts and in fact in our current prototype, reads don't even trap into
> hypervisor. A config page is mapped as read-only into guest and so all
> configuration information like device id, version etc can be read w/o causing a
> trap. Writes are trapped and we have been able to have hypervisor handles all
> writes (except reset) w/o needing synchronous intervention from backend driver.
> For example, any write to driver features, QueueNumMax etc are cached within
> hypervisor which is later passed to backend driver lazily - via some hypercall
> interface - before it starts working on the first IO transaction.
>
> Reset seems to be the only write that needs synchronous intervention from
> backend, which we want to address by having guest loop on reset completion
> (rather than stall vcpu until backend finishes reset handling).

This sounds similar to what we meet with VDUSE, a untrusted
vDPA(virtio) device in the userspace.

Thanks

>
> Alternately, I could go by your suggestion and stall the vcpu for a certain
> threshold period of time when reset is issued. If backend fails to ACK reset
> within that threshold period, unblock vcpu and fail further initialization by
> refusing to set FATURES_OK bit. Like I said before, that seems very
> hackish way to fail probe - with no clear indication to driver on what failed.
>
> How about merging a change like this to Linux driver w/o making any
> spec changes for now?
>
> diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
> index 56128b9..9db81e6 100644
> --- a/drivers/virtio/virtio_mmio.c
> +++ b/drivers/virtio/virtio_mmio.c
> @@ -262,6 +262,9 @@ static void vm_reset(struct virtio_device *vdev)
>
>         /* 0 status means a reset. */
>         writel(0, vm_dev->base + VIRTIO_MMIO_STATUS);
> +
> +       while (vm_get_status(vdev)) {
> +               /* Bail after some timeout */
> +               msleep(1);
> +       }
>  }
>
> - vatsa
>
>
> > > Ref A: https://lists.oasis-open.org/archives/virtio-dev/202108/msg00090.html
>
>
>
> --
> Qualcomm Innovation Center, Inc. is submitting the attached "feedback" as a
> non-member to the virtio-dev mailing list for consideration and inclusion.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [virtio-dev] Re: [PATCH v3] virtio-mmio: Specify wait needed in driver during reset
       [not found]   ` <CACGkMEun-obsMTLAGCCtKpqNxWHBKk_QC_6WywsVsFb7mfW9qw@mail.gmail.com>
@ 2021-11-26  8:35     ` Cornelia Huck
  2021-11-29  2:40       ` Jason Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Cornelia Huck @ 2021-11-26  8:35 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: Srivatsa Vaddagiri, Virtio-Dev, Trilok Soni

On Fri, Nov 26 2021, Jason Wang <jasowang@redhat.com> wrote:

> On Fri, Nov 26, 2021 at 5:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>>
>> On Tue, Aug 31, 2021 at 07:27:53PM +0530, Srivatsa Vaddagiri wrote:
>> > Reset of a virtio-mmio device is initiated by writing 0 to its Status register.
>> > In case of some devices, the reset operation itself may not be completed
>> > by the time write instruction completes and hence such devices would require
>> > drivers to wait on reset operation to complete before they proceed with
>> > remaining steps of initialization.
>> >
>> > Update the specification to indicate which devices would need driver to block on
>> > reset completion.
>> >
>> > Suggested-by: Jason Wang <jasowang@redhat.com>
>> > Signed-off-by: Srivatsa Vaddagiri <quic_svaddagi@quicinc.com>
>>
>> I have been thinking about this some more.
>>
>>
>> \section{Device Reset}\label{sec:Basic Facilities of a Virtio Device / Device Reset}
>>
>> The driver may want to initiate a device reset at various times; notably,
>> it is required to do so during device initialization and device cleanup.
>>
>> The mechanism used by the driver to initiate the reset is transport specific.
>>
>> \devicenormative{\subsection}{Device Reset}{Basic Facilities of a Virtio Device / Device Reset}
>>
>> A device MUST reinitialize \field{device status} to 0 after receiving a reset.
>>
>> A device MUST NOT send notifications or interact with the queues after
>> indicating completion of the reset by reinitializing \field{device status}
>> to 0, until the driver re-initializes the device.
>>
>> \drivernormative{\subsection}{Device Reset}{Basic Facilities of a Virtio Device / Device Reset}
>>
>> The driver SHOULD consider a driver-initiated reset complete when it
>> reads \field{device status} as 0.
>
> I wonder if we need to switch to using MUST here.

I'm not quite sure what advantage s/SHOULD/MUST/ would give us here.

I don't think we want to force the driver to read back the device status
to determine success if the transport already has a different way to
signal completion. (I probably should undust my ccw reset clarification
patch...)

>> So I guess we can just fix the mmio driver to read status
>> and declare victory? Do we really need all the spec changes?
>
> Probably, but one consideration of this change is to avoid the
> breaking of existing devices. Maybe it's over engineering.

How big is the chance that we use an unfixed driver with a device that
would get it into trouble? The proposed change is pretty complex, I'd
rather avoid it if possible.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3] virtio-mmio: Specify wait needed in driver during reset
  2021-11-26  8:35     ` Cornelia Huck
@ 2021-11-29  2:40       ` Jason Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Wang @ 2021-11-29  2:40 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, Srivatsa Vaddagiri, Virtio-Dev, Trilok Soni

On Fri, Nov 26, 2021 at 4:35 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Fri, Nov 26 2021, Jason Wang <jasowang@redhat.com> wrote:
>
> > On Fri, Nov 26, 2021 at 5:25 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >>
> >> On Tue, Aug 31, 2021 at 07:27:53PM +0530, Srivatsa Vaddagiri wrote:
> >> > Reset of a virtio-mmio device is initiated by writing 0 to its Status register.
> >> > In case of some devices, the reset operation itself may not be completed
> >> > by the time write instruction completes and hence such devices would require
> >> > drivers to wait on reset operation to complete before they proceed with
> >> > remaining steps of initialization.
> >> >
> >> > Update the specification to indicate which devices would need driver to block on
> >> > reset completion.
> >> >
> >> > Suggested-by: Jason Wang <jasowang@redhat.com>
> >> > Signed-off-by: Srivatsa Vaddagiri <quic_svaddagi@quicinc.com>
> >>
> >> I have been thinking about this some more.
> >>
> >>
> >> \section{Device Reset}\label{sec:Basic Facilities of a Virtio Device / Device Reset}
> >>
> >> The driver may want to initiate a device reset at various times; notably,
> >> it is required to do so during device initialization and device cleanup.
> >>
> >> The mechanism used by the driver to initiate the reset is transport specific.
> >>
> >> \devicenormative{\subsection}{Device Reset}{Basic Facilities of a Virtio Device / Device Reset}
> >>
> >> A device MUST reinitialize \field{device status} to 0 after receiving a reset.
> >>
> >> A device MUST NOT send notifications or interact with the queues after
> >> indicating completion of the reset by reinitializing \field{device status}
> >> to 0, until the driver re-initializes the device.
> >>
> >> \drivernormative{\subsection}{Device Reset}{Basic Facilities of a Virtio Device / Device Reset}
> >>
> >> The driver SHOULD consider a driver-initiated reset complete when it
> >> reads \field{device status} as 0.
> >
> > I wonder if we need to switch to using MUST here.
>
> I'm not quite sure what advantage s/SHOULD/MUST/ would give us here.

It means the driver needs to validate the completion. Using SHOULD
means for some reason it can be ignored. So considering mmio doesn't
do the validation now, maybe SHOULD be better. Or I wonder if we can
change to use

"The driver SHOULD consider a driver-initiated reset complete in a
transport specific way"?

>
> I don't think we want to force the driver to read back the device status
> to determine success if the transport already has a different way to
> signal completion. (I probably should undust my ccw reset clarification
> patch...)

Right.

>
> >> So I guess we can just fix the mmio driver to read status
> >> and declare victory? Do we really need all the spec changes?
> >
> > Probably, but one consideration of this change is to avoid the
> > breaking of existing devices. Maybe it's over engineering.
>
> How big is the chance that we use an unfixed driver with a device that
> would get it into trouble?

The problem I guess is that it would be impossible to fix the device sometime.

>The proposed change is pretty complex, I'd
> rather avoid it if possible.

I agree, I think it may as simple as adding things like:

"After writing 0 to status, the driver MUST wait for a read of status
to return 0 before reinitializing the device."

For the version bumping, it's probably because we don't have a
capability enumeration like pci, so there's no more choices.

Thanks

>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-11-29  2:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-31 13:57 [PATCH v3] virtio-mmio: Specify wait needed in driver during reset Srivatsa Vaddagiri
2021-08-31 14:45 ` Michael S. Tsirkin
2021-08-31 15:56   ` [virtio-dev] " Srivatsa Vaddagiri
2021-09-01  1:19     ` Michael S. Tsirkin
2021-09-01 13:31       ` Srivatsa Vaddagiri
2021-09-02  7:27         ` Jason Wang
     [not found] ` <20211125162349-mutt-send-email-mst@kernel.org>
     [not found]   ` <CACGkMEun-obsMTLAGCCtKpqNxWHBKk_QC_6WywsVsFb7mfW9qw@mail.gmail.com>
2021-11-26  8:35     ` Cornelia Huck
2021-11-29  2:40       ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.