All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-dev] [PATCH v4 0/1] Define a low power mode for devices
@ 2024-02-16  8:24 ` David Stevens
  0 siblings, 0 replies; 18+ messages in thread
From: David Stevens @ 2024-02-16  8:24 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, virtio-comment, virtio-dev
  Cc: parav, David Stevens

The Linux patch [1] added support for suspending virtio devices using
native PCI power management. However, Linux does PCI power management
during the noirq phase of suspend and resume. Without a mechanism to
suspend/resume virtio devices when the driver is suspended/resumed in
the early phase of suspend/late phase of resume, there is a window where
interrupts can be lost.

[1] https://lore.kernel.org/lkml/20231208070754.3132339-1-stevensd@chromium.org/

v3 -> v4:
 - Define virtio-pci specific power management that can be used in
   conjunction with PCI power management.
v2 -> v3:
 - Use different words for some concepts to avoid conflicts with other
   parts of the spec.
 - Rewrite various sentences to improve clarity.
v1 -> v2:
 - Define virtio-pci support on top of PCI power management.
 - Add more conformance requirements.

David Stevens (1):
  Add suspend support for virtio PCI devices

 transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

-- 
2.44.0.rc0.258.g7320e95886-goog


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-comment] [PATCH v4 0/1] Define a low power mode for devices
@ 2024-02-16  8:24 ` David Stevens
  0 siblings, 0 replies; 18+ messages in thread
From: David Stevens @ 2024-02-16  8:24 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, virtio-comment, virtio-dev
  Cc: parav, David Stevens

The Linux patch [1] added support for suspending virtio devices using
native PCI power management. However, Linux does PCI power management
during the noirq phase of suspend and resume. Without a mechanism to
suspend/resume virtio devices when the driver is suspended/resumed in
the early phase of suspend/late phase of resume, there is a window where
interrupts can be lost.

[1] https://lore.kernel.org/lkml/20231208070754.3132339-1-stevensd@chromium.org/

v3 -> v4:
 - Define virtio-pci specific power management that can be used in
   conjunction with PCI power management.
v2 -> v3:
 - Use different words for some concepts to avoid conflicts with other
   parts of the spec.
 - Rewrite various sentences to improve clarity.
v1 -> v2:
 - Define virtio-pci support on top of PCI power management.
 - Add more conformance requirements.

David Stevens (1):
  Add suspend support for virtio PCI devices

 transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

-- 
2.44.0.rc0.258.g7320e95886-goog


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-dev] [PATCH v4 1/1] Add suspend support for virtio PCI devices
  2024-02-16  8:24 ` [virtio-comment] " David Stevens
@ 2024-02-16  8:24   ` David Stevens
  -1 siblings, 0 replies; 18+ messages in thread
From: David Stevens @ 2024-02-16  8:24 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, virtio-comment, virtio-dev
  Cc: parav, David Stevens

Add a virtio power management PCI capability to allow drivers to suspend
virtio PCI devices. This allows drivers to suspend devices at the virtio
level before suspending them at the PCI transport layer. This allows
drivers to do a two phase suspend, which prevents notifications from
being ignored or lost if interrupts are reconfigured at the PCI
transport layer immediately before or after the device is put into the
PCI PM D3 low power state.
---
 transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/transport-pci.tex b/transport-pci.tex
index a5c6719ea871..ce77708a9b69 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
 /* Vendor-specific data */
 #define VIRTIO_PCI_CAP_VENDOR_CFG        9
+/* Power management configuration */
+#define VIRTIO_PCI_CAP_PM_CFG            10
 \end{lstlisting}
 
         Any other value is reserved for future use.
@@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
+\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
+
+The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
+driver can write to the byte to set the power state of the device,
+and it can read from the byte to get the current power state of the
+device.
+
+The valid power states are:
+
+\begin{lstlisting}
+/* Device is operating normally */
+#define VIRTIO_PM_STATE_ACTIVE    0
+/* Device operation is suspended */
+#define VIRTIO_PM_STATE_SUSPENDED 1
+\end{lstlisting}
+
+The device power state has no effect when \field{device status} does
+not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
+set.
+
+\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
+
+A device MUST maintain its state while suspended such that all driver
+visible state after resuming exactly matches driver visible state
+before suspending.
+
+A device MUST NOT send notifications while suspended.
+
+A device MAY operate on any buffers in its virtqueue while suspended.
+
+A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
+
+A device SHOULD take steps to minimize its resource consumption while
+suspended, although what this involves is specific to the particular
+device implementation.
+
+\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
+
+A driver MUST NOT access a suspended device's BARs corresponding to
+any virtio structures, except for the power management byte.
+
+A driver MAY suspend a device that has buffers in one of its
+virtqueues, but it MUST NOT modify any such buffers while the device
+is suspended.
+
+A driver MUST read from the power management byte after writing to the
+byte to verify that the device successfully entered the target power
+state.
+
 \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
 
 Transitional devices MUST present part of configuration
-- 
2.44.0.rc0.258.g7320e95886-goog


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [virtio-comment] [PATCH v4 1/1] Add suspend support for virtio PCI devices
@ 2024-02-16  8:24   ` David Stevens
  0 siblings, 0 replies; 18+ messages in thread
From: David Stevens @ 2024-02-16  8:24 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, virtio-comment, virtio-dev
  Cc: parav, David Stevens

Add a virtio power management PCI capability to allow drivers to suspend
virtio PCI devices. This allows drivers to suspend devices at the virtio
level before suspending them at the PCI transport layer. This allows
drivers to do a two phase suspend, which prevents notifications from
being ignored or lost if interrupts are reconfigured at the PCI
transport layer immediately before or after the device is put into the
PCI PM D3 low power state.
---
 transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/transport-pci.tex b/transport-pci.tex
index a5c6719ea871..ce77708a9b69 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
 #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
 /* Vendor-specific data */
 #define VIRTIO_PCI_CAP_VENDOR_CFG        9
+/* Power management configuration */
+#define VIRTIO_PCI_CAP_PM_CFG            10
 \end{lstlisting}
 
         Any other value is reserved for future use.
@@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
+\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
+
+The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
+driver can write to the byte to set the power state of the device,
+and it can read from the byte to get the current power state of the
+device.
+
+The valid power states are:
+
+\begin{lstlisting}
+/* Device is operating normally */
+#define VIRTIO_PM_STATE_ACTIVE    0
+/* Device operation is suspended */
+#define VIRTIO_PM_STATE_SUSPENDED 1
+\end{lstlisting}
+
+The device power state has no effect when \field{device status} does
+not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
+set.
+
+\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
+
+A device MUST maintain its state while suspended such that all driver
+visible state after resuming exactly matches driver visible state
+before suspending.
+
+A device MUST NOT send notifications while suspended.
+
+A device MAY operate on any buffers in its virtqueue while suspended.
+
+A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
+
+A device SHOULD take steps to minimize its resource consumption while
+suspended, although what this involves is specific to the particular
+device implementation.
+
+\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
+
+A driver MUST NOT access a suspended device's BARs corresponding to
+any virtio structures, except for the power management byte.
+
+A driver MAY suspend a device that has buffers in one of its
+virtqueues, but it MUST NOT modify any such buffers while the device
+is suspended.
+
+A driver MUST read from the power management byte after writing to the
+byte to verify that the device successfully entered the target power
+state.
+
 \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
 
 Transitional devices MUST present part of configuration
-- 
2.44.0.rc0.258.g7320e95886-goog


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
  2024-02-16  8:24   ` [virtio-comment] " David Stevens
@ 2024-02-16  8:56     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2024-02-16  8:56 UTC (permalink / raw)
  To: David Stevens; +Cc: Jason Wang, virtio-comment, virtio-dev, parav

On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote:
> Add a virtio power management PCI capability to allow drivers to suspend
> virtio PCI devices. This allows drivers to suspend devices at the virtio
> level before suspending them at the PCI transport layer. This allows
> drivers to do a two phase suspend, which prevents notifications from
> being ignored or lost if interrupts are reconfigured at the PCI
> transport layer immediately before or after the device is put into the
> PCI PM D3 low power state.
> ---
>  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index a5c6719ea871..ce77708a9b69 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>  /* Vendor-specific data */
>  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> +/* Power management configuration */
> +#define VIRTIO_PCI_CAP_PM_CFG            10
>  \end{lstlisting}
>  
>          Any other value is reserved for future use.
> @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
>  specified by some other Virtio Structure PCI Capability
>  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
>  
> +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> +
> +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
> +driver can write to the byte to set the power state of the device,
> +and it can read from the byte to get the current power state of the
> +device.
> +
> +The valid power states are:
> +
> +\begin{lstlisting}
> +/* Device is operating normally */
> +#define VIRTIO_PM_STATE_ACTIVE    0
> +/* Device operation is suspended */
> +#define VIRTIO_PM_STATE_SUSPENDED 1
> +\end{lstlisting}
> +
> +The device power state has no effect when \field{device status} does
> +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
> +set.


Given this is only after DRIVER_OK, wouldn't a feature bit + status bit
make more sense? This will make it transport-independent and simplify
discovery.

> +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> +
> +A device MUST maintain its state while suspended such that all driver
> +visible state after resuming exactly matches driver visible state
> +before suspending.
> +
> +A device MUST NOT send notifications while suspended.
> +
> +A device MAY operate on any buffers in its virtqueue while suspended.

How is this reconsiled with state matching exactly? buffers are
driver-visible ...

> +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
> +
> +A device SHOULD take steps to minimize its resource consumption while
> +suspended, although what this involves is specific to the particular
> +device implementation.
> +
> +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> +
> +A driver MUST NOT access a suspended device's BARs corresponding to
> +any virtio structures, except for the power management byte.
> +
> +A driver MAY suspend a device that has buffers in one of its
> +virtqueues, but it MUST NOT modify any such buffers while the device
> +is suspended.
> +
> +A driver MUST read from the power management byte after writing to the
> +byte to verify that the device successfully entered the target power
> +state.

Verify how? By checking the value returned? And what should it do with the value 
does not match?


> +
>  \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
>  
>  Transitional devices MUST present part of configuration
> -- 
> 2.44.0.rc0.258.g7320e95886-goog


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-comment] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
@ 2024-02-16  8:56     ` Michael S. Tsirkin
  0 siblings, 0 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2024-02-16  8:56 UTC (permalink / raw)
  To: David Stevens; +Cc: Jason Wang, virtio-comment, virtio-dev, parav

On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote:
> Add a virtio power management PCI capability to allow drivers to suspend
> virtio PCI devices. This allows drivers to suspend devices at the virtio
> level before suspending them at the PCI transport layer. This allows
> drivers to do a two phase suspend, which prevents notifications from
> being ignored or lost if interrupts are reconfigured at the PCI
> transport layer immediately before or after the device is put into the
> PCI PM D3 low power state.
> ---
>  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
> 
> diff --git a/transport-pci.tex b/transport-pci.tex
> index a5c6719ea871..ce77708a9b69 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>  /* Vendor-specific data */
>  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> +/* Power management configuration */
> +#define VIRTIO_PCI_CAP_PM_CFG            10
>  \end{lstlisting}
>  
>          Any other value is reserved for future use.
> @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
>  specified by some other Virtio Structure PCI Capability
>  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
>  
> +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> +
> +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
> +driver can write to the byte to set the power state of the device,
> +and it can read from the byte to get the current power state of the
> +device.
> +
> +The valid power states are:
> +
> +\begin{lstlisting}
> +/* Device is operating normally */
> +#define VIRTIO_PM_STATE_ACTIVE    0
> +/* Device operation is suspended */
> +#define VIRTIO_PM_STATE_SUSPENDED 1
> +\end{lstlisting}
> +
> +The device power state has no effect when \field{device status} does
> +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
> +set.


Given this is only after DRIVER_OK, wouldn't a feature bit + status bit
make more sense? This will make it transport-independent and simplify
discovery.

> +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> +
> +A device MUST maintain its state while suspended such that all driver
> +visible state after resuming exactly matches driver visible state
> +before suspending.
> +
> +A device MUST NOT send notifications while suspended.
> +
> +A device MAY operate on any buffers in its virtqueue while suspended.

How is this reconsiled with state matching exactly? buffers are
driver-visible ...

> +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
> +
> +A device SHOULD take steps to minimize its resource consumption while
> +suspended, although what this involves is specific to the particular
> +device implementation.
> +
> +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> +
> +A driver MUST NOT access a suspended device's BARs corresponding to
> +any virtio structures, except for the power management byte.
> +
> +A driver MAY suspend a device that has buffers in one of its
> +virtqueues, but it MUST NOT modify any such buffers while the device
> +is suspended.
> +
> +A driver MUST read from the power management byte after writing to the
> +byte to verify that the device successfully entered the target power
> +state.

Verify how? By checking the value returned? And what should it do with the value 
does not match?


> +
>  \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
>  
>  Transitional devices MUST present part of configuration
> -- 
> 2.44.0.rc0.258.g7320e95886-goog


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
  2024-02-16  8:56     ` [virtio-comment] " Michael S. Tsirkin
@ 2024-02-18 10:59       ` Zhu, Lingshan
  -1 siblings, 0 replies; 18+ messages in thread
From: Zhu, Lingshan @ 2024-02-18 10:59 UTC (permalink / raw)
  To: Michael S. Tsirkin, David Stevens
  Cc: Jason Wang, virtio-comment, virtio-dev, parav



On 2/16/2024 4:56 PM, Michael S. Tsirkin wrote:
> On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote:
>> Add a virtio power management PCI capability to allow drivers to suspend
>> virtio PCI devices. This allows drivers to suspend devices at the virtio
>> level before suspending them at the PCI transport layer. This allows
>> drivers to do a two phase suspend, which prevents notifications from
>> being ignored or lost if interrupts are reconfigured at the PCI
>> transport layer immediately before or after the device is put into the
>> PCI PM D3 low power state.
>> ---
>>   transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 51 insertions(+)
>>
>> diff --git a/transport-pci.tex b/transport-pci.tex
>> index a5c6719ea871..ce77708a9b69 100644
>> --- a/transport-pci.tex
>> +++ b/transport-pci.tex
>> @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>>   #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>>   /* Vendor-specific data */
>>   #define VIRTIO_PCI_CAP_VENDOR_CFG        9
>> +/* Power management configuration */
>> +#define VIRTIO_PCI_CAP_PM_CFG            10
>>   \end{lstlisting}
>>   
>>           Any other value is reserved for future use.
>> @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
>>   specified by some other Virtio Structure PCI Capability
>>   of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
>>   
>> +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
>> +
>> +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
>> +driver can write to the byte to set the power state of the device,
>> +and it can read from the byte to get the current power state of the
>> +device.
>> +
>> +The valid power states are:
>> +
>> +\begin{lstlisting}
>> +/* Device is operating normally */
>> +#define VIRTIO_PM_STATE_ACTIVE    0
>> +/* Device operation is suspended */
>> +#define VIRTIO_PM_STATE_SUSPENDED 1
>> +\end{lstlisting}
>> +
>> +The device power state has no effect when \field{device status} does
>> +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
>> +set.
>
> Given this is only after DRIVER_OK, wouldn't a feature bit + status bit
> make more sense? This will make it transport-independent and simplify
> discovery.
I agree with Michael, I have ever proposed a solution suspending the device
by setting _SUSPEND in the device_status field. I will send this suspending
patch again(will cc you) in a few days.

Thanks
Zhu Lingshan
>
>> +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
>> +
>> +A device MUST maintain its state while suspended such that all driver
>> +visible state after resuming exactly matches driver visible state
>> +before suspending.
>> +
>> +A device MUST NOT send notifications while suspended.
>> +
>> +A device MAY operate on any buffers in its virtqueue while suspended.
> How is this reconsiled with state matching exactly? buffers are
> driver-visible ...
>
>> +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
>> +
>> +A device SHOULD take steps to minimize its resource consumption while
>> +suspended, although what this involves is specific to the particular
>> +device implementation.
>> +
>> +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
>> +
>> +A driver MUST NOT access a suspended device's BARs corresponding to
>> +any virtio structures, except for the power management byte.
>> +
>> +A driver MAY suspend a device that has buffers in one of its
>> +virtqueues, but it MUST NOT modify any such buffers while the device
>> +is suspended.
>> +
>> +A driver MUST read from the power management byte after writing to the
>> +byte to verify that the device successfully entered the target power
>> +state.
> Verify how? By checking the value returned? And what should it do with the value
> does not match?
>
>
>> +
>>   \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
>>   
>>   Transitional devices MUST present part of configuration
>> -- 
>> 2.44.0.rc0.258.g7320e95886-goog
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
@ 2024-02-18 10:59       ` Zhu, Lingshan
  0 siblings, 0 replies; 18+ messages in thread
From: Zhu, Lingshan @ 2024-02-18 10:59 UTC (permalink / raw)
  To: Michael S. Tsirkin, David Stevens
  Cc: Jason Wang, virtio-comment, virtio-dev, parav



On 2/16/2024 4:56 PM, Michael S. Tsirkin wrote:
> On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote:
>> Add a virtio power management PCI capability to allow drivers to suspend
>> virtio PCI devices. This allows drivers to suspend devices at the virtio
>> level before suspending them at the PCI transport layer. This allows
>> drivers to do a two phase suspend, which prevents notifications from
>> being ignored or lost if interrupts are reconfigured at the PCI
>> transport layer immediately before or after the device is put into the
>> PCI PM D3 low power state.
>> ---
>>   transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 51 insertions(+)
>>
>> diff --git a/transport-pci.tex b/transport-pci.tex
>> index a5c6719ea871..ce77708a9b69 100644
>> --- a/transport-pci.tex
>> +++ b/transport-pci.tex
>> @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
>>   #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
>>   /* Vendor-specific data */
>>   #define VIRTIO_PCI_CAP_VENDOR_CFG        9
>> +/* Power management configuration */
>> +#define VIRTIO_PCI_CAP_PM_CFG            10
>>   \end{lstlisting}
>>   
>>           Any other value is reserved for future use.
>> @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
>>   specified by some other Virtio Structure PCI Capability
>>   of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
>>   
>> +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
>> +
>> +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
>> +driver can write to the byte to set the power state of the device,
>> +and it can read from the byte to get the current power state of the
>> +device.
>> +
>> +The valid power states are:
>> +
>> +\begin{lstlisting}
>> +/* Device is operating normally */
>> +#define VIRTIO_PM_STATE_ACTIVE    0
>> +/* Device operation is suspended */
>> +#define VIRTIO_PM_STATE_SUSPENDED 1
>> +\end{lstlisting}
>> +
>> +The device power state has no effect when \field{device status} does
>> +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
>> +set.
>
> Given this is only after DRIVER_OK, wouldn't a feature bit + status bit
> make more sense? This will make it transport-independent and simplify
> discovery.
I agree with Michael, I have ever proposed a solution suspending the device
by setting _SUSPEND in the device_status field. I will send this suspending
patch again(will cc you) in a few days.

Thanks
Zhu Lingshan
>
>> +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
>> +
>> +A device MUST maintain its state while suspended such that all driver
>> +visible state after resuming exactly matches driver visible state
>> +before suspending.
>> +
>> +A device MUST NOT send notifications while suspended.
>> +
>> +A device MAY operate on any buffers in its virtqueue while suspended.
> How is this reconsiled with state matching exactly? buffers are
> driver-visible ...
>
>> +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
>> +
>> +A device SHOULD take steps to minimize its resource consumption while
>> +suspended, although what this involves is specific to the particular
>> +device implementation.
>> +
>> +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
>> +
>> +A driver MUST NOT access a suspended device's BARs corresponding to
>> +any virtio structures, except for the power management byte.
>> +
>> +A driver MAY suspend a device that has buffers in one of its
>> +virtqueues, but it MUST NOT modify any such buffers while the device
>> +is suspended.
>> +
>> +A driver MUST read from the power management byte after writing to the
>> +byte to verify that the device successfully entered the target power
>> +state.
> Verify how? By checking the value returned? And what should it do with the value
> does not match?
>
>
>> +
>>   \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
>>   
>>   Transitional devices MUST present part of configuration
>> -- 
>> 2.44.0.rc0.258.g7320e95886-goog
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
  2024-02-16  8:56     ` [virtio-comment] " Michael S. Tsirkin
@ 2024-02-19  6:46       ` David Stevens
  -1 siblings, 0 replies; 18+ messages in thread
From: David Stevens @ 2024-02-19  6:46 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Jason Wang, virtio-comment, virtio-dev, parav

On Fri, Feb 16, 2024 at 5:56 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote:
> > Add a virtio power management PCI capability to allow drivers to suspend
> > virtio PCI devices. This allows drivers to suspend devices at the virtio
> > level before suspending them at the PCI transport layer. This allows
> > drivers to do a two phase suspend, which prevents notifications from
> > being ignored or lost if interrupts are reconfigured at the PCI
> > transport layer immediately before or after the device is put into the
> > PCI PM D3 low power state.
> > ---
> >  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 51 insertions(+)
> >
> > diff --git a/transport-pci.tex b/transport-pci.tex
> > index a5c6719ea871..ce77708a9b69 100644
> > --- a/transport-pci.tex
> > +++ b/transport-pci.tex
> > @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
> >  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
> >  /* Vendor-specific data */
> >  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> > +/* Power management configuration */
> > +#define VIRTIO_PCI_CAP_PM_CFG            10
> >  \end{lstlisting}
> >
> >          Any other value is reserved for future use.
> > @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
> >  specified by some other Virtio Structure PCI Capability
> >  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
> >
> > +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > +
> > +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
> > +driver can write to the byte to set the power state of the device,
> > +and it can read from the byte to get the current power state of the
> > +device.
> > +
> > +The valid power states are:
> > +
> > +\begin{lstlisting}
> > +/* Device is operating normally */
> > +#define VIRTIO_PM_STATE_ACTIVE    0
> > +/* Device operation is suspended */
> > +#define VIRTIO_PM_STATE_SUSPENDED 1
> > +\end{lstlisting}
> > +
> > +The device power state has no effect when \field{device status} does
> > +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
> > +set.
>
>
> Given this is only after DRIVER_OK, wouldn't a feature bit + status bit
> make more sense? This will make it transport-independent and simplify
> discovery.

A feature bit + status bit would work as well. I've posted some
questions on Zhu Lingshan's patch.

> > +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > +
> > +A device MUST maintain its state while suspended such that all driver
> > +visible state after resuming exactly matches driver visible state
> > +before suspending.
> > +
> > +A device MUST NOT send notifications while suspended.
> > +
> > +A device MAY operate on any buffers in its virtqueue while suspended.
>
> How is this reconsiled with state matching exactly? buffers are
> driver-visible ...

Drivers can't atomically access buffers and the PM byte, so the device
modifying a buffer while suspended is indistinguishable from the
device modifying a buffer in the window between when it is resumed by
the driver and when the driver accesses the buffer. But you are right
that the wording is contradictory. Maybe the earlier requirement could
be better phrased as:

A device MUST maintain its state while suspended such that after the
driver resumes the device, the driver can use the device as if it had
never been suspended in the first place.

> > +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
> > +
> > +A device SHOULD take steps to minimize its resource consumption while
> > +suspended, although what this involves is specific to the particular
> > +device implementation.
> > +
> > +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > +
> > +A driver MUST NOT access a suspended device's BARs corresponding to
> > +any virtio structures, except for the power management byte.
> > +
> > +A driver MAY suspend a device that has buffers in one of its
> > +virtqueues, but it MUST NOT modify any such buffers while the device
> > +is suspended.
> > +
> > +A driver MUST read from the power management byte after writing to the
> > +byte to verify that the device successfully entered the target power
> > +state.
>
> Verify how? By checking the value returned? And what should it do with the value
> does not match?

Yes, comparing the value returned to the one it just wrote. The three
options I can think of would be to abort the suspend, reset the
device, or retry. Retry only makes sense if suspend/resume might
succeed in the future. An API is really only retry-friendly if it has
a way to set a timeout, since the consumer is what should be deciding
how long to wait but can't make that decision without a timeout.
Personally, I would lean towards not allowing timeouts here, since
it's simpler. So maybe something like this:

After the driver writes a new value to the PM byte, if the device can
transition to the requested state, then subsequent reads of the PM
state byte MUST block until the transition completes. If the device
cannot transition to the requested state, it MUST immediately return
the prior value of the PM state byte for subsequent reads of the PM
state byte.

In that case, the only options are to abort the suspension or to reset
the device. That's really a policy decision of the driver, so I don't
know how it would fit into this spec.

-David


-David

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-comment] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
@ 2024-02-19  6:46       ` David Stevens
  0 siblings, 0 replies; 18+ messages in thread
From: David Stevens @ 2024-02-19  6:46 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Jason Wang, virtio-comment, virtio-dev, parav

On Fri, Feb 16, 2024 at 5:56 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote:
> > Add a virtio power management PCI capability to allow drivers to suspend
> > virtio PCI devices. This allows drivers to suspend devices at the virtio
> > level before suspending them at the PCI transport layer. This allows
> > drivers to do a two phase suspend, which prevents notifications from
> > being ignored or lost if interrupts are reconfigured at the PCI
> > transport layer immediately before or after the device is put into the
> > PCI PM D3 low power state.
> > ---
> >  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 51 insertions(+)
> >
> > diff --git a/transport-pci.tex b/transport-pci.tex
> > index a5c6719ea871..ce77708a9b69 100644
> > --- a/transport-pci.tex
> > +++ b/transport-pci.tex
> > @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
> >  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
> >  /* Vendor-specific data */
> >  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> > +/* Power management configuration */
> > +#define VIRTIO_PCI_CAP_PM_CFG            10
> >  \end{lstlisting}
> >
> >          Any other value is reserved for future use.
> > @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
> >  specified by some other Virtio Structure PCI Capability
> >  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
> >
> > +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > +
> > +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
> > +driver can write to the byte to set the power state of the device,
> > +and it can read from the byte to get the current power state of the
> > +device.
> > +
> > +The valid power states are:
> > +
> > +\begin{lstlisting}
> > +/* Device is operating normally */
> > +#define VIRTIO_PM_STATE_ACTIVE    0
> > +/* Device operation is suspended */
> > +#define VIRTIO_PM_STATE_SUSPENDED 1
> > +\end{lstlisting}
> > +
> > +The device power state has no effect when \field{device status} does
> > +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
> > +set.
>
>
> Given this is only after DRIVER_OK, wouldn't a feature bit + status bit
> make more sense? This will make it transport-independent and simplify
> discovery.

A feature bit + status bit would work as well. I've posted some
questions on Zhu Lingshan's patch.

> > +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > +
> > +A device MUST maintain its state while suspended such that all driver
> > +visible state after resuming exactly matches driver visible state
> > +before suspending.
> > +
> > +A device MUST NOT send notifications while suspended.
> > +
> > +A device MAY operate on any buffers in its virtqueue while suspended.
>
> How is this reconsiled with state matching exactly? buffers are
> driver-visible ...

Drivers can't atomically access buffers and the PM byte, so the device
modifying a buffer while suspended is indistinguishable from the
device modifying a buffer in the window between when it is resumed by
the driver and when the driver accesses the buffer. But you are right
that the wording is contradictory. Maybe the earlier requirement could
be better phrased as:

A device MUST maintain its state while suspended such that after the
driver resumes the device, the driver can use the device as if it had
never been suspended in the first place.

> > +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
> > +
> > +A device SHOULD take steps to minimize its resource consumption while
> > +suspended, although what this involves is specific to the particular
> > +device implementation.
> > +
> > +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > +
> > +A driver MUST NOT access a suspended device's BARs corresponding to
> > +any virtio structures, except for the power management byte.
> > +
> > +A driver MAY suspend a device that has buffers in one of its
> > +virtqueues, but it MUST NOT modify any such buffers while the device
> > +is suspended.
> > +
> > +A driver MUST read from the power management byte after writing to the
> > +byte to verify that the device successfully entered the target power
> > +state.
>
> Verify how? By checking the value returned? And what should it do with the value
> does not match?

Yes, comparing the value returned to the one it just wrote. The three
options I can think of would be to abort the suspend, reset the
device, or retry. Retry only makes sense if suspend/resume might
succeed in the future. An API is really only retry-friendly if it has
a way to set a timeout, since the consumer is what should be deciding
how long to wait but can't make that decision without a timeout.
Personally, I would lean towards not allowing timeouts here, since
it's simpler. So maybe something like this:

After the driver writes a new value to the PM byte, if the device can
transition to the requested state, then subsequent reads of the PM
state byte MUST block until the transition completes. If the device
cannot transition to the requested state, it MUST immediately return
the prior value of the PM state byte for subsequent reads of the PM
state byte.

In that case, the only options are to abort the suspension or to reset
the device. That's really a policy decision of the driver, so I don't
know how it would fit into this spec.

-David


-David

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
  2024-02-19  6:46       ` [virtio-comment] " David Stevens
@ 2024-02-19  7:42         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2024-02-19  7:42 UTC (permalink / raw)
  To: David Stevens; +Cc: Jason Wang, virtio-comment, virtio-dev, parav

On Mon, Feb 19, 2024 at 03:46:37PM +0900, David Stevens wrote:
> On Fri, Feb 16, 2024 at 5:56 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote:
> > > Add a virtio power management PCI capability to allow drivers to suspend
> > > virtio PCI devices. This allows drivers to suspend devices at the virtio
> > > level before suspending them at the PCI transport layer. This allows
> > > drivers to do a two phase suspend, which prevents notifications from
> > > being ignored or lost if interrupts are reconfigured at the PCI
> > > transport layer immediately before or after the device is put into the
> > > PCI PM D3 low power state.
> > > ---
> > >  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 51 insertions(+)
> > >
> > > diff --git a/transport-pci.tex b/transport-pci.tex
> > > index a5c6719ea871..ce77708a9b69 100644
> > > --- a/transport-pci.tex
> > > +++ b/transport-pci.tex
> > > @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
> > >  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
> > >  /* Vendor-specific data */
> > >  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> > > +/* Power management configuration */
> > > +#define VIRTIO_PCI_CAP_PM_CFG            10
> > >  \end{lstlisting}
> > >
> > >          Any other value is reserved for future use.
> > > @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
> > >  specified by some other Virtio Structure PCI Capability
> > >  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
> > >
> > > +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > > +
> > > +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
> > > +driver can write to the byte to set the power state of the device,
> > > +and it can read from the byte to get the current power state of the
> > > +device.
> > > +
> > > +The valid power states are:
> > > +
> > > +\begin{lstlisting}
> > > +/* Device is operating normally */
> > > +#define VIRTIO_PM_STATE_ACTIVE    0
> > > +/* Device operation is suspended */
> > > +#define VIRTIO_PM_STATE_SUSPENDED 1
> > > +\end{lstlisting}
> > > +
> > > +The device power state has no effect when \field{device status} does
> > > +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
> > > +set.
> >
> >
> > Given this is only after DRIVER_OK, wouldn't a feature bit + status bit
> > make more sense? This will make it transport-independent and simplify
> > discovery.
> 
> A feature bit + status bit would work as well. I've posted some
> questions on Zhu Lingshan's patch.


Thanks! I like it that your patch is more specific but maybe something
can be figure out to get the best of both worlds.

> > > +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > > +
> > > +A device MUST maintain its state while suspended such that all driver
> > > +visible state after resuming exactly matches driver visible state
> > > +before suspending.
> > > +
> > > +A device MUST NOT send notifications while suspended.
> > > +
> > > +A device MAY operate on any buffers in its virtqueue while suspended.
> >
> > How is this reconsiled with state matching exactly? buffers are
> > driver-visible ...
> 
> Drivers can't atomically access buffers and the PM byte, so the device
> modifying a buffer while suspended is indistinguishable from the
> device modifying a buffer in the window between when it is resumed by
> the driver and when the driver accesses the buffer. But you are right
> that the wording is contradictory. Maybe the earlier requirement could
> be better phrased as:
> 
> A device MUST maintain its state while suspended such that after the
> driver resumes the device, the driver can use the device as if it had
> never been suspended in the first place.


I think you wording is fine, just make it clear how is the
contradiction resolved. E.g. exactly matches ... with the
exception of using buffers available in one of the virtqueues.

> > > +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
> > > +
> > > +A device SHOULD take steps to minimize its resource consumption while
> > > +suspended, although what this involves is specific to the particular
> > > +device implementation.
> > > +
> > > +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > > +
> > > +A driver MUST NOT access a suspended device's BARs corresponding to
> > > +any virtio structures, except for the power management byte.
> > > +
> > > +A driver MAY suspend a device that has buffers in one of its
> > > +virtqueues, but it MUST NOT modify any such buffers while the device
> > > +is suspended.
> > > +
> > > +A driver MUST read from the power management byte after writing to the
> > > +byte to verify that the device successfully entered the target power
> > > +state.
> >
> > Verify how? By checking the value returned? And what should it do with the value
> > does not match?
> 
> Yes, comparing the value returned to the one it just wrote. The three
> options I can think of would be to abort the suspend, reset the
> device, or retry. Retry only makes sense if suspend/resume might
> succeed in the future.

Well actually there is a reason to retry even without a timeout -
has to do with pci rules which limit how quickly device has
to respond to a read. So if you want to implement suspend
in firmware and not be bound to any timing limits you need
retry in software.


> An API is really only retry-friendly if it has
> a way to set a timeout, since the consumer is what should be deciding
> how long to wait but can't make that decision without a timeout.
> Personally, I would lean towards not allowing timeouts here, since
> it's simpler. So maybe something like this:
> 
> After the driver writes a new value to the PM byte, if the device can
> transition to the requested state, then subsequent reads of the PM
> state byte MUST block until the transition completes. If the device
> cannot transition to the requested state, it MUST immediately return
> the prior value of the PM state byte for subsequent reads of the PM
> state byte.

PCI has timing limitations on how long reads can block though.
So that could be a reason to retry.


> In that case, the only options are to abort the suspension or to reset
> the device. That's really a policy decision of the driver, so I don't
> know how it would fit into this spec.
> 
> -David
> 
> 
> -David


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-comment] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
@ 2024-02-19  7:42         ` Michael S. Tsirkin
  0 siblings, 0 replies; 18+ messages in thread
From: Michael S. Tsirkin @ 2024-02-19  7:42 UTC (permalink / raw)
  To: David Stevens; +Cc: Jason Wang, virtio-comment, virtio-dev, parav

On Mon, Feb 19, 2024 at 03:46:37PM +0900, David Stevens wrote:
> On Fri, Feb 16, 2024 at 5:56 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote:
> > > Add a virtio power management PCI capability to allow drivers to suspend
> > > virtio PCI devices. This allows drivers to suspend devices at the virtio
> > > level before suspending them at the PCI transport layer. This allows
> > > drivers to do a two phase suspend, which prevents notifications from
> > > being ignored or lost if interrupts are reconfigured at the PCI
> > > transport layer immediately before or after the device is put into the
> > > PCI PM D3 low power state.
> > > ---
> > >  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 51 insertions(+)
> > >
> > > diff --git a/transport-pci.tex b/transport-pci.tex
> > > index a5c6719ea871..ce77708a9b69 100644
> > > --- a/transport-pci.tex
> > > +++ b/transport-pci.tex
> > > @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option
> > >  #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8
> > >  /* Vendor-specific data */
> > >  #define VIRTIO_PCI_CAP_VENDOR_CFG        9
> > > +/* Power management configuration */
> > > +#define VIRTIO_PCI_CAP_PM_CFG            10
> > >  \end{lstlisting}
> > >
> > >          Any other value is reserved for future use.
> > > @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O
> > >  specified by some other Virtio Structure PCI Capability
> > >  of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
> > >
> > > +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > > +
> > > +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The
> > > +driver can write to the byte to set the power state of the device,
> > > +and it can read from the byte to get the current power state of the
> > > +device.
> > > +
> > > +The valid power states are:
> > > +
> > > +\begin{lstlisting}
> > > +/* Device is operating normally */
> > > +#define VIRTIO_PM_STATE_ACTIVE    0
> > > +/* Device operation is suspended */
> > > +#define VIRTIO_PM_STATE_SUSPENDED 1
> > > +\end{lstlisting}
> > > +
> > > +The device power state has no effect when \field{device status} does
> > > +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit
> > > +set.
> >
> >
> > Given this is only after DRIVER_OK, wouldn't a feature bit + status bit
> > make more sense? This will make it transport-independent and simplify
> > discovery.
> 
> A feature bit + status bit would work as well. I've posted some
> questions on Zhu Lingshan's patch.


Thanks! I like it that your patch is more specific but maybe something
can be figure out to get the best of both worlds.

> > > +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > > +
> > > +A device MUST maintain its state while suspended such that all driver
> > > +visible state after resuming exactly matches driver visible state
> > > +before suspending.
> > > +
> > > +A device MUST NOT send notifications while suspended.
> > > +
> > > +A device MAY operate on any buffers in its virtqueue while suspended.
> >
> > How is this reconsiled with state matching exactly? buffers are
> > driver-visible ...
> 
> Drivers can't atomically access buffers and the PM byte, so the device
> modifying a buffer while suspended is indistinguishable from the
> device modifying a buffer in the window between when it is resumed by
> the driver and when the driver accesses the buffer. But you are right
> that the wording is contradictory. Maybe the earlier requirement could
> be better phrased as:
> 
> A device MUST maintain its state while suspended such that after the
> driver resumes the device, the driver can use the device as if it had
> never been suspended in the first place.


I think you wording is fine, just make it clear how is the
contradiction resolved. E.g. exactly matches ... with the
exception of using buffers available in one of the virtqueues.

> > > +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset.
> > > +
> > > +A device SHOULD take steps to minimize its resource consumption while
> > > +suspended, although what this involves is specific to the particular
> > > +device implementation.
> > > +
> > > +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability}
> > > +
> > > +A driver MUST NOT access a suspended device's BARs corresponding to
> > > +any virtio structures, except for the power management byte.
> > > +
> > > +A driver MAY suspend a device that has buffers in one of its
> > > +virtqueues, but it MUST NOT modify any such buffers while the device
> > > +is suspended.
> > > +
> > > +A driver MUST read from the power management byte after writing to the
> > > +byte to verify that the device successfully entered the target power
> > > +state.
> >
> > Verify how? By checking the value returned? And what should it do with the value
> > does not match?
> 
> Yes, comparing the value returned to the one it just wrote. The three
> options I can think of would be to abort the suspend, reset the
> device, or retry. Retry only makes sense if suspend/resume might
> succeed in the future.

Well actually there is a reason to retry even without a timeout -
has to do with pci rules which limit how quickly device has
to respond to a read. So if you want to implement suspend
in firmware and not be bound to any timing limits you need
retry in software.


> An API is really only retry-friendly if it has
> a way to set a timeout, since the consumer is what should be deciding
> how long to wait but can't make that decision without a timeout.
> Personally, I would lean towards not allowing timeouts here, since
> it's simpler. So maybe something like this:
> 
> After the driver writes a new value to the PM byte, if the device can
> transition to the requested state, then subsequent reads of the PM
> state byte MUST block until the transition completes. If the device
> cannot transition to the requested state, it MUST immediately return
> the prior value of the PM state byte for subsequent reads of the PM
> state byte.

PCI has timing limitations on how long reads can block though.
So that could be a reason to retry.


> In that case, the only options are to abort the suspension or to reset
> the device. That's really a policy decision of the driver, so I don't
> know how it would fit into this spec.
> 
> -David
> 
> 
> -David


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
  2024-02-16  8:24   ` [virtio-comment] " David Stevens
@ 2024-02-26  6:42     ` Jason Wang
  -1 siblings, 0 replies; 18+ messages in thread
From: Jason Wang @ 2024-02-26  6:42 UTC (permalink / raw)
  To: David Stevens; +Cc: Michael S . Tsirkin, virtio-comment, virtio-dev, parav

On Fri, Feb 16, 2024 at 4:25 PM David Stevens <stevensd@chromium.org> wrote:
>
> Add a virtio power management PCI capability to allow drivers to suspend
> virtio PCI devices. This allows drivers to suspend devices at the virtio
> level before suspending them at the PCI transport layer. This allows
> drivers to do a two phase suspend, which prevents notifications from
> being ignored or lost if interrupts are reconfigured at the PCI
> transport layer immediately before or after the device is put into the
> PCI PM D3 low power state.
> ---
>  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
>

I would like to know how the two phase suspend can prevent the loss of
notification or interrupt. Or for example, if it can be workaround at
the driver level.

Thanks


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-comment] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
@ 2024-02-26  6:42     ` Jason Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Jason Wang @ 2024-02-26  6:42 UTC (permalink / raw)
  To: David Stevens; +Cc: Michael S . Tsirkin, virtio-comment, virtio-dev, parav

On Fri, Feb 16, 2024 at 4:25 PM David Stevens <stevensd@chromium.org> wrote:
>
> Add a virtio power management PCI capability to allow drivers to suspend
> virtio PCI devices. This allows drivers to suspend devices at the virtio
> level before suspending them at the PCI transport layer. This allows
> drivers to do a two phase suspend, which prevents notifications from
> being ignored or lost if interrupts are reconfigured at the PCI
> transport layer immediately before or after the device is put into the
> PCI PM D3 low power state.
> ---
>  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
>

I would like to know how the two phase suspend can prevent the loss of
notification or interrupt. Or for example, if it can be workaround at
the driver level.

Thanks


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-comment] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
  2024-02-26  6:42     ` [virtio-comment] " Jason Wang
@ 2024-02-26  8:46       ` David Stevens
  -1 siblings, 0 replies; 18+ messages in thread
From: David Stevens @ 2024-02-26  8:46 UTC (permalink / raw)
  To: Jason Wang; +Cc: Michael S . Tsirkin, virtio-comment, virtio-dev, parav

On Mon, Feb 26, 2024 at 3:42 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Feb 16, 2024 at 4:25 PM David Stevens <stevensd@chromium.org> wrote:
> >
> > Add a virtio power management PCI capability to allow drivers to suspend
> > virtio PCI devices. This allows drivers to suspend devices at the virtio
> > level before suspending them at the PCI transport layer. This allows
> > drivers to do a two phase suspend, which prevents notifications from
> > being ignored or lost if interrupts are reconfigured at the PCI
> > transport layer immediately before or after the device is put into the
> > PCI PM D3 low power state.
> > ---
> >  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 51 insertions(+)
> >
>
> I would like to know how the two phase suspend can prevent the loss of
> notification or interrupt. Or for example, if it can be workaround at
> the driver level.

I'll preface this by saying I am not an expert in how Linux handles
interrupts or in PCI, so I might be missing some important
information.

Actually, after thinking about this some more, I'm not sure that the
two phase suspend is necessary per-se. Rather, I think relying only on
PCI power management is simply insufficient.

According to the kernel's system suspend and device interrupts
documentation [1], "... after the 'late' phase of device suspend there
is no legitimate reason why any interrupts from suspended devices
should trigger and if any devices have not been suspended properly
yet, it is better to block interrupts from them anyway." PCI power
management doesn't occur until the noirq phase of suspend, so that's
too late to be useful here. Other than that, as of right now, there's
nothing the driver can do to tell devices they need to stop sending
interrupts. Adding suspend support to the virtio spec would address
this deficiency.

More concretely, I ran into two problems when trying to get shallow
suspend working with only PCI power management. If there are
workarounds for this at the driver level, then that would be great.
But I can't think of any, at least.

First, there is a window between when Linux suspends interrupts with
suspend_device_irqs and when PCI power management occurs. If the
device triggers an interrupt in that window, then it won't be lost
since IRQS_PENDING gets added to the descriptor state. However, if the
originating event should have woken the guest up from sleep, then that
doesn't happen, since there is no way for the device to know it
actually needed to generate a PCI PME. I might be missing something,
but this seems like something that can't be worked around at the
driver level.

Second, the core interrupt code can migrate interrupts when CPUs are
offline'ed entering shallow suspend. Since PCI devices are already in
D3 at this point, __pci_write_msi_msg does not update the MSI routing.
When bringing the device back up, there is a window between when the
device is put into D0 and when the updated routing tables are written
by __pci_restore_msix_state. If the device sends an interrupt in this
window, then it will be sent to the completely wrong handler in the
guest (or dropped if there is no handler installed). I guess this
could be worked around in the core PCI code by masking MSIX interrupts
before suspending the device. However, the fact that Linux doesn't
already do that today suggests that we're wrong, rather than everybody
else happens to be wrong in just the right way.

[1] https://www.kernel.org/doc/Documentation/power/suspend-and-interrupts.rst

-David

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-dev] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
@ 2024-02-26  8:46       ` David Stevens
  0 siblings, 0 replies; 18+ messages in thread
From: David Stevens @ 2024-02-26  8:46 UTC (permalink / raw)
  To: Jason Wang; +Cc: Michael S . Tsirkin, virtio-comment, virtio-dev, parav

On Mon, Feb 26, 2024 at 3:42 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Feb 16, 2024 at 4:25 PM David Stevens <stevensd@chromium.org> wrote:
> >
> > Add a virtio power management PCI capability to allow drivers to suspend
> > virtio PCI devices. This allows drivers to suspend devices at the virtio
> > level before suspending them at the PCI transport layer. This allows
> > drivers to do a two phase suspend, which prevents notifications from
> > being ignored or lost if interrupts are reconfigured at the PCI
> > transport layer immediately before or after the device is put into the
> > PCI PM D3 low power state.
> > ---
> >  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 51 insertions(+)
> >
>
> I would like to know how the two phase suspend can prevent the loss of
> notification or interrupt. Or for example, if it can be workaround at
> the driver level.

I'll preface this by saying I am not an expert in how Linux handles
interrupts or in PCI, so I might be missing some important
information.

Actually, after thinking about this some more, I'm not sure that the
two phase suspend is necessary per-se. Rather, I think relying only on
PCI power management is simply insufficient.

According to the kernel's system suspend and device interrupts
documentation [1], "... after the 'late' phase of device suspend there
is no legitimate reason why any interrupts from suspended devices
should trigger and if any devices have not been suspended properly
yet, it is better to block interrupts from them anyway." PCI power
management doesn't occur until the noirq phase of suspend, so that's
too late to be useful here. Other than that, as of right now, there's
nothing the driver can do to tell devices they need to stop sending
interrupts. Adding suspend support to the virtio spec would address
this deficiency.

More concretely, I ran into two problems when trying to get shallow
suspend working with only PCI power management. If there are
workarounds for this at the driver level, then that would be great.
But I can't think of any, at least.

First, there is a window between when Linux suspends interrupts with
suspend_device_irqs and when PCI power management occurs. If the
device triggers an interrupt in that window, then it won't be lost
since IRQS_PENDING gets added to the descriptor state. However, if the
originating event should have woken the guest up from sleep, then that
doesn't happen, since there is no way for the device to know it
actually needed to generate a PCI PME. I might be missing something,
but this seems like something that can't be worked around at the
driver level.

Second, the core interrupt code can migrate interrupts when CPUs are
offline'ed entering shallow suspend. Since PCI devices are already in
D3 at this point, __pci_write_msi_msg does not update the MSI routing.
When bringing the device back up, there is a window between when the
device is put into D0 and when the updated routing tables are written
by __pci_restore_msix_state. If the device sends an interrupt in this
window, then it will be sent to the completely wrong handler in the
guest (or dropped if there is no handler installed). I guess this
could be worked around in the core PCI code by masking MSIX interrupts
before suspending the device. However, the fact that Linux doesn't
already do that today suggests that we're wrong, rather than everybody
else happens to be wrong in just the right way.

[1] https://www.kernel.org/doc/Documentation/power/suspend-and-interrupts.rst

-David

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [virtio-dev] Re: [virtio-comment] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
  2024-02-26  8:46       ` [virtio-dev] " David Stevens
@ 2024-02-26 10:23         ` Zhu, Lingshan
  -1 siblings, 0 replies; 18+ messages in thread
From: Zhu, Lingshan @ 2024-02-26 10:23 UTC (permalink / raw)
  To: David Stevens, Jason Wang
  Cc: Michael S . Tsirkin, virtio-comment, virtio-dev, parav



On 2/26/2024 4:46 PM, David Stevens wrote:
> On Mon, Feb 26, 2024 at 3:42 PM Jason Wang <jasowang@redhat.com> wrote:
>> On Fri, Feb 16, 2024 at 4:25 PM David Stevens <stevensd@chromium.org> wrote:
>>> Add a virtio power management PCI capability to allow drivers to suspend
>>> virtio PCI devices. This allows drivers to suspend devices at the virtio
>>> level before suspending them at the PCI transport layer. This allows
>>> drivers to do a two phase suspend, which prevents notifications from
>>> being ignored or lost if interrupts are reconfigured at the PCI
>>> transport layer immediately before or after the device is put into the
>>> PCI PM D3 low power state.
>>> ---
>>>   transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 51 insertions(+)
>>>
>> I would like to know how the two phase suspend can prevent the loss of
>> notification or interrupt. Or for example, if it can be workaround at
>> the driver level.
> I'll preface this by saying I am not an expert in how Linux handles
> interrupts or in PCI, so I might be missing some important
> information.
>
> Actually, after thinking about this some more, I'm not sure that the
> two phase suspend is necessary per-se. Rather, I think relying only on
> PCI power management is simply insufficient.
>
> According to the kernel's system suspend and device interrupts
> documentation [1], "... after the 'late' phase of device suspend there
> is no legitimate reason why any interrupts from suspended devices
> should trigger and if any devices have not been suspended properly
> yet, it is better to block interrupts from them anyway." PCI power
> management doesn't occur until the noirq phase of suspend, so that's
> too late to be useful here. Other than that, as of right now, there's
> nothing the driver can do to tell devices they need to stop sending
> interrupts. Adding suspend support to the virtio spec would address
> this deficiency.
I think the SUSPEND bit which is a virtio level suspending can help here.
>
> More concretely, I ran into two problems when trying to get shallow
> suspend working with only PCI power management. If there are
> workarounds for this at the driver level, then that would be great.
> But I can't think of any, at least.
>
> First, there is a window between when Linux suspends interrupts with
> suspend_device_irqs and when PCI power management occurs. If the
> device triggers an interrupt in that window, then it won't be lost
> since IRQS_PENDING gets added to the descriptor state. However, if the
> originating event should have woken the guest up from sleep, then that
> doesn't happen, since there is no way for the device to know it
> actually needed to generate a PCI PME. I might be missing something,
> but this seems like something that can't be worked around at the
> driver level.
>
> Second, the core interrupt code can migrate interrupts when CPUs are
> offline'ed entering shallow suspend. Since PCI devices are already in
> D3 at this point, __pci_write_msi_msg does not update the MSI routing.
> When bringing the device back up, there is a window between when the
> device is put into D0 and when the updated routing tables are written
> by __pci_restore_msix_state. If the device sends an interrupt in this
> window, then it will be sent to the completely wrong handler in the
> guest (or dropped if there is no handler installed). I guess this
> could be worked around in the core PCI code by masking MSIX interrupts
> before suspending the device. However, the fact that Linux doesn't
> already do that today suggests that we're wrong, rather than everybody
> else happens to be wrong in just the right way.
I am not a PCI expert, please correct me if I misunderstand anything.

MSI interrupt are in-band DMA writing, so the data is always secure,
and CPU should process them anyway even after waking up, the handler is
always there since the driver register it.

The device should not send any interrupts(both vq and config) once it
presents SUSPEND.

For PM, I think we should not SUSPEND a PCI device by PM,
PM should be supplementary steps for SUSPENDING.

While we are working on this patch, I will address the comments for
the SUSPENDING bit patch and work our a V2 to wider review and more
comments. Finally we will merge into a series.

Thanks
>
> [1] https://www.kernel.org/doc/Documentation/power/suspend-and-interrupts.rst
>
> -David
>
> This publicly archived list offers a means to provide input to the
>
> OASIS Virtual I/O Device (VIRTIO) TC.
>
>
>
> In order to verify user consent to the Feedback License terms and
>
> to minimize spam in the list archive, subscription is required
>
> before posting.
>
>
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>
> List help: virtio-comment-help@lists.oasis-open.org
>
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>
> Committee: https://www.oasis-open.org/committees/virtio/
>
> Join OASIS: https://www.oasis-open.org/join/
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [virtio-comment] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
@ 2024-02-26 10:23         ` Zhu, Lingshan
  0 siblings, 0 replies; 18+ messages in thread
From: Zhu, Lingshan @ 2024-02-26 10:23 UTC (permalink / raw)
  To: David Stevens, Jason Wang
  Cc: Michael S . Tsirkin, virtio-comment, virtio-dev, parav



On 2/26/2024 4:46 PM, David Stevens wrote:
> On Mon, Feb 26, 2024 at 3:42 PM Jason Wang <jasowang@redhat.com> wrote:
>> On Fri, Feb 16, 2024 at 4:25 PM David Stevens <stevensd@chromium.org> wrote:
>>> Add a virtio power management PCI capability to allow drivers to suspend
>>> virtio PCI devices. This allows drivers to suspend devices at the virtio
>>> level before suspending them at the PCI transport layer. This allows
>>> drivers to do a two phase suspend, which prevents notifications from
>>> being ignored or lost if interrupts are reconfigured at the PCI
>>> transport layer immediately before or after the device is put into the
>>> PCI PM D3 low power state.
>>> ---
>>>   transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 51 insertions(+)
>>>
>> I would like to know how the two phase suspend can prevent the loss of
>> notification or interrupt. Or for example, if it can be workaround at
>> the driver level.
> I'll preface this by saying I am not an expert in how Linux handles
> interrupts or in PCI, so I might be missing some important
> information.
>
> Actually, after thinking about this some more, I'm not sure that the
> two phase suspend is necessary per-se. Rather, I think relying only on
> PCI power management is simply insufficient.
>
> According to the kernel's system suspend and device interrupts
> documentation [1], "... after the 'late' phase of device suspend there
> is no legitimate reason why any interrupts from suspended devices
> should trigger and if any devices have not been suspended properly
> yet, it is better to block interrupts from them anyway." PCI power
> management doesn't occur until the noirq phase of suspend, so that's
> too late to be useful here. Other than that, as of right now, there's
> nothing the driver can do to tell devices they need to stop sending
> interrupts. Adding suspend support to the virtio spec would address
> this deficiency.
I think the SUSPEND bit which is a virtio level suspending can help here.
>
> More concretely, I ran into two problems when trying to get shallow
> suspend working with only PCI power management. If there are
> workarounds for this at the driver level, then that would be great.
> But I can't think of any, at least.
>
> First, there is a window between when Linux suspends interrupts with
> suspend_device_irqs and when PCI power management occurs. If the
> device triggers an interrupt in that window, then it won't be lost
> since IRQS_PENDING gets added to the descriptor state. However, if the
> originating event should have woken the guest up from sleep, then that
> doesn't happen, since there is no way for the device to know it
> actually needed to generate a PCI PME. I might be missing something,
> but this seems like something that can't be worked around at the
> driver level.
>
> Second, the core interrupt code can migrate interrupts when CPUs are
> offline'ed entering shallow suspend. Since PCI devices are already in
> D3 at this point, __pci_write_msi_msg does not update the MSI routing.
> When bringing the device back up, there is a window between when the
> device is put into D0 and when the updated routing tables are written
> by __pci_restore_msix_state. If the device sends an interrupt in this
> window, then it will be sent to the completely wrong handler in the
> guest (or dropped if there is no handler installed). I guess this
> could be worked around in the core PCI code by masking MSIX interrupts
> before suspending the device. However, the fact that Linux doesn't
> already do that today suggests that we're wrong, rather than everybody
> else happens to be wrong in just the right way.
I am not a PCI expert, please correct me if I misunderstand anything.

MSI interrupt are in-band DMA writing, so the data is always secure,
and CPU should process them anyway even after waking up, the handler is
always there since the driver register it.

The device should not send any interrupts(both vq and config) once it
presents SUSPEND.

For PM, I think we should not SUSPEND a PCI device by PM,
PM should be supplementary steps for SUSPENDING.

While we are working on this patch, I will address the comments for
the SUSPENDING bit patch and work our a V2 to wider review and more
comments. Finally we will merge into a series.

Thanks
>
> [1] https://www.kernel.org/doc/Documentation/power/suspend-and-interrupts.rst
>
> -David
>
> This publicly archived list offers a means to provide input to the
>
> OASIS Virtual I/O Device (VIRTIO) TC.
>
>
>
> In order to verify user consent to the Feedback License terms and
>
> to minimize spam in the list archive, subscription is required
>
> before posting.
>
>
>
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
>
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
>
> List help: virtio-comment-help@lists.oasis-open.org
>
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
>
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
>
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
>
> Committee: https://www.oasis-open.org/committees/virtio/
>
> Join OASIS: https://www.oasis-open.org/join/
>


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-02-26 10:23 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-16  8:24 [virtio-dev] [PATCH v4 0/1] Define a low power mode for devices David Stevens
2024-02-16  8:24 ` [virtio-comment] " David Stevens
2024-02-16  8:24 ` [virtio-dev] [PATCH v4 1/1] Add suspend support for virtio PCI devices David Stevens
2024-02-16  8:24   ` [virtio-comment] " David Stevens
2024-02-16  8:56   ` [virtio-dev] " Michael S. Tsirkin
2024-02-16  8:56     ` [virtio-comment] " Michael S. Tsirkin
2024-02-18 10:59     ` [virtio-dev] " Zhu, Lingshan
2024-02-18 10:59       ` [virtio-comment] " Zhu, Lingshan
2024-02-19  6:46     ` David Stevens
2024-02-19  6:46       ` [virtio-comment] " David Stevens
2024-02-19  7:42       ` [virtio-dev] " Michael S. Tsirkin
2024-02-19  7:42         ` [virtio-comment] " Michael S. Tsirkin
2024-02-26  6:42   ` [virtio-dev] " Jason Wang
2024-02-26  6:42     ` [virtio-comment] " Jason Wang
2024-02-26  8:46     ` David Stevens
2024-02-26  8:46       ` [virtio-dev] " David Stevens
2024-02-26 10:23       ` [virtio-dev] Re: [virtio-comment] " Zhu, Lingshan
2024-02-26 10:23         ` Zhu, Lingshan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.