All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-dev] [PATCH v6 0/1] Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-21  3:51 ` Jiqian Chen
  0 siblings, 0 replies; 76+ messages in thread
From: Jiqian Chen @ 2023-10-21  3:51 UTC (permalink / raw)
  To: Michael S . Tsirkin, Gerd Hoffmann, Parav Pandit, Jason Wang,
	Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev
  Cc: Honglei Huang, Julia Zhang, Huang Rui, Jiqian Chen

Hi all,
This is the version 6, thank Michael S. Tsirkin, Parav Pandit and Jason Wang
for their suggestions, it made below changes:
* Declared a new feature bit
* Add the description of conformance
* Improve it to virtio device level
* Change the unsuitable description, use "preserve resources" to replace
  "freeze mode" to avoid mixing with freeze. This patch is to protect all
  the resources that can't be re-create but we still need them, not to
  trigger any suspend or freeze behavior.

Best regards,
Jiqian Chen

v5:
makes below changes:
* Since this series patches add a new mechanism that let virtgpu and Qemu can negotiate
  their reset behavior, and other guys hope me can improve this mechanism to virtio pci
  level, so that other virtio devices can also benefit from it. So instead of adding
  new feature flag VIRTIO_GPU_F_FREEZE_S3 only serves for virtgpu, v5 add a new parameter
  named freeze_mode to struct virtio_pci_common_cfg, when guest begin suspending, set
  freeze_mode to VIRTIO_PCI_FREEZE_MODE_FREEZE_S3, and then all virtio devices can get
  this status, and notice that guest is suspending, then they can change their reset
  behavior.
V5 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230919110225.2282914-1-Jiqian.Chen@amd.com/T/#t
V5 of kernel patch:
https://lore.kernel.org/lkml/20230919104607.2282248-1-Jiqian.Chen@amd.com/T/#t


v4:
no v4 patches.
V4 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230719074726.1613088-1-Jiqian.Chen@amd.com/T/#t
No v4 of kernel patch


v3:
makes below changes:
* Use enum for freeze mode, so this can be extended with more
  modes in the future.
* Rename functions and paratemers with "_S3" postfix.
* Explain in more detail
Link:
https://lists.oasis-open.org/archives/virtio-comment/202307/msg00209.html
V3 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230720120816.8751-1-Jiqian.Chen@amd.com
V3 of Kernel patch: https://lore.kernel.org/lkml/20230720115805.8206-1-Jiqian.Chen@amd.com/T/#t


v2:
makes below changes:
* Elaborate on the types of resources.
* Add some descriptions for S3 and S4.
Link:
https://lists.oasis-open.org/archives/virtio-comment/202307/msg00160.html
V2 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230630070016.841459-1-Jiqian.Chen@amd.com/T/#t
V2 of Kernel patch:
https://lore.kernel.org/lkml/20230630073448.842767-1-Jiqian.Chen@amd.com/T/#t


v1:
Hi all,
I am working to implement virtgpu S3 function on Xen.

Currently on Xen, if we start a guest through Qemu with enabling virtgpu, and then suspend and s3resume guest. We can find that the guest kernel comes back, but the display doesn't. It just shown a black screen.

That is because when guest was during suspending, it called into Qemu and Qemu destroyed all resources and reset renderer. This made the display gone after guest resumed.

So, I add a mechanism that when guest is suspending, it will notify Qemu, and then Qemu will not destroy resources. That can help guest's display come back.

As discussed and suggested by Robert Beckett and Gerd Hoffmann on v1 qemu's mailing list.
Due to that mechanism needs cooperation between guest and host. What's more, as virtio drivers by design paravirt drivers, it is reasonable for guest to accept some cooperation with host to manage suspend/resume. So I request to add a new feature flag, so that guest and host can negotiate whenever freezing is supported or not.
Link:
https://lists.oasis-open.org/archives/virtio-comment/202306/msg00595.html
V1 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230608025655.1674357-2-Jiqian.Chen@amd.com/
V1 of Kernel patch:
https://lore.kernel.org/lkml/20230608063857.1677973-1-Jiqian.Chen@amd.com/


Jiqian Chen (1):
  content: Add new feature VIRTIO_F_PRESERVE_RESOURCES

 conformance.tex   |  2 ++
 content.tex       | 25 +++++++++++++++++++++++++
 transport-pci.tex |  6 ++++++
 3 files changed, 33 insertions(+)

-- 
2.34.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] [PATCH v6 0/1] Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-21  3:51 ` Jiqian Chen
  0 siblings, 0 replies; 76+ messages in thread
From: Jiqian Chen @ 2023-10-21  3:51 UTC (permalink / raw)
  To: Michael S . Tsirkin, Gerd Hoffmann, Parav Pandit, Jason Wang,
	Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev
  Cc: Honglei Huang, Julia Zhang, Huang Rui, Jiqian Chen

Hi all,
This is the version 6, thank Michael S. Tsirkin, Parav Pandit and Jason Wang
for their suggestions, it made below changes:
* Declared a new feature bit
* Add the description of conformance
* Improve it to virtio device level
* Change the unsuitable description, use "preserve resources" to replace
  "freeze mode" to avoid mixing with freeze. This patch is to protect all
  the resources that can't be re-create but we still need them, not to
  trigger any suspend or freeze behavior.

Best regards,
Jiqian Chen

v5:
makes below changes:
* Since this series patches add a new mechanism that let virtgpu and Qemu can negotiate
  their reset behavior, and other guys hope me can improve this mechanism to virtio pci
  level, so that other virtio devices can also benefit from it. So instead of adding
  new feature flag VIRTIO_GPU_F_FREEZE_S3 only serves for virtgpu, v5 add a new parameter
  named freeze_mode to struct virtio_pci_common_cfg, when guest begin suspending, set
  freeze_mode to VIRTIO_PCI_FREEZE_MODE_FREEZE_S3, and then all virtio devices can get
  this status, and notice that guest is suspending, then they can change their reset
  behavior.
V5 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230919110225.2282914-1-Jiqian.Chen@amd.com/T/#t
V5 of kernel patch:
https://lore.kernel.org/lkml/20230919104607.2282248-1-Jiqian.Chen@amd.com/T/#t


v4:
no v4 patches.
V4 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230719074726.1613088-1-Jiqian.Chen@amd.com/T/#t
No v4 of kernel patch


v3:
makes below changes:
* Use enum for freeze mode, so this can be extended with more
  modes in the future.
* Rename functions and paratemers with "_S3" postfix.
* Explain in more detail
Link:
https://lists.oasis-open.org/archives/virtio-comment/202307/msg00209.html
V3 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230720120816.8751-1-Jiqian.Chen@amd.com
V3 of Kernel patch: https://lore.kernel.org/lkml/20230720115805.8206-1-Jiqian.Chen@amd.com/T/#t


v2:
makes below changes:
* Elaborate on the types of resources.
* Add some descriptions for S3 and S4.
Link:
https://lists.oasis-open.org/archives/virtio-comment/202307/msg00160.html
V2 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230630070016.841459-1-Jiqian.Chen@amd.com/T/#t
V2 of Kernel patch:
https://lore.kernel.org/lkml/20230630073448.842767-1-Jiqian.Chen@amd.com/T/#t


v1:
Hi all,
I am working to implement virtgpu S3 function on Xen.

Currently on Xen, if we start a guest through Qemu with enabling virtgpu, and then suspend and s3resume guest. We can find that the guest kernel comes back, but the display doesn't. It just shown a black screen.

That is because when guest was during suspending, it called into Qemu and Qemu destroyed all resources and reset renderer. This made the display gone after guest resumed.

So, I add a mechanism that when guest is suspending, it will notify Qemu, and then Qemu will not destroy resources. That can help guest's display come back.

As discussed and suggested by Robert Beckett and Gerd Hoffmann on v1 qemu's mailing list.
Due to that mechanism needs cooperation between guest and host. What's more, as virtio drivers by design paravirt drivers, it is reasonable for guest to accept some cooperation with host to manage suspend/resume. So I request to add a new feature flag, so that guest and host can negotiate whenever freezing is supported or not.
Link:
https://lists.oasis-open.org/archives/virtio-comment/202306/msg00595.html
V1 of Qemu patch:
https://lore.kernel.org/qemu-devel/20230608025655.1674357-2-Jiqian.Chen@amd.com/
V1 of Kernel patch:
https://lore.kernel.org/lkml/20230608063857.1677973-1-Jiqian.Chen@amd.com/


Jiqian Chen (1):
  content: Add new feature VIRTIO_F_PRESERVE_RESOURCES

 conformance.tex   |  2 ++
 content.tex       | 25 +++++++++++++++++++++++++
 transport-pci.tex |  6 ++++++
 3 files changed, 33 insertions(+)

-- 
2.34.1


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-21  3:51 ` [virtio-comment] " Jiqian Chen
@ 2023-10-21  3:51   ` Jiqian Chen
  -1 siblings, 0 replies; 76+ messages in thread
From: Jiqian Chen @ 2023-10-21  3:51 UTC (permalink / raw)
  To: Michael S . Tsirkin, Gerd Hoffmann, Parav Pandit, Jason Wang,
	Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev
  Cc: Honglei Huang, Julia Zhang, Huang Rui, Jiqian Chen

In some scenes, Qemu may reset or destroy resources of virtio device,
but some of them can't be re-created, so that causes some problems.

For example, when we do S3 for guest, guest will set device_status to
0, it causes Qemu to reset virtioi-gpu device, and then all render
resources of virtio-gpu will be destroyed. As a result, after guest
resuming, the display can't come back, and we only see a black screen.

In order to deal with the above scene, we need a mechanism that allows
guest and Qemu to negotiate their behaviors for resources. So, this
patch adds a new feature named VIRTIO_F_PRESERVE_RESOURCES. It allows
guest to tell Qemu when there is a need to preserve resources, guest
must preserve resources until 0 is set.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 conformance.tex   |  2 ++
 content.tex       | 25 +++++++++++++++++++++++++
 transport-pci.tex |  6 ++++++
 3 files changed, 33 insertions(+)

diff --git a/conformance.tex b/conformance.tex
index dc00e84..60cc0b1 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -91,6 +91,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues / The Virtqueue Descriptor Table / Indirect Descriptors}
 \item \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues / Supplying Buffers to The Device / Updating flags}
 \item \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues / Supplying Buffers to The Device / Sending Available Buffer Notifications}
+\item \ref{drivernormative:Basic Facilities of a Virtio Device / Preserve Resources}
 \item \ref{drivernormative:General Initialization And Device Operation / Device Initialization}
 \item \ref{drivernormative:General Initialization And Device Operation / Device Cleanup}
 \item \ref{drivernormative:Reserved Feature Bits}
@@ -172,6 +173,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Packed Virtqueues / The Virtqueue Descriptor Table}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Packed Virtqueues / Scatter-Gather Support}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory Regions}
+\item \ref{devicenormative:Basic Facilities of a Virtio Device / Preserve Resources}
 \item \ref{devicenormative:Reserved Feature Bits}
 \end{itemize}
 
diff --git a/content.tex b/content.tex
index 0a62dce..b6b1859 100644
--- a/content.tex
+++ b/content.tex
@@ -502,6 +502,27 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
 types. It is RECOMMENDED that devices generate version 4
 UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
 
+\section{Preserve Resources}\label{sec:Basic Facilities of a Virtio Device / Preserve Resources}
+
+As virtio devices are paravirtualization devices by design.
+There are various devices resources created by sending commands
+from frontend and stored in backend.
+
+In some scenes, resources may be destroyed or reset, some of
+them can be re-created since frontend has enough information
+, but some can't. At this case, we can set \field{Preserve Resources}
+to 1 by specific transport, to prevent resources being destroyed.
+
+Which kind of resources need to be preserved and how to preserve
+resources depend on specific devices.
+
+\drivernormative{\subsection}{Preserve Resources}{Basic Facilities of a Virtio Device / Preserve resources}
+A driver SHOULD set \field{Preserve Resources} to 1 when there is a need
+to preserve resources.
+
+\devicenormative{\subsection}{Preserve Resources}{Basic Facilities of a Virtio Device / Preserve resources}
+A device MUST NOT destroy resources until \field{Preserve Resources} is 0.
+
 \input{admin.tex}
 
 \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
@@ -872,6 +893,10 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 	\ref{devicenormative:Basic Facilities of a Virtio Device / Feature Bits} for
 	handling features reserved for future use.
 
+  \item[VIRTIO_F_PRESERVE_RESOURCES(42)] This feature indicates
+  that the device need to preserve resources.
+  See \ref{sec:Basic Facilities of a Virtio Device / Preserve Resources}.
+
 \end{description}
 
 \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
diff --git a/transport-pci.tex b/transport-pci.tex
index a5c6719..f6eea65 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -325,6 +325,7 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
         /* About the administration virtqueue. */
         le16 admin_queue_index;         /* read-only for driver */
         le16 admin_queue_num;         /* read-only for driver */
+        le16 preserve_resources;        /* read-write */
 };
 \end{lstlisting}
 
@@ -428,6 +429,11 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
 	The value 0 indicates no supported administration virtqueues.
 	This field is valid only if VIRTIO_F_ADMIN_VQ has been
 	negotiated.
+
+\item[\field{preserve_resources}]
+        The driver writes this to let device preserve resources whenever driver has demands.
+        1 - device need to preserve resources which can't be re-created, until 0 is set.
+        0 - all resources can be destroyed.
 \end{description}
 
 \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
-- 
2.34.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [virtio-comment] [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-21  3:51   ` Jiqian Chen
  0 siblings, 0 replies; 76+ messages in thread
From: Jiqian Chen @ 2023-10-21  3:51 UTC (permalink / raw)
  To: Michael S . Tsirkin, Gerd Hoffmann, Parav Pandit, Jason Wang,
	Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev
  Cc: Honglei Huang, Julia Zhang, Huang Rui, Jiqian Chen

In some scenes, Qemu may reset or destroy resources of virtio device,
but some of them can't be re-created, so that causes some problems.

For example, when we do S3 for guest, guest will set device_status to
0, it causes Qemu to reset virtioi-gpu device, and then all render
resources of virtio-gpu will be destroyed. As a result, after guest
resuming, the display can't come back, and we only see a black screen.

In order to deal with the above scene, we need a mechanism that allows
guest and Qemu to negotiate their behaviors for resources. So, this
patch adds a new feature named VIRTIO_F_PRESERVE_RESOURCES. It allows
guest to tell Qemu when there is a need to preserve resources, guest
must preserve resources until 0 is set.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
 conformance.tex   |  2 ++
 content.tex       | 25 +++++++++++++++++++++++++
 transport-pci.tex |  6 ++++++
 3 files changed, 33 insertions(+)

diff --git a/conformance.tex b/conformance.tex
index dc00e84..60cc0b1 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -91,6 +91,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues / The Virtqueue Descriptor Table / Indirect Descriptors}
 \item \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues / Supplying Buffers to The Device / Updating flags}
 \item \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues / Supplying Buffers to The Device / Sending Available Buffer Notifications}
+\item \ref{drivernormative:Basic Facilities of a Virtio Device / Preserve Resources}
 \item \ref{drivernormative:General Initialization And Device Operation / Device Initialization}
 \item \ref{drivernormative:General Initialization And Device Operation / Device Cleanup}
 \item \ref{drivernormative:Reserved Feature Bits}
@@ -172,6 +173,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Packed Virtqueues / The Virtqueue Descriptor Table}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Packed Virtqueues / Scatter-Gather Support}
 \item \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory Regions}
+\item \ref{devicenormative:Basic Facilities of a Virtio Device / Preserve Resources}
 \item \ref{devicenormative:Reserved Feature Bits}
 \end{itemize}
 
diff --git a/content.tex b/content.tex
index 0a62dce..b6b1859 100644
--- a/content.tex
+++ b/content.tex
@@ -502,6 +502,27 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo
 types. It is RECOMMENDED that devices generate version 4
 UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
 
+\section{Preserve Resources}\label{sec:Basic Facilities of a Virtio Device / Preserve Resources}
+
+As virtio devices are paravirtualization devices by design.
+There are various devices resources created by sending commands
+from frontend and stored in backend.
+
+In some scenes, resources may be destroyed or reset, some of
+them can be re-created since frontend has enough information
+, but some can't. At this case, we can set \field{Preserve Resources}
+to 1 by specific transport, to prevent resources being destroyed.
+
+Which kind of resources need to be preserved and how to preserve
+resources depend on specific devices.
+
+\drivernormative{\subsection}{Preserve Resources}{Basic Facilities of a Virtio Device / Preserve resources}
+A driver SHOULD set \field{Preserve Resources} to 1 when there is a need
+to preserve resources.
+
+\devicenormative{\subsection}{Preserve Resources}{Basic Facilities of a Virtio Device / Preserve resources}
+A device MUST NOT destroy resources until \field{Preserve Resources} is 0.
+
 \input{admin.tex}
 
 \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation}
@@ -872,6 +893,10 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 	\ref{devicenormative:Basic Facilities of a Virtio Device / Feature Bits} for
 	handling features reserved for future use.
 
+  \item[VIRTIO_F_PRESERVE_RESOURCES(42)] This feature indicates
+  that the device need to preserve resources.
+  See \ref{sec:Basic Facilities of a Virtio Device / Preserve Resources}.
+
 \end{description}
 
 \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits}
diff --git a/transport-pci.tex b/transport-pci.tex
index a5c6719..f6eea65 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -325,6 +325,7 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
         /* About the administration virtqueue. */
         le16 admin_queue_index;         /* read-only for driver */
         le16 admin_queue_num;         /* read-only for driver */
+        le16 preserve_resources;        /* read-write */
 };
 \end{lstlisting}
 
@@ -428,6 +429,11 @@ \subsubsection{Common configuration structure layout}\label{sec:Virtio Transport
 	The value 0 indicates no supported administration virtqueues.
 	This field is valid only if VIRTIO_F_ADMIN_VQ has been
 	negotiated.
+
+\item[\field{preserve_resources}]
+        The driver writes this to let device preserve resources whenever driver has demands.
+        1 - device need to preserve resources which can't be re-created, until 0 is set.
+        0 - all resources can be destroyed.
 \end{description}
 
 \devicenormative{\paragraph}{Common configuration structure layout}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common configuration structure layout}
-- 
2.34.1


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-21  3:51   ` [virtio-comment] " Jiqian Chen
@ 2023-10-23  6:00     ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2023-10-23  6:00 UTC (permalink / raw)
  To: Jiqian Chen, Michael S . Tsirkin, Gerd Hoffmann, Jason Wang,
	Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev
  Cc: Honglei Huang, Julia Zhang, Huang Rui

Hi Jiqian,

> From: Jiqian Chen <Jiqian.Chen@amd.com>
> Sent: Saturday, October 21, 2023 9:21 AM
> To: Michael S . Tsirkin <mst@redhat.com>; Gerd Hoffmann
> <kraxel@redhat.com>; Parav Pandit <parav@nvidia.com>; Jason Wang
> <jasowang@redhat.com>; Xuan Zhuo <xuanzhuo@linux.alibaba.com>; David
> Airlie <airlied@redhat.com>; Gurchetan Singh
> <gurchetansingh@chromium.org>; Chia-I Wu <olvaffe@gmail.com>; Marc-
> André Lureau <marcandre.lureau@gmail.com>; Robert Beckett
> <bob.beckett@collabora.com>; Mikhail Golubev-Ciuchea <Mikhail.Golubev-
> Ciuchea@opensynergy.com>; virtio-comment@lists.oasis-open.org; virtio-
> dev@lists.oasis-open.org
> Cc: Honglei Huang <Honglei1.Huang@amd.com>; Julia Zhang
> <Julia.Zhang@amd.com>; Huang Rui <Ray.Huang@amd.com>; Jiqian Chen
> <Jiqian.Chen@amd.com>
> Subject: [PATCH v6 1/1] content: Add new feature
> VIRTIO_F_PRESERVE_RESOURCES
> 
> In some scenes, Qemu may reset or destroy resources of virtio device, but some
> of them can't be re-created, so that causes some problems.
> 
It can be re-created. It is just that the guest driver lost the previous resource values.
So a better wording for the motivation is combining with below line to me is,

Currently guest drivers destroy and re-create virtio resources during power management suspend/resume sequence respectively.
For example, for a PCI transport, even if the device offers D3 to d0 state transition, a virtio guest driver has no way to know if 
the device can store the state during D0->D3 transition and same state can be restored on D3->D0 transition.

> For example, when we do S3 for guest, guest will set device_status to 0, it
> causes Qemu to reset virtioi-gpu device, and then all render resources of virtio-
> gpu will be destroyed. As a result, after guest resuming, the display can't come
> back, and we only see a black screen.
> 
To make guest drivers aware of above device capability, introduce a new feature bit to indicate such device capability.
This patch adds...

> In order to deal with the above scene, we need a mechanism that allows guest
> and Qemu to negotiate their behaviors for resources. So, this patch adds a new
> feature named VIRTIO_F_PRESERVE_RESOURCES. It allows guest to tell Qemu
> when there is a need to preserve resources, guest must preserve resources until
> 0 is set.
> 
I think this can be done without introducing the new register.
Can you please check if the PM register itself can serve the purpose instead of new virtio level register?

> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
>  conformance.tex   |  2 ++
>  content.tex       | 25 +++++++++++++++++++++++++
>  transport-pci.tex |  6 ++++++
>  3 files changed, 33 insertions(+)
> 
> diff --git a/conformance.tex b/conformance.tex index dc00e84..60cc0b1
> 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -91,6 +91,7 @@ \section{Conformance Targets}\label{sec:Conformance /
> Conformance Targets}  \item \ref{drivernormative:Basic Facilities of a Virtio
> Device / Packed Virtqueues / The Virtqueue Descriptor Table / Indirect
> Descriptors}  \item \ref{drivernormative:Basic Facilities of a Virtio Device /
> Packed Virtqueues / Supplying Buffers to The Device / Updating flags}  \item
> \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues /
> Supplying Buffers to The Device / Sending Available Buffer Notifications}
> +\item \ref{drivernormative:Basic Facilities of a Virtio Device /
> +Preserve Resources}
>  \item \ref{drivernormative:General Initialization And Device Operation /
> Device Initialization}  \item \ref{drivernormative:General Initialization And
> Device Operation / Device Cleanup}  \item \ref{drivernormative:Reserved
> Feature Bits} @@ -172,6 +173,7 @@ \section{Conformance
> Targets}\label{sec:Conformance / Conformance Targets}  \item
> \ref{devicenormative:Basic Facilities of a Virtio Device / Packed Virtqueues /
> The Virtqueue Descriptor Table}  \item \ref{devicenormative:Basic Facilities of
> a Virtio Device / Packed Virtqueues / Scatter-Gather Support}  \item
> \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory
> Regions}
> +\item \ref{devicenormative:Basic Facilities of a Virtio Device /
> +Preserve Resources}
>  \item \ref{devicenormative:Reserved Feature Bits}  \end{itemize}
> 
> diff --git a/content.tex b/content.tex
> index 0a62dce..b6b1859 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -502,6 +502,27 @@ \section{Exporting Objects}\label{sec:Basic Facilities
> of a Virtio Device / Expo  types. It is RECOMMENDED that devices generate
> version 4  UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> 
> +\section{Preserve Resources}\label{sec:Basic Facilities of a Virtio
> +Device / Preserve Resources}
> +
> +As virtio devices are paravirtualization devices by design.
This is not true and not relevant for the spec. Please remove this line.

> +There are various devices resources created by sending commands from
> +frontend and stored in backend.
> +
> +In some scenes, resources may be destroyed or reset, some of them can
> +be re-created since frontend has enough information , but some can't.
> +At this case, we can set \field{Preserve Resources} to 1 by specific
> +transport, to prevent resources being destroyed.
> +
> +Which kind of resources need to be preserved and how to preserve
> +resources depend on specific devices.
s/on specific devices/on specific device type/

> +
> +\drivernormative{\subsection}{Preserve Resources}{Basic Facilities of a
> +Virtio Device / Preserve resources} A driver SHOULD set \field{Preserve
> +Resources} to 1 when there is a need to preserve resources.
> +
> +\devicenormative{\subsection}{Preserve Resources}{Basic Facilities of a
> +Virtio Device / Preserve resources} A device MUST NOT destroy resources until
> \field{Preserve Resources} is 0.
> +
>  \input{admin.tex}
> 
>  \chapter{General Initialization And Device Operation}\label{sec:General
> Initialization And Device Operation} @@ -872,6 +893,10 @@
> \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>  	\ref{devicenormative:Basic Facilities of a Virtio Device / Feature Bits}
> for
>  	handling features reserved for future use.
> 
> +  \item[VIRTIO_F_PRESERVE_RESOURCES(42)] This feature indicates  that
> + the device need to preserve resources.
> +  See \ref{sec:Basic Facilities of a Virtio Device / Preserve Resources}.
> +
>  \end{description}
> 
>  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits} diff --
> git a/transport-pci.tex b/transport-pci.tex index a5c6719..f6eea65 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
> layout}\label{sec:Virtio Transport
>          /* About the administration virtqueue. */
>          le16 admin_queue_index;         /* read-only for driver */
>          le16 admin_queue_num;         /* read-only for driver */
> +        le16 preserve_resources;        /* read-write */
Preserving these resources in the device implementation takes finite amount of time.
Possibly more than 40nsec (time of PCIe write TLP).
Hence this register must be a polling register to indicate that preservation_done.
This will tell the guest when the preservation is done, and when restoration is done, so that it can resume upper layers.

Please refer to queue_reset definition to learn more about such register definition.

Lets please make sure that PCIe PM level registers are sufficient/not-sufficient to decide the addition of this register.

>  };
>  \end{lstlisting}
> 
> @@ -428,6 +429,11 @@ \subsubsection{Common configuration structure
> layout}\label{sec:Virtio Transport
>  	The value 0 indicates no supported administration virtqueues.
>  	This field is valid only if VIRTIO_F_ADMIN_VQ has been
>  	negotiated.
> +
> +\item[\field{preserve_resources}]
> +        The driver writes this to let device preserve resources whenever driver
> has demands.
> +        1 - device need to preserve resources which can't be re-created, until 0 is
> set.
> +        0 - all resources can be destroyed.
>  \end{description}
> 
>  \devicenormative{\paragraph}{Common configuration structure layout}{Virtio
> Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common
> configuration structure layout}
> --
> 2.34.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-23  6:00     ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2023-10-23  6:00 UTC (permalink / raw)
  To: Jiqian Chen, Michael S . Tsirkin, Gerd Hoffmann, Jason Wang,
	Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev
  Cc: Honglei Huang, Julia Zhang, Huang Rui

Hi Jiqian,

> From: Jiqian Chen <Jiqian.Chen@amd.com>
> Sent: Saturday, October 21, 2023 9:21 AM
> To: Michael S . Tsirkin <mst@redhat.com>; Gerd Hoffmann
> <kraxel@redhat.com>; Parav Pandit <parav@nvidia.com>; Jason Wang
> <jasowang@redhat.com>; Xuan Zhuo <xuanzhuo@linux.alibaba.com>; David
> Airlie <airlied@redhat.com>; Gurchetan Singh
> <gurchetansingh@chromium.org>; Chia-I Wu <olvaffe@gmail.com>; Marc-
> André Lureau <marcandre.lureau@gmail.com>; Robert Beckett
> <bob.beckett@collabora.com>; Mikhail Golubev-Ciuchea <Mikhail.Golubev-
> Ciuchea@opensynergy.com>; virtio-comment@lists.oasis-open.org; virtio-
> dev@lists.oasis-open.org
> Cc: Honglei Huang <Honglei1.Huang@amd.com>; Julia Zhang
> <Julia.Zhang@amd.com>; Huang Rui <Ray.Huang@amd.com>; Jiqian Chen
> <Jiqian.Chen@amd.com>
> Subject: [PATCH v6 1/1] content: Add new feature
> VIRTIO_F_PRESERVE_RESOURCES
> 
> In some scenes, Qemu may reset or destroy resources of virtio device, but some
> of them can't be re-created, so that causes some problems.
> 
It can be re-created. It is just that the guest driver lost the previous resource values.
So a better wording for the motivation is combining with below line to me is,

Currently guest drivers destroy and re-create virtio resources during power management suspend/resume sequence respectively.
For example, for a PCI transport, even if the device offers D3 to d0 state transition, a virtio guest driver has no way to know if 
the device can store the state during D0->D3 transition and same state can be restored on D3->D0 transition.

> For example, when we do S3 for guest, guest will set device_status to 0, it
> causes Qemu to reset virtioi-gpu device, and then all render resources of virtio-
> gpu will be destroyed. As a result, after guest resuming, the display can't come
> back, and we only see a black screen.
> 
To make guest drivers aware of above device capability, introduce a new feature bit to indicate such device capability.
This patch adds...

> In order to deal with the above scene, we need a mechanism that allows guest
> and Qemu to negotiate their behaviors for resources. So, this patch adds a new
> feature named VIRTIO_F_PRESERVE_RESOURCES. It allows guest to tell Qemu
> when there is a need to preserve resources, guest must preserve resources until
> 0 is set.
> 
I think this can be done without introducing the new register.
Can you please check if the PM register itself can serve the purpose instead of new virtio level register?

> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
>  conformance.tex   |  2 ++
>  content.tex       | 25 +++++++++++++++++++++++++
>  transport-pci.tex |  6 ++++++
>  3 files changed, 33 insertions(+)
> 
> diff --git a/conformance.tex b/conformance.tex index dc00e84..60cc0b1
> 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -91,6 +91,7 @@ \section{Conformance Targets}\label{sec:Conformance /
> Conformance Targets}  \item \ref{drivernormative:Basic Facilities of a Virtio
> Device / Packed Virtqueues / The Virtqueue Descriptor Table / Indirect
> Descriptors}  \item \ref{drivernormative:Basic Facilities of a Virtio Device /
> Packed Virtqueues / Supplying Buffers to The Device / Updating flags}  \item
> \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues /
> Supplying Buffers to The Device / Sending Available Buffer Notifications}
> +\item \ref{drivernormative:Basic Facilities of a Virtio Device /
> +Preserve Resources}
>  \item \ref{drivernormative:General Initialization And Device Operation /
> Device Initialization}  \item \ref{drivernormative:General Initialization And
> Device Operation / Device Cleanup}  \item \ref{drivernormative:Reserved
> Feature Bits} @@ -172,6 +173,7 @@ \section{Conformance
> Targets}\label{sec:Conformance / Conformance Targets}  \item
> \ref{devicenormative:Basic Facilities of a Virtio Device / Packed Virtqueues /
> The Virtqueue Descriptor Table}  \item \ref{devicenormative:Basic Facilities of
> a Virtio Device / Packed Virtqueues / Scatter-Gather Support}  \item
> \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory
> Regions}
> +\item \ref{devicenormative:Basic Facilities of a Virtio Device /
> +Preserve Resources}
>  \item \ref{devicenormative:Reserved Feature Bits}  \end{itemize}
> 
> diff --git a/content.tex b/content.tex
> index 0a62dce..b6b1859 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -502,6 +502,27 @@ \section{Exporting Objects}\label{sec:Basic Facilities
> of a Virtio Device / Expo  types. It is RECOMMENDED that devices generate
> version 4  UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
> 
> +\section{Preserve Resources}\label{sec:Basic Facilities of a Virtio
> +Device / Preserve Resources}
> +
> +As virtio devices are paravirtualization devices by design.
This is not true and not relevant for the spec. Please remove this line.

> +There are various devices resources created by sending commands from
> +frontend and stored in backend.
> +
> +In some scenes, resources may be destroyed or reset, some of them can
> +be re-created since frontend has enough information , but some can't.
> +At this case, we can set \field{Preserve Resources} to 1 by specific
> +transport, to prevent resources being destroyed.
> +
> +Which kind of resources need to be preserved and how to preserve
> +resources depend on specific devices.
s/on specific devices/on specific device type/

> +
> +\drivernormative{\subsection}{Preserve Resources}{Basic Facilities of a
> +Virtio Device / Preserve resources} A driver SHOULD set \field{Preserve
> +Resources} to 1 when there is a need to preserve resources.
> +
> +\devicenormative{\subsection}{Preserve Resources}{Basic Facilities of a
> +Virtio Device / Preserve resources} A device MUST NOT destroy resources until
> \field{Preserve Resources} is 0.
> +
>  \input{admin.tex}
> 
>  \chapter{General Initialization And Device Operation}\label{sec:General
> Initialization And Device Operation} @@ -872,6 +893,10 @@
> \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>  	\ref{devicenormative:Basic Facilities of a Virtio Device / Feature Bits}
> for
>  	handling features reserved for future use.
> 
> +  \item[VIRTIO_F_PRESERVE_RESOURCES(42)] This feature indicates  that
> + the device need to preserve resources.
> +  See \ref{sec:Basic Facilities of a Virtio Device / Preserve Resources}.
> +
>  \end{description}
> 
>  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits} diff --
> git a/transport-pci.tex b/transport-pci.tex index a5c6719..f6eea65 100644
> --- a/transport-pci.tex
> +++ b/transport-pci.tex
> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
> layout}\label{sec:Virtio Transport
>          /* About the administration virtqueue. */
>          le16 admin_queue_index;         /* read-only for driver */
>          le16 admin_queue_num;         /* read-only for driver */
> +        le16 preserve_resources;        /* read-write */
Preserving these resources in the device implementation takes finite amount of time.
Possibly more than 40nsec (time of PCIe write TLP).
Hence this register must be a polling register to indicate that preservation_done.
This will tell the guest when the preservation is done, and when restoration is done, so that it can resume upper layers.

Please refer to queue_reset definition to learn more about such register definition.

Lets please make sure that PCIe PM level registers are sufficient/not-sufficient to decide the addition of this register.

>  };
>  \end{lstlisting}
> 
> @@ -428,6 +429,11 @@ \subsubsection{Common configuration structure
> layout}\label{sec:Virtio Transport
>  	The value 0 indicates no supported administration virtqueues.
>  	This field is valid only if VIRTIO_F_ADMIN_VQ has been
>  	negotiated.
> +
> +\item[\field{preserve_resources}]
> +        The driver writes this to let device preserve resources whenever driver
> has demands.
> +        1 - device need to preserve resources which can't be re-created, until 0 is
> set.
> +        0 - all resources can be destroyed.
>  \end{description}
> 
>  \devicenormative{\paragraph}{Common configuration structure layout}{Virtio
> Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common
> configuration structure layout}
> --
> 2.34.1


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-23  6:00     ` [virtio-comment] " Parav Pandit
@ 2023-10-23 10:38       ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-23 10:38 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2023/10/23 14:00, Parav Pandit wrote:
> Hi Jiqian,
> 
>> From: Jiqian Chen <Jiqian.Chen@amd.com>
>> Sent: Saturday, October 21, 2023 9:21 AM
>> To: Michael S . Tsirkin <mst@redhat.com>; Gerd Hoffmann
>> <kraxel@redhat.com>; Parav Pandit <parav@nvidia.com>; Jason Wang
>> <jasowang@redhat.com>; Xuan Zhuo <xuanzhuo@linux.alibaba.com>; David
>> Airlie <airlied@redhat.com>; Gurchetan Singh
>> <gurchetansingh@chromium.org>; Chia-I Wu <olvaffe@gmail.com>; Marc-
>> André Lureau <marcandre.lureau@gmail.com>; Robert Beckett
>> <bob.beckett@collabora.com>; Mikhail Golubev-Ciuchea <Mikhail.Golubev-
>> Ciuchea@opensynergy.com>; virtio-comment@lists.oasis-open.org; virtio-
>> dev@lists.oasis-open.org
>> Cc: Honglei Huang <Honglei1.Huang@amd.com>; Julia Zhang
>> <Julia.Zhang@amd.com>; Huang Rui <Ray.Huang@amd.com>; Jiqian Chen
>> <Jiqian.Chen@amd.com>
>> Subject: [PATCH v6 1/1] content: Add new feature
>> VIRTIO_F_PRESERVE_RESOURCES
>>
>> In some scenes, Qemu may reset or destroy resources of virtio device, but some
>> of them can't be re-created, so that causes some problems.
>>
> It can be re-created. It is just that the guest driver lost the previous resource values.
> So a better wording for the motivation is combining with below line to me is,
Yes, guest driver hasn't enough information or values of resources to re-create them. I will change my description in next version.

> 
> Currently guest drivers destroy and re-create virtio resources during power management suspend/resume sequence respectively.
> For example, for a PCI transport, even if the device offers D3 to d0 state transition, a virtio guest driver has no way to know if 
> the device can store the state during D0->D3 transition and same state can be restored on D3->D0 transition.
Thanks, will change in next version.

> 
>> For example, when we do S3 for guest, guest will set device_status to 0, it
>> causes Qemu to reset virtioi-gpu device, and then all render resources of virtio-
>> gpu will be destroyed. As a result, after guest resuming, the display can't come
>> back, and we only see a black screen.
>>
> To make guest drivers aware of above device capability, introduce a new feature bit to indicate such device capability.
> This patch adds...
Thanks, will change in next version.

> 
>> In order to deal with the above scene, we need a mechanism that allows guest
>> and Qemu to negotiate their behaviors for resources. So, this patch adds a new
>> feature named VIRTIO_F_PRESERVE_RESOURCES. It allows guest to tell Qemu
>> when there is a need to preserve resources, guest must preserve resources until
>> 0 is set.
>>
> I think this can be done without introducing the new register.
> Can you please check if the PM register itself can serve the purpose instead of new virtio level register?
Do you mean the system PM register? I think it is unreasonable to let virtio-device listen the PM state of Guest system. It's more suitable that each device gets notifications from driver, and then do preserving resources operation.

> 
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>>  conformance.tex   |  2 ++
>>  content.tex       | 25 +++++++++++++++++++++++++
>>  transport-pci.tex |  6 ++++++
>>  3 files changed, 33 insertions(+)
>>
>> diff --git a/conformance.tex b/conformance.tex index dc00e84..60cc0b1
>> 100644
>> --- a/conformance.tex
>> +++ b/conformance.tex
>> @@ -91,6 +91,7 @@ \section{Conformance Targets}\label{sec:Conformance /
>> Conformance Targets}  \item \ref{drivernormative:Basic Facilities of a Virtio
>> Device / Packed Virtqueues / The Virtqueue Descriptor Table / Indirect
>> Descriptors}  \item \ref{drivernormative:Basic Facilities of a Virtio Device /
>> Packed Virtqueues / Supplying Buffers to The Device / Updating flags}  \item
>> \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues /
>> Supplying Buffers to The Device / Sending Available Buffer Notifications}
>> +\item \ref{drivernormative:Basic Facilities of a Virtio Device /
>> +Preserve Resources}
>>  \item \ref{drivernormative:General Initialization And Device Operation /
>> Device Initialization}  \item \ref{drivernormative:General Initialization And
>> Device Operation / Device Cleanup}  \item \ref{drivernormative:Reserved
>> Feature Bits} @@ -172,6 +173,7 @@ \section{Conformance
>> Targets}\label{sec:Conformance / Conformance Targets}  \item
>> \ref{devicenormative:Basic Facilities of a Virtio Device / Packed Virtqueues /
>> The Virtqueue Descriptor Table}  \item \ref{devicenormative:Basic Facilities of
>> a Virtio Device / Packed Virtqueues / Scatter-Gather Support}  \item
>> \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory
>> Regions}
>> +\item \ref{devicenormative:Basic Facilities of a Virtio Device /
>> +Preserve Resources}
>>  \item \ref{devicenormative:Reserved Feature Bits}  \end{itemize}
>>
>> diff --git a/content.tex b/content.tex
>> index 0a62dce..b6b1859 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -502,6 +502,27 @@ \section{Exporting Objects}\label{sec:Basic Facilities
>> of a Virtio Device / Expo  types. It is RECOMMENDED that devices generate
>> version 4  UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
>>
>> +\section{Preserve Resources}\label{sec:Basic Facilities of a Virtio
>> +Device / Preserve Resources}
>> +
>> +As virtio devices are paravirtualization devices by design.
> This is not true and not relevant for the spec. Please remove this line.
OK, will remove this line in next version.

> 
>> +There are various devices resources created by sending commands from
>> +frontend and stored in backend.
>> +
>> +In some scenes, resources may be destroyed or reset, some of them can
>> +be re-created since frontend has enough information , but some can't.
>> +At this case, we can set \field{Preserve Resources} to 1 by specific
>> +transport, to prevent resources being destroyed.
>> +
>> +Which kind of resources need to be preserved and how to preserve
>> +resources depend on specific devices.
> s/on specific devices/on specific device type/
Thanks, will change in next version.

> 
>> +
>> +\drivernormative{\subsection}{Preserve Resources}{Basic Facilities of a
>> +Virtio Device / Preserve resources} A driver SHOULD set \field{Preserve
>> +Resources} to 1 when there is a need to preserve resources.
>> +
>> +\devicenormative{\subsection}{Preserve Resources}{Basic Facilities of a
>> +Virtio Device / Preserve resources} A device MUST NOT destroy resources until
>> \field{Preserve Resources} is 0.
>> +
>>  \input{admin.tex}
>>
>>  \chapter{General Initialization And Device Operation}\label{sec:General
>> Initialization And Device Operation} @@ -872,6 +893,10 @@
>> \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>>  	\ref{devicenormative:Basic Facilities of a Virtio Device / Feature Bits}
>> for
>>  	handling features reserved for future use.
>>
>> +  \item[VIRTIO_F_PRESERVE_RESOURCES(42)] This feature indicates  that
>> + the device need to preserve resources.
>> +  See \ref{sec:Basic Facilities of a Virtio Device / Preserve Resources}.
>> +
>>  \end{description}
>>
>>  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits} diff --
>> git a/transport-pci.tex b/transport-pci.tex index a5c6719..f6eea65 100644
>> --- a/transport-pci.tex
>> +++ b/transport-pci.tex
>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
>> layout}\label{sec:Virtio Transport
>>          /* About the administration virtqueue. */
>>          le16 admin_queue_index;         /* read-only for driver */
>>          le16 admin_queue_num;         /* read-only for driver */
>> +        le16 preserve_resources;        /* read-write */
> Preserving these resources in the device implementation takes finite amount of time.
> Possibly more than 40nsec (time of PCIe write TLP).
> Hence this register must be a polling register to indicate that preservation_done.
> This will tell the guest when the preservation is done, and when restoration is done, so that it can resume upper layers.
> 
> Please refer to queue_reset definition to learn more about such register definition.
Thanks, I will refer to "queue_reset". So, I need three values, driver write 1 to let device do preserving resources, driver write 2 to let device do restoring resources, device write 0 to tell driver that preserving or restoring done, am I right?

> 
> Lets please make sure that PCIe PM level registers are sufficient/not-sufficient to decide the addition of this register.
But if the device is not a PCIe device, it doesn't have PM capability, then this will not work. Actually in my local environment, pci_is_express() return false in Qemu, they are not PCIe device.

> 
>>  };
>>  \end{lstlisting}
>>
>> @@ -428,6 +429,11 @@ \subsubsection{Common configuration structure
>> layout}\label{sec:Virtio Transport
>>  	The value 0 indicates no supported administration virtqueues.
>>  	This field is valid only if VIRTIO_F_ADMIN_VQ has been
>>  	negotiated.
>> +
>> +\item[\field{preserve_resources}]
>> +        The driver writes this to let device preserve resources whenever driver
>> has demands.
>> +        1 - device need to preserve resources which can't be re-created, until 0 is
>> set.
>> +        0 - all resources can be destroyed.
>>  \end{description}
>>
>>  \devicenormative{\paragraph}{Common configuration structure layout}{Virtio
>> Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common
>> configuration structure layout}
>> --
>> 2.34.1
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-23 10:38       ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-23 10:38 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2023/10/23 14:00, Parav Pandit wrote:
> Hi Jiqian,
> 
>> From: Jiqian Chen <Jiqian.Chen@amd.com>
>> Sent: Saturday, October 21, 2023 9:21 AM
>> To: Michael S . Tsirkin <mst@redhat.com>; Gerd Hoffmann
>> <kraxel@redhat.com>; Parav Pandit <parav@nvidia.com>; Jason Wang
>> <jasowang@redhat.com>; Xuan Zhuo <xuanzhuo@linux.alibaba.com>; David
>> Airlie <airlied@redhat.com>; Gurchetan Singh
>> <gurchetansingh@chromium.org>; Chia-I Wu <olvaffe@gmail.com>; Marc-
>> André Lureau <marcandre.lureau@gmail.com>; Robert Beckett
>> <bob.beckett@collabora.com>; Mikhail Golubev-Ciuchea <Mikhail.Golubev-
>> Ciuchea@opensynergy.com>; virtio-comment@lists.oasis-open.org; virtio-
>> dev@lists.oasis-open.org
>> Cc: Honglei Huang <Honglei1.Huang@amd.com>; Julia Zhang
>> <Julia.Zhang@amd.com>; Huang Rui <Ray.Huang@amd.com>; Jiqian Chen
>> <Jiqian.Chen@amd.com>
>> Subject: [PATCH v6 1/1] content: Add new feature
>> VIRTIO_F_PRESERVE_RESOURCES
>>
>> In some scenes, Qemu may reset or destroy resources of virtio device, but some
>> of them can't be re-created, so that causes some problems.
>>
> It can be re-created. It is just that the guest driver lost the previous resource values.
> So a better wording for the motivation is combining with below line to me is,
Yes, guest driver hasn't enough information or values of resources to re-create them. I will change my description in next version.

> 
> Currently guest drivers destroy and re-create virtio resources during power management suspend/resume sequence respectively.
> For example, for a PCI transport, even if the device offers D3 to d0 state transition, a virtio guest driver has no way to know if 
> the device can store the state during D0->D3 transition and same state can be restored on D3->D0 transition.
Thanks, will change in next version.

> 
>> For example, when we do S3 for guest, guest will set device_status to 0, it
>> causes Qemu to reset virtioi-gpu device, and then all render resources of virtio-
>> gpu will be destroyed. As a result, after guest resuming, the display can't come
>> back, and we only see a black screen.
>>
> To make guest drivers aware of above device capability, introduce a new feature bit to indicate such device capability.
> This patch adds...
Thanks, will change in next version.

> 
>> In order to deal with the above scene, we need a mechanism that allows guest
>> and Qemu to negotiate their behaviors for resources. So, this patch adds a new
>> feature named VIRTIO_F_PRESERVE_RESOURCES. It allows guest to tell Qemu
>> when there is a need to preserve resources, guest must preserve resources until
>> 0 is set.
>>
> I think this can be done without introducing the new register.
> Can you please check if the PM register itself can serve the purpose instead of new virtio level register?
Do you mean the system PM register? I think it is unreasonable to let virtio-device listen the PM state of Guest system. It's more suitable that each device gets notifications from driver, and then do preserving resources operation.

> 
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>>  conformance.tex   |  2 ++
>>  content.tex       | 25 +++++++++++++++++++++++++
>>  transport-pci.tex |  6 ++++++
>>  3 files changed, 33 insertions(+)
>>
>> diff --git a/conformance.tex b/conformance.tex index dc00e84..60cc0b1
>> 100644
>> --- a/conformance.tex
>> +++ b/conformance.tex
>> @@ -91,6 +91,7 @@ \section{Conformance Targets}\label{sec:Conformance /
>> Conformance Targets}  \item \ref{drivernormative:Basic Facilities of a Virtio
>> Device / Packed Virtqueues / The Virtqueue Descriptor Table / Indirect
>> Descriptors}  \item \ref{drivernormative:Basic Facilities of a Virtio Device /
>> Packed Virtqueues / Supplying Buffers to The Device / Updating flags}  \item
>> \ref{drivernormative:Basic Facilities of a Virtio Device / Packed Virtqueues /
>> Supplying Buffers to The Device / Sending Available Buffer Notifications}
>> +\item \ref{drivernormative:Basic Facilities of a Virtio Device /
>> +Preserve Resources}
>>  \item \ref{drivernormative:General Initialization And Device Operation /
>> Device Initialization}  \item \ref{drivernormative:General Initialization And
>> Device Operation / Device Cleanup}  \item \ref{drivernormative:Reserved
>> Feature Bits} @@ -172,6 +173,7 @@ \section{Conformance
>> Targets}\label{sec:Conformance / Conformance Targets}  \item
>> \ref{devicenormative:Basic Facilities of a Virtio Device / Packed Virtqueues /
>> The Virtqueue Descriptor Table}  \item \ref{devicenormative:Basic Facilities of
>> a Virtio Device / Packed Virtqueues / Scatter-Gather Support}  \item
>> \ref{devicenormative:Basic Facilities of a Virtio Device / Shared Memory
>> Regions}
>> +\item \ref{devicenormative:Basic Facilities of a Virtio Device /
>> +Preserve Resources}
>>  \item \ref{devicenormative:Reserved Feature Bits}  \end{itemize}
>>
>> diff --git a/content.tex b/content.tex
>> index 0a62dce..b6b1859 100644
>> --- a/content.tex
>> +++ b/content.tex
>> @@ -502,6 +502,27 @@ \section{Exporting Objects}\label{sec:Basic Facilities
>> of a Virtio Device / Expo  types. It is RECOMMENDED that devices generate
>> version 4  UUIDs as specified by \hyperref[intro:rfc4122]{[RFC4122]}.
>>
>> +\section{Preserve Resources}\label{sec:Basic Facilities of a Virtio
>> +Device / Preserve Resources}
>> +
>> +As virtio devices are paravirtualization devices by design.
> This is not true and not relevant for the spec. Please remove this line.
OK, will remove this line in next version.

> 
>> +There are various devices resources created by sending commands from
>> +frontend and stored in backend.
>> +
>> +In some scenes, resources may be destroyed or reset, some of them can
>> +be re-created since frontend has enough information , but some can't.
>> +At this case, we can set \field{Preserve Resources} to 1 by specific
>> +transport, to prevent resources being destroyed.
>> +
>> +Which kind of resources need to be preserved and how to preserve
>> +resources depend on specific devices.
> s/on specific devices/on specific device type/
Thanks, will change in next version.

> 
>> +
>> +\drivernormative{\subsection}{Preserve Resources}{Basic Facilities of a
>> +Virtio Device / Preserve resources} A driver SHOULD set \field{Preserve
>> +Resources} to 1 when there is a need to preserve resources.
>> +
>> +\devicenormative{\subsection}{Preserve Resources}{Basic Facilities of a
>> +Virtio Device / Preserve resources} A device MUST NOT destroy resources until
>> \field{Preserve Resources} is 0.
>> +
>>  \input{admin.tex}
>>
>>  \chapter{General Initialization And Device Operation}\label{sec:General
>> Initialization And Device Operation} @@ -872,6 +893,10 @@
>> \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>>  	\ref{devicenormative:Basic Facilities of a Virtio Device / Feature Bits}
>> for
>>  	handling features reserved for future use.
>>
>> +  \item[VIRTIO_F_PRESERVE_RESOURCES(42)] This feature indicates  that
>> + the device need to preserve resources.
>> +  See \ref{sec:Basic Facilities of a Virtio Device / Preserve Resources}.
>> +
>>  \end{description}
>>
>>  \drivernormative{\section}{Reserved Feature Bits}{Reserved Feature Bits} diff --
>> git a/transport-pci.tex b/transport-pci.tex index a5c6719..f6eea65 100644
>> --- a/transport-pci.tex
>> +++ b/transport-pci.tex
>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
>> layout}\label{sec:Virtio Transport
>>          /* About the administration virtqueue. */
>>          le16 admin_queue_index;         /* read-only for driver */
>>          le16 admin_queue_num;         /* read-only for driver */
>> +        le16 preserve_resources;        /* read-write */
> Preserving these resources in the device implementation takes finite amount of time.
> Possibly more than 40nsec (time of PCIe write TLP).
> Hence this register must be a polling register to indicate that preservation_done.
> This will tell the guest when the preservation is done, and when restoration is done, so that it can resume upper layers.
> 
> Please refer to queue_reset definition to learn more about such register definition.
Thanks, I will refer to "queue_reset". So, I need three values, driver write 1 to let device do preserving resources, driver write 2 to let device do restoring resources, device write 0 to tell driver that preserving or restoring done, am I right?

> 
> Lets please make sure that PCIe PM level registers are sufficient/not-sufficient to decide the addition of this register.
But if the device is not a PCIe device, it doesn't have PM capability, then this will not work. Actually in my local environment, pci_is_express() return false in Qemu, they are not PCIe device.

> 
>>  };
>>  \end{lstlisting}
>>
>> @@ -428,6 +429,11 @@ \subsubsection{Common configuration structure
>> layout}\label{sec:Virtio Transport
>>  	The value 0 indicates no supported administration virtqueues.
>>  	This field is valid only if VIRTIO_F_ADMIN_VQ has been
>>  	negotiated.
>> +
>> +\item[\field{preserve_resources}]
>> +        The driver writes this to let device preserve resources whenever driver
>> has demands.
>> +        1 - device need to preserve resources which can't be re-created, until 0 is
>> set.
>> +        0 - all resources can be destroyed.
>>  \end{description}
>>
>>  \devicenormative{\paragraph}{Common configuration structure layout}{Virtio
>> Transport Options / Virtio Over PCI Bus / PCI Device Layout / Common
>> configuration structure layout}
>> --
>> 2.34.1
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-23 10:38       ` [virtio-comment] " Chen, Jiqian
@ 2023-10-23 13:35         ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2023-10-23 13:35 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, October 23, 2023 4:09 PM

> > I think this can be done without introducing the new register.
> > Can you please check if the PM register itself can serve the purpose instead
> of new virtio level register?
> Do you mean the system PM register? 
No, the device's PM register at transport level.

> I think it is unreasonable to let virtio-
> device listen the PM state of Guest system. 
Guest driver performs any work on the guest systems PM callback events in the virtio driver.

> It's more suitable that each device
> gets notifications from driver, and then do preserving resources operation.
I agree that each device gets the notification from driver.
The question is, should it be virtio driver, or existing pci driver which transitions the state from d0->d3 and d3->d0 is enough.
Can you please check that?

> >> --- a/transport-pci.tex
> >> +++ b/transport-pci.tex
> >> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
> >> layout}\label{sec:Virtio Transport
> >>          /* About the administration virtqueue. */
> >>          le16 admin_queue_index;         /* read-only for driver */
> >>          le16 admin_queue_num;         /* read-only for driver */
> >> +        le16 preserve_resources;        /* read-write */
> > Preserving these resources in the device implementation takes finite amount
> of time.
> > Possibly more than 40nsec (time of PCIe write TLP).
> > Hence this register must be a polling register to indicate that
> preservation_done.
> > This will tell the guest when the preservation is done, and when restoration is
> done, so that it can resume upper layers.
> >
> > Please refer to queue_reset definition to learn more about such register
> definition.
> Thanks, I will refer to "queue_reset". So, I need three values, driver write 1 to
> let device do preserving resources, driver write 2 to let device do restoring
> resources, device write 0 to tell driver that preserving or restoring done, am I
> right?
> 
Right.

And if the existing pcie pm state bits can do, we can leverage that.
If it cannot be used, lets add that reasoning in the commit log to describe this register.

> >
> > Lets please make sure that PCIe PM level registers are sufficient/not-sufficient
> to decide the addition of this register.
> But if the device is not a PCIe device, it doesn't have PM capability, then this
> will not work. Actually in my local environment, pci_is_express() return false in
> Qemu, they are not PCIe device.
It is reasonable to ask to plug in as PCIe device in 2023 to get new functionality that too you mentioned a gpu device. 😊
Which does not have very long history of any backward compatibility.
Please explore the d0<->d3 PM state bit if can be used.

Thanks a lot.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-23 13:35         ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2023-10-23 13:35 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, October 23, 2023 4:09 PM

> > I think this can be done without introducing the new register.
> > Can you please check if the PM register itself can serve the purpose instead
> of new virtio level register?
> Do you mean the system PM register? 
No, the device's PM register at transport level.

> I think it is unreasonable to let virtio-
> device listen the PM state of Guest system. 
Guest driver performs any work on the guest systems PM callback events in the virtio driver.

> It's more suitable that each device
> gets notifications from driver, and then do preserving resources operation.
I agree that each device gets the notification from driver.
The question is, should it be virtio driver, or existing pci driver which transitions the state from d0->d3 and d3->d0 is enough.
Can you please check that?

> >> --- a/transport-pci.tex
> >> +++ b/transport-pci.tex
> >> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
> >> layout}\label{sec:Virtio Transport
> >>          /* About the administration virtqueue. */
> >>          le16 admin_queue_index;         /* read-only for driver */
> >>          le16 admin_queue_num;         /* read-only for driver */
> >> +        le16 preserve_resources;        /* read-write */
> > Preserving these resources in the device implementation takes finite amount
> of time.
> > Possibly more than 40nsec (time of PCIe write TLP).
> > Hence this register must be a polling register to indicate that
> preservation_done.
> > This will tell the guest when the preservation is done, and when restoration is
> done, so that it can resume upper layers.
> >
> > Please refer to queue_reset definition to learn more about such register
> definition.
> Thanks, I will refer to "queue_reset". So, I need three values, driver write 1 to
> let device do preserving resources, driver write 2 to let device do restoring
> resources, device write 0 to tell driver that preserving or restoring done, am I
> right?
> 
Right.

And if the existing pcie pm state bits can do, we can leverage that.
If it cannot be used, lets add that reasoning in the commit log to describe this register.

> >
> > Lets please make sure that PCIe PM level registers are sufficient/not-sufficient
> to decide the addition of this register.
> But if the device is not a PCIe device, it doesn't have PM capability, then this
> will not work. Actually in my local environment, pci_is_express() return false in
> Qemu, they are not PCIe device.
It is reasonable to ask to plug in as PCIe device in 2023 to get new functionality that too you mentioned a gpu device. 😊
Which does not have very long history of any backward compatibility.
Please explore the d0<->d3 PM state bit if can be used.

Thanks a lot.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-23 13:35         ` [virtio-comment] " Parav Pandit
@ 2023-10-24 10:35           ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-24 10:35 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2023/10/23 21:35, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, October 23, 2023 4:09 PM
> 
>>> I think this can be done without introducing the new register.
>>> Can you please check if the PM register itself can serve the purpose instead
>> of new virtio level register?
>> Do you mean the system PM register? 
> No, the device's PM register at transport level.
I tried to find this register(pci level or virtio pci level or virtio driver level), but I didn't find it in Linux kernel or Qemu codes.
May I know which register you are referring to specifically? Or which PM state bit you mentioned below?

> 
>> I think it is unreasonable to let virtio-
>> device listen the PM state of Guest system. 
> Guest driver performs any work on the guest systems PM callback events in the virtio driver.
I didn't find any PM state callback in the virtio driver.

> 
>> It's more suitable that each device
>> gets notifications from driver, and then do preserving resources operation.
> I agree that each device gets the notification from driver.
> The question is, should it be virtio driver, or existing pci driver which transitions the state from d0->d3 and d3->d0 is enough.
It seems there isn't existing pci driver to transitions d0 or d3 state. Could you please tell me which one it is specifically? I am very willing to give a try.

> Can you please check that?
> 
>>>> --- a/transport-pci.tex
>>>> +++ b/transport-pci.tex
>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
>>>> layout}\label{sec:Virtio Transport
>>>>          /* About the administration virtqueue. */
>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>> +        le16 preserve_resources;        /* read-write */
>>> Preserving these resources in the device implementation takes finite amount
>> of time.
>>> Possibly more than 40nsec (time of PCIe write TLP).
>>> Hence this register must be a polling register to indicate that
>> preservation_done.
>>> This will tell the guest when the preservation is done, and when restoration is
>> done, so that it can resume upper layers.
>>>
>>> Please refer to queue_reset definition to learn more about such register
>> definition.
>> Thanks, I will refer to "queue_reset". So, I need three values, driver write 1 to
>> let device do preserving resources, driver write 2 to let device do restoring
>> resources, device write 0 to tell driver that preserving or restoring done, am I
>> right?
>>
> Right.
> 
> And if the existing pcie pm state bits can do, we can leverage that.
> If it cannot be used, lets add that reasoning in the commit log to describe this register.
> 
>>>
>>> Lets please make sure that PCIe PM level registers are sufficient/not-sufficient
>> to decide the addition of this register.
>> But if the device is not a PCIe device, it doesn't have PM capability, then this
>> will not work. Actually in my local environment, pci_is_express() return false in
>> Qemu, they are not PCIe device.
> It is reasonable to ask to plug in as PCIe device in 2023 to get new functionality that too you mentioned a gpu device. 😊
> Which does not have very long history of any backward compatibility.
Do you suggest me to add PM capability for virtio-gpu or change virtio-gpu to a PCIe device?

> Please explore the d0<->d3 PM state bit if can be used.
> 
> Thanks a lot.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-24 10:35           ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-24 10:35 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2023/10/23 21:35, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, October 23, 2023 4:09 PM
> 
>>> I think this can be done without introducing the new register.
>>> Can you please check if the PM register itself can serve the purpose instead
>> of new virtio level register?
>> Do you mean the system PM register? 
> No, the device's PM register at transport level.
I tried to find this register(pci level or virtio pci level or virtio driver level), but I didn't find it in Linux kernel or Qemu codes.
May I know which register you are referring to specifically? Or which PM state bit you mentioned below?

> 
>> I think it is unreasonable to let virtio-
>> device listen the PM state of Guest system. 
> Guest driver performs any work on the guest systems PM callback events in the virtio driver.
I didn't find any PM state callback in the virtio driver.

> 
>> It's more suitable that each device
>> gets notifications from driver, and then do preserving resources operation.
> I agree that each device gets the notification from driver.
> The question is, should it be virtio driver, or existing pci driver which transitions the state from d0->d3 and d3->d0 is enough.
It seems there isn't existing pci driver to transitions d0 or d3 state. Could you please tell me which one it is specifically? I am very willing to give a try.

> Can you please check that?
> 
>>>> --- a/transport-pci.tex
>>>> +++ b/transport-pci.tex
>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
>>>> layout}\label{sec:Virtio Transport
>>>>          /* About the administration virtqueue. */
>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>> +        le16 preserve_resources;        /* read-write */
>>> Preserving these resources in the device implementation takes finite amount
>> of time.
>>> Possibly more than 40nsec (time of PCIe write TLP).
>>> Hence this register must be a polling register to indicate that
>> preservation_done.
>>> This will tell the guest when the preservation is done, and when restoration is
>> done, so that it can resume upper layers.
>>>
>>> Please refer to queue_reset definition to learn more about such register
>> definition.
>> Thanks, I will refer to "queue_reset". So, I need three values, driver write 1 to
>> let device do preserving resources, driver write 2 to let device do restoring
>> resources, device write 0 to tell driver that preserving or restoring done, am I
>> right?
>>
> Right.
> 
> And if the existing pcie pm state bits can do, we can leverage that.
> If it cannot be used, lets add that reasoning in the commit log to describe this register.
> 
>>>
>>> Lets please make sure that PCIe PM level registers are sufficient/not-sufficient
>> to decide the addition of this register.
>> But if the device is not a PCIe device, it doesn't have PM capability, then this
>> will not work. Actually in my local environment, pci_is_express() return false in
>> Qemu, they are not PCIe device.
> It is reasonable to ask to plug in as PCIe device in 2023 to get new functionality that too you mentioned a gpu device. 😊
> Which does not have very long history of any backward compatibility.
Do you suggest me to add PM capability for virtio-gpu or change virtio-gpu to a PCIe device?

> Please explore the d0<->d3 PM state bit if can be used.
> 
> Thanks a lot.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-24 10:35           ` [virtio-comment] " Chen, Jiqian
@ 2023-10-24 10:51             ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2023-10-24 10:51 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Tuesday, October 24, 2023 4:06 PM
> 
> On 2023/10/23 21:35, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, October 23, 2023 4:09 PM
> >
> >>> I think this can be done without introducing the new register.
> >>> Can you please check if the PM register itself can serve the purpose
> >>> instead
> >> of new virtio level register?
> >> Do you mean the system PM register?
> > No, the device's PM register at transport level.
> I tried to find this register(pci level or virtio pci level or virtio driver level), but I
> didn't find it in Linux kernel or Qemu codes.
> May I know which register you are referring to specifically? Or which PM state
> bit you mentioned below?
> 
PCI spec's PCI Power Management Capability Structure in section 7.5.2.
> >
> >> I think it is unreasonable to let virtio- device listen the PM state
> >> of Guest system.
> > Guest driver performs any work on the guest systems PM callback events in
> the virtio driver.
> I didn't find any PM state callback in the virtio driver.
> 
There are virtio_suspend and virtio_resume in case of Linux.

> >
> >> It's more suitable that each device
> >> gets notifications from driver, and then do preserving resources operation.
> > I agree that each device gets the notification from driver.
> > The question is, should it be virtio driver, or existing pci driver which
> transitions the state from d0->d3 and d3->d0 is enough.
> It seems there isn't existing pci driver to transitions d0 or d3 state. Could you
> please tell me which one it is specifically? I am very willing to give a try.
> 
Virtio-pci modern driver of Linux should be able to.

> > Can you please check that?
> >
> >>>> --- a/transport-pci.tex
> >>>> +++ b/transport-pci.tex
> >>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
> >>>> layout}\label{sec:Virtio Transport
> >>>>          /* About the administration virtqueue. */
> >>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>> +        le16 preserve_resources;        /* read-write */
> >>> Preserving these resources in the device implementation takes finite
> >>> amount
> >> of time.
> >>> Possibly more than 40nsec (time of PCIe write TLP).
> >>> Hence this register must be a polling register to indicate that
> >> preservation_done.
> >>> This will tell the guest when the preservation is done, and when
> >>> restoration is
> >> done, so that it can resume upper layers.
> >>>
> >>> Please refer to queue_reset definition to learn more about such
> >>> register
> >> definition.
> >> Thanks, I will refer to "queue_reset". So, I need three values,
> >> driver write 1 to let device do preserving resources, driver write 2
> >> to let device do restoring resources, device write 0 to tell driver
> >> that preserving or restoring done, am I right?
> >>
> > Right.
> >
> > And if the existing pcie pm state bits can do, we can leverage that.
> > If it cannot be used, lets add that reasoning in the commit log to describe this
> register.
> >
> >>>
> >>> Lets please make sure that PCIe PM level registers are
> >>> sufficient/not-sufficient
> >> to decide the addition of this register.
> >> But if the device is not a PCIe device, it doesn't have PM
> >> capability, then this will not work. Actually in my local
> >> environment, pci_is_express() return false in Qemu, they are not PCIe
> device.
> > It is reasonable to ask to plug in as PCIe device in 2023 to get new
> > functionality that too you mentioned a gpu device. 😊
> > Which does not have very long history of any backward compatibility.
> Do you suggest me to add PM capability for virtio-gpu or change virtio-gpu to a
> PCIe device?
> 
PCI Power Management Capability Structure does not seem to be limited to PCIe.


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-24 10:51             ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2023-10-24 10:51 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Tuesday, October 24, 2023 4:06 PM
> 
> On 2023/10/23 21:35, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, October 23, 2023 4:09 PM
> >
> >>> I think this can be done without introducing the new register.
> >>> Can you please check if the PM register itself can serve the purpose
> >>> instead
> >> of new virtio level register?
> >> Do you mean the system PM register?
> > No, the device's PM register at transport level.
> I tried to find this register(pci level or virtio pci level or virtio driver level), but I
> didn't find it in Linux kernel or Qemu codes.
> May I know which register you are referring to specifically? Or which PM state
> bit you mentioned below?
> 
PCI spec's PCI Power Management Capability Structure in section 7.5.2.
> >
> >> I think it is unreasonable to let virtio- device listen the PM state
> >> of Guest system.
> > Guest driver performs any work on the guest systems PM callback events in
> the virtio driver.
> I didn't find any PM state callback in the virtio driver.
> 
There are virtio_suspend and virtio_resume in case of Linux.

> >
> >> It's more suitable that each device
> >> gets notifications from driver, and then do preserving resources operation.
> > I agree that each device gets the notification from driver.
> > The question is, should it be virtio driver, or existing pci driver which
> transitions the state from d0->d3 and d3->d0 is enough.
> It seems there isn't existing pci driver to transitions d0 or d3 state. Could you
> please tell me which one it is specifically? I am very willing to give a try.
> 
Virtio-pci modern driver of Linux should be able to.

> > Can you please check that?
> >
> >>>> --- a/transport-pci.tex
> >>>> +++ b/transport-pci.tex
> >>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
> >>>> layout}\label{sec:Virtio Transport
> >>>>          /* About the administration virtqueue. */
> >>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>> +        le16 preserve_resources;        /* read-write */
> >>> Preserving these resources in the device implementation takes finite
> >>> amount
> >> of time.
> >>> Possibly more than 40nsec (time of PCIe write TLP).
> >>> Hence this register must be a polling register to indicate that
> >> preservation_done.
> >>> This will tell the guest when the preservation is done, and when
> >>> restoration is
> >> done, so that it can resume upper layers.
> >>>
> >>> Please refer to queue_reset definition to learn more about such
> >>> register
> >> definition.
> >> Thanks, I will refer to "queue_reset". So, I need three values,
> >> driver write 1 to let device do preserving resources, driver write 2
> >> to let device do restoring resources, device write 0 to tell driver
> >> that preserving or restoring done, am I right?
> >>
> > Right.
> >
> > And if the existing pcie pm state bits can do, we can leverage that.
> > If it cannot be used, lets add that reasoning in the commit log to describe this
> register.
> >
> >>>
> >>> Lets please make sure that PCIe PM level registers are
> >>> sufficient/not-sufficient
> >> to decide the addition of this register.
> >> But if the device is not a PCIe device, it doesn't have PM
> >> capability, then this will not work. Actually in my local
> >> environment, pci_is_express() return false in Qemu, they are not PCIe
> device.
> > It is reasonable to ask to plug in as PCIe device in 2023 to get new
> > functionality that too you mentioned a gpu device. 😊
> > Which does not have very long history of any backward compatibility.
> Do you suggest me to add PM capability for virtio-gpu or change virtio-gpu to a
> PCIe device?
> 
PCI Power Management Capability Structure does not seem to be limited to PCIe.


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-24 10:51             ` [virtio-comment] " Parav Pandit
@ 2023-10-24 12:13               ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-24 12:13 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2023/10/24 18:51, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Tuesday, October 24, 2023 4:06 PM
>>
>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>
>>>>> I think this can be done without introducing the new register.
>>>>> Can you please check if the PM register itself can serve the purpose
>>>>> instead
>>>> of new virtio level register?
>>>> Do you mean the system PM register?
>>> No, the device's PM register at transport level.
>> I tried to find this register(pci level or virtio pci level or virtio driver level), but I
>> didn't find it in Linux kernel or Qemu codes.
>> May I know which register you are referring to specifically? Or which PM state
>> bit you mentioned below?
>>
> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
Yes, what you point to is PM capability for PCIe device. 
But the problem is still that in Qemu code, it will check the condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not add PM capability for them.
And another problem is how about the MMIO transport devices? Since preserve resources is need for all transport type devices.

>>>
>>>> I think it is unreasonable to let virtio- device listen the PM state
>>>> of Guest system.
>>> Guest driver performs any work on the guest systems PM callback events in
>> the virtio driver.
>> I didn't find any PM state callback in the virtio driver.
>>
> There are virtio_suspend and virtio_resume in case of Linux.
I think what you said virtio_suspend/resume is freeze/restore callback from "struct virtio_driver" or suspend/resume callback from "static const struct dev_pm_ops virtio_pci_pm_ops".
And yes, I agree, if virtio devices have PM capability, maybe we can set PM state in those callback functions.

> 
>>>
>>>> It's more suitable that each device
>>>> gets notifications from driver, and then do preserving resources operation.
>>> I agree that each device gets the notification from driver.
>>> The question is, should it be virtio driver, or existing pci driver which
>> transitions the state from d0->d3 and d3->d0 is enough.
>> It seems there isn't existing pci driver to transitions d0 or d3 state. Could you
>> please tell me which one it is specifically? I am very willing to give a try.
>>
> Virtio-pci modern driver of Linux should be able to.
Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I said above.

> 
>>> Can you please check that?
>>>
>>>>>> --- a/transport-pci.tex
>>>>>> +++ b/transport-pci.tex
>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
>>>>>> layout}\label{sec:Virtio Transport
>>>>>>          /* About the administration virtqueue. */
>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>> Preserving these resources in the device implementation takes finite
>>>>> amount
>>>> of time.
>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>> Hence this register must be a polling register to indicate that
>>>> preservation_done.
>>>>> This will tell the guest when the preservation is done, and when
>>>>> restoration is
>>>> done, so that it can resume upper layers.
>>>>>
>>>>> Please refer to queue_reset definition to learn more about such
>>>>> register
>>>> definition.
>>>> Thanks, I will refer to "queue_reset". So, I need three values,
>>>> driver write 1 to let device do preserving resources, driver write 2
>>>> to let device do restoring resources, device write 0 to tell driver
>>>> that preserving or restoring done, am I right?
>>>>
>>> Right.
>>>
>>> And if the existing pcie pm state bits can do, we can leverage that.
>>> If it cannot be used, lets add that reasoning in the commit log to describe this
>> register.
>>>
>>>>>
>>>>> Lets please make sure that PCIe PM level registers are
>>>>> sufficient/not-sufficient
>>>> to decide the addition of this register.
>>>> But if the device is not a PCIe device, it doesn't have PM
>>>> capability, then this will not work. Actually in my local
>>>> environment, pci_is_express() return false in Qemu, they are not PCIe
>> device.
>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
>>> functionality that too you mentioned a gpu device. 😊
>>> Which does not have very long history of any backward compatibility.
>> Do you suggest me to add PM capability for virtio-gpu or change virtio-gpu to a
>> PCIe device?
>>
> PCI Power Management Capability Structure does not seem to be limited to PCIe.
I am not sure, but in current Qemu code, I can see the check "pci_is_express" for all virtio-pci devices. If we want to add PM capability for virtio-pci devices, we need to change them to PCIe device I think.

> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-24 12:13               ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-24 12:13 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2023/10/24 18:51, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Tuesday, October 24, 2023 4:06 PM
>>
>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>
>>>>> I think this can be done without introducing the new register.
>>>>> Can you please check if the PM register itself can serve the purpose
>>>>> instead
>>>> of new virtio level register?
>>>> Do you mean the system PM register?
>>> No, the device's PM register at transport level.
>> I tried to find this register(pci level or virtio pci level or virtio driver level), but I
>> didn't find it in Linux kernel or Qemu codes.
>> May I know which register you are referring to specifically? Or which PM state
>> bit you mentioned below?
>>
> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
Yes, what you point to is PM capability for PCIe device. 
But the problem is still that in Qemu code, it will check the condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not add PM capability for them.
And another problem is how about the MMIO transport devices? Since preserve resources is need for all transport type devices.

>>>
>>>> I think it is unreasonable to let virtio- device listen the PM state
>>>> of Guest system.
>>> Guest driver performs any work on the guest systems PM callback events in
>> the virtio driver.
>> I didn't find any PM state callback in the virtio driver.
>>
> There are virtio_suspend and virtio_resume in case of Linux.
I think what you said virtio_suspend/resume is freeze/restore callback from "struct virtio_driver" or suspend/resume callback from "static const struct dev_pm_ops virtio_pci_pm_ops".
And yes, I agree, if virtio devices have PM capability, maybe we can set PM state in those callback functions.

> 
>>>
>>>> It's more suitable that each device
>>>> gets notifications from driver, and then do preserving resources operation.
>>> I agree that each device gets the notification from driver.
>>> The question is, should it be virtio driver, or existing pci driver which
>> transitions the state from d0->d3 and d3->d0 is enough.
>> It seems there isn't existing pci driver to transitions d0 or d3 state. Could you
>> please tell me which one it is specifically? I am very willing to give a try.
>>
> Virtio-pci modern driver of Linux should be able to.
Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I said above.

> 
>>> Can you please check that?
>>>
>>>>>> --- a/transport-pci.tex
>>>>>> +++ b/transport-pci.tex
>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration structure
>>>>>> layout}\label{sec:Virtio Transport
>>>>>>          /* About the administration virtqueue. */
>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>> Preserving these resources in the device implementation takes finite
>>>>> amount
>>>> of time.
>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>> Hence this register must be a polling register to indicate that
>>>> preservation_done.
>>>>> This will tell the guest when the preservation is done, and when
>>>>> restoration is
>>>> done, so that it can resume upper layers.
>>>>>
>>>>> Please refer to queue_reset definition to learn more about such
>>>>> register
>>>> definition.
>>>> Thanks, I will refer to "queue_reset". So, I need three values,
>>>> driver write 1 to let device do preserving resources, driver write 2
>>>> to let device do restoring resources, device write 0 to tell driver
>>>> that preserving or restoring done, am I right?
>>>>
>>> Right.
>>>
>>> And if the existing pcie pm state bits can do, we can leverage that.
>>> If it cannot be used, lets add that reasoning in the commit log to describe this
>> register.
>>>
>>>>>
>>>>> Lets please make sure that PCIe PM level registers are
>>>>> sufficient/not-sufficient
>>>> to decide the addition of this register.
>>>> But if the device is not a PCIe device, it doesn't have PM
>>>> capability, then this will not work. Actually in my local
>>>> environment, pci_is_express() return false in Qemu, they are not PCIe
>> device.
>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
>>> functionality that too you mentioned a gpu device. 😊
>>> Which does not have very long history of any backward compatibility.
>> Do you suggest me to add PM capability for virtio-gpu or change virtio-gpu to a
>> PCIe device?
>>
> PCI Power Management Capability Structure does not seem to be limited to PCIe.
I am not sure, but in current Qemu code, I can see the check "pci_is_express" for all virtio-pci devices. If we want to add PM capability for virtio-pci devices, we need to change them to PCIe device I think.

> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-24 12:13               ` [virtio-comment] " Chen, Jiqian
@ 2023-10-25  3:51                 ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2023-10-25  3:51 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray



> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Tuesday, October 24, 2023 5:44 PM
> 
> On 2023/10/24 18:51, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Tuesday, October 24, 2023 4:06 PM
> >>
> >> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>
> >>>>> I think this can be done without introducing the new register.
> >>>>> Can you please check if the PM register itself can serve the
> >>>>> purpose instead
> >>>> of new virtio level register?
> >>>> Do you mean the system PM register?
> >>> No, the device's PM register at transport level.
> >> I tried to find this register(pci level or virtio pci level or virtio
> >> driver level), but I didn't find it in Linux kernel or Qemu codes.
> >> May I know which register you are referring to specifically? Or which
> >> PM state bit you mentioned below?
> >>
> > PCI spec's PCI Power Management Capability Structure in section 7.5.2.
> Yes, what you point to is PM capability for PCIe device.
> But the problem is still that in Qemu code, it will check the
> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
> add PM capability for them.
PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.

> And another problem is how about the MMIO transport devices? Since
> preserve resources is need for all transport type devices.
> 
MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.

> >>>
> >>>> I think it is unreasonable to let virtio- device listen the PM
> >>>> state of Guest system.
> >>> Guest driver performs any work on the guest systems PM callback
> >>> events in
> >> the virtio driver.
> >> I didn't find any PM state callback in the virtio driver.
> >>
> > There are virtio_suspend and virtio_resume in case of Linux.
> I think what you said virtio_suspend/resume is freeze/restore callback from
> "struct virtio_driver" or suspend/resume callback from "static const struct
> dev_pm_ops virtio_pci_pm_ops".
> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
> in those callback functions.
> 
> >
> >>>
> >>>> It's more suitable that each device gets notifications from driver,
> >>>> and then do preserving resources operation.
> >>> I agree that each device gets the notification from driver.
> >>> The question is, should it be virtio driver, or existing pci driver
> >>> which
> >> transitions the state from d0->d3 and d3->d0 is enough.
> >> It seems there isn't existing pci driver to transitions d0 or d3
> >> state. Could you please tell me which one it is specifically? I am very willing to
> give a try.
> >>
> > Virtio-pci modern driver of Linux should be able to.
> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
> said above.
>
Both can be resolved without switching to pcie.
 
> >
> >>> Can you please check that?
> >>>
> >>>>>> --- a/transport-pci.tex
> >>>>>> +++ b/transport-pci.tex
> >>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
> structure
> >>>>>> layout}\label{sec:Virtio Transport
> >>>>>>          /* About the administration virtqueue. */
> >>>>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>> Preserving these resources in the device implementation takes
> >>>>> finite amount
> >>>> of time.
> >>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>> Hence this register must be a polling register to indicate that
> >>>> preservation_done.
> >>>>> This will tell the guest when the preservation is done, and when
> >>>>> restoration is
> >>>> done, so that it can resume upper layers.
> >>>>>
> >>>>> Please refer to queue_reset definition to learn more about such
> >>>>> register
> >>>> definition.
> >>>> Thanks, I will refer to "queue_reset". So, I need three values,
> >>>> driver write 1 to let device do preserving resources, driver write
> >>>> 2 to let device do restoring resources, device write 0 to tell
> >>>> driver that preserving or restoring done, am I right?
> >>>>
> >>> Right.
> >>>
> >>> And if the existing pcie pm state bits can do, we can leverage that.
> >>> If it cannot be used, lets add that reasoning in the commit log to
> >>> describe this
> >> register.
> >>>
> >>>>>
> >>>>> Lets please make sure that PCIe PM level registers are
> >>>>> sufficient/not-sufficient
> >>>> to decide the addition of this register.
> >>>> But if the device is not a PCIe device, it doesn't have PM
> >>>> capability, then this will not work. Actually in my local
> >>>> environment, pci_is_express() return false in Qemu, they are not
> >>>> PCIe
> >> device.
> >>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
> >>> functionality that too you mentioned a gpu device. 😊
> >>> Which does not have very long history of any backward compatibility.
> >> Do you suggest me to add PM capability for virtio-gpu or change
> >> virtio-gpu to a PCIe device?
> >>
> > PCI Power Management Capability Structure does not seem to be limited to
> PCIe.
> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
> we need to change them to PCIe device I think.
> 
That is one option.
Second option to extend PCI PM cap for non pci device because it is supported.

> >
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-25  3:51                 ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2023-10-25  3:51 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray



> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Tuesday, October 24, 2023 5:44 PM
> 
> On 2023/10/24 18:51, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Tuesday, October 24, 2023 4:06 PM
> >>
> >> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>
> >>>>> I think this can be done without introducing the new register.
> >>>>> Can you please check if the PM register itself can serve the
> >>>>> purpose instead
> >>>> of new virtio level register?
> >>>> Do you mean the system PM register?
> >>> No, the device's PM register at transport level.
> >> I tried to find this register(pci level or virtio pci level or virtio
> >> driver level), but I didn't find it in Linux kernel or Qemu codes.
> >> May I know which register you are referring to specifically? Or which
> >> PM state bit you mentioned below?
> >>
> > PCI spec's PCI Power Management Capability Structure in section 7.5.2.
> Yes, what you point to is PM capability for PCIe device.
> But the problem is still that in Qemu code, it will check the
> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
> add PM capability for them.
PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.

> And another problem is how about the MMIO transport devices? Since
> preserve resources is need for all transport type devices.
> 
MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.

> >>>
> >>>> I think it is unreasonable to let virtio- device listen the PM
> >>>> state of Guest system.
> >>> Guest driver performs any work on the guest systems PM callback
> >>> events in
> >> the virtio driver.
> >> I didn't find any PM state callback in the virtio driver.
> >>
> > There are virtio_suspend and virtio_resume in case of Linux.
> I think what you said virtio_suspend/resume is freeze/restore callback from
> "struct virtio_driver" or suspend/resume callback from "static const struct
> dev_pm_ops virtio_pci_pm_ops".
> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
> in those callback functions.
> 
> >
> >>>
> >>>> It's more suitable that each device gets notifications from driver,
> >>>> and then do preserving resources operation.
> >>> I agree that each device gets the notification from driver.
> >>> The question is, should it be virtio driver, or existing pci driver
> >>> which
> >> transitions the state from d0->d3 and d3->d0 is enough.
> >> It seems there isn't existing pci driver to transitions d0 or d3
> >> state. Could you please tell me which one it is specifically? I am very willing to
> give a try.
> >>
> > Virtio-pci modern driver of Linux should be able to.
> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
> said above.
>
Both can be resolved without switching to pcie.
 
> >
> >>> Can you please check that?
> >>>
> >>>>>> --- a/transport-pci.tex
> >>>>>> +++ b/transport-pci.tex
> >>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
> structure
> >>>>>> layout}\label{sec:Virtio Transport
> >>>>>>          /* About the administration virtqueue. */
> >>>>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>> Preserving these resources in the device implementation takes
> >>>>> finite amount
> >>>> of time.
> >>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>> Hence this register must be a polling register to indicate that
> >>>> preservation_done.
> >>>>> This will tell the guest when the preservation is done, and when
> >>>>> restoration is
> >>>> done, so that it can resume upper layers.
> >>>>>
> >>>>> Please refer to queue_reset definition to learn more about such
> >>>>> register
> >>>> definition.
> >>>> Thanks, I will refer to "queue_reset". So, I need three values,
> >>>> driver write 1 to let device do preserving resources, driver write
> >>>> 2 to let device do restoring resources, device write 0 to tell
> >>>> driver that preserving or restoring done, am I right?
> >>>>
> >>> Right.
> >>>
> >>> And if the existing pcie pm state bits can do, we can leverage that.
> >>> If it cannot be used, lets add that reasoning in the commit log to
> >>> describe this
> >> register.
> >>>
> >>>>>
> >>>>> Lets please make sure that PCIe PM level registers are
> >>>>> sufficient/not-sufficient
> >>>> to decide the addition of this register.
> >>>> But if the device is not a PCIe device, it doesn't have PM
> >>>> capability, then this will not work. Actually in my local
> >>>> environment, pci_is_express() return false in Qemu, they are not
> >>>> PCIe
> >> device.
> >>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
> >>> functionality that too you mentioned a gpu device. 😊
> >>> Which does not have very long history of any backward compatibility.
> >> Do you suggest me to add PM capability for virtio-gpu or change
> >> virtio-gpu to a PCIe device?
> >>
> > PCI Power Management Capability Structure does not seem to be limited to
> PCIe.
> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
> we need to change them to PCIe device I think.
> 
That is one option.
Second option to extend PCI PM cap for non pci device because it is supported.

> >
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-25  3:51                 ` [virtio-comment] " Parav Pandit
@ 2023-10-26 10:24                   ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-26 10:24 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian


On 2023/10/25 11:51, Parav Pandit wrote:
> 
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Tuesday, October 24, 2023 5:44 PM
>>
>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>
>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>
>>>>>>> I think this can be done without introducing the new register.
>>>>>>> Can you please check if the PM register itself can serve the
>>>>>>> purpose instead
>>>>>> of new virtio level register?
>>>>>> Do you mean the system PM register?
>>>>> No, the device's PM register at transport level.
>>>> I tried to find this register(pci level or virtio pci level or virtio
>>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
>>>> May I know which register you are referring to specifically? Or which
>>>> PM state bit you mentioned below?
>>>>
>>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
>> Yes, what you point to is PM capability for PCIe device.
>> But the problem is still that in Qemu code, it will check the
>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
>> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
>> add PM capability for them.
> PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
> But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
Do you suggest me to implement PM capability for virtio devices in Qemu firstly, and then to try if the PM capability can work for this scenario?
If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.
Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?

> 
>> And another problem is how about the MMIO transport devices? Since
>> preserve resources is need for all transport type devices.
>>
> MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
> 
>>>>>
>>>>>> I think it is unreasonable to let virtio- device listen the PM
>>>>>> state of Guest system.
>>>>> Guest driver performs any work on the guest systems PM callback
>>>>> events in
>>>> the virtio driver.
>>>> I didn't find any PM state callback in the virtio driver.
>>>>
>>> There are virtio_suspend and virtio_resume in case of Linux.
>> I think what you said virtio_suspend/resume is freeze/restore callback from
>> "struct virtio_driver" or suspend/resume callback from "static const struct
>> dev_pm_ops virtio_pci_pm_ops".
>> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
>> in those callback functions.
>>
>>>
>>>>>
>>>>>> It's more suitable that each device gets notifications from driver,
>>>>>> and then do preserving resources operation.
>>>>> I agree that each device gets the notification from driver.
>>>>> The question is, should it be virtio driver, or existing pci driver
>>>>> which
>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>> It seems there isn't existing pci driver to transitions d0 or d3
>>>> state. Could you please tell me which one it is specifically? I am very willing to
>> give a try.
>>>>
>>> Virtio-pci modern driver of Linux should be able to.
>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
>> said above.
>>
> Both can be resolved without switching to pcie.
>  
>>>
>>>>> Can you please check that?
>>>>>
>>>>>>>> --- a/transport-pci.tex
>>>>>>>> +++ b/transport-pci.tex
>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
>> structure
>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>> Preserving these resources in the device implementation takes
>>>>>>> finite amount
>>>>>> of time.
>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>> Hence this register must be a polling register to indicate that
>>>>>> preservation_done.
>>>>>>> This will tell the guest when the preservation is done, and when
>>>>>>> restoration is
>>>>>> done, so that it can resume upper layers.
>>>>>>>
>>>>>>> Please refer to queue_reset definition to learn more about such
>>>>>>> register
>>>>>> definition.
>>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
>>>>>> driver write 1 to let device do preserving resources, driver write
>>>>>> 2 to let device do restoring resources, device write 0 to tell
>>>>>> driver that preserving or restoring done, am I right?
>>>>>>
>>>>> Right.
>>>>>
>>>>> And if the existing pcie pm state bits can do, we can leverage that.
>>>>> If it cannot be used, lets add that reasoning in the commit log to
>>>>> describe this
>>>> register.
>>>>>
>>>>>>>
>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>> sufficient/not-sufficient
>>>>>> to decide the addition of this register.
>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>> capability, then this will not work. Actually in my local
>>>>>> environment, pci_is_express() return false in Qemu, they are not
>>>>>> PCIe
>>>> device.
>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
>>>>> functionality that too you mentioned a gpu device. 😊
>>>>> Which does not have very long history of any backward compatibility.
>>>> Do you suggest me to add PM capability for virtio-gpu or change
>>>> virtio-gpu to a PCIe device?
>>>>
>>> PCI Power Management Capability Structure does not seem to be limited to
>> PCIe.
>> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
>> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
>> we need to change them to PCIe device I think.
>>
> That is one option.
> Second option to extend PCI PM cap for non pci device because it is supported.
> 
>>>
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-26 10:24                   ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-26 10:24 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S . Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian


On 2023/10/25 11:51, Parav Pandit wrote:
> 
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Tuesday, October 24, 2023 5:44 PM
>>
>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>
>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>
>>>>>>> I think this can be done without introducing the new register.
>>>>>>> Can you please check if the PM register itself can serve the
>>>>>>> purpose instead
>>>>>> of new virtio level register?
>>>>>> Do you mean the system PM register?
>>>>> No, the device's PM register at transport level.
>>>> I tried to find this register(pci level or virtio pci level or virtio
>>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
>>>> May I know which register you are referring to specifically? Or which
>>>> PM state bit you mentioned below?
>>>>
>>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
>> Yes, what you point to is PM capability for PCIe device.
>> But the problem is still that in Qemu code, it will check the
>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
>> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
>> add PM capability for them.
> PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
> But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
Do you suggest me to implement PM capability for virtio devices in Qemu firstly, and then to try if the PM capability can work for this scenario?
If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.
Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?

> 
>> And another problem is how about the MMIO transport devices? Since
>> preserve resources is need for all transport type devices.
>>
> MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
> 
>>>>>
>>>>>> I think it is unreasonable to let virtio- device listen the PM
>>>>>> state of Guest system.
>>>>> Guest driver performs any work on the guest systems PM callback
>>>>> events in
>>>> the virtio driver.
>>>> I didn't find any PM state callback in the virtio driver.
>>>>
>>> There are virtio_suspend and virtio_resume in case of Linux.
>> I think what you said virtio_suspend/resume is freeze/restore callback from
>> "struct virtio_driver" or suspend/resume callback from "static const struct
>> dev_pm_ops virtio_pci_pm_ops".
>> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
>> in those callback functions.
>>
>>>
>>>>>
>>>>>> It's more suitable that each device gets notifications from driver,
>>>>>> and then do preserving resources operation.
>>>>> I agree that each device gets the notification from driver.
>>>>> The question is, should it be virtio driver, or existing pci driver
>>>>> which
>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>> It seems there isn't existing pci driver to transitions d0 or d3
>>>> state. Could you please tell me which one it is specifically? I am very willing to
>> give a try.
>>>>
>>> Virtio-pci modern driver of Linux should be able to.
>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
>> said above.
>>
> Both can be resolved without switching to pcie.
>  
>>>
>>>>> Can you please check that?
>>>>>
>>>>>>>> --- a/transport-pci.tex
>>>>>>>> +++ b/transport-pci.tex
>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
>> structure
>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>> Preserving these resources in the device implementation takes
>>>>>>> finite amount
>>>>>> of time.
>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>> Hence this register must be a polling register to indicate that
>>>>>> preservation_done.
>>>>>>> This will tell the guest when the preservation is done, and when
>>>>>>> restoration is
>>>>>> done, so that it can resume upper layers.
>>>>>>>
>>>>>>> Please refer to queue_reset definition to learn more about such
>>>>>>> register
>>>>>> definition.
>>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
>>>>>> driver write 1 to let device do preserving resources, driver write
>>>>>> 2 to let device do restoring resources, device write 0 to tell
>>>>>> driver that preserving or restoring done, am I right?
>>>>>>
>>>>> Right.
>>>>>
>>>>> And if the existing pcie pm state bits can do, we can leverage that.
>>>>> If it cannot be used, lets add that reasoning in the commit log to
>>>>> describe this
>>>> register.
>>>>>
>>>>>>>
>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>> sufficient/not-sufficient
>>>>>> to decide the addition of this register.
>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>> capability, then this will not work. Actually in my local
>>>>>> environment, pci_is_express() return false in Qemu, they are not
>>>>>> PCIe
>>>> device.
>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
>>>>> functionality that too you mentioned a gpu device. 😊
>>>>> Which does not have very long history of any backward compatibility.
>>>> Do you suggest me to add PM capability for virtio-gpu or change
>>>> virtio-gpu to a PCIe device?
>>>>
>>> PCI Power Management Capability Structure does not seem to be limited to
>> PCIe.
>> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
>> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
>> we need to change them to PCIe device I think.
>>
> That is one option.
> Second option to extend PCI PM cap for non pci device because it is supported.
> 
>>>
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-26 10:24                   ` [virtio-comment] " Chen, Jiqian
@ 2023-10-26 10:30                     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 76+ messages in thread
From: Michael S. Tsirkin @ 2023-10-26 10:30 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Parav Pandit, Gerd Hoffmann, Jason Wang, Xuan Zhuo, David Airlie,
	Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray

On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
> 
> On 2023/10/25 11:51, Parav Pandit wrote:
> > 
> > 
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Tuesday, October 24, 2023 5:44 PM
> >>
> >> On 2023/10/24 18:51, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Tuesday, October 24, 2023 4:06 PM
> >>>>
> >>>> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>>>
> >>>>>>> I think this can be done without introducing the new register.
> >>>>>>> Can you please check if the PM register itself can serve the
> >>>>>>> purpose instead
> >>>>>> of new virtio level register?
> >>>>>> Do you mean the system PM register?
> >>>>> No, the device's PM register at transport level.
> >>>> I tried to find this register(pci level or virtio pci level or virtio
> >>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
> >>>> May I know which register you are referring to specifically? Or which
> >>>> PM state bit you mentioned below?
> >>>>
> >>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
> >> Yes, what you point to is PM capability for PCIe device.
> >> But the problem is still that in Qemu code, it will check the
> >> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
> >> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
> >> add PM capability for them.
> > PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
> > But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
> Do you suggest me to implement PM capability for virtio devices in
> Qemu firstly, and then to try if the PM capability can work for this
> scenario?

virtio devices in qemu already have a PM capability.


> If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.

One of the selling points of virtio is precisely reusing existing
platform capabilities as opposed to coming up with our own.
See abstract.tex


> Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?

We can't remove registers.

> > 
> >> And another problem is how about the MMIO transport devices? Since
> >> preserve resources is need for all transport type devices.
> >>
> > MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
> > 
> >>>>>
> >>>>>> I think it is unreasonable to let virtio- device listen the PM
> >>>>>> state of Guest system.
> >>>>> Guest driver performs any work on the guest systems PM callback
> >>>>> events in
> >>>> the virtio driver.
> >>>> I didn't find any PM state callback in the virtio driver.
> >>>>
> >>> There are virtio_suspend and virtio_resume in case of Linux.
> >> I think what you said virtio_suspend/resume is freeze/restore callback from
> >> "struct virtio_driver" or suspend/resume callback from "static const struct
> >> dev_pm_ops virtio_pci_pm_ops".
> >> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
> >> in those callback functions.
> >>
> >>>
> >>>>>
> >>>>>> It's more suitable that each device gets notifications from driver,
> >>>>>> and then do preserving resources operation.
> >>>>> I agree that each device gets the notification from driver.
> >>>>> The question is, should it be virtio driver, or existing pci driver
> >>>>> which
> >>>> transitions the state from d0->d3 and d3->d0 is enough.
> >>>> It seems there isn't existing pci driver to transitions d0 or d3
> >>>> state. Could you please tell me which one it is specifically? I am very willing to
> >> give a try.
> >>>>
> >>> Virtio-pci modern driver of Linux should be able to.
> >> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
> >> said above.
> >>
> > Both can be resolved without switching to pcie.
> >  
> >>>
> >>>>> Can you please check that?
> >>>>>
> >>>>>>>> --- a/transport-pci.tex
> >>>>>>>> +++ b/transport-pci.tex
> >>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
> >> structure
> >>>>>>>> layout}\label{sec:Virtio Transport
> >>>>>>>>          /* About the administration virtqueue. */
> >>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>>>> Preserving these resources in the device implementation takes
> >>>>>>> finite amount
> >>>>>> of time.
> >>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>>>> Hence this register must be a polling register to indicate that
> >>>>>> preservation_done.
> >>>>>>> This will tell the guest when the preservation is done, and when
> >>>>>>> restoration is
> >>>>>> done, so that it can resume upper layers.
> >>>>>>>
> >>>>>>> Please refer to queue_reset definition to learn more about such
> >>>>>>> register
> >>>>>> definition.
> >>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
> >>>>>> driver write 1 to let device do preserving resources, driver write
> >>>>>> 2 to let device do restoring resources, device write 0 to tell
> >>>>>> driver that preserving or restoring done, am I right?
> >>>>>>
> >>>>> Right.
> >>>>>
> >>>>> And if the existing pcie pm state bits can do, we can leverage that.
> >>>>> If it cannot be used, lets add that reasoning in the commit log to
> >>>>> describe this
> >>>> register.
> >>>>>
> >>>>>>>
> >>>>>>> Lets please make sure that PCIe PM level registers are
> >>>>>>> sufficient/not-sufficient
> >>>>>> to decide the addition of this register.
> >>>>>> But if the device is not a PCIe device, it doesn't have PM
> >>>>>> capability, then this will not work. Actually in my local
> >>>>>> environment, pci_is_express() return false in Qemu, they are not
> >>>>>> PCIe
> >>>> device.
> >>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
> >>>>> functionality that too you mentioned a gpu device. 😊
> >>>>> Which does not have very long history of any backward compatibility.
> >>>> Do you suggest me to add PM capability for virtio-gpu or change
> >>>> virtio-gpu to a PCIe device?
> >>>>
> >>> PCI Power Management Capability Structure does not seem to be limited to
> >> PCIe.
> >> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
> >> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
> >> we need to change them to PCIe device I think.
> >>
> > That is one option.
> > Second option to extend PCI PM cap for non pci device because it is supported.
> > 
> >>>
> >>
> >> --
> >> Best regards,
> >> Jiqian Chen.
> 
> -- 
> Best regards,
> Jiqian Chen.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-26 10:30                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 76+ messages in thread
From: Michael S. Tsirkin @ 2023-10-26 10:30 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Parav Pandit, Gerd Hoffmann, Jason Wang, Xuan Zhuo, David Airlie,
	Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray

On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
> 
> On 2023/10/25 11:51, Parav Pandit wrote:
> > 
> > 
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Tuesday, October 24, 2023 5:44 PM
> >>
> >> On 2023/10/24 18:51, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Tuesday, October 24, 2023 4:06 PM
> >>>>
> >>>> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>>>
> >>>>>>> I think this can be done without introducing the new register.
> >>>>>>> Can you please check if the PM register itself can serve the
> >>>>>>> purpose instead
> >>>>>> of new virtio level register?
> >>>>>> Do you mean the system PM register?
> >>>>> No, the device's PM register at transport level.
> >>>> I tried to find this register(pci level or virtio pci level or virtio
> >>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
> >>>> May I know which register you are referring to specifically? Or which
> >>>> PM state bit you mentioned below?
> >>>>
> >>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
> >> Yes, what you point to is PM capability for PCIe device.
> >> But the problem is still that in Qemu code, it will check the
> >> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
> >> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
> >> add PM capability for them.
> > PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
> > But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
> Do you suggest me to implement PM capability for virtio devices in
> Qemu firstly, and then to try if the PM capability can work for this
> scenario?

virtio devices in qemu already have a PM capability.


> If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.

One of the selling points of virtio is precisely reusing existing
platform capabilities as opposed to coming up with our own.
See abstract.tex


> Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?

We can't remove registers.

> > 
> >> And another problem is how about the MMIO transport devices? Since
> >> preserve resources is need for all transport type devices.
> >>
> > MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
> > 
> >>>>>
> >>>>>> I think it is unreasonable to let virtio- device listen the PM
> >>>>>> state of Guest system.
> >>>>> Guest driver performs any work on the guest systems PM callback
> >>>>> events in
> >>>> the virtio driver.
> >>>> I didn't find any PM state callback in the virtio driver.
> >>>>
> >>> There are virtio_suspend and virtio_resume in case of Linux.
> >> I think what you said virtio_suspend/resume is freeze/restore callback from
> >> "struct virtio_driver" or suspend/resume callback from "static const struct
> >> dev_pm_ops virtio_pci_pm_ops".
> >> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
> >> in those callback functions.
> >>
> >>>
> >>>>>
> >>>>>> It's more suitable that each device gets notifications from driver,
> >>>>>> and then do preserving resources operation.
> >>>>> I agree that each device gets the notification from driver.
> >>>>> The question is, should it be virtio driver, or existing pci driver
> >>>>> which
> >>>> transitions the state from d0->d3 and d3->d0 is enough.
> >>>> It seems there isn't existing pci driver to transitions d0 or d3
> >>>> state. Could you please tell me which one it is specifically? I am very willing to
> >> give a try.
> >>>>
> >>> Virtio-pci modern driver of Linux should be able to.
> >> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
> >> said above.
> >>
> > Both can be resolved without switching to pcie.
> >  
> >>>
> >>>>> Can you please check that?
> >>>>>
> >>>>>>>> --- a/transport-pci.tex
> >>>>>>>> +++ b/transport-pci.tex
> >>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
> >> structure
> >>>>>>>> layout}\label{sec:Virtio Transport
> >>>>>>>>          /* About the administration virtqueue. */
> >>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>>>> Preserving these resources in the device implementation takes
> >>>>>>> finite amount
> >>>>>> of time.
> >>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>>>> Hence this register must be a polling register to indicate that
> >>>>>> preservation_done.
> >>>>>>> This will tell the guest when the preservation is done, and when
> >>>>>>> restoration is
> >>>>>> done, so that it can resume upper layers.
> >>>>>>>
> >>>>>>> Please refer to queue_reset definition to learn more about such
> >>>>>>> register
> >>>>>> definition.
> >>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
> >>>>>> driver write 1 to let device do preserving resources, driver write
> >>>>>> 2 to let device do restoring resources, device write 0 to tell
> >>>>>> driver that preserving or restoring done, am I right?
> >>>>>>
> >>>>> Right.
> >>>>>
> >>>>> And if the existing pcie pm state bits can do, we can leverage that.
> >>>>> If it cannot be used, lets add that reasoning in the commit log to
> >>>>> describe this
> >>>> register.
> >>>>>
> >>>>>>>
> >>>>>>> Lets please make sure that PCIe PM level registers are
> >>>>>>> sufficient/not-sufficient
> >>>>>> to decide the addition of this register.
> >>>>>> But if the device is not a PCIe device, it doesn't have PM
> >>>>>> capability, then this will not work. Actually in my local
> >>>>>> environment, pci_is_express() return false in Qemu, they are not
> >>>>>> PCIe
> >>>> device.
> >>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
> >>>>> functionality that too you mentioned a gpu device. 😊
> >>>>> Which does not have very long history of any backward compatibility.
> >>>> Do you suggest me to add PM capability for virtio-gpu or change
> >>>> virtio-gpu to a PCIe device?
> >>>>
> >>> PCI Power Management Capability Structure does not seem to be limited to
> >> PCIe.
> >> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
> >> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
> >> we need to change them to PCIe device I think.
> >>
> > That is one option.
> > Second option to extend PCI PM cap for non pci device because it is supported.
> > 
> >>>
> >>
> >> --
> >> Best regards,
> >> Jiqian Chen.
> 
> -- 
> Best regards,
> Jiqian Chen.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-26 10:30                     ` [virtio-comment] " Michael S. Tsirkin
@ 2023-10-27  3:03                       ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-27  3:03 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: Gerd Hoffmann, Jason Wang, Xuan Zhuo, David Airlie,
	Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

Hi Michael S. Tsirkin and Parav Pandit,
Thank you for your detailed explanation. I will try to use PM cap to fix this issue.

On 2023/10/26 18:30, Michael S. Tsirkin wrote:
> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
>>
>> On 2023/10/25 11:51, Parav Pandit wrote:
>>>
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Tuesday, October 24, 2023 5:44 PM
>>>>
>>>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>>>
>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>>>
>>>>>>>>> I think this can be done without introducing the new register.
>>>>>>>>> Can you please check if the PM register itself can serve the
>>>>>>>>> purpose instead
>>>>>>>> of new virtio level register?
>>>>>>>> Do you mean the system PM register?
>>>>>>> No, the device's PM register at transport level.
>>>>>> I tried to find this register(pci level or virtio pci level or virtio
>>>>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
>>>>>> May I know which register you are referring to specifically? Or which
>>>>>> PM state bit you mentioned below?
>>>>>>
>>>>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
>>>> Yes, what you point to is PM capability for PCIe device.
>>>> But the problem is still that in Qemu code, it will check the
>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
>>>> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
>>>> add PM capability for them.
>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
>>> But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
>> Do you suggest me to implement PM capability for virtio devices in
>> Qemu firstly, and then to try if the PM capability can work for this
>> scenario?
> 
> virtio devices in qemu already have a PM capability.
> 
> 
>> If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.
> 
> One of the selling points of virtio is precisely reusing existing
> platform capabilities as opposed to coming up with our own.
> See abstract.tex
> 
> 
>> Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?
> 
> We can't remove registers.
> 
>>>
>>>> And another problem is how about the MMIO transport devices? Since
>>>> preserve resources is need for all transport type devices.
>>>>
>>> MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
>>>
>>>>>>>
>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
>>>>>>>> state of Guest system.
>>>>>>> Guest driver performs any work on the guest systems PM callback
>>>>>>> events in
>>>>>> the virtio driver.
>>>>>> I didn't find any PM state callback in the virtio driver.
>>>>>>
>>>>> There are virtio_suspend and virtio_resume in case of Linux.
>>>> I think what you said virtio_suspend/resume is freeze/restore callback from
>>>> "struct virtio_driver" or suspend/resume callback from "static const struct
>>>> dev_pm_ops virtio_pci_pm_ops".
>>>> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
>>>> in those callback functions.
>>>>
>>>>>
>>>>>>>
>>>>>>>> It's more suitable that each device gets notifications from driver,
>>>>>>>> and then do preserving resources operation.
>>>>>>> I agree that each device gets the notification from driver.
>>>>>>> The question is, should it be virtio driver, or existing pci driver
>>>>>>> which
>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
>>>>>> state. Could you please tell me which one it is specifically? I am very willing to
>>>> give a try.
>>>>>>
>>>>> Virtio-pci modern driver of Linux should be able to.
>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
>>>> said above.
>>>>
>>> Both can be resolved without switching to pcie.
>>>  
>>>>>
>>>>>>> Can you please check that?
>>>>>>>
>>>>>>>>>> --- a/transport-pci.tex
>>>>>>>>>> +++ b/transport-pci.tex
>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
>>>> structure
>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>>>> Preserving these resources in the device implementation takes
>>>>>>>>> finite amount
>>>>>>>> of time.
>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>>>> Hence this register must be a polling register to indicate that
>>>>>>>> preservation_done.
>>>>>>>>> This will tell the guest when the preservation is done, and when
>>>>>>>>> restoration is
>>>>>>>> done, so that it can resume upper layers.
>>>>>>>>>
>>>>>>>>> Please refer to queue_reset definition to learn more about such
>>>>>>>>> register
>>>>>>>> definition.
>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
>>>>>>>> driver write 1 to let device do preserving resources, driver write
>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
>>>>>>>> driver that preserving or restoring done, am I right?
>>>>>>>>
>>>>>>> Right.
>>>>>>>
>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
>>>>>>> If it cannot be used, lets add that reasoning in the commit log to
>>>>>>> describe this
>>>>>> register.
>>>>>>>
>>>>>>>>>
>>>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>>>> sufficient/not-sufficient
>>>>>>>> to decide the addition of this register.
>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>>>> capability, then this will not work. Actually in my local
>>>>>>>> environment, pci_is_express() return false in Qemu, they are not
>>>>>>>> PCIe
>>>>>> device.
>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
>>>>>>> functionality that too you mentioned a gpu device. 😊
>>>>>>> Which does not have very long history of any backward compatibility.
>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
>>>>>> virtio-gpu to a PCIe device?
>>>>>>
>>>>> PCI Power Management Capability Structure does not seem to be limited to
>>>> PCIe.
>>>> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
>>>> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
>>>> we need to change them to PCIe device I think.
>>>>
>>> That is one option.
>>> Second option to extend PCI PM cap for non pci device because it is supported.
>>>
>>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Jiqian Chen.
>>
>> -- 
>> Best regards,
>> Jiqian Chen.
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2023-10-27  3:03                       ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2023-10-27  3:03 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: Gerd Hoffmann, Jason Wang, Xuan Zhuo, David Airlie,
	Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

Hi Michael S. Tsirkin and Parav Pandit,
Thank you for your detailed explanation. I will try to use PM cap to fix this issue.

On 2023/10/26 18:30, Michael S. Tsirkin wrote:
> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
>>
>> On 2023/10/25 11:51, Parav Pandit wrote:
>>>
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Tuesday, October 24, 2023 5:44 PM
>>>>
>>>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>>>
>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>>>
>>>>>>>>> I think this can be done without introducing the new register.
>>>>>>>>> Can you please check if the PM register itself can serve the
>>>>>>>>> purpose instead
>>>>>>>> of new virtio level register?
>>>>>>>> Do you mean the system PM register?
>>>>>>> No, the device's PM register at transport level.
>>>>>> I tried to find this register(pci level or virtio pci level or virtio
>>>>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
>>>>>> May I know which register you are referring to specifically? Or which
>>>>>> PM state bit you mentioned below?
>>>>>>
>>>>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
>>>> Yes, what you point to is PM capability for PCIe device.
>>>> But the problem is still that in Qemu code, it will check the
>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
>>>> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
>>>> add PM capability for them.
>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
>>> But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
>> Do you suggest me to implement PM capability for virtio devices in
>> Qemu firstly, and then to try if the PM capability can work for this
>> scenario?
> 
> virtio devices in qemu already have a PM capability.
> 
> 
>> If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.
> 
> One of the selling points of virtio is precisely reusing existing
> platform capabilities as opposed to coming up with our own.
> See abstract.tex
> 
> 
>> Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?
> 
> We can't remove registers.
> 
>>>
>>>> And another problem is how about the MMIO transport devices? Since
>>>> preserve resources is need for all transport type devices.
>>>>
>>> MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
>>>
>>>>>>>
>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
>>>>>>>> state of Guest system.
>>>>>>> Guest driver performs any work on the guest systems PM callback
>>>>>>> events in
>>>>>> the virtio driver.
>>>>>> I didn't find any PM state callback in the virtio driver.
>>>>>>
>>>>> There are virtio_suspend and virtio_resume in case of Linux.
>>>> I think what you said virtio_suspend/resume is freeze/restore callback from
>>>> "struct virtio_driver" or suspend/resume callback from "static const struct
>>>> dev_pm_ops virtio_pci_pm_ops".
>>>> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
>>>> in those callback functions.
>>>>
>>>>>
>>>>>>>
>>>>>>>> It's more suitable that each device gets notifications from driver,
>>>>>>>> and then do preserving resources operation.
>>>>>>> I agree that each device gets the notification from driver.
>>>>>>> The question is, should it be virtio driver, or existing pci driver
>>>>>>> which
>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
>>>>>> state. Could you please tell me which one it is specifically? I am very willing to
>>>> give a try.
>>>>>>
>>>>> Virtio-pci modern driver of Linux should be able to.
>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
>>>> said above.
>>>>
>>> Both can be resolved without switching to pcie.
>>>  
>>>>>
>>>>>>> Can you please check that?
>>>>>>>
>>>>>>>>>> --- a/transport-pci.tex
>>>>>>>>>> +++ b/transport-pci.tex
>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
>>>> structure
>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>>>> Preserving these resources in the device implementation takes
>>>>>>>>> finite amount
>>>>>>>> of time.
>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>>>> Hence this register must be a polling register to indicate that
>>>>>>>> preservation_done.
>>>>>>>>> This will tell the guest when the preservation is done, and when
>>>>>>>>> restoration is
>>>>>>>> done, so that it can resume upper layers.
>>>>>>>>>
>>>>>>>>> Please refer to queue_reset definition to learn more about such
>>>>>>>>> register
>>>>>>>> definition.
>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
>>>>>>>> driver write 1 to let device do preserving resources, driver write
>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
>>>>>>>> driver that preserving or restoring done, am I right?
>>>>>>>>
>>>>>>> Right.
>>>>>>>
>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
>>>>>>> If it cannot be used, lets add that reasoning in the commit log to
>>>>>>> describe this
>>>>>> register.
>>>>>>>
>>>>>>>>>
>>>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>>>> sufficient/not-sufficient
>>>>>>>> to decide the addition of this register.
>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>>>> capability, then this will not work. Actually in my local
>>>>>>>> environment, pci_is_express() return false in Qemu, they are not
>>>>>>>> PCIe
>>>>>> device.
>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
>>>>>>> functionality that too you mentioned a gpu device. 😊
>>>>>>> Which does not have very long history of any backward compatibility.
>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
>>>>>> virtio-gpu to a PCIe device?
>>>>>>
>>>>> PCI Power Management Capability Structure does not seem to be limited to
>>>> PCIe.
>>>> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
>>>> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
>>>> we need to change them to PCIe device I think.
>>>>
>>> That is one option.
>>> Second option to extend PCI PM cap for non pci device because it is supported.
>>>
>>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Jiqian Chen.
>>
>> -- 
>> Best regards,
>> Jiqian Chen.
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2023-10-27  3:03                       ` [virtio-comment] " Chen, Jiqian
@ 2024-01-12  7:41                         ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-12  7:41 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray, Chen, Jiqian

Hi all,
Sorry to reply late.
I don't know if you still remember this problem, let me briefly descript it.
I am working to implement virtgpu S3 function on Xen.
Currently on Xen, if we start a guest through qemu with enabling virtgpu, and then suspend and resume guest. We can find that the guest kernel comes back, but the display doesn't. It just shown a black screen.
That is because during suspending, guest called into qemu and qemu destroyed all display resources and reset renderer. This made the display gone after guest resumed.
So, I add a new command for virtio-gpu that when guest is suspending, it will notify qemu and set parameter(preserver_resource) to 1 and then qemu will preserve resources, and when resuming, guest will notify qemu to set parameter to 0, and then qemu will keep the normal actions. That can help guest's display come back.
When I upstream above implementation, Parav and MST suggest me to use the PM capability to fix this problem instead of adding a new command or state bit.
Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this problem.
The reason is:
during guest suspending, it will write D3 state through PM cap, then I can save the resources of virtio-gpu on Qemu side(set preserver_resource to 1),
but during the process of resuming, the state of PM cap will be cleared by qemu resetting(qemu_system_wakeup-> qemu_devices_reset-> virtio_vga_base_reset-> virtio_gpu_gl_reset),
it causes that when guest reads state from PM cap, it will find the virtio-gpu has already been D0 state, so guest will not write D0 through PM cap, so I can't know when to restore the resources(set preserver_resource to 0).
Do you have any other suggestions?
Or can I just fallback to the version that add a new command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think that way is more reasonable and feasible for virtio-gpu to protect display resources during S3. As for other devices, if necessary, they can also refer to the implementation of the virtio-gpu to add new commands to prevent resource loss during S3.

On 2023/10/27 11:03, Chen, Jiqian wrote:
> Hi Michael S. Tsirkin and Parav Pandit,
> Thank you for your detailed explanation. I will try to use PM cap to fix this issue.
> 
> On 2023/10/26 18:30, Michael S. Tsirkin wrote:
>> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
>>>
>>> On 2023/10/25 11:51, Parav Pandit wrote:
>>>>
>>>>
>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>> Sent: Tuesday, October 24, 2023 5:44 PM
>>>>>
>>>>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>>>>
>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>>>>
>>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>>>>
>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>>>>
>>>>>>>>>> I think this can be done without introducing the new register.
>>>>>>>>>> Can you please check if the PM register itself can serve the
>>>>>>>>>> purpose instead
>>>>>>>>> of new virtio level register?
>>>>>>>>> Do you mean the system PM register?
>>>>>>>> No, the device's PM register at transport level.
>>>>>>> I tried to find this register(pci level or virtio pci level or virtio
>>>>>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
>>>>>>> May I know which register you are referring to specifically? Or which
>>>>>>> PM state bit you mentioned below?
>>>>>>>
>>>>>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
>>>>> Yes, what you point to is PM capability for PCIe device.
>>>>> But the problem is still that in Qemu code, it will check the
>>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
>>>>> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
>>>>> add PM capability for them.
>>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
>>>> But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
>>> Do you suggest me to implement PM capability for virtio devices in
>>> Qemu firstly, and then to try if the PM capability can work for this
>>> scenario?
>>
>> virtio devices in qemu already have a PM capability.
>>
>>
>>> If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.
>>
>> One of the selling points of virtio is precisely reusing existing
>> platform capabilities as opposed to coming up with our own.
>> See abstract.tex
>>
>>
>>> Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?
>>
>> We can't remove registers.
>>
>>>>
>>>>> And another problem is how about the MMIO transport devices? Since
>>>>> preserve resources is need for all transport type devices.
>>>>>
>>>> MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
>>>>
>>>>>>>>
>>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
>>>>>>>>> state of Guest system.
>>>>>>>> Guest driver performs any work on the guest systems PM callback
>>>>>>>> events in
>>>>>>> the virtio driver.
>>>>>>> I didn't find any PM state callback in the virtio driver.
>>>>>>>
>>>>>> There are virtio_suspend and virtio_resume in case of Linux.
>>>>> I think what you said virtio_suspend/resume is freeze/restore callback from
>>>>> "struct virtio_driver" or suspend/resume callback from "static const struct
>>>>> dev_pm_ops virtio_pci_pm_ops".
>>>>> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
>>>>> in those callback functions.
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>> It's more suitable that each device gets notifications from driver,
>>>>>>>>> and then do preserving resources operation.
>>>>>>>> I agree that each device gets the notification from driver.
>>>>>>>> The question is, should it be virtio driver, or existing pci driver
>>>>>>>> which
>>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
>>>>>>> state. Could you please tell me which one it is specifically? I am very willing to
>>>>> give a try.
>>>>>>>
>>>>>> Virtio-pci modern driver of Linux should be able to.
>>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
>>>>> said above.
>>>>>
>>>> Both can be resolved without switching to pcie.
>>>>  
>>>>>>
>>>>>>>> Can you please check that?
>>>>>>>>
>>>>>>>>>>> --- a/transport-pci.tex
>>>>>>>>>>> +++ b/transport-pci.tex
>>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
>>>>> structure
>>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>>>>> Preserving these resources in the device implementation takes
>>>>>>>>>> finite amount
>>>>>>>>> of time.
>>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>>>>> Hence this register must be a polling register to indicate that
>>>>>>>>> preservation_done.
>>>>>>>>>> This will tell the guest when the preservation is done, and when
>>>>>>>>>> restoration is
>>>>>>>>> done, so that it can resume upper layers.
>>>>>>>>>>
>>>>>>>>>> Please refer to queue_reset definition to learn more about such
>>>>>>>>>> register
>>>>>>>>> definition.
>>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
>>>>>>>>> driver write 1 to let device do preserving resources, driver write
>>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
>>>>>>>>> driver that preserving or restoring done, am I right?
>>>>>>>>>
>>>>>>>> Right.
>>>>>>>>
>>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
>>>>>>>> If it cannot be used, lets add that reasoning in the commit log to
>>>>>>>> describe this
>>>>>>> register.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>>>>> sufficient/not-sufficient
>>>>>>>>> to decide the addition of this register.
>>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>>>>> capability, then this will not work. Actually in my local
>>>>>>>>> environment, pci_is_express() return false in Qemu, they are not
>>>>>>>>> PCIe
>>>>>>> device.
>>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
>>>>>>>> functionality that too you mentioned a gpu device. 😊
>>>>>>>> Which does not have very long history of any backward compatibility.
>>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
>>>>>>> virtio-gpu to a PCIe device?
>>>>>>>
>>>>>> PCI Power Management Capability Structure does not seem to be limited to
>>>>> PCIe.
>>>>> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
>>>>> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
>>>>> we need to change them to PCIe device I think.
>>>>>
>>>> That is one option.
>>>> Second option to extend PCI PM cap for non pci device because it is supported.
>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Jiqian Chen.
>>>
>>> -- 
>>> Best regards,
>>> Jiqian Chen.
>>
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-12  7:41                         ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-12  7:41 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray, Chen, Jiqian

Hi all,
Sorry to reply late.
I don't know if you still remember this problem, let me briefly descript it.
I am working to implement virtgpu S3 function on Xen.
Currently on Xen, if we start a guest through qemu with enabling virtgpu, and then suspend and resume guest. We can find that the guest kernel comes back, but the display doesn't. It just shown a black screen.
That is because during suspending, guest called into qemu and qemu destroyed all display resources and reset renderer. This made the display gone after guest resumed.
So, I add a new command for virtio-gpu that when guest is suspending, it will notify qemu and set parameter(preserver_resource) to 1 and then qemu will preserve resources, and when resuming, guest will notify qemu to set parameter to 0, and then qemu will keep the normal actions. That can help guest's display come back.
When I upstream above implementation, Parav and MST suggest me to use the PM capability to fix this problem instead of adding a new command or state bit.
Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this problem.
The reason is:
during guest suspending, it will write D3 state through PM cap, then I can save the resources of virtio-gpu on Qemu side(set preserver_resource to 1),
but during the process of resuming, the state of PM cap will be cleared by qemu resetting(qemu_system_wakeup-> qemu_devices_reset-> virtio_vga_base_reset-> virtio_gpu_gl_reset),
it causes that when guest reads state from PM cap, it will find the virtio-gpu has already been D0 state, so guest will not write D0 through PM cap, so I can't know when to restore the resources(set preserver_resource to 0).
Do you have any other suggestions?
Or can I just fallback to the version that add a new command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think that way is more reasonable and feasible for virtio-gpu to protect display resources during S3. As for other devices, if necessary, they can also refer to the implementation of the virtio-gpu to add new commands to prevent resource loss during S3.

On 2023/10/27 11:03, Chen, Jiqian wrote:
> Hi Michael S. Tsirkin and Parav Pandit,
> Thank you for your detailed explanation. I will try to use PM cap to fix this issue.
> 
> On 2023/10/26 18:30, Michael S. Tsirkin wrote:
>> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
>>>
>>> On 2023/10/25 11:51, Parav Pandit wrote:
>>>>
>>>>
>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>> Sent: Tuesday, October 24, 2023 5:44 PM
>>>>>
>>>>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>>>>
>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>>>>
>>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>>>>
>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>>>>
>>>>>>>>>> I think this can be done without introducing the new register.
>>>>>>>>>> Can you please check if the PM register itself can serve the
>>>>>>>>>> purpose instead
>>>>>>>>> of new virtio level register?
>>>>>>>>> Do you mean the system PM register?
>>>>>>>> No, the device's PM register at transport level.
>>>>>>> I tried to find this register(pci level or virtio pci level or virtio
>>>>>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
>>>>>>> May I know which register you are referring to specifically? Or which
>>>>>>> PM state bit you mentioned below?
>>>>>>>
>>>>>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
>>>>> Yes, what you point to is PM capability for PCIe device.
>>>>> But the problem is still that in Qemu code, it will check the
>>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
>>>>> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
>>>>> add PM capability for them.
>>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
>>>> But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
>>> Do you suggest me to implement PM capability for virtio devices in
>>> Qemu firstly, and then to try if the PM capability can work for this
>>> scenario?
>>
>> virtio devices in qemu already have a PM capability.
>>
>>
>>> If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.
>>
>> One of the selling points of virtio is precisely reusing existing
>> platform capabilities as opposed to coming up with our own.
>> See abstract.tex
>>
>>
>>> Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?
>>
>> We can't remove registers.
>>
>>>>
>>>>> And another problem is how about the MMIO transport devices? Since
>>>>> preserve resources is need for all transport type devices.
>>>>>
>>>> MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
>>>>
>>>>>>>>
>>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
>>>>>>>>> state of Guest system.
>>>>>>>> Guest driver performs any work on the guest systems PM callback
>>>>>>>> events in
>>>>>>> the virtio driver.
>>>>>>> I didn't find any PM state callback in the virtio driver.
>>>>>>>
>>>>>> There are virtio_suspend and virtio_resume in case of Linux.
>>>>> I think what you said virtio_suspend/resume is freeze/restore callback from
>>>>> "struct virtio_driver" or suspend/resume callback from "static const struct
>>>>> dev_pm_ops virtio_pci_pm_ops".
>>>>> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
>>>>> in those callback functions.
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>>> It's more suitable that each device gets notifications from driver,
>>>>>>>>> and then do preserving resources operation.
>>>>>>>> I agree that each device gets the notification from driver.
>>>>>>>> The question is, should it be virtio driver, or existing pci driver
>>>>>>>> which
>>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
>>>>>>> state. Could you please tell me which one it is specifically? I am very willing to
>>>>> give a try.
>>>>>>>
>>>>>> Virtio-pci modern driver of Linux should be able to.
>>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
>>>>> said above.
>>>>>
>>>> Both can be resolved without switching to pcie.
>>>>  
>>>>>>
>>>>>>>> Can you please check that?
>>>>>>>>
>>>>>>>>>>> --- a/transport-pci.tex
>>>>>>>>>>> +++ b/transport-pci.tex
>>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
>>>>> structure
>>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>>>>> Preserving these resources in the device implementation takes
>>>>>>>>>> finite amount
>>>>>>>>> of time.
>>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>>>>> Hence this register must be a polling register to indicate that
>>>>>>>>> preservation_done.
>>>>>>>>>> This will tell the guest when the preservation is done, and when
>>>>>>>>>> restoration is
>>>>>>>>> done, so that it can resume upper layers.
>>>>>>>>>>
>>>>>>>>>> Please refer to queue_reset definition to learn more about such
>>>>>>>>>> register
>>>>>>>>> definition.
>>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
>>>>>>>>> driver write 1 to let device do preserving resources, driver write
>>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
>>>>>>>>> driver that preserving or restoring done, am I right?
>>>>>>>>>
>>>>>>>> Right.
>>>>>>>>
>>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
>>>>>>>> If it cannot be used, lets add that reasoning in the commit log to
>>>>>>>> describe this
>>>>>>> register.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>>>>> sufficient/not-sufficient
>>>>>>>>> to decide the addition of this register.
>>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>>>>> capability, then this will not work. Actually in my local
>>>>>>>>> environment, pci_is_express() return false in Qemu, they are not
>>>>>>>>> PCIe
>>>>>>> device.
>>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
>>>>>>>> functionality that too you mentioned a gpu device. 😊
>>>>>>>> Which does not have very long history of any backward compatibility.
>>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
>>>>>>> virtio-gpu to a PCIe device?
>>>>>>>
>>>>>> PCI Power Management Capability Structure does not seem to be limited to
>>>>> PCIe.
>>>>> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
>>>>> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
>>>>> we need to change them to PCIe device I think.
>>>>>
>>>> That is one option.
>>>> Second option to extend PCI PM cap for non pci device because it is supported.
>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Jiqian Chen.
>>>
>>> -- 
>>> Best regards,
>>> Jiqian Chen.
>>
> 

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-12  7:41                         ` [virtio-comment] " Chen, Jiqian
@ 2024-01-12  8:02                           ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-12  8:02 UTC (permalink / raw)
  To: Chen, Jiqian, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray

Hi Jiqian,

> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Friday, January 12, 2024 1:11 PM

> 
> Hi all,
> Sorry to reply late.
> I don't know if you still remember this problem, let me briefly descript it.
> I am working to implement virtgpu S3 function on Xen.
> Currently on Xen, if we start a guest through qemu with enabling virtgpu, and
> then suspend and resume guest. We can find that the guest kernel comes
> back, but the display doesn't. It just shown a black screen.
> That is because during suspending, guest called into qemu and qemu
> destroyed all display resources and reset renderer. This made the display gone
> after guest resumed.
> So, I add a new command for virtio-gpu that when guest is suspending, it will
> notify qemu and set parameter(preserver_resource) to 1 and then qemu will
> preserve resources, and when resuming, guest will notify qemu to set
> parameter to 0, and then qemu will keep the normal actions. That can help
> guest's display come back.
> When I upstream above implementation, Parav and MST suggest me to use
> the PM capability to fix this problem instead of adding a new command or
> state bit.
> Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this
> problem.
> The reason is:
> during guest suspending, it will write D3 state through PM cap, then I can save
> the resources of virtio-gpu on Qemu side(set preserver_resource to 1), but
> during the process of resuming, the state of PM cap will be cleared by qemu
> resetting(qemu_system_wakeup-> qemu_devices_reset->
> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when guest reads
> state from PM cap, it will find the virtio-gpu has already been D0 state, 
This behavior needs to be fixed. As the spec listed out, " This 2-bit field is used both to determine the current power state of a Function"

So device needs to return D3 in the PowerState field.  This is must.

In addition to it, an additional busy_poll register is helpful that indicates the device is ready to use.
This is because 10msec is the timeout set by the PCI spec.
This can be hard for the devices to implement if the large GPU state is being read from files or slow media or for other hw devices in large quantities.

This limit comes from the PCI spec normative of below.

After transitioning a VF from D3Hot to D0, at least one of the following is true:
◦ At least 10 ms has passed since the request to enter D0 was issued.

So Readiness Time Reporting capability is not so useful.

Hence, after PM state to D3->D0 transition is successful, virtio level PCI register is useful to ensure that device is resumed to drive rest of the registers.

> so
> guest will not write D0 through PM cap, so I can't know when to restore the
> resources(set preserver_resource to 0).
> Do you have any other suggestions?
> Or can I just fallback to the version that add a new
> command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think
> that way is more reasonable and feasible for virtio-gpu to protect display
> resources during S3. As for other devices, if necessary, they can also refer to
> the implementation of the virtio-gpu to add new commands to prevent
> resource loss during S3.
> 
> On 2023/10/27 11:03, Chen, Jiqian wrote:
> > Hi Michael S. Tsirkin and Parav Pandit, Thank you for your detailed
> > explanation. I will try to use PM cap to fix this issue.
> >
> > On 2023/10/26 18:30, Michael S. Tsirkin wrote:
> >> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
> >>>
> >>> On 2023/10/25 11:51, Parav Pandit wrote:
> >>>>
> >>>>
> >>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>> Sent: Tuesday, October 24, 2023 5:44 PM
> >>>>>
> >>>>> On 2023/10/24 18:51, Parav Pandit wrote:
> >>>>>>
> >>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
> >>>>>>>
> >>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>>>>>>
> >>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>>>>>>
> >>>>>>>>>> I think this can be done without introducing the new register.
> >>>>>>>>>> Can you please check if the PM register itself can serve the
> >>>>>>>>>> purpose instead
> >>>>>>>>> of new virtio level register?
> >>>>>>>>> Do you mean the system PM register?
> >>>>>>>> No, the device's PM register at transport level.
> >>>>>>> I tried to find this register(pci level or virtio pci level or
> >>>>>>> virtio driver level), but I didn't find it in Linux kernel or Qemu codes.
> >>>>>>> May I know which register you are referring to specifically? Or
> >>>>>>> which PM state bit you mentioned below?
> >>>>>>>
> >>>>>> PCI spec's PCI Power Management Capability Structure in section
> 7.5.2.
> >>>>> Yes, what you point to is PM capability for PCIe device.
> >>>>> But the problem is still that in Qemu code, it will check the
> >>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci
> >>>>> devices in function virtio_pci_realize(), if the virtio devices
> >>>>> aren't a PCIe device, it will not add PM capability for them.
> >>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put
> it only under is_pcie check.
> >>>> But it can be done outside of that check as well because this capability
> exists on PCI too for long time and it is backward compatible.
> >>> Do you suggest me to implement PM capability for virtio devices in
> >>> Qemu firstly, and then to try if the PM capability can work for this
> >>> scenario?
> >>
> >> virtio devices in qemu already have a PM capability.
> >>
> >>
> >>> If so, we will complicate a simple problem. Because there are no other
> needs to add PM capability for virtio devices for now, if we add it just for
> preserving resources, it seems unnecessary and unreasonable. And we are not
> sure if there are other scenarios that are not during the process of PM state
> changing also need to preserve resources, if have, then the PM register can't
> cover, but preserve_resources register can.
> >>
> >> One of the selling points of virtio is precisely reusing existing
> >> platform capabilities as opposed to coming up with our own.
> >> See abstract.tex
> >>
> >>
> >>> Can I add some notes like "If PM capability is implemented for virtio
> devices, it may cover this scenario, and if there are no other scenarios that PM
> can't cover, then we can remove this register " in commit message or spec
> description and let us continue to add preserve_resources register?
> >>
> >> We can't remove registers.
> >>
> >>>>
> >>>>> And another problem is how about the MMIO transport devices? Since
> >>>>> preserve resources is need for all transport type devices.
> >>>>>
> >>>> MMIO lacks such rich PM definitions. If in future MMIO wants to
> support, it will be extended to match to other transports like PCI.
> >>>>
> >>>>>>>>
> >>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
> >>>>>>>>> state of Guest system.
> >>>>>>>> Guest driver performs any work on the guest systems PM callback
> >>>>>>>> events in
> >>>>>>> the virtio driver.
> >>>>>>> I didn't find any PM state callback in the virtio driver.
> >>>>>>>
> >>>>>> There are virtio_suspend and virtio_resume in case of Linux.
> >>>>> I think what you said virtio_suspend/resume is freeze/restore
> >>>>> callback from "struct virtio_driver" or suspend/resume callback
> >>>>> from "static const struct dev_pm_ops virtio_pci_pm_ops".
> >>>>> And yes, I agree, if virtio devices have PM capability, maybe we
> >>>>> can set PM state in those callback functions.
> >>>>>
> >>>>>>
> >>>>>>>>
> >>>>>>>>> It's more suitable that each device gets notifications from
> >>>>>>>>> driver, and then do preserving resources operation.
> >>>>>>>> I agree that each device gets the notification from driver.
> >>>>>>>> The question is, should it be virtio driver, or existing pci
> >>>>>>>> driver which
> >>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
> >>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
> >>>>>>> state. Could you please tell me which one it is specifically? I
> >>>>>>> am very willing to
> >>>>> give a try.
> >>>>>>>
> >>>>>> Virtio-pci modern driver of Linux should be able to.
> >>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the
> >>>>> two problems I said above.
> >>>>>
> >>>> Both can be resolved without switching to pcie.
> >>>>
> >>>>>>
> >>>>>>>> Can you please check that?
> >>>>>>>>
> >>>>>>>>>>> --- a/transport-pci.tex
> >>>>>>>>>>> +++ b/transport-pci.tex
> >>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
> >>>>> structure
> >>>>>>>>>>> layout}\label{sec:Virtio Transport
> >>>>>>>>>>>          /* About the administration virtqueue. */
> >>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>>>>>>> Preserving these resources in the device implementation takes
> >>>>>>>>>> finite amount
> >>>>>>>>> of time.
> >>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>>>>>>> Hence this register must be a polling register to indicate
> >>>>>>>>>> that
> >>>>>>>>> preservation_done.
> >>>>>>>>>> This will tell the guest when the preservation is done, and
> >>>>>>>>>> when restoration is
> >>>>>>>>> done, so that it can resume upper layers.
> >>>>>>>>>>
> >>>>>>>>>> Please refer to queue_reset definition to learn more about
> >>>>>>>>>> such register
> >>>>>>>>> definition.
> >>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three
> >>>>>>>>> values, driver write 1 to let device do preserving resources,
> >>>>>>>>> driver write
> >>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
> >>>>>>>>> driver that preserving or restoring done, am I right?
> >>>>>>>>>
> >>>>>>>> Right.
> >>>>>>>>
> >>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
> >>>>>>>> If it cannot be used, lets add that reasoning in the commit log
> >>>>>>>> to describe this
> >>>>>>> register.
> >>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Lets please make sure that PCIe PM level registers are
> >>>>>>>>>> sufficient/not-sufficient
> >>>>>>>>> to decide the addition of this register.
> >>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
> >>>>>>>>> capability, then this will not work. Actually in my local
> >>>>>>>>> environment, pci_is_express() return false in Qemu, they are
> >>>>>>>>> not PCIe
> >>>>>>> device.
> >>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to
> >>>>>>>> get new functionality that too you mentioned a gpu device. 😊
> >>>>>>>> Which does not have very long history of any backward
> compatibility.
> >>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
> >>>>>>> virtio-gpu to a PCIe device?
> >>>>>>>
> >>>>>> PCI Power Management Capability Structure does not seem to be
> >>>>>> limited to
> >>>>> PCIe.
> >>>>> I am not sure, but in current Qemu code, I can see the check
> "pci_is_express"
> >>>>> for all virtio-pci devices. If we want to add PM capability for
> >>>>> virtio-pci devices, we need to change them to PCIe device I think.
> >>>>>
> >>>> That is one option.
> >>>> Second option to extend PCI PM cap for non pci device because it is
> supported.
> >>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>> Jiqian Chen.
> >>>
> >>> --
> >>> Best regards,
> >>> Jiqian Chen.
> >>
> >
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-12  8:02                           ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-12  8:02 UTC (permalink / raw)
  To: Chen, Jiqian, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray

Hi Jiqian,

> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Friday, January 12, 2024 1:11 PM

> 
> Hi all,
> Sorry to reply late.
> I don't know if you still remember this problem, let me briefly descript it.
> I am working to implement virtgpu S3 function on Xen.
> Currently on Xen, if we start a guest through qemu with enabling virtgpu, and
> then suspend and resume guest. We can find that the guest kernel comes
> back, but the display doesn't. It just shown a black screen.
> That is because during suspending, guest called into qemu and qemu
> destroyed all display resources and reset renderer. This made the display gone
> after guest resumed.
> So, I add a new command for virtio-gpu that when guest is suspending, it will
> notify qemu and set parameter(preserver_resource) to 1 and then qemu will
> preserve resources, and when resuming, guest will notify qemu to set
> parameter to 0, and then qemu will keep the normal actions. That can help
> guest's display come back.
> When I upstream above implementation, Parav and MST suggest me to use
> the PM capability to fix this problem instead of adding a new command or
> state bit.
> Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this
> problem.
> The reason is:
> during guest suspending, it will write D3 state through PM cap, then I can save
> the resources of virtio-gpu on Qemu side(set preserver_resource to 1), but
> during the process of resuming, the state of PM cap will be cleared by qemu
> resetting(qemu_system_wakeup-> qemu_devices_reset->
> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when guest reads
> state from PM cap, it will find the virtio-gpu has already been D0 state, 
This behavior needs to be fixed. As the spec listed out, " This 2-bit field is used both to determine the current power state of a Function"

So device needs to return D3 in the PowerState field.  This is must.

In addition to it, an additional busy_poll register is helpful that indicates the device is ready to use.
This is because 10msec is the timeout set by the PCI spec.
This can be hard for the devices to implement if the large GPU state is being read from files or slow media or for other hw devices in large quantities.

This limit comes from the PCI spec normative of below.

After transitioning a VF from D3Hot to D0, at least one of the following is true:
◦ At least 10 ms has passed since the request to enter D0 was issued.

So Readiness Time Reporting capability is not so useful.

Hence, after PM state to D3->D0 transition is successful, virtio level PCI register is useful to ensure that device is resumed to drive rest of the registers.

> so
> guest will not write D0 through PM cap, so I can't know when to restore the
> resources(set preserver_resource to 0).
> Do you have any other suggestions?
> Or can I just fallback to the version that add a new
> command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think
> that way is more reasonable and feasible for virtio-gpu to protect display
> resources during S3. As for other devices, if necessary, they can also refer to
> the implementation of the virtio-gpu to add new commands to prevent
> resource loss during S3.
> 
> On 2023/10/27 11:03, Chen, Jiqian wrote:
> > Hi Michael S. Tsirkin and Parav Pandit, Thank you for your detailed
> > explanation. I will try to use PM cap to fix this issue.
> >
> > On 2023/10/26 18:30, Michael S. Tsirkin wrote:
> >> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
> >>>
> >>> On 2023/10/25 11:51, Parav Pandit wrote:
> >>>>
> >>>>
> >>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>> Sent: Tuesday, October 24, 2023 5:44 PM
> >>>>>
> >>>>> On 2023/10/24 18:51, Parav Pandit wrote:
> >>>>>>
> >>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
> >>>>>>>
> >>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>>>>>>
> >>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>>>>>>
> >>>>>>>>>> I think this can be done without introducing the new register.
> >>>>>>>>>> Can you please check if the PM register itself can serve the
> >>>>>>>>>> purpose instead
> >>>>>>>>> of new virtio level register?
> >>>>>>>>> Do you mean the system PM register?
> >>>>>>>> No, the device's PM register at transport level.
> >>>>>>> I tried to find this register(pci level or virtio pci level or
> >>>>>>> virtio driver level), but I didn't find it in Linux kernel or Qemu codes.
> >>>>>>> May I know which register you are referring to specifically? Or
> >>>>>>> which PM state bit you mentioned below?
> >>>>>>>
> >>>>>> PCI spec's PCI Power Management Capability Structure in section
> 7.5.2.
> >>>>> Yes, what you point to is PM capability for PCIe device.
> >>>>> But the problem is still that in Qemu code, it will check the
> >>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci
> >>>>> devices in function virtio_pci_realize(), if the virtio devices
> >>>>> aren't a PCIe device, it will not add PM capability for them.
> >>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put
> it only under is_pcie check.
> >>>> But it can be done outside of that check as well because this capability
> exists on PCI too for long time and it is backward compatible.
> >>> Do you suggest me to implement PM capability for virtio devices in
> >>> Qemu firstly, and then to try if the PM capability can work for this
> >>> scenario?
> >>
> >> virtio devices in qemu already have a PM capability.
> >>
> >>
> >>> If so, we will complicate a simple problem. Because there are no other
> needs to add PM capability for virtio devices for now, if we add it just for
> preserving resources, it seems unnecessary and unreasonable. And we are not
> sure if there are other scenarios that are not during the process of PM state
> changing also need to preserve resources, if have, then the PM register can't
> cover, but preserve_resources register can.
> >>
> >> One of the selling points of virtio is precisely reusing existing
> >> platform capabilities as opposed to coming up with our own.
> >> See abstract.tex
> >>
> >>
> >>> Can I add some notes like "If PM capability is implemented for virtio
> devices, it may cover this scenario, and if there are no other scenarios that PM
> can't cover, then we can remove this register " in commit message or spec
> description and let us continue to add preserve_resources register?
> >>
> >> We can't remove registers.
> >>
> >>>>
> >>>>> And another problem is how about the MMIO transport devices? Since
> >>>>> preserve resources is need for all transport type devices.
> >>>>>
> >>>> MMIO lacks such rich PM definitions. If in future MMIO wants to
> support, it will be extended to match to other transports like PCI.
> >>>>
> >>>>>>>>
> >>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
> >>>>>>>>> state of Guest system.
> >>>>>>>> Guest driver performs any work on the guest systems PM callback
> >>>>>>>> events in
> >>>>>>> the virtio driver.
> >>>>>>> I didn't find any PM state callback in the virtio driver.
> >>>>>>>
> >>>>>> There are virtio_suspend and virtio_resume in case of Linux.
> >>>>> I think what you said virtio_suspend/resume is freeze/restore
> >>>>> callback from "struct virtio_driver" or suspend/resume callback
> >>>>> from "static const struct dev_pm_ops virtio_pci_pm_ops".
> >>>>> And yes, I agree, if virtio devices have PM capability, maybe we
> >>>>> can set PM state in those callback functions.
> >>>>>
> >>>>>>
> >>>>>>>>
> >>>>>>>>> It's more suitable that each device gets notifications from
> >>>>>>>>> driver, and then do preserving resources operation.
> >>>>>>>> I agree that each device gets the notification from driver.
> >>>>>>>> The question is, should it be virtio driver, or existing pci
> >>>>>>>> driver which
> >>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
> >>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
> >>>>>>> state. Could you please tell me which one it is specifically? I
> >>>>>>> am very willing to
> >>>>> give a try.
> >>>>>>>
> >>>>>> Virtio-pci modern driver of Linux should be able to.
> >>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the
> >>>>> two problems I said above.
> >>>>>
> >>>> Both can be resolved without switching to pcie.
> >>>>
> >>>>>>
> >>>>>>>> Can you please check that?
> >>>>>>>>
> >>>>>>>>>>> --- a/transport-pci.tex
> >>>>>>>>>>> +++ b/transport-pci.tex
> >>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
> >>>>> structure
> >>>>>>>>>>> layout}\label{sec:Virtio Transport
> >>>>>>>>>>>          /* About the administration virtqueue. */
> >>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>>>>>>> Preserving these resources in the device implementation takes
> >>>>>>>>>> finite amount
> >>>>>>>>> of time.
> >>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>>>>>>> Hence this register must be a polling register to indicate
> >>>>>>>>>> that
> >>>>>>>>> preservation_done.
> >>>>>>>>>> This will tell the guest when the preservation is done, and
> >>>>>>>>>> when restoration is
> >>>>>>>>> done, so that it can resume upper layers.
> >>>>>>>>>>
> >>>>>>>>>> Please refer to queue_reset definition to learn more about
> >>>>>>>>>> such register
> >>>>>>>>> definition.
> >>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three
> >>>>>>>>> values, driver write 1 to let device do preserving resources,
> >>>>>>>>> driver write
> >>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
> >>>>>>>>> driver that preserving or restoring done, am I right?
> >>>>>>>>>
> >>>>>>>> Right.
> >>>>>>>>
> >>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
> >>>>>>>> If it cannot be used, lets add that reasoning in the commit log
> >>>>>>>> to describe this
> >>>>>>> register.
> >>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Lets please make sure that PCIe PM level registers are
> >>>>>>>>>> sufficient/not-sufficient
> >>>>>>>>> to decide the addition of this register.
> >>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
> >>>>>>>>> capability, then this will not work. Actually in my local
> >>>>>>>>> environment, pci_is_express() return false in Qemu, they are
> >>>>>>>>> not PCIe
> >>>>>>> device.
> >>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to
> >>>>>>>> get new functionality that too you mentioned a gpu device. 😊
> >>>>>>>> Which does not have very long history of any backward
> compatibility.
> >>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
> >>>>>>> virtio-gpu to a PCIe device?
> >>>>>>>
> >>>>>> PCI Power Management Capability Structure does not seem to be
> >>>>>> limited to
> >>>>> PCIe.
> >>>>> I am not sure, but in current Qemu code, I can see the check
> "pci_is_express"
> >>>>> for all virtio-pci devices. If we want to add PM capability for
> >>>>> virtio-pci devices, we need to change them to PCIe device I think.
> >>>>>
> >>>> That is one option.
> >>>> Second option to extend PCI PM cap for non pci device because it is
> supported.
> >>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>> Jiqian Chen.
> >>>
> >>> --
> >>> Best regards,
> >>> Jiqian Chen.
> >>
> >
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-12  8:02                           ` [virtio-comment] " Parav Pandit
@ 2024-01-12  8:25                             ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-12  8:25 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray, Chen, Jiqian


On 2024/1/12 16:02, Parav Pandit wrote:
> Hi Jiqian,
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Friday, January 12, 2024 1:11 PM
> 
>>
>> Hi all,
>> Sorry to reply late.
>> I don't know if you still remember this problem, let me briefly descript it.
>> I am working to implement virtgpu S3 function on Xen.
>> Currently on Xen, if we start a guest through qemu with enabling virtgpu, and
>> then suspend and resume guest. We can find that the guest kernel comes
>> back, but the display doesn't. It just shown a black screen.
>> That is because during suspending, guest called into qemu and qemu
>> destroyed all display resources and reset renderer. This made the display gone
>> after guest resumed.
>> So, I add a new command for virtio-gpu that when guest is suspending, it will
>> notify qemu and set parameter(preserver_resource) to 1 and then qemu will
>> preserve resources, and when resuming, guest will notify qemu to set
>> parameter to 0, and then qemu will keep the normal actions. That can help
>> guest's display come back.
>> When I upstream above implementation, Parav and MST suggest me to use
>> the PM capability to fix this problem instead of adding a new command or
>> state bit.
>> Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this
>> problem.
>> The reason is:
>> during guest suspending, it will write D3 state through PM cap, then I can save
>> the resources of virtio-gpu on Qemu side(set preserver_resource to 1), but
>> during the process of resuming, the state of PM cap will be cleared by qemu
>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when guest reads
>> state from PM cap, it will find the virtio-gpu has already been D0 state, 
> This behavior needs to be fixed. As the spec listed out, " This 2-bit field is used both to determine the current power state of a Function"
Do you mean it is wrong to reset PM cap when vritio_gpu reset in current qemu code? Why?
Shouldn't all device states, including PM registers, be reset during the process of virtio-gpu reset?

> 
> So device needs to return D3 in the PowerState field.  This is must.
But current behavior is a kind of soft reset, I think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is reasonable.

> 
> In addition to it, an additional busy_poll register is helpful that indicates the device is ready to use.
> This is because 10msec is the timeout set by the PCI spec.
> This can be hard for the devices to implement if the large GPU state is being read from files or slow media or for other hw devices in large quantities.
> 
> This limit comes from the PCI spec normative of below.
> 
> After transitioning a VF from D3Hot to D0, at least one of the following is true:
> ◦ At least 10 ms has passed since the request to enter D0 was issued.
> 
> So Readiness Time Reporting capability is not so useful.
> 
> Hence, after PM state to D3->D0 transition is successful, virtio level PCI register is useful to ensure that device is resumed to drive rest of the registers.
> 
>> so
>> guest will not write D0 through PM cap, so I can't know when to restore the
>> resources(set preserver_resource to 0).
>> Do you have any other suggestions?
>> Or can I just fallback to the version that add a new
>> command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think
>> that way is more reasonable and feasible for virtio-gpu to protect display
>> resources during S3. As for other devices, if necessary, they can also refer to
>> the implementation of the virtio-gpu to add new commands to prevent
>> resource loss during S3.
>>
>> On 2023/10/27 11:03, Chen, Jiqian wrote:
>>> Hi Michael S. Tsirkin and Parav Pandit, Thank you for your detailed
>>> explanation. I will try to use PM cap to fix this issue.
>>>
>>> On 2023/10/26 18:30, Michael S. Tsirkin wrote:
>>>> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
>>>>>
>>>>> On 2023/10/25 11:51, Parav Pandit wrote:
>>>>>>
>>>>>>
>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>> Sent: Tuesday, October 24, 2023 5:44 PM
>>>>>>>
>>>>>>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>>>>>>
>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>>>>>>
>>>>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>>>>>>
>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>>>>>>
>>>>>>>>>>>> I think this can be done without introducing the new register.
>>>>>>>>>>>> Can you please check if the PM register itself can serve the
>>>>>>>>>>>> purpose instead
>>>>>>>>>>> of new virtio level register?
>>>>>>>>>>> Do you mean the system PM register?
>>>>>>>>>> No, the device's PM register at transport level.
>>>>>>>>> I tried to find this register(pci level or virtio pci level or
>>>>>>>>> virtio driver level), but I didn't find it in Linux kernel or Qemu codes.
>>>>>>>>> May I know which register you are referring to specifically? Or
>>>>>>>>> which PM state bit you mentioned below?
>>>>>>>>>
>>>>>>>> PCI spec's PCI Power Management Capability Structure in section
>> 7.5.2.
>>>>>>> Yes, what you point to is PM capability for PCIe device.
>>>>>>> But the problem is still that in Qemu code, it will check the
>>>>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci
>>>>>>> devices in function virtio_pci_realize(), if the virtio devices
>>>>>>> aren't a PCIe device, it will not add PM capability for them.
>>>>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put
>> it only under is_pcie check.
>>>>>> But it can be done outside of that check as well because this capability
>> exists on PCI too for long time and it is backward compatible.
>>>>> Do you suggest me to implement PM capability for virtio devices in
>>>>> Qemu firstly, and then to try if the PM capability can work for this
>>>>> scenario?
>>>>
>>>> virtio devices in qemu already have a PM capability.
>>>>
>>>>
>>>>> If so, we will complicate a simple problem. Because there are no other
>> needs to add PM capability for virtio devices for now, if we add it just for
>> preserving resources, it seems unnecessary and unreasonable. And we are not
>> sure if there are other scenarios that are not during the process of PM state
>> changing also need to preserve resources, if have, then the PM register can't
>> cover, but preserve_resources register can.
>>>>
>>>> One of the selling points of virtio is precisely reusing existing
>>>> platform capabilities as opposed to coming up with our own.
>>>> See abstract.tex
>>>>
>>>>
>>>>> Can I add some notes like "If PM capability is implemented for virtio
>> devices, it may cover this scenario, and if there are no other scenarios that PM
>> can't cover, then we can remove this register " in commit message or spec
>> description and let us continue to add preserve_resources register?
>>>>
>>>> We can't remove registers.
>>>>
>>>>>>
>>>>>>> And another problem is how about the MMIO transport devices? Since
>>>>>>> preserve resources is need for all transport type devices.
>>>>>>>
>>>>>> MMIO lacks such rich PM definitions. If in future MMIO wants to
>> support, it will be extended to match to other transports like PCI.
>>>>>>
>>>>>>>>>>
>>>>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
>>>>>>>>>>> state of Guest system.
>>>>>>>>>> Guest driver performs any work on the guest systems PM callback
>>>>>>>>>> events in
>>>>>>>>> the virtio driver.
>>>>>>>>> I didn't find any PM state callback in the virtio driver.
>>>>>>>>>
>>>>>>>> There are virtio_suspend and virtio_resume in case of Linux.
>>>>>>> I think what you said virtio_suspend/resume is freeze/restore
>>>>>>> callback from "struct virtio_driver" or suspend/resume callback
>>>>>>> from "static const struct dev_pm_ops virtio_pci_pm_ops".
>>>>>>> And yes, I agree, if virtio devices have PM capability, maybe we
>>>>>>> can set PM state in those callback functions.
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> It's more suitable that each device gets notifications from
>>>>>>>>>>> driver, and then do preserving resources operation.
>>>>>>>>>> I agree that each device gets the notification from driver.
>>>>>>>>>> The question is, should it be virtio driver, or existing pci
>>>>>>>>>> driver which
>>>>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
>>>>>>>>> state. Could you please tell me which one it is specifically? I
>>>>>>>>> am very willing to
>>>>>>> give a try.
>>>>>>>>>
>>>>>>>> Virtio-pci modern driver of Linux should be able to.
>>>>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the
>>>>>>> two problems I said above.
>>>>>>>
>>>>>> Both can be resolved without switching to pcie.
>>>>>>
>>>>>>>>
>>>>>>>>>> Can you please check that?
>>>>>>>>>>
>>>>>>>>>>>>> --- a/transport-pci.tex
>>>>>>>>>>>>> +++ b/transport-pci.tex
>>>>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
>>>>>>> structure
>>>>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>>>>>>> Preserving these resources in the device implementation takes
>>>>>>>>>>>> finite amount
>>>>>>>>>>> of time.
>>>>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>>>>>>> Hence this register must be a polling register to indicate
>>>>>>>>>>>> that
>>>>>>>>>>> preservation_done.
>>>>>>>>>>>> This will tell the guest when the preservation is done, and
>>>>>>>>>>>> when restoration is
>>>>>>>>>>> done, so that it can resume upper layers.
>>>>>>>>>>>>
>>>>>>>>>>>> Please refer to queue_reset definition to learn more about
>>>>>>>>>>>> such register
>>>>>>>>>>> definition.
>>>>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three
>>>>>>>>>>> values, driver write 1 to let device do preserving resources,
>>>>>>>>>>> driver write
>>>>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
>>>>>>>>>>> driver that preserving or restoring done, am I right?
>>>>>>>>>>>
>>>>>>>>>> Right.
>>>>>>>>>>
>>>>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
>>>>>>>>>> If it cannot be used, lets add that reasoning in the commit log
>>>>>>>>>> to describe this
>>>>>>>>> register.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>>>>>>> sufficient/not-sufficient
>>>>>>>>>>> to decide the addition of this register.
>>>>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>>>>>>> capability, then this will not work. Actually in my local
>>>>>>>>>>> environment, pci_is_express() return false in Qemu, they are
>>>>>>>>>>> not PCIe
>>>>>>>>> device.
>>>>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to
>>>>>>>>>> get new functionality that too you mentioned a gpu device. 😊
>>>>>>>>>> Which does not have very long history of any backward
>> compatibility.
>>>>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
>>>>>>>>> virtio-gpu to a PCIe device?
>>>>>>>>>
>>>>>>>> PCI Power Management Capability Structure does not seem to be
>>>>>>>> limited to
>>>>>>> PCIe.
>>>>>>> I am not sure, but in current Qemu code, I can see the check
>> "pci_is_express"
>>>>>>> for all virtio-pci devices. If we want to add PM capability for
>>>>>>> virtio-pci devices, we need to change them to PCIe device I think.
>>>>>>>
>>>>>> That is one option.
>>>>>> Second option to extend PCI PM cap for non pci device because it is
>> supported.
>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Jiqian Chen.
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Jiqian Chen.
>>>>
>>>
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-12  8:25                             ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-12  8:25 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray, Chen, Jiqian


On 2024/1/12 16:02, Parav Pandit wrote:
> Hi Jiqian,
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Friday, January 12, 2024 1:11 PM
> 
>>
>> Hi all,
>> Sorry to reply late.
>> I don't know if you still remember this problem, let me briefly descript it.
>> I am working to implement virtgpu S3 function on Xen.
>> Currently on Xen, if we start a guest through qemu with enabling virtgpu, and
>> then suspend and resume guest. We can find that the guest kernel comes
>> back, but the display doesn't. It just shown a black screen.
>> That is because during suspending, guest called into qemu and qemu
>> destroyed all display resources and reset renderer. This made the display gone
>> after guest resumed.
>> So, I add a new command for virtio-gpu that when guest is suspending, it will
>> notify qemu and set parameter(preserver_resource) to 1 and then qemu will
>> preserve resources, and when resuming, guest will notify qemu to set
>> parameter to 0, and then qemu will keep the normal actions. That can help
>> guest's display come back.
>> When I upstream above implementation, Parav and MST suggest me to use
>> the PM capability to fix this problem instead of adding a new command or
>> state bit.
>> Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this
>> problem.
>> The reason is:
>> during guest suspending, it will write D3 state through PM cap, then I can save
>> the resources of virtio-gpu on Qemu side(set preserver_resource to 1), but
>> during the process of resuming, the state of PM cap will be cleared by qemu
>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when guest reads
>> state from PM cap, it will find the virtio-gpu has already been D0 state, 
> This behavior needs to be fixed. As the spec listed out, " This 2-bit field is used both to determine the current power state of a Function"
Do you mean it is wrong to reset PM cap when vritio_gpu reset in current qemu code? Why?
Shouldn't all device states, including PM registers, be reset during the process of virtio-gpu reset?

> 
> So device needs to return D3 in the PowerState field.  This is must.
But current behavior is a kind of soft reset, I think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is reasonable.

> 
> In addition to it, an additional busy_poll register is helpful that indicates the device is ready to use.
> This is because 10msec is the timeout set by the PCI spec.
> This can be hard for the devices to implement if the large GPU state is being read from files or slow media or for other hw devices in large quantities.
> 
> This limit comes from the PCI spec normative of below.
> 
> After transitioning a VF from D3Hot to D0, at least one of the following is true:
> ◦ At least 10 ms has passed since the request to enter D0 was issued.
> 
> So Readiness Time Reporting capability is not so useful.
> 
> Hence, after PM state to D3->D0 transition is successful, virtio level PCI register is useful to ensure that device is resumed to drive rest of the registers.
> 
>> so
>> guest will not write D0 through PM cap, so I can't know when to restore the
>> resources(set preserver_resource to 0).
>> Do you have any other suggestions?
>> Or can I just fallback to the version that add a new
>> command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think
>> that way is more reasonable and feasible for virtio-gpu to protect display
>> resources during S3. As for other devices, if necessary, they can also refer to
>> the implementation of the virtio-gpu to add new commands to prevent
>> resource loss during S3.
>>
>> On 2023/10/27 11:03, Chen, Jiqian wrote:
>>> Hi Michael S. Tsirkin and Parav Pandit, Thank you for your detailed
>>> explanation. I will try to use PM cap to fix this issue.
>>>
>>> On 2023/10/26 18:30, Michael S. Tsirkin wrote:
>>>> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
>>>>>
>>>>> On 2023/10/25 11:51, Parav Pandit wrote:
>>>>>>
>>>>>>
>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>> Sent: Tuesday, October 24, 2023 5:44 PM
>>>>>>>
>>>>>>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>>>>>>
>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>>>>>>
>>>>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>>>>>>
>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>>>>>>
>>>>>>>>>>>> I think this can be done without introducing the new register.
>>>>>>>>>>>> Can you please check if the PM register itself can serve the
>>>>>>>>>>>> purpose instead
>>>>>>>>>>> of new virtio level register?
>>>>>>>>>>> Do you mean the system PM register?
>>>>>>>>>> No, the device's PM register at transport level.
>>>>>>>>> I tried to find this register(pci level or virtio pci level or
>>>>>>>>> virtio driver level), but I didn't find it in Linux kernel or Qemu codes.
>>>>>>>>> May I know which register you are referring to specifically? Or
>>>>>>>>> which PM state bit you mentioned below?
>>>>>>>>>
>>>>>>>> PCI spec's PCI Power Management Capability Structure in section
>> 7.5.2.
>>>>>>> Yes, what you point to is PM capability for PCIe device.
>>>>>>> But the problem is still that in Qemu code, it will check the
>>>>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci
>>>>>>> devices in function virtio_pci_realize(), if the virtio devices
>>>>>>> aren't a PCIe device, it will not add PM capability for them.
>>>>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put
>> it only under is_pcie check.
>>>>>> But it can be done outside of that check as well because this capability
>> exists on PCI too for long time and it is backward compatible.
>>>>> Do you suggest me to implement PM capability for virtio devices in
>>>>> Qemu firstly, and then to try if the PM capability can work for this
>>>>> scenario?
>>>>
>>>> virtio devices in qemu already have a PM capability.
>>>>
>>>>
>>>>> If so, we will complicate a simple problem. Because there are no other
>> needs to add PM capability for virtio devices for now, if we add it just for
>> preserving resources, it seems unnecessary and unreasonable. And we are not
>> sure if there are other scenarios that are not during the process of PM state
>> changing also need to preserve resources, if have, then the PM register can't
>> cover, but preserve_resources register can.
>>>>
>>>> One of the selling points of virtio is precisely reusing existing
>>>> platform capabilities as opposed to coming up with our own.
>>>> See abstract.tex
>>>>
>>>>
>>>>> Can I add some notes like "If PM capability is implemented for virtio
>> devices, it may cover this scenario, and if there are no other scenarios that PM
>> can't cover, then we can remove this register " in commit message or spec
>> description and let us continue to add preserve_resources register?
>>>>
>>>> We can't remove registers.
>>>>
>>>>>>
>>>>>>> And another problem is how about the MMIO transport devices? Since
>>>>>>> preserve resources is need for all transport type devices.
>>>>>>>
>>>>>> MMIO lacks such rich PM definitions. If in future MMIO wants to
>> support, it will be extended to match to other transports like PCI.
>>>>>>
>>>>>>>>>>
>>>>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
>>>>>>>>>>> state of Guest system.
>>>>>>>>>> Guest driver performs any work on the guest systems PM callback
>>>>>>>>>> events in
>>>>>>>>> the virtio driver.
>>>>>>>>> I didn't find any PM state callback in the virtio driver.
>>>>>>>>>
>>>>>>>> There are virtio_suspend and virtio_resume in case of Linux.
>>>>>>> I think what you said virtio_suspend/resume is freeze/restore
>>>>>>> callback from "struct virtio_driver" or suspend/resume callback
>>>>>>> from "static const struct dev_pm_ops virtio_pci_pm_ops".
>>>>>>> And yes, I agree, if virtio devices have PM capability, maybe we
>>>>>>> can set PM state in those callback functions.
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> It's more suitable that each device gets notifications from
>>>>>>>>>>> driver, and then do preserving resources operation.
>>>>>>>>>> I agree that each device gets the notification from driver.
>>>>>>>>>> The question is, should it be virtio driver, or existing pci
>>>>>>>>>> driver which
>>>>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
>>>>>>>>> state. Could you please tell me which one it is specifically? I
>>>>>>>>> am very willing to
>>>>>>> give a try.
>>>>>>>>>
>>>>>>>> Virtio-pci modern driver of Linux should be able to.
>>>>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the
>>>>>>> two problems I said above.
>>>>>>>
>>>>>> Both can be resolved without switching to pcie.
>>>>>>
>>>>>>>>
>>>>>>>>>> Can you please check that?
>>>>>>>>>>
>>>>>>>>>>>>> --- a/transport-pci.tex
>>>>>>>>>>>>> +++ b/transport-pci.tex
>>>>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
>>>>>>> structure
>>>>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
>>>>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>>>>>>> Preserving these resources in the device implementation takes
>>>>>>>>>>>> finite amount
>>>>>>>>>>> of time.
>>>>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>>>>>>> Hence this register must be a polling register to indicate
>>>>>>>>>>>> that
>>>>>>>>>>> preservation_done.
>>>>>>>>>>>> This will tell the guest when the preservation is done, and
>>>>>>>>>>>> when restoration is
>>>>>>>>>>> done, so that it can resume upper layers.
>>>>>>>>>>>>
>>>>>>>>>>>> Please refer to queue_reset definition to learn more about
>>>>>>>>>>>> such register
>>>>>>>>>>> definition.
>>>>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three
>>>>>>>>>>> values, driver write 1 to let device do preserving resources,
>>>>>>>>>>> driver write
>>>>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
>>>>>>>>>>> driver that preserving or restoring done, am I right?
>>>>>>>>>>>
>>>>>>>>>> Right.
>>>>>>>>>>
>>>>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
>>>>>>>>>> If it cannot be used, lets add that reasoning in the commit log
>>>>>>>>>> to describe this
>>>>>>>>> register.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>>>>>>> sufficient/not-sufficient
>>>>>>>>>>> to decide the addition of this register.
>>>>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>>>>>>> capability, then this will not work. Actually in my local
>>>>>>>>>>> environment, pci_is_express() return false in Qemu, they are
>>>>>>>>>>> not PCIe
>>>>>>>>> device.
>>>>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to
>>>>>>>>>> get new functionality that too you mentioned a gpu device. 😊
>>>>>>>>>> Which does not have very long history of any backward
>> compatibility.
>>>>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
>>>>>>>>> virtio-gpu to a PCIe device?
>>>>>>>>>
>>>>>>>> PCI Power Management Capability Structure does not seem to be
>>>>>>>> limited to
>>>>>>> PCIe.
>>>>>>> I am not sure, but in current Qemu code, I can see the check
>> "pci_is_express"
>>>>>>> for all virtio-pci devices. If we want to add PM capability for
>>>>>>> virtio-pci devices, we need to change them to PCIe device I think.
>>>>>>>
>>>>>> That is one option.
>>>>>> Second option to extend PCI PM cap for non pci device because it is
>> supported.
>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Jiqian Chen.
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Jiqian Chen.
>>>>
>>>
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-12  8:25                             ` [virtio-comment] " Chen, Jiqian
@ 2024-01-12  8:47                               ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-12  8:47 UTC (permalink / raw)
  To: Chen, Jiqian, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray



> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Friday, January 12, 2024 1:55 PM
> 
> 
> On 2024/1/12 16:02, Parav Pandit wrote:
> > Hi Jiqian,
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Friday, January 12, 2024 1:11 PM
> >
> >>
> >> Hi all,
> >> Sorry to reply late.
> >> I don't know if you still remember this problem, let me briefly descript it.
> >> I am working to implement virtgpu S3 function on Xen.
> >> Currently on Xen, if we start a guest through qemu with enabling
> >> virtgpu, and then suspend and resume guest. We can find that the
> >> guest kernel comes back, but the display doesn't. It just shown a black
> screen.
> >> That is because during suspending, guest called into qemu and qemu
> >> destroyed all display resources and reset renderer. This made the
> >> display gone after guest resumed.
> >> So, I add a new command for virtio-gpu that when guest is suspending,
> >> it will notify qemu and set parameter(preserver_resource) to 1 and
> >> then qemu will preserve resources, and when resuming, guest will
> >> notify qemu to set parameter to 0, and then qemu will keep the normal
> >> actions. That can help guest's display come back.
> >> When I upstream above implementation, Parav and MST suggest me to
> use
> >> the PM capability to fix this problem instead of adding a new command
> >> or state bit.
> >> Now, I have tried the PM capability of virtio-gpu, it can't be used
> >> to solve this problem.
> >> The reason is:
> >> during guest suspending, it will write D3 state through PM cap, then
> >> I can save the resources of virtio-gpu on Qemu side(set
> >> preserver_resource to 1), but during the process of resuming, the
> >> state of PM cap will be cleared by qemu
> >> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
> >> guest reads state from PM cap, it will find the virtio-gpu has
> >> already been D0 state,
> > This behavior needs to be fixed. As the spec listed out, " This 2-bit field is
> used both to determine the current power state of a Function"
> Do you mean it is wrong to reset PM cap when vritio_gpu reset in current
> qemu code? Why?
Because PM is implementing the support from D3->d0 transition and if it device in D3, the device needs to respond that it is in D3 to match the PCI spec.

> Shouldn't all device states, including PM registers, be reset during the
> process of virtio-gpu reset?
If No_Soft_Reset== 0, no device context to be preserved.
If No_Soft_Reset == 1, device to preserve minimal registers and other things listed in pci spec.

> 
> >
> > So device needs to return D3 in the PowerState field.  This is must.
> But current behavior is a kind of soft reset, I
> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
> reasonable.
What you described is, you need to do No_Soft_Reset=1, so please set the cap accordingly to achieve the restore.

Also in your case the if the QEMU knows that it will have to resume the device soon.
Hence, it can well prepare the gpu context before resuming the vcpus.
In such case, the extra register wouldn’t be necessary.
Having PM control register is anyway needed to drive the device properly.

Having the extra busy_poll register is flexible. So please evaluate if you need it or not.

> 
> >
> > In addition to it, an additional busy_poll register is helpful that indicates the
> device is ready to use.
> > This is because 10msec is the timeout set by the PCI spec.
> > This can be hard for the devices to implement if the large GPU state is being
> read from files or slow media or for other hw devices in large quantities.
> >
> > This limit comes from the PCI spec normative of below.
> >
> > After transitioning a VF from D3Hot to D0, at least one of the following is
> true:
> > ◦ At least 10 ms has passed since the request to enter D0 was issued.
> >
> > So Readiness Time Reporting capability is not so useful.
> >
> > Hence, after PM state to D3->D0 transition is successful, virtio level PCI
> register is useful to ensure that device is resumed to drive rest of the
> registers.
> >
> >> so
> >> guest will not write D0 through PM cap, so I can't know when to
> >> restore the resources(set preserver_resource to 0).
> >> Do you have any other suggestions?
> >> Or can I just fallback to the version that add a new
> >> command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think
> that
> >> way is more reasonable and feasible for virtio-gpu to protect display
> >> resources during S3. As for other devices, if necessary, they can
> >> also refer to the implementation of the virtio-gpu to add new
> >> commands to prevent resource loss during S3.
> >>
> >> On 2023/10/27 11:03, Chen, Jiqian wrote:
> >>> Hi Michael S. Tsirkin and Parav Pandit, Thank you for your detailed
> >>> explanation. I will try to use PM cap to fix this issue.
> >>>
> >>> On 2023/10/26 18:30, Michael S. Tsirkin wrote:
> >>>> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
> >>>>>
> >>>>> On 2023/10/25 11:51, Parav Pandit wrote:
> >>>>>>
> >>>>>>
> >>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>> Sent: Tuesday, October 24, 2023 5:44 PM
> >>>>>>>
> >>>>>>> On 2023/10/24 18:51, Parav Pandit wrote:
> >>>>>>>>
> >>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
> >>>>>>>>>
> >>>>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>>>>>>>>
> >>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>>>>>>>>
> >>>>>>>>>>>> I think this can be done without introducing the new register.
> >>>>>>>>>>>> Can you please check if the PM register itself can serve
> >>>>>>>>>>>> the purpose instead
> >>>>>>>>>>> of new virtio level register?
> >>>>>>>>>>> Do you mean the system PM register?
> >>>>>>>>>> No, the device's PM register at transport level.
> >>>>>>>>> I tried to find this register(pci level or virtio pci level or
> >>>>>>>>> virtio driver level), but I didn't find it in Linux kernel or Qemu
> codes.
> >>>>>>>>> May I know which register you are referring to specifically?
> >>>>>>>>> Or which PM state bit you mentioned below?
> >>>>>>>>>
> >>>>>>>> PCI spec's PCI Power Management Capability Structure in section
> >> 7.5.2.
> >>>>>>> Yes, what you point to is PM capability for PCIe device.
> >>>>>>> But the problem is still that in Qemu code, it will check the
> >>>>>>> condition(pci_bus_is_express or pci_is_express) of all
> >>>>>>> virtio-pci devices in function virtio_pci_realize(), if the
> >>>>>>> virtio devices aren't a PCIe device, it will not add PM capability for
> them.
> >>>>>> PCI PM capability is must for PCIe devices. So may be QEMU code
> >>>>>> has put
> >> it only under is_pcie check.
> >>>>>> But it can be done outside of that check as well because this
> >>>>>> capability
> >> exists on PCI too for long time and it is backward compatible.
> >>>>> Do you suggest me to implement PM capability for virtio devices in
> >>>>> Qemu firstly, and then to try if the PM capability can work for
> >>>>> this scenario?
> >>>>
> >>>> virtio devices in qemu already have a PM capability.
> >>>>
> >>>>
> >>>>> If so, we will complicate a simple problem. Because there are no
> >>>>> other
> >> needs to add PM capability for virtio devices for now, if we add it
> >> just for preserving resources, it seems unnecessary and unreasonable.
> >> And we are not sure if there are other scenarios that are not during
> >> the process of PM state changing also need to preserve resources, if
> >> have, then the PM register can't cover, but preserve_resources register
> can.
> >>>>
> >>>> One of the selling points of virtio is precisely reusing existing
> >>>> platform capabilities as opposed to coming up with our own.
> >>>> See abstract.tex
> >>>>
> >>>>
> >>>>> Can I add some notes like "If PM capability is implemented for
> >>>>> virtio
> >> devices, it may cover this scenario, and if there are no other
> >> scenarios that PM can't cover, then we can remove this register " in
> >> commit message or spec description and let us continue to add
> preserve_resources register?
> >>>>
> >>>> We can't remove registers.
> >>>>
> >>>>>>
> >>>>>>> And another problem is how about the MMIO transport devices?
> >>>>>>> Since preserve resources is need for all transport type devices.
> >>>>>>>
> >>>>>> MMIO lacks such rich PM definitions. If in future MMIO wants to
> >> support, it will be extended to match to other transports like PCI.
> >>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> I think it is unreasonable to let virtio- device listen the
> >>>>>>>>>>> PM state of Guest system.
> >>>>>>>>>> Guest driver performs any work on the guest systems PM
> >>>>>>>>>> callback events in
> >>>>>>>>> the virtio driver.
> >>>>>>>>> I didn't find any PM state callback in the virtio driver.
> >>>>>>>>>
> >>>>>>>> There are virtio_suspend and virtio_resume in case of Linux.
> >>>>>>> I think what you said virtio_suspend/resume is freeze/restore
> >>>>>>> callback from "struct virtio_driver" or suspend/resume callback
> >>>>>>> from "static const struct dev_pm_ops virtio_pci_pm_ops".
> >>>>>>> And yes, I agree, if virtio devices have PM capability, maybe we
> >>>>>>> can set PM state in those callback functions.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> It's more suitable that each device gets notifications from
> >>>>>>>>>>> driver, and then do preserving resources operation.
> >>>>>>>>>> I agree that each device gets the notification from driver.
> >>>>>>>>>> The question is, should it be virtio driver, or existing pci
> >>>>>>>>>> driver which
> >>>>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
> >>>>>>>>> It seems there isn't existing pci driver to transitions d0 or
> >>>>>>>>> d3 state. Could you please tell me which one it is
> >>>>>>>>> specifically? I am very willing to
> >>>>>>> give a try.
> >>>>>>>>>
> >>>>>>>> Virtio-pci modern driver of Linux should be able to.
> >>>>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still
> >>>>>>> the two problems I said above.
> >>>>>>>
> >>>>>> Both can be resolved without switching to pcie.
> >>>>>>
> >>>>>>>>
> >>>>>>>>>> Can you please check that?
> >>>>>>>>>>
> >>>>>>>>>>>>> --- a/transport-pci.tex
> >>>>>>>>>>>>> +++ b/transport-pci.tex
> >>>>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common
> configuration
> >>>>>>> structure
> >>>>>>>>>>>>> layout}\label{sec:Virtio Transport
> >>>>>>>>>>>>>          /* About the administration virtqueue. */
> >>>>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver
> */
> >>>>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>>>>>>>>> Preserving these resources in the device implementation
> >>>>>>>>>>>> takes finite amount
> >>>>>>>>>>> of time.
> >>>>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>>>>>>>>> Hence this register must be a polling register to indicate
> >>>>>>>>>>>> that
> >>>>>>>>>>> preservation_done.
> >>>>>>>>>>>> This will tell the guest when the preservation is done, and
> >>>>>>>>>>>> when restoration is
> >>>>>>>>>>> done, so that it can resume upper layers.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please refer to queue_reset definition to learn more about
> >>>>>>>>>>>> such register
> >>>>>>>>>>> definition.
> >>>>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three
> >>>>>>>>>>> values, driver write 1 to let device do preserving
> >>>>>>>>>>> resources, driver write
> >>>>>>>>>>> 2 to let device do restoring resources, device write 0 to
> >>>>>>>>>>> tell driver that preserving or restoring done, am I right?
> >>>>>>>>>>>
> >>>>>>>>>> Right.
> >>>>>>>>>>
> >>>>>>>>>> And if the existing pcie pm state bits can do, we can leverage
> that.
> >>>>>>>>>> If it cannot be used, lets add that reasoning in the commit
> >>>>>>>>>> log to describe this
> >>>>>>>>> register.
> >>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Lets please make sure that PCIe PM level registers are
> >>>>>>>>>>>> sufficient/not-sufficient
> >>>>>>>>>>> to decide the addition of this register.
> >>>>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
> >>>>>>>>>>> capability, then this will not work. Actually in my local
> >>>>>>>>>>> environment, pci_is_express() return false in Qemu, they are
> >>>>>>>>>>> not PCIe
> >>>>>>>>> device.
> >>>>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to
> >>>>>>>>>> get new functionality that too you mentioned a gpu device. 😊
> >>>>>>>>>> Which does not have very long history of any backward
> >> compatibility.
> >>>>>>>>> Do you suggest me to add PM capability for virtio-gpu or
> >>>>>>>>> change virtio-gpu to a PCIe device?
> >>>>>>>>>
> >>>>>>>> PCI Power Management Capability Structure does not seem to be
> >>>>>>>> limited to
> >>>>>>> PCIe.
> >>>>>>> I am not sure, but in current Qemu code, I can see the check
> >> "pci_is_express"
> >>>>>>> for all virtio-pci devices. If we want to add PM capability for
> >>>>>>> virtio-pci devices, we need to change them to PCIe device I think.
> >>>>>>>
> >>>>>> That is one option.
> >>>>>> Second option to extend PCI PM cap for non pci device because it
> >>>>>> is
> >> supported.
> >>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best regards,
> >>>>>>> Jiqian Chen.
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>> Jiqian Chen.
> >>>>
> >>>
> >>
> >> --
> >> Best regards,
> >> Jiqian Chen.
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-12  8:47                               ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-12  8:47 UTC (permalink / raw)
  To: Chen, Jiqian, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray



> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Friday, January 12, 2024 1:55 PM
> 
> 
> On 2024/1/12 16:02, Parav Pandit wrote:
> > Hi Jiqian,
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Friday, January 12, 2024 1:11 PM
> >
> >>
> >> Hi all,
> >> Sorry to reply late.
> >> I don't know if you still remember this problem, let me briefly descript it.
> >> I am working to implement virtgpu S3 function on Xen.
> >> Currently on Xen, if we start a guest through qemu with enabling
> >> virtgpu, and then suspend and resume guest. We can find that the
> >> guest kernel comes back, but the display doesn't. It just shown a black
> screen.
> >> That is because during suspending, guest called into qemu and qemu
> >> destroyed all display resources and reset renderer. This made the
> >> display gone after guest resumed.
> >> So, I add a new command for virtio-gpu that when guest is suspending,
> >> it will notify qemu and set parameter(preserver_resource) to 1 and
> >> then qemu will preserve resources, and when resuming, guest will
> >> notify qemu to set parameter to 0, and then qemu will keep the normal
> >> actions. That can help guest's display come back.
> >> When I upstream above implementation, Parav and MST suggest me to
> use
> >> the PM capability to fix this problem instead of adding a new command
> >> or state bit.
> >> Now, I have tried the PM capability of virtio-gpu, it can't be used
> >> to solve this problem.
> >> The reason is:
> >> during guest suspending, it will write D3 state through PM cap, then
> >> I can save the resources of virtio-gpu on Qemu side(set
> >> preserver_resource to 1), but during the process of resuming, the
> >> state of PM cap will be cleared by qemu
> >> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
> >> guest reads state from PM cap, it will find the virtio-gpu has
> >> already been D0 state,
> > This behavior needs to be fixed. As the spec listed out, " This 2-bit field is
> used both to determine the current power state of a Function"
> Do you mean it is wrong to reset PM cap when vritio_gpu reset in current
> qemu code? Why?
Because PM is implementing the support from D3->d0 transition and if it device in D3, the device needs to respond that it is in D3 to match the PCI spec.

> Shouldn't all device states, including PM registers, be reset during the
> process of virtio-gpu reset?
If No_Soft_Reset== 0, no device context to be preserved.
If No_Soft_Reset == 1, device to preserve minimal registers and other things listed in pci spec.

> 
> >
> > So device needs to return D3 in the PowerState field.  This is must.
> But current behavior is a kind of soft reset, I
> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
> reasonable.
What you described is, you need to do No_Soft_Reset=1, so please set the cap accordingly to achieve the restore.

Also in your case the if the QEMU knows that it will have to resume the device soon.
Hence, it can well prepare the gpu context before resuming the vcpus.
In such case, the extra register wouldn’t be necessary.
Having PM control register is anyway needed to drive the device properly.

Having the extra busy_poll register is flexible. So please evaluate if you need it or not.

> 
> >
> > In addition to it, an additional busy_poll register is helpful that indicates the
> device is ready to use.
> > This is because 10msec is the timeout set by the PCI spec.
> > This can be hard for the devices to implement if the large GPU state is being
> read from files or slow media or for other hw devices in large quantities.
> >
> > This limit comes from the PCI spec normative of below.
> >
> > After transitioning a VF from D3Hot to D0, at least one of the following is
> true:
> > ◦ At least 10 ms has passed since the request to enter D0 was issued.
> >
> > So Readiness Time Reporting capability is not so useful.
> >
> > Hence, after PM state to D3->D0 transition is successful, virtio level PCI
> register is useful to ensure that device is resumed to drive rest of the
> registers.
> >
> >> so
> >> guest will not write D0 through PM cap, so I can't know when to
> >> restore the resources(set preserver_resource to 0).
> >> Do you have any other suggestions?
> >> Or can I just fallback to the version that add a new
> >> command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think
> that
> >> way is more reasonable and feasible for virtio-gpu to protect display
> >> resources during S3. As for other devices, if necessary, they can
> >> also refer to the implementation of the virtio-gpu to add new
> >> commands to prevent resource loss during S3.
> >>
> >> On 2023/10/27 11:03, Chen, Jiqian wrote:
> >>> Hi Michael S. Tsirkin and Parav Pandit, Thank you for your detailed
> >>> explanation. I will try to use PM cap to fix this issue.
> >>>
> >>> On 2023/10/26 18:30, Michael S. Tsirkin wrote:
> >>>> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
> >>>>>
> >>>>> On 2023/10/25 11:51, Parav Pandit wrote:
> >>>>>>
> >>>>>>
> >>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>> Sent: Tuesday, October 24, 2023 5:44 PM
> >>>>>>>
> >>>>>>> On 2023/10/24 18:51, Parav Pandit wrote:
> >>>>>>>>
> >>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
> >>>>>>>>>
> >>>>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>>>>>>>>
> >>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>>>>>>>>
> >>>>>>>>>>>> I think this can be done without introducing the new register.
> >>>>>>>>>>>> Can you please check if the PM register itself can serve
> >>>>>>>>>>>> the purpose instead
> >>>>>>>>>>> of new virtio level register?
> >>>>>>>>>>> Do you mean the system PM register?
> >>>>>>>>>> No, the device's PM register at transport level.
> >>>>>>>>> I tried to find this register(pci level or virtio pci level or
> >>>>>>>>> virtio driver level), but I didn't find it in Linux kernel or Qemu
> codes.
> >>>>>>>>> May I know which register you are referring to specifically?
> >>>>>>>>> Or which PM state bit you mentioned below?
> >>>>>>>>>
> >>>>>>>> PCI spec's PCI Power Management Capability Structure in section
> >> 7.5.2.
> >>>>>>> Yes, what you point to is PM capability for PCIe device.
> >>>>>>> But the problem is still that in Qemu code, it will check the
> >>>>>>> condition(pci_bus_is_express or pci_is_express) of all
> >>>>>>> virtio-pci devices in function virtio_pci_realize(), if the
> >>>>>>> virtio devices aren't a PCIe device, it will not add PM capability for
> them.
> >>>>>> PCI PM capability is must for PCIe devices. So may be QEMU code
> >>>>>> has put
> >> it only under is_pcie check.
> >>>>>> But it can be done outside of that check as well because this
> >>>>>> capability
> >> exists on PCI too for long time and it is backward compatible.
> >>>>> Do you suggest me to implement PM capability for virtio devices in
> >>>>> Qemu firstly, and then to try if the PM capability can work for
> >>>>> this scenario?
> >>>>
> >>>> virtio devices in qemu already have a PM capability.
> >>>>
> >>>>
> >>>>> If so, we will complicate a simple problem. Because there are no
> >>>>> other
> >> needs to add PM capability for virtio devices for now, if we add it
> >> just for preserving resources, it seems unnecessary and unreasonable.
> >> And we are not sure if there are other scenarios that are not during
> >> the process of PM state changing also need to preserve resources, if
> >> have, then the PM register can't cover, but preserve_resources register
> can.
> >>>>
> >>>> One of the selling points of virtio is precisely reusing existing
> >>>> platform capabilities as opposed to coming up with our own.
> >>>> See abstract.tex
> >>>>
> >>>>
> >>>>> Can I add some notes like "If PM capability is implemented for
> >>>>> virtio
> >> devices, it may cover this scenario, and if there are no other
> >> scenarios that PM can't cover, then we can remove this register " in
> >> commit message or spec description and let us continue to add
> preserve_resources register?
> >>>>
> >>>> We can't remove registers.
> >>>>
> >>>>>>
> >>>>>>> And another problem is how about the MMIO transport devices?
> >>>>>>> Since preserve resources is need for all transport type devices.
> >>>>>>>
> >>>>>> MMIO lacks such rich PM definitions. If in future MMIO wants to
> >> support, it will be extended to match to other transports like PCI.
> >>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> I think it is unreasonable to let virtio- device listen the
> >>>>>>>>>>> PM state of Guest system.
> >>>>>>>>>> Guest driver performs any work on the guest systems PM
> >>>>>>>>>> callback events in
> >>>>>>>>> the virtio driver.
> >>>>>>>>> I didn't find any PM state callback in the virtio driver.
> >>>>>>>>>
> >>>>>>>> There are virtio_suspend and virtio_resume in case of Linux.
> >>>>>>> I think what you said virtio_suspend/resume is freeze/restore
> >>>>>>> callback from "struct virtio_driver" or suspend/resume callback
> >>>>>>> from "static const struct dev_pm_ops virtio_pci_pm_ops".
> >>>>>>> And yes, I agree, if virtio devices have PM capability, maybe we
> >>>>>>> can set PM state in those callback functions.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> It's more suitable that each device gets notifications from
> >>>>>>>>>>> driver, and then do preserving resources operation.
> >>>>>>>>>> I agree that each device gets the notification from driver.
> >>>>>>>>>> The question is, should it be virtio driver, or existing pci
> >>>>>>>>>> driver which
> >>>>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
> >>>>>>>>> It seems there isn't existing pci driver to transitions d0 or
> >>>>>>>>> d3 state. Could you please tell me which one it is
> >>>>>>>>> specifically? I am very willing to
> >>>>>>> give a try.
> >>>>>>>>>
> >>>>>>>> Virtio-pci modern driver of Linux should be able to.
> >>>>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still
> >>>>>>> the two problems I said above.
> >>>>>>>
> >>>>>> Both can be resolved without switching to pcie.
> >>>>>>
> >>>>>>>>
> >>>>>>>>>> Can you please check that?
> >>>>>>>>>>
> >>>>>>>>>>>>> --- a/transport-pci.tex
> >>>>>>>>>>>>> +++ b/transport-pci.tex
> >>>>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common
> configuration
> >>>>>>> structure
> >>>>>>>>>>>>> layout}\label{sec:Virtio Transport
> >>>>>>>>>>>>>          /* About the administration virtqueue. */
> >>>>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver
> */
> >>>>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>>>>>>>>> Preserving these resources in the device implementation
> >>>>>>>>>>>> takes finite amount
> >>>>>>>>>>> of time.
> >>>>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>>>>>>>>> Hence this register must be a polling register to indicate
> >>>>>>>>>>>> that
> >>>>>>>>>>> preservation_done.
> >>>>>>>>>>>> This will tell the guest when the preservation is done, and
> >>>>>>>>>>>> when restoration is
> >>>>>>>>>>> done, so that it can resume upper layers.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please refer to queue_reset definition to learn more about
> >>>>>>>>>>>> such register
> >>>>>>>>>>> definition.
> >>>>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three
> >>>>>>>>>>> values, driver write 1 to let device do preserving
> >>>>>>>>>>> resources, driver write
> >>>>>>>>>>> 2 to let device do restoring resources, device write 0 to
> >>>>>>>>>>> tell driver that preserving or restoring done, am I right?
> >>>>>>>>>>>
> >>>>>>>>>> Right.
> >>>>>>>>>>
> >>>>>>>>>> And if the existing pcie pm state bits can do, we can leverage
> that.
> >>>>>>>>>> If it cannot be used, lets add that reasoning in the commit
> >>>>>>>>>> log to describe this
> >>>>>>>>> register.
> >>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Lets please make sure that PCIe PM level registers are
> >>>>>>>>>>>> sufficient/not-sufficient
> >>>>>>>>>>> to decide the addition of this register.
> >>>>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
> >>>>>>>>>>> capability, then this will not work. Actually in my local
> >>>>>>>>>>> environment, pci_is_express() return false in Qemu, they are
> >>>>>>>>>>> not PCIe
> >>>>>>>>> device.
> >>>>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to
> >>>>>>>>>> get new functionality that too you mentioned a gpu device. 😊
> >>>>>>>>>> Which does not have very long history of any backward
> >> compatibility.
> >>>>>>>>> Do you suggest me to add PM capability for virtio-gpu or
> >>>>>>>>> change virtio-gpu to a PCIe device?
> >>>>>>>>>
> >>>>>>>> PCI Power Management Capability Structure does not seem to be
> >>>>>>>> limited to
> >>>>>>> PCIe.
> >>>>>>> I am not sure, but in current Qemu code, I can see the check
> >> "pci_is_express"
> >>>>>>> for all virtio-pci devices. If we want to add PM capability for
> >>>>>>> virtio-pci devices, we need to change them to PCIe device I think.
> >>>>>>>
> >>>>>> That is one option.
> >>>>>> Second option to extend PCI PM cap for non pci device because it
> >>>>>> is
> >> supported.
> >>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best regards,
> >>>>>>> Jiqian Chen.
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>> Jiqian Chen.
> >>>>
> >>>
> >>
> >> --
> >> Best regards,
> >> Jiqian Chen.
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-12  8:47                               ` [virtio-comment] " Parav Pandit
@ 2024-01-12  9:24                                 ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-12  9:24 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray, Chen, Jiqian

On 2024/1/12 16:47, Parav Pandit wrote:
> 
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Friday, January 12, 2024 1:55 PM
>>
>>
>> On 2024/1/12 16:02, Parav Pandit wrote:
>>> Hi Jiqian,
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>
>>>>
>>>> Hi all,
>>>> Sorry to reply late.
>>>> I don't know if you still remember this problem, let me briefly descript it.
>>>> I am working to implement virtgpu S3 function on Xen.
>>>> Currently on Xen, if we start a guest through qemu with enabling
>>>> virtgpu, and then suspend and resume guest. We can find that the
>>>> guest kernel comes back, but the display doesn't. It just shown a black
>> screen.
>>>> That is because during suspending, guest called into qemu and qemu
>>>> destroyed all display resources and reset renderer. This made the
>>>> display gone after guest resumed.
>>>> So, I add a new command for virtio-gpu that when guest is suspending,
>>>> it will notify qemu and set parameter(preserver_resource) to 1 and
>>>> then qemu will preserve resources, and when resuming, guest will
>>>> notify qemu to set parameter to 0, and then qemu will keep the normal
>>>> actions. That can help guest's display come back.
>>>> When I upstream above implementation, Parav and MST suggest me to
>> use
>>>> the PM capability to fix this problem instead of adding a new command
>>>> or state bit.
>>>> Now, I have tried the PM capability of virtio-gpu, it can't be used
>>>> to solve this problem.
>>>> The reason is:
>>>> during guest suspending, it will write D3 state through PM cap, then
>>>> I can save the resources of virtio-gpu on Qemu side(set
>>>> preserver_resource to 1), but during the process of resuming, the
>>>> state of PM cap will be cleared by qemu
>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
>>>> guest reads state from PM cap, it will find the virtio-gpu has
>>>> already been D0 state,
>>> This behavior needs to be fixed. As the spec listed out, " This 2-bit field is
>> used both to determine the current power state of a Function"
>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in current
>> qemu code? Why?
> Because PM is implementing the support from D3->d0 transition and if it device in D3, the device needs to respond that it is in D3 to match the PCI spec.
> 
>> Shouldn't all device states, including PM registers, be reset during the
>> process of virtio-gpu reset?
> If No_Soft_Reset== 0, no device context to be preserved.
> If No_Soft_Reset == 1, device to preserve minimal registers and other things listed in pci spec.
> 
>>
>>>
>>> So device needs to return D3 in the PowerState field.  This is must.
>> But current behavior is a kind of soft reset, I
>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
>> reasonable.
> What you described is, you need to do No_Soft_Reset=1, so please set the cap accordingly to achieve the restore.
> 
> Also in your case the if the QEMU knows that it will have to resume the device soon.
> Hence, it can well prepare the gpu context before resuming the vcpus.
> In such case, the extra register wouldn’t be necessary.
> Having PM control register is anyway needed to drive the device properly.
Even as you said, the reset behavior of PM in the current QEMU code is incorrect, and even if it is modified, it also cannot fix this problem with S3.
Because there is twice device reset during resuming.
First is when we trigger a resume command to qemu, qemu will reset device(qemu_system_wakeup-> qemu_devices_reset-> virtio_vga_base_reset-> virtio_gpu_gl_reset).
After the first time, then second time happens in guest kernel, virtio_device_restore-> virtio_reset_device. But the PM state changes (D3 to D0) happens between those two, so the display resources will be still destroyed by the second time of device reset.
The most feasible and useful way to fix this problem is that just need to add a new virtio-gpu command instead of an extra register or using PM cap.
Thanks.

> 
> Having the extra busy_poll register is flexible. So please evaluate if you need it or not.
> 
>>
>>>
>>> In addition to it, an additional busy_poll register is helpful that indicates the
>> device is ready to use.
>>> This is because 10msec is the timeout set by the PCI spec.
>>> This can be hard for the devices to implement if the large GPU state is being
>> read from files or slow media or for other hw devices in large quantities.
>>>
>>> This limit comes from the PCI spec normative of below.
>>>
>>> After transitioning a VF from D3Hot to D0, at least one of the following is
>> true:
>>> ◦ At least 10 ms has passed since the request to enter D0 was issued.
>>>
>>> So Readiness Time Reporting capability is not so useful.
>>>
>>> Hence, after PM state to D3->D0 transition is successful, virtio level PCI
>> register is useful to ensure that device is resumed to drive rest of the
>> registers.
>>>
>>>> so
>>>> guest will not write D0 through PM cap, so I can't know when to
>>>> restore the resources(set preserver_resource to 0).
>>>> Do you have any other suggestions?
>>>> Or can I just fallback to the version that add a new
>>>> command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think
>> that
>>>> way is more reasonable and feasible for virtio-gpu to protect display
>>>> resources during S3. As for other devices, if necessary, they can
>>>> also refer to the implementation of the virtio-gpu to add new
>>>> commands to prevent resource loss during S3.
>>>>
>>>> On 2023/10/27 11:03, Chen, Jiqian wrote:
>>>>> Hi Michael S. Tsirkin and Parav Pandit, Thank you for your detailed
>>>>> explanation. I will try to use PM cap to fix this issue.
>>>>>
>>>>> On 2023/10/26 18:30, Michael S. Tsirkin wrote:
>>>>>> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
>>>>>>>
>>>>>>> On 2023/10/25 11:51, Parav Pandit wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>> Sent: Tuesday, October 24, 2023 5:44 PM
>>>>>>>>>
>>>>>>>>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>>>>>>>>
>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>>>>>>>>
>>>>>>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>>>>>>>>
>>>>>>>>>>>>>> I think this can be done without introducing the new register.
>>>>>>>>>>>>>> Can you please check if the PM register itself can serve
>>>>>>>>>>>>>> the purpose instead
>>>>>>>>>>>>> of new virtio level register?
>>>>>>>>>>>>> Do you mean the system PM register?
>>>>>>>>>>>> No, the device's PM register at transport level.
>>>>>>>>>>> I tried to find this register(pci level or virtio pci level or
>>>>>>>>>>> virtio driver level), but I didn't find it in Linux kernel or Qemu
>> codes.
>>>>>>>>>>> May I know which register you are referring to specifically?
>>>>>>>>>>> Or which PM state bit you mentioned below?
>>>>>>>>>>>
>>>>>>>>>> PCI spec's PCI Power Management Capability Structure in section
>>>> 7.5.2.
>>>>>>>>> Yes, what you point to is PM capability for PCIe device.
>>>>>>>>> But the problem is still that in Qemu code, it will check the
>>>>>>>>> condition(pci_bus_is_express or pci_is_express) of all
>>>>>>>>> virtio-pci devices in function virtio_pci_realize(), if the
>>>>>>>>> virtio devices aren't a PCIe device, it will not add PM capability for
>> them.
>>>>>>>> PCI PM capability is must for PCIe devices. So may be QEMU code
>>>>>>>> has put
>>>> it only under is_pcie check.
>>>>>>>> But it can be done outside of that check as well because this
>>>>>>>> capability
>>>> exists on PCI too for long time and it is backward compatible.
>>>>>>> Do you suggest me to implement PM capability for virtio devices in
>>>>>>> Qemu firstly, and then to try if the PM capability can work for
>>>>>>> this scenario?
>>>>>>
>>>>>> virtio devices in qemu already have a PM capability.
>>>>>>
>>>>>>
>>>>>>> If so, we will complicate a simple problem. Because there are no
>>>>>>> other
>>>> needs to add PM capability for virtio devices for now, if we add it
>>>> just for preserving resources, it seems unnecessary and unreasonable.
>>>> And we are not sure if there are other scenarios that are not during
>>>> the process of PM state changing also need to preserve resources, if
>>>> have, then the PM register can't cover, but preserve_resources register
>> can.
>>>>>>
>>>>>> One of the selling points of virtio is precisely reusing existing
>>>>>> platform capabilities as opposed to coming up with our own.
>>>>>> See abstract.tex
>>>>>>
>>>>>>
>>>>>>> Can I add some notes like "If PM capability is implemented for
>>>>>>> virtio
>>>> devices, it may cover this scenario, and if there are no other
>>>> scenarios that PM can't cover, then we can remove this register " in
>>>> commit message or spec description and let us continue to add
>> preserve_resources register?
>>>>>>
>>>>>> We can't remove registers.
>>>>>>
>>>>>>>>
>>>>>>>>> And another problem is how about the MMIO transport devices?
>>>>>>>>> Since preserve resources is need for all transport type devices.
>>>>>>>>>
>>>>>>>> MMIO lacks such rich PM definitions. If in future MMIO wants to
>>>> support, it will be extended to match to other transports like PCI.
>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> I think it is unreasonable to let virtio- device listen the
>>>>>>>>>>>>> PM state of Guest system.
>>>>>>>>>>>> Guest driver performs any work on the guest systems PM
>>>>>>>>>>>> callback events in
>>>>>>>>>>> the virtio driver.
>>>>>>>>>>> I didn't find any PM state callback in the virtio driver.
>>>>>>>>>>>
>>>>>>>>>> There are virtio_suspend and virtio_resume in case of Linux.
>>>>>>>>> I think what you said virtio_suspend/resume is freeze/restore
>>>>>>>>> callback from "struct virtio_driver" or suspend/resume callback
>>>>>>>>> from "static const struct dev_pm_ops virtio_pci_pm_ops".
>>>>>>>>> And yes, I agree, if virtio devices have PM capability, maybe we
>>>>>>>>> can set PM state in those callback functions.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> It's more suitable that each device gets notifications from
>>>>>>>>>>>>> driver, and then do preserving resources operation.
>>>>>>>>>>>> I agree that each device gets the notification from driver.
>>>>>>>>>>>> The question is, should it be virtio driver, or existing pci
>>>>>>>>>>>> driver which
>>>>>>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>>>>>>>>> It seems there isn't existing pci driver to transitions d0 or
>>>>>>>>>>> d3 state. Could you please tell me which one it is
>>>>>>>>>>> specifically? I am very willing to
>>>>>>>>> give a try.
>>>>>>>>>>>
>>>>>>>>>> Virtio-pci modern driver of Linux should be able to.
>>>>>>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still
>>>>>>>>> the two problems I said above.
>>>>>>>>>
>>>>>>>> Both can be resolved without switching to pcie.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> Can you please check that?
>>>>>>>>>>>>
>>>>>>>>>>>>>>> --- a/transport-pci.tex
>>>>>>>>>>>>>>> +++ b/transport-pci.tex
>>>>>>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common
>> configuration
>>>>>>>>> structure
>>>>>>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver
>> */
>>>>>>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>>>>>>>>> Preserving these resources in the device implementation
>>>>>>>>>>>>>> takes finite amount
>>>>>>>>>>>>> of time.
>>>>>>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>>>>>>>>> Hence this register must be a polling register to indicate
>>>>>>>>>>>>>> that
>>>>>>>>>>>>> preservation_done.
>>>>>>>>>>>>>> This will tell the guest when the preservation is done, and
>>>>>>>>>>>>>> when restoration is
>>>>>>>>>>>>> done, so that it can resume upper layers.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please refer to queue_reset definition to learn more about
>>>>>>>>>>>>>> such register
>>>>>>>>>>>>> definition.
>>>>>>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three
>>>>>>>>>>>>> values, driver write 1 to let device do preserving
>>>>>>>>>>>>> resources, driver write
>>>>>>>>>>>>> 2 to let device do restoring resources, device write 0 to
>>>>>>>>>>>>> tell driver that preserving or restoring done, am I right?
>>>>>>>>>>>>>
>>>>>>>>>>>> Right.
>>>>>>>>>>>>
>>>>>>>>>>>> And if the existing pcie pm state bits can do, we can leverage
>> that.
>>>>>>>>>>>> If it cannot be used, lets add that reasoning in the commit
>>>>>>>>>>>> log to describe this
>>>>>>>>>>> register.
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>>>>>>>>> sufficient/not-sufficient
>>>>>>>>>>>>> to decide the addition of this register.
>>>>>>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>>>>>>>>> capability, then this will not work. Actually in my local
>>>>>>>>>>>>> environment, pci_is_express() return false in Qemu, they are
>>>>>>>>>>>>> not PCIe
>>>>>>>>>>> device.
>>>>>>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to
>>>>>>>>>>>> get new functionality that too you mentioned a gpu device. 😊
>>>>>>>>>>>> Which does not have very long history of any backward
>>>> compatibility.
>>>>>>>>>>> Do you suggest me to add PM capability for virtio-gpu or
>>>>>>>>>>> change virtio-gpu to a PCIe device?
>>>>>>>>>>>
>>>>>>>>>> PCI Power Management Capability Structure does not seem to be
>>>>>>>>>> limited to
>>>>>>>>> PCIe.
>>>>>>>>> I am not sure, but in current Qemu code, I can see the check
>>>> "pci_is_express"
>>>>>>>>> for all virtio-pci devices. If we want to add PM capability for
>>>>>>>>> virtio-pci devices, we need to change them to PCIe device I think.
>>>>>>>>>
>>>>>>>> That is one option.
>>>>>>>> Second option to extend PCI PM cap for non pci device because it
>>>>>>>> is
>>>> supported.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Jiqian Chen.
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Jiqian Chen.
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Jiqian Chen.
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-12  9:24                                 ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-12  9:24 UTC (permalink / raw)
  To: Parav Pandit, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray, Chen, Jiqian

On 2024/1/12 16:47, Parav Pandit wrote:
> 
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Friday, January 12, 2024 1:55 PM
>>
>>
>> On 2024/1/12 16:02, Parav Pandit wrote:
>>> Hi Jiqian,
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>
>>>>
>>>> Hi all,
>>>> Sorry to reply late.
>>>> I don't know if you still remember this problem, let me briefly descript it.
>>>> I am working to implement virtgpu S3 function on Xen.
>>>> Currently on Xen, if we start a guest through qemu with enabling
>>>> virtgpu, and then suspend and resume guest. We can find that the
>>>> guest kernel comes back, but the display doesn't. It just shown a black
>> screen.
>>>> That is because during suspending, guest called into qemu and qemu
>>>> destroyed all display resources and reset renderer. This made the
>>>> display gone after guest resumed.
>>>> So, I add a new command for virtio-gpu that when guest is suspending,
>>>> it will notify qemu and set parameter(preserver_resource) to 1 and
>>>> then qemu will preserve resources, and when resuming, guest will
>>>> notify qemu to set parameter to 0, and then qemu will keep the normal
>>>> actions. That can help guest's display come back.
>>>> When I upstream above implementation, Parav and MST suggest me to
>> use
>>>> the PM capability to fix this problem instead of adding a new command
>>>> or state bit.
>>>> Now, I have tried the PM capability of virtio-gpu, it can't be used
>>>> to solve this problem.
>>>> The reason is:
>>>> during guest suspending, it will write D3 state through PM cap, then
>>>> I can save the resources of virtio-gpu on Qemu side(set
>>>> preserver_resource to 1), but during the process of resuming, the
>>>> state of PM cap will be cleared by qemu
>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
>>>> guest reads state from PM cap, it will find the virtio-gpu has
>>>> already been D0 state,
>>> This behavior needs to be fixed. As the spec listed out, " This 2-bit field is
>> used both to determine the current power state of a Function"
>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in current
>> qemu code? Why?
> Because PM is implementing the support from D3->d0 transition and if it device in D3, the device needs to respond that it is in D3 to match the PCI spec.
> 
>> Shouldn't all device states, including PM registers, be reset during the
>> process of virtio-gpu reset?
> If No_Soft_Reset== 0, no device context to be preserved.
> If No_Soft_Reset == 1, device to preserve minimal registers and other things listed in pci spec.
> 
>>
>>>
>>> So device needs to return D3 in the PowerState field.  This is must.
>> But current behavior is a kind of soft reset, I
>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
>> reasonable.
> What you described is, you need to do No_Soft_Reset=1, so please set the cap accordingly to achieve the restore.
> 
> Also in your case the if the QEMU knows that it will have to resume the device soon.
> Hence, it can well prepare the gpu context before resuming the vcpus.
> In such case, the extra register wouldn’t be necessary.
> Having PM control register is anyway needed to drive the device properly.
Even as you said, the reset behavior of PM in the current QEMU code is incorrect, and even if it is modified, it also cannot fix this problem with S3.
Because there is twice device reset during resuming.
First is when we trigger a resume command to qemu, qemu will reset device(qemu_system_wakeup-> qemu_devices_reset-> virtio_vga_base_reset-> virtio_gpu_gl_reset).
After the first time, then second time happens in guest kernel, virtio_device_restore-> virtio_reset_device. But the PM state changes (D3 to D0) happens between those two, so the display resources will be still destroyed by the second time of device reset.
The most feasible and useful way to fix this problem is that just need to add a new virtio-gpu command instead of an extra register or using PM cap.
Thanks.

> 
> Having the extra busy_poll register is flexible. So please evaluate if you need it or not.
> 
>>
>>>
>>> In addition to it, an additional busy_poll register is helpful that indicates the
>> device is ready to use.
>>> This is because 10msec is the timeout set by the PCI spec.
>>> This can be hard for the devices to implement if the large GPU state is being
>> read from files or slow media or for other hw devices in large quantities.
>>>
>>> This limit comes from the PCI spec normative of below.
>>>
>>> After transitioning a VF from D3Hot to D0, at least one of the following is
>> true:
>>> ◦ At least 10 ms has passed since the request to enter D0 was issued.
>>>
>>> So Readiness Time Reporting capability is not so useful.
>>>
>>> Hence, after PM state to D3->D0 transition is successful, virtio level PCI
>> register is useful to ensure that device is resumed to drive rest of the
>> registers.
>>>
>>>> so
>>>> guest will not write D0 through PM cap, so I can't know when to
>>>> restore the resources(set preserver_resource to 0).
>>>> Do you have any other suggestions?
>>>> Or can I just fallback to the version that add a new
>>>> command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think
>> that
>>>> way is more reasonable and feasible for virtio-gpu to protect display
>>>> resources during S3. As for other devices, if necessary, they can
>>>> also refer to the implementation of the virtio-gpu to add new
>>>> commands to prevent resource loss during S3.
>>>>
>>>> On 2023/10/27 11:03, Chen, Jiqian wrote:
>>>>> Hi Michael S. Tsirkin and Parav Pandit, Thank you for your detailed
>>>>> explanation. I will try to use PM cap to fix this issue.
>>>>>
>>>>> On 2023/10/26 18:30, Michael S. Tsirkin wrote:
>>>>>> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
>>>>>>>
>>>>>>> On 2023/10/25 11:51, Parav Pandit wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>> Sent: Tuesday, October 24, 2023 5:44 PM
>>>>>>>>>
>>>>>>>>> On 2023/10/24 18:51, Parav Pandit wrote:
>>>>>>>>>>
>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
>>>>>>>>>>>
>>>>>>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
>>>>>>>>>>>>
>>>>>>>>>>>>>> I think this can be done without introducing the new register.
>>>>>>>>>>>>>> Can you please check if the PM register itself can serve
>>>>>>>>>>>>>> the purpose instead
>>>>>>>>>>>>> of new virtio level register?
>>>>>>>>>>>>> Do you mean the system PM register?
>>>>>>>>>>>> No, the device's PM register at transport level.
>>>>>>>>>>> I tried to find this register(pci level or virtio pci level or
>>>>>>>>>>> virtio driver level), but I didn't find it in Linux kernel or Qemu
>> codes.
>>>>>>>>>>> May I know which register you are referring to specifically?
>>>>>>>>>>> Or which PM state bit you mentioned below?
>>>>>>>>>>>
>>>>>>>>>> PCI spec's PCI Power Management Capability Structure in section
>>>> 7.5.2.
>>>>>>>>> Yes, what you point to is PM capability for PCIe device.
>>>>>>>>> But the problem is still that in Qemu code, it will check the
>>>>>>>>> condition(pci_bus_is_express or pci_is_express) of all
>>>>>>>>> virtio-pci devices in function virtio_pci_realize(), if the
>>>>>>>>> virtio devices aren't a PCIe device, it will not add PM capability for
>> them.
>>>>>>>> PCI PM capability is must for PCIe devices. So may be QEMU code
>>>>>>>> has put
>>>> it only under is_pcie check.
>>>>>>>> But it can be done outside of that check as well because this
>>>>>>>> capability
>>>> exists on PCI too for long time and it is backward compatible.
>>>>>>> Do you suggest me to implement PM capability for virtio devices in
>>>>>>> Qemu firstly, and then to try if the PM capability can work for
>>>>>>> this scenario?
>>>>>>
>>>>>> virtio devices in qemu already have a PM capability.
>>>>>>
>>>>>>
>>>>>>> If so, we will complicate a simple problem. Because there are no
>>>>>>> other
>>>> needs to add PM capability for virtio devices for now, if we add it
>>>> just for preserving resources, it seems unnecessary and unreasonable.
>>>> And we are not sure if there are other scenarios that are not during
>>>> the process of PM state changing also need to preserve resources, if
>>>> have, then the PM register can't cover, but preserve_resources register
>> can.
>>>>>>
>>>>>> One of the selling points of virtio is precisely reusing existing
>>>>>> platform capabilities as opposed to coming up with our own.
>>>>>> See abstract.tex
>>>>>>
>>>>>>
>>>>>>> Can I add some notes like "If PM capability is implemented for
>>>>>>> virtio
>>>> devices, it may cover this scenario, and if there are no other
>>>> scenarios that PM can't cover, then we can remove this register " in
>>>> commit message or spec description and let us continue to add
>> preserve_resources register?
>>>>>>
>>>>>> We can't remove registers.
>>>>>>
>>>>>>>>
>>>>>>>>> And another problem is how about the MMIO transport devices?
>>>>>>>>> Since preserve resources is need for all transport type devices.
>>>>>>>>>
>>>>>>>> MMIO lacks such rich PM definitions. If in future MMIO wants to
>>>> support, it will be extended to match to other transports like PCI.
>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> I think it is unreasonable to let virtio- device listen the
>>>>>>>>>>>>> PM state of Guest system.
>>>>>>>>>>>> Guest driver performs any work on the guest systems PM
>>>>>>>>>>>> callback events in
>>>>>>>>>>> the virtio driver.
>>>>>>>>>>> I didn't find any PM state callback in the virtio driver.
>>>>>>>>>>>
>>>>>>>>>> There are virtio_suspend and virtio_resume in case of Linux.
>>>>>>>>> I think what you said virtio_suspend/resume is freeze/restore
>>>>>>>>> callback from "struct virtio_driver" or suspend/resume callback
>>>>>>>>> from "static const struct dev_pm_ops virtio_pci_pm_ops".
>>>>>>>>> And yes, I agree, if virtio devices have PM capability, maybe we
>>>>>>>>> can set PM state in those callback functions.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> It's more suitable that each device gets notifications from
>>>>>>>>>>>>> driver, and then do preserving resources operation.
>>>>>>>>>>>> I agree that each device gets the notification from driver.
>>>>>>>>>>>> The question is, should it be virtio driver, or existing pci
>>>>>>>>>>>> driver which
>>>>>>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
>>>>>>>>>>> It seems there isn't existing pci driver to transitions d0 or
>>>>>>>>>>> d3 state. Could you please tell me which one it is
>>>>>>>>>>> specifically? I am very willing to
>>>>>>>>> give a try.
>>>>>>>>>>>
>>>>>>>>>> Virtio-pci modern driver of Linux should be able to.
>>>>>>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still
>>>>>>>>> the two problems I said above.
>>>>>>>>>
>>>>>>>> Both can be resolved without switching to pcie.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> Can you please check that?
>>>>>>>>>>>>
>>>>>>>>>>>>>>> --- a/transport-pci.tex
>>>>>>>>>>>>>>> +++ b/transport-pci.tex
>>>>>>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common
>> configuration
>>>>>>>>> structure
>>>>>>>>>>>>>>> layout}\label{sec:Virtio Transport
>>>>>>>>>>>>>>>          /* About the administration virtqueue. */
>>>>>>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver
>> */
>>>>>>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
>>>>>>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
>>>>>>>>>>>>>> Preserving these resources in the device implementation
>>>>>>>>>>>>>> takes finite amount
>>>>>>>>>>>>> of time.
>>>>>>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
>>>>>>>>>>>>>> Hence this register must be a polling register to indicate
>>>>>>>>>>>>>> that
>>>>>>>>>>>>> preservation_done.
>>>>>>>>>>>>>> This will tell the guest when the preservation is done, and
>>>>>>>>>>>>>> when restoration is
>>>>>>>>>>>>> done, so that it can resume upper layers.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please refer to queue_reset definition to learn more about
>>>>>>>>>>>>>> such register
>>>>>>>>>>>>> definition.
>>>>>>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three
>>>>>>>>>>>>> values, driver write 1 to let device do preserving
>>>>>>>>>>>>> resources, driver write
>>>>>>>>>>>>> 2 to let device do restoring resources, device write 0 to
>>>>>>>>>>>>> tell driver that preserving or restoring done, am I right?
>>>>>>>>>>>>>
>>>>>>>>>>>> Right.
>>>>>>>>>>>>
>>>>>>>>>>>> And if the existing pcie pm state bits can do, we can leverage
>> that.
>>>>>>>>>>>> If it cannot be used, lets add that reasoning in the commit
>>>>>>>>>>>> log to describe this
>>>>>>>>>>> register.
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Lets please make sure that PCIe PM level registers are
>>>>>>>>>>>>>> sufficient/not-sufficient
>>>>>>>>>>>>> to decide the addition of this register.
>>>>>>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
>>>>>>>>>>>>> capability, then this will not work. Actually in my local
>>>>>>>>>>>>> environment, pci_is_express() return false in Qemu, they are
>>>>>>>>>>>>> not PCIe
>>>>>>>>>>> device.
>>>>>>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to
>>>>>>>>>>>> get new functionality that too you mentioned a gpu device. 😊
>>>>>>>>>>>> Which does not have very long history of any backward
>>>> compatibility.
>>>>>>>>>>> Do you suggest me to add PM capability for virtio-gpu or
>>>>>>>>>>> change virtio-gpu to a PCIe device?
>>>>>>>>>>>
>>>>>>>>>> PCI Power Management Capability Structure does not seem to be
>>>>>>>>>> limited to
>>>>>>>>> PCIe.
>>>>>>>>> I am not sure, but in current Qemu code, I can see the check
>>>> "pci_is_express"
>>>>>>>>> for all virtio-pci devices. If we want to add PM capability for
>>>>>>>>> virtio-pci devices, we need to change them to PCIe device I think.
>>>>>>>>>
>>>>>>>> That is one option.
>>>>>>>> Second option to extend PCI PM cap for non pci device because it
>>>>>>>> is
>>>> supported.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Jiqian Chen.
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Jiqian Chen.
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Jiqian Chen.
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-12  9:24                                 ` [virtio-comment] " Chen, Jiqian
@ 2024-01-12  9:44                                   ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-12  9:44 UTC (permalink / raw)
  To: Chen, Jiqian, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray



> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Friday, January 12, 2024 2:55 PM
> 
> On 2024/1/12 16:47, Parav Pandit wrote:
> >
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Friday, January 12, 2024 1:55 PM
> >>
> >>
> >> On 2024/1/12 16:02, Parav Pandit wrote:
> >>> Hi Jiqian,
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>
> >>>>
> >>>> Hi all,
> >>>> Sorry to reply late.
> >>>> I don't know if you still remember this problem, let me briefly descript
> it.
> >>>> I am working to implement virtgpu S3 function on Xen.
> >>>> Currently on Xen, if we start a guest through qemu with enabling
> >>>> virtgpu, and then suspend and resume guest. We can find that the
> >>>> guest kernel comes back, but the display doesn't. It just shown a
> >>>> black
> >> screen.
> >>>> That is because during suspending, guest called into qemu and qemu
> >>>> destroyed all display resources and reset renderer. This made the
> >>>> display gone after guest resumed.
> >>>> So, I add a new command for virtio-gpu that when guest is
> >>>> suspending, it will notify qemu and set
> >>>> parameter(preserver_resource) to 1 and then qemu will preserve
> >>>> resources, and when resuming, guest will notify qemu to set
> >>>> parameter to 0, and then qemu will keep the normal actions. That can
> help guest's display come back.
> >>>> When I upstream above implementation, Parav and MST suggest me to
> >> use
> >>>> the PM capability to fix this problem instead of adding a new
> >>>> command or state bit.
> >>>> Now, I have tried the PM capability of virtio-gpu, it can't be used
> >>>> to solve this problem.
> >>>> The reason is:
> >>>> during guest suspending, it will write D3 state through PM cap,
> >>>> then I can save the resources of virtio-gpu on Qemu side(set
> >>>> preserver_resource to 1), but during the process of resuming, the
> >>>> state of PM cap will be cleared by qemu
> >>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
> >>>> guest reads state from PM cap, it will find the virtio-gpu has
> >>>> already been D0 state,
> >>> This behavior needs to be fixed. As the spec listed out, " This
> >>> 2-bit field is
> >> used both to determine the current power state of a Function"
> >> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
> >> current qemu code? Why?
> > Because PM is implementing the support from D3->d0 transition and if it
> device in D3, the device needs to respond that it is in D3 to match the PCI
> spec.
> >
> >> Shouldn't all device states, including PM registers, be reset during
> >> the process of virtio-gpu reset?
> > If No_Soft_Reset== 0, no device context to be preserved.
> > If No_Soft_Reset == 1, device to preserve minimal registers and other
> things listed in pci spec.
> >
> >>
> >>>
> >>> So device needs to return D3 in the PowerState field.  This is must.
> >> But current behavior is a kind of soft reset, I
> >> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
> >> reasonable.
> > What you described is, you need to do No_Soft_Reset=1, so please set the
> cap accordingly to achieve the restore.
> >
> > Also in your case the if the QEMU knows that it will have to resume the
> device soon.
> > Hence, it can well prepare the gpu context before resuming the vcpus.
> > In such case, the extra register wouldn’t be necessary.
> > Having PM control register is anyway needed to drive the device properly.
> Even as you said, the reset behavior of PM in the current QEMU code is
> incorrect, and even if it is modified, it also cannot fix this problem with S3.
You meant D3?

> Because there is twice device reset during resuming.
> First is when we trigger a resume command to qemu, qemu will reset
> device(qemu_system_wakeup-> qemu_devices_reset->
> virtio_vga_base_reset-> virtio_gpu_gl_reset).
A device implementation that supports PM should not reset the device. Please fix it.

> After the first time, then second time happens in guest kernel,
> virtio_device_restore-> virtio_reset_device. But the PM state changes (D3 to
> D0) happens between those two, so the display resources will be still
> destroyed by the second time of device reset.
The driver that understands that device supports PM, will skip the reset steps.
Hence the 2nd reset will also not occur.
Therefore, device state will be restored.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-12  9:44                                   ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-12  9:44 UTC (permalink / raw)
  To: Chen, Jiqian, Michael S. Tsirkin, Gerd Hoffmann
  Cc: Jason Wang, Xuan Zhuo, David Airlie, Gurchetan Singh, Chia-I Wu,
	Marc-André Lureau, Robert Beckett, Mikhail Golubev-Ciuchea,
	virtio-comment, virtio-dev, Huang, Honglei1, Zhang, Julia, Huang,
	Ray



> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Friday, January 12, 2024 2:55 PM
> 
> On 2024/1/12 16:47, Parav Pandit wrote:
> >
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Friday, January 12, 2024 1:55 PM
> >>
> >>
> >> On 2024/1/12 16:02, Parav Pandit wrote:
> >>> Hi Jiqian,
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>
> >>>>
> >>>> Hi all,
> >>>> Sorry to reply late.
> >>>> I don't know if you still remember this problem, let me briefly descript
> it.
> >>>> I am working to implement virtgpu S3 function on Xen.
> >>>> Currently on Xen, if we start a guest through qemu with enabling
> >>>> virtgpu, and then suspend and resume guest. We can find that the
> >>>> guest kernel comes back, but the display doesn't. It just shown a
> >>>> black
> >> screen.
> >>>> That is because during suspending, guest called into qemu and qemu
> >>>> destroyed all display resources and reset renderer. This made the
> >>>> display gone after guest resumed.
> >>>> So, I add a new command for virtio-gpu that when guest is
> >>>> suspending, it will notify qemu and set
> >>>> parameter(preserver_resource) to 1 and then qemu will preserve
> >>>> resources, and when resuming, guest will notify qemu to set
> >>>> parameter to 0, and then qemu will keep the normal actions. That can
> help guest's display come back.
> >>>> When I upstream above implementation, Parav and MST suggest me to
> >> use
> >>>> the PM capability to fix this problem instead of adding a new
> >>>> command or state bit.
> >>>> Now, I have tried the PM capability of virtio-gpu, it can't be used
> >>>> to solve this problem.
> >>>> The reason is:
> >>>> during guest suspending, it will write D3 state through PM cap,
> >>>> then I can save the resources of virtio-gpu on Qemu side(set
> >>>> preserver_resource to 1), but during the process of resuming, the
> >>>> state of PM cap will be cleared by qemu
> >>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
> >>>> guest reads state from PM cap, it will find the virtio-gpu has
> >>>> already been D0 state,
> >>> This behavior needs to be fixed. As the spec listed out, " This
> >>> 2-bit field is
> >> used both to determine the current power state of a Function"
> >> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
> >> current qemu code? Why?
> > Because PM is implementing the support from D3->d0 transition and if it
> device in D3, the device needs to respond that it is in D3 to match the PCI
> spec.
> >
> >> Shouldn't all device states, including PM registers, be reset during
> >> the process of virtio-gpu reset?
> > If No_Soft_Reset== 0, no device context to be preserved.
> > If No_Soft_Reset == 1, device to preserve minimal registers and other
> things listed in pci spec.
> >
> >>
> >>>
> >>> So device needs to return D3 in the PowerState field.  This is must.
> >> But current behavior is a kind of soft reset, I
> >> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
> >> reasonable.
> > What you described is, you need to do No_Soft_Reset=1, so please set the
> cap accordingly to achieve the restore.
> >
> > Also in your case the if the QEMU knows that it will have to resume the
> device soon.
> > Hence, it can well prepare the gpu context before resuming the vcpus.
> > In such case, the extra register wouldn’t be necessary.
> > Having PM control register is anyway needed to drive the device properly.
> Even as you said, the reset behavior of PM in the current QEMU code is
> incorrect, and even if it is modified, it also cannot fix this problem with S3.
You meant D3?

> Because there is twice device reset during resuming.
> First is when we trigger a resume command to qemu, qemu will reset
> device(qemu_system_wakeup-> qemu_devices_reset->
> virtio_vga_base_reset-> virtio_gpu_gl_reset).
A device implementation that supports PM should not reset the device. Please fix it.

> After the first time, then second time happens in guest kernel,
> virtio_device_restore-> virtio_reset_device. But the PM state changes (D3 to
> D0) happens between those two, so the display resources will be still
> destroyed by the second time of device reset.
The driver that understands that device supports PM, will skip the reset steps.
Hence the 2nd reset will also not occur.
Therefore, device state will be restored.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-12  7:41                         ` [virtio-comment] " Chen, Jiqian
@ 2024-01-15  0:25                           ` Jason Wang
  -1 siblings, 0 replies; 76+ messages in thread
From: Jason Wang @ 2024-01-15  0:25 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Parav Pandit, Gerd Hoffmann, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray

On Fri, Jan 12, 2024 at 3:41 PM Chen, Jiqian <Jiqian.Chen@amd.com> wrote:
>
> Hi all,
> Sorry to reply late.
> I don't know if you still remember this problem, let me briefly descript it.
> I am working to implement virtgpu S3 function on Xen.
> Currently on Xen, if we start a guest through qemu with enabling virtgpu, and then suspend and resume guest. We can find that the guest kernel comes back, but the display doesn't. It just shown a black screen.
> That is because during suspending, guest called into qemu and qemu destroyed all display resources and reset renderer. This made the display gone after guest resumed.
> So, I add a new command for virtio-gpu that when guest is suspending, it will notify qemu and set parameter(preserver_resource) to 1 and then qemu will preserve resources, and when resuming, guest will notify qemu to set parameter to 0, and then qemu will keep the normal actions. That can help guest's display come back.
> When I upstream above implementation, Parav and MST suggest me to use the PM capability to fix this problem instead of adding a new command or state bit.
> Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this problem.
> The reason is:
> during guest suspending, it will write D3 state through PM cap, then I can save the resources of virtio-gpu on Qemu side(set preserver_resource to 1),
> but during the process of resuming, the state of PM cap will be cleared by qemu resetting(qemu_system_wakeup-> qemu_devices_reset-> virtio_vga_base_reset-> virtio_gpu_gl_reset),
> it causes that when guest reads state from PM cap, it will find the virtio-gpu has already been D0 state, so guest will not write D0 through PM cap, so I can't know when to restore the resources(set preserver_resource to 0).

Looks like a bug to be fixed?

> Do you have any other suggestions?
> Or can I just fallback to the version that add a new command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think that way is more reasonable and feasible for virtio-gpu to protect display resources during S3. As for other devices, if necessary, they can also refer to the implementation of the virtio-gpu to add new commands to prevent resource loss during S3.

Note that there's recently a fix for no_soft_reset, a well behaved
device should not suffer from this issue anymore. If I understand this
correctly, there's no need for any extension for the spec as well.:

https://lore.kernel.org/lkml/CACGkMEs_MajTFxGVcK5R8oqQzTxuL4Pm=uUnOonDczWzqaucsw@mail.gmail.com/T/

Thanks

>
> On 2023/10/27 11:03, Chen, Jiqian wrote:
> > Hi Michael S. Tsirkin and Parav Pandit,
> > Thank you for your detailed explanation. I will try to use PM cap to fix this issue.
> >
> > On 2023/10/26 18:30, Michael S. Tsirkin wrote:
> >> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
> >>>
> >>> On 2023/10/25 11:51, Parav Pandit wrote:
> >>>>
> >>>>
> >>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>> Sent: Tuesday, October 24, 2023 5:44 PM
> >>>>>
> >>>>> On 2023/10/24 18:51, Parav Pandit wrote:
> >>>>>>
> >>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
> >>>>>>>
> >>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>>>>>>
> >>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>>>>>>
> >>>>>>>>>> I think this can be done without introducing the new register.
> >>>>>>>>>> Can you please check if the PM register itself can serve the
> >>>>>>>>>> purpose instead
> >>>>>>>>> of new virtio level register?
> >>>>>>>>> Do you mean the system PM register?
> >>>>>>>> No, the device's PM register at transport level.
> >>>>>>> I tried to find this register(pci level or virtio pci level or virtio
> >>>>>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
> >>>>>>> May I know which register you are referring to specifically? Or which
> >>>>>>> PM state bit you mentioned below?
> >>>>>>>
> >>>>>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
> >>>>> Yes, what you point to is PM capability for PCIe device.
> >>>>> But the problem is still that in Qemu code, it will check the
> >>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
> >>>>> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
> >>>>> add PM capability for them.
> >>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
> >>>> But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
> >>> Do you suggest me to implement PM capability for virtio devices in
> >>> Qemu firstly, and then to try if the PM capability can work for this
> >>> scenario?
> >>
> >> virtio devices in qemu already have a PM capability.
> >>
> >>
> >>> If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.
> >>
> >> One of the selling points of virtio is precisely reusing existing
> >> platform capabilities as opposed to coming up with our own.
> >> See abstract.tex
> >>
> >>
> >>> Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?
> >>
> >> We can't remove registers.
> >>
> >>>>
> >>>>> And another problem is how about the MMIO transport devices? Since
> >>>>> preserve resources is need for all transport type devices.
> >>>>>
> >>>> MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
> >>>>
> >>>>>>>>
> >>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
> >>>>>>>>> state of Guest system.
> >>>>>>>> Guest driver performs any work on the guest systems PM callback
> >>>>>>>> events in
> >>>>>>> the virtio driver.
> >>>>>>> I didn't find any PM state callback in the virtio driver.
> >>>>>>>
> >>>>>> There are virtio_suspend and virtio_resume in case of Linux.
> >>>>> I think what you said virtio_suspend/resume is freeze/restore callback from
> >>>>> "struct virtio_driver" or suspend/resume callback from "static const struct
> >>>>> dev_pm_ops virtio_pci_pm_ops".
> >>>>> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
> >>>>> in those callback functions.
> >>>>>
> >>>>>>
> >>>>>>>>
> >>>>>>>>> It's more suitable that each device gets notifications from driver,
> >>>>>>>>> and then do preserving resources operation.
> >>>>>>>> I agree that each device gets the notification from driver.
> >>>>>>>> The question is, should it be virtio driver, or existing pci driver
> >>>>>>>> which
> >>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
> >>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
> >>>>>>> state. Could you please tell me which one it is specifically? I am very willing to
> >>>>> give a try.
> >>>>>>>
> >>>>>> Virtio-pci modern driver of Linux should be able to.
> >>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
> >>>>> said above.
> >>>>>
> >>>> Both can be resolved without switching to pcie.
> >>>>
> >>>>>>
> >>>>>>>> Can you please check that?
> >>>>>>>>
> >>>>>>>>>>> --- a/transport-pci.tex
> >>>>>>>>>>> +++ b/transport-pci.tex
> >>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
> >>>>> structure
> >>>>>>>>>>> layout}\label{sec:Virtio Transport
> >>>>>>>>>>>          /* About the administration virtqueue. */
> >>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>>>>>>> Preserving these resources in the device implementation takes
> >>>>>>>>>> finite amount
> >>>>>>>>> of time.
> >>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>>>>>>> Hence this register must be a polling register to indicate that
> >>>>>>>>> preservation_done.
> >>>>>>>>>> This will tell the guest when the preservation is done, and when
> >>>>>>>>>> restoration is
> >>>>>>>>> done, so that it can resume upper layers.
> >>>>>>>>>>
> >>>>>>>>>> Please refer to queue_reset definition to learn more about such
> >>>>>>>>>> register
> >>>>>>>>> definition.
> >>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
> >>>>>>>>> driver write 1 to let device do preserving resources, driver write
> >>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
> >>>>>>>>> driver that preserving or restoring done, am I right?
> >>>>>>>>>
> >>>>>>>> Right.
> >>>>>>>>
> >>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
> >>>>>>>> If it cannot be used, lets add that reasoning in the commit log to
> >>>>>>>> describe this
> >>>>>>> register.
> >>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Lets please make sure that PCIe PM level registers are
> >>>>>>>>>> sufficient/not-sufficient
> >>>>>>>>> to decide the addition of this register.
> >>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
> >>>>>>>>> capability, then this will not work. Actually in my local
> >>>>>>>>> environment, pci_is_express() return false in Qemu, they are not
> >>>>>>>>> PCIe
> >>>>>>> device.
> >>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
> >>>>>>>> functionality that too you mentioned a gpu device. 😊
> >>>>>>>> Which does not have very long history of any backward compatibility.
> >>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
> >>>>>>> virtio-gpu to a PCIe device?
> >>>>>>>
> >>>>>> PCI Power Management Capability Structure does not seem to be limited to
> >>>>> PCIe.
> >>>>> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
> >>>>> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
> >>>>> we need to change them to PCIe device I think.
> >>>>>
> >>>> That is one option.
> >>>> Second option to extend PCI PM cap for non pci device because it is supported.
> >>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>> Jiqian Chen.
> >>>
> >>> --
> >>> Best regards,
> >>> Jiqian Chen.
> >>
> >
>
> --
> Best regards,
> Jiqian Chen.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  0:25                           ` Jason Wang
  0 siblings, 0 replies; 76+ messages in thread
From: Jason Wang @ 2024-01-15  0:25 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Parav Pandit, Gerd Hoffmann, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray

On Fri, Jan 12, 2024 at 3:41 PM Chen, Jiqian <Jiqian.Chen@amd.com> wrote:
>
> Hi all,
> Sorry to reply late.
> I don't know if you still remember this problem, let me briefly descript it.
> I am working to implement virtgpu S3 function on Xen.
> Currently on Xen, if we start a guest through qemu with enabling virtgpu, and then suspend and resume guest. We can find that the guest kernel comes back, but the display doesn't. It just shown a black screen.
> That is because during suspending, guest called into qemu and qemu destroyed all display resources and reset renderer. This made the display gone after guest resumed.
> So, I add a new command for virtio-gpu that when guest is suspending, it will notify qemu and set parameter(preserver_resource) to 1 and then qemu will preserve resources, and when resuming, guest will notify qemu to set parameter to 0, and then qemu will keep the normal actions. That can help guest's display come back.
> When I upstream above implementation, Parav and MST suggest me to use the PM capability to fix this problem instead of adding a new command or state bit.
> Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this problem.
> The reason is:
> during guest suspending, it will write D3 state through PM cap, then I can save the resources of virtio-gpu on Qemu side(set preserver_resource to 1),
> but during the process of resuming, the state of PM cap will be cleared by qemu resetting(qemu_system_wakeup-> qemu_devices_reset-> virtio_vga_base_reset-> virtio_gpu_gl_reset),
> it causes that when guest reads state from PM cap, it will find the virtio-gpu has already been D0 state, so guest will not write D0 through PM cap, so I can't know when to restore the resources(set preserver_resource to 0).

Looks like a bug to be fixed?

> Do you have any other suggestions?
> Or can I just fallback to the version that add a new command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think that way is more reasonable and feasible for virtio-gpu to protect display resources during S3. As for other devices, if necessary, they can also refer to the implementation of the virtio-gpu to add new commands to prevent resource loss during S3.

Note that there's recently a fix for no_soft_reset, a well behaved
device should not suffer from this issue anymore. If I understand this
correctly, there's no need for any extension for the spec as well.:

https://lore.kernel.org/lkml/CACGkMEs_MajTFxGVcK5R8oqQzTxuL4Pm=uUnOonDczWzqaucsw@mail.gmail.com/T/

Thanks

>
> On 2023/10/27 11:03, Chen, Jiqian wrote:
> > Hi Michael S. Tsirkin and Parav Pandit,
> > Thank you for your detailed explanation. I will try to use PM cap to fix this issue.
> >
> > On 2023/10/26 18:30, Michael S. Tsirkin wrote:
> >> On Thu, Oct 26, 2023 at 10:24:26AM +0000, Chen, Jiqian wrote:
> >>>
> >>> On 2023/10/25 11:51, Parav Pandit wrote:
> >>>>
> >>>>
> >>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>> Sent: Tuesday, October 24, 2023 5:44 PM
> >>>>>
> >>>>> On 2023/10/24 18:51, Parav Pandit wrote:
> >>>>>>
> >>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>> Sent: Tuesday, October 24, 2023 4:06 PM
> >>>>>>>
> >>>>>>> On 2023/10/23 21:35, Parav Pandit wrote:
> >>>>>>>>
> >>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>> Sent: Monday, October 23, 2023 4:09 PM
> >>>>>>>>
> >>>>>>>>>> I think this can be done without introducing the new register.
> >>>>>>>>>> Can you please check if the PM register itself can serve the
> >>>>>>>>>> purpose instead
> >>>>>>>>> of new virtio level register?
> >>>>>>>>> Do you mean the system PM register?
> >>>>>>>> No, the device's PM register at transport level.
> >>>>>>> I tried to find this register(pci level or virtio pci level or virtio
> >>>>>>> driver level), but I didn't find it in Linux kernel or Qemu codes.
> >>>>>>> May I know which register you are referring to specifically? Or which
> >>>>>>> PM state bit you mentioned below?
> >>>>>>>
> >>>>>> PCI spec's PCI Power Management Capability Structure in section 7.5.2.
> >>>>> Yes, what you point to is PM capability for PCIe device.
> >>>>> But the problem is still that in Qemu code, it will check the
> >>>>> condition(pci_bus_is_express or pci_is_express) of all virtio-pci devices in
> >>>>> function virtio_pci_realize(), if the virtio devices aren't a PCIe device, it will not
> >>>>> add PM capability for them.
> >>>> PCI PM capability is must for PCIe devices. So may be QEMU code has put it only under is_pcie check.
> >>>> But it can be done outside of that check as well because this capability exists on PCI too for long time and it is backward compatible.
> >>> Do you suggest me to implement PM capability for virtio devices in
> >>> Qemu firstly, and then to try if the PM capability can work for this
> >>> scenario?
> >>
> >> virtio devices in qemu already have a PM capability.
> >>
> >>
> >>> If so, we will complicate a simple problem. Because there are no other needs to add PM capability for virtio devices for now, if we add it just for preserving resources, it seems unnecessary and unreasonable. And we are not sure if there are other scenarios that are not during the process of PM state changing also need to preserve resources, if have, then the PM register can't cover, but preserve_resources register can.
> >>
> >> One of the selling points of virtio is precisely reusing existing
> >> platform capabilities as opposed to coming up with our own.
> >> See abstract.tex
> >>
> >>
> >>> Can I add some notes like "If PM capability is implemented for virtio devices, it may cover this scenario, and if there are no other scenarios that PM can't cover, then we can remove this register " in commit message or spec description and let us continue to add preserve_resources register?
> >>
> >> We can't remove registers.
> >>
> >>>>
> >>>>> And another problem is how about the MMIO transport devices? Since
> >>>>> preserve resources is need for all transport type devices.
> >>>>>
> >>>> MMIO lacks such rich PM definitions. If in future MMIO wants to support, it will be extended to match to other transports like PCI.
> >>>>
> >>>>>>>>
> >>>>>>>>> I think it is unreasonable to let virtio- device listen the PM
> >>>>>>>>> state of Guest system.
> >>>>>>>> Guest driver performs any work on the guest systems PM callback
> >>>>>>>> events in
> >>>>>>> the virtio driver.
> >>>>>>> I didn't find any PM state callback in the virtio driver.
> >>>>>>>
> >>>>>> There are virtio_suspend and virtio_resume in case of Linux.
> >>>>> I think what you said virtio_suspend/resume is freeze/restore callback from
> >>>>> "struct virtio_driver" or suspend/resume callback from "static const struct
> >>>>> dev_pm_ops virtio_pci_pm_ops".
> >>>>> And yes, I agree, if virtio devices have PM capability, maybe we can set PM state
> >>>>> in those callback functions.
> >>>>>
> >>>>>>
> >>>>>>>>
> >>>>>>>>> It's more suitable that each device gets notifications from driver,
> >>>>>>>>> and then do preserving resources operation.
> >>>>>>>> I agree that each device gets the notification from driver.
> >>>>>>>> The question is, should it be virtio driver, or existing pci driver
> >>>>>>>> which
> >>>>>>> transitions the state from d0->d3 and d3->d0 is enough.
> >>>>>>> It seems there isn't existing pci driver to transitions d0 or d3
> >>>>>>> state. Could you please tell me which one it is specifically? I am very willing to
> >>>>> give a try.
> >>>>>>>
> >>>>>> Virtio-pci modern driver of Linux should be able to.
> >>>>> Yes, I know, it is the VIRTIO_PCI_FLAG_INIT_PM_BIT. But still the two problems I
> >>>>> said above.
> >>>>>
> >>>> Both can be resolved without switching to pcie.
> >>>>
> >>>>>>
> >>>>>>>> Can you please check that?
> >>>>>>>>
> >>>>>>>>>>> --- a/transport-pci.tex
> >>>>>>>>>>> +++ b/transport-pci.tex
> >>>>>>>>>>> @@ -325,6 +325,7 @@ \subsubsection{Common configuration
> >>>>> structure
> >>>>>>>>>>> layout}\label{sec:Virtio Transport
> >>>>>>>>>>>          /* About the administration virtqueue. */
> >>>>>>>>>>>          le16 admin_queue_index;         /* read-only for driver */
> >>>>>>>>>>>          le16 admin_queue_num;         /* read-only for driver */
> >>>>>>>>>>> +        le16 preserve_resources;        /* read-write */
> >>>>>>>>>> Preserving these resources in the device implementation takes
> >>>>>>>>>> finite amount
> >>>>>>>>> of time.
> >>>>>>>>>> Possibly more than 40nsec (time of PCIe write TLP).
> >>>>>>>>>> Hence this register must be a polling register to indicate that
> >>>>>>>>> preservation_done.
> >>>>>>>>>> This will tell the guest when the preservation is done, and when
> >>>>>>>>>> restoration is
> >>>>>>>>> done, so that it can resume upper layers.
> >>>>>>>>>>
> >>>>>>>>>> Please refer to queue_reset definition to learn more about such
> >>>>>>>>>> register
> >>>>>>>>> definition.
> >>>>>>>>> Thanks, I will refer to "queue_reset". So, I need three values,
> >>>>>>>>> driver write 1 to let device do preserving resources, driver write
> >>>>>>>>> 2 to let device do restoring resources, device write 0 to tell
> >>>>>>>>> driver that preserving or restoring done, am I right?
> >>>>>>>>>
> >>>>>>>> Right.
> >>>>>>>>
> >>>>>>>> And if the existing pcie pm state bits can do, we can leverage that.
> >>>>>>>> If it cannot be used, lets add that reasoning in the commit log to
> >>>>>>>> describe this
> >>>>>>> register.
> >>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Lets please make sure that PCIe PM level registers are
> >>>>>>>>>> sufficient/not-sufficient
> >>>>>>>>> to decide the addition of this register.
> >>>>>>>>> But if the device is not a PCIe device, it doesn't have PM
> >>>>>>>>> capability, then this will not work. Actually in my local
> >>>>>>>>> environment, pci_is_express() return false in Qemu, they are not
> >>>>>>>>> PCIe
> >>>>>>> device.
> >>>>>>>> It is reasonable to ask to plug in as PCIe device in 2023 to get new
> >>>>>>>> functionality that too you mentioned a gpu device. 😊
> >>>>>>>> Which does not have very long history of any backward compatibility.
> >>>>>>> Do you suggest me to add PM capability for virtio-gpu or change
> >>>>>>> virtio-gpu to a PCIe device?
> >>>>>>>
> >>>>>> PCI Power Management Capability Structure does not seem to be limited to
> >>>>> PCIe.
> >>>>> I am not sure, but in current Qemu code, I can see the check "pci_is_express"
> >>>>> for all virtio-pci devices. If we want to add PM capability for virtio-pci devices,
> >>>>> we need to change them to PCIe device I think.
> >>>>>
> >>>> That is one option.
> >>>> Second option to extend PCI PM cap for non pci device because it is supported.
> >>>>
> >>>>>>
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>> Jiqian Chen.
> >>>
> >>> --
> >>> Best regards,
> >>> Jiqian Chen.
> >>
> >
>
> --
> Best regards,
> Jiqian Chen.


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  0:25                           ` [virtio-comment] " Jason Wang
@ 2024-01-15  7:25                             ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  7:25 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Parav Pandit, Gerd Hoffmann, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 08:25, Jason Wang wrote:
> On Fri, Jan 12, 2024 at 3:41 PM Chen, Jiqian <Jiqian.Chen@amd.com> wrote:
>>
>> Hi all,
>> Sorry to reply late.
>> I don't know if you still remember this problem, let me briefly descript it.
>> I am working to implement virtgpu S3 function on Xen.
>> Currently on Xen, if we start a guest through qemu with enabling virtgpu, and then suspend and resume guest. We can find that the guest kernel comes back, but the display doesn't. It just shown a black screen.
>> That is because during suspending, guest called into qemu and qemu destroyed all display resources and reset renderer. This made the display gone after guest resumed.
>> So, I add a new command for virtio-gpu that when guest is suspending, it will notify qemu and set parameter(preserver_resource) to 1 and then qemu will preserve resources, and when resuming, guest will notify qemu to set parameter to 0, and then qemu will keep the normal actions. That can help guest's display come back.
>> When I upstream above implementation, Parav and MST suggest me to use the PM capability to fix this problem instead of adding a new command or state bit.
>> Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this problem.
>> The reason is:
>> during guest suspending, it will write D3 state through PM cap, then I can save the resources of virtio-gpu on Qemu side(set preserver_resource to 1),
>> but during the process of resuming, the state of PM cap will be cleared by qemu resetting(qemu_system_wakeup-> qemu_devices_reset-> virtio_vga_base_reset-> virtio_gpu_gl_reset),
>> it causes that when guest reads state from PM cap, it will find the virtio-gpu has already been D0 state, so guest will not write D0 through PM cap, so I can't know when to restore the resources(set preserver_resource to 0).
> 
> Looks like a bug to be fixed?
Do you think the devices_reset behavior of qemu shouldn't happen? Or just the state of PM cap shouldn't be cleared?

> 
>> Do you have any other suggestions?
>> Or can I just fallback to the version that add a new command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think that way is more reasonable and feasible for virtio-gpu to protect display resources during S3. As for other devices, if necessary, they can also refer to the implementation of the virtio-gpu to add new commands to prevent resource loss during S3.
> 
> Note that there's recently a fix for no_soft_reset, a well behaved
> device should not suffer from this issue anymore. If I understand this
> correctly, there's no need for any extension for the spec as well.:
But in current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of virtio-pci devices isn't set, is it a bug?
Even with the fix you mentioned, the resetting can still happen because of no_soft_reset==0.

> 
> https://lore.kernel.org/lkml/CACGkMEs_MajTFxGVcK5R8oqQzTxuL4Pm=uUnOonDczWzqaucsw@mail.gmail.com/T/
> 
> Thanks
> 

Best Regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  7:25                             ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  7:25 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Parav Pandit, Gerd Hoffmann, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 08:25, Jason Wang wrote:
> On Fri, Jan 12, 2024 at 3:41 PM Chen, Jiqian <Jiqian.Chen@amd.com> wrote:
>>
>> Hi all,
>> Sorry to reply late.
>> I don't know if you still remember this problem, let me briefly descript it.
>> I am working to implement virtgpu S3 function on Xen.
>> Currently on Xen, if we start a guest through qemu with enabling virtgpu, and then suspend and resume guest. We can find that the guest kernel comes back, but the display doesn't. It just shown a black screen.
>> That is because during suspending, guest called into qemu and qemu destroyed all display resources and reset renderer. This made the display gone after guest resumed.
>> So, I add a new command for virtio-gpu that when guest is suspending, it will notify qemu and set parameter(preserver_resource) to 1 and then qemu will preserve resources, and when resuming, guest will notify qemu to set parameter to 0, and then qemu will keep the normal actions. That can help guest's display come back.
>> When I upstream above implementation, Parav and MST suggest me to use the PM capability to fix this problem instead of adding a new command or state bit.
>> Now, I have tried the PM capability of virtio-gpu, it can't be used to solve this problem.
>> The reason is:
>> during guest suspending, it will write D3 state through PM cap, then I can save the resources of virtio-gpu on Qemu side(set preserver_resource to 1),
>> but during the process of resuming, the state of PM cap will be cleared by qemu resetting(qemu_system_wakeup-> qemu_devices_reset-> virtio_vga_base_reset-> virtio_gpu_gl_reset),
>> it causes that when guest reads state from PM cap, it will find the virtio-gpu has already been D0 state, so guest will not write D0 through PM cap, so I can't know when to restore the resources(set preserver_resource to 0).
> 
> Looks like a bug to be fixed?
Do you think the devices_reset behavior of qemu shouldn't happen? Or just the state of PM cap shouldn't be cleared?

> 
>> Do you have any other suggestions?
>> Or can I just fallback to the version that add a new command(VIRTIO_GPU_CMD_PRESERVE_RESOURCE) in virtio-gpu? I think that way is more reasonable and feasible for virtio-gpu to protect display resources during S3. As for other devices, if necessary, they can also refer to the implementation of the virtio-gpu to add new commands to prevent resource loss during S3.
> 
> Note that there's recently a fix for no_soft_reset, a well behaved
> device should not suffer from this issue anymore. If I understand this
> correctly, there's no need for any extension for the spec as well.:
But in current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of virtio-pci devices isn't set, is it a bug?
Even with the fix you mentioned, the resetting can still happen because of no_soft_reset==0.

> 
> https://lore.kernel.org/lkml/CACGkMEs_MajTFxGVcK5R8oqQzTxuL4Pm=uUnOonDczWzqaucsw@mail.gmail.com/T/
> 
> Thanks
> 

Best Regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-12  9:44                                   ` [virtio-comment] " Parav Pandit
@ 2024-01-15  7:33                                     ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  7:33 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/12 17:44, Parav Pandit wrote:
> 
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Friday, January 12, 2024 2:55 PM
>>
>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>
>>>>
>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>> Hi Jiqian,
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>
>>>>>>
>>>>>> Hi all,
>>>>>> Sorry to reply late.
>>>>>> I don't know if you still remember this problem, let me briefly descript
>> it.
>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>> Currently on Xen, if we start a guest through qemu with enabling
>>>>>> virtgpu, and then suspend and resume guest. We can find that the
>>>>>> guest kernel comes back, but the display doesn't. It just shown a
>>>>>> black
>>>> screen.
>>>>>> That is because during suspending, guest called into qemu and qemu
>>>>>> destroyed all display resources and reset renderer. This made the
>>>>>> display gone after guest resumed.
>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>> suspending, it will notify qemu and set
>>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
>>>>>> resources, and when resuming, guest will notify qemu to set
>>>>>> parameter to 0, and then qemu will keep the normal actions. That can
>> help guest's display come back.
>>>>>> When I upstream above implementation, Parav and MST suggest me to
>>>> use
>>>>>> the PM capability to fix this problem instead of adding a new
>>>>>> command or state bit.
>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be used
>>>>>> to solve this problem.
>>>>>> The reason is:
>>>>>> during guest suspending, it will write D3 state through PM cap,
>>>>>> then I can save the resources of virtio-gpu on Qemu side(set
>>>>>> preserver_resource to 1), but during the process of resuming, the
>>>>>> state of PM cap will be cleared by qemu
>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
>>>>>> guest reads state from PM cap, it will find the virtio-gpu has
>>>>>> already been D0 state,
>>>>> This behavior needs to be fixed. As the spec listed out, " This
>>>>> 2-bit field is
>>>> used both to determine the current power state of a Function"
>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
>>>> current qemu code? Why?
>>> Because PM is implementing the support from D3->d0 transition and if it
>> device in D3, the device needs to respond that it is in D3 to match the PCI
>> spec.
>>>
>>>> Shouldn't all device states, including PM registers, be reset during
>>>> the process of virtio-gpu reset?
>>> If No_Soft_Reset== 0, no device context to be preserved.
>>> If No_Soft_Reset == 1, device to preserve minimal registers and other
>> things listed in pci spec.
>>>
>>>>
>>>>>
>>>>> So device needs to return D3 in the PowerState field.  This is must.
>>>> But current behavior is a kind of soft reset, I
>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
>>>> reasonable.
>>> What you described is, you need to do No_Soft_Reset=1, so please set the
>> cap accordingly to achieve the restore.
>>>
>>> Also in your case the if the QEMU knows that it will have to resume the
>> device soon.
>>> Hence, it can well prepare the gpu context before resuming the vcpus.
>>> In such case, the extra register wouldn’t be necessary.
>>> Having PM control register is anyway needed to drive the device properly.
>> Even as you said, the reset behavior of PM in the current QEMU code is
>> incorrect, and even if it is modified, it also cannot fix this problem with S3.
> You meant D3?
> 
>> Because there is twice device reset during resuming.
>> First is when we trigger a resume command to qemu, qemu will reset
>> device(qemu_system_wakeup-> qemu_devices_reset->
>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> A device implementation that supports PM should not reset the device. Please fix it.
Do you mean the devices_reset behavior of qemu shouldn't happen when a device has PM cap?
Or just the state of PM cap shouldn't be reset?

> 
>> After the first time, then second time happens in guest kernel,
>> virtio_device_restore-> virtio_reset_device. But the PM state changes (D3 to
>> D0) happens between those two, so the display resources will be still
>> destroyed by the second time of device reset.
> The driver that understands that device supports PM, will skip the reset steps.
But it happens for now.
In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of virtio-pci devices isn't set, is it a bug? It means the no_soft_reset==0.

> Hence the 2nd reset will also not occur.
> Therefore, device state will be restored.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] Re: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  7:33                                     ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  7:33 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/12 17:44, Parav Pandit wrote:
> 
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Friday, January 12, 2024 2:55 PM
>>
>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>
>>>>
>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>> Hi Jiqian,
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>
>>>>>>
>>>>>> Hi all,
>>>>>> Sorry to reply late.
>>>>>> I don't know if you still remember this problem, let me briefly descript
>> it.
>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>> Currently on Xen, if we start a guest through qemu with enabling
>>>>>> virtgpu, and then suspend and resume guest. We can find that the
>>>>>> guest kernel comes back, but the display doesn't. It just shown a
>>>>>> black
>>>> screen.
>>>>>> That is because during suspending, guest called into qemu and qemu
>>>>>> destroyed all display resources and reset renderer. This made the
>>>>>> display gone after guest resumed.
>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>> suspending, it will notify qemu and set
>>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
>>>>>> resources, and when resuming, guest will notify qemu to set
>>>>>> parameter to 0, and then qemu will keep the normal actions. That can
>> help guest's display come back.
>>>>>> When I upstream above implementation, Parav and MST suggest me to
>>>> use
>>>>>> the PM capability to fix this problem instead of adding a new
>>>>>> command or state bit.
>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be used
>>>>>> to solve this problem.
>>>>>> The reason is:
>>>>>> during guest suspending, it will write D3 state through PM cap,
>>>>>> then I can save the resources of virtio-gpu on Qemu side(set
>>>>>> preserver_resource to 1), but during the process of resuming, the
>>>>>> state of PM cap will be cleared by qemu
>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
>>>>>> guest reads state from PM cap, it will find the virtio-gpu has
>>>>>> already been D0 state,
>>>>> This behavior needs to be fixed. As the spec listed out, " This
>>>>> 2-bit field is
>>>> used both to determine the current power state of a Function"
>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
>>>> current qemu code? Why?
>>> Because PM is implementing the support from D3->d0 transition and if it
>> device in D3, the device needs to respond that it is in D3 to match the PCI
>> spec.
>>>
>>>> Shouldn't all device states, including PM registers, be reset during
>>>> the process of virtio-gpu reset?
>>> If No_Soft_Reset== 0, no device context to be preserved.
>>> If No_Soft_Reset == 1, device to preserve minimal registers and other
>> things listed in pci spec.
>>>
>>>>
>>>>>
>>>>> So device needs to return D3 in the PowerState field.  This is must.
>>>> But current behavior is a kind of soft reset, I
>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
>>>> reasonable.
>>> What you described is, you need to do No_Soft_Reset=1, so please set the
>> cap accordingly to achieve the restore.
>>>
>>> Also in your case the if the QEMU knows that it will have to resume the
>> device soon.
>>> Hence, it can well prepare the gpu context before resuming the vcpus.
>>> In such case, the extra register wouldn’t be necessary.
>>> Having PM control register is anyway needed to drive the device properly.
>> Even as you said, the reset behavior of PM in the current QEMU code is
>> incorrect, and even if it is modified, it also cannot fix this problem with S3.
> You meant D3?
> 
>> Because there is twice device reset during resuming.
>> First is when we trigger a resume command to qemu, qemu will reset
>> device(qemu_system_wakeup-> qemu_devices_reset->
>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> A device implementation that supports PM should not reset the device. Please fix it.
Do you mean the devices_reset behavior of qemu shouldn't happen when a device has PM cap?
Or just the state of PM cap shouldn't be reset?

> 
>> After the first time, then second time happens in guest kernel,
>> virtio_device_restore-> virtio_reset_device. But the PM state changes (D3 to
>> D0) happens between those two, so the display resources will be still
>> destroyed by the second time of device reset.
> The driver that understands that device supports PM, will skip the reset steps.
But it happens for now.
In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of virtio-pci devices isn't set, is it a bug? It means the no_soft_reset==0.

> Hence the 2nd reset will also not occur.
> Therefore, device state will be restored.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  7:33                                     ` [virtio-comment] " Chen, Jiqian
@ 2024-01-15  7:37                                       ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  7:37 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 1:03 PM
> 
> On 2024/1/12 17:44, Parav Pandit wrote:
> >
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Friday, January 12, 2024 2:55 PM
> >>
> >> On 2024/1/12 16:47, Parav Pandit wrote:
> >>>
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Friday, January 12, 2024 1:55 PM
> >>>>
> >>>>
> >>>> On 2024/1/12 16:02, Parav Pandit wrote:
> >>>>> Hi Jiqian,
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>>>
> >>>>>>
> >>>>>> Hi all,
> >>>>>> Sorry to reply late.
> >>>>>> I don't know if you still remember this problem, let me briefly
> >>>>>> descript
> >> it.
> >>>>>> I am working to implement virtgpu S3 function on Xen.
> >>>>>> Currently on Xen, if we start a guest through qemu with enabling
> >>>>>> virtgpu, and then suspend and resume guest. We can find that the
> >>>>>> guest kernel comes back, but the display doesn't. It just shown a
> >>>>>> black
> >>>> screen.
> >>>>>> That is because during suspending, guest called into qemu and
> >>>>>> qemu destroyed all display resources and reset renderer. This
> >>>>>> made the display gone after guest resumed.
> >>>>>> So, I add a new command for virtio-gpu that when guest is
> >>>>>> suspending, it will notify qemu and set
> >>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
> >>>>>> resources, and when resuming, guest will notify qemu to set
> >>>>>> parameter to 0, and then qemu will keep the normal actions. That
> >>>>>> can
> >> help guest's display come back.
> >>>>>> When I upstream above implementation, Parav and MST suggest me
> to
> >>>> use
> >>>>>> the PM capability to fix this problem instead of adding a new
> >>>>>> command or state bit.
> >>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be
> >>>>>> used to solve this problem.
> >>>>>> The reason is:
> >>>>>> during guest suspending, it will write D3 state through PM cap,
> >>>>>> then I can save the resources of virtio-gpu on Qemu side(set
> >>>>>> preserver_resource to 1), but during the process of resuming, the
> >>>>>> state of PM cap will be cleared by qemu
> >>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
> >>>>>> guest reads state from PM cap, it will find the virtio-gpu has
> >>>>>> already been D0 state,
> >>>>> This behavior needs to be fixed. As the spec listed out, " This
> >>>>> 2-bit field is
> >>>> used both to determine the current power state of a Function"
> >>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
> >>>> current qemu code? Why?
> >>> Because PM is implementing the support from D3->d0 transition and if
> >>> it
> >> device in D3, the device needs to respond that it is in D3 to match
> >> the PCI spec.
> >>>
> >>>> Shouldn't all device states, including PM registers, be reset
> >>>> during the process of virtio-gpu reset?
> >>> If No_Soft_Reset== 0, no device context to be preserved.
> >>> If No_Soft_Reset == 1, device to preserve minimal registers and
> >>> other
> >> things listed in pci spec.
> >>>
> >>>>
> >>>>>
> >>>>> So device needs to return D3 in the PowerState field.  This is must.
> >>>> But current behavior is a kind of soft reset, I
> >>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
> >>>> reasonable.
> >>> What you described is, you need to do No_Soft_Reset=1, so please set
> >>> the
> >> cap accordingly to achieve the restore.
> >>>
> >>> Also in your case the if the QEMU knows that it will have to resume
> >>> the
> >> device soon.
> >>> Hence, it can well prepare the gpu context before resuming the vcpus.
> >>> In such case, the extra register wouldn’t be necessary.
> >>> Having PM control register is anyway needed to drive the device properly.
> >> Even as you said, the reset behavior of PM in the current QEMU code
> >> is incorrect, and even if it is modified, it also cannot fix this problem with S3.
> > You meant D3?
> >
> >> Because there is twice device reset during resuming.
> >> First is when we trigger a resume command to qemu, qemu will reset
> >> device(qemu_system_wakeup-> qemu_devices_reset->
> >> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> > A device implementation that supports PM should not reset the device.
> Please fix it.
> Do you mean the devices_reset behavior of qemu shouldn't happen when a
> device has PM cap?
Right. Device reset should not happen on PM capable device that offers no_soft_reset==1.

> Or just the state of PM cap shouldn't be reset?
> 
> >
> >> After the first time, then second time happens in guest kernel,
> >> virtio_device_restore-> virtio_reset_device. But the PM state changes
> >> (D3 to
> >> D0) happens between those two, so the display resources will be still
> >> destroyed by the second time of device reset.
> > The driver that understands that device supports PM, will skip the reset
> steps.
> But it happens for now.
This should be enhanced in the guest driver.

> In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of virtio-pci
> devices isn't set, is it a bug? It means the no_soft_reset==0.
> 
It should be set when/if the device can store and restore the device context on d3->d0 transition.

> > Hence the 2nd reset will also not occur.
> > Therefore, device state will be restored.
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  7:37                                       ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  7:37 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 1:03 PM
> 
> On 2024/1/12 17:44, Parav Pandit wrote:
> >
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Friday, January 12, 2024 2:55 PM
> >>
> >> On 2024/1/12 16:47, Parav Pandit wrote:
> >>>
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Friday, January 12, 2024 1:55 PM
> >>>>
> >>>>
> >>>> On 2024/1/12 16:02, Parav Pandit wrote:
> >>>>> Hi Jiqian,
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>>>
> >>>>>>
> >>>>>> Hi all,
> >>>>>> Sorry to reply late.
> >>>>>> I don't know if you still remember this problem, let me briefly
> >>>>>> descript
> >> it.
> >>>>>> I am working to implement virtgpu S3 function on Xen.
> >>>>>> Currently on Xen, if we start a guest through qemu with enabling
> >>>>>> virtgpu, and then suspend and resume guest. We can find that the
> >>>>>> guest kernel comes back, but the display doesn't. It just shown a
> >>>>>> black
> >>>> screen.
> >>>>>> That is because during suspending, guest called into qemu and
> >>>>>> qemu destroyed all display resources and reset renderer. This
> >>>>>> made the display gone after guest resumed.
> >>>>>> So, I add a new command for virtio-gpu that when guest is
> >>>>>> suspending, it will notify qemu and set
> >>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
> >>>>>> resources, and when resuming, guest will notify qemu to set
> >>>>>> parameter to 0, and then qemu will keep the normal actions. That
> >>>>>> can
> >> help guest's display come back.
> >>>>>> When I upstream above implementation, Parav and MST suggest me
> to
> >>>> use
> >>>>>> the PM capability to fix this problem instead of adding a new
> >>>>>> command or state bit.
> >>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be
> >>>>>> used to solve this problem.
> >>>>>> The reason is:
> >>>>>> during guest suspending, it will write D3 state through PM cap,
> >>>>>> then I can save the resources of virtio-gpu on Qemu side(set
> >>>>>> preserver_resource to 1), but during the process of resuming, the
> >>>>>> state of PM cap will be cleared by qemu
> >>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
> >>>>>> guest reads state from PM cap, it will find the virtio-gpu has
> >>>>>> already been D0 state,
> >>>>> This behavior needs to be fixed. As the spec listed out, " This
> >>>>> 2-bit field is
> >>>> used both to determine the current power state of a Function"
> >>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
> >>>> current qemu code? Why?
> >>> Because PM is implementing the support from D3->d0 transition and if
> >>> it
> >> device in D3, the device needs to respond that it is in D3 to match
> >> the PCI spec.
> >>>
> >>>> Shouldn't all device states, including PM registers, be reset
> >>>> during the process of virtio-gpu reset?
> >>> If No_Soft_Reset== 0, no device context to be preserved.
> >>> If No_Soft_Reset == 1, device to preserve minimal registers and
> >>> other
> >> things listed in pci spec.
> >>>
> >>>>
> >>>>>
> >>>>> So device needs to return D3 in the PowerState field.  This is must.
> >>>> But current behavior is a kind of soft reset, I
> >>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
> >>>> reasonable.
> >>> What you described is, you need to do No_Soft_Reset=1, so please set
> >>> the
> >> cap accordingly to achieve the restore.
> >>>
> >>> Also in your case the if the QEMU knows that it will have to resume
> >>> the
> >> device soon.
> >>> Hence, it can well prepare the gpu context before resuming the vcpus.
> >>> In such case, the extra register wouldn’t be necessary.
> >>> Having PM control register is anyway needed to drive the device properly.
> >> Even as you said, the reset behavior of PM in the current QEMU code
> >> is incorrect, and even if it is modified, it also cannot fix this problem with S3.
> > You meant D3?
> >
> >> Because there is twice device reset during resuming.
> >> First is when we trigger a resume command to qemu, qemu will reset
> >> device(qemu_system_wakeup-> qemu_devices_reset->
> >> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> > A device implementation that supports PM should not reset the device.
> Please fix it.
> Do you mean the devices_reset behavior of qemu shouldn't happen when a
> device has PM cap?
Right. Device reset should not happen on PM capable device that offers no_soft_reset==1.

> Or just the state of PM cap shouldn't be reset?
> 
> >
> >> After the first time, then second time happens in guest kernel,
> >> virtio_device_restore-> virtio_reset_device. But the PM state changes
> >> (D3 to
> >> D0) happens between those two, so the display resources will be still
> >> destroyed by the second time of device reset.
> > The driver that understands that device supports PM, will skip the reset
> steps.
> But it happens for now.
This should be enhanced in the guest driver.

> In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of virtio-pci
> devices isn't set, is it a bug? It means the no_soft_reset==0.
> 
It should be set when/if the device can store and restore the device context on d3->d0 transition.

> > Hence the 2nd reset will also not occur.
> > Therefore, device state will be restored.
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  7:37                                       ` [virtio-comment] " Parav Pandit
@ 2024-01-15  7:48                                         ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  7:48 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 15:37, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 1:03 PM
>>
>> On 2024/1/12 17:44, Parav Pandit wrote:
>>>
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Friday, January 12, 2024 2:55 PM
>>>>
>>>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>>>
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>>>
>>>>>>
>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>>>> Hi Jiqian,
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>>>
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>> Sorry to reply late.
>>>>>>>> I don't know if you still remember this problem, let me briefly
>>>>>>>> descript
>>>> it.
>>>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>>>> Currently on Xen, if we start a guest through qemu with enabling
>>>>>>>> virtgpu, and then suspend and resume guest. We can find that the
>>>>>>>> guest kernel comes back, but the display doesn't. It just shown a
>>>>>>>> black
>>>>>> screen.
>>>>>>>> That is because during suspending, guest called into qemu and
>>>>>>>> qemu destroyed all display resources and reset renderer. This
>>>>>>>> made the display gone after guest resumed.
>>>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>>>> suspending, it will notify qemu and set
>>>>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
>>>>>>>> resources, and when resuming, guest will notify qemu to set
>>>>>>>> parameter to 0, and then qemu will keep the normal actions. That
>>>>>>>> can
>>>> help guest's display come back.
>>>>>>>> When I upstream above implementation, Parav and MST suggest me
>> to
>>>>>> use
>>>>>>>> the PM capability to fix this problem instead of adding a new
>>>>>>>> command or state bit.
>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be
>>>>>>>> used to solve this problem.
>>>>>>>> The reason is:
>>>>>>>> during guest suspending, it will write D3 state through PM cap,
>>>>>>>> then I can save the resources of virtio-gpu on Qemu side(set
>>>>>>>> preserver_resource to 1), but during the process of resuming, the
>>>>>>>> state of PM cap will be cleared by qemu
>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
>>>>>>>> guest reads state from PM cap, it will find the virtio-gpu has
>>>>>>>> already been D0 state,
>>>>>>> This behavior needs to be fixed. As the spec listed out, " This
>>>>>>> 2-bit field is
>>>>>> used both to determine the current power state of a Function"
>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
>>>>>> current qemu code? Why?
>>>>> Because PM is implementing the support from D3->d0 transition and if
>>>>> it
>>>> device in D3, the device needs to respond that it is in D3 to match
>>>> the PCI spec.
>>>>>
>>>>>> Shouldn't all device states, including PM registers, be reset
>>>>>> during the process of virtio-gpu reset?
>>>>> If No_Soft_Reset== 0, no device context to be preserved.
>>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
>>>>> other
>>>> things listed in pci spec.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> So device needs to return D3 in the PowerState field.  This is must.
>>>>>> But current behavior is a kind of soft reset, I
>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
>>>>>> reasonable.
>>>>> What you described is, you need to do No_Soft_Reset=1, so please set
>>>>> the
>>>> cap accordingly to achieve the restore.
>>>>>
>>>>> Also in your case the if the QEMU knows that it will have to resume
>>>>> the
>>>> device soon.
>>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
>>>>> In such case, the extra register wouldn’t be necessary.
>>>>> Having PM control register is anyway needed to drive the device properly.
>>>> Even as you said, the reset behavior of PM in the current QEMU code
>>>> is incorrect, and even if it is modified, it also cannot fix this problem with S3.
>>> You meant D3?
>>>
>>>> Because there is twice device reset during resuming.
>>>> First is when we trigger a resume command to qemu, qemu will reset
>>>> device(qemu_system_wakeup-> qemu_devices_reset->
>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
>>> A device implementation that supports PM should not reset the device.
>> Please fix it.
>> Do you mean the devices_reset behavior of qemu shouldn't happen when a
>> device has PM cap?
> Right. Device reset should not happen on PM capable device that offers no_soft_reset==1.
But current qemu code, no_soft_reset==0.

> 
>> Or just the state of PM cap shouldn't be reset?
>>
>>>
>>>> After the first time, then second time happens in guest kernel,
>>>> virtio_device_restore-> virtio_reset_device. But the PM state changes
>>>> (D3 to
>>>> D0) happens between those two, so the display resources will be still
>>>> destroyed by the second time of device reset.
>>> The driver that understands that device supports PM, will skip the reset
>> steps.
>> But it happens for now.
> This should be enhanced in the guest driver.
> 
>> In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of virtio-pci
>> devices isn't set, is it a bug? It means the no_soft_reset==0.
>>
> It should be set when/if the device can store and restore the device context on d3->d0 transition.
I don't know why it doesn't be set in current qemu code.
What do you mean "the device can store and restore the device context on d3->d0 transition"? The real physical device or the simulated devices on qemu side?
How do I confirm it?

> 
>>> Hence the 2nd reset will also not occur.
>>> Therefore, device state will be restored.
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  7:48                                         ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  7:48 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 15:37, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 1:03 PM
>>
>> On 2024/1/12 17:44, Parav Pandit wrote:
>>>
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Friday, January 12, 2024 2:55 PM
>>>>
>>>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>>>
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>>>
>>>>>>
>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>>>> Hi Jiqian,
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>>>
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>> Sorry to reply late.
>>>>>>>> I don't know if you still remember this problem, let me briefly
>>>>>>>> descript
>>>> it.
>>>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>>>> Currently on Xen, if we start a guest through qemu with enabling
>>>>>>>> virtgpu, and then suspend and resume guest. We can find that the
>>>>>>>> guest kernel comes back, but the display doesn't. It just shown a
>>>>>>>> black
>>>>>> screen.
>>>>>>>> That is because during suspending, guest called into qemu and
>>>>>>>> qemu destroyed all display resources and reset renderer. This
>>>>>>>> made the display gone after guest resumed.
>>>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>>>> suspending, it will notify qemu and set
>>>>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
>>>>>>>> resources, and when resuming, guest will notify qemu to set
>>>>>>>> parameter to 0, and then qemu will keep the normal actions. That
>>>>>>>> can
>>>> help guest's display come back.
>>>>>>>> When I upstream above implementation, Parav and MST suggest me
>> to
>>>>>> use
>>>>>>>> the PM capability to fix this problem instead of adding a new
>>>>>>>> command or state bit.
>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be
>>>>>>>> used to solve this problem.
>>>>>>>> The reason is:
>>>>>>>> during guest suspending, it will write D3 state through PM cap,
>>>>>>>> then I can save the resources of virtio-gpu on Qemu side(set
>>>>>>>> preserver_resource to 1), but during the process of resuming, the
>>>>>>>> state of PM cap will be cleared by qemu
>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that when
>>>>>>>> guest reads state from PM cap, it will find the virtio-gpu has
>>>>>>>> already been D0 state,
>>>>>>> This behavior needs to be fixed. As the spec listed out, " This
>>>>>>> 2-bit field is
>>>>>> used both to determine the current power state of a Function"
>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
>>>>>> current qemu code? Why?
>>>>> Because PM is implementing the support from D3->d0 transition and if
>>>>> it
>>>> device in D3, the device needs to respond that it is in D3 to match
>>>> the PCI spec.
>>>>>
>>>>>> Shouldn't all device states, including PM registers, be reset
>>>>>> during the process of virtio-gpu reset?
>>>>> If No_Soft_Reset== 0, no device context to be preserved.
>>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
>>>>> other
>>>> things listed in pci spec.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> So device needs to return D3 in the PowerState field.  This is must.
>>>>>> But current behavior is a kind of soft reset, I
>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu is
>>>>>> reasonable.
>>>>> What you described is, you need to do No_Soft_Reset=1, so please set
>>>>> the
>>>> cap accordingly to achieve the restore.
>>>>>
>>>>> Also in your case the if the QEMU knows that it will have to resume
>>>>> the
>>>> device soon.
>>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
>>>>> In such case, the extra register wouldn’t be necessary.
>>>>> Having PM control register is anyway needed to drive the device properly.
>>>> Even as you said, the reset behavior of PM in the current QEMU code
>>>> is incorrect, and even if it is modified, it also cannot fix this problem with S3.
>>> You meant D3?
>>>
>>>> Because there is twice device reset during resuming.
>>>> First is when we trigger a resume command to qemu, qemu will reset
>>>> device(qemu_system_wakeup-> qemu_devices_reset->
>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
>>> A device implementation that supports PM should not reset the device.
>> Please fix it.
>> Do you mean the devices_reset behavior of qemu shouldn't happen when a
>> device has PM cap?
> Right. Device reset should not happen on PM capable device that offers no_soft_reset==1.
But current qemu code, no_soft_reset==0.

> 
>> Or just the state of PM cap shouldn't be reset?
>>
>>>
>>>> After the first time, then second time happens in guest kernel,
>>>> virtio_device_restore-> virtio_reset_device. But the PM state changes
>>>> (D3 to
>>>> D0) happens between those two, so the display resources will be still
>>>> destroyed by the second time of device reset.
>>> The driver that understands that device supports PM, will skip the reset
>> steps.
>> But it happens for now.
> This should be enhanced in the guest driver.
> 
>> In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of virtio-pci
>> devices isn't set, is it a bug? It means the no_soft_reset==0.
>>
> It should be set when/if the device can store and restore the device context on d3->d0 transition.
I don't know why it doesn't be set in current qemu code.
What do you mean "the device can store and restore the device context on d3->d0 transition"? The real physical device or the simulated devices on qemu side?
How do I confirm it?

> 
>>> Hence the 2nd reset will also not occur.
>>> Therefore, device state will be restored.
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  7:48                                         ` Chen, Jiqian
@ 2024-01-15  7:55                                           ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  7:55 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 1:18 PM
> 
> On 2024/1/15 15:37, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 1:03 PM
> >>
> >> On 2024/1/12 17:44, Parav Pandit wrote:
> >>>
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Friday, January 12, 2024 2:55 PM
> >>>>
> >>>> On 2024/1/12 16:47, Parav Pandit wrote:
> >>>>>
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Friday, January 12, 2024 1:55 PM
> >>>>>>
> >>>>>>
> >>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
> >>>>>>> Hi Jiqian,
> >>>>>>>
> >>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>> Sorry to reply late.
> >>>>>>>> I don't know if you still remember this problem, let me briefly
> >>>>>>>> descript
> >>>> it.
> >>>>>>>> I am working to implement virtgpu S3 function on Xen.
> >>>>>>>> Currently on Xen, if we start a guest through qemu with
> >>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
> >>>>>>>> find that the guest kernel comes back, but the display doesn't.
> >>>>>>>> It just shown a black
> >>>>>> screen.
> >>>>>>>> That is because during suspending, guest called into qemu and
> >>>>>>>> qemu destroyed all display resources and reset renderer. This
> >>>>>>>> made the display gone after guest resumed.
> >>>>>>>> So, I add a new command for virtio-gpu that when guest is
> >>>>>>>> suspending, it will notify qemu and set
> >>>>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
> >>>>>>>> resources, and when resuming, guest will notify qemu to set
> >>>>>>>> parameter to 0, and then qemu will keep the normal actions.
> >>>>>>>> That can
> >>>> help guest's display come back.
> >>>>>>>> When I upstream above implementation, Parav and MST suggest me
> >> to
> >>>>>> use
> >>>>>>>> the PM capability to fix this problem instead of adding a new
> >>>>>>>> command or state bit.
> >>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be
> >>>>>>>> used to solve this problem.
> >>>>>>>> The reason is:
> >>>>>>>> during guest suspending, it will write D3 state through PM cap,
> >>>>>>>> then I can save the resources of virtio-gpu on Qemu side(set
> >>>>>>>> preserver_resource to 1), but during the process of resuming,
> >>>>>>>> the state of PM cap will be cleared by qemu
> >>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that
> >>>>>>>> when guest reads state from PM cap, it will find the virtio-gpu
> >>>>>>>> has already been D0 state,
> >>>>>>> This behavior needs to be fixed. As the spec listed out, " This
> >>>>>>> 2-bit field is
> >>>>>> used both to determine the current power state of a Function"
> >>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
> >>>>>> current qemu code? Why?
> >>>>> Because PM is implementing the support from D3->d0 transition and
> >>>>> if it
> >>>> device in D3, the device needs to respond that it is in D3 to match
> >>>> the PCI spec.
> >>>>>
> >>>>>> Shouldn't all device states, including PM registers, be reset
> >>>>>> during the process of virtio-gpu reset?
> >>>>> If No_Soft_Reset== 0, no device context to be preserved.
> >>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
> >>>>> other
> >>>> things listed in pci spec.
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> So device needs to return D3 in the PowerState field.  This is must.
> >>>>>> But current behavior is a kind of soft reset, I
> >>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu
> is
> >>>>>> reasonable.
> >>>>> What you described is, you need to do No_Soft_Reset=1, so please
> >>>>> set the
> >>>> cap accordingly to achieve the restore.
> >>>>>
> >>>>> Also in your case the if the QEMU knows that it will have to
> >>>>> resume the
> >>>> device soon.
> >>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
> >>>>> In such case, the extra register wouldn’t be necessary.
> >>>>> Having PM control register is anyway needed to drive the device
> properly.
> >>>> Even as you said, the reset behavior of PM in the current QEMU code
> >>>> is incorrect, and even if it is modified, it also cannot fix this problem with
> S3.
> >>> You meant D3?
> >>>
> >>>> Because there is twice device reset during resuming.
> >>>> First is when we trigger a resume command to qemu, qemu will reset
> >>>> device(qemu_system_wakeup-> qemu_devices_reset->
> >>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> >>> A device implementation that supports PM should not reset the device.
> >> Please fix it.
> >> Do you mean the devices_reset behavior of qemu shouldn't happen when
> >> a device has PM cap?
> > Right. Device reset should not happen on PM capable device that offers
> no_soft_reset==1.
> But current qemu code, no_soft_reset==0.
> 
So please fix the qemu when you are adding the functionality of restoring the device context on PM.

> >
> >> Or just the state of PM cap shouldn't be reset?
> >>
> >>>
> >>>> After the first time, then second time happens in guest kernel,
> >>>> virtio_device_restore-> virtio_reset_device. But the PM state
> >>>> changes
> >>>> (D3 to
> >>>> D0) happens between those two, so the display resources will be
> >>>> still destroyed by the second time of device reset.
> >>> The driver that understands that device supports PM, will skip the
> >>> reset
> >> steps.
> >> But it happens for now.
> > This should be enhanced in the guest driver.
> >
> >> In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of
> >> virtio-pci devices isn't set, is it a bug? It means the no_soft_reset==0.
> >>
> > It should be set when/if the device can store and restore the device context
> on d3->d0 transition.
> I don't know why it doesn't be set in current qemu code.
May be because QEMU is not restoring the context on PM commands.

> What do you mean "the device can store and restore the device context on
> d3->d0 transition"? The real physical device or the simulated devices on qemu
> side?
Does not matter, whichever implementation of device is doing the PM should set it.
In your case it is qemu.

> How do I confirm it?
If you mean "I" means guest driver, it will just work as long as it refers to the PM capabilities.

> 
> >
> >>> Hence the 2nd reset will also not occur.
> >>> Therefore, device state will be restored.
> >>
> >> --
> >> Best regards,
> >> Jiqian Chen.
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  7:55                                           ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  7:55 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 1:18 PM
> 
> On 2024/1/15 15:37, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 1:03 PM
> >>
> >> On 2024/1/12 17:44, Parav Pandit wrote:
> >>>
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Friday, January 12, 2024 2:55 PM
> >>>>
> >>>> On 2024/1/12 16:47, Parav Pandit wrote:
> >>>>>
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Friday, January 12, 2024 1:55 PM
> >>>>>>
> >>>>>>
> >>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
> >>>>>>> Hi Jiqian,
> >>>>>>>
> >>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>> Sorry to reply late.
> >>>>>>>> I don't know if you still remember this problem, let me briefly
> >>>>>>>> descript
> >>>> it.
> >>>>>>>> I am working to implement virtgpu S3 function on Xen.
> >>>>>>>> Currently on Xen, if we start a guest through qemu with
> >>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
> >>>>>>>> find that the guest kernel comes back, but the display doesn't.
> >>>>>>>> It just shown a black
> >>>>>> screen.
> >>>>>>>> That is because during suspending, guest called into qemu and
> >>>>>>>> qemu destroyed all display resources and reset renderer. This
> >>>>>>>> made the display gone after guest resumed.
> >>>>>>>> So, I add a new command for virtio-gpu that when guest is
> >>>>>>>> suspending, it will notify qemu and set
> >>>>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
> >>>>>>>> resources, and when resuming, guest will notify qemu to set
> >>>>>>>> parameter to 0, and then qemu will keep the normal actions.
> >>>>>>>> That can
> >>>> help guest's display come back.
> >>>>>>>> When I upstream above implementation, Parav and MST suggest me
> >> to
> >>>>>> use
> >>>>>>>> the PM capability to fix this problem instead of adding a new
> >>>>>>>> command or state bit.
> >>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be
> >>>>>>>> used to solve this problem.
> >>>>>>>> The reason is:
> >>>>>>>> during guest suspending, it will write D3 state through PM cap,
> >>>>>>>> then I can save the resources of virtio-gpu on Qemu side(set
> >>>>>>>> preserver_resource to 1), but during the process of resuming,
> >>>>>>>> the state of PM cap will be cleared by qemu
> >>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that
> >>>>>>>> when guest reads state from PM cap, it will find the virtio-gpu
> >>>>>>>> has already been D0 state,
> >>>>>>> This behavior needs to be fixed. As the spec listed out, " This
> >>>>>>> 2-bit field is
> >>>>>> used both to determine the current power state of a Function"
> >>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
> >>>>>> current qemu code? Why?
> >>>>> Because PM is implementing the support from D3->d0 transition and
> >>>>> if it
> >>>> device in D3, the device needs to respond that it is in D3 to match
> >>>> the PCI spec.
> >>>>>
> >>>>>> Shouldn't all device states, including PM registers, be reset
> >>>>>> during the process of virtio-gpu reset?
> >>>>> If No_Soft_Reset== 0, no device context to be preserved.
> >>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
> >>>>> other
> >>>> things listed in pci spec.
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> So device needs to return D3 in the PowerState field.  This is must.
> >>>>>> But current behavior is a kind of soft reset, I
> >>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu
> is
> >>>>>> reasonable.
> >>>>> What you described is, you need to do No_Soft_Reset=1, so please
> >>>>> set the
> >>>> cap accordingly to achieve the restore.
> >>>>>
> >>>>> Also in your case the if the QEMU knows that it will have to
> >>>>> resume the
> >>>> device soon.
> >>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
> >>>>> In such case, the extra register wouldn’t be necessary.
> >>>>> Having PM control register is anyway needed to drive the device
> properly.
> >>>> Even as you said, the reset behavior of PM in the current QEMU code
> >>>> is incorrect, and even if it is modified, it also cannot fix this problem with
> S3.
> >>> You meant D3?
> >>>
> >>>> Because there is twice device reset during resuming.
> >>>> First is when we trigger a resume command to qemu, qemu will reset
> >>>> device(qemu_system_wakeup-> qemu_devices_reset->
> >>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> >>> A device implementation that supports PM should not reset the device.
> >> Please fix it.
> >> Do you mean the devices_reset behavior of qemu shouldn't happen when
> >> a device has PM cap?
> > Right. Device reset should not happen on PM capable device that offers
> no_soft_reset==1.
> But current qemu code, no_soft_reset==0.
> 
So please fix the qemu when you are adding the functionality of restoring the device context on PM.

> >
> >> Or just the state of PM cap shouldn't be reset?
> >>
> >>>
> >>>> After the first time, then second time happens in guest kernel,
> >>>> virtio_device_restore-> virtio_reset_device. But the PM state
> >>>> changes
> >>>> (D3 to
> >>>> D0) happens between those two, so the display resources will be
> >>>> still destroyed by the second time of device reset.
> >>> The driver that understands that device supports PM, will skip the
> >>> reset
> >> steps.
> >> But it happens for now.
> > This should be enhanced in the guest driver.
> >
> >> In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of
> >> virtio-pci devices isn't set, is it a bug? It means the no_soft_reset==0.
> >>
> > It should be set when/if the device can store and restore the device context
> on d3->d0 transition.
> I don't know why it doesn't be set in current qemu code.
May be because QEMU is not restoring the context on PM commands.

> What do you mean "the device can store and restore the device context on
> d3->d0 transition"? The real physical device or the simulated devices on qemu
> side?
Does not matter, whichever implementation of device is doing the PM should set it.
In your case it is qemu.

> How do I confirm it?
If you mean "I" means guest driver, it will just work as long as it refers to the PM capabilities.

> 
> >
> >>> Hence the 2nd reset will also not occur.
> >>> Therefore, device state will be restored.
> >>
> >> --
> >> Best regards,
> >> Jiqian Chen.
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  7:55                                           ` Parav Pandit
@ 2024-01-15  8:20                                             ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  8:20 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 15:55, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 1:18 PM
>>
>> On 2024/1/15 15:37, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 1:03 PM
>>>>
>>>> On 2024/1/12 17:44, Parav Pandit wrote:
>>>>>
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Friday, January 12, 2024 2:55 PM
>>>>>>
>>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>>>>>
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>>>>>> Hi Jiqian,
>>>>>>>>>
>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>> Sorry to reply late.
>>>>>>>>>> I don't know if you still remember this problem, let me briefly
>>>>>>>>>> descript
>>>>>> it.
>>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>>>>>> Currently on Xen, if we start a guest through qemu with
>>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
>>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
>>>>>>>>>> It just shown a black
>>>>>>>> screen.
>>>>>>>>>> That is because during suspending, guest called into qemu and
>>>>>>>>>> qemu destroyed all display resources and reset renderer. This
>>>>>>>>>> made the display gone after guest resumed.
>>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>>>>>> suspending, it will notify qemu and set
>>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
>>>>>>>>>> resources, and when resuming, guest will notify qemu to set
>>>>>>>>>> parameter to 0, and then qemu will keep the normal actions.
>>>>>>>>>> That can
>>>>>> help guest's display come back.
>>>>>>>>>> When I upstream above implementation, Parav and MST suggest me
>>>> to
>>>>>>>> use
>>>>>>>>>> the PM capability to fix this problem instead of adding a new
>>>>>>>>>> command or state bit.
>>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be
>>>>>>>>>> used to solve this problem.
>>>>>>>>>> The reason is:
>>>>>>>>>> during guest suspending, it will write D3 state through PM cap,
>>>>>>>>>> then I can save the resources of virtio-gpu on Qemu side(set
>>>>>>>>>> preserver_resource to 1), but during the process of resuming,
>>>>>>>>>> the state of PM cap will be cleared by qemu
>>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that
>>>>>>>>>> when guest reads state from PM cap, it will find the virtio-gpu
>>>>>>>>>> has already been D0 state,
>>>>>>>>> This behavior needs to be fixed. As the spec listed out, " This
>>>>>>>>> 2-bit field is
>>>>>>>> used both to determine the current power state of a Function"
>>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
>>>>>>>> current qemu code? Why?
>>>>>>> Because PM is implementing the support from D3->d0 transition and
>>>>>>> if it
>>>>>> device in D3, the device needs to respond that it is in D3 to match
>>>>>> the PCI spec.
>>>>>>>
>>>>>>>> Shouldn't all device states, including PM registers, be reset
>>>>>>>> during the process of virtio-gpu reset?
>>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
>>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
>>>>>>> other
>>>>>> things listed in pci spec.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> So device needs to return D3 in the PowerState field.  This is must.
>>>>>>>> But current behavior is a kind of soft reset, I
>>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu
>> is
>>>>>>>> reasonable.
>>>>>>> What you described is, you need to do No_Soft_Reset=1, so please
>>>>>>> set the
>>>>>> cap accordingly to achieve the restore.
>>>>>>>
>>>>>>> Also in your case the if the QEMU knows that it will have to
>>>>>>> resume the
>>>>>> device soon.
>>>>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
>>>>>>> In such case, the extra register wouldn’t be necessary.
>>>>>>> Having PM control register is anyway needed to drive the device
>> properly.
>>>>>> Even as you said, the reset behavior of PM in the current QEMU code
>>>>>> is incorrect, and even if it is modified, it also cannot fix this problem with
>> S3.
>>>>> You meant D3?
>>>>>
>>>>>> Because there is twice device reset during resuming.
>>>>>> First is when we trigger a resume command to qemu, qemu will reset
>>>>>> device(qemu_system_wakeup-> qemu_devices_reset->
>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
>>>>> A device implementation that supports PM should not reset the device.
>>>> Please fix it.
>>>> Do you mean the devices_reset behavior of qemu shouldn't happen when
>>>> a device has PM cap?
>>> Right. Device reset should not happen on PM capable device that offers
>> no_soft_reset==1.
>> But current qemu code, no_soft_reset==0.
>>
> So please fix the qemu when you are adding the functionality of restoring the device context on PM.
My patch didn't add the functionality of restoring the device context on PM.
Is no_soft_reset must set 1 when a device has PM cap? It should also be allowed for situations that are not equal to 1, right?
There should be devices that has PM cap but no_soft_reset is 0 because that devices can't store and restore the device context on d3->d0 transition, like current qemu code.
So the fix should be feasible for all situations that whether the no_soft_reset is 1 or not.
If use PM cap, not every situation can solve current problem (display resources of GPU will be destroyed).

> 
>>>
>>>> Or just the state of PM cap shouldn't be reset?
>>>>
>>>>>
>>>>>> After the first time, then second time happens in guest kernel,
>>>>>> virtio_device_restore-> virtio_reset_device. But the PM state
>>>>>> changes
>>>>>> (D3 to
>>>>>> D0) happens between those two, so the display resources will be
>>>>>> still destroyed by the second time of device reset.
>>>>> The driver that understands that device supports PM, will skip the
>>>>> reset
>>>> steps.
>>>> But it happens for now.
>>> This should be enhanced in the guest driver.
>>>
>>>> In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of
>>>> virtio-pci devices isn't set, is it a bug? It means the no_soft_reset==0.
>>>>
>>> It should be set when/if the device can store and restore the device context
>> on d3->d0 transition.
>> I don't know why it doesn't be set in current qemu code.
> May be because QEMU is not restoring the context on PM commands.
> 
>> What do you mean "the device can store and restore the device context on
>> d3->d0 transition"? The real physical device or the simulated devices on qemu
>> side?
> Does not matter, whichever implementation of device is doing the PM should set it.
> In your case it is qemu.
> 
>> How do I confirm it?
> If you mean "I" means guest driver, it will just work as long as it refers to the PM capabilities.
> 
>>
>>>
>>>>> Hence the 2nd reset will also not occur.
>>>>> Therefore, device state will be restored.
>>>>
>>>> --
>>>> Best regards,
>>>> Jiqian Chen.
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  8:20                                             ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  8:20 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 15:55, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 1:18 PM
>>
>> On 2024/1/15 15:37, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 1:03 PM
>>>>
>>>> On 2024/1/12 17:44, Parav Pandit wrote:
>>>>>
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Friday, January 12, 2024 2:55 PM
>>>>>>
>>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>>>>>
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>>>>>> Hi Jiqian,
>>>>>>>>>
>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>> Sorry to reply late.
>>>>>>>>>> I don't know if you still remember this problem, let me briefly
>>>>>>>>>> descript
>>>>>> it.
>>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>>>>>> Currently on Xen, if we start a guest through qemu with
>>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
>>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
>>>>>>>>>> It just shown a black
>>>>>>>> screen.
>>>>>>>>>> That is because during suspending, guest called into qemu and
>>>>>>>>>> qemu destroyed all display resources and reset renderer. This
>>>>>>>>>> made the display gone after guest resumed.
>>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>>>>>> suspending, it will notify qemu and set
>>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will preserve
>>>>>>>>>> resources, and when resuming, guest will notify qemu to set
>>>>>>>>>> parameter to 0, and then qemu will keep the normal actions.
>>>>>>>>>> That can
>>>>>> help guest's display come back.
>>>>>>>>>> When I upstream above implementation, Parav and MST suggest me
>>>> to
>>>>>>>> use
>>>>>>>>>> the PM capability to fix this problem instead of adding a new
>>>>>>>>>> command or state bit.
>>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't be
>>>>>>>>>> used to solve this problem.
>>>>>>>>>> The reason is:
>>>>>>>>>> during guest suspending, it will write D3 state through PM cap,
>>>>>>>>>> then I can save the resources of virtio-gpu on Qemu side(set
>>>>>>>>>> preserver_resource to 1), but during the process of resuming,
>>>>>>>>>> the state of PM cap will be cleared by qemu
>>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that
>>>>>>>>>> when guest reads state from PM cap, it will find the virtio-gpu
>>>>>>>>>> has already been D0 state,
>>>>>>>>> This behavior needs to be fixed. As the spec listed out, " This
>>>>>>>>> 2-bit field is
>>>>>>>> used both to determine the current power state of a Function"
>>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset in
>>>>>>>> current qemu code? Why?
>>>>>>> Because PM is implementing the support from D3->d0 transition and
>>>>>>> if it
>>>>>> device in D3, the device needs to respond that it is in D3 to match
>>>>>> the PCI spec.
>>>>>>>
>>>>>>>> Shouldn't all device states, including PM registers, be reset
>>>>>>>> during the process of virtio-gpu reset?
>>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
>>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
>>>>>>> other
>>>>>> things listed in pci spec.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> So device needs to return D3 in the PowerState field.  This is must.
>>>>>>>> But current behavior is a kind of soft reset, I
>>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by qemu
>> is
>>>>>>>> reasonable.
>>>>>>> What you described is, you need to do No_Soft_Reset=1, so please
>>>>>>> set the
>>>>>> cap accordingly to achieve the restore.
>>>>>>>
>>>>>>> Also in your case the if the QEMU knows that it will have to
>>>>>>> resume the
>>>>>> device soon.
>>>>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
>>>>>>> In such case, the extra register wouldn’t be necessary.
>>>>>>> Having PM control register is anyway needed to drive the device
>> properly.
>>>>>> Even as you said, the reset behavior of PM in the current QEMU code
>>>>>> is incorrect, and even if it is modified, it also cannot fix this problem with
>> S3.
>>>>> You meant D3?
>>>>>
>>>>>> Because there is twice device reset during resuming.
>>>>>> First is when we trigger a resume command to qemu, qemu will reset
>>>>>> device(qemu_system_wakeup-> qemu_devices_reset->
>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
>>>>> A device implementation that supports PM should not reset the device.
>>>> Please fix it.
>>>> Do you mean the devices_reset behavior of qemu shouldn't happen when
>>>> a device has PM cap?
>>> Right. Device reset should not happen on PM capable device that offers
>> no_soft_reset==1.
>> But current qemu code, no_soft_reset==0.
>>
> So please fix the qemu when you are adding the functionality of restoring the device context on PM.
My patch didn't add the functionality of restoring the device context on PM.
Is no_soft_reset must set 1 when a device has PM cap? It should also be allowed for situations that are not equal to 1, right?
There should be devices that has PM cap but no_soft_reset is 0 because that devices can't store and restore the device context on d3->d0 transition, like current qemu code.
So the fix should be feasible for all situations that whether the no_soft_reset is 1 or not.
If use PM cap, not every situation can solve current problem (display resources of GPU will be destroyed).

> 
>>>
>>>> Or just the state of PM cap shouldn't be reset?
>>>>
>>>>>
>>>>>> After the first time, then second time happens in guest kernel,
>>>>>> virtio_device_restore-> virtio_reset_device. But the PM state
>>>>>> changes
>>>>>> (D3 to
>>>>>> D0) happens between those two, so the display resources will be
>>>>>> still destroyed by the second time of device reset.
>>>>> The driver that understands that device supports PM, will skip the
>>>>> reset
>>>> steps.
>>>> But it happens for now.
>>> This should be enhanced in the guest driver.
>>>
>>>> In current qemu codes, the bit PCI_PM_CTRL_NO_SOFT_RESET of
>>>> virtio-pci devices isn't set, is it a bug? It means the no_soft_reset==0.
>>>>
>>> It should be set when/if the device can store and restore the device context
>> on d3->d0 transition.
>> I don't know why it doesn't be set in current qemu code.
> May be because QEMU is not restoring the context on PM commands.
> 
>> What do you mean "the device can store and restore the device context on
>> d3->d0 transition"? The real physical device or the simulated devices on qemu
>> side?
> Does not matter, whichever implementation of device is doing the PM should set it.
> In your case it is qemu.
> 
>> How do I confirm it?
> If you mean "I" means guest driver, it will just work as long as it refers to the PM capabilities.
> 
>>
>>>
>>>>> Hence the 2nd reset will also not occur.
>>>>> Therefore, device state will be restored.
>>>>
>>>> --
>>>> Best regards,
>>>> Jiqian Chen.
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  8:20                                             ` Chen, Jiqian
@ 2024-01-15  8:52                                               ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  8:52 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 1:51 PM
> 
> On 2024/1/15 15:55, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 1:18 PM
> >>
> >> On 2024/1/15 15:37, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 1:03 PM
> >>>>
> >>>> On 2024/1/12 17:44, Parav Pandit wrote:
> >>>>>
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Friday, January 12, 2024 2:55 PM
> >>>>>>
> >>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
> >>>>>>>>> Hi Jiqian,
> >>>>>>>>>
> >>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>> Sorry to reply late.
> >>>>>>>>>> I don't know if you still remember this problem, let me
> >>>>>>>>>> briefly descript
> >>>>>> it.
> >>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
> >>>>>>>>>> Currently on Xen, if we start a guest through qemu with
> >>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
> >>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
> >>>>>>>>>> It just shown a black
> >>>>>>>> screen.
> >>>>>>>>>> That is because during suspending, guest called into qemu and
> >>>>>>>>>> qemu destroyed all display resources and reset renderer. This
> >>>>>>>>>> made the display gone after guest resumed.
> >>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
> >>>>>>>>>> suspending, it will notify qemu and set
> >>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will
> >>>>>>>>>> preserve resources, and when resuming, guest will notify qemu
> >>>>>>>>>> to set parameter to 0, and then qemu will keep the normal
> actions.
> >>>>>>>>>> That can
> >>>>>> help guest's display come back.
> >>>>>>>>>> When I upstream above implementation, Parav and MST suggest
> >>>>>>>>>> me
> >>>> to
> >>>>>>>> use
> >>>>>>>>>> the PM capability to fix this problem instead of adding a new
> >>>>>>>>>> command or state bit.
> >>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't
> >>>>>>>>>> be used to solve this problem.
> >>>>>>>>>> The reason is:
> >>>>>>>>>> during guest suspending, it will write D3 state through PM
> >>>>>>>>>> cap, then I can save the resources of virtio-gpu on Qemu
> >>>>>>>>>> side(set preserver_resource to 1), but during the process of
> >>>>>>>>>> resuming, the state of PM cap will be cleared by qemu
> >>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that
> >>>>>>>>>> when guest reads state from PM cap, it will find the
> >>>>>>>>>> virtio-gpu has already been D0 state,
> >>>>>>>>> This behavior needs to be fixed. As the spec listed out, "
> >>>>>>>>> This 2-bit field is
> >>>>>>>> used both to determine the current power state of a Function"
> >>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset
> >>>>>>>> in current qemu code? Why?
> >>>>>>> Because PM is implementing the support from D3->d0 transition
> >>>>>>> and if it
> >>>>>> device in D3, the device needs to respond that it is in D3 to
> >>>>>> match the PCI spec.
> >>>>>>>
> >>>>>>>> Shouldn't all device states, including PM registers, be reset
> >>>>>>>> during the process of virtio-gpu reset?
> >>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
> >>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
> >>>>>>> other
> >>>>>> things listed in pci spec.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> So device needs to return D3 in the PowerState field.  This is must.
> >>>>>>>> But current behavior is a kind of soft reset, I
> >>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by
> qemu
> >> is
> >>>>>>>> reasonable.
> >>>>>>> What you described is, you need to do No_Soft_Reset=1, so please
> >>>>>>> set the
> >>>>>> cap accordingly to achieve the restore.
> >>>>>>>
> >>>>>>> Also in your case the if the QEMU knows that it will have to
> >>>>>>> resume the
> >>>>>> device soon.
> >>>>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
> >>>>>>> In such case, the extra register wouldn’t be necessary.
> >>>>>>> Having PM control register is anyway needed to drive the device
> >> properly.
> >>>>>> Even as you said, the reset behavior of PM in the current QEMU
> >>>>>> code is incorrect, and even if it is modified, it also cannot fix
> >>>>>> this problem with
> >> S3.
> >>>>> You meant D3?
> >>>>>
> >>>>>> Because there is twice device reset during resuming.
> >>>>>> First is when we trigger a resume command to qemu, qemu will
> >>>>>> reset device(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> >>>>> A device implementation that supports PM should not reset the device.
> >>>> Please fix it.
> >>>> Do you mean the devices_reset behavior of qemu shouldn't happen
> >>>> when a device has PM cap?
> >>> Right. Device reset should not happen on PM capable device that
> >>> offers
> >> no_soft_reset==1.
> >> But current qemu code, no_soft_reset==0.
> >>
> > So please fix the qemu when you are adding the functionality of restoring
> the device context on PM.
> My patch didn't add the functionality of restoring the device context on PM.
> Is no_soft_reset must set 1 when a device has PM cap? 

It the device wants to expose the ability to save and restore context, then yes, no_soft_reset = 1, as defined by the PCI spec.

> It should also be
> allowed for situations that are not equal to 1, right?
I don’t see the need of it. It would not be aligned to no_soft_reset bit already defined bit.

> There should be devices that has PM cap but no_soft_reset is 0 because that
> devices can't store and restore the device context on d3->d0 transition, like
> current qemu code.
Right, PM cap with no_soft_reset=0, will not restore the context on d3->d0 as today.

> So the fix should be feasible for all situations that whether the no_soft_reset is
> 1 or not.
I don’t follow the "fix should be feasible".
QEMU reset the device on d3->d0 transition when no_soft_reset = 0 seems fine because it follows the defined pci spec.

> If use PM cap, not every situation can solve current problem (display
> resources of GPU will be destroyed).

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  8:52                                               ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  8:52 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 1:51 PM
> 
> On 2024/1/15 15:55, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 1:18 PM
> >>
> >> On 2024/1/15 15:37, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 1:03 PM
> >>>>
> >>>> On 2024/1/12 17:44, Parav Pandit wrote:
> >>>>>
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Friday, January 12, 2024 2:55 PM
> >>>>>>
> >>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
> >>>>>>>>> Hi Jiqian,
> >>>>>>>>>
> >>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Hi all,
> >>>>>>>>>> Sorry to reply late.
> >>>>>>>>>> I don't know if you still remember this problem, let me
> >>>>>>>>>> briefly descript
> >>>>>> it.
> >>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
> >>>>>>>>>> Currently on Xen, if we start a guest through qemu with
> >>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
> >>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
> >>>>>>>>>> It just shown a black
> >>>>>>>> screen.
> >>>>>>>>>> That is because during suspending, guest called into qemu and
> >>>>>>>>>> qemu destroyed all display resources and reset renderer. This
> >>>>>>>>>> made the display gone after guest resumed.
> >>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
> >>>>>>>>>> suspending, it will notify qemu and set
> >>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will
> >>>>>>>>>> preserve resources, and when resuming, guest will notify qemu
> >>>>>>>>>> to set parameter to 0, and then qemu will keep the normal
> actions.
> >>>>>>>>>> That can
> >>>>>> help guest's display come back.
> >>>>>>>>>> When I upstream above implementation, Parav and MST suggest
> >>>>>>>>>> me
> >>>> to
> >>>>>>>> use
> >>>>>>>>>> the PM capability to fix this problem instead of adding a new
> >>>>>>>>>> command or state bit.
> >>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't
> >>>>>>>>>> be used to solve this problem.
> >>>>>>>>>> The reason is:
> >>>>>>>>>> during guest suspending, it will write D3 state through PM
> >>>>>>>>>> cap, then I can save the resources of virtio-gpu on Qemu
> >>>>>>>>>> side(set preserver_resource to 1), but during the process of
> >>>>>>>>>> resuming, the state of PM cap will be cleared by qemu
> >>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that
> >>>>>>>>>> when guest reads state from PM cap, it will find the
> >>>>>>>>>> virtio-gpu has already been D0 state,
> >>>>>>>>> This behavior needs to be fixed. As the spec listed out, "
> >>>>>>>>> This 2-bit field is
> >>>>>>>> used both to determine the current power state of a Function"
> >>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset
> >>>>>>>> in current qemu code? Why?
> >>>>>>> Because PM is implementing the support from D3->d0 transition
> >>>>>>> and if it
> >>>>>> device in D3, the device needs to respond that it is in D3 to
> >>>>>> match the PCI spec.
> >>>>>>>
> >>>>>>>> Shouldn't all device states, including PM registers, be reset
> >>>>>>>> during the process of virtio-gpu reset?
> >>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
> >>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
> >>>>>>> other
> >>>>>> things listed in pci spec.
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> So device needs to return D3 in the PowerState field.  This is must.
> >>>>>>>> But current behavior is a kind of soft reset, I
> >>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by
> qemu
> >> is
> >>>>>>>> reasonable.
> >>>>>>> What you described is, you need to do No_Soft_Reset=1, so please
> >>>>>>> set the
> >>>>>> cap accordingly to achieve the restore.
> >>>>>>>
> >>>>>>> Also in your case the if the QEMU knows that it will have to
> >>>>>>> resume the
> >>>>>> device soon.
> >>>>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
> >>>>>>> In such case, the extra register wouldn’t be necessary.
> >>>>>>> Having PM control register is anyway needed to drive the device
> >> properly.
> >>>>>> Even as you said, the reset behavior of PM in the current QEMU
> >>>>>> code is incorrect, and even if it is modified, it also cannot fix
> >>>>>> this problem with
> >> S3.
> >>>>> You meant D3?
> >>>>>
> >>>>>> Because there is twice device reset during resuming.
> >>>>>> First is when we trigger a resume command to qemu, qemu will
> >>>>>> reset device(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> >>>>> A device implementation that supports PM should not reset the device.
> >>>> Please fix it.
> >>>> Do you mean the devices_reset behavior of qemu shouldn't happen
> >>>> when a device has PM cap?
> >>> Right. Device reset should not happen on PM capable device that
> >>> offers
> >> no_soft_reset==1.
> >> But current qemu code, no_soft_reset==0.
> >>
> > So please fix the qemu when you are adding the functionality of restoring
> the device context on PM.
> My patch didn't add the functionality of restoring the device context on PM.
> Is no_soft_reset must set 1 when a device has PM cap? 

It the device wants to expose the ability to save and restore context, then yes, no_soft_reset = 1, as defined by the PCI spec.

> It should also be
> allowed for situations that are not equal to 1, right?
I don’t see the need of it. It would not be aligned to no_soft_reset bit already defined bit.

> There should be devices that has PM cap but no_soft_reset is 0 because that
> devices can't store and restore the device context on d3->d0 transition, like
> current qemu code.
Right, PM cap with no_soft_reset=0, will not restore the context on d3->d0 as today.

> So the fix should be feasible for all situations that whether the no_soft_reset is
> 1 or not.
I don’t follow the "fix should be feasible".
QEMU reset the device on d3->d0 transition when no_soft_reset = 0 seems fine because it follows the defined pci spec.

> If use PM cap, not every situation can solve current problem (display
> resources of GPU will be destroyed).

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  8:52                                               ` Parav Pandit
@ 2024-01-15  9:09                                                 ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  9:09 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 16:52, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 1:51 PM
>>
>> On 2024/1/15 15:55, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 1:18 PM
>>>>
>>>> On 2024/1/15 15:37, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, January 15, 2024 1:03 PM
>>>>>>
>>>>>> On 2024/1/12 17:44, Parav Pandit wrote:
>>>>>>>
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Friday, January 12, 2024 2:55 PM
>>>>>>>>
>>>>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>>>>>>>> Hi Jiqian,
>>>>>>>>>>>
>>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>> Sorry to reply late.
>>>>>>>>>>>> I don't know if you still remember this problem, let me
>>>>>>>>>>>> briefly descript
>>>>>>>> it.
>>>>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>>>>>>>> Currently on Xen, if we start a guest through qemu with
>>>>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
>>>>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
>>>>>>>>>>>> It just shown a black
>>>>>>>>>> screen.
>>>>>>>>>>>> That is because during suspending, guest called into qemu and
>>>>>>>>>>>> qemu destroyed all display resources and reset renderer. This
>>>>>>>>>>>> made the display gone after guest resumed.
>>>>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>>>>>>>> suspending, it will notify qemu and set
>>>>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will
>>>>>>>>>>>> preserve resources, and when resuming, guest will notify qemu
>>>>>>>>>>>> to set parameter to 0, and then qemu will keep the normal
>> actions.
>>>>>>>>>>>> That can
>>>>>>>> help guest's display come back.
>>>>>>>>>>>> When I upstream above implementation, Parav and MST suggest
>>>>>>>>>>>> me
>>>>>> to
>>>>>>>>>> use
>>>>>>>>>>>> the PM capability to fix this problem instead of adding a new
>>>>>>>>>>>> command or state bit.
>>>>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't
>>>>>>>>>>>> be used to solve this problem.
>>>>>>>>>>>> The reason is:
>>>>>>>>>>>> during guest suspending, it will write D3 state through PM
>>>>>>>>>>>> cap, then I can save the resources of virtio-gpu on Qemu
>>>>>>>>>>>> side(set preserver_resource to 1), but during the process of
>>>>>>>>>>>> resuming, the state of PM cap will be cleared by qemu
>>>>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that
>>>>>>>>>>>> when guest reads state from PM cap, it will find the
>>>>>>>>>>>> virtio-gpu has already been D0 state,
>>>>>>>>>>> This behavior needs to be fixed. As the spec listed out, "
>>>>>>>>>>> This 2-bit field is
>>>>>>>>>> used both to determine the current power state of a Function"
>>>>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset
>>>>>>>>>> in current qemu code? Why?
>>>>>>>>> Because PM is implementing the support from D3->d0 transition
>>>>>>>>> and if it
>>>>>>>> device in D3, the device needs to respond that it is in D3 to
>>>>>>>> match the PCI spec.
>>>>>>>>>
>>>>>>>>>> Shouldn't all device states, including PM registers, be reset
>>>>>>>>>> during the process of virtio-gpu reset?
>>>>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
>>>>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
>>>>>>>>> other
>>>>>>>> things listed in pci spec.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So device needs to return D3 in the PowerState field.  This is must.
>>>>>>>>>> But current behavior is a kind of soft reset, I
>>>>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by
>> qemu
>>>> is
>>>>>>>>>> reasonable.
>>>>>>>>> What you described is, you need to do No_Soft_Reset=1, so please
>>>>>>>>> set the
>>>>>>>> cap accordingly to achieve the restore.
>>>>>>>>>
>>>>>>>>> Also in your case the if the QEMU knows that it will have to
>>>>>>>>> resume the
>>>>>>>> device soon.
>>>>>>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
>>>>>>>>> In such case, the extra register wouldn’t be necessary.
>>>>>>>>> Having PM control register is anyway needed to drive the device
>>>> properly.
>>>>>>>> Even as you said, the reset behavior of PM in the current QEMU
>>>>>>>> code is incorrect, and even if it is modified, it also cannot fix
>>>>>>>> this problem with
>>>> S3.
>>>>>>> You meant D3?
>>>>>>>
>>>>>>>> Because there is twice device reset during resuming.
>>>>>>>> First is when we trigger a resume command to qemu, qemu will
>>>>>>>> reset device(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
>>>>>>> A device implementation that supports PM should not reset the device.
>>>>>> Please fix it.
>>>>>> Do you mean the devices_reset behavior of qemu shouldn't happen
>>>>>> when a device has PM cap?
>>>>> Right. Device reset should not happen on PM capable device that
>>>>> offers
>>>> no_soft_reset==1.
>>>> But current qemu code, no_soft_reset==0.
>>>>
>>> So please fix the qemu when you are adding the functionality of restoring
>> the device context on PM.
>> My patch didn't add the functionality of restoring the device context on PM.
>> Is no_soft_reset must set 1 when a device has PM cap? 
> 
> It the device wants to expose the ability to save and restore context, then yes, no_soft_reset = 1, as defined by the PCI spec.
> 
>> It should also be
>> allowed for situations that are not equal to 1, right?
> I don’t see the need of it. It would not be aligned to no_soft_reset bit already defined bit.
> 
>> There should be devices that has PM cap but no_soft_reset is 0 because that
>> devices can't store and restore the device context on d3->d0 transition, like
>> current qemu code.
> Right, PM cap with no_soft_reset=0, will not restore the context on d3->d0 as today.
> 
>> So the fix should be feasible for all situations that whether the no_soft_reset is
>> 1 or not.
> I don’t follow the "fix should be feasible".
I mean the fix that to solve the problem(virtio-gpu's display resources are destroyed during S3) should also be feasible to the situation that no_soft_reset == 0.

> QEMU reset the device on d3->d0 transition when no_soft_reset = 0 seems fine because it follows the defined pci spec.
Right.
Now that you also think the current implementation of QEMU is reasonable, the fix to this problem(display resources are destroyed during S3) should be applicable to the current QEMU code.
If using PM cap, it cannot solve the problem encountered in current QEMU code.
I think you may not have had a detailed understanding of my modifications. I will resend the latest modifications on the Linux kernel and QEMU to upstream.
Welcome to review.

> 
>> If use PM cap, not every situation can solve current problem (display
>> resources of GPU will be destroyed).

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  9:09                                                 ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  9:09 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 16:52, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 1:51 PM
>>
>> On 2024/1/15 15:55, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 1:18 PM
>>>>
>>>> On 2024/1/15 15:37, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, January 15, 2024 1:03 PM
>>>>>>
>>>>>> On 2024/1/12 17:44, Parav Pandit wrote:
>>>>>>>
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Friday, January 12, 2024 2:55 PM
>>>>>>>>
>>>>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>>>>>>>> Hi Jiqian,
>>>>>>>>>>>
>>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>> Sorry to reply late.
>>>>>>>>>>>> I don't know if you still remember this problem, let me
>>>>>>>>>>>> briefly descript
>>>>>>>> it.
>>>>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>>>>>>>> Currently on Xen, if we start a guest through qemu with
>>>>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
>>>>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
>>>>>>>>>>>> It just shown a black
>>>>>>>>>> screen.
>>>>>>>>>>>> That is because during suspending, guest called into qemu and
>>>>>>>>>>>> qemu destroyed all display resources and reset renderer. This
>>>>>>>>>>>> made the display gone after guest resumed.
>>>>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>>>>>>>> suspending, it will notify qemu and set
>>>>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will
>>>>>>>>>>>> preserve resources, and when resuming, guest will notify qemu
>>>>>>>>>>>> to set parameter to 0, and then qemu will keep the normal
>> actions.
>>>>>>>>>>>> That can
>>>>>>>> help guest's display come back.
>>>>>>>>>>>> When I upstream above implementation, Parav and MST suggest
>>>>>>>>>>>> me
>>>>>> to
>>>>>>>>>> use
>>>>>>>>>>>> the PM capability to fix this problem instead of adding a new
>>>>>>>>>>>> command or state bit.
>>>>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't
>>>>>>>>>>>> be used to solve this problem.
>>>>>>>>>>>> The reason is:
>>>>>>>>>>>> during guest suspending, it will write D3 state through PM
>>>>>>>>>>>> cap, then I can save the resources of virtio-gpu on Qemu
>>>>>>>>>>>> side(set preserver_resource to 1), but during the process of
>>>>>>>>>>>> resuming, the state of PM cap will be cleared by qemu
>>>>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes that
>>>>>>>>>>>> when guest reads state from PM cap, it will find the
>>>>>>>>>>>> virtio-gpu has already been D0 state,
>>>>>>>>>>> This behavior needs to be fixed. As the spec listed out, "
>>>>>>>>>>> This 2-bit field is
>>>>>>>>>> used both to determine the current power state of a Function"
>>>>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset
>>>>>>>>>> in current qemu code? Why?
>>>>>>>>> Because PM is implementing the support from D3->d0 transition
>>>>>>>>> and if it
>>>>>>>> device in D3, the device needs to respond that it is in D3 to
>>>>>>>> match the PCI spec.
>>>>>>>>>
>>>>>>>>>> Shouldn't all device states, including PM registers, be reset
>>>>>>>>>> during the process of virtio-gpu reset?
>>>>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
>>>>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers and
>>>>>>>>> other
>>>>>>>> things listed in pci spec.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So device needs to return D3 in the PowerState field.  This is must.
>>>>>>>>>> But current behavior is a kind of soft reset, I
>>>>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by
>> qemu
>>>> is
>>>>>>>>>> reasonable.
>>>>>>>>> What you described is, you need to do No_Soft_Reset=1, so please
>>>>>>>>> set the
>>>>>>>> cap accordingly to achieve the restore.
>>>>>>>>>
>>>>>>>>> Also in your case the if the QEMU knows that it will have to
>>>>>>>>> resume the
>>>>>>>> device soon.
>>>>>>>>> Hence, it can well prepare the gpu context before resuming the vcpus.
>>>>>>>>> In such case, the extra register wouldn’t be necessary.
>>>>>>>>> Having PM control register is anyway needed to drive the device
>>>> properly.
>>>>>>>> Even as you said, the reset behavior of PM in the current QEMU
>>>>>>>> code is incorrect, and even if it is modified, it also cannot fix
>>>>>>>> this problem with
>>>> S3.
>>>>>>> You meant D3?
>>>>>>>
>>>>>>>> Because there is twice device reset during resuming.
>>>>>>>> First is when we trigger a resume command to qemu, qemu will
>>>>>>>> reset device(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
>>>>>>> A device implementation that supports PM should not reset the device.
>>>>>> Please fix it.
>>>>>> Do you mean the devices_reset behavior of qemu shouldn't happen
>>>>>> when a device has PM cap?
>>>>> Right. Device reset should not happen on PM capable device that
>>>>> offers
>>>> no_soft_reset==1.
>>>> But current qemu code, no_soft_reset==0.
>>>>
>>> So please fix the qemu when you are adding the functionality of restoring
>> the device context on PM.
>> My patch didn't add the functionality of restoring the device context on PM.
>> Is no_soft_reset must set 1 when a device has PM cap? 
> 
> It the device wants to expose the ability to save and restore context, then yes, no_soft_reset = 1, as defined by the PCI spec.
> 
>> It should also be
>> allowed for situations that are not equal to 1, right?
> I don’t see the need of it. It would not be aligned to no_soft_reset bit already defined bit.
> 
>> There should be devices that has PM cap but no_soft_reset is 0 because that
>> devices can't store and restore the device context on d3->d0 transition, like
>> current qemu code.
> Right, PM cap with no_soft_reset=0, will not restore the context on d3->d0 as today.
> 
>> So the fix should be feasible for all situations that whether the no_soft_reset is
>> 1 or not.
> I don’t follow the "fix should be feasible".
I mean the fix that to solve the problem(virtio-gpu's display resources are destroyed during S3) should also be feasible to the situation that no_soft_reset == 0.

> QEMU reset the device on d3->d0 transition when no_soft_reset = 0 seems fine because it follows the defined pci spec.
Right.
Now that you also think the current implementation of QEMU is reasonable, the fix to this problem(display resources are destroyed during S3) should be applicable to the current QEMU code.
If using PM cap, it cannot solve the problem encountered in current QEMU code.
I think you may not have had a detailed understanding of my modifications. I will resend the latest modifications on the Linux kernel and QEMU to upstream.
Welcome to review.

> 
>> If use PM cap, not every situation can solve current problem (display
>> resources of GPU will be destroyed).

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  9:09                                                 ` Chen, Jiqian
@ 2024-01-15  9:16                                                   ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  9:16 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 2:40 PM
> 
> On 2024/1/15 16:52, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 1:51 PM
> >>
> >> On 2024/1/15 15:55, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 1:18 PM
> >>>>
> >>>> On 2024/1/15 15:37, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Monday, January 15, 2024 1:03 PM
> >>>>>>
> >>>>>> On 2024/1/12 17:44, Parav Pandit wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>> Sent: Friday, January 12, 2024 2:55 PM
> >>>>>>>>
> >>>>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
> >>>>>>>>>>> Hi Jiqian,
> >>>>>>>>>>>
> >>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi all,
> >>>>>>>>>>>> Sorry to reply late.
> >>>>>>>>>>>> I don't know if you still remember this problem, let me
> >>>>>>>>>>>> briefly descript
> >>>>>>>> it.
> >>>>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
> >>>>>>>>>>>> Currently on Xen, if we start a guest through qemu with
> >>>>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
> >>>>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
> >>>>>>>>>>>> It just shown a black
> >>>>>>>>>> screen.
> >>>>>>>>>>>> That is because during suspending, guest called into qemu
> >>>>>>>>>>>> and qemu destroyed all display resources and reset
> >>>>>>>>>>>> renderer. This made the display gone after guest resumed.
> >>>>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
> >>>>>>>>>>>> suspending, it will notify qemu and set
> >>>>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will
> >>>>>>>>>>>> preserve resources, and when resuming, guest will notify
> >>>>>>>>>>>> qemu to set parameter to 0, and then qemu will keep the
> >>>>>>>>>>>> normal
> >> actions.
> >>>>>>>>>>>> That can
> >>>>>>>> help guest's display come back.
> >>>>>>>>>>>> When I upstream above implementation, Parav and MST
> suggest
> >>>>>>>>>>>> me
> >>>>>> to
> >>>>>>>>>> use
> >>>>>>>>>>>> the PM capability to fix this problem instead of adding a
> >>>>>>>>>>>> new command or state bit.
> >>>>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't
> >>>>>>>>>>>> be used to solve this problem.
> >>>>>>>>>>>> The reason is:
> >>>>>>>>>>>> during guest suspending, it will write D3 state through PM
> >>>>>>>>>>>> cap, then I can save the resources of virtio-gpu on Qemu
> >>>>>>>>>>>> side(set preserver_resource to 1), but during the process
> >>>>>>>>>>>> of resuming, the state of PM cap will be cleared by qemu
> >>>>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes
> >>>>>>>>>>>> that when guest reads state from PM cap, it will find the
> >>>>>>>>>>>> virtio-gpu has already been D0 state,
> >>>>>>>>>>> This behavior needs to be fixed. As the spec listed out, "
> >>>>>>>>>>> This 2-bit field is
> >>>>>>>>>> used both to determine the current power state of a Function"
> >>>>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset
> >>>>>>>>>> in current qemu code? Why?
> >>>>>>>>> Because PM is implementing the support from D3->d0 transition
> >>>>>>>>> and if it
> >>>>>>>> device in D3, the device needs to respond that it is in D3 to
> >>>>>>>> match the PCI spec.
> >>>>>>>>>
> >>>>>>>>>> Shouldn't all device states, including PM registers, be reset
> >>>>>>>>>> during the process of virtio-gpu reset?
> >>>>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
> >>>>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers
> >>>>>>>>> and other
> >>>>>>>> things listed in pci spec.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> So device needs to return D3 in the PowerState field.  This is
> must.
> >>>>>>>>>> But current behavior is a kind of soft reset, I
> >>>>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by
> >> qemu
> >>>> is
> >>>>>>>>>> reasonable.
> >>>>>>>>> What you described is, you need to do No_Soft_Reset=1, so
> >>>>>>>>> please set the
> >>>>>>>> cap accordingly to achieve the restore.
> >>>>>>>>>
> >>>>>>>>> Also in your case the if the QEMU knows that it will have to
> >>>>>>>>> resume the
> >>>>>>>> device soon.
> >>>>>>>>> Hence, it can well prepare the gpu context before resuming the
> vcpus.
> >>>>>>>>> In such case, the extra register wouldn’t be necessary.
> >>>>>>>>> Having PM control register is anyway needed to drive the
> >>>>>>>>> device
> >>>> properly.
> >>>>>>>> Even as you said, the reset behavior of PM in the current QEMU
> >>>>>>>> code is incorrect, and even if it is modified, it also cannot
> >>>>>>>> fix this problem with
> >>>> S3.
> >>>>>>> You meant D3?
> >>>>>>>
> >>>>>>>> Because there is twice device reset during resuming.
> >>>>>>>> First is when we trigger a resume command to qemu, qemu will
> >>>>>>>> reset device(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> >>>>>>> A device implementation that supports PM should not reset the
> device.
> >>>>>> Please fix it.
> >>>>>> Do you mean the devices_reset behavior of qemu shouldn't happen
> >>>>>> when a device has PM cap?
> >>>>> Right. Device reset should not happen on PM capable device that
> >>>>> offers
> >>>> no_soft_reset==1.
> >>>> But current qemu code, no_soft_reset==0.
> >>>>
> >>> So please fix the qemu when you are adding the functionality of
> >>> restoring
> >> the device context on PM.
> >> My patch didn't add the functionality of restoring the device context on
> PM.
> >> Is no_soft_reset must set 1 when a device has PM cap?
> >
> > It the device wants to expose the ability to save and restore context, then
> yes, no_soft_reset = 1, as defined by the PCI spec.
> >
> >> It should also be
> >> allowed for situations that are not equal to 1, right?
> > I don’t see the need of it. It would not be aligned to no_soft_reset bit already
> defined bit.
> >
> >> There should be devices that has PM cap but no_soft_reset is 0
> >> because that devices can't store and restore the device context on
> >> d3->d0 transition, like current qemu code.
> > Right, PM cap with no_soft_reset=0, will not restore the context on d3->d0
> as today.
> >
> >> So the fix should be feasible for all situations that whether the
> >> no_soft_reset is
> >> 1 or not.
> > I don’t follow the "fix should be feasible".
> I mean the fix that to solve the problem(virtio-gpu's display resources are
> destroyed during S3) should also be feasible to the situation that
> no_soft_reset == 0.
> 
> > QEMU reset the device on d3->d0 transition when no_soft_reset = 0 seems
> fine because it follows the defined pci spec.
> Right.
> Now that you also think the current implementation of QEMU is reasonable,
> the fix to this problem(display resources are destroyed during S3) should be
> applicable to the current QEMU code.
Yes.

To conclude our discussion,

Display resources should be only restored when following conditions are met.
1. PCI PM cap is reported.
2. PCI PM cap has non_soft_reset=1
3. virtio guest driver do not perform reset when transport offers a restore capability using #1 and #2.

Do you agree? Yes?

> If using PM cap, it cannot solve the problem encountered in current QEMU
> code.
> I think you may not have had a detailed understanding of my modifications. I
> will resend the latest modifications on the Linux kernel and QEMU to
> upstream.
> Welcome to review.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  9:16                                                   ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  9:16 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 2:40 PM
> 
> On 2024/1/15 16:52, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 1:51 PM
> >>
> >> On 2024/1/15 15:55, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 1:18 PM
> >>>>
> >>>> On 2024/1/15 15:37, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Monday, January 15, 2024 1:03 PM
> >>>>>>
> >>>>>> On 2024/1/12 17:44, Parav Pandit wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>> Sent: Friday, January 12, 2024 2:55 PM
> >>>>>>>>
> >>>>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
> >>>>>>>>>>> Hi Jiqian,
> >>>>>>>>>>>
> >>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi all,
> >>>>>>>>>>>> Sorry to reply late.
> >>>>>>>>>>>> I don't know if you still remember this problem, let me
> >>>>>>>>>>>> briefly descript
> >>>>>>>> it.
> >>>>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
> >>>>>>>>>>>> Currently on Xen, if we start a guest through qemu with
> >>>>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
> >>>>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
> >>>>>>>>>>>> It just shown a black
> >>>>>>>>>> screen.
> >>>>>>>>>>>> That is because during suspending, guest called into qemu
> >>>>>>>>>>>> and qemu destroyed all display resources and reset
> >>>>>>>>>>>> renderer. This made the display gone after guest resumed.
> >>>>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
> >>>>>>>>>>>> suspending, it will notify qemu and set
> >>>>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will
> >>>>>>>>>>>> preserve resources, and when resuming, guest will notify
> >>>>>>>>>>>> qemu to set parameter to 0, and then qemu will keep the
> >>>>>>>>>>>> normal
> >> actions.
> >>>>>>>>>>>> That can
> >>>>>>>> help guest's display come back.
> >>>>>>>>>>>> When I upstream above implementation, Parav and MST
> suggest
> >>>>>>>>>>>> me
> >>>>>> to
> >>>>>>>>>> use
> >>>>>>>>>>>> the PM capability to fix this problem instead of adding a
> >>>>>>>>>>>> new command or state bit.
> >>>>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't
> >>>>>>>>>>>> be used to solve this problem.
> >>>>>>>>>>>> The reason is:
> >>>>>>>>>>>> during guest suspending, it will write D3 state through PM
> >>>>>>>>>>>> cap, then I can save the resources of virtio-gpu on Qemu
> >>>>>>>>>>>> side(set preserver_resource to 1), but during the process
> >>>>>>>>>>>> of resuming, the state of PM cap will be cleared by qemu
> >>>>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes
> >>>>>>>>>>>> that when guest reads state from PM cap, it will find the
> >>>>>>>>>>>> virtio-gpu has already been D0 state,
> >>>>>>>>>>> This behavior needs to be fixed. As the spec listed out, "
> >>>>>>>>>>> This 2-bit field is
> >>>>>>>>>> used both to determine the current power state of a Function"
> >>>>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset
> >>>>>>>>>> in current qemu code? Why?
> >>>>>>>>> Because PM is implementing the support from D3->d0 transition
> >>>>>>>>> and if it
> >>>>>>>> device in D3, the device needs to respond that it is in D3 to
> >>>>>>>> match the PCI spec.
> >>>>>>>>>
> >>>>>>>>>> Shouldn't all device states, including PM registers, be reset
> >>>>>>>>>> during the process of virtio-gpu reset?
> >>>>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
> >>>>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers
> >>>>>>>>> and other
> >>>>>>>> things listed in pci spec.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> So device needs to return D3 in the PowerState field.  This is
> must.
> >>>>>>>>>> But current behavior is a kind of soft reset, I
> >>>>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by
> >> qemu
> >>>> is
> >>>>>>>>>> reasonable.
> >>>>>>>>> What you described is, you need to do No_Soft_Reset=1, so
> >>>>>>>>> please set the
> >>>>>>>> cap accordingly to achieve the restore.
> >>>>>>>>>
> >>>>>>>>> Also in your case the if the QEMU knows that it will have to
> >>>>>>>>> resume the
> >>>>>>>> device soon.
> >>>>>>>>> Hence, it can well prepare the gpu context before resuming the
> vcpus.
> >>>>>>>>> In such case, the extra register wouldn’t be necessary.
> >>>>>>>>> Having PM control register is anyway needed to drive the
> >>>>>>>>> device
> >>>> properly.
> >>>>>>>> Even as you said, the reset behavior of PM in the current QEMU
> >>>>>>>> code is incorrect, and even if it is modified, it also cannot
> >>>>>>>> fix this problem with
> >>>> S3.
> >>>>>>> You meant D3?
> >>>>>>>
> >>>>>>>> Because there is twice device reset during resuming.
> >>>>>>>> First is when we trigger a resume command to qemu, qemu will
> >>>>>>>> reset device(qemu_system_wakeup-> qemu_devices_reset->
> >>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
> >>>>>>> A device implementation that supports PM should not reset the
> device.
> >>>>>> Please fix it.
> >>>>>> Do you mean the devices_reset behavior of qemu shouldn't happen
> >>>>>> when a device has PM cap?
> >>>>> Right. Device reset should not happen on PM capable device that
> >>>>> offers
> >>>> no_soft_reset==1.
> >>>> But current qemu code, no_soft_reset==0.
> >>>>
> >>> So please fix the qemu when you are adding the functionality of
> >>> restoring
> >> the device context on PM.
> >> My patch didn't add the functionality of restoring the device context on
> PM.
> >> Is no_soft_reset must set 1 when a device has PM cap?
> >
> > It the device wants to expose the ability to save and restore context, then
> yes, no_soft_reset = 1, as defined by the PCI spec.
> >
> >> It should also be
> >> allowed for situations that are not equal to 1, right?
> > I don’t see the need of it. It would not be aligned to no_soft_reset bit already
> defined bit.
> >
> >> There should be devices that has PM cap but no_soft_reset is 0
> >> because that devices can't store and restore the device context on
> >> d3->d0 transition, like current qemu code.
> > Right, PM cap with no_soft_reset=0, will not restore the context on d3->d0
> as today.
> >
> >> So the fix should be feasible for all situations that whether the
> >> no_soft_reset is
> >> 1 or not.
> > I don’t follow the "fix should be feasible".
> I mean the fix that to solve the problem(virtio-gpu's display resources are
> destroyed during S3) should also be feasible to the situation that
> no_soft_reset == 0.
> 
> > QEMU reset the device on d3->d0 transition when no_soft_reset = 0 seems
> fine because it follows the defined pci spec.
> Right.
> Now that you also think the current implementation of QEMU is reasonable,
> the fix to this problem(display resources are destroyed during S3) should be
> applicable to the current QEMU code.
Yes.

To conclude our discussion,

Display resources should be only restored when following conditions are met.
1. PCI PM cap is reported.
2. PCI PM cap has non_soft_reset=1
3. virtio guest driver do not perform reset when transport offers a restore capability using #1 and #2.

Do you agree? Yes?

> If using PM cap, it cannot solve the problem encountered in current QEMU
> code.
> I think you may not have had a detailed understanding of my modifications. I
> will resend the latest modifications on the Linux kernel and QEMU to
> upstream.
> Welcome to review.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  9:16                                                   ` Parav Pandit
@ 2024-01-15  9:40                                                     ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  9:40 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 17:16, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 2:40 PM
>>
>> On 2024/1/15 16:52, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 1:51 PM
>>>>
>>>> On 2024/1/15 15:55, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, January 15, 2024 1:18 PM
>>>>>>
>>>>>> On 2024/1/15 15:37, Parav Pandit wrote:
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Monday, January 15, 2024 1:03 PM
>>>>>>>>
>>>>>>>> On 2024/1/12 17:44, Parav Pandit wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>> Sent: Friday, January 12, 2024 2:55 PM
>>>>>>>>>>
>>>>>>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>>>>>>>>>> Hi Jiqian,
>>>>>>>>>>>>>
>>>>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>> Sorry to reply late.
>>>>>>>>>>>>>> I don't know if you still remember this problem, let me
>>>>>>>>>>>>>> briefly descript
>>>>>>>>>> it.
>>>>>>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>>>>>>>>>> Currently on Xen, if we start a guest through qemu with
>>>>>>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
>>>>>>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
>>>>>>>>>>>>>> It just shown a black
>>>>>>>>>>>> screen.
>>>>>>>>>>>>>> That is because during suspending, guest called into qemu
>>>>>>>>>>>>>> and qemu destroyed all display resources and reset
>>>>>>>>>>>>>> renderer. This made the display gone after guest resumed.
>>>>>>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>>>>>>>>>> suspending, it will notify qemu and set
>>>>>>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will
>>>>>>>>>>>>>> preserve resources, and when resuming, guest will notify
>>>>>>>>>>>>>> qemu to set parameter to 0, and then qemu will keep the
>>>>>>>>>>>>>> normal
>>>> actions.
>>>>>>>>>>>>>> That can
>>>>>>>>>> help guest's display come back.
>>>>>>>>>>>>>> When I upstream above implementation, Parav and MST
>> suggest
>>>>>>>>>>>>>> me
>>>>>>>> to
>>>>>>>>>>>> use
>>>>>>>>>>>>>> the PM capability to fix this problem instead of adding a
>>>>>>>>>>>>>> new command or state bit.
>>>>>>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't
>>>>>>>>>>>>>> be used to solve this problem.
>>>>>>>>>>>>>> The reason is:
>>>>>>>>>>>>>> during guest suspending, it will write D3 state through PM
>>>>>>>>>>>>>> cap, then I can save the resources of virtio-gpu on Qemu
>>>>>>>>>>>>>> side(set preserver_resource to 1), but during the process
>>>>>>>>>>>>>> of resuming, the state of PM cap will be cleared by qemu
>>>>>>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes
>>>>>>>>>>>>>> that when guest reads state from PM cap, it will find the
>>>>>>>>>>>>>> virtio-gpu has already been D0 state,
>>>>>>>>>>>>> This behavior needs to be fixed. As the spec listed out, "
>>>>>>>>>>>>> This 2-bit field is
>>>>>>>>>>>> used both to determine the current power state of a Function"
>>>>>>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset
>>>>>>>>>>>> in current qemu code? Why?
>>>>>>>>>>> Because PM is implementing the support from D3->d0 transition
>>>>>>>>>>> and if it
>>>>>>>>>> device in D3, the device needs to respond that it is in D3 to
>>>>>>>>>> match the PCI spec.
>>>>>>>>>>>
>>>>>>>>>>>> Shouldn't all device states, including PM registers, be reset
>>>>>>>>>>>> during the process of virtio-gpu reset?
>>>>>>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
>>>>>>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers
>>>>>>>>>>> and other
>>>>>>>>>> things listed in pci spec.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So device needs to return D3 in the PowerState field.  This is
>> must.
>>>>>>>>>>>> But current behavior is a kind of soft reset, I
>>>>>>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by
>>>> qemu
>>>>>> is
>>>>>>>>>>>> reasonable.
>>>>>>>>>>> What you described is, you need to do No_Soft_Reset=1, so
>>>>>>>>>>> please set the
>>>>>>>>>> cap accordingly to achieve the restore.
>>>>>>>>>>>
>>>>>>>>>>> Also in your case the if the QEMU knows that it will have to
>>>>>>>>>>> resume the
>>>>>>>>>> device soon.
>>>>>>>>>>> Hence, it can well prepare the gpu context before resuming the
>> vcpus.
>>>>>>>>>>> In such case, the extra register wouldn’t be necessary.
>>>>>>>>>>> Having PM control register is anyway needed to drive the
>>>>>>>>>>> device
>>>>>> properly.
>>>>>>>>>> Even as you said, the reset behavior of PM in the current QEMU
>>>>>>>>>> code is incorrect, and even if it is modified, it also cannot
>>>>>>>>>> fix this problem with
>>>>>> S3.
>>>>>>>>> You meant D3?
>>>>>>>>>
>>>>>>>>>> Because there is twice device reset during resuming.
>>>>>>>>>> First is when we trigger a resume command to qemu, qemu will
>>>>>>>>>> reset device(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
>>>>>>>>> A device implementation that supports PM should not reset the
>> device.
>>>>>>>> Please fix it.
>>>>>>>> Do you mean the devices_reset behavior of qemu shouldn't happen
>>>>>>>> when a device has PM cap?
>>>>>>> Right. Device reset should not happen on PM capable device that
>>>>>>> offers
>>>>>> no_soft_reset==1.
>>>>>> But current qemu code, no_soft_reset==0.
>>>>>>
>>>>> So please fix the qemu when you are adding the functionality of
>>>>> restoring
>>>> the device context on PM.
>>>> My patch didn't add the functionality of restoring the device context on
>> PM.
>>>> Is no_soft_reset must set 1 when a device has PM cap?
>>>
>>> It the device wants to expose the ability to save and restore context, then
>> yes, no_soft_reset = 1, as defined by the PCI spec.
>>>
>>>> It should also be
>>>> allowed for situations that are not equal to 1, right?
>>> I don’t see the need of it. It would not be aligned to no_soft_reset bit already
>> defined bit.
>>>
>>>> There should be devices that has PM cap but no_soft_reset is 0
>>>> because that devices can't store and restore the device context on
>>>> d3->d0 transition, like current qemu code.
>>> Right, PM cap with no_soft_reset=0, will not restore the context on d3->d0
>> as today.
>>>
>>>> So the fix should be feasible for all situations that whether the
>>>> no_soft_reset is
>>>> 1 or not.
>>> I don’t follow the "fix should be feasible".
>> I mean the fix that to solve the problem(virtio-gpu's display resources are
>> destroyed during S3) should also be feasible to the situation that
>> no_soft_reset == 0.
>>
>>> QEMU reset the device on d3->d0 transition when no_soft_reset = 0 seems
>> fine because it follows the defined pci spec.
>> Right.
>> Now that you also think the current implementation of QEMU is reasonable,
>> the fix to this problem(display resources are destroyed during S3) should be
>> applicable to the current QEMU code.
> Yes.
> 
> To conclude our discussion,
> 
> Display resources should be only restored when following conditions are met.
> 1. PCI PM cap is reported.
> 2. PCI PM cap has non_soft_reset=1
> 3. virtio guest driver do not perform reset when transport offers a restore capability using #1 and #2.
> 
> Do you agree? Yes?
Yes, I think this problem (display resources are destroyed during S3) can be sorted to two situations:
First is what you said above, in this situation, the devices_reset of qemu is unreasonable, if a device has PM cap and non_soft_reset=1, qemu should not do resetting.
Second is without #1 or #2, the reset behavior is fine. My patch is to fix this situation, that sending a new virtio-gpu command to notify qemu to prevent destroying display resources during S3.

> 
>> If using PM cap, it cannot solve the problem encountered in current QEMU
>> code.
>> I think you may not have had a detailed understanding of my modifications. I
>> will resend the latest modifications on the Linux kernel and QEMU to
>> upstream.
>> Welcome to review.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  9:40                                                     ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15  9:40 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 17:16, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 2:40 PM
>>
>> On 2024/1/15 16:52, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 1:51 PM
>>>>
>>>> On 2024/1/15 15:55, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, January 15, 2024 1:18 PM
>>>>>>
>>>>>> On 2024/1/15 15:37, Parav Pandit wrote:
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Monday, January 15, 2024 1:03 PM
>>>>>>>>
>>>>>>>> On 2024/1/12 17:44, Parav Pandit wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>> Sent: Friday, January 12, 2024 2:55 PM
>>>>>>>>>>
>>>>>>>>>> On 2024/1/12 16:47, Parav Pandit wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>>> Sent: Friday, January 12, 2024 1:55 PM
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2024/1/12 16:02, Parav Pandit wrote:
>>>>>>>>>>>>> Hi Jiqian,
>>>>>>>>>>>>>
>>>>>>>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>>>>>>>> Sent: Friday, January 12, 2024 1:11 PM
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>> Sorry to reply late.
>>>>>>>>>>>>>> I don't know if you still remember this problem, let me
>>>>>>>>>>>>>> briefly descript
>>>>>>>>>> it.
>>>>>>>>>>>>>> I am working to implement virtgpu S3 function on Xen.
>>>>>>>>>>>>>> Currently on Xen, if we start a guest through qemu with
>>>>>>>>>>>>>> enabling virtgpu, and then suspend and resume guest. We can
>>>>>>>>>>>>>> find that the guest kernel comes back, but the display doesn't.
>>>>>>>>>>>>>> It just shown a black
>>>>>>>>>>>> screen.
>>>>>>>>>>>>>> That is because during suspending, guest called into qemu
>>>>>>>>>>>>>> and qemu destroyed all display resources and reset
>>>>>>>>>>>>>> renderer. This made the display gone after guest resumed.
>>>>>>>>>>>>>> So, I add a new command for virtio-gpu that when guest is
>>>>>>>>>>>>>> suspending, it will notify qemu and set
>>>>>>>>>>>>>> parameter(preserver_resource) to 1 and then qemu will
>>>>>>>>>>>>>> preserve resources, and when resuming, guest will notify
>>>>>>>>>>>>>> qemu to set parameter to 0, and then qemu will keep the
>>>>>>>>>>>>>> normal
>>>> actions.
>>>>>>>>>>>>>> That can
>>>>>>>>>> help guest's display come back.
>>>>>>>>>>>>>> When I upstream above implementation, Parav and MST
>> suggest
>>>>>>>>>>>>>> me
>>>>>>>> to
>>>>>>>>>>>> use
>>>>>>>>>>>>>> the PM capability to fix this problem instead of adding a
>>>>>>>>>>>>>> new command or state bit.
>>>>>>>>>>>>>> Now, I have tried the PM capability of virtio-gpu, it can't
>>>>>>>>>>>>>> be used to solve this problem.
>>>>>>>>>>>>>> The reason is:
>>>>>>>>>>>>>> during guest suspending, it will write D3 state through PM
>>>>>>>>>>>>>> cap, then I can save the resources of virtio-gpu on Qemu
>>>>>>>>>>>>>> side(set preserver_resource to 1), but during the process
>>>>>>>>>>>>>> of resuming, the state of PM cap will be cleared by qemu
>>>>>>>>>>>>>> resetting(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset), it causes
>>>>>>>>>>>>>> that when guest reads state from PM cap, it will find the
>>>>>>>>>>>>>> virtio-gpu has already been D0 state,
>>>>>>>>>>>>> This behavior needs to be fixed. As the spec listed out, "
>>>>>>>>>>>>> This 2-bit field is
>>>>>>>>>>>> used both to determine the current power state of a Function"
>>>>>>>>>>>> Do you mean it is wrong to reset PM cap when vritio_gpu reset
>>>>>>>>>>>> in current qemu code? Why?
>>>>>>>>>>> Because PM is implementing the support from D3->d0 transition
>>>>>>>>>>> and if it
>>>>>>>>>> device in D3, the device needs to respond that it is in D3 to
>>>>>>>>>> match the PCI spec.
>>>>>>>>>>>
>>>>>>>>>>>> Shouldn't all device states, including PM registers, be reset
>>>>>>>>>>>> during the process of virtio-gpu reset?
>>>>>>>>>>> If No_Soft_Reset== 0, no device context to be preserved.
>>>>>>>>>>> If No_Soft_Reset == 1, device to preserve minimal registers
>>>>>>>>>>> and other
>>>>>>>>>> things listed in pci spec.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So device needs to return D3 in the PowerState field.  This is
>> must.
>>>>>>>>>>>> But current behavior is a kind of soft reset, I
>>>>>>>>>>>> think(!PCI_PM_CTRL_NO_SOFT_RESET). The pm cap is reset by
>>>> qemu
>>>>>> is
>>>>>>>>>>>> reasonable.
>>>>>>>>>>> What you described is, you need to do No_Soft_Reset=1, so
>>>>>>>>>>> please set the
>>>>>>>>>> cap accordingly to achieve the restore.
>>>>>>>>>>>
>>>>>>>>>>> Also in your case the if the QEMU knows that it will have to
>>>>>>>>>>> resume the
>>>>>>>>>> device soon.
>>>>>>>>>>> Hence, it can well prepare the gpu context before resuming the
>> vcpus.
>>>>>>>>>>> In such case, the extra register wouldn’t be necessary.
>>>>>>>>>>> Having PM control register is anyway needed to drive the
>>>>>>>>>>> device
>>>>>> properly.
>>>>>>>>>> Even as you said, the reset behavior of PM in the current QEMU
>>>>>>>>>> code is incorrect, and even if it is modified, it also cannot
>>>>>>>>>> fix this problem with
>>>>>> S3.
>>>>>>>>> You meant D3?
>>>>>>>>>
>>>>>>>>>> Because there is twice device reset during resuming.
>>>>>>>>>> First is when we trigger a resume command to qemu, qemu will
>>>>>>>>>> reset device(qemu_system_wakeup-> qemu_devices_reset->
>>>>>>>>>> virtio_vga_base_reset-> virtio_gpu_gl_reset).
>>>>>>>>> A device implementation that supports PM should not reset the
>> device.
>>>>>>>> Please fix it.
>>>>>>>> Do you mean the devices_reset behavior of qemu shouldn't happen
>>>>>>>> when a device has PM cap?
>>>>>>> Right. Device reset should not happen on PM capable device that
>>>>>>> offers
>>>>>> no_soft_reset==1.
>>>>>> But current qemu code, no_soft_reset==0.
>>>>>>
>>>>> So please fix the qemu when you are adding the functionality of
>>>>> restoring
>>>> the device context on PM.
>>>> My patch didn't add the functionality of restoring the device context on
>> PM.
>>>> Is no_soft_reset must set 1 when a device has PM cap?
>>>
>>> It the device wants to expose the ability to save and restore context, then
>> yes, no_soft_reset = 1, as defined by the PCI spec.
>>>
>>>> It should also be
>>>> allowed for situations that are not equal to 1, right?
>>> I don’t see the need of it. It would not be aligned to no_soft_reset bit already
>> defined bit.
>>>
>>>> There should be devices that has PM cap but no_soft_reset is 0
>>>> because that devices can't store and restore the device context on
>>>> d3->d0 transition, like current qemu code.
>>> Right, PM cap with no_soft_reset=0, will not restore the context on d3->d0
>> as today.
>>>
>>>> So the fix should be feasible for all situations that whether the
>>>> no_soft_reset is
>>>> 1 or not.
>>> I don’t follow the "fix should be feasible".
>> I mean the fix that to solve the problem(virtio-gpu's display resources are
>> destroyed during S3) should also be feasible to the situation that
>> no_soft_reset == 0.
>>
>>> QEMU reset the device on d3->d0 transition when no_soft_reset = 0 seems
>> fine because it follows the defined pci spec.
>> Right.
>> Now that you also think the current implementation of QEMU is reasonable,
>> the fix to this problem(display resources are destroyed during S3) should be
>> applicable to the current QEMU code.
> Yes.
> 
> To conclude our discussion,
> 
> Display resources should be only restored when following conditions are met.
> 1. PCI PM cap is reported.
> 2. PCI PM cap has non_soft_reset=1
> 3. virtio guest driver do not perform reset when transport offers a restore capability using #1 and #2.
> 
> Do you agree? Yes?
Yes, I think this problem (display resources are destroyed during S3) can be sorted to two situations:
First is what you said above, in this situation, the devices_reset of qemu is unreasonable, if a device has PM cap and non_soft_reset=1, qemu should not do resetting.
Second is without #1 or #2, the reset behavior is fine. My patch is to fix this situation, that sending a new virtio-gpu command to notify qemu to prevent destroying display resources during S3.

> 
>> If using PM cap, it cannot solve the problem encountered in current QEMU
>> code.
>> I think you may not have had a detailed understanding of my modifications. I
>> will resend the latest modifications on the Linux kernel and QEMU to
>> upstream.
>> Welcome to review.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  9:40                                                     ` Chen, Jiqian
@ 2024-01-15  9:46                                                       ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  9:46 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 3:10 PM

> > Display resources should be only restored when following conditions are
> met.
> > 1. PCI PM cap is reported.
> > 2. PCI PM cap has non_soft_reset=1
> > 3. virtio guest driver do not perform reset when transport offers a restore
> capability using #1 and #2.
> >
> > Do you agree? Yes?
> Yes, I think this problem (display resources are destroyed during S3) can be
> sorted to two situations:
> First is what you said above, in this situation, the devices_reset of qemu is
> unreasonable, if a device has PM cap and non_soft_reset=1, qemu should not
> do resetting.

> Second is without #1 or #2, the reset behavior is fine. My patch is to fix this
> situation, that sending a new virtio-gpu command to notify qemu to prevent
> destroying display resources during S3.

I still have hard time following "My patch is to fix this situation...".

When #1 and #2 is not done, there is nothing to restore. 
Driver should not send some new virtio specific command when #1 and #2 is not there.
Instead, if the device wants to restore the context, all #1, #2 and #3 should be done to implement the restore functionality.

In other words, if one wants to use device context restore on PM state transition it should do #1, #2 and #3.
(and avoid inventing new infrastructure because PCI PM has necessary things).

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15  9:46                                                       ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15  9:46 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 3:10 PM

> > Display resources should be only restored when following conditions are
> met.
> > 1. PCI PM cap is reported.
> > 2. PCI PM cap has non_soft_reset=1
> > 3. virtio guest driver do not perform reset when transport offers a restore
> capability using #1 and #2.
> >
> > Do you agree? Yes?
> Yes, I think this problem (display resources are destroyed during S3) can be
> sorted to two situations:
> First is what you said above, in this situation, the devices_reset of qemu is
> unreasonable, if a device has PM cap and non_soft_reset=1, qemu should not
> do resetting.

> Second is without #1 or #2, the reset behavior is fine. My patch is to fix this
> situation, that sending a new virtio-gpu command to notify qemu to prevent
> destroying display resources during S3.

I still have hard time following "My patch is to fix this situation...".

When #1 and #2 is not done, there is nothing to restore. 
Driver should not send some new virtio specific command when #1 and #2 is not there.
Instead, if the device wants to restore the context, all #1, #2 and #3 should be done to implement the restore functionality.

In other words, if one wants to use device context restore on PM state transition it should do #1, #2 and #3.
(and avoid inventing new infrastructure because PCI PM has necessary things).

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15  9:46                                                       ` Parav Pandit
@ 2024-01-15 10:47                                                         ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15 10:47 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 17:46, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 3:10 PM
> 
>>> Display resources should be only restored when following conditions are
>> met.
>>> 1. PCI PM cap is reported.
>>> 2. PCI PM cap has non_soft_reset=1
>>> 3. virtio guest driver do not perform reset when transport offers a restore
>> capability using #1 and #2.
>>>
>>> Do you agree? Yes?
>> Yes, I think this problem (display resources are destroyed during S3) can be
>> sorted to two situations:
>> First is what you said above, in this situation, the devices_reset of qemu is
>> unreasonable, if a device has PM cap and non_soft_reset=1, qemu should not
>> do resetting.
> 
>> Second is without #1 or #2, the reset behavior is fine. My patch is to fix this
>> situation, that sending a new virtio-gpu command to notify qemu to prevent
>> destroying display resources during S3.
> 
> I still have hard time following "My patch is to fix this situation...".
> 
> When #1 and #2 is not done, there is nothing to restore. 
> Driver should not send some new virtio specific command when #1 and #2 is not there.
> Instead, if the device wants to restore the context, all #1, #2 and #3 should be done to implement the restore functionality.
When #1 and #2 is done. The "device reset behavior" should not happen. And then the display resources are not destroyed.
I didn’t say send command to restore context.
I mean when #1 and #2 is not done. Driver and qemu both reset devices when resuming, at that time, we need a method to prevent the display resources from destroying.
I don't mean anything else, I just feel like you may not have understood the issues I encountered during the S3 process and the implementation logic of my patch from the beginning.
In current kernel and qemu codes, when we do S3(suspend and resume) for guest, during the resuming, qemu and guest driver will reset the virtio-gpu(because #2 is not done), but once the resetting happens, the display resources will be destroyed and we can't restore, so we must add a method to prevent destroying resources at that situation. What my patch do is to add a new virtio-gpu command to notify qemu to check if the display resources should be destroyed during resetting.

> 
> In other words, if one wants to use device context restore on PM state transition it should do #1, #2 and #3.
> (and avoid inventing new infrastructure because PCI PM has necessary things).


-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15 10:47                                                         ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15 10:47 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 17:46, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 3:10 PM
> 
>>> Display resources should be only restored when following conditions are
>> met.
>>> 1. PCI PM cap is reported.
>>> 2. PCI PM cap has non_soft_reset=1
>>> 3. virtio guest driver do not perform reset when transport offers a restore
>> capability using #1 and #2.
>>>
>>> Do you agree? Yes?
>> Yes, I think this problem (display resources are destroyed during S3) can be
>> sorted to two situations:
>> First is what you said above, in this situation, the devices_reset of qemu is
>> unreasonable, if a device has PM cap and non_soft_reset=1, qemu should not
>> do resetting.
> 
>> Second is without #1 or #2, the reset behavior is fine. My patch is to fix this
>> situation, that sending a new virtio-gpu command to notify qemu to prevent
>> destroying display resources during S3.
> 
> I still have hard time following "My patch is to fix this situation...".
> 
> When #1 and #2 is not done, there is nothing to restore. 
> Driver should not send some new virtio specific command when #1 and #2 is not there.
> Instead, if the device wants to restore the context, all #1, #2 and #3 should be done to implement the restore functionality.
When #1 and #2 is done. The "device reset behavior" should not happen. And then the display resources are not destroyed.
I didn’t say send command to restore context.
I mean when #1 and #2 is not done. Driver and qemu both reset devices when resuming, at that time, we need a method to prevent the display resources from destroying.
I don't mean anything else, I just feel like you may not have understood the issues I encountered during the S3 process and the implementation logic of my patch from the beginning.
In current kernel and qemu codes, when we do S3(suspend and resume) for guest, during the resuming, qemu and guest driver will reset the virtio-gpu(because #2 is not done), but once the resetting happens, the display resources will be destroyed and we can't restore, so we must add a method to prevent destroying resources at that situation. What my patch do is to add a new virtio-gpu command to notify qemu to check if the display resources should be destroyed during resetting.

> 
> In other words, if one wants to use device context restore on PM state transition it should do #1, #2 and #3.
> (and avoid inventing new infrastructure because PCI PM has necessary things).


-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15 10:47                                                         ` Chen, Jiqian
@ 2024-01-15 10:52                                                           ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15 10:52 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 4:17 PM
> 
> On 2024/1/15 17:46, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 3:10 PM
> >
> >>> Display resources should be only restored when following conditions
> >>> are
> >> met.
> >>> 1. PCI PM cap is reported.
> >>> 2. PCI PM cap has non_soft_reset=1
> >>> 3. virtio guest driver do not perform reset when transport offers a
> >>> restore
> >> capability using #1 and #2.
> >>>
> >>> Do you agree? Yes?
> >> Yes, I think this problem (display resources are destroyed during S3)
> >> can be sorted to two situations:
> >> First is what you said above, in this situation, the devices_reset of
> >> qemu is unreasonable, if a device has PM cap and non_soft_reset=1,
> >> qemu should not do resetting.
> >
> >> Second is without #1 or #2, the reset behavior is fine. My patch is
> >> to fix this situation, that sending a new virtio-gpu command to
> >> notify qemu to prevent destroying display resources during S3.
> >
> > I still have hard time following "My patch is to fix this situation...".
> >
> > When #1 and #2 is not done, there is nothing to restore.
> > Driver should not send some new virtio specific command when #1 and #2 is
> not there.
> > Instead, if the device wants to restore the context, all #1, #2 and #3 should
> be done to implement the restore functionality.
> When #1 and #2 is done. The "device reset behavior" should not happen. And
> then the display resources are not destroyed.
> I didn’t say send command to restore context.

> I mean when #1 and #2 is not done. Driver and qemu both reset devices when
> resuming, at that time, 
Above behavior is as per the spec guidelines. Hence it is fine.

> we need a method to prevent the display resources from destroying.
I disagree to above as it is not as per the spec guidelines.
Just follow #1, #2 and #3 to not destroy the display resource destroying.
No need for any check command etc.

> I don't mean anything else, I just feel like you may not have understood the
> issues I encountered during the S3 process and the implementation logic of
> my patch from the beginning.
> In current kernel and qemu codes, when we do S3(suspend and resume) for
> guest, during the resuming, qemu and guest driver will reset the virtio-
> gpu(because #2 is not done), but once the resetting happens, the display
> resources will be destroyed and we can't restore, so we must add a method to
> prevent destroying resources at that situation. What my patch do is to add a
> new virtio-gpu command to notify qemu to check if the display resources
> should be destroyed during resetting.
> 
> >
> > In other words, if one wants to use device context restore on PM state
> transition it should do #1, #2 and #3.
> > (and avoid inventing new infrastructure because PCI PM has necessary
> things).
> 
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15 10:52                                                           ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15 10:52 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 4:17 PM
> 
> On 2024/1/15 17:46, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 3:10 PM
> >
> >>> Display resources should be only restored when following conditions
> >>> are
> >> met.
> >>> 1. PCI PM cap is reported.
> >>> 2. PCI PM cap has non_soft_reset=1
> >>> 3. virtio guest driver do not perform reset when transport offers a
> >>> restore
> >> capability using #1 and #2.
> >>>
> >>> Do you agree? Yes?
> >> Yes, I think this problem (display resources are destroyed during S3)
> >> can be sorted to two situations:
> >> First is what you said above, in this situation, the devices_reset of
> >> qemu is unreasonable, if a device has PM cap and non_soft_reset=1,
> >> qemu should not do resetting.
> >
> >> Second is without #1 or #2, the reset behavior is fine. My patch is
> >> to fix this situation, that sending a new virtio-gpu command to
> >> notify qemu to prevent destroying display resources during S3.
> >
> > I still have hard time following "My patch is to fix this situation...".
> >
> > When #1 and #2 is not done, there is nothing to restore.
> > Driver should not send some new virtio specific command when #1 and #2 is
> not there.
> > Instead, if the device wants to restore the context, all #1, #2 and #3 should
> be done to implement the restore functionality.
> When #1 and #2 is done. The "device reset behavior" should not happen. And
> then the display resources are not destroyed.
> I didn’t say send command to restore context.

> I mean when #1 and #2 is not done. Driver and qemu both reset devices when
> resuming, at that time, 
Above behavior is as per the spec guidelines. Hence it is fine.

> we need a method to prevent the display resources from destroying.
I disagree to above as it is not as per the spec guidelines.
Just follow #1, #2 and #3 to not destroy the display resource destroying.
No need for any check command etc.

> I don't mean anything else, I just feel like you may not have understood the
> issues I encountered during the S3 process and the implementation logic of
> my patch from the beginning.
> In current kernel and qemu codes, when we do S3(suspend and resume) for
> guest, during the resuming, qemu and guest driver will reset the virtio-
> gpu(because #2 is not done), but once the resetting happens, the display
> resources will be destroyed and we can't restore, so we must add a method to
> prevent destroying resources at that situation. What my patch do is to add a
> new virtio-gpu command to notify qemu to check if the display resources
> should be destroyed during resetting.
> 
> >
> > In other words, if one wants to use device context restore on PM state
> transition it should do #1, #2 and #3.
> > (and avoid inventing new infrastructure because PCI PM has necessary
> things).
> 
> 
> --
> Best regards,
> Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15 10:52                                                           ` Parav Pandit
@ 2024-01-15 11:07                                                             ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15 11:07 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 18:52, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 4:17 PM
>>
>> On 2024/1/15 17:46, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 3:10 PM
>>>
>>>>> Display resources should be only restored when following conditions
>>>>> are
>>>> met.
>>>>> 1. PCI PM cap is reported.
>>>>> 2. PCI PM cap has non_soft_reset=1
>>>>> 3. virtio guest driver do not perform reset when transport offers a
>>>>> restore
>>>> capability using #1 and #2.
>>>>>
>>>>> Do you agree? Yes?
>>>> Yes, I think this problem (display resources are destroyed during S3)
>>>> can be sorted to two situations:
>>>> First is what you said above, in this situation, the devices_reset of
>>>> qemu is unreasonable, if a device has PM cap and non_soft_reset=1,
>>>> qemu should not do resetting.
>>>
>>>> Second is without #1 or #2, the reset behavior is fine. My patch is
>>>> to fix this situation, that sending a new virtio-gpu command to
>>>> notify qemu to prevent destroying display resources during S3.
>>>
>>> I still have hard time following "My patch is to fix this situation...".
>>>
>>> When #1 and #2 is not done, there is nothing to restore.
>>> Driver should not send some new virtio specific command when #1 and #2 is
>> not there.
>>> Instead, if the device wants to restore the context, all #1, #2 and #3 should
>> be done to implement the restore functionality.
>> When #1 and #2 is done. The "device reset behavior" should not happen. And
>> then the display resources are not destroyed.
>> I didn’t say send command to restore context.
> 
>> I mean when #1 and #2 is not done. Driver and qemu both reset devices when
>> resuming, at that time, 
> Above behavior is as per the spec guidelines. Hence it is fine.
> 
>> we need a method to prevent the display resources from destroying.
> I disagree to above as it is not as per the spec guidelines.
> Just follow #1, #2 and #3 to not destroy the display resource destroying.
I agree that follow #1, #2 and #3 will not destroy display resource.
So the question goes back: If there is no #2 and you say that the reset behavior complies with the spec guidelines, how can I make the virtio gpu work properly with display resources destroyed?
What my patch do is not to prevent resetting, is to prevent destroying resources during resetting. Gerd Hoffmann has agreed to this point in my qemu patch reviewing.
Even if I use PM cap, I still need to prevent destroying resources during resetting.

> No need for any check command etc.
> 
>> I don't mean anything else, I just feel like you may not have understood the
>> issues I encountered during the S3 process and the implementation logic of
>> my patch from the beginning.
>> In current kernel and qemu codes, when we do S3(suspend and resume) for
>> guest, during the resuming, qemu and guest driver will reset the virtio-
>> gpu(because #2 is not done), but once the resetting happens, the display
>> resources will be destroyed and we can't restore, so we must add a method to
>> prevent destroying resources at that situation. What my patch do is to add a
>> new virtio-gpu command to notify qemu to check if the display resources
>> should be destroyed during resetting.
>>
>>>
>>> In other words, if one wants to use device context restore on PM state
>> transition it should do #1, #2 and #3.
>>> (and avoid inventing new infrastructure because PCI PM has necessary
>> things).
>>
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15 11:07                                                             ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-15 11:07 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 18:52, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 4:17 PM
>>
>> On 2024/1/15 17:46, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 3:10 PM
>>>
>>>>> Display resources should be only restored when following conditions
>>>>> are
>>>> met.
>>>>> 1. PCI PM cap is reported.
>>>>> 2. PCI PM cap has non_soft_reset=1
>>>>> 3. virtio guest driver do not perform reset when transport offers a
>>>>> restore
>>>> capability using #1 and #2.
>>>>>
>>>>> Do you agree? Yes?
>>>> Yes, I think this problem (display resources are destroyed during S3)
>>>> can be sorted to two situations:
>>>> First is what you said above, in this situation, the devices_reset of
>>>> qemu is unreasonable, if a device has PM cap and non_soft_reset=1,
>>>> qemu should not do resetting.
>>>
>>>> Second is without #1 or #2, the reset behavior is fine. My patch is
>>>> to fix this situation, that sending a new virtio-gpu command to
>>>> notify qemu to prevent destroying display resources during S3.
>>>
>>> I still have hard time following "My patch is to fix this situation...".
>>>
>>> When #1 and #2 is not done, there is nothing to restore.
>>> Driver should not send some new virtio specific command when #1 and #2 is
>> not there.
>>> Instead, if the device wants to restore the context, all #1, #2 and #3 should
>> be done to implement the restore functionality.
>> When #1 and #2 is done. The "device reset behavior" should not happen. And
>> then the display resources are not destroyed.
>> I didn’t say send command to restore context.
> 
>> I mean when #1 and #2 is not done. Driver and qemu both reset devices when
>> resuming, at that time, 
> Above behavior is as per the spec guidelines. Hence it is fine.
> 
>> we need a method to prevent the display resources from destroying.
> I disagree to above as it is not as per the spec guidelines.
> Just follow #1, #2 and #3 to not destroy the display resource destroying.
I agree that follow #1, #2 and #3 will not destroy display resource.
So the question goes back: If there is no #2 and you say that the reset behavior complies with the spec guidelines, how can I make the virtio gpu work properly with display resources destroyed?
What my patch do is not to prevent resetting, is to prevent destroying resources during resetting. Gerd Hoffmann has agreed to this point in my qemu patch reviewing.
Even if I use PM cap, I still need to prevent destroying resources during resetting.

> No need for any check command etc.
> 
>> I don't mean anything else, I just feel like you may not have understood the
>> issues I encountered during the S3 process and the implementation logic of
>> my patch from the beginning.
>> In current kernel and qemu codes, when we do S3(suspend and resume) for
>> guest, during the resuming, qemu and guest driver will reset the virtio-
>> gpu(because #2 is not done), but once the resetting happens, the display
>> resources will be destroyed and we can't restore, so we must add a method to
>> prevent destroying resources at that situation. What my patch do is to add a
>> new virtio-gpu command to notify qemu to check if the display resources
>> should be destroyed during resetting.
>>
>>>
>>> In other words, if one wants to use device context restore on PM state
>> transition it should do #1, #2 and #3.
>>> (and avoid inventing new infrastructure because PCI PM has necessary
>> things).
>>
>>
>> --
>> Best regards,
>> Jiqian Chen.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15 11:07                                                             ` [virtio-dev] " Chen, Jiqian
@ 2024-01-15 11:10                                                               ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15 11:10 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 4:37 PM
> 
> On 2024/1/15 18:52, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 4:17 PM
> >>
> >> On 2024/1/15 17:46, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 3:10 PM
> >>>
> >>>>> Display resources should be only restored when following
> >>>>> conditions are
> >>>> met.
> >>>>> 1. PCI PM cap is reported.
> >>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do not
> >>>>> perform reset when transport offers a restore
> >>>> capability using #1 and #2.
> >>>>>
> >>>>> Do you agree? Yes?
> >>>> Yes, I think this problem (display resources are destroyed during
> >>>> S3) can be sorted to two situations:
> >>>> First is what you said above, in this situation, the devices_reset
> >>>> of qemu is unreasonable, if a device has PM cap and
> >>>> non_soft_reset=1, qemu should not do resetting.
> >>>
> >>>> Second is without #1 or #2, the reset behavior is fine. My patch is
> >>>> to fix this situation, that sending a new virtio-gpu command to
> >>>> notify qemu to prevent destroying display resources during S3.
> >>>
> >>> I still have hard time following "My patch is to fix this situation...".
> >>>
> >>> When #1 and #2 is not done, there is nothing to restore.
> >>> Driver should not send some new virtio specific command when #1 and
> >>> #2 is
> >> not there.
> >>> Instead, if the device wants to restore the context, all #1, #2 and
> >>> #3 should
> >> be done to implement the restore functionality.
> >> When #1 and #2 is done. The "device reset behavior" should not
> >> happen. And then the display resources are not destroyed.
> >> I didn’t say send command to restore context.
> >
> >> I mean when #1 and #2 is not done. Driver and qemu both reset devices
> >> when resuming, at that time,
> > Above behavior is as per the spec guidelines. Hence it is fine.
> >
> >> we need a method to prevent the display resources from destroying.
> > I disagree to above as it is not as per the spec guidelines.
> > Just follow #1, #2 and #3 to not destroy the display resource destroying.
> I agree that follow #1, #2 and #3 will not destroy display resource.
> So the question goes back: If there is no #2 and you say that the reset
> behavior complies with the spec guidelines, how can I make the virtio gpu
> work properly with display resources destroyed?
Implement #2. :)

> What my patch do is not to prevent resetting, is to prevent destroying
> resources during resetting. Gerd Hoffmann has agreed to this point in my
> qemu patch reviewing.
> Even if I use PM cap, I still need to prevent destroying resources during
> resetting.
Do you mean when virtio device is reset, you want to prevent?
If so, that is yet additional virtio spec hack that must be avoided as reset flow should be one.
Please do #3 to avoid resetting the device.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-15 11:10                                                               ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-15 11:10 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Monday, January 15, 2024 4:37 PM
> 
> On 2024/1/15 18:52, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 4:17 PM
> >>
> >> On 2024/1/15 17:46, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 3:10 PM
> >>>
> >>>>> Display resources should be only restored when following
> >>>>> conditions are
> >>>> met.
> >>>>> 1. PCI PM cap is reported.
> >>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do not
> >>>>> perform reset when transport offers a restore
> >>>> capability using #1 and #2.
> >>>>>
> >>>>> Do you agree? Yes?
> >>>> Yes, I think this problem (display resources are destroyed during
> >>>> S3) can be sorted to two situations:
> >>>> First is what you said above, in this situation, the devices_reset
> >>>> of qemu is unreasonable, if a device has PM cap and
> >>>> non_soft_reset=1, qemu should not do resetting.
> >>>
> >>>> Second is without #1 or #2, the reset behavior is fine. My patch is
> >>>> to fix this situation, that sending a new virtio-gpu command to
> >>>> notify qemu to prevent destroying display resources during S3.
> >>>
> >>> I still have hard time following "My patch is to fix this situation...".
> >>>
> >>> When #1 and #2 is not done, there is nothing to restore.
> >>> Driver should not send some new virtio specific command when #1 and
> >>> #2 is
> >> not there.
> >>> Instead, if the device wants to restore the context, all #1, #2 and
> >>> #3 should
> >> be done to implement the restore functionality.
> >> When #1 and #2 is done. The "device reset behavior" should not
> >> happen. And then the display resources are not destroyed.
> >> I didn’t say send command to restore context.
> >
> >> I mean when #1 and #2 is not done. Driver and qemu both reset devices
> >> when resuming, at that time,
> > Above behavior is as per the spec guidelines. Hence it is fine.
> >
> >> we need a method to prevent the display resources from destroying.
> > I disagree to above as it is not as per the spec guidelines.
> > Just follow #1, #2 and #3 to not destroy the display resource destroying.
> I agree that follow #1, #2 and #3 will not destroy display resource.
> So the question goes back: If there is no #2 and you say that the reset
> behavior complies with the spec guidelines, how can I make the virtio gpu
> work properly with display resources destroyed?
Implement #2. :)

> What my patch do is not to prevent resetting, is to prevent destroying
> resources during resetting. Gerd Hoffmann has agreed to this point in my
> qemu patch reviewing.
> Even if I use PM cap, I still need to prevent destroying resources during
> resetting.
Do you mean when virtio device is reset, you want to prevent?
If so, that is yet additional virtio spec hack that must be avoided as reset flow should be one.
Please do #3 to avoid resetting the device.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-15 11:10                                                               ` Parav Pandit
@ 2024-01-16  6:37                                                                 ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-16  6:37 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 19:10, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 4:37 PM
>>
>> On 2024/1/15 18:52, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 4:17 PM
>>>>
>>>> On 2024/1/15 17:46, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, January 15, 2024 3:10 PM
>>>>>
>>>>>>> Display resources should be only restored when following
>>>>>>> conditions are
>>>>>> met.
>>>>>>> 1. PCI PM cap is reported.
>>>>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do not
>>>>>>> perform reset when transport offers a restore
>>>>>> capability using #1 and #2.
>>>>>>>
>>>>>>> Do you agree? Yes?
>>>>>> Yes, I think this problem (display resources are destroyed during
>>>>>> S3) can be sorted to two situations:
>>>>>> First is what you said above, in this situation, the devices_reset
>>>>>> of qemu is unreasonable, if a device has PM cap and
>>>>>> non_soft_reset=1, qemu should not do resetting.
>>>>>
>>>>>> Second is without #1 or #2, the reset behavior is fine. My patch is
>>>>>> to fix this situation, that sending a new virtio-gpu command to
>>>>>> notify qemu to prevent destroying display resources during S3.
>>>>>
>>>>> I still have hard time following "My patch is to fix this situation...".
>>>>>
>>>>> When #1 and #2 is not done, there is nothing to restore.
>>>>> Driver should not send some new virtio specific command when #1 and
>>>>> #2 is
>>>> not there.
>>>>> Instead, if the device wants to restore the context, all #1, #2 and
>>>>> #3 should
>>>> be done to implement the restore functionality.
>>>> When #1 and #2 is done. The "device reset behavior" should not
>>>> happen. And then the display resources are not destroyed.
>>>> I didn’t say send command to restore context.
>>>
>>>> I mean when #1 and #2 is not done. Driver and qemu both reset devices
>>>> when resuming, at that time,
>>> Above behavior is as per the spec guidelines. Hence it is fine.
>>>
>>>> we need a method to prevent the display resources from destroying.
>>> I disagree to above as it is not as per the spec guidelines.
>>> Just follow #1, #2 and #3 to not destroy the display resource destroying.
>> I agree that follow #1, #2 and #3 will not destroy display resource.
>> So the question goes back: If there is no #2 and you say that the reset
>> behavior complies with the spec guidelines, how can I make the virtio gpu
>> work properly with display resources destroyed?
> Implement #2. :)
We can't simply implement #2 if a device doesn't have #2, see Section 7.5.2.2 in PCI Express Spec, about No_Soft_Reset:
" If a VF implements the Power Management Capability, the VF's value of this field must be identical to the associated PF's value. "
What's more, a device doesn't have #2 is allowed, we should consider how to solve the two situations(No_Soft_Reset=1 and No_Soft_Reset=0), rather than simply considering implementing #2 for the device when it does not have #2.
Also in PCI Express Spec:
" Functional context is required to be maintained by Functions in the D3Hot state if the No_Soft_Reset field in the PMCSR is Set. In this case, System Software is not required to re-initialize the Function after a transition from D3Hot to D0 (the Function will be in the D0active state). If the No_Soft_Reset bit is Clear, functional context is not required to be maintained by the Function in the D3Hot state."
This description corresponds to the two situations we are discussing.
First is a device has #2(No_Soft_Reset=1), the device resetting will not happen, then the display resources of virtio-gpu problem will not happen.
But in Second, a device doesn't have #2(No_Soft_Reset=0), it is fine for system to do device resetting, at this time, the display resources of virtio-gpu problem will happen, we should also consider a method to solve that problem, but not to implement #2.

> 
>> What my patch do is not to prevent resetting, is to prevent destroying
>> resources during resetting. Gerd Hoffmann has agreed to this point in my
>> qemu patch reviewing.
>> Even if I use PM cap, I still need to prevent destroying resources during
>> resetting.
> Do you mean when virtio device is reset, you want to prevent?
I want to protect display resources, not to prevent the whole process of device resetting.
Also in PCI Express Spec:
"Note that a Function's software driver participates in the process of transitioning the Function from D0 to D3Hot. It contributes to the process by saving any functional state that would otherwise be lost with removal of main power, and otherwise preparing the Function for the transition to D3Hot."
I let guest driver to notify qemu to protect the display resources that will be loss during S3, it is also compliant with Spec.

> If so, that is yet additional virtio spec hack that must be avoided as reset flow should be one.
> Please do #3 to avoid resetting the device.
The same reason as above. According to the regulations of the spec, we should allow for the occurrence of two situations(No_Soft_Reset=1 and No_Soft_Reset=0) and provide solutions for each situation, rather than simply categorizing them into one.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-16  6:37                                                                 ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-16  6:37 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/15 19:10, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Monday, January 15, 2024 4:37 PM
>>
>> On 2024/1/15 18:52, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 4:17 PM
>>>>
>>>> On 2024/1/15 17:46, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, January 15, 2024 3:10 PM
>>>>>
>>>>>>> Display resources should be only restored when following
>>>>>>> conditions are
>>>>>> met.
>>>>>>> 1. PCI PM cap is reported.
>>>>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do not
>>>>>>> perform reset when transport offers a restore
>>>>>> capability using #1 and #2.
>>>>>>>
>>>>>>> Do you agree? Yes?
>>>>>> Yes, I think this problem (display resources are destroyed during
>>>>>> S3) can be sorted to two situations:
>>>>>> First is what you said above, in this situation, the devices_reset
>>>>>> of qemu is unreasonable, if a device has PM cap and
>>>>>> non_soft_reset=1, qemu should not do resetting.
>>>>>
>>>>>> Second is without #1 or #2, the reset behavior is fine. My patch is
>>>>>> to fix this situation, that sending a new virtio-gpu command to
>>>>>> notify qemu to prevent destroying display resources during S3.
>>>>>
>>>>> I still have hard time following "My patch is to fix this situation...".
>>>>>
>>>>> When #1 and #2 is not done, there is nothing to restore.
>>>>> Driver should not send some new virtio specific command when #1 and
>>>>> #2 is
>>>> not there.
>>>>> Instead, if the device wants to restore the context, all #1, #2 and
>>>>> #3 should
>>>> be done to implement the restore functionality.
>>>> When #1 and #2 is done. The "device reset behavior" should not
>>>> happen. And then the display resources are not destroyed.
>>>> I didn’t say send command to restore context.
>>>
>>>> I mean when #1 and #2 is not done. Driver and qemu both reset devices
>>>> when resuming, at that time,
>>> Above behavior is as per the spec guidelines. Hence it is fine.
>>>
>>>> we need a method to prevent the display resources from destroying.
>>> I disagree to above as it is not as per the spec guidelines.
>>> Just follow #1, #2 and #3 to not destroy the display resource destroying.
>> I agree that follow #1, #2 and #3 will not destroy display resource.
>> So the question goes back: If there is no #2 and you say that the reset
>> behavior complies with the spec guidelines, how can I make the virtio gpu
>> work properly with display resources destroyed?
> Implement #2. :)
We can't simply implement #2 if a device doesn't have #2, see Section 7.5.2.2 in PCI Express Spec, about No_Soft_Reset:
" If a VF implements the Power Management Capability, the VF's value of this field must be identical to the associated PF's value. "
What's more, a device doesn't have #2 is allowed, we should consider how to solve the two situations(No_Soft_Reset=1 and No_Soft_Reset=0), rather than simply considering implementing #2 for the device when it does not have #2.
Also in PCI Express Spec:
" Functional context is required to be maintained by Functions in the D3Hot state if the No_Soft_Reset field in the PMCSR is Set. In this case, System Software is not required to re-initialize the Function after a transition from D3Hot to D0 (the Function will be in the D0active state). If the No_Soft_Reset bit is Clear, functional context is not required to be maintained by the Function in the D3Hot state."
This description corresponds to the two situations we are discussing.
First is a device has #2(No_Soft_Reset=1), the device resetting will not happen, then the display resources of virtio-gpu problem will not happen.
But in Second, a device doesn't have #2(No_Soft_Reset=0), it is fine for system to do device resetting, at this time, the display resources of virtio-gpu problem will happen, we should also consider a method to solve that problem, but not to implement #2.

> 
>> What my patch do is not to prevent resetting, is to prevent destroying
>> resources during resetting. Gerd Hoffmann has agreed to this point in my
>> qemu patch reviewing.
>> Even if I use PM cap, I still need to prevent destroying resources during
>> resetting.
> Do you mean when virtio device is reset, you want to prevent?
I want to protect display resources, not to prevent the whole process of device resetting.
Also in PCI Express Spec:
"Note that a Function's software driver participates in the process of transitioning the Function from D0 to D3Hot. It contributes to the process by saving any functional state that would otherwise be lost with removal of main power, and otherwise preparing the Function for the transition to D3Hot."
I let guest driver to notify qemu to protect the display resources that will be loss during S3, it is also compliant with Spec.

> If so, that is yet additional virtio spec hack that must be avoided as reset flow should be one.
> Please do #3 to avoid resetting the device.
The same reason as above. According to the regulations of the spec, we should allow for the occurrence of two situations(No_Soft_Reset=1 and No_Soft_Reset=0) and provide solutions for each situation, rather than simply categorizing them into one.

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-16  6:37                                                                 ` Chen, Jiqian
@ 2024-01-16  7:19                                                                   ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-16  7:19 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Tuesday, January 16, 2024 12:08 PM
> 
> On 2024/1/15 19:10, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 4:37 PM
> >>
> >> On 2024/1/15 18:52, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 4:17 PM
> >>>>
> >>>> On 2024/1/15 17:46, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Monday, January 15, 2024 3:10 PM
> >>>>>
> >>>>>>> Display resources should be only restored when following
> >>>>>>> conditions are
> >>>>>> met.
> >>>>>>> 1. PCI PM cap is reported.
> >>>>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do not
> >>>>>>> perform reset when transport offers a restore
> >>>>>> capability using #1 and #2.
> >>>>>>>
> >>>>>>> Do you agree? Yes?
> >>>>>> Yes, I think this problem (display resources are destroyed during
> >>>>>> S3) can be sorted to two situations:
> >>>>>> First is what you said above, in this situation, the
> >>>>>> devices_reset of qemu is unreasonable, if a device has PM cap and
> >>>>>> non_soft_reset=1, qemu should not do resetting.
> >>>>>
> >>>>>> Second is without #1 or #2, the reset behavior is fine. My patch
> >>>>>> is to fix this situation, that sending a new virtio-gpu command
> >>>>>> to notify qemu to prevent destroying display resources during S3.
> >>>>>
> >>>>> I still have hard time following "My patch is to fix this situation...".
> >>>>>
> >>>>> When #1 and #2 is not done, there is nothing to restore.
> >>>>> Driver should not send some new virtio specific command when #1
> >>>>> and
> >>>>> #2 is
> >>>> not there.
> >>>>> Instead, if the device wants to restore the context, all #1, #2
> >>>>> and
> >>>>> #3 should
> >>>> be done to implement the restore functionality.
> >>>> When #1 and #2 is done. The "device reset behavior" should not
> >>>> happen. And then the display resources are not destroyed.
> >>>> I didn’t say send command to restore context.
> >>>
> >>>> I mean when #1 and #2 is not done. Driver and qemu both reset
> >>>> devices when resuming, at that time,
> >>> Above behavior is as per the spec guidelines. Hence it is fine.
> >>>
> >>>> we need a method to prevent the display resources from destroying.
> >>> I disagree to above as it is not as per the spec guidelines.
> >>> Just follow #1, #2 and #3 to not destroy the display resource destroying.
> >> I agree that follow #1, #2 and #3 will not destroy display resource.
> >> So the question goes back: If there is no #2 and you say that the
> >> reset behavior complies with the spec guidelines, how can I make the
> >> virtio gpu work properly with display resources destroyed?
> > Implement #2. :)
> We can't simply implement #2 if a device doesn't have #2, 
How do you define "We" above?
Isn't we == device?

> see Section 7.5.2.2
> in PCI Express Spec, about No_Soft_Reset:
> " If a VF implements the Power Management Capability, the VF's value of this
> field must be identical to the associated PF's value. "
> What's more, a device doesn't have #2 is allowed, we should consider how to
> solve the two situations(No_Soft_Reset=1 and No_Soft_Reset=0), rather than
> simply considering implementing #2 for the device when it does not have #2.
A device implementation (sw or hw) that wants to offer maintain the function context will implement No_Soft_reset=1.

> Also in PCI Express Spec:
> " Functional context is required to be maintained by Functions in the D3Hot
> state if the No_Soft_Reset field in the PMCSR is Set. In this case, System
> Software is not required to re-initialize the Function after a transition from
> D3Hot to D0 (the Function will be in the D0active state). If the No_Soft_Reset
> bit is Clear, functional context is not required to be maintained by the Function
> in the D3Hot state."
Right. It is NOT required to maintain. Hence, lets not maintain.

> This description corresponds to the two situations we are discussing.
> First is a device has #2(No_Soft_Reset=1), the device resetting will not happen,
> then the display resources of virtio-gpu problem will not happen.
Right.

> But in Second, a device doesn't have #2(No_Soft_Reset=0), it is fine for system
> to do device resetting, at this time, the display resources of virtio-gpu problem
> will happen, we should also consider a method to solve that problem, but not
> to implement #2.
No, because as wrote in the spec, ", functional context is not required to be maintained". Hence no need to maintain.

> 
> >
> >> What my patch do is not to prevent resetting, is to prevent
> >> destroying resources during resetting. Gerd Hoffmann has agreed to
> >> this point in my qemu patch reviewing.
> >> Even if I use PM cap, I still need to prevent destroying resources
> >> during resetting.
> > Do you mean when virtio device is reset, you want to prevent?
> I want to protect display resources, not to prevent the whole process of device
> resetting.
This is contradicting.
When you reset the device it will reset the functional context as well as guided by pci.

> Also in PCI Express Spec:
> "Note that a Function's software driver participates in the process of
> transitioning the Function from D0 to D3Hot. It contributes to the process by
> saving any functional state that would otherwise be lost with removal of main
> power, and otherwise preparing the Function for the transition to D3Hot."
> I let guest driver to notify qemu to protect the display resources that will be
> loss during S3, it is also compliant with Spec.
This is already done by the guest PCI driver using the PM control bits. Right?
Only thing needed is to fix the virtio layer to honor the PCI PM capabilities and skip resetting the device.

> 
> > If so, that is yet additional virtio spec hack that must be avoided as reset flow
> should be one.
> > Please do #3 to avoid resetting the device.
> The same reason as above. According to the regulations of the spec, we
> should allow for the occurrence of two situations(No_Soft_Reset=1 and
> No_Soft_Reset=0) and provide solutions for each situation, rather than simply
> categorizing them into one.
> 
No_soft_reset=0 has no obligation to restore the functional context.

It seems to me that you are trying to maintain the function context EVEN on device reset for the motivation of not modifying the guest driver.
If that is the motivation, is certainly not right to work around things that way.
I hope my interpretation is wrong. :)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-16  7:19                                                                   ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-16  7:19 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Tuesday, January 16, 2024 12:08 PM
> 
> On 2024/1/15 19:10, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Monday, January 15, 2024 4:37 PM
> >>
> >> On 2024/1/15 18:52, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 4:17 PM
> >>>>
> >>>> On 2024/1/15 17:46, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Monday, January 15, 2024 3:10 PM
> >>>>>
> >>>>>>> Display resources should be only restored when following
> >>>>>>> conditions are
> >>>>>> met.
> >>>>>>> 1. PCI PM cap is reported.
> >>>>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do not
> >>>>>>> perform reset when transport offers a restore
> >>>>>> capability using #1 and #2.
> >>>>>>>
> >>>>>>> Do you agree? Yes?
> >>>>>> Yes, I think this problem (display resources are destroyed during
> >>>>>> S3) can be sorted to two situations:
> >>>>>> First is what you said above, in this situation, the
> >>>>>> devices_reset of qemu is unreasonable, if a device has PM cap and
> >>>>>> non_soft_reset=1, qemu should not do resetting.
> >>>>>
> >>>>>> Second is without #1 or #2, the reset behavior is fine. My patch
> >>>>>> is to fix this situation, that sending a new virtio-gpu command
> >>>>>> to notify qemu to prevent destroying display resources during S3.
> >>>>>
> >>>>> I still have hard time following "My patch is to fix this situation...".
> >>>>>
> >>>>> When #1 and #2 is not done, there is nothing to restore.
> >>>>> Driver should not send some new virtio specific command when #1
> >>>>> and
> >>>>> #2 is
> >>>> not there.
> >>>>> Instead, if the device wants to restore the context, all #1, #2
> >>>>> and
> >>>>> #3 should
> >>>> be done to implement the restore functionality.
> >>>> When #1 and #2 is done. The "device reset behavior" should not
> >>>> happen. And then the display resources are not destroyed.
> >>>> I didn’t say send command to restore context.
> >>>
> >>>> I mean when #1 and #2 is not done. Driver and qemu both reset
> >>>> devices when resuming, at that time,
> >>> Above behavior is as per the spec guidelines. Hence it is fine.
> >>>
> >>>> we need a method to prevent the display resources from destroying.
> >>> I disagree to above as it is not as per the spec guidelines.
> >>> Just follow #1, #2 and #3 to not destroy the display resource destroying.
> >> I agree that follow #1, #2 and #3 will not destroy display resource.
> >> So the question goes back: If there is no #2 and you say that the
> >> reset behavior complies with the spec guidelines, how can I make the
> >> virtio gpu work properly with display resources destroyed?
> > Implement #2. :)
> We can't simply implement #2 if a device doesn't have #2, 
How do you define "We" above?
Isn't we == device?

> see Section 7.5.2.2
> in PCI Express Spec, about No_Soft_Reset:
> " If a VF implements the Power Management Capability, the VF's value of this
> field must be identical to the associated PF's value. "
> What's more, a device doesn't have #2 is allowed, we should consider how to
> solve the two situations(No_Soft_Reset=1 and No_Soft_Reset=0), rather than
> simply considering implementing #2 for the device when it does not have #2.
A device implementation (sw or hw) that wants to offer maintain the function context will implement No_Soft_reset=1.

> Also in PCI Express Spec:
> " Functional context is required to be maintained by Functions in the D3Hot
> state if the No_Soft_Reset field in the PMCSR is Set. In this case, System
> Software is not required to re-initialize the Function after a transition from
> D3Hot to D0 (the Function will be in the D0active state). If the No_Soft_Reset
> bit is Clear, functional context is not required to be maintained by the Function
> in the D3Hot state."
Right. It is NOT required to maintain. Hence, lets not maintain.

> This description corresponds to the two situations we are discussing.
> First is a device has #2(No_Soft_Reset=1), the device resetting will not happen,
> then the display resources of virtio-gpu problem will not happen.
Right.

> But in Second, a device doesn't have #2(No_Soft_Reset=0), it is fine for system
> to do device resetting, at this time, the display resources of virtio-gpu problem
> will happen, we should also consider a method to solve that problem, but not
> to implement #2.
No, because as wrote in the spec, ", functional context is not required to be maintained". Hence no need to maintain.

> 
> >
> >> What my patch do is not to prevent resetting, is to prevent
> >> destroying resources during resetting. Gerd Hoffmann has agreed to
> >> this point in my qemu patch reviewing.
> >> Even if I use PM cap, I still need to prevent destroying resources
> >> during resetting.
> > Do you mean when virtio device is reset, you want to prevent?
> I want to protect display resources, not to prevent the whole process of device
> resetting.
This is contradicting.
When you reset the device it will reset the functional context as well as guided by pci.

> Also in PCI Express Spec:
> "Note that a Function's software driver participates in the process of
> transitioning the Function from D0 to D3Hot. It contributes to the process by
> saving any functional state that would otherwise be lost with removal of main
> power, and otherwise preparing the Function for the transition to D3Hot."
> I let guest driver to notify qemu to protect the display resources that will be
> loss during S3, it is also compliant with Spec.
This is already done by the guest PCI driver using the PM control bits. Right?
Only thing needed is to fix the virtio layer to honor the PCI PM capabilities and skip resetting the device.

> 
> > If so, that is yet additional virtio spec hack that must be avoided as reset flow
> should be one.
> > Please do #3 to avoid resetting the device.
> The same reason as above. According to the regulations of the spec, we
> should allow for the occurrence of two situations(No_Soft_Reset=1 and
> No_Soft_Reset=0) and provide solutions for each situation, rather than simply
> categorizing them into one.
> 
No_soft_reset=0 has no obligation to restore the functional context.

It seems to me that you are trying to maintain the function context EVEN on device reset for the motivation of not modifying the guest driver.
If that is the motivation, is certainly not right to work around things that way.
I hope my interpretation is wrong. :)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-16  7:19                                                                   ` Parav Pandit
@ 2024-01-16  8:20                                                                     ` Chen, Jiqian
  -1 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-16  8:20 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/16 15:19, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Tuesday, January 16, 2024 12:08 PM
>>
>> On 2024/1/15 19:10, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 4:37 PM
>>>>
>>>> On 2024/1/15 18:52, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, January 15, 2024 4:17 PM
>>>>>>
>>>>>> On 2024/1/15 17:46, Parav Pandit wrote:
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Monday, January 15, 2024 3:10 PM
>>>>>>>
>>>>>>>>> Display resources should be only restored when following
>>>>>>>>> conditions are
>>>>>>>> met.
>>>>>>>>> 1. PCI PM cap is reported.
>>>>>>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do not
>>>>>>>>> perform reset when transport offers a restore
>>>>>>>> capability using #1 and #2.
>>>>>>>>>
>>>>>>>>> Do you agree? Yes?
>>>>>>>> Yes, I think this problem (display resources are destroyed during
>>>>>>>> S3) can be sorted to two situations:
>>>>>>>> First is what you said above, in this situation, the
>>>>>>>> devices_reset of qemu is unreasonable, if a device has PM cap and
>>>>>>>> non_soft_reset=1, qemu should not do resetting.
>>>>>>>
>>>>>>>> Second is without #1 or #2, the reset behavior is fine. My patch
>>>>>>>> is to fix this situation, that sending a new virtio-gpu command
>>>>>>>> to notify qemu to prevent destroying display resources during S3.
>>>>>>>
>>>>>>> I still have hard time following "My patch is to fix this situation...".
>>>>>>>
>>>>>>> When #1 and #2 is not done, there is nothing to restore.
>>>>>>> Driver should not send some new virtio specific command when #1
>>>>>>> and
>>>>>>> #2 is
>>>>>> not there.
>>>>>>> Instead, if the device wants to restore the context, all #1, #2
>>>>>>> and
>>>>>>> #3 should
>>>>>> be done to implement the restore functionality.
>>>>>> When #1 and #2 is done. The "device reset behavior" should not
>>>>>> happen. And then the display resources are not destroyed.
>>>>>> I didn’t say send command to restore context.
>>>>>
>>>>>> I mean when #1 and #2 is not done. Driver and qemu both reset
>>>>>> devices when resuming, at that time,
>>>>> Above behavior is as per the spec guidelines. Hence it is fine.
>>>>>
>>>>>> we need a method to prevent the display resources from destroying.
>>>>> I disagree to above as it is not as per the spec guidelines.
>>>>> Just follow #1, #2 and #3 to not destroy the display resource destroying.
>>>> I agree that follow #1, #2 and #3 will not destroy display resource.
>>>> So the question goes back: If there is no #2 and you say that the
>>>> reset behavior complies with the spec guidelines, how can I make the
>>>> virtio gpu work properly with display resources destroyed?
>>> Implement #2. :)
>> We can't simply implement #2 if a device doesn't have #2, 
> How do you define "We" above?
> Isn't we == device?
> 
>> see Section 7.5.2.2
>> in PCI Express Spec, about No_Soft_Reset:
>> " If a VF implements the Power Management Capability, the VF's value of this
>> field must be identical to the associated PF's value. "
>> What's more, a device doesn't have #2 is allowed, we should consider how to
>> solve the two situations(No_Soft_Reset=1 and No_Soft_Reset=0), rather than
>> simply considering implementing #2 for the device when it does not have #2.
> A device implementation (sw or hw) that wants to offer maintain the function context will implement No_Soft_reset=1.
> 
>> Also in PCI Express Spec:
>> " Functional context is required to be maintained by Functions in the D3Hot
>> state if the No_Soft_Reset field in the PMCSR is Set. In this case, System
>> Software is not required to re-initialize the Function after a transition from
>> D3Hot to D0 (the Function will be in the D0active state). If the No_Soft_Reset
>> bit is Clear, functional context is not required to be maintained by the Function
>> in the D3Hot state."
> Right. It is NOT required to maintain. Hence, lets not maintain.
> 
>> This description corresponds to the two situations we are discussing.
>> First is a device has #2(No_Soft_Reset=1), the device resetting will not happen,
>> then the display resources of virtio-gpu problem will not happen.
> Right.
> 
>> But in Second, a device doesn't have #2(No_Soft_Reset=0), it is fine for system
>> to do device resetting, at this time, the display resources of virtio-gpu problem
>> will happen, we should also consider a method to solve that problem, but not
>> to implement #2.
> No, because as wrote in the spec, ", functional context is not required to be maintained". Hence no need to maintain.
> 
>>
>>>
>>>> What my patch do is not to prevent resetting, is to prevent
>>>> destroying resources during resetting. Gerd Hoffmann has agreed to
>>>> this point in my qemu patch reviewing.
>>>> Even if I use PM cap, I still need to prevent destroying resources
>>>> during resetting.
>>> Do you mean when virtio device is reset, you want to prevent?
>> I want to protect display resources, not to prevent the whole process of device
>> resetting.
> This is contradicting.
> When you reset the device it will reset the functional context as well as guided by pci.
> 
>> Also in PCI Express Spec:
>> "Note that a Function's software driver participates in the process of
>> transitioning the Function from D0 to D3Hot. It contributes to the process by
>> saving any functional state that would otherwise be lost with removal of main
>> power, and otherwise preparing the Function for the transition to D3Hot."
>> I let guest driver to notify qemu to protect the display resources that will be
>> loss during S3, it is also compliant with Spec.
> This is already done by the guest PCI driver using the PM control bits. Right?
> Only thing needed is to fix the virtio layer to honor the PCI PM capabilities and skip resetting the device.
> 
>>
>>> If so, that is yet additional virtio spec hack that must be avoided as reset flow
>> should be one.
>>> Please do #3 to avoid resetting the device.
>> The same reason as above. According to the regulations of the spec, we
>> should allow for the occurrence of two situations(No_Soft_Reset=1 and
>> No_Soft_Reset=0) and provide solutions for each situation, rather than simply
>> categorizing them into one.
>>
> No_soft_reset=0 has no obligation to restore the functional context.
> 
> It seems to me that you are trying to maintain the function context EVEN on device reset for the motivation of not modifying the guest driver.
> If that is the motivation, is certainly not right to work around things that way.
> I hope my interpretation is wrong. :)
You are wrong, I am not for the motivation of not modifying the guest driver. Because guest driver doesn't have enough information to recreate all the display resources, this has been discussed and agreed upon in my qemu patch review.
I just want to ask you if a device has no #2 (No_Soft_Reset=0), how do you solve the display resources of virtio-gpu destroyed problem? (Not to implement #2, because this situation exists, is reasonable, and complies with the Spec. Shouldn't we also consider how to modify code to make the virtio-gpu work properly in this situation?  You can't say to directly implement #2 to eliminate the existence of this situation, can you?)

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-16  8:20                                                                     ` Chen, Jiqian
  0 siblings, 0 replies; 76+ messages in thread
From: Chen, Jiqian @ 2024-01-16  8:20 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray, Chen,
	Jiqian

On 2024/1/16 15:19, Parav Pandit wrote:
> 
>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>> Sent: Tuesday, January 16, 2024 12:08 PM
>>
>> On 2024/1/15 19:10, Parav Pandit wrote:
>>>
>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>> Sent: Monday, January 15, 2024 4:37 PM
>>>>
>>>> On 2024/1/15 18:52, Parav Pandit wrote:
>>>>>
>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>> Sent: Monday, January 15, 2024 4:17 PM
>>>>>>
>>>>>> On 2024/1/15 17:46, Parav Pandit wrote:
>>>>>>>
>>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
>>>>>>>> Sent: Monday, January 15, 2024 3:10 PM
>>>>>>>
>>>>>>>>> Display resources should be only restored when following
>>>>>>>>> conditions are
>>>>>>>> met.
>>>>>>>>> 1. PCI PM cap is reported.
>>>>>>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do not
>>>>>>>>> perform reset when transport offers a restore
>>>>>>>> capability using #1 and #2.
>>>>>>>>>
>>>>>>>>> Do you agree? Yes?
>>>>>>>> Yes, I think this problem (display resources are destroyed during
>>>>>>>> S3) can be sorted to two situations:
>>>>>>>> First is what you said above, in this situation, the
>>>>>>>> devices_reset of qemu is unreasonable, if a device has PM cap and
>>>>>>>> non_soft_reset=1, qemu should not do resetting.
>>>>>>>
>>>>>>>> Second is without #1 or #2, the reset behavior is fine. My patch
>>>>>>>> is to fix this situation, that sending a new virtio-gpu command
>>>>>>>> to notify qemu to prevent destroying display resources during S3.
>>>>>>>
>>>>>>> I still have hard time following "My patch is to fix this situation...".
>>>>>>>
>>>>>>> When #1 and #2 is not done, there is nothing to restore.
>>>>>>> Driver should not send some new virtio specific command when #1
>>>>>>> and
>>>>>>> #2 is
>>>>>> not there.
>>>>>>> Instead, if the device wants to restore the context, all #1, #2
>>>>>>> and
>>>>>>> #3 should
>>>>>> be done to implement the restore functionality.
>>>>>> When #1 and #2 is done. The "device reset behavior" should not
>>>>>> happen. And then the display resources are not destroyed.
>>>>>> I didn’t say send command to restore context.
>>>>>
>>>>>> I mean when #1 and #2 is not done. Driver and qemu both reset
>>>>>> devices when resuming, at that time,
>>>>> Above behavior is as per the spec guidelines. Hence it is fine.
>>>>>
>>>>>> we need a method to prevent the display resources from destroying.
>>>>> I disagree to above as it is not as per the spec guidelines.
>>>>> Just follow #1, #2 and #3 to not destroy the display resource destroying.
>>>> I agree that follow #1, #2 and #3 will not destroy display resource.
>>>> So the question goes back: If there is no #2 and you say that the
>>>> reset behavior complies with the spec guidelines, how can I make the
>>>> virtio gpu work properly with display resources destroyed?
>>> Implement #2. :)
>> We can't simply implement #2 if a device doesn't have #2, 
> How do you define "We" above?
> Isn't we == device?
> 
>> see Section 7.5.2.2
>> in PCI Express Spec, about No_Soft_Reset:
>> " If a VF implements the Power Management Capability, the VF's value of this
>> field must be identical to the associated PF's value. "
>> What's more, a device doesn't have #2 is allowed, we should consider how to
>> solve the two situations(No_Soft_Reset=1 and No_Soft_Reset=0), rather than
>> simply considering implementing #2 for the device when it does not have #2.
> A device implementation (sw or hw) that wants to offer maintain the function context will implement No_Soft_reset=1.
> 
>> Also in PCI Express Spec:
>> " Functional context is required to be maintained by Functions in the D3Hot
>> state if the No_Soft_Reset field in the PMCSR is Set. In this case, System
>> Software is not required to re-initialize the Function after a transition from
>> D3Hot to D0 (the Function will be in the D0active state). If the No_Soft_Reset
>> bit is Clear, functional context is not required to be maintained by the Function
>> in the D3Hot state."
> Right. It is NOT required to maintain. Hence, lets not maintain.
> 
>> This description corresponds to the two situations we are discussing.
>> First is a device has #2(No_Soft_Reset=1), the device resetting will not happen,
>> then the display resources of virtio-gpu problem will not happen.
> Right.
> 
>> But in Second, a device doesn't have #2(No_Soft_Reset=0), it is fine for system
>> to do device resetting, at this time, the display resources of virtio-gpu problem
>> will happen, we should also consider a method to solve that problem, but not
>> to implement #2.
> No, because as wrote in the spec, ", functional context is not required to be maintained". Hence no need to maintain.
> 
>>
>>>
>>>> What my patch do is not to prevent resetting, is to prevent
>>>> destroying resources during resetting. Gerd Hoffmann has agreed to
>>>> this point in my qemu patch reviewing.
>>>> Even if I use PM cap, I still need to prevent destroying resources
>>>> during resetting.
>>> Do you mean when virtio device is reset, you want to prevent?
>> I want to protect display resources, not to prevent the whole process of device
>> resetting.
> This is contradicting.
> When you reset the device it will reset the functional context as well as guided by pci.
> 
>> Also in PCI Express Spec:
>> "Note that a Function's software driver participates in the process of
>> transitioning the Function from D0 to D3Hot. It contributes to the process by
>> saving any functional state that would otherwise be lost with removal of main
>> power, and otherwise preparing the Function for the transition to D3Hot."
>> I let guest driver to notify qemu to protect the display resources that will be
>> loss during S3, it is also compliant with Spec.
> This is already done by the guest PCI driver using the PM control bits. Right?
> Only thing needed is to fix the virtio layer to honor the PCI PM capabilities and skip resetting the device.
> 
>>
>>> If so, that is yet additional virtio spec hack that must be avoided as reset flow
>> should be one.
>>> Please do #3 to avoid resetting the device.
>> The same reason as above. According to the regulations of the spec, we
>> should allow for the occurrence of two situations(No_Soft_Reset=1 and
>> No_Soft_Reset=0) and provide solutions for each situation, rather than simply
>> categorizing them into one.
>>
> No_soft_reset=0 has no obligation to restore the functional context.
> 
> It seems to me that you are trying to maintain the function context EVEN on device reset for the motivation of not modifying the guest driver.
> If that is the motivation, is certainly not right to work around things that way.
> I hope my interpretation is wrong. :)
You are wrong, I am not for the motivation of not modifying the guest driver. Because guest driver doesn't have enough information to recreate all the display resources, this has been discussed and agreed upon in my qemu patch review.
I just want to ask you if a device has no #2 (No_Soft_Reset=0), how do you solve the display resources of virtio-gpu destroyed problem? (Not to implement #2, because this situation exists, is reasonable, and complies with the Spec. Shouldn't we also consider how to modify code to make the virtio-gpu work properly in this situation?  You can't say to directly implement #2 to eliminate the existence of this situation, can you?)

-- 
Best regards,
Jiqian Chen.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* [virtio-dev] RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
  2024-01-16  8:20                                                                     ` Chen, Jiqian
@ 2024-01-16  8:27                                                                       ` Parav Pandit
  -1 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-16  8:27 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Tuesday, January 16, 2024 1:51 PM
> 
> On 2024/1/16 15:19, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Tuesday, January 16, 2024 12:08 PM
> >>
> >> On 2024/1/15 19:10, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 4:37 PM
> >>>>
> >>>> On 2024/1/15 18:52, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Monday, January 15, 2024 4:17 PM
> >>>>>>
> >>>>>> On 2024/1/15 17:46, Parav Pandit wrote:
> >>>>>>>
> >>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>> Sent: Monday, January 15, 2024 3:10 PM
> >>>>>>>
> >>>>>>>>> Display resources should be only restored when following
> >>>>>>>>> conditions are
> >>>>>>>> met.
> >>>>>>>>> 1. PCI PM cap is reported.
> >>>>>>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do
> >>>>>>>>> not perform reset when transport offers a restore
> >>>>>>>> capability using #1 and #2.
> >>>>>>>>>
> >>>>>>>>> Do you agree? Yes?
> >>>>>>>> Yes, I think this problem (display resources are destroyed
> >>>>>>>> during
> >>>>>>>> S3) can be sorted to two situations:
> >>>>>>>> First is what you said above, in this situation, the
> >>>>>>>> devices_reset of qemu is unreasonable, if a device has PM cap
> >>>>>>>> and non_soft_reset=1, qemu should not do resetting.
> >>>>>>>
> >>>>>>>> Second is without #1 or #2, the reset behavior is fine. My
> >>>>>>>> patch is to fix this situation, that sending a new virtio-gpu
> >>>>>>>> command to notify qemu to prevent destroying display resources
> during S3.
> >>>>>>>
> >>>>>>> I still have hard time following "My patch is to fix this situation...".
> >>>>>>>
> >>>>>>> When #1 and #2 is not done, there is nothing to restore.
> >>>>>>> Driver should not send some new virtio specific command when #1
> >>>>>>> and
> >>>>>>> #2 is
> >>>>>> not there.
> >>>>>>> Instead, if the device wants to restore the context, all #1, #2
> >>>>>>> and
> >>>>>>> #3 should
> >>>>>> be done to implement the restore functionality.
> >>>>>> When #1 and #2 is done. The "device reset behavior" should not
> >>>>>> happen. And then the display resources are not destroyed.
> >>>>>> I didn’t say send command to restore context.
> >>>>>
> >>>>>> I mean when #1 and #2 is not done. Driver and qemu both reset
> >>>>>> devices when resuming, at that time,
> >>>>> Above behavior is as per the spec guidelines. Hence it is fine.
> >>>>>
> >>>>>> we need a method to prevent the display resources from destroying.
> >>>>> I disagree to above as it is not as per the spec guidelines.
> >>>>> Just follow #1, #2 and #3 to not destroy the display resource destroying.
> >>>> I agree that follow #1, #2 and #3 will not destroy display resource.
> >>>> So the question goes back: If there is no #2 and you say that the
> >>>> reset behavior complies with the spec guidelines, how can I make
> >>>> the virtio gpu work properly with display resources destroyed?
> >>> Implement #2. :)
> >> We can't simply implement #2 if a device doesn't have #2,
> > How do you define "We" above?
> > Isn't we == device?
> >
> >> see Section 7.5.2.2
> >> in PCI Express Spec, about No_Soft_Reset:
> >> " If a VF implements the Power Management Capability, the VF's value
> >> of this field must be identical to the associated PF's value. "
> >> What's more, a device doesn't have #2 is allowed, we should consider
> >> how to solve the two situations(No_Soft_Reset=1 and No_Soft_Reset=0),
> >> rather than simply considering implementing #2 for the device when it does
> not have #2.
> > A device implementation (sw or hw) that wants to offer maintain the
> function context will implement No_Soft_reset=1.
> >
> >> Also in PCI Express Spec:
> >> " Functional context is required to be maintained by Functions in the
> >> D3Hot state if the No_Soft_Reset field in the PMCSR is Set. In this
> >> case, System Software is not required to re-initialize the Function
> >> after a transition from D3Hot to D0 (the Function will be in the
> >> D0active state). If the No_Soft_Reset bit is Clear, functional
> >> context is not required to be maintained by the Function in the D3Hot
> state."
> > Right. It is NOT required to maintain. Hence, lets not maintain.
> >
> >> This description corresponds to the two situations we are discussing.
> >> First is a device has #2(No_Soft_Reset=1), the device resetting will
> >> not happen, then the display resources of virtio-gpu problem will not
> happen.
> > Right.
> >
> >> But in Second, a device doesn't have #2(No_Soft_Reset=0), it is fine
> >> for system to do device resetting, at this time, the display
> >> resources of virtio-gpu problem will happen, we should also consider
> >> a method to solve that problem, but not to implement #2.
> > No, because as wrote in the spec, ", functional context is not required to be
> maintained". Hence no need to maintain.
> >
> >>
> >>>
> >>>> What my patch do is not to prevent resetting, is to prevent
> >>>> destroying resources during resetting. Gerd Hoffmann has agreed to
> >>>> this point in my qemu patch reviewing.
> >>>> Even if I use PM cap, I still need to prevent destroying resources
> >>>> during resetting.
> >>> Do you mean when virtio device is reset, you want to prevent?
> >> I want to protect display resources, not to prevent the whole process
> >> of device resetting.
> > This is contradicting.
> > When you reset the device it will reset the functional context as well as
> guided by pci.
> >
> >> Also in PCI Express Spec:
> >> "Note that a Function's software driver participates in the process
> >> of transitioning the Function from D0 to D3Hot. It contributes to the
> >> process by saving any functional state that would otherwise be lost
> >> with removal of main power, and otherwise preparing the Function for the
> transition to D3Hot."
> >> I let guest driver to notify qemu to protect the display resources
> >> that will be loss during S3, it is also compliant with Spec.
> > This is already done by the guest PCI driver using the PM control bits. Right?
> > Only thing needed is to fix the virtio layer to honor the PCI PM capabilities
> and skip resetting the device.
> >
> >>
> >>> If so, that is yet additional virtio spec hack that must be avoided
> >>> as reset flow
> >> should be one.
> >>> Please do #3 to avoid resetting the device.
> >> The same reason as above. According to the regulations of the spec,
> >> we should allow for the occurrence of two situations(No_Soft_Reset=1
> >> and
> >> No_Soft_Reset=0) and provide solutions for each situation, rather
> >> than simply categorizing them into one.
> >>
> > No_soft_reset=0 has no obligation to restore the functional context.
> >
> > It seems to me that you are trying to maintain the function context EVEN on
> device reset for the motivation of not modifying the guest driver.
> > If that is the motivation, is certainly not right to work around things that
> way.
> > I hope my interpretation is wrong. :)
> You are wrong, I am not for the motivation of not modifying the guest driver.
Ok. good.

> Because guest driver doesn't have enough information to recreate all the
> display resources, this has been discussed and agreed upon in my qemu patch
> review.
Ok. I am also in favor of restoring the resources on d3->d0.
We are aligned for #1 and #3.

> I just want to ask you if a device has no #2 (No_Soft_Reset=0), how do you
> solve the display resources of virtio-gpu destroyed problem? 
By implementing #2.

(Not to> implement #2, because this situation exists,
Which situation exists? Device implementation is missing #2?
If so, lets improve the device implementation to do #2.

 is reasonable, and complies with
> the Spec. Shouldn't we also consider how to modify code to make the virtio-
> gpu work properly in this situation?  You can't say to directly implement #2 to
> eliminate the existence of this situation, can you?)
I didn’t follow, what is the problem in implementing #2?

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [virtio-comment] RE: [PATCH v6 1/1] content: Add new feature VIRTIO_F_PRESERVE_RESOURCES
@ 2024-01-16  8:27                                                                       ` Parav Pandit
  0 siblings, 0 replies; 76+ messages in thread
From: Parav Pandit @ 2024-01-16  8:27 UTC (permalink / raw)
  To: Chen, Jiqian
  Cc: Michael S. Tsirkin, Gerd Hoffmann, Jason Wang, Xuan Zhuo,
	David Airlie, Gurchetan Singh, Chia-I Wu, Marc-André Lureau,
	Robert Beckett, Mikhail Golubev-Ciuchea, virtio-comment,
	virtio-dev, Huang, Honglei1, Zhang, Julia, Huang, Ray


> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> Sent: Tuesday, January 16, 2024 1:51 PM
> 
> On 2024/1/16 15:19, Parav Pandit wrote:
> >
> >> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >> Sent: Tuesday, January 16, 2024 12:08 PM
> >>
> >> On 2024/1/15 19:10, Parav Pandit wrote:
> >>>
> >>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>> Sent: Monday, January 15, 2024 4:37 PM
> >>>>
> >>>> On 2024/1/15 18:52, Parav Pandit wrote:
> >>>>>
> >>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>> Sent: Monday, January 15, 2024 4:17 PM
> >>>>>>
> >>>>>> On 2024/1/15 17:46, Parav Pandit wrote:
> >>>>>>>
> >>>>>>>> From: Chen, Jiqian <Jiqian.Chen@amd.com>
> >>>>>>>> Sent: Monday, January 15, 2024 3:10 PM
> >>>>>>>
> >>>>>>>>> Display resources should be only restored when following
> >>>>>>>>> conditions are
> >>>>>>>> met.
> >>>>>>>>> 1. PCI PM cap is reported.
> >>>>>>>>> 2. PCI PM cap has non_soft_reset=1 3. virtio guest driver do
> >>>>>>>>> not perform reset when transport offers a restore
> >>>>>>>> capability using #1 and #2.
> >>>>>>>>>
> >>>>>>>>> Do you agree? Yes?
> >>>>>>>> Yes, I think this problem (display resources are destroyed
> >>>>>>>> during
> >>>>>>>> S3) can be sorted to two situations:
> >>>>>>>> First is what you said above, in this situation, the
> >>>>>>>> devices_reset of qemu is unreasonable, if a device has PM cap
> >>>>>>>> and non_soft_reset=1, qemu should not do resetting.
> >>>>>>>
> >>>>>>>> Second is without #1 or #2, the reset behavior is fine. My
> >>>>>>>> patch is to fix this situation, that sending a new virtio-gpu
> >>>>>>>> command to notify qemu to prevent destroying display resources
> during S3.
> >>>>>>>
> >>>>>>> I still have hard time following "My patch is to fix this situation...".
> >>>>>>>
> >>>>>>> When #1 and #2 is not done, there is nothing to restore.
> >>>>>>> Driver should not send some new virtio specific command when #1
> >>>>>>> and
> >>>>>>> #2 is
> >>>>>> not there.
> >>>>>>> Instead, if the device wants to restore the context, all #1, #2
> >>>>>>> and
> >>>>>>> #3 should
> >>>>>> be done to implement the restore functionality.
> >>>>>> When #1 and #2 is done. The "device reset behavior" should not
> >>>>>> happen. And then the display resources are not destroyed.
> >>>>>> I didn’t say send command to restore context.
> >>>>>
> >>>>>> I mean when #1 and #2 is not done. Driver and qemu both reset
> >>>>>> devices when resuming, at that time,
> >>>>> Above behavior is as per the spec guidelines. Hence it is fine.
> >>>>>
> >>>>>> we need a method to prevent the display resources from destroying.
> >>>>> I disagree to above as it is not as per the spec guidelines.
> >>>>> Just follow #1, #2 and #3 to not destroy the display resource destroying.
> >>>> I agree that follow #1, #2 and #3 will not destroy display resource.
> >>>> So the question goes back: If there is no #2 and you say that the
> >>>> reset behavior complies with the spec guidelines, how can I make
> >>>> the virtio gpu work properly with display resources destroyed?
> >>> Implement #2. :)
> >> We can't simply implement #2 if a device doesn't have #2,
> > How do you define "We" above?
> > Isn't we == device?
> >
> >> see Section 7.5.2.2
> >> in PCI Express Spec, about No_Soft_Reset:
> >> " If a VF implements the Power Management Capability, the VF's value
> >> of this field must be identical to the associated PF's value. "
> >> What's more, a device doesn't have #2 is allowed, we should consider
> >> how to solve the two situations(No_Soft_Reset=1 and No_Soft_Reset=0),
> >> rather than simply considering implementing #2 for the device when it does
> not have #2.
> > A device implementation (sw or hw) that wants to offer maintain the
> function context will implement No_Soft_reset=1.
> >
> >> Also in PCI Express Spec:
> >> " Functional context is required to be maintained by Functions in the
> >> D3Hot state if the No_Soft_Reset field in the PMCSR is Set. In this
> >> case, System Software is not required to re-initialize the Function
> >> after a transition from D3Hot to D0 (the Function will be in the
> >> D0active state). If the No_Soft_Reset bit is Clear, functional
> >> context is not required to be maintained by the Function in the D3Hot
> state."
> > Right. It is NOT required to maintain. Hence, lets not maintain.
> >
> >> This description corresponds to the two situations we are discussing.
> >> First is a device has #2(No_Soft_Reset=1), the device resetting will
> >> not happen, then the display resources of virtio-gpu problem will not
> happen.
> > Right.
> >
> >> But in Second, a device doesn't have #2(No_Soft_Reset=0), it is fine
> >> for system to do device resetting, at this time, the display
> >> resources of virtio-gpu problem will happen, we should also consider
> >> a method to solve that problem, but not to implement #2.
> > No, because as wrote in the spec, ", functional context is not required to be
> maintained". Hence no need to maintain.
> >
> >>
> >>>
> >>>> What my patch do is not to prevent resetting, is to prevent
> >>>> destroying resources during resetting. Gerd Hoffmann has agreed to
> >>>> this point in my qemu patch reviewing.
> >>>> Even if I use PM cap, I still need to prevent destroying resources
> >>>> during resetting.
> >>> Do you mean when virtio device is reset, you want to prevent?
> >> I want to protect display resources, not to prevent the whole process
> >> of device resetting.
> > This is contradicting.
> > When you reset the device it will reset the functional context as well as
> guided by pci.
> >
> >> Also in PCI Express Spec:
> >> "Note that a Function's software driver participates in the process
> >> of transitioning the Function from D0 to D3Hot. It contributes to the
> >> process by saving any functional state that would otherwise be lost
> >> with removal of main power, and otherwise preparing the Function for the
> transition to D3Hot."
> >> I let guest driver to notify qemu to protect the display resources
> >> that will be loss during S3, it is also compliant with Spec.
> > This is already done by the guest PCI driver using the PM control bits. Right?
> > Only thing needed is to fix the virtio layer to honor the PCI PM capabilities
> and skip resetting the device.
> >
> >>
> >>> If so, that is yet additional virtio spec hack that must be avoided
> >>> as reset flow
> >> should be one.
> >>> Please do #3 to avoid resetting the device.
> >> The same reason as above. According to the regulations of the spec,
> >> we should allow for the occurrence of two situations(No_Soft_Reset=1
> >> and
> >> No_Soft_Reset=0) and provide solutions for each situation, rather
> >> than simply categorizing them into one.
> >>
> > No_soft_reset=0 has no obligation to restore the functional context.
> >
> > It seems to me that you are trying to maintain the function context EVEN on
> device reset for the motivation of not modifying the guest driver.
> > If that is the motivation, is certainly not right to work around things that
> way.
> > I hope my interpretation is wrong. :)
> You are wrong, I am not for the motivation of not modifying the guest driver.
Ok. good.

> Because guest driver doesn't have enough information to recreate all the
> display resources, this has been discussed and agreed upon in my qemu patch
> review.
Ok. I am also in favor of restoring the resources on d3->d0.
We are aligned for #1 and #3.

> I just want to ask you if a device has no #2 (No_Soft_Reset=0), how do you
> solve the display resources of virtio-gpu destroyed problem? 
By implementing #2.

(Not to> implement #2, because this situation exists,
Which situation exists? Device implementation is missing #2?
If so, lets improve the device implementation to do #2.

 is reasonable, and complies with
> the Spec. Shouldn't we also consider how to modify code to make the virtio-
> gpu work properly in this situation?  You can't say to directly implement #2 to
> eliminate the existence of this situation, can you?)
I didn’t follow, what is the problem in implementing #2?

^ permalink raw reply	[flat|nested] 76+ messages in thread

end of thread, other threads:[~2024-01-16  8:27 UTC | newest]

Thread overview: 76+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-21  3:51 [virtio-dev] [PATCH v6 0/1] Add new feature VIRTIO_F_PRESERVE_RESOURCES Jiqian Chen
2023-10-21  3:51 ` [virtio-comment] " Jiqian Chen
2023-10-21  3:51 ` [virtio-dev] [PATCH v6 1/1] content: " Jiqian Chen
2023-10-21  3:51   ` [virtio-comment] " Jiqian Chen
2023-10-23  6:00   ` [virtio-dev] " Parav Pandit
2023-10-23  6:00     ` [virtio-comment] " Parav Pandit
2023-10-23 10:38     ` [virtio-dev] " Chen, Jiqian
2023-10-23 10:38       ` [virtio-comment] " Chen, Jiqian
2023-10-23 13:35       ` [virtio-dev] " Parav Pandit
2023-10-23 13:35         ` [virtio-comment] " Parav Pandit
2023-10-24 10:35         ` [virtio-dev] " Chen, Jiqian
2023-10-24 10:35           ` [virtio-comment] " Chen, Jiqian
2023-10-24 10:51           ` [virtio-dev] " Parav Pandit
2023-10-24 10:51             ` [virtio-comment] " Parav Pandit
2023-10-24 12:13             ` [virtio-dev] " Chen, Jiqian
2023-10-24 12:13               ` [virtio-comment] " Chen, Jiqian
2023-10-25  3:51               ` [virtio-dev] " Parav Pandit
2023-10-25  3:51                 ` [virtio-comment] " Parav Pandit
2023-10-26 10:24                 ` [virtio-dev] " Chen, Jiqian
2023-10-26 10:24                   ` [virtio-comment] " Chen, Jiqian
2023-10-26 10:30                   ` [virtio-dev] " Michael S. Tsirkin
2023-10-26 10:30                     ` [virtio-comment] " Michael S. Tsirkin
2023-10-27  3:03                     ` [virtio-dev] " Chen, Jiqian
2023-10-27  3:03                       ` [virtio-comment] " Chen, Jiqian
2024-01-12  7:41                       ` [virtio-dev] " Chen, Jiqian
2024-01-12  7:41                         ` [virtio-comment] " Chen, Jiqian
2024-01-12  8:02                         ` [virtio-dev] " Parav Pandit
2024-01-12  8:02                           ` [virtio-comment] " Parav Pandit
2024-01-12  8:25                           ` [virtio-dev] " Chen, Jiqian
2024-01-12  8:25                             ` [virtio-comment] " Chen, Jiqian
2024-01-12  8:47                             ` [virtio-dev] " Parav Pandit
2024-01-12  8:47                               ` [virtio-comment] " Parav Pandit
2024-01-12  9:24                               ` [virtio-dev] " Chen, Jiqian
2024-01-12  9:24                                 ` [virtio-comment] " Chen, Jiqian
2024-01-12  9:44                                 ` [virtio-dev] " Parav Pandit
2024-01-12  9:44                                   ` [virtio-comment] " Parav Pandit
2024-01-15  7:33                                   ` [virtio-dev] " Chen, Jiqian
2024-01-15  7:33                                     ` [virtio-comment] " Chen, Jiqian
2024-01-15  7:37                                     ` [virtio-dev] " Parav Pandit
2024-01-15  7:37                                       ` [virtio-comment] " Parav Pandit
2024-01-15  7:48                                       ` [virtio-dev] " Chen, Jiqian
2024-01-15  7:48                                         ` Chen, Jiqian
2024-01-15  7:55                                         ` [virtio-dev] " Parav Pandit
2024-01-15  7:55                                           ` Parav Pandit
2024-01-15  8:20                                           ` [virtio-dev] " Chen, Jiqian
2024-01-15  8:20                                             ` Chen, Jiqian
2024-01-15  8:52                                             ` [virtio-dev] " Parav Pandit
2024-01-15  8:52                                               ` Parav Pandit
2024-01-15  9:09                                               ` [virtio-dev] " Chen, Jiqian
2024-01-15  9:09                                                 ` Chen, Jiqian
2024-01-15  9:16                                                 ` [virtio-dev] " Parav Pandit
2024-01-15  9:16                                                   ` Parav Pandit
2024-01-15  9:40                                                   ` [virtio-dev] " Chen, Jiqian
2024-01-15  9:40                                                     ` Chen, Jiqian
2024-01-15  9:46                                                     ` [virtio-dev] " Parav Pandit
2024-01-15  9:46                                                       ` Parav Pandit
2024-01-15 10:47                                                       ` [virtio-dev] " Chen, Jiqian
2024-01-15 10:47                                                         ` Chen, Jiqian
2024-01-15 10:52                                                         ` [virtio-dev] " Parav Pandit
2024-01-15 10:52                                                           ` Parav Pandit
2024-01-15 11:07                                                           ` Chen, Jiqian
2024-01-15 11:07                                                             ` [virtio-dev] " Chen, Jiqian
2024-01-15 11:10                                                             ` [virtio-dev] " Parav Pandit
2024-01-15 11:10                                                               ` Parav Pandit
2024-01-16  6:37                                                               ` [virtio-dev] " Chen, Jiqian
2024-01-16  6:37                                                                 ` Chen, Jiqian
2024-01-16  7:19                                                                 ` [virtio-dev] " Parav Pandit
2024-01-16  7:19                                                                   ` Parav Pandit
2024-01-16  8:20                                                                   ` [virtio-dev] " Chen, Jiqian
2024-01-16  8:20                                                                     ` Chen, Jiqian
2024-01-16  8:27                                                                     ` [virtio-dev] " Parav Pandit
2024-01-16  8:27                                                                       ` Parav Pandit
2024-01-15  0:25                         ` [virtio-dev] " Jason Wang
2024-01-15  0:25                           ` [virtio-comment] " Jason Wang
2024-01-15  7:25                           ` Chen, Jiqian
2024-01-15  7:25                             ` [virtio-dev] " Chen, Jiqian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.