All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-comment] [PATCH 0/2] Add alternative to deflate-on-oom
@ 2022-01-24 12:56 David Stevens
  2022-01-24 12:56 ` [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue David Stevens
  2022-01-24 12:56 ` [virtio-comment] [PATCH 2/2] virtio-balloon: add a responsive host feature David Stevens
  0 siblings, 2 replies; 16+ messages in thread
From: David Stevens @ 2022-01-24 12:56 UTC (permalink / raw)
  To: virtio-comment; +Cc: David Stevens

This series provides an alternative to deflate-on-oom, by adding a
feature flag that lets the driver assume that all memory in the balloon
will be made available when necessary. To allow the device to better
fulfil that assumption, this series also adds an event queue for the
driver to send OOM events to the device.

Currently, if deflate-on-oom is not provided, then the guest driver does
not know when, or even if, the memory in the balloon will ever be
returned. Because of this, the Linux virtio balloon driver effectively
unplugs memory in the balloon when deflate-on-oom is not provided.
However, this can lead to compatibilty issues with userspace, since very
few userspace programs actually handle hotplug of memory, and Linux's
uAPIs aren't really even suited to potentially high frequency plugging
and unplugging memory. With this alternative to deflate-on-oom, systems
which cannot use deflate-on-oom can still run unmodified Linux programs
in the guest.

Since this series is already adding an event queue, it also adds an
event the guest can use to inform the host of an allocation failure when
trying to inflate the balloon. This provides an explicit signal that
inflation may take a long time or fail altogether, which the device can
then handle as it sees fit (e.g. abort the inflation attempt and tell
the program making resource management decisions that it wasn't able to
free enough memory from the guest).

David Stevens (2):
  virtio-balloon: add an event queue
  virtio-balloon: add a responsive host feature

 conformance.tex |  2 ++
 content.tex     | 75 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 77 insertions(+)

-- 
2.35.0.rc0.227.g00780c9af4-goog


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-01-24 12:56 [virtio-comment] [PATCH 0/2] Add alternative to deflate-on-oom David Stevens
@ 2022-01-24 12:56 ` David Stevens
  2022-01-28 15:46   ` David Hildenbrand
  2022-08-09 20:52   ` Michael S. Tsirkin
  2022-01-24 12:56 ` [virtio-comment] [PATCH 2/2] virtio-balloon: add a responsive host feature David Stevens
  1 sibling, 2 replies; 16+ messages in thread
From: David Stevens @ 2022-01-24 12:56 UTC (permalink / raw)
  To: virtio-comment; +Cc: David Stevens

Add an event queue to the balloon to allow the driver to send events to
the device, which allows the device to be more responsive to the memory
needs of the guest.

There are two defined events. The first event is an out of memory event.
This event provides a way for the guest to handle out of memory events
on a system where the host does not support deflate-on-oom. The second
event is an out of puff event, which the guest can send when it fails to
allocate memory while trying to inflate the balloon. This serves as an
indication to the device that the driver may not be able to inflate the
balloon in a timely manner.

Signed-off-by: David Stevens <stevensd@chromium.org>
---
 conformance.tex |  2 ++
 content.tex     | 59 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/conformance.tex b/conformance.tex
index 42f853762ebe..5e0baa5ae7df 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -181,6 +181,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
 \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
 \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
+\item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Events}
 \end{itemize}
 
 \conformance{\subsection}{SCSI Host Driver Conformance}\label{sec:Conformance / Driver Conformance / SCSI Host Driver Conformance}
@@ -440,6 +441,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
 \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
 \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
 \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
+\item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Events}
 \end{itemize}
 
 \conformance{\subsection}{SCSI Host Device Conformance}\label{sec:Conformance / Device Conformance / SCSI Host Device Conformance}
diff --git a/content.tex b/content.tex
index 32de6685c50b..3aeb319d31a7 100644
--- a/content.tex
+++ b/content.tex
@@ -5420,6 +5420,7 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
 \item[2] statsq
 \item[3] free_page_vq
 \item[4] reporting_vq
+\item[5] event_vq
 \end{description}
 
   statsq only exists if VIRTIO_BALLOON_F_STATS_VQ is set.
@@ -5428,6 +5429,8 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
 
   reporting_vq only exists if VIRTIO_BALLOON_F_PAGE_REPORTING is set.
 
+  event_vq only exists if VIRTIO_BALLOON_F_EVENT_VQ is set.
+
 \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Feature bits}
 \begin{description}
 \item[VIRTIO_BALLOON_F_MUST_TELL_HOST (0)] Host has to be told before
@@ -5446,6 +5449,8 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
     Configuration field \field{poison_val} is valid.
 \item[ VIRTIO_BALLOON_F_PAGE_REPORTING(5) ] The device has support for free
     page reporting. A virtqueue for reporting free guest memory is present.
+\item[ VIRTIO_BALLOON_F_EVENT_VQ(6) ] A virtqueue for sending events from
+    the driver to the device.
 
 \end{description}
 
@@ -5529,6 +5534,8 @@ \subsection{Device Initialization}\label{sec:Device Types / Memory Balloon Devic
   notify the device about the stats virtqueue buffer.
 \item If the VIRTIO_BALLOON_F_PAGE_REPORTING feature bit is negotiated, then
   begin reporting free pages to the device.
+\item If the VIRTIO_BALLOON_F_EVENT_VQ feature bit is negotiated, then
+  begin reporting events to the device.
 \end{enumerate}
 
 \subsection{Device Operation}\label{sec:Device Types / Memory Balloon Device / Device Operation}
@@ -6031,6 +6038,58 @@ \subsubsection{Free Page Reporting}\label{sec:Device Types / Memory Balloon Devi
 MUST NOT modify the the content of a reported page to a value other than
 \field{poison_val}.
 
+\subsubsection{Events}\label{sec:Device Types / Memory Balloon Device / Device Operation / Events}
+
+The events virtqueue allows the driver to signal events to the
+device.
+
+\begin{lstlisting}
+struct virtio_balloon_event {
+        u32 type;
+        u32 data;
+};
+
+#define VIRTIO_BALLOON_EVENT_OOM        1
+#define VIRTIO_BALLOON_EVENT_OOPUFF     2
+\end{lstlisting}
+
+The \field{type} determines how to interpret the event. The following
+events are defined:
+
+\begin{itemize}
+\item Out of memory
+\begin{lstlisting}
+#define VIRTIO_BALLOON_EVENT_OOM        1
+\end{lstlisting}
+The driver has encountered a situation in which using pages from the balloon
+is necessary for system stability (e.g. if memory is required by applications
+running within the guest). The \field{data} value indicates how many pages
+the driver requires to maintain system stability.
+\item Out of puff
+\begin{lstlisting}
+#define VIRTIO_BALLOON_EVENT_OOPUFF     2
+\end{lstlisting}
+The driver has encountered an allocation failure when trying to inflate
+the balloon. The \field{data} value is unused. This event serves as a signal
+that the balloon may not be able to inflate in a timely manner.
+\end{itemize}
+
+\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
+
+The driver MUST update \field{actual} with any allocated pages before
+sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
+
+The driver SHOULD wait for the device to acknowledge the event
+before trying to further inflate or deflate the balloon.
+
+If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
+SHOULD send an OOM event before using pages from the balloon.
+
+\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
+
+When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
+the balloon by \field{data} pages before acknowledging the event.
+
 \section{SCSI Host Device}\label{sec:Device Types / SCSI Host Device}
 
 The virtio SCSI host device groups together one or more virtual
-- 
2.35.0.rc0.227.g00780c9af4-goog


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [virtio-comment] [PATCH 2/2] virtio-balloon: add a responsive host feature
  2022-01-24 12:56 [virtio-comment] [PATCH 0/2] Add alternative to deflate-on-oom David Stevens
  2022-01-24 12:56 ` [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue David Stevens
@ 2022-01-24 12:56 ` David Stevens
  2022-01-28 15:52   ` David Hildenbrand
  1 sibling, 1 reply; 16+ messages in thread
From: David Stevens @ 2022-01-24 12:56 UTC (permalink / raw)
  To: virtio-comment; +Cc: David Stevens

Add a feature bit that the device can use to indicate that it will
monitor and respond to memory pressure in the guest. This flag allows
the driver to assume that the device will provide memory when necessary
and will not permanently remove memory from the guest via inflating the
balloon.

Signed-off-by: David Stevens <stevensd@chromium.org>
---
 content.tex | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/content.tex b/content.tex
index 3aeb319d31a7..dcb2b71a2b6a 100644
--- a/content.tex
+++ b/content.tex
@@ -5451,6 +5451,8 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
     page reporting. A virtqueue for reporting free guest memory is present.
 \item[ VIRTIO_BALLOON_F_EVENT_VQ(6) ] A virtqueue for sending events from
     the driver to the device.
+\item[ VIRTIO_BALLOON_F_RESPONSIVE_HOST(7) ] The device will respond to memory
+    pressure in the guest by deflating the balloon.
 
 \end{description}
 
@@ -5471,6 +5473,14 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
 bit, and if the driver did not accept this feature bit, the
 device MAY signal failure by failing to set FEATURES_OK
 \field{device status} bit when the driver writes it.
+
+If the device offers the VIRTIO_BALLOON_F_RESPONSIVE_HOST feature
+bit, it MUST also offer the VIRTIO_BALLOON_F_STATS_VQ and
+VIRTIO_BALLOON_F_EVENT_VQ feature bits. Although the device may not
+always be able to immediately respond to memory pressure in the
+guest, the device SHOULD be able to fully deflate the balloon if
+memory pressure persists in the guest.
+
 \subparagraph{Legacy Interface: Feature bits}\label{sec:Device
 Types / Memory Balloon Device / Feature bits / Legacy Interface:
 Feature bits}
@@ -6087,8 +6097,14 @@ \subsubsection{Events}\label{sec:Device Types / Memory Balloon Device / Device O
 
 \devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
 
-When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
-the balloon by \field{data} pages before acknowledging the event.
+If VIRTIO_BALLOON_F_RESPONSIVE_HOST has not been negotiated, when the device
+receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate the balloon by
+\field{data} pages before acknowledging the event.
+
+If VIRTIO_BALLOON_F_RESPONSIVE_HOST has been negotiated, when the device
+receives a VIRTIO_BALLOON_EVENT_OOM event, it MUST deflate the balloon
+by \field{data} pages before acknowledging the event, unless doing so
+would jeopardize stability of the host.
 
 \section{SCSI Host Device}\label{sec:Device Types / SCSI Host Device}
 
-- 
2.35.0.rc0.227.g00780c9af4-goog


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-01-24 12:56 ` [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue David Stevens
@ 2022-01-28 15:46   ` David Hildenbrand
  2022-01-31  7:10     ` David Stevens
  2022-08-09 20:52   ` Michael S. Tsirkin
  1 sibling, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2022-01-28 15:46 UTC (permalink / raw)
  To: David Stevens, virtio-comment

On 24.01.22 13:56, David Stevens wrote:
> Add an event queue to the balloon to allow the driver to send events to
> the device, which allows the device to be more responsive to the memory
> needs of the guest.
> 
> There are two defined events. The first event is an out of memory event.
> This event provides a way for the guest to handle out of memory events
> on a system where the host does not support deflate-on-oom. The second
> event is an out of puff event, which the guest can send when it fails to
> allocate memory while trying to inflate the balloon. This serves as an
> indication to the device that the driver may not be able to inflate the
> balloon in a timely manner.

Do we actually need a queue for that or could a simple member in the
config space be sufficient?

Something like a "logical driver state". By storing specific value, the
the driver can notify the hypervisor about such events.

Just an idea.

Of course, you cannot really come up with an additional payload, but my
guess is that "The \field{data} value indicates how many pages the
driver requires to maintain system stability." will be wrong in 99.999%
of all cases. Relying on anything provided by the shrinker or the OOM
handler does not give you an idea what the system needs overall to "
maintain system stability".

[...]

> +
> +\begin{itemize}
> +\item Out of memory
> +\begin{lstlisting}
> +#define VIRTIO_BALLOON_EVENT_OOM        1
> +\end{lstlisting}
> +The driver has encountered a situation in which using pages from the balloon
> +is necessary for system stability (e.g. if memory is required by applications
> +running within the guest). The \field{data} value indicates how many pages
> +the driver requires to maintain system stability.
> +\item Out of puff
> +\begin{lstlisting}
> +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> +\end{lstlisting}
> +The driver has encountered an allocation failure when trying to inflate
> +the balloon. The \field{data} value is unused. This event serves as a signal
> +that the balloon may not be able to inflate in a timely manner.
> +\end{itemize}
> +
> +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> +
> +The driver MUST update \field{actual} with any allocated pages before
> +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
> +
> +The driver SHOULD wait for the device to acknowledge the event
> +before trying to further inflate or deflate the balloon.
> +
> +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
> +SHOULD send an OOM event before using pages from the balloon.
> +
> +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> +
> +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
> +the balloon by \field{data} pages before acknowledging the event.

The issue is that this is asynchronous. You won't really be able to stop
OOM from killing processes as you usually won't be able to get back
pages fast enough.

Even deflate-on-oom is flawed in that regard. If you run in the OOM
handler you're pretty much screwed already. But I mean, having it is not
making the situation worse :)

-- 
Thanks,

David / dhildenb


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 2/2] virtio-balloon: add a responsive host feature
  2022-01-24 12:56 ` [virtio-comment] [PATCH 2/2] virtio-balloon: add a responsive host feature David Stevens
@ 2022-01-28 15:52   ` David Hildenbrand
  2022-01-31  7:09     ` David Stevens
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2022-01-28 15:52 UTC (permalink / raw)
  To: David Stevens, virtio-comment

On 24.01.22 13:56, David Stevens wrote:
> Add a feature bit that the device can use to indicate that it will
> monitor and respond to memory pressure in the guest. This flag allows
> the driver to assume that the device will provide memory when necessary
> and will not permanently remove memory from the guest via inflating the
> balloon.
> 
> Signed-off-by: David Stevens <stevensd@chromium.org>
> ---
>  content.tex | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/content.tex b/content.tex
> index 3aeb319d31a7..dcb2b71a2b6a 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -5451,6 +5451,8 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
>      page reporting. A virtqueue for reporting free guest memory is present.
>  \item[ VIRTIO_BALLOON_F_EVENT_VQ(6) ] A virtqueue for sending events from
>      the driver to the device.
> +\item[ VIRTIO_BALLOON_F_RESPONSIVE_HOST(7) ] The device will respond to memory
> +    pressure in the guest by deflating the balloon.

s/HOST/DEVICE/ ?

>  
>  \end{description}
>  
> @@ -5471,6 +5473,14 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
>  bit, and if the driver did not accept this feature bit, the
>  device MAY signal failure by failing to set FEATURES_OK
>  \field{device status} bit when the driver writes it.
> +
> +If the device offers the VIRTIO_BALLOON_F_RESPONSIVE_HOST feature
> +bit, it MUST also offer the VIRTIO_BALLOON_F_STATS_VQ and
> +VIRTIO_BALLOON_F_EVENT_VQ feature bits. Although the device may not
> +always be able to immediately respond to memory pressure in the
> +guest, the device SHOULD be able to fully deflate the balloon if
> +memory pressure persists in the guest.

Hm. If you take a look at the history of memory ballooning, it is even
*desired* for the VM to have memory pressure.

The hypervisor has memory pressure, instead of swapping random stuff, it
inflates the memory balloon of the VMs.

The VMs will be *under memory pressure* and either shrink the pagecache
or start swapping what they consider least valuable. Reclaim under
memory pressure.

So memory pressure is intended, you just don't want to destabilize the
VMs. Maybe you actually didn't intend to phrase it that way here?



-- 
Thanks,

David / dhildenb


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 2/2] virtio-balloon: add a responsive host feature
  2022-01-28 15:52   ` David Hildenbrand
@ 2022-01-31  7:09     ` David Stevens
  2022-01-31  8:28       ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: David Stevens @ 2022-01-31  7:09 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: virtio-comment

On Sat, Jan 29, 2022 at 12:52 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.01.22 13:56, David Stevens wrote:
> > Add a feature bit that the device can use to indicate that it will
> > monitor and respond to memory pressure in the guest. This flag allows
> > the driver to assume that the device will provide memory when necessary
> > and will not permanently remove memory from the guest via inflating the
> > balloon.
> >
> > Signed-off-by: David Stevens <stevensd@chromium.org>
> > ---
> >  content.tex | 20 ++++++++++++++++++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/content.tex b/content.tex
> > index 3aeb319d31a7..dcb2b71a2b6a 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -5451,6 +5451,8 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
> >      page reporting. A virtqueue for reporting free guest memory is present.
> >  \item[ VIRTIO_BALLOON_F_EVENT_VQ(6) ] A virtqueue for sending events from
> >      the driver to the device.
> > +\item[ VIRTIO_BALLOON_F_RESPONSIVE_HOST(7) ] The device will respond to memory
> > +    pressure in the guest by deflating the balloon.
>
> s/HOST/DEVICE/ ?

I was being consistent with the MUST_TELL_HOST flag, but I can switch
to DEVICE if that's preferred for consistency with the virtio spec as
a whole.

> >
> >  \end{description}
> >
> > @@ -5471,6 +5473,14 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
> >  bit, and if the driver did not accept this feature bit, the
> >  device MAY signal failure by failing to set FEATURES_OK
> >  \field{device status} bit when the driver writes it.
> > +
> > +If the device offers the VIRTIO_BALLOON_F_RESPONSIVE_HOST feature
> > +bit, it MUST also offer the VIRTIO_BALLOON_F_STATS_VQ and
> > +VIRTIO_BALLOON_F_EVENT_VQ feature bits. Although the device may not
> > +always be able to immediately respond to memory pressure in the
> > +guest, the device SHOULD be able to fully deflate the balloon if
> > +memory pressure persists in the guest.
>
> Hm. If you take a look at the history of memory ballooning, it is even
> *desired* for the VM to have memory pressure.
>
> The hypervisor has memory pressure, instead of swapping random stuff, it
> inflates the memory balloon of the VMs.
>
> The VMs will be *under memory pressure* and either shrink the pagecache
> or start swapping what they consider least valuable. Reclaim under
> memory pressure.
>
> So memory pressure is intended, you just don't want to destabilize the
> VMs. Maybe you actually didn't intend to phrase it that way here?

That's a good point, the pressure does go both ways. How about
something like this:

Device requirements: The device SHOULD deflate the balloon if it
determines that memory pressure in the guest is higher than memory
pressure in the host. If memory pressure in the guest continuously
exceeds memory pressure in the host, the device SHOULD be able to
fully deflate the balloon.

-David

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-01-28 15:46   ` David Hildenbrand
@ 2022-01-31  7:10     ` David Stevens
  2022-01-31  8:43       ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: David Stevens @ 2022-01-31  7:10 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: virtio-comment

On Sat, Jan 29, 2022 at 12:46 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 24.01.22 13:56, David Stevens wrote:
> > Add an event queue to the balloon to allow the driver to send events to
> > the device, which allows the device to be more responsive to the memory
> > needs of the guest.
> >
> > There are two defined events. The first event is an out of memory event.
> > This event provides a way for the guest to handle out of memory events
> > on a system where the host does not support deflate-on-oom. The second
> > event is an out of puff event, which the guest can send when it fails to
> > allocate memory while trying to inflate the balloon. This serves as an
> > indication to the device that the driver may not be able to inflate the
> > balloon in a timely manner.
>
> Do we actually need a queue for that or could a simple member in the
> config space be sufficient?
>
> Something like a "logical driver state". By storing specific value, the
> the driver can notify the hypervisor about such events.
>
> Just an idea.

Having an event field in the config space should work. I do think
having an ack is important, but that could be done via the device
clearing the event field and triggering a config change callback.

The one drawback of this approach I can think of is that it requires
that the driver writing to the device config space generates events
for the device. At least based on my understanding of the balloon
spec, the device can completely ignore any writes to the device config
space after the DRIVER_OK bit is set. That allows the device to put
the device config on its own page and use KVM to map that page into
the guest, which then skips the need for any vmexits when
reading/writing num_pages or actual in the guest. I guess this would
still technically be achievable even with the event field by making
the offset of the device config non-page-aligned, so that the event
field lies on a new page. However, if any new fields get added after
the event field, they would generate vmexits as well. I don't know if
that's a concern to worry about, though.

> Of course, you cannot really come up with an additional payload, but my
> guess is that "The \field{data} value indicates how many pages the
> driver requires to maintain system stability." will be wrong in 99.999%
> of all cases. Relying on anything provided by the shrinker or the OOM
> handler does not give you an idea what the system needs overall to "
> maintain system stability".

One of the device or the driver needs to determine how much to deflate
the balloon in response to the OOM. It's true that the driver probably
can't guess exactly how much memory is needed, but it's definitely
going to be able to make a better guess than the device. In the end,
the device can do whatever it wants with the request, but I think it's
still useful for the driver to be able to provide a hint.

> [...]
>
> > +
> > +\begin{itemize}
> > +\item Out of memory
> > +\begin{lstlisting}
> > +#define VIRTIO_BALLOON_EVENT_OOM        1
> > +\end{lstlisting}
> > +The driver has encountered a situation in which using pages from the balloon
> > +is necessary for system stability (e.g. if memory is required by applications
> > +running within the guest). The \field{data} value indicates how many pages
> > +the driver requires to maintain system stability.
> > +\item Out of puff
> > +\begin{lstlisting}
> > +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> > +\end{lstlisting}
> > +The driver has encountered an allocation failure when trying to inflate
> > +the balloon. The \field{data} value is unused. This event serves as a signal
> > +that the balloon may not be able to inflate in a timely manner.
> > +\end{itemize}
> > +
> > +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> > +
> > +The driver MUST update \field{actual} with any allocated pages before
> > +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
> > +
> > +The driver SHOULD wait for the device to acknowledge the event
> > +before trying to further inflate or deflate the balloon.
> > +
> > +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
> > +SHOULD send an OOM event before using pages from the balloon.
> > +
> > +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> > +
> > +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
> > +the balloon by \field{data} pages before acknowledging the event.
>
> The issue is that this is asynchronous. You won't really be able to stop
> OOM from killing processes as you usually won't be able to get back
> pages fast enough.

If the device reduces num_pages before acking the message, then the
driver can wait for the ack and deflate the balloon synchronously. For
Linux specifically, blocking in the OOM notifier is fine (at least the
balloon driver already acquires a mutex here). And while it's true
that reclaiming memory might not be fast, my understanding is that
anywhere that could invoke the OOM killer can also invoke swap to
disk, which is also not fast.

-David

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 2/2] virtio-balloon: add a responsive host feature
  2022-01-31  7:09     ` David Stevens
@ 2022-01-31  8:28       ` David Hildenbrand
  0 siblings, 0 replies; 16+ messages in thread
From: David Hildenbrand @ 2022-01-31  8:28 UTC (permalink / raw)
  To: David Stevens; +Cc: virtio-comment

On 31.01.22 08:09, David Stevens wrote:
> On Sat, Jan 29, 2022 at 12:52 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 24.01.22 13:56, David Stevens wrote:
>>> Add a feature bit that the device can use to indicate that it will
>>> monitor and respond to memory pressure in the guest. This flag allows
>>> the driver to assume that the device will provide memory when necessary
>>> and will not permanently remove memory from the guest via inflating the
>>> balloon.
>>>
>>> Signed-off-by: David Stevens <stevensd@chromium.org>
>>> ---
>>>  content.tex | 20 ++++++++++++++++++--
>>>  1 file changed, 18 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/content.tex b/content.tex
>>> index 3aeb319d31a7..dcb2b71a2b6a 100644
>>> --- a/content.tex
>>> +++ b/content.tex
>>> @@ -5451,6 +5451,8 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
>>>      page reporting. A virtqueue for reporting free guest memory is present.
>>>  \item[ VIRTIO_BALLOON_F_EVENT_VQ(6) ] A virtqueue for sending events from
>>>      the driver to the device.
>>> +\item[ VIRTIO_BALLOON_F_RESPONSIVE_HOST(7) ] The device will respond to memory
>>> +    pressure in the guest by deflating the balloon.
>>
>> s/HOST/DEVICE/ ?
> 
> I was being consistent with the MUST_TELL_HOST flag, but I can switch
> to DEVICE if that's preferred for consistency with the virtio spec as
> a whole.

Yeah, IIRC nowadays we avoid using host/guest terminology and instead
use device/driver. MUST_TELL_HOST is quite ancient :)

> 
>>>
>>>  \end{description}
>>>
>>> @@ -5471,6 +5473,14 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
>>>  bit, and if the driver did not accept this feature bit, the
>>>  device MAY signal failure by failing to set FEATURES_OK
>>>  \field{device status} bit when the driver writes it.
>>> +
>>> +If the device offers the VIRTIO_BALLOON_F_RESPONSIVE_HOST feature
>>> +bit, it MUST also offer the VIRTIO_BALLOON_F_STATS_VQ and
>>> +VIRTIO_BALLOON_F_EVENT_VQ feature bits. Although the device may not
>>> +always be able to immediately respond to memory pressure in the
>>> +guest, the device SHOULD be able to fully deflate the balloon if
>>> +memory pressure persists in the guest.
>>
>> Hm. If you take a look at the history of memory ballooning, it is even
>> *desired* for the VM to have memory pressure.
>>
>> The hypervisor has memory pressure, instead of swapping random stuff, it
>> inflates the memory balloon of the VMs.
>>
>> The VMs will be *under memory pressure* and either shrink the pagecache
>> or start swapping what they consider least valuable. Reclaim under
>> memory pressure.
>>
>> So memory pressure is intended, you just don't want to destabilize the
>> VMs. Maybe you actually didn't intend to phrase it that way here?
> 
> That's a good point, the pressure does go both ways. How about
> something like this:
> 
> Device requirements: The device SHOULD deflate the balloon if it
> determines that memory pressure in the guest is higher than memory
> pressure in the host. If memory pressure in the guest continuously
> exceeds memory pressure in the host, the device SHOULD be able to
> fully deflate the balloon.


Or maybe something more generic:

The device SHOULD deflate the balloon if it derives from the memory
statistics reported by the driver that the driver is under severe,
possibly harmful, memory pressure. The device MAY deflate the balloon if
it derives that the driver is continuously under memory pressure, but
MAY decide otherwise, for example, if the device itself is under memory
pressure.



-- 
Thanks,

David / dhildenb


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-01-31  7:10     ` David Stevens
@ 2022-01-31  8:43       ` David Hildenbrand
  2022-02-04  1:41         ` David Stevens
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2022-01-31  8:43 UTC (permalink / raw)
  To: David Stevens; +Cc: virtio-comment

On 31.01.22 08:10, David Stevens wrote:
> On Sat, Jan 29, 2022 at 12:46 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 24.01.22 13:56, David Stevens wrote:
>>> Add an event queue to the balloon to allow the driver to send events to
>>> the device, which allows the device to be more responsive to the memory
>>> needs of the guest.
>>>
>>> There are two defined events. The first event is an out of memory event.
>>> This event provides a way for the guest to handle out of memory events
>>> on a system where the host does not support deflate-on-oom. The second
>>> event is an out of puff event, which the guest can send when it fails to
>>> allocate memory while trying to inflate the balloon. This serves as an
>>> indication to the device that the driver may not be able to inflate the
>>> balloon in a timely manner.
>>
>> Do we actually need a queue for that or could a simple member in the
>> config space be sufficient?
>>
>> Something like a "logical driver state". By storing specific value, the
>> the driver can notify the hypervisor about such events.
>>
>> Just an idea.
> 
> Having an event field in the config space should work. I do think
> having an ack is important, but that could be done via the device
> clearing the event field and triggering a config change callback.
> 

Right, but for an ACK a queue might actually be better.

> The one drawback of this approach I can think of is that it requires
> that the driver writing to the device config space generates events
> for the device. At least based on my understanding of the balloon
> spec, the device can completely ignore any writes to the device config
> space after the DRIVER_OK bit is set. That allows the device to put
> the device config on its own page and use KVM to map that page into
> the guest, which then skips the need for any vmexits when
> reading/writing num_pages or actual in the guest. I guess this would
> still technically be achievable even with the event field by making
> the offset of the device config non-page-aligned, so that the event
> field lies on a new page. However, if any new fields get added after
> the event field, they would generate vmexits as well. I don't know if
> that's a concern to worry about, though.
> 
>> Of course, you cannot really come up with an additional payload, but my
>> guess is that "The \field{data} value indicates how many pages the
>> driver requires to maintain system stability." will be wrong in 99.999%
>> of all cases. Relying on anything provided by the shrinker or the OOM
>> handler does not give you an idea what the system needs overall to "
>> maintain system stability".
> 
> One of the device or the driver needs to determine how much to deflate
> the balloon in response to the OOM. It's true that the driver probably
> can't guess exactly how much memory is needed, but it's definitely
> going to be able to make a better guess than the device. In the end,
> the device can do whatever it wants with the request, but I think it's
> still useful for the driver to be able to provide a hint.

The device is already making tons of guesses by staring at the memory
statistics, so I think we should leave that guessing also to the device.
At least I don't see how the values reported by the driver could be any
better (they will be wrong most of the time either way).

>>> +
>>> +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
>>> +
>>> +The driver MUST update \field{actual} with any allocated pages before
>>> +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
>>> +
>>> +The driver SHOULD wait for the device to acknowledge the event
>>> +before trying to further inflate or deflate the balloon.
>>> +
>>> +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
>>> +SHOULD send an OOM event before using pages from the balloon.
>>> +
>>> +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
>>> +
>>> +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
>>> +the balloon by \field{data} pages before acknowledging the event.
>>
>> The issue is that this is asynchronous. You won't really be able to stop
>> OOM from killing processes as you usually won't be able to get back
>> pages fast enough.
> 
> If the device reduces num_pages before acking the message, then the
> driver can wait for the ack and deflate the balloon synchronously. For
> Linux specifically, blocking in the OOM notifier is fine (at least the
> balloon driver already acquires a mutex here). And while it's true
> that reclaiming memory might not be fast, my understanding is that
> anywhere that could invoke the OOM killer can also invoke swap to
> disk, which is also not fast.

And that's the main issue IIRC. Allocation paths that *cannot* do that
(sleep, trigger the OOM killer) will fail the allocation instead,
essentially destabilizing your system or just crashing with unexpected
behavior. Reclaim can be done mostly synchronous if need be IIRC.

So once *some* path triggers the OOM killer and you try to keep up,
other parts of the system can already start falling apart.

Hooking into the shrinker interface is better, however, has some ugly
side-effects that random memory pressure will completely deflate the
balloon.

See 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
followed by da10329cb057 ("virtio-balloon: switch back to OOM handler
for VIRTIO_BALLOON_F_DEFLATE_ON_OOM")

Especially my note about "The shrinker does not have a concept of
priorities yet, so this behavior cannot be configured."


Long story short: we should avoid hooking into the OOM killer for all
new features.

-- 
Thanks,

David / dhildenb


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-01-31  8:43       ` David Hildenbrand
@ 2022-02-04  1:41         ` David Stevens
  2022-02-10 12:53           ` David Hildenbrand
  0 siblings, 1 reply; 16+ messages in thread
From: David Stevens @ 2022-02-04  1:41 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: virtio-comment

> >>> +
> >>> +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> >>> +
> >>> +The driver MUST update \field{actual} with any allocated pages before
> >>> +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
> >>> +
> >>> +The driver SHOULD wait for the device to acknowledge the event
> >>> +before trying to further inflate or deflate the balloon.
> >>> +
> >>> +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
> >>> +SHOULD send an OOM event before using pages from the balloon.
> >>> +
> >>> +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> >>> +
> >>> +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
> >>> +the balloon by \field{data} pages before acknowledging the event.
> >>
> >> The issue is that this is asynchronous. You won't really be able to stop
> >> OOM from killing processes as you usually won't be able to get back
> >> pages fast enough.
> >
> > If the device reduces num_pages before acking the message, then the
> > driver can wait for the ack and deflate the balloon synchronously. For
> > Linux specifically, blocking in the OOM notifier is fine (at least the
> > balloon driver already acquires a mutex here). And while it's true
> > that reclaiming memory might not be fast, my understanding is that
> > anywhere that could invoke the OOM killer can also invoke swap to
> > disk, which is also not fast.
>
> And that's the main issue IIRC. Allocation paths that *cannot* do that
> (sleep, trigger the OOM killer) will fail the allocation instead,
> essentially destabilizing your system or just crashing with unexpected
> behavior. Reclaim can be done mostly synchronous if need be IIRC.
>
> So once *some* path triggers the OOM killer and you try to keep up,
> other parts of the system can already start falling apart.
>
> Hooking into the shrinker interface is better, however, has some ugly
> side-effects that random memory pressure will completely deflate the
> balloon.
>
> See 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
> followed by da10329cb057 ("virtio-balloon: switch back to OOM handler
> for VIRTIO_BALLOON_F_DEFLATE_ON_OOM")
>
> Especially my note about "The shrinker does not have a concept of
> priorities yet, so this behavior cannot be configured."
>
>
> Long story short: we should avoid hooking into the OOM killer for all
> new features.

In that case, this could be a more generic
VIRTIO_BALLOON_EVENT_PRESSURE event, which the driver is free to send
at any point where it detects memory pressure. For the Linux driver,
that would be during reclaim. For device requirements, something like:

"The device SHOULD deflate the balloon before acknowledging the event
if it determines that the driver is under severe memory pressure."

Since the device is the one that determines how much to actually
deflate, the balloon could still be used to apply memory pressure to
the guest when needed. Because the device would need stats information
to be able to handle the pressure event in any meaningful way, it
would probably be useful to include memory stats in the event itself
to avoid the need for a round trip back to the driver. That would make
the event struct look something like this:

struct virtio_balloon_event {
  u32 type;
  union {
     struct virtio_balloon_event_pressure pressure;
  };
};

struct virtio_balloon_event_pressure {
  struct virtio_balloon_stat stats[];
};

-David

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-02-04  1:41         ` David Stevens
@ 2022-02-10 12:53           ` David Hildenbrand
  2022-02-14  6:59             ` David Stevens
  0 siblings, 1 reply; 16+ messages in thread
From: David Hildenbrand @ 2022-02-10 12:53 UTC (permalink / raw)
  To: David Stevens; +Cc: virtio-comment

On 04.02.22 02:41, David Stevens wrote:
>>>>> +
>>>>> +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
>>>>> +
>>>>> +The driver MUST update \field{actual} with any allocated pages before
>>>>> +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
>>>>> +
>>>>> +The driver SHOULD wait for the device to acknowledge the event
>>>>> +before trying to further inflate or deflate the balloon.
>>>>> +
>>>>> +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
>>>>> +SHOULD send an OOM event before using pages from the balloon.
>>>>> +
>>>>> +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
>>>>> +
>>>>> +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
>>>>> +the balloon by \field{data} pages before acknowledging the event.
>>>>
>>>> The issue is that this is asynchronous. You won't really be able to stop
>>>> OOM from killing processes as you usually won't be able to get back
>>>> pages fast enough.
>>>
>>> If the device reduces num_pages before acking the message, then the
>>> driver can wait for the ack and deflate the balloon synchronously. For
>>> Linux specifically, blocking in the OOM notifier is fine (at least the
>>> balloon driver already acquires a mutex here). And while it's true
>>> that reclaiming memory might not be fast, my understanding is that
>>> anywhere that could invoke the OOM killer can also invoke swap to
>>> disk, which is also not fast.
>>
>> And that's the main issue IIRC. Allocation paths that *cannot* do that
>> (sleep, trigger the OOM killer) will fail the allocation instead,
>> essentially destabilizing your system or just crashing with unexpected
>> behavior. Reclaim can be done mostly synchronous if need be IIRC.
>>
>> So once *some* path triggers the OOM killer and you try to keep up,
>> other parts of the system can already start falling apart.
>>
>> Hooking into the shrinker interface is better, however, has some ugly
>> side-effects that random memory pressure will completely deflate the
>> balloon.
>>
>> See 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
>> followed by da10329cb057 ("virtio-balloon: switch back to OOM handler
>> for VIRTIO_BALLOON_F_DEFLATE_ON_OOM")
>>
>> Especially my note about "The shrinker does not have a concept of
>> priorities yet, so this behavior cannot be configured."
>>
>>
>> Long story short: we should avoid hooking into the OOM killer for all
>> new features.
> 
> In that case, this could be a more generic
> VIRTIO_BALLOON_EVENT_PRESSURE event, which the driver is free to send
> at any point where it detects memory pressure. For the Linux driver,
> that would be during reclaim. For device requirements, something like:

During reclaim is not sufficient I think. E.g., just inflating the
balloon would trigger reclaim (intended!) and trigger this event.

I think you'd actually want shrinker priorities or similar in Linux, and
really get notified only once some healthy reclaim "let's drop clean
file pages" is no longer possible -- or even if we're close to it no
longer beeing possible.

> 
> "The device SHOULD deflate the balloon before acknowledging the event
> if it determines that the driver is under severe memory pressure."

The "severe memory pressure" part is what we want. The interesting part
is how we could actually obtain that information from Linux.

Having that said, I'm not opposed to these changes, but there should be
a way to actually hook this up to Linux MM and get a reasonable outcome
out of it. As raised, the OOM killer is not really what we want to hook
into.

-- 
Thanks,

David / dhildenb


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-02-10 12:53           ` David Hildenbrand
@ 2022-02-14  6:59             ` David Stevens
  0 siblings, 0 replies; 16+ messages in thread
From: David Stevens @ 2022-02-14  6:59 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: virtio-comment

On Thu, Feb 10, 2022 at 9:54 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 04.02.22 02:41, David Stevens wrote:
> >>>>> +
> >>>>> +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> >>>>> +
> >>>>> +The driver MUST update \field{actual} with any allocated pages before
> >>>>> +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
> >>>>> +
> >>>>> +The driver SHOULD wait for the device to acknowledge the event
> >>>>> +before trying to further inflate or deflate the balloon.
> >>>>> +
> >>>>> +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
> >>>>> +SHOULD send an OOM event before using pages from the balloon.
> >>>>> +
> >>>>> +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> >>>>> +
> >>>>> +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
> >>>>> +the balloon by \field{data} pages before acknowledging the event.
> >>>>
> >>>> The issue is that this is asynchronous. You won't really be able to stop
> >>>> OOM from killing processes as you usually won't be able to get back
> >>>> pages fast enough.
> >>>
> >>> If the device reduces num_pages before acking the message, then the
> >>> driver can wait for the ack and deflate the balloon synchronously. For
> >>> Linux specifically, blocking in the OOM notifier is fine (at least the
> >>> balloon driver already acquires a mutex here). And while it's true
> >>> that reclaiming memory might not be fast, my understanding is that
> >>> anywhere that could invoke the OOM killer can also invoke swap to
> >>> disk, which is also not fast.
> >>
> >> And that's the main issue IIRC. Allocation paths that *cannot* do that
> >> (sleep, trigger the OOM killer) will fail the allocation instead,
> >> essentially destabilizing your system or just crashing with unexpected
> >> behavior. Reclaim can be done mostly synchronous if need be IIRC.
> >>
> >> So once *some* path triggers the OOM killer and you try to keep up,
> >> other parts of the system can already start falling apart.
> >>
> >> Hooking into the shrinker interface is better, however, has some ugly
> >> side-effects that random memory pressure will completely deflate the
> >> balloon.
> >>
> >> See 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
> >> followed by da10329cb057 ("virtio-balloon: switch back to OOM handler
> >> for VIRTIO_BALLOON_F_DEFLATE_ON_OOM")
> >>
> >> Especially my note about "The shrinker does not have a concept of
> >> priorities yet, so this behavior cannot be configured."
> >>
> >>
> >> Long story short: we should avoid hooking into the OOM killer for all
> >> new features.
> >
> > In that case, this could be a more generic
> > VIRTIO_BALLOON_EVENT_PRESSURE event, which the driver is free to send
> > at any point where it detects memory pressure. For the Linux driver,
> > that would be during reclaim. For device requirements, something like:
>
> During reclaim is not sufficient I think. E.g., just inflating the
> balloon would trigger reclaim (intended!) and trigger this event.
>
> I think you'd actually want shrinker priorities or similar in Linux, and
> really get notified only once some healthy reclaim "let's drop clean
> file pages" is no longer possible -- or even if we're close to it no
> longer beeing possible.

We don't necessarily want to wait until we've reclaimed all/most clean
pages before considering whether or not to deflate the balloon - there
could be memory in the system being used for something less important
than those clean pages (e.g. even more clean file pages in another
VM). Sending memory pressure events to the device as early as possible
leaves more of the decision about how to allocate memory up to the
device. Although if the guest is trying to directly shrink the balloon
while the driver is inflating the balloon, that probably is something
to explicitly include in the message to the device.

> >
> > "The device SHOULD deflate the balloon before acknowledging the event
> > if it determines that the driver is under severe memory pressure."
>
> The "severe memory pressure" part is what we want. The interesting part
> is how we could actually obtain that information from Linux.
>
> Having that said, I'm not opposed to these changes, but there should be
> a way to actually hook this up to Linux MM and get a reasonable outcome
> out of it. As raised, the OOM killer is not really what we want to hook
> into.

It sounds like the rough idea behind this proposal is probably
generally okay, and the main thing to work out is what sort of
information to provide to the device so it can implement effective
heuristics. It's possible that just the memory stats might be enough
to get good results, but if not things like shrinker priorities or
maybe even PSI information could be helpful. I think I should probably
take a while to explore various options and see what actually works
well closer to production.

Thanks,
David

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-01-24 12:56 ` [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue David Stevens
  2022-01-28 15:46   ` David Hildenbrand
@ 2022-08-09 20:52   ` Michael S. Tsirkin
  2022-08-10  7:54     ` David Stevens
  1 sibling, 1 reply; 16+ messages in thread
From: Michael S. Tsirkin @ 2022-08-09 20:52 UTC (permalink / raw)
  To: David Stevens; +Cc: virtio-comment

On Mon, Jan 24, 2022 at 09:56:24PM +0900, David Stevens wrote:
> Add an event queue to the balloon to allow the driver to send events to
> the device, which allows the device to be more responsive to the memory
> needs of the guest.
> 
> There are two defined events. The first event is an out of memory event.
> This event provides a way for the guest to handle out of memory events
> on a system where the host does not support deflate-on-oom. The second
> event is an out of puff event, which the guest can send when it fails to
> allocate memory while trying to inflate the balloon. This serves as an
> indication to the device that the driver may not be able to inflate the
> balloon in a timely manner.
> 
> Signed-off-by: David Stevens <stevensd@chromium.org>
> ---
>  conformance.tex |  2 ++
>  content.tex     | 59 +++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 61 insertions(+)
> 
> diff --git a/conformance.tex b/conformance.tex
> index 42f853762ebe..5e0baa5ae7df 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -181,6 +181,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
>  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
>  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
> +\item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Events}
>  \end{itemize}
>  
>  \conformance{\subsection}{SCSI Host Driver Conformance}\label{sec:Conformance / Driver Conformance / SCSI Host Driver Conformance}
> @@ -440,6 +441,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
>  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
>  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
> +\item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Events}
>  \end{itemize}
>  
>  \conformance{\subsection}{SCSI Host Device Conformance}\label{sec:Conformance / Device Conformance / SCSI Host Device Conformance}
> diff --git a/content.tex b/content.tex
> index 32de6685c50b..3aeb319d31a7 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -5420,6 +5420,7 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
>  \item[2] statsq
>  \item[3] free_page_vq
>  \item[4] reporting_vq
> +\item[5] event_vq
>  \end{description}
>  
>    statsq only exists if VIRTIO_BALLOON_F_STATS_VQ is set.
> @@ -5428,6 +5429,8 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
>  
>    reporting_vq only exists if VIRTIO_BALLOON_F_PAGE_REPORTING is set.
>  
> +  event_vq only exists if VIRTIO_BALLOON_F_EVENT_VQ is set.
> +
>  \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Feature bits}
>  \begin{description}
>  \item[VIRTIO_BALLOON_F_MUST_TELL_HOST (0)] Host has to be told before
> @@ -5446,6 +5449,8 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
>      Configuration field \field{poison_val} is valid.
>  \item[ VIRTIO_BALLOON_F_PAGE_REPORTING(5) ] The device has support for free
>      page reporting. A virtqueue for reporting free guest memory is present.
> +\item[ VIRTIO_BALLOON_F_EVENT_VQ(6) ] A virtqueue for sending events from
> +    the driver to the device.
>  
>  \end{description}
>  
> @@ -5529,6 +5534,8 @@ \subsection{Device Initialization}\label{sec:Device Types / Memory Balloon Devic
>    notify the device about the stats virtqueue buffer.
>  \item If the VIRTIO_BALLOON_F_PAGE_REPORTING feature bit is negotiated, then
>    begin reporting free pages to the device.
> +\item If the VIRTIO_BALLOON_F_EVENT_VQ feature bit is negotiated, then
> +  begin reporting events to the device.
>  \end{enumerate}
>  
>  \subsection{Device Operation}\label{sec:Device Types / Memory Balloon Device / Device Operation}
> @@ -6031,6 +6038,58 @@ \subsubsection{Free Page Reporting}\label{sec:Device Types / Memory Balloon Devi
>  MUST NOT modify the the content of a reported page to a value other than
>  \field{poison_val}.
>  
> +\subsubsection{Events}\label{sec:Device Types / Memory Balloon Device / Device Operation / Events}
> +
> +The events virtqueue allows the driver to signal events to the
> +device.

and the following is the format used for buffers added
to this vq? write or read buffers?

> +
> +\begin{lstlisting}
> +struct virtio_balloon_event {
> +        u32 type;
> +        u32 data;
> +};
> +
> +#define VIRTIO_BALLOON_EVENT_OOM        1
> +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> +\end{lstlisting}
> +
> +The \field{type} determines how to interpret the event. The following
> +events are defined:
> +
> +\begin{itemize}
> +\item Out of memory
> +\begin{lstlisting}
> +#define VIRTIO_BALLOON_EVENT_OOM        1
> +\end{lstlisting}
> +The driver has encountered a situation in which using pages from the balloon
> +is necessary for system stability (e.g. if memory is required by applications
> +running within the guest). The \field{data} value indicates how many pages
> +the driver requires to maintain system stability.

I'd prefer that we report info in bytes or another portable metric
please. Down the road we should switch balloon away from
using pages, too. This might mean we need to make data and type u64.

> +\item Out of puff
> +\begin{lstlisting}
> +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> +\end{lstlisting}
> +The driver has encountered an allocation failure when trying to inflate
> +the balloon. The \field{data} value is unused. This event serves as a signal
> +that the balloon may not be able to inflate in a timely manner.
> +\end{itemize}
> +
> +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> +
> +The driver MUST update \field{actual} with any allocated pages before
> +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
> +
> +The driver SHOULD wait for the device to acknowledge the event
> +before trying to further inflate or deflate the balloon.

Hmm what is the motivation for this?

> +
> +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
> +SHOULD send an OOM event before using pages from the balloon.

what does using mean here? You mean without MUST_TELL_HOST?

> +
> +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> +
> +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
> +the balloon by \field{data} pages before acknowledging the event.

... by modifying the config space field I guess?

> +
>  \section{SCSI Host Device}\label{sec:Device Types / SCSI Host Device}
>  
>  The virtio SCSI host device groups together one or more virtual
> -- 
> 2.35.0.rc0.227.g00780c9af4-goog
> 
> 
> This publicly archived list offers a means to provide input to the
> OASIS Virtual I/O Device (VIRTIO) TC.
> 
> In order to verify user consent to the Feedback License terms and
> to minimize spam in the list archive, subscription is required
> before posting.
> 
> Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> List help: virtio-comment-help@lists.oasis-open.org
> List archive: https://lists.oasis-open.org/archives/virtio-comment/
> Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> Committee: https://www.oasis-open.org/committees/virtio/
> Join OASIS: https://www.oasis-open.org/join/


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-08-09 20:52   ` Michael S. Tsirkin
@ 2022-08-10  7:54     ` David Stevens
  2022-08-10  9:10       ` Michael S. Tsirkin
  0 siblings, 1 reply; 16+ messages in thread
From: David Stevens @ 2022-08-10  7:54 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-comment

On Wed, Aug 10, 2022 at 5:52 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 24, 2022 at 09:56:24PM +0900, David Stevens wrote:
> > Add an event queue to the balloon to allow the driver to send events to
> > the device, which allows the device to be more responsive to the memory
> > needs of the guest.
> >
> > There are two defined events. The first event is an out of memory event.
> > This event provides a way for the guest to handle out of memory events
> > on a system where the host does not support deflate-on-oom. The second
> > event is an out of puff event, which the guest can send when it fails to
> > allocate memory while trying to inflate the balloon. This serves as an
> > indication to the device that the driver may not be able to inflate the
> > balloon in a timely manner.
> >
> > Signed-off-by: David Stevens <stevensd@chromium.org>
> > ---
> >  conformance.tex |  2 ++
> >  content.tex     | 59 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 61 insertions(+)
> >
> > diff --git a/conformance.tex b/conformance.tex
> > index 42f853762ebe..5e0baa5ae7df 100644
> > --- a/conformance.tex
> > +++ b/conformance.tex
> > @@ -181,6 +181,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> >  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
> >  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
> >  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
> > +\item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Events}
> >  \end{itemize}
> >
> >  \conformance{\subsection}{SCSI Host Driver Conformance}\label{sec:Conformance / Driver Conformance / SCSI Host Driver Conformance}
> > @@ -440,6 +441,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> >  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
> >  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
> >  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
> > +\item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Events}
> >  \end{itemize}
> >
> >  \conformance{\subsection}{SCSI Host Device Conformance}\label{sec:Conformance / Device Conformance / SCSI Host Device Conformance}
> > diff --git a/content.tex b/content.tex
> > index 32de6685c50b..3aeb319d31a7 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -5420,6 +5420,7 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
> >  \item[2] statsq
> >  \item[3] free_page_vq
> >  \item[4] reporting_vq
> > +\item[5] event_vq
> >  \end{description}
> >
> >    statsq only exists if VIRTIO_BALLOON_F_STATS_VQ is set.
> > @@ -5428,6 +5429,8 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
> >
> >    reporting_vq only exists if VIRTIO_BALLOON_F_PAGE_REPORTING is set.
> >
> > +  event_vq only exists if VIRTIO_BALLOON_F_EVENT_VQ is set.
> > +
> >  \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Feature bits}
> >  \begin{description}
> >  \item[VIRTIO_BALLOON_F_MUST_TELL_HOST (0)] Host has to be told before
> > @@ -5446,6 +5449,8 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
> >      Configuration field \field{poison_val} is valid.
> >  \item[ VIRTIO_BALLOON_F_PAGE_REPORTING(5) ] The device has support for free
> >      page reporting. A virtqueue for reporting free guest memory is present.
> > +\item[ VIRTIO_BALLOON_F_EVENT_VQ(6) ] A virtqueue for sending events from
> > +    the driver to the device.
> >
> >  \end{description}
> >
> > @@ -5529,6 +5534,8 @@ \subsection{Device Initialization}\label{sec:Device Types / Memory Balloon Devic
> >    notify the device about the stats virtqueue buffer.
> >  \item If the VIRTIO_BALLOON_F_PAGE_REPORTING feature bit is negotiated, then
> >    begin reporting free pages to the device.
> > +\item If the VIRTIO_BALLOON_F_EVENT_VQ feature bit is negotiated, then
> > +  begin reporting events to the device.
> >  \end{enumerate}
> >
> >  \subsection{Device Operation}\label{sec:Device Types / Memory Balloon Device / Device Operation}
> > @@ -6031,6 +6038,58 @@ \subsubsection{Free Page Reporting}\label{sec:Device Types / Memory Balloon Devi
> >  MUST NOT modify the the content of a reported page to a value other than
> >  \field{poison_val}.
> >
> > +\subsubsection{Events}\label{sec:Device Types / Memory Balloon Device / Device Operation / Events}
> > +
> > +The events virtqueue allows the driver to signal events to the
> > +device.
>
> and the following is the format used for buffers added
> to this vq? write or read buffers?

It's the format for messages sent to the virtqueue. Written by the
driver, read-only from the device.

>
> > +
> > +\begin{lstlisting}
> > +struct virtio_balloon_event {
> > +        u32 type;
> > +        u32 data;
> > +};
> > +
> > +#define VIRTIO_BALLOON_EVENT_OOM        1
> > +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> > +\end{lstlisting}
> > +
> > +The \field{type} determines how to interpret the event. The following
> > +events are defined:
> > +
> > +\begin{itemize}
> > +\item Out of memory
> > +\begin{lstlisting}
> > +#define VIRTIO_BALLOON_EVENT_OOM        1
> > +\end{lstlisting}
> > +The driver has encountered a situation in which using pages from the balloon
> > +is necessary for system stability (e.g. if memory is required by applications
> > +running within the guest). The \field{data} value indicates how many pages
> > +the driver requires to maintain system stability.
>
> I'd prefer that we report info in bytes or another portable metric
> please. Down the road we should switch balloon away from
> using pages, too. This might mean we need to make data and type u64.
>
> > +\item Out of puff
> > +\begin{lstlisting}
> > +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> > +\end{lstlisting}
> > +The driver has encountered an allocation failure when trying to inflate
> > +the balloon. The \field{data} value is unused. This event serves as a signal
> > +that the balloon may not be able to inflate in a timely manner.
> > +\end{itemize}
> > +
> > +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> > +
> > +The driver MUST update \field{actual} with any allocated pages before
> > +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
> > +
> > +The driver SHOULD wait for the device to acknowledge the event
> > +before trying to further inflate or deflate the balloon.
>
> Hmm what is the motivation for this?

Having the driver wait gives the device the option to respond to the
event synchronously. For example, since a puff failure when inflating
signals high memory pressure in the guest, the device might want to
reduce num_pages to stop further attempts to inflate the balloon. And
if the device wants to respond asynchronously, it can always do that
after acking the event.

>
> > +
> > +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
> > +SHOULD send an OOM event before using pages from the balloon.
>
> what does using mean here? You mean without MUST_TELL_HOST?
>

With DEFLATE_ON_OOM, "the driver MAY use pages from the balloon when
num_pages is less than or equal to the actual number of pages in the
balloon if this is required for system stability". When the event
queue is enabled, the driver should send a VIRTIO_BALLOON_EVENT_OOM
before doing that and using pages in the balloon. That said, I've been
convinced that deflate-on-oom is questionable at best, so I think it's
better to replace the OOM event with a more generic memory pressure
event.

> > +
> > +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> > +
> > +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
> > +the balloon by \field{data} pages before acknowledging the event.
>
> ... by modifying the config space field I guess?
>

Yes, by reducing num_pages.

> > +
> >  \section{SCSI Host Device}\label{sec:Device Types / SCSI Host Device}
> >
> >  The virtio SCSI host device groups together one or more virtual
> > --
> > 2.35.0.rc0.227.g00780c9af4-goog
> >
> >
> > This publicly archived list offers a means to provide input to the
> > OASIS Virtual I/O Device (VIRTIO) TC.
> >
> > In order to verify user consent to the Feedback License terms and
> > to minimize spam in the list archive, subscription is required
> > before posting.
> >
> > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > List help: virtio-comment-help@lists.oasis-open.org
> > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > Committee: https://www.oasis-open.org/committees/virtio/
> > Join OASIS: https://www.oasis-open.org/join/
>

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-08-10  7:54     ` David Stevens
@ 2022-08-10  9:10       ` Michael S. Tsirkin
  2022-08-16  7:13         ` David Stevens
  0 siblings, 1 reply; 16+ messages in thread
From: Michael S. Tsirkin @ 2022-08-10  9:10 UTC (permalink / raw)
  To: David Stevens; +Cc: virtio-comment

On Wed, Aug 10, 2022 at 04:54:01PM +0900, David Stevens wrote:
> On Wed, Aug 10, 2022 at 5:52 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Jan 24, 2022 at 09:56:24PM +0900, David Stevens wrote:
> > > Add an event queue to the balloon to allow the driver to send events to
> > > the device, which allows the device to be more responsive to the memory
> > > needs of the guest.
> > >
> > > There are two defined events. The first event is an out of memory event.
> > > This event provides a way for the guest to handle out of memory events
> > > on a system where the host does not support deflate-on-oom. The second
> > > event is an out of puff event, which the guest can send when it fails to
> > > allocate memory while trying to inflate the balloon. This serves as an
> > > indication to the device that the driver may not be able to inflate the
> > > balloon in a timely manner.

Thanks for the answers! Pls include them in the spec too.

> > >
> > > Signed-off-by: David Stevens <stevensd@chromium.org>
> > > ---
> > >  conformance.tex |  2 ++
> > >  content.tex     | 59 +++++++++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 61 insertions(+)
> > >
> > > diff --git a/conformance.tex b/conformance.tex
> > > index 42f853762ebe..5e0baa5ae7df 100644
> > > --- a/conformance.tex
> > > +++ b/conformance.tex
> > > @@ -181,6 +181,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > >  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
> > >  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
> > >  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
> > > +\item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Events}
> > >  \end{itemize}
> > >
> > >  \conformance{\subsection}{SCSI Host Driver Conformance}\label{sec:Conformance / Driver Conformance / SCSI Host Driver Conformance}
> > > @@ -440,6 +441,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > >  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
> > >  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
> > >  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
> > > +\item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Events}
> > >  \end{itemize}
> > >
> > >  \conformance{\subsection}{SCSI Host Device Conformance}\label{sec:Conformance / Device Conformance / SCSI Host Device Conformance}
> > > diff --git a/content.tex b/content.tex
> > > index 32de6685c50b..3aeb319d31a7 100644
> > > --- a/content.tex
> > > +++ b/content.tex
> > > @@ -5420,6 +5420,7 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
> > >  \item[2] statsq
> > >  \item[3] free_page_vq
> > >  \item[4] reporting_vq
> > > +\item[5] event_vq
> > >  \end{description}
> > >
> > >    statsq only exists if VIRTIO_BALLOON_F_STATS_VQ is set.
> > > @@ -5428,6 +5429,8 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
> > >
> > >    reporting_vq only exists if VIRTIO_BALLOON_F_PAGE_REPORTING is set.
> > >
> > > +  event_vq only exists if VIRTIO_BALLOON_F_EVENT_VQ is set.
> > > +
> > >  \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Feature bits}
> > >  \begin{description}
> > >  \item[VIRTIO_BALLOON_F_MUST_TELL_HOST (0)] Host has to be told before
> > > @@ -5446,6 +5449,8 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
> > >      Configuration field \field{poison_val} is valid.
> > >  \item[ VIRTIO_BALLOON_F_PAGE_REPORTING(5) ] The device has support for free
> > >      page reporting. A virtqueue for reporting free guest memory is present.
> > > +\item[ VIRTIO_BALLOON_F_EVENT_VQ(6) ] A virtqueue for sending events from
> > > +    the driver to the device.
> > >
> > >  \end{description}
> > >
> > > @@ -5529,6 +5534,8 @@ \subsection{Device Initialization}\label{sec:Device Types / Memory Balloon Devic
> > >    notify the device about the stats virtqueue buffer.
> > >  \item If the VIRTIO_BALLOON_F_PAGE_REPORTING feature bit is negotiated, then
> > >    begin reporting free pages to the device.
> > > +\item If the VIRTIO_BALLOON_F_EVENT_VQ feature bit is negotiated, then
> > > +  begin reporting events to the device.
> > >  \end{enumerate}
> > >
> > >  \subsection{Device Operation}\label{sec:Device Types / Memory Balloon Device / Device Operation}
> > > @@ -6031,6 +6038,58 @@ \subsubsection{Free Page Reporting}\label{sec:Device Types / Memory Balloon Devi
> > >  MUST NOT modify the the content of a reported page to a value other than
> > >  \field{poison_val}.
> > >
> > > +\subsubsection{Events}\label{sec:Device Types / Memory Balloon Device / Device Operation / Events}
> > > +
> > > +The events virtqueue allows the driver to signal events to the
> > > +device.
> >
> > and the following is the format used for buffers added
> > to this vq? write or read buffers?
> 
> It's the format for messages sent to the virtqueue. Written by the
> driver, read-only from the device.
> 
> >
> > > +
> > > +\begin{lstlisting}
> > > +struct virtio_balloon_event {
> > > +        u32 type;
> > > +        u32 data;
> > > +};
> > > +
> > > +#define VIRTIO_BALLOON_EVENT_OOM        1
> > > +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> > > +\end{lstlisting}
> > > +
> > > +The \field{type} determines how to interpret the event. The following
> > > +events are defined:
> > > +
> > > +\begin{itemize}
> > > +\item Out of memory
> > > +\begin{lstlisting}
> > > +#define VIRTIO_BALLOON_EVENT_OOM        1
> > > +\end{lstlisting}
> > > +The driver has encountered a situation in which using pages from the balloon
> > > +is necessary for system stability (e.g. if memory is required by applications
> > > +running within the guest). The \field{data} value indicates how many pages
> > > +the driver requires to maintain system stability.
> >
> > I'd prefer that we report info in bytes or another portable metric
> > please. Down the road we should switch balloon away from
> > using pages, too. This might mean we need to make data and type u64.
> >
> > > +\item Out of puff
> > > +\begin{lstlisting}
> > > +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> > > +\end{lstlisting}
> > > +The driver has encountered an allocation failure when trying to inflate
> > > +the balloon. The \field{data} value is unused. This event serves as a signal
> > > +that the balloon may not be able to inflate in a timely manner.
> > > +\end{itemize}
> > > +
> > > +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> > > +
> > > +The driver MUST update \field{actual} with any allocated pages before
> > > +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
> > > +
> > > +The driver SHOULD wait for the device to acknowledge the event
> > > +before trying to further inflate or deflate the balloon.
> >
> > Hmm what is the motivation for this?
> 
> Having the driver wait gives the device the option to respond to the
> event synchronously. For example, since a puff failure when inflating
> signals high memory pressure in the guest, the device might want to
> reduce num_pages to stop further attempts to inflate the balloon.
> And
> if the device wants to respond asynchronously, it can always do that
> after acking the event.

Question: how does device know by how much to reduce the num_pages?
Should the current balloon size be sent with this event?


> >
> > > +
> > > +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
> > > +SHOULD send an OOM event before using pages from the balloon.
> >
> > what does using mean here? You mean without MUST_TELL_HOST?
> >
> 
> With DEFLATE_ON_OOM, "the driver MAY use pages from the balloon when
> num_pages is less than or equal to the actual number of pages in the
> balloon if this is required for system stability". When the event
> queue is enabled, the driver should send a VIRTIO_BALLOON_EVENT_OOM
> before doing that and using pages in the balloon.

> That said, I've been
> convinced that deflate-on-oom is questionable at best, so I think it's
> better to replace the OOM event with a more generic memory pressure
> event.




> > > +
> > > +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> > > +
> > > +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
> > > +the balloon by \field{data} pages before acknowledging the event.
> >
> > ... by modifying the config space field I guess?
> >
> 
> Yes, by reducing num_pages.
> 
> > > +
> > >  \section{SCSI Host Device}\label{sec:Device Types / SCSI Host Device}
> > >
> > >  The virtio SCSI host device groups together one or more virtual
> > > --
> > > 2.35.0.rc0.227.g00780c9af4-goog
> > >
> > >
> > > This publicly archived list offers a means to provide input to the
> > > OASIS Virtual I/O Device (VIRTIO) TC.
> > >
> > > In order to verify user consent to the Feedback License terms and
> > > to minimize spam in the list archive, subscription is required
> > > before posting.
> > >
> > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > List help: virtio-comment-help@lists.oasis-open.org
> > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > Committee: https://www.oasis-open.org/committees/virtio/
> > > Join OASIS: https://www.oasis-open.org/join/
> >


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue
  2022-08-10  9:10       ` Michael S. Tsirkin
@ 2022-08-16  7:13         ` David Stevens
  0 siblings, 0 replies; 16+ messages in thread
From: David Stevens @ 2022-08-16  7:13 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-comment

On Wed, Aug 10, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Aug 10, 2022 at 04:54:01PM +0900, David Stevens wrote:
> > On Wed, Aug 10, 2022 at 5:52 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Jan 24, 2022 at 09:56:24PM +0900, David Stevens wrote:
> > > > Add an event queue to the balloon to allow the driver to send events to
> > > > the device, which allows the device to be more responsive to the memory
> > > > needs of the guest.
> > > >
> > > > There are two defined events. The first event is an out of memory event.
> > > > This event provides a way for the guest to handle out of memory events
> > > > on a system where the host does not support deflate-on-oom. The second
> > > > event is an out of puff event, which the guest can send when it fails to
> > > > allocate memory while trying to inflate the balloon. This serves as an
> > > > indication to the device that the driver may not be able to inflate the
> > > > balloon in a timely manner.
>
> Thanks for the answers! Pls include them in the spec too.

Thanks for the feedback. I'll continue the investigation of how well
this works for our system, and if the results are promising I'll send
out a new version of these patches.

> > > >
> > > > Signed-off-by: David Stevens <stevensd@chromium.org>
> > > > ---
> > > >  conformance.tex |  2 ++
> > > >  content.tex     | 59 +++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 61 insertions(+)
> > > >
> > > > diff --git a/conformance.tex b/conformance.tex
> > > > index 42f853762ebe..5e0baa5ae7df 100644
> > > > --- a/conformance.tex
> > > > +++ b/conformance.tex
> > > > @@ -181,6 +181,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > > >  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
> > > >  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
> > > >  \item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
> > > > +\item \ref{drivernormative:Device Types / Memory Balloon Device / Device Operation / Events}
> > > >  \end{itemize}
> > > >
> > > >  \conformance{\subsection}{SCSI Host Driver Conformance}\label{sec:Conformance / Driver Conformance / SCSI Host Driver Conformance}
> > > > @@ -440,6 +441,7 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
> > > >  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Hinting}
> > > >  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Page Poison}
> > > >  \item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Free Page Reporting}
> > > > +\item \ref{devicenormative:Device Types / Memory Balloon Device / Device Operation / Events}
> > > >  \end{itemize}
> > > >
> > > >  \conformance{\subsection}{SCSI Host Device Conformance}\label{sec:Conformance / Device Conformance / SCSI Host Device Conformance}
> > > > diff --git a/content.tex b/content.tex
> > > > index 32de6685c50b..3aeb319d31a7 100644
> > > > --- a/content.tex
> > > > +++ b/content.tex
> > > > @@ -5420,6 +5420,7 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
> > > >  \item[2] statsq
> > > >  \item[3] free_page_vq
> > > >  \item[4] reporting_vq
> > > > +\item[5] event_vq
> > > >  \end{description}
> > > >
> > > >    statsq only exists if VIRTIO_BALLOON_F_STATS_VQ is set.
> > > > @@ -5428,6 +5429,8 @@ \subsection{Virtqueues}\label{sec:Device Types / Memory Balloon Device / Virtque
> > > >
> > > >    reporting_vq only exists if VIRTIO_BALLOON_F_PAGE_REPORTING is set.
> > > >
> > > > +  event_vq only exists if VIRTIO_BALLOON_F_EVENT_VQ is set.
> > > > +
> > > >  \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Feature bits}
> > > >  \begin{description}
> > > >  \item[VIRTIO_BALLOON_F_MUST_TELL_HOST (0)] Host has to be told before
> > > > @@ -5446,6 +5449,8 @@ \subsection{Feature bits}\label{sec:Device Types / Memory Balloon Device / Featu
> > > >      Configuration field \field{poison_val} is valid.
> > > >  \item[ VIRTIO_BALLOON_F_PAGE_REPORTING(5) ] The device has support for free
> > > >      page reporting. A virtqueue for reporting free guest memory is present.
> > > > +\item[ VIRTIO_BALLOON_F_EVENT_VQ(6) ] A virtqueue for sending events from
> > > > +    the driver to the device.
> > > >
> > > >  \end{description}
> > > >
> > > > @@ -5529,6 +5534,8 @@ \subsection{Device Initialization}\label{sec:Device Types / Memory Balloon Devic
> > > >    notify the device about the stats virtqueue buffer.
> > > >  \item If the VIRTIO_BALLOON_F_PAGE_REPORTING feature bit is negotiated, then
> > > >    begin reporting free pages to the device.
> > > > +\item If the VIRTIO_BALLOON_F_EVENT_VQ feature bit is negotiated, then
> > > > +  begin reporting events to the device.
> > > >  \end{enumerate}
> > > >
> > > >  \subsection{Device Operation}\label{sec:Device Types / Memory Balloon Device / Device Operation}
> > > > @@ -6031,6 +6038,58 @@ \subsubsection{Free Page Reporting}\label{sec:Device Types / Memory Balloon Devi
> > > >  MUST NOT modify the the content of a reported page to a value other than
> > > >  \field{poison_val}.
> > > >
> > > > +\subsubsection{Events}\label{sec:Device Types / Memory Balloon Device / Device Operation / Events}
> > > > +
> > > > +The events virtqueue allows the driver to signal events to the
> > > > +device.
> > >
> > > and the following is the format used for buffers added
> > > to this vq? write or read buffers?
> >
> > It's the format for messages sent to the virtqueue. Written by the
> > driver, read-only from the device.
> >
> > >
> > > > +
> > > > +\begin{lstlisting}
> > > > +struct virtio_balloon_event {
> > > > +        u32 type;
> > > > +        u32 data;
> > > > +};
> > > > +
> > > > +#define VIRTIO_BALLOON_EVENT_OOM        1
> > > > +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> > > > +\end{lstlisting}
> > > > +
> > > > +The \field{type} determines how to interpret the event. The following
> > > > +events are defined:
> > > > +
> > > > +\begin{itemize}
> > > > +\item Out of memory
> > > > +\begin{lstlisting}
> > > > +#define VIRTIO_BALLOON_EVENT_OOM        1
> > > > +\end{lstlisting}
> > > > +The driver has encountered a situation in which using pages from the balloon
> > > > +is necessary for system stability (e.g. if memory is required by applications
> > > > +running within the guest). The \field{data} value indicates how many pages
> > > > +the driver requires to maintain system stability.
> > >
> > > I'd prefer that we report info in bytes or another portable metric
> > > please. Down the road we should switch balloon away from
> > > using pages, too. This might mean we need to make data and type u64.
> > >
> > > > +\item Out of puff
> > > > +\begin{lstlisting}
> > > > +#define VIRTIO_BALLOON_EVENT_OOPUFF     2
> > > > +\end{lstlisting}
> > > > +The driver has encountered an allocation failure when trying to inflate
> > > > +the balloon. The \field{data} value is unused. This event serves as a signal
> > > > +that the balloon may not be able to inflate in a timely manner.
> > > > +\end{itemize}
> > > > +
> > > > +\drivernormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> > > > +
> > > > +The driver MUST update \field{actual} with any allocated pages before
> > > > +sending a VIRTIO_BALLOON_EVENT_OOPUFF event.
> > > > +
> > > > +The driver SHOULD wait for the device to acknowledge the event
> > > > +before trying to further inflate or deflate the balloon.
> > >
> > > Hmm what is the motivation for this?
> >
> > Having the driver wait gives the device the option to respond to the
> > event synchronously. For example, since a puff failure when inflating
> > signals high memory pressure in the guest, the device might want to
> > reduce num_pages to stop further attempts to inflate the balloon.
> > And
> > if the device wants to respond asynchronously, it can always do that
> > after acking the event.
>
> Question: how does device know by how much to reduce the num_pages?
> Should the current balloon size be sent with this event?

The device can determine the balloon's current size from
virtio_balloon_config.actual. The device could also query memory stats
via the stats queue, although our implementation doesn't currently do
this.

> > >
> > > > +
> > > > +If VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been negotiated, the driver
> > > > +SHOULD send an OOM event before using pages from the balloon.
> > >
> > > what does using mean here? You mean without MUST_TELL_HOST?
> > >
> >
> > With DEFLATE_ON_OOM, "the driver MAY use pages from the balloon when
> > num_pages is less than or equal to the actual number of pages in the
> > balloon if this is required for system stability". When the event
> > queue is enabled, the driver should send a VIRTIO_BALLOON_EVENT_OOM
> > before doing that and using pages in the balloon.
>
> > That said, I've been
> > convinced that deflate-on-oom is questionable at best, so I think it's
> > better to replace the OOM event with a more generic memory pressure
> > event.
>
>
>
>
> > > > +
> > > > +\devicenormative{\paragraph}{Events}{Device Types / Memory Balloon Device / Device Operation / Events}
> > > > +
> > > > +When the device receives a VIRTIO_BALLOON_EVENT_OOM event, it SHOULD deflate
> > > > +the balloon by \field{data} pages before acknowledging the event.
> > >
> > > ... by modifying the config space field I guess?
> > >
> >
> > Yes, by reducing num_pages.
> >
> > > > +
> > > >  \section{SCSI Host Device}\label{sec:Device Types / SCSI Host Device}
> > > >
> > > >  The virtio SCSI host device groups together one or more virtual
> > > > --
> > > > 2.35.0.rc0.227.g00780c9af4-goog
> > > >
> > > >
> > > > This publicly archived list offers a means to provide input to the
> > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > >
> > > > In order to verify user consent to the Feedback License terms and
> > > > to minimize spam in the list archive, subscription is required
> > > > before posting.
> > > >
> > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > Join OASIS: https://www.oasis-open.org/join/
> > >
>

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-08-16  7:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-24 12:56 [virtio-comment] [PATCH 0/2] Add alternative to deflate-on-oom David Stevens
2022-01-24 12:56 ` [virtio-comment] [PATCH 1/2] virtio-balloon: add an event queue David Stevens
2022-01-28 15:46   ` David Hildenbrand
2022-01-31  7:10     ` David Stevens
2022-01-31  8:43       ` David Hildenbrand
2022-02-04  1:41         ` David Stevens
2022-02-10 12:53           ` David Hildenbrand
2022-02-14  6:59             ` David Stevens
2022-08-09 20:52   ` Michael S. Tsirkin
2022-08-10  7:54     ` David Stevens
2022-08-10  9:10       ` Michael S. Tsirkin
2022-08-16  7:13         ` David Stevens
2022-01-24 12:56 ` [virtio-comment] [PATCH 2/2] virtio-balloon: add a responsive host feature David Stevens
2022-01-28 15:52   ` David Hildenbrand
2022-01-31  7:09     ` David Stevens
2022-01-31  8:28       ` David Hildenbrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.