All of lore.kernel.org
 help / color / mirror / Atom feed
* [virtio-comment] [PATCH RFC 0/3] virtio-rng based entropy leak reporting
@ 2022-11-21 16:30 Michael S. Tsirkin
  2022-11-21 16:30 ` [virtio-comment] [PATCH RFC 1/3] rng: move to a file of its own Michael S. Tsirkin
                   ` (3 more replies)
  0 siblings, 4 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2022-11-21 16:30 UTC (permalink / raw)
  To: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Chalios, Babis, Jason A. Donenfeld

Generally, entropy only grows. However, there are cases where
it goes down - for example, consider generating a one time
pad where someone managed to use a side channel to
steal its contents. By combining the seemingly random
pad with the stolen contents we have reversed the entropy.

This actually happens within VMs e.g. when time is reversed due
to snapshoting. Existing approaches for VMs include Microsoft's
VM GEN ID.

This draft proposes a feature in virtio rng for reporting such
leaks.

Patches 1,2 refactor existing draft text. Patch 3 adds new functionality.

TODO:
	document theory of operation
	add conformance clauses


Michael S. Tsirkin (3):
  rng: move to a file of its own
  rng: be specific about the virtqueue
  rng: leak detection support

 content.tex    |  43 +--------------------
 virtio-rng.tex | 102 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 103 insertions(+), 42 deletions(-)
 create mode 100644 virtio-rng.tex

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] [PATCH RFC 1/3] rng: move to a file of its own
  2022-11-21 16:30 [virtio-comment] [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
@ 2022-11-21 16:30 ` Michael S. Tsirkin
  2022-11-21 16:30 ` [virtio-comment] [PATCH RFC 2/3] rng: be specific about the virtqueue Michael S. Tsirkin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2022-11-21 16:30 UTC (permalink / raw)
  To: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Chalios, Babis, Jason A. Donenfeld

Will make it easier to edit.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 content.tex    | 43 +------------------------------------------
 virtio-rng.tex | 42 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+), 42 deletions(-)
 create mode 100644 virtio-rng.tex

diff --git a/content.tex b/content.tex
index 7111929..332c257 100644
--- a/content.tex
+++ b/content.tex
@@ -5540,48 +5540,7 @@ \subsubsection{Legacy Interface: Framing Requirements}\label{sec:Device
 negotiated VIRTIO_F_ANY_LAYOUT MUST use only a single
 descriptor for all buffers in the control receiveq and control transmitq.
 
-\section{Entropy Device}\label{sec:Device Types / Entropy Device}
-
-The virtio entropy device supplies high-quality randomness for
-guest use.
-
-\subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
-  4
-
-\subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
-\begin{description}
-\item[0] requestq
-\end{description}
-
-\subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
-  None currently defined
-
-\subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
-  None currently defined.
-
-\subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Device Initialization}
-
-\begin{enumerate}
-\item The virtqueue is initialized
-\end{enumerate}
-
-\subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
-
-When the driver requires random bytes, it places the descriptor
-of one or more buffers in the queue. It will be completely filled
-by random data by the device.
-
-\drivernormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
-
-The driver MUST NOT place device-readable buffers into the queue.
-
-The driver MUST examine the length written by the device to determine
-how many random bytes were received.
-
-\devicenormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
-
-The device MUST place one or more random bytes into the buffer, but it
-MAY use less than the entire buffer length.
+\input{virtio-rng.tex}
 
 \section{Traditional Memory Balloon Device}\label{sec:Device Types / Memory Balloon Device}
 
diff --git a/virtio-rng.tex b/virtio-rng.tex
new file mode 100644
index 0000000..c26f589
--- /dev/null
+++ b/virtio-rng.tex
@@ -0,0 +1,42 @@
+\section{Entropy Device}\label{sec:Device Types / Entropy Device}
+
+The virtio entropy device supplies high-quality randomness for
+guest use.
+
+\subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
+  4
+
+\subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
+\begin{description}
+\item[0] requestq
+\end{description}
+
+\subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
+  None currently defined
+
+\subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
+  None currently defined.
+
+\subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Device Initialization}
+
+\begin{enumerate}
+\item The virtqueue is initialized
+\end{enumerate}
+
+\subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
+
+When the driver requires random bytes, it places the descriptor
+of one or more buffers in the queue. It will be completely filled
+by random data by the device.
+
+\drivernormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
+
+The driver MUST NOT place device-readable buffers into the queue.
+
+The driver MUST examine the length written by the device to determine
+how many random bytes were received.
+
+\devicenormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
+
+The device MUST place one or more random bytes into the buffer, but it
+MAY use less than the entire buffer length.
-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [virtio-comment] [PATCH RFC 2/3] rng: be specific about the virtqueue
  2022-11-21 16:30 [virtio-comment] [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
  2022-11-21 16:30 ` [virtio-comment] [PATCH RFC 1/3] rng: move to a file of its own Michael S. Tsirkin
@ 2022-11-21 16:30 ` Michael S. Tsirkin
  2022-11-21 16:30 ` [virtio-dev] [PATCH RFC 3/3] rng: leak detection support Michael S. Tsirkin
  2023-01-12  7:02 ` [virtio-dev] Re: [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
  3 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2022-11-21 16:30 UTC (permalink / raw)
  To: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Chalios, Babis, Jason A. Donenfeld

As we are going to add more virtqueues, when talking
about the requestq be specific and mention it by name
as opposed to just generally "the virtqueue".

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 virtio-rng.tex | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/virtio-rng.tex b/virtio-rng.tex
index c26f589..1ec7164 100644
--- a/virtio-rng.tex
+++ b/virtio-rng.tex
@@ -20,23 +20,24 @@ \subsection{Device configuration layout}\label{sec:Device Types / Entropy Device
 \subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Device Initialization}
 
 \begin{enumerate}
-\item The virtqueue is initialized
+\item The \field{requestq} virtqueue is initialized
 \end{enumerate}
 
 \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
 
 When the driver requires random bytes, it places the descriptor
-of one or more buffers in the queue. It will be completely filled
+of one or more buffers in the \field{requestq} virtqueue. It will be completely filled
 by random data by the device.
 
 \drivernormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
 
-The driver MUST NOT place device-readable buffers into the queue.
+The driver MUST NOT place device-readable buffers into the \field{requestq} virtqueue.
 
 The driver MUST examine the length written by the device to determine
 how many random bytes were received.
 
 \devicenormative{\subsubsection}{Device Operation}{Device Types / Entropy Device / Device Operation}
 
-The device MUST place one or more random bytes into the buffer, but it
+The device MUST place one or more random bytes into the buffer
+made available to it through \field{requestq}, but it
 MAY use less than the entire buffer length.
-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2022-11-21 16:30 [virtio-comment] [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
  2022-11-21 16:30 ` [virtio-comment] [PATCH RFC 1/3] rng: move to a file of its own Michael S. Tsirkin
  2022-11-21 16:30 ` [virtio-comment] [PATCH RFC 2/3] rng: be specific about the virtqueue Michael S. Tsirkin
@ 2022-11-21 16:30 ` Michael S. Tsirkin
  2022-11-25 12:41   ` [virtio-dev] " Babis Chalios
                     ` (2 more replies)
  2023-01-12  7:02 ` [virtio-dev] Re: [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
  3 siblings, 3 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2022-11-21 16:30 UTC (permalink / raw)
  To: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Chalios, Babis, Jason A. Donenfeld

Add virtqueues to support reporting entropy leaks (similar to virtio based vmgenid).

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 virtio-rng.tex | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 60 insertions(+), 1 deletion(-)

diff --git a/virtio-rng.tex b/virtio-rng.tex
index 1ec7164..4760dfa 100644
--- a/virtio-rng.tex
+++ b/virtio-rng.tex
@@ -9,10 +9,14 @@ \subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
 \subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
 \begin{description}
 \item[0] requestq
+\item[1] leakq1 (only if VIRTIO_RNG_F_LEAK is offered)
+\item[2] leakq2 (only if VIRTIO_RNG_F_LEAK is offered)
 \end{description}
 
 \subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
-  None currently defined
+\begin{description}
+\item[VIRTIO_RNG_F_LEAK (0)] Device can report and handle information leaks.
+\end{description}
 
 \subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
   None currently defined.
@@ -21,6 +25,7 @@ \subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Dev
 
 \begin{enumerate}
 \item The \field{requestq} virtqueue is initialized
+\item If VIRTIO_RNG_F_LEAK has been negotiated, \field{leakq1} and \field{leakq2} are initialized
 \end{enumerate}
 
 \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
@@ -41,3 +46,57 @@ \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device O
 The device MUST place one or more random bytes into the buffer
 made available to it through \field{requestq}, but it
 MAY use less than the entire buffer length.
+
+\subsubsection{Reporting Information Leaks}{Device Types / Entropy Device / Device Operation / Reporting Information Leaks}
+
+The device might, after the fact, detect that some of the entropy
+it supplied to the driver has after the fact degraded in quality
+or leaked to the outside world.  One example is when the device
+is part of the virtual machine undergoing a restore from snapshot
+operation. Another example is when the information leaks from the
+host system through a side-channel.
+
+The driver would typically react by causing regeneration of any
+information that might have leaked and that has to be secret or
+unique.  It is understood that when an information leak has been
+detected it is likely not limited to the entropy received through
+the specific device. In particular, this is the case for
+snapshoting It is thus suggested that the system fully
+regenerate any unique/secret information in this scenario.
+
+If VIRTIO_RNG_F_LEAK has been negotiated the device can report
+such leaks to the driver through a set of dedicated leak
+queues: \field{leakq1} and \field{leakq2}.
+
+Buffers added to the leak queues can have one of two forms:
+\begin{enumerate}
+\item A write-only buffer. It will be completely filled by random data by the device.
+\item A buffer consisting of read-only section followed by a
+write-only section, both of identical size. The
+device will copy data from the read-only section to the write-only
+section.
+\end{enumerate}
+
+The steps for operating the virtqueue are:
+
+\begin{enumerate}
+\item At each time, only one of \field{leakq1}, \field{leakq2} is active
+      (has buffers added/used).
+\item After initialization, \field{leakq1} is active.
+\item Driver adds multiple buffers to the active leak queue.
+\item The buffers are not used until an information leak is
+      detected, as long as that is the case driver can
+      add more buffers to the active queue.
+\item Upon detecting an information leak, device starts
+      using buffers in the active leak queue.
+\item Upon detecting that buffers have been used, driver
+      switches to another leak queue making it active
+      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
+      It then starts adding buffers to the new leak queue.
+\item Device will keep using buffers in the active leak queue
+      until it detects that both the current leak queue is empty and another
+      leak queue has buffers. At that point device switches to
+      another leak queue, making it active.
+\item After the switch, buffers from the new leak queue are not
+      used until an information leak is detected.
+\end{enumerate}
-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [virtio-dev] Re: [PATCH RFC 3/3] rng: leak detection support
  2022-11-21 16:30 ` [virtio-dev] [PATCH RFC 3/3] rng: leak detection support Michael S. Tsirkin
@ 2022-11-25 12:41   ` Babis Chalios
  2022-12-12 10:10     ` Babis Chalios
  2023-01-11 13:57   ` Babis Chalios
  2023-08-31 10:16   ` [virtio-dev] " Babis Chalios
  2 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2022-11-25 12:41 UTC (permalink / raw)
  To: Michael S. Tsirkin, virtio-comment, virtio-dev, Cali, Marco,
	Graf (AWS),
	Alexander, Jason A. Donenfeld

Hi Michael,

And thanks a lot on the effort on that.

On 21/11/22 17:30, Michael S. Tsirkin wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Add virtqueues to support reporting entropy leaks (similar to virtio based vmgenid).
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   virtio-rng.tex | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 60 insertions(+), 1 deletion(-)
>
> diff --git a/virtio-rng.tex b/virtio-rng.tex
> index 1ec7164..4760dfa 100644
> --- a/virtio-rng.tex
> +++ b/virtio-rng.tex
> @@ -9,10 +9,14 @@ \subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
>   \subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
>   \begin{description}
>   \item[0] requestq
> +\item[1] leakq1 (only if VIRTIO_RNG_F_LEAK is offered)
> +\item[2] leakq2 (only if VIRTIO_RNG_F_LEAK is offered)
>   \end{description}
>
>   \subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
> -  None currently defined
> +\begin{description}
> +\item[VIRTIO_RNG_F_LEAK (0)] Device can report and handle information leaks.
> +\end{description}
>
>   \subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
>     None currently defined.
> @@ -21,6 +25,7 @@ \subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Dev
>
>   \begin{enumerate}
>   \item The \field{requestq} virtqueue is initialized
> +\item If VIRTIO_RNG_F_LEAK has been negotiated, \field{leakq1} and \field{leakq2} are initialized
>   \end{enumerate}
>
>   \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
> @@ -41,3 +46,57 @@ \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device O
>   The device MUST place one or more random bytes into the buffer
>   made available to it through \field{requestq}, but it
>   MAY use less than the entire buffer length.
> +
> +\subsubsection{Reporting Information Leaks}{Device Types / Entropy Device / Device Operation / Reporting Information Leaks}
> +
> +The device might, after the fact, detect that some of the entropy
> +it supplied to the driver has after the fact degraded in quality
> +or leaked to the outside world.  One example is when the device
> +is part of the virtual machine undergoing a restore from snapshot
> +operation. Another example is when the information leaks from the
> +host system through a side-channel.
> +
> +The driver would typically react by causing regeneration of any
> +information that might have leaked and that has to be secret or
> +unique.  It is understood that when an information leak has been
> +detected it is likely not limited to the entropy received through
> +the specific device. In particular, this is the case for
> +snapshoting It is thus suggested that the system fully
> +regenerate any unique/secret information in this scenario.
> +
> +If VIRTIO_RNG_F_LEAK has been negotiated the device can report
> +such leaks to the driver through a set of dedicated leak
> +queues: \field{leakq1} and \field{leakq2}.
> +
> +Buffers added to the leak queues can have one of two forms:
> +\begin{enumerate}
> +\item A write-only buffer. It will be completely filled by random data by the device.
> +\item A buffer consisting of read-only section followed by a
> +write-only section, both of identical size. The
> +device will copy data from the read-only section to the write-only
> +section.
> +\end{enumerate}
> +
> +The steps for operating the virtqueue are:
> +
> +\begin{enumerate}
> +\item At each time, only one of \field{leakq1}, \field{leakq2} is active
> +      (has buffers added/used).
> +\item After initialization, \field{leakq1} is active.
> +\item Driver adds multiple buffers to the active leak queue.
> +\item The buffers are not used until an information leak is
> +      detected, as long as that is the case driver can
> +      add more buffers to the active queue.
> +\item Upon detecting an information leak, device starts
> +      using buffers in the active leak queue.
> +\item Upon detecting that buffers have been used, driver
> +      switches to another leak queue making it active
> +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
> +      It then starts adding buffers to the new leak queue.
> +\item Device will keep using buffers in the active leak queue
> +      until it detects that both the current leak queue is empty and another
> +      leak queue has buffers. At that point device switches to
> +      another leak queue, making it active.
> +\item After the switch, buffers from the new leak queue are not
> +      used until an information leak is detected.
> +\end{enumerate}
> --
> MST
>

I am taking now a closer look to trying to see any potential 
race-conditions but in a first glance it looks ok to me.
Curious to see what Jason thinks too.

Cheers,
Babis
Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] Re: [PATCH RFC 3/3] rng: leak detection support
  2022-11-25 12:41   ` [virtio-dev] " Babis Chalios
@ 2022-12-12 10:10     ` Babis Chalios
  0 siblings, 0 replies; 53+ messages in thread
From: Babis Chalios @ 2022-12-12 10:10 UTC (permalink / raw)
  To: Michael S. Tsirkin, virtio-comment, virtio-dev, Cali, Marco,
	Graf (AWS),
	Alexander, Jason A. Donenfeld

Hi Michael,

On 25/11/22 13:41, Babis Chalios wrote:
> Hi Michael,
>
> And thanks a lot on the effort on that.
>
> On 21/11/22 17:30, Michael S. Tsirkin wrote:
>> CAUTION: This email originated from outside of the organization. Do 
>> not click links or open attachments unless you can confirm the sender 
>> and know the content is safe.
>>
>>
>>
>> Add virtqueues to support reporting entropy leaks (similar to virtio 
>> based vmgenid).
>>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> ---
>>   virtio-rng.tex | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 60 insertions(+), 1 deletion(-)
>>
>> diff --git a/virtio-rng.tex b/virtio-rng.tex
>> index 1ec7164..4760dfa 100644
>> --- a/virtio-rng.tex
>> +++ b/virtio-rng.tex
>> @@ -9,10 +9,14 @@ \subsection{Device ID}\label{sec:Device Types / 
>> Entropy Device / Device ID}
>>   \subsection{Virtqueues}\label{sec:Device Types / Entropy Device / 
>> Virtqueues}
>>   \begin{description}
>>   \item[0] requestq
>> +\item[1] leakq1 (only if VIRTIO_RNG_F_LEAK is offered)
>> +\item[2] leakq2 (only if VIRTIO_RNG_F_LEAK is offered)
>>   \end{description}
>>
>>   \subsection{Feature bits}\label{sec:Device Types / Entropy Device / 
>> Feature bits}
>> -  None currently defined
>> +\begin{description}
>> +\item[VIRTIO_RNG_F_LEAK (0)] Device can report and handle 
>> information leaks.
>> +\end{description}
>>
>>   \subsection{Device configuration layout}\label{sec:Device Types / 
>> Entropy Device / Device configuration layout}
>>     None currently defined.
>> @@ -21,6 +25,7 @@ \subsection{Device Initialization}\label{sec:Device 
>> Types / Entropy Device / Dev
>>
>>   \begin{enumerate}
>>   \item The \field{requestq} virtqueue is initialized
>> +\item If VIRTIO_RNG_F_LEAK has been negotiated, \field{leakq1} and 
>> \field{leakq2} are initialized
>>   \end{enumerate}
>>
>>   \subsection{Device Operation}\label{sec:Device Types / Entropy 
>> Device / Device Operation}
>> @@ -41,3 +46,57 @@ \subsection{Device Operation}\label{sec:Device 
>> Types / Entropy Device / Device O
>>   The device MUST place one or more random bytes into the buffer
>>   made available to it through \field{requestq}, but it
>>   MAY use less than the entire buffer length.
>> +
>> +\subsubsection{Reporting Information Leaks}{Device Types / Entropy 
>> Device / Device Operation / Reporting Information Leaks}
>> +
>> +The device might, after the fact, detect that some of the entropy
>> +it supplied to the driver has after the fact degraded in quality
>> +or leaked to the outside world.  One example is when the device
>> +is part of the virtual machine undergoing a restore from snapshot
>> +operation. Another example is when the information leaks from the
>> +host system through a side-channel.
>> +
>> +The driver would typically react by causing regeneration of any
>> +information that might have leaked and that has to be secret or
>> +unique.  It is understood that when an information leak has been
>> +detected it is likely not limited to the entropy received through
>> +the specific device. In particular, this is the case for
>> +snapshoting It is thus suggested that the system fully
>> +regenerate any unique/secret information in this scenario.
>> +
>> +If VIRTIO_RNG_F_LEAK has been negotiated the device can report
>> +such leaks to the driver through a set of dedicated leak
>> +queues: \field{leakq1} and \field{leakq2}.
>> +
>> +Buffers added to the leak queues can have one of two forms:
>> +\begin{enumerate}
>> +\item A write-only buffer. It will be completely filled by random 
>> data by the device.
>> +\item A buffer consisting of read-only section followed by a
>> +write-only section, both of identical size. The
>> +device will copy data from the read-only section to the write-only
>> +section.
>> +\end{enumerate}
>> +
>> +The steps for operating the virtqueue are:
>> +
>> +\begin{enumerate}
>> +\item At each time, only one of \field{leakq1}, \field{leakq2} is 
>> active
>> +      (has buffers added/used).
>> +\item After initialization, \field{leakq1} is active.
>> +\item Driver adds multiple buffers to the active leak queue.
>> +\item The buffers are not used until an information leak is
>> +      detected, as long as that is the case driver can
>> +      add more buffers to the active queue.
>> +\item Upon detecting an information leak, device starts
>> +      using buffers in the active leak queue.
>> +\item Upon detecting that buffers have been used, driver
>> +      switches to another leak queue making it active
>> +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
>> +      It then starts adding buffers to the new leak queue.
>> +\item Device will keep using buffers in the active leak queue
>> +      until it detects that both the current leak queue is empty and 
>> another
>> +      leak queue has buffers. At that point device switches to
>> +      another leak queue, making it active.
I assume by "using" here you mean that the device will perform directly 
the action
described by the buffers once they become available to it. Is my 
understanding correct?

Also, what happens if at this point (there are buffers on the "old" 
active leak queue) and a
new information leak is detected? Should we clarify this case?

>> +\item After the switch, buffers from the new leak queue are not
>> +      used until an information leak is detected.
>> +\end{enumerate}
>> -- 
>> MST
>>
>
> I am taking now a closer look to trying to see any potential 
> race-conditions but in a first glance it looks ok to me.
> Curious to see what Jason thinks too.
>
> Cheers,
> Babis
> Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de 
> Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . 
> Folio 102 . Hoja M-401234 . CIF B84570936


Cheers,
Babis

Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-dev] Re: [PATCH RFC 3/3] rng: leak detection support
  2022-11-21 16:30 ` [virtio-dev] [PATCH RFC 3/3] rng: leak detection support Michael S. Tsirkin
  2022-11-25 12:41   ` [virtio-dev] " Babis Chalios
@ 2023-01-11 13:57   ` Babis Chalios
  2023-08-31 10:16   ` [virtio-dev] " Babis Chalios
  2 siblings, 0 replies; 53+ messages in thread
From: Babis Chalios @ 2023-01-11 13:57 UTC (permalink / raw)
  To: Michael S. Tsirkin, virtio-comment, virtio-dev, Cali, Marco,
	Graf (AWS),
	Alexander, Jason A. Donenfeld, amit

CCing Amit

On 21/11/22 17:30, Michael S. Tsirkin wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> Add virtqueues to support reporting entropy leaks (similar to virtio based vmgenid).
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   virtio-rng.tex | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 60 insertions(+), 1 deletion(-)
>
> diff --git a/virtio-rng.tex b/virtio-rng.tex
> index 1ec7164..4760dfa 100644
> --- a/virtio-rng.tex
> +++ b/virtio-rng.tex
> @@ -9,10 +9,14 @@ \subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
>   \subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
>   \begin{description}
>   \item[0] requestq
> +\item[1] leakq1 (only if VIRTIO_RNG_F_LEAK is offered)
> +\item[2] leakq2 (only if VIRTIO_RNG_F_LEAK is offered)
>   \end{description}
>
>   \subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
> -  None currently defined
> +\begin{description}
> +\item[VIRTIO_RNG_F_LEAK (0)] Device can report and handle information leaks.
> +\end{description}
>
>   \subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
>     None currently defined.
> @@ -21,6 +25,7 @@ \subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Dev
>
>   \begin{enumerate}
>   \item The \field{requestq} virtqueue is initialized
> +\item If VIRTIO_RNG_F_LEAK has been negotiated, \field{leakq1} and \field{leakq2} are initialized
>   \end{enumerate}
>
>   \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
> @@ -41,3 +46,57 @@ \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device O
>   The device MUST place one or more random bytes into the buffer
>   made available to it through \field{requestq}, but it
>   MAY use less than the entire buffer length.
> +
> +\subsubsection{Reporting Information Leaks}{Device Types / Entropy Device / Device Operation / Reporting Information Leaks}
> +
> +The device might, after the fact, detect that some of the entropy
> +it supplied to the driver has after the fact degraded in quality
> +or leaked to the outside world.  One example is when the device
> +is part of the virtual machine undergoing a restore from snapshot
> +operation. Another example is when the information leaks from the
> +host system through a side-channel.
> +
> +The driver would typically react by causing regeneration of any
> +information that might have leaked and that has to be secret or
> +unique.  It is understood that when an information leak has been
> +detected it is likely not limited to the entropy received through
> +the specific device. In particular, this is the case for
> +snapshoting It is thus suggested that the system fully
> +regenerate any unique/secret information in this scenario.
> +
> +If VIRTIO_RNG_F_LEAK has been negotiated the device can report
> +such leaks to the driver through a set of dedicated leak
> +queues: \field{leakq1} and \field{leakq2}.
> +
> +Buffers added to the leak queues can have one of two forms:
> +\begin{enumerate}
> +\item A write-only buffer. It will be completely filled by random data by the device.
> +\item A buffer consisting of read-only section followed by a
> +write-only section, both of identical size. The
> +device will copy data from the read-only section to the write-only
> +section.
> +\end{enumerate}
> +
> +The steps for operating the virtqueue are:
> +
> +\begin{enumerate}
> +\item At each time, only one of \field{leakq1}, \field{leakq2} is active
> +      (has buffers added/used).
> +\item After initialization, \field{leakq1} is active.
> +\item Driver adds multiple buffers to the active leak queue.
> +\item The buffers are not used until an information leak is
> +      detected, as long as that is the case driver can
> +      add more buffers to the active queue.
> +\item Upon detecting an information leak, device starts
> +      using buffers in the active leak queue.
> +\item Upon detecting that buffers have been used, driver
> +      switches to another leak queue making it active
> +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
> +      It then starts adding buffers to the new leak queue.
> +\item Device will keep using buffers in the active leak queue
> +      until it detects that both the current leak queue is empty and another
> +      leak queue has buffers. At that point device switches to
> +      another leak queue, making it active.
> +\item After the switch, buffers from the new leak queue are not
> +      used until an information leak is detected.
> +\end{enumerate}
> --
> MST
>

Cheers,
Babis

Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-dev] Re: [PATCH RFC 0/3] virtio-rng based entropy leak reporting
  2022-11-21 16:30 [virtio-comment] [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2022-11-21 16:30 ` [virtio-dev] [PATCH RFC 3/3] rng: leak detection support Michael S. Tsirkin
@ 2023-01-12  7:02 ` Michael S. Tsirkin
  2023-01-16 11:39   ` Babis Chalios
  3 siblings, 1 reply; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-01-12  7:02 UTC (permalink / raw)
  To: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Chalios, Babis, Jason A. Donenfeld

On Mon, Nov 21, 2022 at 11:30:19AM -0500, Michael S. Tsirkin wrote:
> Generally, entropy only grows. However, there are cases where
> it goes down - for example, consider generating a one time
> pad where someone managed to use a side channel to
> steal its contents. By combining the seemingly random
> pad with the stolen contents we have reversed the entropy.
> 
> This actually happens within VMs e.g. when time is reversed due
> to snapshoting. Existing approaches for VMs include Microsoft's
> VM GEN ID.
> 
> This draft proposes a feature in virtio rng for reporting such
> leaks.
> 
> Patches 1,2 refactor existing draft text. Patch 3 adds new functionality.
> 
> TODO:
> 	document theory of operation
> 	add conformance clauses

Guys any input on this? Anyone going to use this?

> 
> Michael S. Tsirkin (3):
>   rng: move to a file of its own
>   rng: be specific about the virtqueue
>   rng: leak detection support
> 
>  content.tex    |  43 +--------------------
>  virtio-rng.tex | 102 +++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 103 insertions(+), 42 deletions(-)
>  create mode 100644 virtio-rng.tex
> 
> -- 
> MST
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] Re: [PATCH RFC 0/3] virtio-rng based entropy leak reporting
  2023-01-12  7:02 ` [virtio-dev] Re: [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
@ 2023-01-16 11:39   ` Babis Chalios
       [not found]     ` <CAHmME9ry2fss2gsbPs2zVJkY=8Cdeae0XFD9FzCVnW67Xy3thA@mail.gmail.com>
  0 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-01-16 11:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, virtio-comment, virtio-dev, Cali, Marco,
	Graf (AWS),
	Alexander, Jason A. Donenfeld
  Cc: amit

Hi Michael,

On 12/1/23 08:02, Michael S. Tsirkin wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Mon, Nov 21, 2022 at 11:30:19AM -0500, Michael S. Tsirkin wrote:
>> Generally, entropy only grows. However, there are cases where
>> it goes down - for example, consider generating a one time
>> pad where someone managed to use a side channel to
>> steal its contents. By combining the seemingly random
>> pad with the stolen contents we have reversed the entropy.
>>
>> This actually happens within VMs e.g. when time is reversed due
>> to snapshoting. Existing approaches for VMs include Microsoft's
>> VM GEN ID.
>>
>> This draft proposes a feature in virtio rng for reporting such
>> leaks.
>>
>> Patches 1,2 refactor existing draft text. Patch 3 adds new functionality.
>>
>> TODO:
>>        document theory of operation
>>        add conformance clauses
> Guys any input on this? Anyone going to use this?

I plan to post an RFC patch for linux virtio-rng show-casing this with 
Firecracker, this week.
Also, I had sent an e-mail: 
https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg09128.html 
with some questions,
not sure whether you missed it?

>> Michael S. Tsirkin (3):
>>    rng: move to a file of its own
>>    rng: be specific about the virtqueue
>>    rng: leak detection support
>>
>>   content.tex    |  43 +--------------------
>>   virtio-rng.tex | 102 +++++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 103 insertions(+), 42 deletions(-)
>>   create mode 100644 virtio-rng.tex
>>
>> --
>> MST
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

Cheers,
Babis


Amazon Spain Services sociedad limitada unipersonal, Calle Ramirez de Prado 5, 28045 Madrid. Registro Mercantil de Madrid . Tomo 22458 . Folio 102 . Hoja M-401234 . CIF B84570936

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] Re: [PATCH RFC 0/3] virtio-rng based entropy leak reporting
       [not found]     ` <CAHmME9ry2fss2gsbPs2zVJkY=8Cdeae0XFD9FzCVnW67Xy3thA@mail.gmail.com>
@ 2023-01-16 18:11       ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-01-16 18:11 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Babis Chalios, virtio-comment, virtio-dev, Cali, Marco,
	Graf (AWS),
	Alexander, amit

On Mon, Jan 16, 2023 at 05:45:40PM +0100, Jason A. Donenfeld wrote:
> Hey guys,
> 
> Just FYI, I am still interested in this, but it's currently taken a
> backseat while I focus on some other parts of the ecosystem. I'll be
> back to moving this forward in March.
> 
> Jason

OK sure, no rush. I will drop this for now and just ping me when there's
interest.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2022-11-21 16:30 ` [virtio-dev] [PATCH RFC 3/3] rng: leak detection support Michael S. Tsirkin
  2022-11-25 12:41   ` [virtio-dev] " Babis Chalios
  2023-01-11 13:57   ` Babis Chalios
@ 2023-08-31 10:16   ` Babis Chalios
  2023-09-12 21:05       ` [virtio-comment] " Michael S. Tsirkin
  2 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-08-31 10:16 UTC (permalink / raw)
  To: Michael S. Tsirkin, virtio-comment, virtio-dev, Cali, Marco,
	Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

Hi Michael,

On 21/11/22 17:30, Michael S. Tsirkin wrote:
> Add virtqueues to support reporting entropy leaks (similar to virtio based vmgenid).
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   virtio-rng.tex | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 60 insertions(+), 1 deletion(-)
>
> diff --git a/virtio-rng.tex b/virtio-rng.tex
> index 1ec7164..4760dfa 100644
> --- a/virtio-rng.tex
> +++ b/virtio-rng.tex
> @@ -9,10 +9,14 @@ \subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
>   \subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
>   \begin{description}
>   \item[0] requestq
> +\item[1] leakq1 (only if VIRTIO_RNG_F_LEAK is offered)
> +\item[2] leakq2 (only if VIRTIO_RNG_F_LEAK is offered)
>   \end{description}
>   
>   \subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
> -  None currently defined
> +\begin{description}
> +\item[VIRTIO_RNG_F_LEAK (0)] Device can report and handle information leaks.
> +\end{description}
>   
>   \subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
>     None currently defined.
> @@ -21,6 +25,7 @@ \subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Dev
>   
>   \begin{enumerate}
>   \item The \field{requestq} virtqueue is initialized
> +\item If VIRTIO_RNG_F_LEAK has been negotiated, \field{leakq1} and \field{leakq2} are initialized
>   \end{enumerate}
>   
>   \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
> @@ -41,3 +46,57 @@ \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device O
>   The device MUST place one or more random bytes into the buffer
>   made available to it through \field{requestq}, but it
>   MAY use less than the entire buffer length.
> +
> +\subsubsection{Reporting Information Leaks}{Device Types / Entropy Device / Device Operation / Reporting Information Leaks}
> +
> +The device might, after the fact, detect that some of the entropy
> +it supplied to the driver has after the fact degraded in quality
> +or leaked to the outside world.  One example is when the device
> +is part of the virtual machine undergoing a restore from snapshot
> +operation. Another example is when the information leaks from the
> +host system through a side-channel.
> +
> +The driver would typically react by causing regeneration of any
> +information that might have leaked and that has to be secret or
> +unique.  It is understood that when an information leak has been
> +detected it is likely not limited to the entropy received through
> +the specific device. In particular, this is the case for
> +snapshoting It is thus suggested that the system fully
> +regenerate any unique/secret information in this scenario.
> +
> +If VIRTIO_RNG_F_LEAK has been negotiated the device can report
> +such leaks to the driver through a set of dedicated leak
> +queues: \field{leakq1} and \field{leakq2}.
> +
> +Buffers added to the leak queues can have one of two forms:
> +\begin{enumerate}
> +\item A write-only buffer. It will be completely filled by random data by the device.
> +\item A buffer consisting of read-only section followed by a
> +write-only section, both of identical size. The
> +device will copy data from the read-only section to the write-only
> +section.
> +\end{enumerate}
> +
> +The steps for operating the virtqueue are:
> +
> +\begin{enumerate}
> +\item At each time, only one of \field{leakq1}, \field{leakq2} is active
> +      (has buffers added/used).
> +\item After initialization, \field{leakq1} is active.
> +\item Driver adds multiple buffers to the active leak queue.
> +\item The buffers are not used until an information leak is
> +      detected, as long as that is the case driver can
> +      add more buffers to the active queue.
> +\item Upon detecting an information leak, device starts
> +      using buffers in the active leak queue.
> +\item Upon detecting that buffers have been used, driver
> +      switches to another leak queue making it active
> +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
> +      It then starts adding buffers to the new leak queue.\

I have been discussing with Alex and we think there's a potential race 
here, between the time the driver
sees the used buffers in the active leak queue until the time it adds 
new buffers to the next leak queue.
If a new entropy leak event arrives the VMM won't find any buffers in 
the queue.

In the last RFC implementing this in Linux we sent to LKML [1] we avoid 
the issue by pre-populating both
queues, but that does not solve the problem if a third entropy leak 
event arrives. The probability of this
happening is indeed small, but we thought of a potential solution to this.

What if we modify the spec here to instruct the VMM to deny taking a 
snapshot if there are not any buffers
in the active leak queue? If we did this, we could even simplify the 
spec to just introduce a single entropy
leak queue, so we could avoid the complexity of switching between active 
leak queues in the driver and
the device. WDYT?

Cheers,
Babis

[1] 
https://lore.kernel.org/lkml/20230823090107.65749-3-bchalios@amazon.es/T/

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-08-31 10:16   ` [virtio-dev] " Babis Chalios
@ 2023-09-12 21:05       ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-12 21:05 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Thu, Aug 31, 2023 at 12:16:22PM +0200, Babis Chalios wrote:
> Hi Michael,
> 
> On 21/11/22 17:30, Michael S. Tsirkin wrote:
> > Add virtqueues to support reporting entropy leaks (similar to virtio based vmgenid).
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >   virtio-rng.tex | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >   1 file changed, 60 insertions(+), 1 deletion(-)
> > 
> > diff --git a/virtio-rng.tex b/virtio-rng.tex
> > index 1ec7164..4760dfa 100644
> > --- a/virtio-rng.tex
> > +++ b/virtio-rng.tex
> > @@ -9,10 +9,14 @@ \subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
> >   \subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
> >   \begin{description}
> >   \item[0] requestq
> > +\item[1] leakq1 (only if VIRTIO_RNG_F_LEAK is offered)
> > +\item[2] leakq2 (only if VIRTIO_RNG_F_LEAK is offered)
> >   \end{description}
> >   \subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
> > -  None currently defined
> > +\begin{description}
> > +\item[VIRTIO_RNG_F_LEAK (0)] Device can report and handle information leaks.
> > +\end{description}
> >   \subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
> >     None currently defined.
> > @@ -21,6 +25,7 @@ \subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Dev
> >   \begin{enumerate}
> >   \item The \field{requestq} virtqueue is initialized
> > +\item If VIRTIO_RNG_F_LEAK has been negotiated, \field{leakq1} and \field{leakq2} are initialized
> >   \end{enumerate}
> >   \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
> > @@ -41,3 +46,57 @@ \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device O
> >   The device MUST place one or more random bytes into the buffer
> >   made available to it through \field{requestq}, but it
> >   MAY use less than the entire buffer length.
> > +
> > +\subsubsection{Reporting Information Leaks}{Device Types / Entropy Device / Device Operation / Reporting Information Leaks}
> > +
> > +The device might, after the fact, detect that some of the entropy
> > +it supplied to the driver has after the fact degraded in quality
> > +or leaked to the outside world.  One example is when the device
> > +is part of the virtual machine undergoing a restore from snapshot
> > +operation. Another example is when the information leaks from the
> > +host system through a side-channel.
> > +
> > +The driver would typically react by causing regeneration of any
> > +information that might have leaked and that has to be secret or
> > +unique.  It is understood that when an information leak has been
> > +detected it is likely not limited to the entropy received through
> > +the specific device. In particular, this is the case for
> > +snapshoting It is thus suggested that the system fully
> > +regenerate any unique/secret information in this scenario.
> > +
> > +If VIRTIO_RNG_F_LEAK has been negotiated the device can report
> > +such leaks to the driver through a set of dedicated leak
> > +queues: \field{leakq1} and \field{leakq2}.
> > +
> > +Buffers added to the leak queues can have one of two forms:
> > +\begin{enumerate}
> > +\item A write-only buffer. It will be completely filled by random data by the device.
> > +\item A buffer consisting of read-only section followed by a
> > +write-only section, both of identical size. The
> > +device will copy data from the read-only section to the write-only
> > +section.
> > +\end{enumerate}
> > +
> > +The steps for operating the virtqueue are:
> > +
> > +\begin{enumerate}
> > +\item At each time, only one of \field{leakq1}, \field{leakq2} is active
> > +      (has buffers added/used).
> > +\item After initialization, \field{leakq1} is active.
> > +\item Driver adds multiple buffers to the active leak queue.
> > +\item The buffers are not used until an information leak is
> > +      detected, as long as that is the case driver can
> > +      add more buffers to the active queue.
> > +\item Upon detecting an information leak, device starts
> > +      using buffers in the active leak queue.
> > +\item Upon detecting that buffers have been used, driver
> > +      switches to another leak queue making it active
> > +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
> > +      It then starts adding buffers to the new leak queue.\
> 
> I have been discussing with Alex and we think there's a potential race here,
> between the time the driver
> sees the used buffers in the active leak queue until the time it adds new
> buffers to the next leak queue.
> If a new entropy leak event arrives the VMM won't find any buffers in the
> queue.

I do not understand why this matters though. we know there was a leak,
why does it matter whether there was one or two leaks?

> In the last RFC implementing this in Linux we sent to LKML [1] we avoid the
> issue by pre-populating both
> queues, but that does not solve the problem if a third entropy leak event
> arrives. The probability of this
> happening is indeed small, but we thought of a potential solution to this.
> 
> What if we modify the spec here to instruct the VMM to deny taking a
> snapshot if there are not any buffers
> in the active leak queue? If we did this, we could even simplify the spec to
> just introduce a single entropy
> leak queue, so we could avoid the complexity of switching between active
> leak queues in the driver and
> the device. WDYT?

here's the problem: 

- driver adds batch 1 of buffers
- leak
- device starts using buffers from batch 1
- driver sees some buffers and starts adding batch 2
- device sees batch 2 and thinks this is part of batch 1
  consumes them all




> Cheers,
> Babis
> 
> [1]
> https://lore.kernel.org/lkml/20230823090107.65749-3-bchalios@amazon.es/T/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-09-12 21:05       ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-12 21:05 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Thu, Aug 31, 2023 at 12:16:22PM +0200, Babis Chalios wrote:
> Hi Michael,
> 
> On 21/11/22 17:30, Michael S. Tsirkin wrote:
> > Add virtqueues to support reporting entropy leaks (similar to virtio based vmgenid).
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >   virtio-rng.tex | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-
> >   1 file changed, 60 insertions(+), 1 deletion(-)
> > 
> > diff --git a/virtio-rng.tex b/virtio-rng.tex
> > index 1ec7164..4760dfa 100644
> > --- a/virtio-rng.tex
> > +++ b/virtio-rng.tex
> > @@ -9,10 +9,14 @@ \subsection{Device ID}\label{sec:Device Types / Entropy Device / Device ID}
> >   \subsection{Virtqueues}\label{sec:Device Types / Entropy Device / Virtqueues}
> >   \begin{description}
> >   \item[0] requestq
> > +\item[1] leakq1 (only if VIRTIO_RNG_F_LEAK is offered)
> > +\item[2] leakq2 (only if VIRTIO_RNG_F_LEAK is offered)
> >   \end{description}
> >   \subsection{Feature bits}\label{sec:Device Types / Entropy Device / Feature bits}
> > -  None currently defined
> > +\begin{description}
> > +\item[VIRTIO_RNG_F_LEAK (0)] Device can report and handle information leaks.
> > +\end{description}
> >   \subsection{Device configuration layout}\label{sec:Device Types / Entropy Device / Device configuration layout}
> >     None currently defined.
> > @@ -21,6 +25,7 @@ \subsection{Device Initialization}\label{sec:Device Types / Entropy Device / Dev
> >   \begin{enumerate}
> >   \item The \field{requestq} virtqueue is initialized
> > +\item If VIRTIO_RNG_F_LEAK has been negotiated, \field{leakq1} and \field{leakq2} are initialized
> >   \end{enumerate}
> >   \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device Operation}
> > @@ -41,3 +46,57 @@ \subsection{Device Operation}\label{sec:Device Types / Entropy Device / Device O
> >   The device MUST place one or more random bytes into the buffer
> >   made available to it through \field{requestq}, but it
> >   MAY use less than the entire buffer length.
> > +
> > +\subsubsection{Reporting Information Leaks}{Device Types / Entropy Device / Device Operation / Reporting Information Leaks}
> > +
> > +The device might, after the fact, detect that some of the entropy
> > +it supplied to the driver has after the fact degraded in quality
> > +or leaked to the outside world.  One example is when the device
> > +is part of the virtual machine undergoing a restore from snapshot
> > +operation. Another example is when the information leaks from the
> > +host system through a side-channel.
> > +
> > +The driver would typically react by causing regeneration of any
> > +information that might have leaked and that has to be secret or
> > +unique.  It is understood that when an information leak has been
> > +detected it is likely not limited to the entropy received through
> > +the specific device. In particular, this is the case for
> > +snapshoting It is thus suggested that the system fully
> > +regenerate any unique/secret information in this scenario.
> > +
> > +If VIRTIO_RNG_F_LEAK has been negotiated the device can report
> > +such leaks to the driver through a set of dedicated leak
> > +queues: \field{leakq1} and \field{leakq2}.
> > +
> > +Buffers added to the leak queues can have one of two forms:
> > +\begin{enumerate}
> > +\item A write-only buffer. It will be completely filled by random data by the device.
> > +\item A buffer consisting of read-only section followed by a
> > +write-only section, both of identical size. The
> > +device will copy data from the read-only section to the write-only
> > +section.
> > +\end{enumerate}
> > +
> > +The steps for operating the virtqueue are:
> > +
> > +\begin{enumerate}
> > +\item At each time, only one of \field{leakq1}, \field{leakq2} is active
> > +      (has buffers added/used).
> > +\item After initialization, \field{leakq1} is active.
> > +\item Driver adds multiple buffers to the active leak queue.
> > +\item The buffers are not used until an information leak is
> > +      detected, as long as that is the case driver can
> > +      add more buffers to the active queue.
> > +\item Upon detecting an information leak, device starts
> > +      using buffers in the active leak queue.
> > +\item Upon detecting that buffers have been used, driver
> > +      switches to another leak queue making it active
> > +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
> > +      It then starts adding buffers to the new leak queue.\
> 
> I have been discussing with Alex and we think there's a potential race here,
> between the time the driver
> sees the used buffers in the active leak queue until the time it adds new
> buffers to the next leak queue.
> If a new entropy leak event arrives the VMM won't find any buffers in the
> queue.

I do not understand why this matters though. we know there was a leak,
why does it matter whether there was one or two leaks?

> In the last RFC implementing this in Linux we sent to LKML [1] we avoid the
> issue by pre-populating both
> queues, but that does not solve the problem if a third entropy leak event
> arrives. The probability of this
> happening is indeed small, but we thought of a potential solution to this.
> 
> What if we modify the spec here to instruct the VMM to deny taking a
> snapshot if there are not any buffers
> in the active leak queue? If we did this, we could even simplify the spec to
> just introduce a single entropy
> leak queue, so we could avoid the complexity of switching between active
> leak queues in the driver and
> the device. WDYT?

here's the problem: 

- driver adds batch 1 of buffers
- leak
- device starts using buffers from batch 1
- driver sees some buffers and starts adding batch 2
- device sees batch 2 and thinks this is part of batch 1
  consumes them all




> Cheers,
> Babis
> 
> [1]
> https://lore.kernel.org/lkml/20230823090107.65749-3-bchalios@amazon.es/T/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-12 21:05       ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-09-13  9:32       ` Babis Chalios
  2023-09-13  9:37           ` [virtio-comment] " Michael S. Tsirkin
  -1 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-09-13  9:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams, bchalios

 > I do not understand why this matters though. we know there was a leak,
 > why does it matter whether there was one or two leaks?
 >
 > > In the last RFC implementing this in Linux we sent to LKML [1] we 
avoid the
 > > issue by pre-populating both
 > > queues, but that does not solve the problem if a third entropy leak 
event
 > > arrives. The probability of this
 > > happening is indeed small, but we thought of a potential solution 
to this.
 > >
 > > What if we modify the spec here to instruct the VMM to deny taking a
 > > snapshot if there are not any buffers
 > > in the active leak queue? If we did this, we could even simplify 
the spec to
 > > just introduce a single entropy
 > > leak queue, so we could avoid the complexity of switching between 
active
 > > leak queues in the driver and
 > > the device. WDYT?
 >
 > here's the problem:
 >
 > - driver adds batch 1 of buffers
 > - leak
 > - device starts using buffers from batch 1
 > - driver sees some buffers and starts adding batch 2

If understand this clause:

 > > +\item Upon detecting that buffers have been used, driver
 > > +      switches to another leak queue making it active
 > > +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
 > > +      It then starts adding buffers to the new leak queue.

correctly:

At this point, the driver will first switch active leak queue and
then add batch 2 to the new leak queue.

and due to this:

 > > +\item Device will keep using buffers in the active leak queue
 > > +      until it detects that both the current leak queue is empty 
and another
 > > +      leak queue has buffers. At that point device switches to
 > > +      another leak queue, making it active.
 > > +\item After the switch, buffers from the new leak queue are not
 > > +      used until an information leak is detected.
 > > +\end{enumerate}

the following won't happen:

 > - device sees batch 2 and thinks this is part of batch 1
 >    consumes them all

Does it make sense?

Cheers,
Babis

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-13  9:32       ` Babis Chalios
@ 2023-09-13  9:37           ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-13  9:37 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Wed, Sep 13, 2023 at 11:32:57AM +0200, Babis Chalios wrote:
> > I do not understand why this matters though. we know there was a leak,
> > why does it matter whether there was one or two leaks?
> >
> > > In the last RFC implementing this in Linux we sent to LKML [1] we avoid
> the
> > > issue by pre-populating both
> > > queues, but that does not solve the problem if a third entropy leak
> event
> > > arrives. The probability of this
> > > happening is indeed small, but we thought of a potential solution to
> this.
> > >
> > > What if we modify the spec here to instruct the VMM to deny taking a
> > > snapshot if there are not any buffers
> > > in the active leak queue? If we did this, we could even simplify the
> spec to
> > > just introduce a single entropy
> > > leak queue, so we could avoid the complexity of switching between active
> > > leak queues in the driver and
> > > the device. WDYT?
> >
> > here's the problem:
> >
> > - driver adds batch 1 of buffers
> > - leak
> > - device starts using buffers from batch 1
> > - driver sees some buffers and starts adding batch 2
> 
> If understand this clause:
> 
> > > +\item Upon detecting that buffers have been used, driver
> > > +      switches to another leak queue making it active
> > > +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
> > > +      It then starts adding buffers to the new leak queue.
> 
> correctly:
> 
> At this point, the driver will first switch active leak queue and
> then add batch 2 to the new leak queue.
> 
> and due to this:
> 
> > > +\item Device will keep using buffers in the active leak queue
> > > +      until it detects that both the current leak queue is empty and
> another
> > > +      leak queue has buffers. At that point device switches to
> > > +      another leak queue, making it active.
> > > +\item After the switch, buffers from the new leak queue are not
> > > +      used until an information leak is detected.
> > > +\end{enumerate}
> 
> the following won't happen:
> 
> > - device sees batch 2 and thinks this is part of batch 1
> >    consumes them all
> 
> Does it make sense?
> 
> Cheers,
> Babis

yes, the queue switch is used as a barrier to detect a new leak event.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-09-13  9:37           ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-13  9:37 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Wed, Sep 13, 2023 at 11:32:57AM +0200, Babis Chalios wrote:
> > I do not understand why this matters though. we know there was a leak,
> > why does it matter whether there was one or two leaks?
> >
> > > In the last RFC implementing this in Linux we sent to LKML [1] we avoid
> the
> > > issue by pre-populating both
> > > queues, but that does not solve the problem if a third entropy leak
> event
> > > arrives. The probability of this
> > > happening is indeed small, but we thought of a potential solution to
> this.
> > >
> > > What if we modify the spec here to instruct the VMM to deny taking a
> > > snapshot if there are not any buffers
> > > in the active leak queue? If we did this, we could even simplify the
> spec to
> > > just introduce a single entropy
> > > leak queue, so we could avoid the complexity of switching between active
> > > leak queues in the driver and
> > > the device. WDYT?
> >
> > here's the problem:
> >
> > - driver adds batch 1 of buffers
> > - leak
> > - device starts using buffers from batch 1
> > - driver sees some buffers and starts adding batch 2
> 
> If understand this clause:
> 
> > > +\item Upon detecting that buffers have been used, driver
> > > +      switches to another leak queue making it active
> > > +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
> > > +      It then starts adding buffers to the new leak queue.
> 
> correctly:
> 
> At this point, the driver will first switch active leak queue and
> then add batch 2 to the new leak queue.
> 
> and due to this:
> 
> > > +\item Device will keep using buffers in the active leak queue
> > > +      until it detects that both the current leak queue is empty and
> another
> > > +      leak queue has buffers. At that point device switches to
> > > +      another leak queue, making it active.
> > > +\item After the switch, buffers from the new leak queue are not
> > > +      used until an information leak is detected.
> > > +\end{enumerate}
> 
> the following won't happen:
> 
> > - device sees batch 2 and thinks this is part of batch 1
> >    consumes them all
> 
> Does it make sense?
> 
> Cheers,
> Babis

yes, the queue switch is used as a barrier to detect a new leak event.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-13  9:37           ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-09-13 11:19           ` Babis Chalios
  2023-09-18 11:14             ` Babis Chalios
  2023-09-18 12:41               ` [virtio-comment] " Michael S. Tsirkin
  -1 siblings, 2 replies; 53+ messages in thread
From: Babis Chalios @ 2023-09-13 11:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams



On 13/9/23 11:37, Michael S. Tsirkin wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed, Sep 13, 2023 at 11:32:57AM +0200, Babis Chalios wrote:
>>> I do not understand why this matters though. we know there was a leak,
>>> why does it matter whether there was one or two leaks?
>>>
>>>> In the last RFC implementing this in Linux we sent to LKML [1] we avoid
>> the
>>>> issue by pre-populating both
>>>> queues, but that does not solve the problem if a third entropy leak
>> event
>>>> arrives. The probability of this
>>>> happening is indeed small, but we thought of a potential solution to
>> this.
>>>> What if we modify the spec here to instruct the VMM to deny taking a
>>>> snapshot if there are not any buffers
>>>> in the active leak queue? If we did this, we could even simplify the
>> spec to
>>>> just introduce a single entropy
>>>> leak queue, so we could avoid the complexity of switching between active
>>>> leak queues in the driver and
>>>> the device. WDYT?
>>> here's the problem:
>>>
>>> - driver adds batch 1 of buffers
>>> - leak
>>> - device starts using buffers from batch 1
>>> - driver sees some buffers and starts adding batch 2
>> If understand this clause:
>>
>>>> +\item Upon detecting that buffers have been used, driver
>>>> +      switches to another leak queue making it active
>>>> +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
>>>> +      It then starts adding buffers to the new leak queue.
>> correctly:
>>
>> At this point, the driver will first switch active leak queue and
>> then add batch 2 to the new leak queue.
>>
>> and due to this:
>>
>>>> +\item Device will keep using buffers in the active leak queue
>>>> +      until it detects that both the current leak queue is empty and
>> another
>>>> +      leak queue has buffers. At that point device switches to
>>>> +      another leak queue, making it active.
>>>> +\item After the switch, buffers from the new leak queue are not
>>>> +      used until an information leak is detected.
>>>> +\end{enumerate}
>> the following won't happen:
>>
>>> - device sees batch 2 and thinks this is part of batch 1
>>>      consumes them all
>> Does it make sense?
>>
>> Cheers,
>> Babis
> yes, the queue switch is used as a barrier to detect a new leak event.

Right, so I think that there is a race condition between the time the 
driver sees the used buffers of the first
batch and until it adds the second batch on the next leak queue.

1. driver adds batch 1
2. leak event
3. device uses batch 1
4. driver sees the used buffers and
     a. switches leak queues
     b. adds batch 2.
5. devices finds initial leak queue empty and sees buffers in second 
leak queue.

If a second leak event happens after step 3 above and before all of 
steps 4 complete then batch 2 will not
be processed as part of the second leak event.

> --
> MST
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-13 11:19           ` Babis Chalios
@ 2023-09-18 11:14             ` Babis Chalios
  2023-09-18 12:41               ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Babis Chalios @ 2023-09-18 11:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams


>> yes, the queue switch is used as a barrier to detect a new leak event.
>
> Right, so I think that there is a race condition between the time the 
> driver sees the used buffers of the first
> batch and until it adds the second batch on the next leak queue.
>
> 1. driver adds batch 1
> 2. leak event
> 3. device uses batch 1
> 4. driver sees the used buffers and
>     a. switches leak queues
>     b. adds batch 2.
> 5. devices finds initial leak queue empty and sees buffers in second 
> leak queue.
>
> If a second leak event happens after step 3 above and before all of 
> steps 4 complete then batch 2 will not
> be processed as part of the second leak event.
>

Hey Michael, any thoughts on this? I think the my Linux RFC patch[1] 
showcases the problem in the
`entropy_leak_detected` function which handles the used buffers. If on 
the VMM we receive a new leak
event before `entropy_leak_detected` runs to completion (and adds a new 
batch of buffers) the leak
event will not have any buffers to handle.

[1] 
https://lore.kernel.org/lkml/20230823090107.65749-3-bchalios@amazon.es/T/#m085769c7b9c08f4acac626e7b4ecde11af13a5be

Cheers,
Babis

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-13 11:19           ` Babis Chalios
@ 2023-09-18 12:41               ` Michael S. Tsirkin
  2023-09-18 12:41               ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-18 12:41 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Wed, Sep 13, 2023 at 01:19:49PM +0200, Babis Chalios wrote:
> 
> 
> On 13/9/23 11:37, Michael S. Tsirkin wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> > 
> > 
> > 
> > On Wed, Sep 13, 2023 at 11:32:57AM +0200, Babis Chalios wrote:
> > > > I do not understand why this matters though. we know there was a leak,
> > > > why does it matter whether there was one or two leaks?
> > > > 
> > > > > In the last RFC implementing this in Linux we sent to LKML [1] we avoid
> > > the
> > > > > issue by pre-populating both
> > > > > queues, but that does not solve the problem if a third entropy leak
> > > event
> > > > > arrives. The probability of this
> > > > > happening is indeed small, but we thought of a potential solution to
> > > this.
> > > > > What if we modify the spec here to instruct the VMM to deny taking a
> > > > > snapshot if there are not any buffers
> > > > > in the active leak queue? If we did this, we could even simplify the
> > > spec to
> > > > > just introduce a single entropy
> > > > > leak queue, so we could avoid the complexity of switching between active
> > > > > leak queues in the driver and
> > > > > the device. WDYT?
> > > > here's the problem:
> > > > 
> > > > - driver adds batch 1 of buffers
> > > > - leak
> > > > - device starts using buffers from batch 1
> > > > - driver sees some buffers and starts adding batch 2
> > > If understand this clause:
> > > 
> > > > > +\item Upon detecting that buffers have been used, driver
> > > > > +      switches to another leak queue making it active
> > > > > +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
> > > > > +      It then starts adding buffers to the new leak queue.
> > > correctly:
> > > 
> > > At this point, the driver will first switch active leak queue and
> > > then add batch 2 to the new leak queue.
> > > 
> > > and due to this:
> > > 
> > > > > +\item Device will keep using buffers in the active leak queue
> > > > > +      until it detects that both the current leak queue is empty and
> > > another
> > > > > +      leak queue has buffers. At that point device switches to
> > > > > +      another leak queue, making it active.
> > > > > +\item After the switch, buffers from the new leak queue are not
> > > > > +      used until an information leak is detected.
> > > > > +\end{enumerate}
> > > the following won't happen:
> > > 
> > > > - device sees batch 2 and thinks this is part of batch 1
> > > >      consumes them all
> > > Does it make sense?
> > > 
> > > Cheers,
> > > Babis
> > yes, the queue switch is used as a barrier to detect a new leak event.
> 
> Right, so I think that there is a race condition between the time the driver
> sees the used buffers of the first
> batch and until it adds the second batch on the next leak queue.
> 
> 1. driver adds batch 1
> 2. leak event
> 3. device uses batch 1
> 4. driver sees the used buffers and
>     a. switches leak queues
>     b. adds batch 2.
> 5. devices finds initial leak queue empty and sees buffers in second leak
> queue.
> 
> If a second leak event happens after step 3 above and before all of steps 4
> complete then batch 2 will not
> be processed as part of the second leak event.

driver can just pre-add buffers in the second queue.

1. available buffers to queue 1-X
2. available buffers to queue X


3. poll queue X
4. used buffers in queue X
5. avail buffers in queue X
6. poll queue 1-X
7. used buffers in queue X
8. avail buffers in queue X
9. goto 3




> > --
> > MST
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-09-18 12:41               ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-18 12:41 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Wed, Sep 13, 2023 at 01:19:49PM +0200, Babis Chalios wrote:
> 
> 
> On 13/9/23 11:37, Michael S. Tsirkin wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> > 
> > 
> > 
> > On Wed, Sep 13, 2023 at 11:32:57AM +0200, Babis Chalios wrote:
> > > > I do not understand why this matters though. we know there was a leak,
> > > > why does it matter whether there was one or two leaks?
> > > > 
> > > > > In the last RFC implementing this in Linux we sent to LKML [1] we avoid
> > > the
> > > > > issue by pre-populating both
> > > > > queues, but that does not solve the problem if a third entropy leak
> > > event
> > > > > arrives. The probability of this
> > > > > happening is indeed small, but we thought of a potential solution to
> > > this.
> > > > > What if we modify the spec here to instruct the VMM to deny taking a
> > > > > snapshot if there are not any buffers
> > > > > in the active leak queue? If we did this, we could even simplify the
> > > spec to
> > > > > just introduce a single entropy
> > > > > leak queue, so we could avoid the complexity of switching between active
> > > > > leak queues in the driver and
> > > > > the device. WDYT?
> > > > here's the problem:
> > > > 
> > > > - driver adds batch 1 of buffers
> > > > - leak
> > > > - device starts using buffers from batch 1
> > > > - driver sees some buffers and starts adding batch 2
> > > If understand this clause:
> > > 
> > > > > +\item Upon detecting that buffers have been used, driver
> > > > > +      switches to another leak queue making it active
> > > > > +      (e.g. from \field{leakq1} to \field{leakq2} or vice versa).
> > > > > +      It then starts adding buffers to the new leak queue.
> > > correctly:
> > > 
> > > At this point, the driver will first switch active leak queue and
> > > then add batch 2 to the new leak queue.
> > > 
> > > and due to this:
> > > 
> > > > > +\item Device will keep using buffers in the active leak queue
> > > > > +      until it detects that both the current leak queue is empty and
> > > another
> > > > > +      leak queue has buffers. At that point device switches to
> > > > > +      another leak queue, making it active.
> > > > > +\item After the switch, buffers from the new leak queue are not
> > > > > +      used until an information leak is detected.
> > > > > +\end{enumerate}
> > > the following won't happen:
> > > 
> > > > - device sees batch 2 and thinks this is part of batch 1
> > > >      consumes them all
> > > Does it make sense?
> > > 
> > > Cheers,
> > > Babis
> > yes, the queue switch is used as a barrier to detect a new leak event.
> 
> Right, so I think that there is a race condition between the time the driver
> sees the used buffers of the first
> batch and until it adds the second batch on the next leak queue.
> 
> 1. driver adds batch 1
> 2. leak event
> 3. device uses batch 1
> 4. driver sees the used buffers and
>     a. switches leak queues
>     b. adds batch 2.
> 5. devices finds initial leak queue empty and sees buffers in second leak
> queue.
> 
> If a second leak event happens after step 3 above and before all of steps 4
> complete then batch 2 will not
> be processed as part of the second leak event.

driver can just pre-add buffers in the second queue.

1. available buffers to queue 1-X
2. available buffers to queue X


3. poll queue X
4. used buffers in queue X
5. avail buffers in queue X
6. poll queue 1-X
7. used buffers in queue X
8. avail buffers in queue X
9. goto 3




> > --
> > MST
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-18 12:41               ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-09-18 13:00               ` Babis Chalios
  2023-09-18 13:58                   ` [virtio-comment] " Michael S. Tsirkin
  -1 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-09-18 13:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams


>> Right, so I think that there is a race condition between the time the driver
>> sees the used buffers of the first
>> batch and until it adds the second batch on the next leak queue.
>>
>> 1. driver adds batch 1
>> 2. leak event
>> 3. device uses batch 1
>> 4. driver sees the used buffers and
>>      a. switches leak queues
>>      b. adds batch 2.
>> 5. devices finds initial leak queue empty and sees buffers in second leak
>> queue.
>>
>> If a second leak event happens after step 3 above and before all of steps 4
>> complete then batch 2 will not
>> be processed as part of the second leak event.
> driver can just pre-add buffers in the second queue.
>
> 1. available buffers to queue 1-X
> 2. available buffers to queue X
>
>
> 3. poll queue X
> 4. used buffers in queue X
> 5. avail buffers in queue X
> 6. poll queue 1-X
> 7. used buffers in queue X
> 8. avail buffers in queue X
> 9. goto 3
>

Yes, that's what the driver does now in the RFC patch. However, this 
just decreases
the race window, it doesn't eliminate it. If a third leak event happens 
it might not
find any buffers to use:

1. available buffers to queue 1-X
2. available buffers to queue X


3. poll queue X
4. used buffers in queue X       <- leak event 1 will use buffers in X
5. avail buffers in queue X
6. poll queue 1-X                <- leak event 2 will use buffers in 1-X
7. used buffers in queue 1-X
8. avail buffers in queue 1-X
                                  <- leak event 3 (it needs buffers in X, race with step 5)
9. goto 3


If, instead, we define a single leak queue and require that VMM should refuse to take a snapshot
if that queue is empty, we avoid the race condition in all cases and IMHO the protocol becomes
much simpler.


Cheers,
Babis



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-18 13:00               ` Babis Chalios
@ 2023-09-18 13:58                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-18 13:58 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Mon, Sep 18, 2023 at 03:00:43PM +0200, Babis Chalios wrote:
> 
> > > Right, so I think that there is a race condition between the time the driver
> > > sees the used buffers of the first
> > > batch and until it adds the second batch on the next leak queue.
> > > 
> > > 1. driver adds batch 1
> > > 2. leak event
> > > 3. device uses batch 1
> > > 4. driver sees the used buffers and
> > >      a. switches leak queues
> > >      b. adds batch 2.
> > > 5. devices finds initial leak queue empty and sees buffers in second leak
> > > queue.
> > > 
> > > If a second leak event happens after step 3 above and before all of steps 4
> > > complete then batch 2 will not
> > > be processed as part of the second leak event.
> > driver can just pre-add buffers in the second queue.
> > 
> > 1. available buffers to queue 1-X
> > 2. available buffers to queue X
> > 
> > 
> > 3. poll queue X
> > 4. used buffers in queue X
> > 5. avail buffers in queue X
> > 6. poll queue 1-X
> > 7. used buffers in queue X
> > 8. avail buffers in queue X
> > 9. goto 3
> > 
> 
> Yes, that's what the driver does now in the RFC patch. However, this just
> decreases
> the race window, it doesn't eliminate it. If a third leak event happens it
> might not
> find any buffers to use:
> 
> 1. available buffers to queue 1-X
> 2. available buffers to queue X
> 
> 
> 3. poll queue X
> 4. used buffers in queue X       <- leak event 1 will use buffers in X
> 5. avail buffers in queue X
> 6. poll queue 1-X                <- leak event 2 will use buffers in 1-X
> 7. used buffers in queue 1-X
> 8. avail buffers in queue 1-X
>                                  <- leak event 3 (it needs buffers in X, race with step 5)
> 9. goto 3


I don't get it. we added buffers in step 5.

> 
> 
> If, instead, we define a single leak queue and require that VMM should refuse to take a snapshot
> if that queue is empty, we avoid the race condition in all cases and IMHO the protocol becomes
> much simpler.
> 
> 
> Cheers,
> Babis
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-09-18 13:58                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-18 13:58 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Mon, Sep 18, 2023 at 03:00:43PM +0200, Babis Chalios wrote:
> 
> > > Right, so I think that there is a race condition between the time the driver
> > > sees the used buffers of the first
> > > batch and until it adds the second batch on the next leak queue.
> > > 
> > > 1. driver adds batch 1
> > > 2. leak event
> > > 3. device uses batch 1
> > > 4. driver sees the used buffers and
> > >      a. switches leak queues
> > >      b. adds batch 2.
> > > 5. devices finds initial leak queue empty and sees buffers in second leak
> > > queue.
> > > 
> > > If a second leak event happens after step 3 above and before all of steps 4
> > > complete then batch 2 will not
> > > be processed as part of the second leak event.
> > driver can just pre-add buffers in the second queue.
> > 
> > 1. available buffers to queue 1-X
> > 2. available buffers to queue X
> > 
> > 
> > 3. poll queue X
> > 4. used buffers in queue X
> > 5. avail buffers in queue X
> > 6. poll queue 1-X
> > 7. used buffers in queue X
> > 8. avail buffers in queue X
> > 9. goto 3
> > 
> 
> Yes, that's what the driver does now in the RFC patch. However, this just
> decreases
> the race window, it doesn't eliminate it. If a third leak event happens it
> might not
> find any buffers to use:
> 
> 1. available buffers to queue 1-X
> 2. available buffers to queue X
> 
> 
> 3. poll queue X
> 4. used buffers in queue X       <- leak event 1 will use buffers in X
> 5. avail buffers in queue X
> 6. poll queue 1-X                <- leak event 2 will use buffers in 1-X
> 7. used buffers in queue 1-X
> 8. avail buffers in queue 1-X
>                                  <- leak event 3 (it needs buffers in X, race with step 5)
> 9. goto 3


I don't get it. we added buffers in step 5.

> 
> 
> If, instead, we define a single leak queue and require that VMM should refuse to take a snapshot
> if that queue is empty, we avoid the race condition in all cases and IMHO the protocol becomes
> much simpler.
> 
> 
> Cheers,
> Babis
> 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-18 13:58                   ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-09-18 14:02                   ` Babis Chalios
  2023-09-18 14:05                       ` [virtio-comment] " Michael S. Tsirkin
  -1 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-09-18 14:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On 18/9/23 15:58, Michael S. Tsirkin wrote:
> On Mon, Sep 18, 2023 at 03:00:43PM +0200, Babis Chalios wrote:
>>>> Right, so I think that there is a race condition between the time the driver
>>>> sees the used buffers of the first
>>>> batch and until it adds the second batch on the next leak queue.
>>>>
>>>> 1. driver adds batch 1
>>>> 2. leak event
>>>> 3. device uses batch 1
>>>> 4. driver sees the used buffers and
>>>>       a. switches leak queues
>>>>       b. adds batch 2.
>>>> 5. devices finds initial leak queue empty and sees buffers in second leak
>>>> queue.
>>>>
>>>> If a second leak event happens after step 3 above and before all of steps 4
>>>> complete then batch 2 will not
>>>> be processed as part of the second leak event.
>>> driver can just pre-add buffers in the second queue.
>>>
>>> 1. available buffers to queue 1-X
>>> 2. available buffers to queue X
>>>
>>>
>>> 3. poll queue X
>>> 4. used buffers in queue X
>>> 5. avail buffers in queue X
>>> 6. poll queue 1-X
>>> 7. used buffers in queue X
>>> 8. avail buffers in queue X
>>> 9. goto 3
>>>
>> Yes, that's what the driver does now in the RFC patch. However, this just
>> decreases
>> the race window, it doesn't eliminate it. If a third leak event happens it
>> might not
>> find any buffers to use:
>>
>> 1. available buffers to queue 1-X
>> 2. available buffers to queue X
>>
>>
>> 3. poll queue X
>> 4. used buffers in queue X       <- leak event 1 will use buffers in X
>> 5. avail buffers in queue X
>> 6. poll queue 1-X                <- leak event 2 will use buffers in 1-X
>> 7. used buffers in queue 1-X
>> 8. avail buffers in queue 1-X
>>                                   <- leak event 3 (it needs buffers in X, race with step 5)
>> 9. goto 3
>
> I don't get it. we added buffers in step 5.

What if the leak event 3 arrives before step 5 had time to actually add 
the buffers in X and make
them visible to the device?


>
>>
>> If, instead, we define a single leak queue and require that VMM should refuse to take a snapshot
>> if that queue is empty, we avoid the race condition in all cases and IMHO the protocol becomes
>> much simpler.
>>
>>
>> Cheers,
>> Babis
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-18 14:02                   ` Babis Chalios
@ 2023-09-18 14:05                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-18 14:05 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Mon, Sep 18, 2023 at 04:02:54PM +0200, Babis Chalios wrote:
> On 18/9/23 15:58, Michael S. Tsirkin wrote:
> > On Mon, Sep 18, 2023 at 03:00:43PM +0200, Babis Chalios wrote:
> > > > > Right, so I think that there is a race condition between the time the driver
> > > > > sees the used buffers of the first
> > > > > batch and until it adds the second batch on the next leak queue.
> > > > > 
> > > > > 1. driver adds batch 1
> > > > > 2. leak event
> > > > > 3. device uses batch 1
> > > > > 4. driver sees the used buffers and
> > > > >       a. switches leak queues
> > > > >       b. adds batch 2.
> > > > > 5. devices finds initial leak queue empty and sees buffers in second leak
> > > > > queue.
> > > > > 
> > > > > If a second leak event happens after step 3 above and before all of steps 4
> > > > > complete then batch 2 will not
> > > > > be processed as part of the second leak event.
> > > > driver can just pre-add buffers in the second queue.
> > > > 
> > > > 1. available buffers to queue 1-X
> > > > 2. available buffers to queue X
> > > > 
> > > > 
> > > > 3. poll queue X
> > > > 4. used buffers in queue X
> > > > 5. avail buffers in queue X
> > > > 6. poll queue 1-X
> > > > 7. used buffers in queue X
> > > > 8. avail buffers in queue X
> > > > 9. goto 3
> > > > 
> > > Yes, that's what the driver does now in the RFC patch. However, this just
> > > decreases
> > > the race window, it doesn't eliminate it. If a third leak event happens it
> > > might not
> > > find any buffers to use:
> > > 
> > > 1. available buffers to queue 1-X
> > > 2. available buffers to queue X
> > > 
> > > 
> > > 3. poll queue X
> > > 4. used buffers in queue X       <- leak event 1 will use buffers in X
> > > 5. avail buffers in queue X
> > > 6. poll queue 1-X                <- leak event 2 will use buffers in 1-X
> > > 7. used buffers in queue 1-X
> > > 8. avail buffers in queue 1-X
> > >                                   <- leak event 3 (it needs buffers in X, race with step 5)
> > > 9. goto 3
> > 
> > I don't get it. we added buffers in step 5.
> 
> What if the leak event 3 arrives before step 5 had time to actually add the
> buffers in X and make
> them visible to the device?


Then it will see a single event in 1-X instead of two events.  A leak is
a leak though, I don't see does it matter how many triggered.


> 
> > 
> > > 
> > > If, instead, we define a single leak queue and require that VMM should refuse to take a snapshot
> > > if that queue is empty, we avoid the race condition in all cases and IMHO the protocol becomes
> > > much simpler.
> > > 
> > > 
> > > Cheers,
> > > Babis
> > > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-09-18 14:05                       ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-18 14:05 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Mon, Sep 18, 2023 at 04:02:54PM +0200, Babis Chalios wrote:
> On 18/9/23 15:58, Michael S. Tsirkin wrote:
> > On Mon, Sep 18, 2023 at 03:00:43PM +0200, Babis Chalios wrote:
> > > > > Right, so I think that there is a race condition between the time the driver
> > > > > sees the used buffers of the first
> > > > > batch and until it adds the second batch on the next leak queue.
> > > > > 
> > > > > 1. driver adds batch 1
> > > > > 2. leak event
> > > > > 3. device uses batch 1
> > > > > 4. driver sees the used buffers and
> > > > >       a. switches leak queues
> > > > >       b. adds batch 2.
> > > > > 5. devices finds initial leak queue empty and sees buffers in second leak
> > > > > queue.
> > > > > 
> > > > > If a second leak event happens after step 3 above and before all of steps 4
> > > > > complete then batch 2 will not
> > > > > be processed as part of the second leak event.
> > > > driver can just pre-add buffers in the second queue.
> > > > 
> > > > 1. available buffers to queue 1-X
> > > > 2. available buffers to queue X
> > > > 
> > > > 
> > > > 3. poll queue X
> > > > 4. used buffers in queue X
> > > > 5. avail buffers in queue X
> > > > 6. poll queue 1-X
> > > > 7. used buffers in queue X
> > > > 8. avail buffers in queue X
> > > > 9. goto 3
> > > > 
> > > Yes, that's what the driver does now in the RFC patch. However, this just
> > > decreases
> > > the race window, it doesn't eliminate it. If a third leak event happens it
> > > might not
> > > find any buffers to use:
> > > 
> > > 1. available buffers to queue 1-X
> > > 2. available buffers to queue X
> > > 
> > > 
> > > 3. poll queue X
> > > 4. used buffers in queue X       <- leak event 1 will use buffers in X
> > > 5. avail buffers in queue X
> > > 6. poll queue 1-X                <- leak event 2 will use buffers in 1-X
> > > 7. used buffers in queue 1-X
> > > 8. avail buffers in queue 1-X
> > >                                   <- leak event 3 (it needs buffers in X, race with step 5)
> > > 9. goto 3
> > 
> > I don't get it. we added buffers in step 5.
> 
> What if the leak event 3 arrives before step 5 had time to actually add the
> buffers in X and make
> them visible to the device?


Then it will see a single event in 1-X instead of two events.  A leak is
a leak though, I don't see does it matter how many triggered.


> 
> > 
> > > 
> > > If, instead, we define a single leak queue and require that VMM should refuse to take a snapshot
> > > if that queue is empty, we avoid the race condition in all cases and IMHO the protocol becomes
> > > much simpler.
> > > 
> > > 
> > > Cheers,
> > > Babis
> > > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-18 14:05                       ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-09-18 16:30                       ` Babis Chalios
  2023-09-19  7:32                         ` Babis Chalios
  -1 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-09-18 16:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

>>>> Yes, that's what the driver does now in the RFC patch. However, this just
>>>> decreases
>>>> the race window, it doesn't eliminate it. If a third leak event happens it
>>>> might not
>>>> find any buffers to use:
>>>>
>>>> 1. available buffers to queue 1-X
>>>> 2. available buffers to queue X
>>>>
>>>>
>>>> 3. poll queue X
>>>> 4. used buffers in queue X       <- leak event 1 will use buffers in X
>>>> 5. avail buffers in queue X
>>>> 6. poll queue 1-X                <- leak event 2 will use buffers in 1-X
>>>> 7. used buffers in queue 1-X
>>>> 8. avail buffers in queue 1-X
>>>>                                    <- leak event 3 (it needs buffers in X, race with step 5)
>>>> 9. goto 3
>>> I don't get it. we added buffers in step 5.
>> What if the leak event 3 arrives before step 5 had time to actually add the
>> buffers in X and make
>> them visible to the device?
>
> Then it will see a single event in 1-X instead of two events.  A leak is
> a leak though, I don't see does it matter how many triggered.
>

So the scenario I have in mind is the following:

(Epoch here is terminology that I used in the Linux RFC).

         Driver 
                                                              Device

1.     add buffers to 1-X
2.     add buffers to X
3.     poll queue X
4.     vcpu 0: cache getrandom() entropy
         and cache epoch value
5.           First snapshot: use buffers in X
6.     vcpu 1: sees used buffers
7.                      Second snapshot: use buffers in 1-X
8.     vcpu 0: getrandom() observes new
         epoch value & caches it
9.            Third snapshot: no buffers in either queue vcpu 1 (from 
step 6 has not yet finished adding new buffers).
10.   vcpu 1 adds new buffer in X
11.   vcpu 0: getrandom() will not see new
         epoch and gets stale entropy.


In this succession of events, when the third snapshot will happen, it 
won't find any buffers in either queue,
so it won't increase the RNG epoch value. So, any entropy gathered after 
step 8 will be the same across all
snapshots. Am I missing something?

Cheers,
Babis



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-18 16:30                       ` Babis Chalios
@ 2023-09-19  7:32                         ` Babis Chalios
  2023-09-19 10:01                             ` [virtio-comment] " Michael S. Tsirkin
  0 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-09-19  7:32 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

Resending to fix e-mail formatting issues (sorry for the spam)

On 18/9/23 18:30, Babis Chalios wrote:
>>>>> Yes, that's what the driver does now in the RFC patch. However, 
>>>>> this just
>>>>> decreases
>>>>> the race window, it doesn't eliminate it. If a third leak event 
>>>>> happens it
>>>>> might not
>>>>> find any buffers to use:
>>>>>
>>>>> 1. available buffers to queue 1-X
>>>>> 2. available buffers to queue X
>>>>>
>>>>>
>>>>> 3. poll queue X
>>>>> 4. used buffers in queue X       <- leak event 1 will use buffers 
>>>>> in X
>>>>> 5. avail buffers in queue X
>>>>> 6. poll queue 1-X                <- leak event 2 will use buffers 
>>>>> in 1-X
>>>>> 7. used buffers in queue 1-X
>>>>> 8. avail buffers in queue 1-X
>>>>>                                    <- leak event 3 (it needs 
>>>>> buffers in X, race with step 5)
>>>>> 9. goto 3
>>>> I don't get it. we added buffers in step 5.
>>> What if the leak event 3 arrives before step 5 had time to actually 
>>> add the
>>> buffers in X and make
>>> them visible to the device?
>>
>> Then it will see a single event in 1-X instead of two events.  A leak is
>> a leak though, I don't see does it matter how many triggered.
>>


So the scenario I have in mind is the following:

(Epoch here is terminology that I used in the Linux RFC. It is a value 
maintained by random.c
that changes every time a leak event happens).

1. add buffers to 1-X
2. add buffers to X
3. poll queue X
4. vcpu 0: get getrandom() entropy and cache epoch value
5. Device: First snapshot, uses buffers in X
6. vcpu 1: sees used buffers
7. Device: Second snapshot, uses buffers in 1-X
8. vcpu 0: getrandom() observes new  epoch value & caches it
9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 
6 has not yet finished adding new buffers).
10. vcpu 1 adds new buffer in X
11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.


In this succession of events, when the third snapshot will happen, the 
device won't find
any buffers in either queue, so it won't increase the RNG epoch value. 
So, any entropy
gathered after step 8 will be the same across all snapshots. Am I 
missing something?

Cheers,
Babis




---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-19  7:32                         ` Babis Chalios
@ 2023-09-19 10:01                             ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-19 10:01 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> Resending to fix e-mail formatting issues (sorry for the spam)
> 
> On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > However, this just
> > > > > > decreases
> > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > leak event happens it
> > > > > > might not
> > > > > > find any buffers to use:
> > > > > > 
> > > > > > 1. available buffers to queue 1-X
> > > > > > 2. available buffers to queue X
> > > > > > 
> > > > > > 
> > > > > > 3. poll queue X
> > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > use buffers in X
> > > > > > 5. avail buffers in queue X
> > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > use buffers in 1-X
> > > > > > 7. used buffers in queue 1-X
> > > > > > 8. avail buffers in queue 1-X
> > > > > >                                    <- leak event 3 (it
> > > > > > needs buffers in X, race with step 5)
> > > > > > 9. goto 3
> > > > > I don't get it. we added buffers in step 5.
> > > > What if the leak event 3 arrives before step 5 had time to
> > > > actually add the
> > > > buffers in X and make
> > > > them visible to the device?
> > > 
> > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > a leak though, I don't see does it matter how many triggered.
> > > 
> 
> 
> So the scenario I have in mind is the following:
> 
> (Epoch here is terminology that I used in the Linux RFC. It is a value
> maintained by random.c
> that changes every time a leak event happens).
> 
> 1. add buffers to 1-X
> 2. add buffers to X
> 3. poll queue X
> 4. vcpu 0: get getrandom() entropy and cache epoch value
> 5. Device: First snapshot, uses buffers in X
> 6. vcpu 1: sees used buffers
> 7. Device: Second snapshot, uses buffers in 1-X
> 8. vcpu 0: getrandom() observes new  epoch value & caches it
> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> has not yet finished adding new buffers).
> 10. vcpu 1 adds new buffer in X
> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> 
> 
> In this succession of events, when the third snapshot will happen, the
> device won't find
> any buffers in either queue, so it won't increase the RNG epoch value. So,
> any entropy
> gathered after step 8 will be the same across all snapshots. Am I missing
> something?
> 
> Cheers,
> Babis
> 

Yes but notice how this is followed by:

12. vcpu 1: sees used buffers in 1-X

Driver can notify getrandom I guess?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-09-19 10:01                             ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-19 10:01 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> Resending to fix e-mail formatting issues (sorry for the spam)
> 
> On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > However, this just
> > > > > > decreases
> > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > leak event happens it
> > > > > > might not
> > > > > > find any buffers to use:
> > > > > > 
> > > > > > 1. available buffers to queue 1-X
> > > > > > 2. available buffers to queue X
> > > > > > 
> > > > > > 
> > > > > > 3. poll queue X
> > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > use buffers in X
> > > > > > 5. avail buffers in queue X
> > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > use buffers in 1-X
> > > > > > 7. used buffers in queue 1-X
> > > > > > 8. avail buffers in queue 1-X
> > > > > >                                    <- leak event 3 (it
> > > > > > needs buffers in X, race with step 5)
> > > > > > 9. goto 3
> > > > > I don't get it. we added buffers in step 5.
> > > > What if the leak event 3 arrives before step 5 had time to
> > > > actually add the
> > > > buffers in X and make
> > > > them visible to the device?
> > > 
> > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > a leak though, I don't see does it matter how many triggered.
> > > 
> 
> 
> So the scenario I have in mind is the following:
> 
> (Epoch here is terminology that I used in the Linux RFC. It is a value
> maintained by random.c
> that changes every time a leak event happens).
> 
> 1. add buffers to 1-X
> 2. add buffers to X
> 3. poll queue X
> 4. vcpu 0: get getrandom() entropy and cache epoch value
> 5. Device: First snapshot, uses buffers in X
> 6. vcpu 1: sees used buffers
> 7. Device: Second snapshot, uses buffers in 1-X
> 8. vcpu 0: getrandom() observes new  epoch value & caches it
> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> has not yet finished adding new buffers).
> 10. vcpu 1 adds new buffer in X
> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> 
> 
> In this succession of events, when the third snapshot will happen, the
> device won't find
> any buffers in either queue, so it won't increase the RNG epoch value. So,
> any entropy
> gathered after step 8 will be the same across all snapshots. Am I missing
> something?
> 
> Cheers,
> Babis
> 

Yes but notice how this is followed by:

12. vcpu 1: sees used buffers in 1-X

Driver can notify getrandom I guess?

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-19 10:01                             ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-09-19 10:11                             ` Babis Chalios
  2023-09-22 12:30                               ` Babis Chalios
  2023-09-22 15:06                                 ` [virtio-comment] " Michael S. Tsirkin
  -1 siblings, 2 replies; 53+ messages in thread
From: Babis Chalios @ 2023-09-19 10:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On 19/9/23 12:01, Michael S. Tsirkin wrote:
> On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
>> Resending to fix e-mail formatting issues (sorry for the spam)
>>
>> On 18/9/23 18:30, Babis Chalios wrote:
>>>>>>> Yes, that's what the driver does now in the RFC patch.
>>>>>>> However, this just
>>>>>>> decreases
>>>>>>> the race window, it doesn't eliminate it. If a third
>>>>>>> leak event happens it
>>>>>>> might not
>>>>>>> find any buffers to use:
>>>>>>>
>>>>>>> 1. available buffers to queue 1-X
>>>>>>> 2. available buffers to queue X
>>>>>>>
>>>>>>>
>>>>>>> 3. poll queue X
>>>>>>> 4. used buffers in queue X       <- leak event 1 will
>>>>>>> use buffers in X
>>>>>>> 5. avail buffers in queue X
>>>>>>> 6. poll queue 1-X                <- leak event 2 will
>>>>>>> use buffers in 1-X
>>>>>>> 7. used buffers in queue 1-X
>>>>>>> 8. avail buffers in queue 1-X
>>>>>>>                                     <- leak event 3 (it
>>>>>>> needs buffers in X, race with step 5)
>>>>>>> 9. goto 3
>>>>>> I don't get it. we added buffers in step 5.
>>>>> What if the leak event 3 arrives before step 5 had time to
>>>>> actually add the
>>>>> buffers in X and make
>>>>> them visible to the device?
>>>> Then it will see a single event in 1-X instead of two events.  A leak is
>>>> a leak though, I don't see does it matter how many triggered.
>>>>
>>
>> So the scenario I have in mind is the following:
>>
>> (Epoch here is terminology that I used in the Linux RFC. It is a value
>> maintained by random.c
>> that changes every time a leak event happens).
>>
>> 1. add buffers to 1-X
>> 2. add buffers to X
>> 3. poll queue X
>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>> 5. Device: First snapshot, uses buffers in X
>> 6. vcpu 1: sees used buffers
>> 7. Device: Second snapshot, uses buffers in 1-X
>> 8. vcpu 0: getrandom() observes new  epoch value & caches it
>> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
>> has not yet finished adding new buffers).
>> 10. vcpu 1 adds new buffer in X
>> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
>>
>>
>> In this succession of events, when the third snapshot will happen, the
>> device won't find
>> any buffers in either queue, so it won't increase the RNG epoch value. So,
>> any entropy
>> gathered after step 8 will be the same across all snapshots. Am I missing
>> something?
>>
>> Cheers,
>> Babis
>>
> Yes but notice how this is followed by:
>
> 12. vcpu 1: sees used buffers in 1-X
>
> Driver can notify getrandom I guess?

It could, but then we have the exact race condition that VMGENID had,
userspace has already consumed stale entropy and there's nothing we
can do about that.

Although this is indeed a corner case, it feels like it beats the purpose
of having the hardware update directly userspace (via copy on leak).

How do you feel about the proposal a couple of emails back? It looks to
me that it avoids completely the race condition.

Cheers,
Babis

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-19 10:11                             ` Babis Chalios
@ 2023-09-22 12:30                               ` Babis Chalios
  2023-09-22 15:06                                 ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Babis Chalios @ 2023-09-22 12:30 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

Hi Michael,

On 19/9/23 12:11, Babis Chalios wrote:
> On 19/9/23 12:01, Michael S. Tsirkin wrote:
>> On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
>>> Resending to fix e-mail formatting issues (sorry for the spam)
>>>
>>> On 18/9/23 18:30, Babis Chalios wrote:
>>>>>>>> Yes, that's what the driver does now in the RFC patch.
>>>>>>>> However, this just
>>>>>>>> decreases
>>>>>>>> the race window, it doesn't eliminate it. If a third
>>>>>>>> leak event happens it
>>>>>>>> might not
>>>>>>>> find any buffers to use:
>>>>>>>>
>>>>>>>> 1. available buffers to queue 1-X
>>>>>>>> 2. available buffers to queue X
>>>>>>>>
>>>>>>>>
>>>>>>>> 3. poll queue X
>>>>>>>> 4. used buffers in queue X       <- leak event 1 will
>>>>>>>> use buffers in X
>>>>>>>> 5. avail buffers in queue X
>>>>>>>> 6. poll queue 1-X                <- leak event 2 will
>>>>>>>> use buffers in 1-X
>>>>>>>> 7. used buffers in queue 1-X
>>>>>>>> 8. avail buffers in queue 1-X
>>>>>>>>                                     <- leak event 3 (it
>>>>>>>> needs buffers in X, race with step 5)
>>>>>>>> 9. goto 3
>>>>>>> I don't get it. we added buffers in step 5.
>>>>>> What if the leak event 3 arrives before step 5 had time to
>>>>>> actually add the
>>>>>> buffers in X and make
>>>>>> them visible to the device?
>>>>> Then it will see a single event in 1-X instead of two events.  A 
>>>>> leak is
>>>>> a leak though, I don't see does it matter how many triggered.
>>>>>
>>>
>>> So the scenario I have in mind is the following:
>>>
>>> (Epoch here is terminology that I used in the Linux RFC. It is a value
>>> maintained by random.c
>>> that changes every time a leak event happens).
>>>
>>> 1. add buffers to 1-X
>>> 2. add buffers to X
>>> 3. poll queue X
>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>> 5. Device: First snapshot, uses buffers in X
>>> 6. vcpu 1: sees used buffers
>>> 7. Device: Second snapshot, uses buffers in 1-X
>>> 8. vcpu 0: getrandom() observes new  epoch value & caches it
>>> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from 
>>> step 6
>>> has not yet finished adding new buffers).
>>> 10. vcpu 1 adds new buffer in X
>>> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
>>>
>>>
>>> In this succession of events, when the third snapshot will happen, the
>>> device won't find
>>> any buffers in either queue, so it won't increase the RNG epoch 
>>> value. So,
>>> any entropy
>>> gathered after step 8 will be the same across all snapshots. Am I 
>>> missing
>>> something?
>>>
>>> Cheers,
>>> Babis
>>>
>> Yes but notice how this is followed by:
>>
>> 12. vcpu 1: sees used buffers in 1-X
>>
>> Driver can notify getrandom I guess?
>
> It could, but then we have the exact race condition that VMGENID had,
> userspace has already consumed stale entropy and there's nothing we
> can do about that.
>
> Although this is indeed a corner case, it feels like it beats the purpose
> of having the hardware update directly userspace (via copy on leak).
>
> How do you feel about the proposal a couple of emails back? It looks to
> me that it avoids completely the race condition.

Any thoughts on this? Sorry for pushing. I want to finalize the details 
on this,
so I can close open fronts on the LKML patch.

Cheers,
Babis

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-19 10:11                             ` Babis Chalios
@ 2023-09-22 15:06                                 ` Michael S. Tsirkin
  2023-09-22 15:06                                 ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-22 15:06 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > 
> > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > However, this just
> > > > > > > > decreases
> > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > leak event happens it
> > > > > > > > might not
> > > > > > > > find any buffers to use:
> > > > > > > > 
> > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > 2. available buffers to queue X
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 3. poll queue X
> > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > use buffers in X
> > > > > > > > 5. avail buffers in queue X
> > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > use buffers in 1-X
> > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > >                                     <- leak event 3 (it
> > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > 9. goto 3
> > > > > > > I don't get it. we added buffers in step 5.
> > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > actually add the
> > > > > > buffers in X and make
> > > > > > them visible to the device?
> > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > a leak though, I don't see does it matter how many triggered.
> > > > > 
> > > 
> > > So the scenario I have in mind is the following:
> > > 
> > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > maintained by random.c
> > > that changes every time a leak event happens).
> > > 
> > > 1. add buffers to 1-X
> > > 2. add buffers to X
> > > 3. poll queue X
> > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > 5. Device: First snapshot, uses buffers in X
> > > 6. vcpu 1: sees used buffers
> > > 7. Device: Second snapshot, uses buffers in 1-X
> > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > has not yet finished adding new buffers).
> > > 10. vcpu 1 adds new buffer in X
> > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > 
> > > 
> > > In this succession of events, when the third snapshot will happen, the
> > > device won't find
> > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > any entropy
> > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > something?
> > > 
> > > Cheers,
> > > Babis
> > > 
> > Yes but notice how this is followed by:
> > 
> > 12. vcpu 1: sees used buffers in 1-X
> > 
> > Driver can notify getrandom I guess?
> 
> It could, but then we have the exact race condition that VMGENID had,
> userspace has already consumed stale entropy and there's nothing we
> can do about that.
> 
> Although this is indeed a corner case, it feels like it beats the purpose
> of having the hardware update directly userspace (via copy on leak).
> 
> How do you feel about the proposal a couple of emails back? It looks to
> me that it avoids completely the race condition.
> 
> Cheers,
> Babis

It does. The problem of course is that this means that e.g.
taking a snapshot of a guest that is stuck won't work well.
I have been thinking of adding MAP/UNMAP descriptors for
a while now. Thus it will be possible to modify
userspace memory without consuming buffers.
Would something like this solve the problem?



> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-09-22 15:06                                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-22 15:06 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > 
> > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > However, this just
> > > > > > > > decreases
> > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > leak event happens it
> > > > > > > > might not
> > > > > > > > find any buffers to use:
> > > > > > > > 
> > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > 2. available buffers to queue X
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 3. poll queue X
> > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > use buffers in X
> > > > > > > > 5. avail buffers in queue X
> > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > use buffers in 1-X
> > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > >                                     <- leak event 3 (it
> > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > 9. goto 3
> > > > > > > I don't get it. we added buffers in step 5.
> > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > actually add the
> > > > > > buffers in X and make
> > > > > > them visible to the device?
> > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > a leak though, I don't see does it matter how many triggered.
> > > > > 
> > > 
> > > So the scenario I have in mind is the following:
> > > 
> > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > maintained by random.c
> > > that changes every time a leak event happens).
> > > 
> > > 1. add buffers to 1-X
> > > 2. add buffers to X
> > > 3. poll queue X
> > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > 5. Device: First snapshot, uses buffers in X
> > > 6. vcpu 1: sees used buffers
> > > 7. Device: Second snapshot, uses buffers in 1-X
> > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > has not yet finished adding new buffers).
> > > 10. vcpu 1 adds new buffer in X
> > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > 
> > > 
> > > In this succession of events, when the third snapshot will happen, the
> > > device won't find
> > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > any entropy
> > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > something?
> > > 
> > > Cheers,
> > > Babis
> > > 
> > Yes but notice how this is followed by:
> > 
> > 12. vcpu 1: sees used buffers in 1-X
> > 
> > Driver can notify getrandom I guess?
> 
> It could, but then we have the exact race condition that VMGENID had,
> userspace has already consumed stale entropy and there's nothing we
> can do about that.
> 
> Although this is indeed a corner case, it feels like it beats the purpose
> of having the hardware update directly userspace (via copy on leak).
> 
> How do you feel about the proposal a couple of emails back? It looks to
> me that it avoids completely the race condition.
> 
> Cheers,
> Babis

It does. The problem of course is that this means that e.g.
taking a snapshot of a guest that is stuck won't work well.
I have been thinking of adding MAP/UNMAP descriptors for
a while now. Thus it will be possible to modify
userspace memory without consuming buffers.
Would something like this solve the problem?



> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-22 15:06                                 ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-09-22 15:40                                 ` Babis Chalios
  2023-09-22 16:01                                     ` [virtio-comment] " Michael S. Tsirkin
  2023-11-02 11:25                                     ` [virtio-comment] " Michael S. Tsirkin
  -1 siblings, 2 replies; 53+ messages in thread
From: Babis Chalios @ 2023-09-22 15:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams



On 22/9/23 17:06, Michael S. Tsirkin wrote:
> On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
>> On 19/9/23 12:01, Michael S. Tsirkin wrote:
>>> On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
>>>> Resending to fix e-mail formatting issues (sorry for the spam)
>>>>
>>>> On 18/9/23 18:30, Babis Chalios wrote:
>>>>>>>>> Yes, that's what the driver does now in the RFC patch.
>>>>>>>>> However, this just
>>>>>>>>> decreases
>>>>>>>>> the race window, it doesn't eliminate it. If a third
>>>>>>>>> leak event happens it
>>>>>>>>> might not
>>>>>>>>> find any buffers to use:
>>>>>>>>>
>>>>>>>>> 1. available buffers to queue 1-X
>>>>>>>>> 2. available buffers to queue X
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 3. poll queue X
>>>>>>>>> 4. used buffers in queue X       <- leak event 1 will
>>>>>>>>> use buffers in X
>>>>>>>>> 5. avail buffers in queue X
>>>>>>>>> 6. poll queue 1-X                <- leak event 2 will
>>>>>>>>> use buffers in 1-X
>>>>>>>>> 7. used buffers in queue 1-X
>>>>>>>>> 8. avail buffers in queue 1-X
>>>>>>>>>                                      <- leak event 3 (it
>>>>>>>>> needs buffers in X, race with step 5)
>>>>>>>>> 9. goto 3
>>>>>>>> I don't get it. we added buffers in step 5.
>>>>>>> What if the leak event 3 arrives before step 5 had time to
>>>>>>> actually add the
>>>>>>> buffers in X and make
>>>>>>> them visible to the device?
>>>>>> Then it will see a single event in 1-X instead of two events.  A leak is
>>>>>> a leak though, I don't see does it matter how many triggered.
>>>>>>
>>>> So the scenario I have in mind is the following:
>>>>
>>>> (Epoch here is terminology that I used in the Linux RFC. It is a value
>>>> maintained by random.c
>>>> that changes every time a leak event happens).
>>>>
>>>> 1. add buffers to 1-X
>>>> 2. add buffers to X
>>>> 3. poll queue X
>>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>>> 5. Device: First snapshot, uses buffers in X
>>>> 6. vcpu 1: sees used buffers
>>>> 7. Device: Second snapshot, uses buffers in 1-X
>>>> 8. vcpu 0: getrandom() observes new  epoch value & caches it
>>>> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
>>>> has not yet finished adding new buffers).
>>>> 10. vcpu 1 adds new buffer in X
>>>> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
>>>>
>>>>
>>>> In this succession of events, when the third snapshot will happen, the
>>>> device won't find
>>>> any buffers in either queue, so it won't increase the RNG epoch value. So,
>>>> any entropy
>>>> gathered after step 8 will be the same across all snapshots. Am I missing
>>>> something?
>>>>
>>>> Cheers,
>>>> Babis
>>>>
>>> Yes but notice how this is followed by:
>>>
>>> 12. vcpu 1: sees used buffers in 1-X
>>>
>>> Driver can notify getrandom I guess?
>> It could, but then we have the exact race condition that VMGENID had,
>> userspace has already consumed stale entropy and there's nothing we
>> can do about that.
>>
>> Although this is indeed a corner case, it feels like it beats the purpose
>> of having the hardware update directly userspace (via copy on leak).
>>
>> How do you feel about the proposal a couple of emails back? It looks to
>> me that it avoids completely the race condition.
>>
>> Cheers,
>> Babis
> It does. The problem of course is that this means that e.g.
> taking a snapshot of a guest that is stuck won't work well.

That is true, but does it matter? The intention of the proposal
is that if it is not safe to take snapshots (i.e. no buffers in the
queue) don't take snapshots.

> I have been thinking of adding MAP/UNMAP descriptors for
> a while now. Thus it will be possible to modify
> userspace memory without consuming buffers.
> Would something like this solve the problem?

I am not familiar with MAP/UNMAP descriptors. Is there
a link where I can read about them?

Cheers,
Babis

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-22 15:40                                 ` Babis Chalios
@ 2023-09-22 16:01                                     ` Michael S. Tsirkin
  2023-11-02 11:25                                     ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-22 16:01 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> 
> 
> On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > 
> > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > However, this just
> > > > > > > > > > decreases
> > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > leak event happens it
> > > > > > > > > > might not
> > > > > > > > > > find any buffers to use:
> > > > > > > > > > 
> > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 3. poll queue X
> > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > use buffers in X
> > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > use buffers in 1-X
> > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > >                                      <- leak event 3 (it
> > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > 9. goto 3
> > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > actually add the
> > > > > > > > buffers in X and make
> > > > > > > > them visible to the device?
> > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > 
> > > > > So the scenario I have in mind is the following:
> > > > > 
> > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > maintained by random.c
> > > > > that changes every time a leak event happens).
> > > > > 
> > > > > 1. add buffers to 1-X
> > > > > 2. add buffers to X
> > > > > 3. poll queue X
> > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > 5. Device: First snapshot, uses buffers in X
> > > > > 6. vcpu 1: sees used buffers
> > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > has not yet finished adding new buffers).
> > > > > 10. vcpu 1 adds new buffer in X
> > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > 
> > > > > 
> > > > > In this succession of events, when the third snapshot will happen, the
> > > > > device won't find
> > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > any entropy
> > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > something?
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > > 
> > > > Yes but notice how this is followed by:
> > > > 
> > > > 12. vcpu 1: sees used buffers in 1-X
> > > > 
> > > > Driver can notify getrandom I guess?
> > > It could, but then we have the exact race condition that VMGENID had,
> > > userspace has already consumed stale entropy and there's nothing we
> > > can do about that.
> > > 
> > > Although this is indeed a corner case, it feels like it beats the purpose
> > > of having the hardware update directly userspace (via copy on leak).
> > > 
> > > How do you feel about the proposal a couple of emails back? It looks to
> > > me that it avoids completely the race condition.
> > > 
> > > Cheers,
> > > Babis
> > It does. The problem of course is that this means that e.g.
> > taking a snapshot of a guest that is stuck won't work well.
> 
> That is true, but does it matter? The intention of the proposal
> is that if it is not safe to take snapshots (i.e. no buffers in the
> queue) don't take snapshots.
> 
> > I have been thinking of adding MAP/UNMAP descriptors for
> > a while now. Thus it will be possible to modify
> > userspace memory without consuming buffers.
> > Would something like this solve the problem?
> 
> I am not familiar with MAP/UNMAP descriptors. Is there
> a link where I can read about them?
> 
> Cheers,
> Babis


Heh no I just came up with the name. Will write up in a couple
of days, but the idea is that driver does get_user_pages,
adds buffer to queue, and device will remember the address
and change that memory on a snapshot. If there are buffers
in the queue it will also use these to tell driver,
but if there are no buffers then it won't.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-09-22 16:01                                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-22 16:01 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> 
> 
> On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > 
> > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > However, this just
> > > > > > > > > > decreases
> > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > leak event happens it
> > > > > > > > > > might not
> > > > > > > > > > find any buffers to use:
> > > > > > > > > > 
> > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 3. poll queue X
> > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > use buffers in X
> > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > use buffers in 1-X
> > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > >                                      <- leak event 3 (it
> > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > 9. goto 3
> > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > actually add the
> > > > > > > > buffers in X and make
> > > > > > > > them visible to the device?
> > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > 
> > > > > So the scenario I have in mind is the following:
> > > > > 
> > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > maintained by random.c
> > > > > that changes every time a leak event happens).
> > > > > 
> > > > > 1. add buffers to 1-X
> > > > > 2. add buffers to X
> > > > > 3. poll queue X
> > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > 5. Device: First snapshot, uses buffers in X
> > > > > 6. vcpu 1: sees used buffers
> > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > has not yet finished adding new buffers).
> > > > > 10. vcpu 1 adds new buffer in X
> > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > 
> > > > > 
> > > > > In this succession of events, when the third snapshot will happen, the
> > > > > device won't find
> > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > any entropy
> > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > something?
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > > 
> > > > Yes but notice how this is followed by:
> > > > 
> > > > 12. vcpu 1: sees used buffers in 1-X
> > > > 
> > > > Driver can notify getrandom I guess?
> > > It could, but then we have the exact race condition that VMGENID had,
> > > userspace has already consumed stale entropy and there's nothing we
> > > can do about that.
> > > 
> > > Although this is indeed a corner case, it feels like it beats the purpose
> > > of having the hardware update directly userspace (via copy on leak).
> > > 
> > > How do you feel about the proposal a couple of emails back? It looks to
> > > me that it avoids completely the race condition.
> > > 
> > > Cheers,
> > > Babis
> > It does. The problem of course is that this means that e.g.
> > taking a snapshot of a guest that is stuck won't work well.
> 
> That is true, but does it matter? The intention of the proposal
> is that if it is not safe to take snapshots (i.e. no buffers in the
> queue) don't take snapshots.
> 
> > I have been thinking of adding MAP/UNMAP descriptors for
> > a while now. Thus it will be possible to modify
> > userspace memory without consuming buffers.
> > Would something like this solve the problem?
> 
> I am not familiar with MAP/UNMAP descriptors. Is there
> a link where I can read about them?
> 
> Cheers,
> Babis


Heh no I just came up with the name. Will write up in a couple
of days, but the idea is that driver does get_user_pages,
adds buffer to queue, and device will remember the address
and change that memory on a snapshot. If there are buffers
in the queue it will also use these to tell driver,
but if there are no buffers then it won't.

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-22 16:01                                     ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-09-27 10:43                                     ` Babis Chalios
  2023-09-27 21:47                                         ` [virtio-comment] " Michael S. Tsirkin
  -1 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-09-27 10:43 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On 22/9/23 18:01, Michael S. Tsirkin wrote:
> On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
>>
>> On 22/9/23 17:06, Michael S. Tsirkin wrote:
>>> On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
>>>> On 19/9/23 12:01, Michael S. Tsirkin wrote:
>>>>> On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
>>>>>> Resending to fix e-mail formatting issues (sorry for the spam)
>>>>>>
>>>>>> On 18/9/23 18:30, Babis Chalios wrote:
>>>>>>>>>>> Yes, that's what the driver does now in the RFC patch.
>>>>>>>>>>> However, this just
>>>>>>>>>>> decreases
>>>>>>>>>>> the race window, it doesn't eliminate it. If a third
>>>>>>>>>>> leak event happens it
>>>>>>>>>>> might not
>>>>>>>>>>> find any buffers to use:
>>>>>>>>>>>
>>>>>>>>>>> 1. available buffers to queue 1-X
>>>>>>>>>>> 2. available buffers to queue X
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 3. poll queue X
>>>>>>>>>>> 4. used buffers in queue X       <- leak event 1 will
>>>>>>>>>>> use buffers in X
>>>>>>>>>>> 5. avail buffers in queue X
>>>>>>>>>>> 6. poll queue 1-X                <- leak event 2 will
>>>>>>>>>>> use buffers in 1-X
>>>>>>>>>>> 7. used buffers in queue 1-X
>>>>>>>>>>> 8. avail buffers in queue 1-X
>>>>>>>>>>>                                       <- leak event 3 (it
>>>>>>>>>>> needs buffers in X, race with step 5)
>>>>>>>>>>> 9. goto 3
>>>>>>>>>> I don't get it. we added buffers in step 5.
>>>>>>>>> What if the leak event 3 arrives before step 5 had time to
>>>>>>>>> actually add the
>>>>>>>>> buffers in X and make
>>>>>>>>> them visible to the device?
>>>>>>>> Then it will see a single event in 1-X instead of two events.  A leak is
>>>>>>>> a leak though, I don't see does it matter how many triggered.
>>>>>>>>
>>>>>> So the scenario I have in mind is the following:
>>>>>>
>>>>>> (Epoch here is terminology that I used in the Linux RFC. It is a value
>>>>>> maintained by random.c
>>>>>> that changes every time a leak event happens).
>>>>>>
>>>>>> 1. add buffers to 1-X
>>>>>> 2. add buffers to X
>>>>>> 3. poll queue X
>>>>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>>>>> 5. Device: First snapshot, uses buffers in X
>>>>>> 6. vcpu 1: sees used buffers
>>>>>> 7. Device: Second snapshot, uses buffers in 1-X
>>>>>> 8. vcpu 0: getrandom() observes new  epoch value & caches it
>>>>>> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
>>>>>> has not yet finished adding new buffers).
>>>>>> 10. vcpu 1 adds new buffer in X
>>>>>> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
>>>>>>
>>>>>>
>>>>>> In this succession of events, when the third snapshot will happen, the
>>>>>> device won't find
>>>>>> any buffers in either queue, so it won't increase the RNG epoch value. So,
>>>>>> any entropy
>>>>>> gathered after step 8 will be the same across all snapshots. Am I missing
>>>>>> something?
>>>>>>
>>>>>> Cheers,
>>>>>> Babis
>>>>>>
>>>>> Yes but notice how this is followed by:
>>>>>
>>>>> 12. vcpu 1: sees used buffers in 1-X
>>>>>
>>>>> Driver can notify getrandom I guess?
>>>> It could, but then we have the exact race condition that VMGENID had,
>>>> userspace has already consumed stale entropy and there's nothing we
>>>> can do about that.
>>>>
>>>> Although this is indeed a corner case, it feels like it beats the purpose
>>>> of having the hardware update directly userspace (via copy on leak).
>>>>
>>>> How do you feel about the proposal a couple of emails back? It looks to
>>>> me that it avoids completely the race condition.
>>>>
>>>> Cheers,
>>>> Babis
>>> It does. The problem of course is that this means that e.g.
>>> taking a snapshot of a guest that is stuck won't work well.
>> That is true, but does it matter? The intention of the proposal
>> is that if it is not safe to take snapshots (i.e. no buffers in the
>> queue) don't take snapshots.
>>
>>> I have been thinking of adding MAP/UNMAP descriptors for
>>> a while now. Thus it will be possible to modify
>>> userspace memory without consuming buffers.
>>> Would something like this solve the problem?
>> I am not familiar with MAP/UNMAP descriptors. Is there
>> a link where I can read about them?
>>
>> Cheers,
>> Babis
>
> Heh no I just came up with the name. Will write up in a couple
> of days, but the idea is that driver does get_user_pages,
> adds buffer to queue, and device will remember the address
> and change that memory on a snapshot. If there are buffers
> in the queue it will also use these to tell driver,
> but if there are no buffers then it won't.

That sounds like a nice mechanism. However in our case the page
holding the counter that gets increased by the hardware is a kernel
page.

The reason for that is that things other than us (virtio-rng) might
want to notify for leak events. For example, I think that Jason
intended to use this mechanism to periodically notify user-space
PRNGs that they need to reseed.

Cheers,
Babis

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-27 10:43                                     ` Babis Chalios
@ 2023-09-27 21:47                                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-27 21:47 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
> On 22/9/23 18:01, Michael S. Tsirkin wrote:
> > On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> > > 
> > > On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > > > 
> > > > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > > > However, this just
> > > > > > > > > > > > decreases
> > > > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > > > leak event happens it
> > > > > > > > > > > > might not
> > > > > > > > > > > > find any buffers to use:
> > > > > > > > > > > > 
> > > > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > > > use buffers in X
> > > > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > > > use buffers in 1-X
> > > > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > > > >                                       <- leak event 3 (it
> > > > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > > > 9. goto 3
> > > > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > > > actually add the
> > > > > > > > > > buffers in X and make
> > > > > > > > > > them visible to the device?
> > > > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > > > 
> > > > > > > So the scenario I have in mind is the following:
> > > > > > > 
> > > > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > > > maintained by random.c
> > > > > > > that changes every time a leak event happens).
> > > > > > > 
> > > > > > > 1. add buffers to 1-X
> > > > > > > 2. add buffers to X
> > > > > > > 3. poll queue X
> > > > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > > > 5. Device: First snapshot, uses buffers in X
> > > > > > > 6. vcpu 1: sees used buffers
> > > > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > > > has not yet finished adding new buffers).
> > > > > > > 10. vcpu 1 adds new buffer in X
> > > > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > > > 
> > > > > > > 
> > > > > > > In this succession of events, when the third snapshot will happen, the
> > > > > > > device won't find
> > > > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > > > any entropy
> > > > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > > > something?
> > > > > > > 
> > > > > > > Cheers,
> > > > > > > Babis
> > > > > > > 
> > > > > > Yes but notice how this is followed by:
> > > > > > 
> > > > > > 12. vcpu 1: sees used buffers in 1-X
> > > > > > 
> > > > > > Driver can notify getrandom I guess?
> > > > > It could, but then we have the exact race condition that VMGENID had,
> > > > > userspace has already consumed stale entropy and there's nothing we
> > > > > can do about that.
> > > > > 
> > > > > Although this is indeed a corner case, it feels like it beats the purpose
> > > > > of having the hardware update directly userspace (via copy on leak).
> > > > > 
> > > > > How do you feel about the proposal a couple of emails back? It looks to
> > > > > me that it avoids completely the race condition.
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > It does. The problem of course is that this means that e.g.
> > > > taking a snapshot of a guest that is stuck won't work well.
> > > That is true, but does it matter? The intention of the proposal
> > > is that if it is not safe to take snapshots (i.e. no buffers in the
> > > queue) don't take snapshots.
> > > 
> > > > I have been thinking of adding MAP/UNMAP descriptors for
> > > > a while now. Thus it will be possible to modify
> > > > userspace memory without consuming buffers.
> > > > Would something like this solve the problem?
> > > I am not familiar with MAP/UNMAP descriptors. Is there
> > > a link where I can read about them?
> > > 
> > > Cheers,
> > > Babis
> > 
> > Heh no I just came up with the name. Will write up in a couple
> > of days, but the idea is that driver does get_user_pages,
> > adds buffer to queue, and device will remember the address
> > and change that memory on a snapshot. If there are buffers
> > in the queue it will also use these to tell driver,
> > but if there are no buffers then it won't.
> 
> That sounds like a nice mechanism. However in our case the page
> holding the counter that gets increased by the hardware is a kernel
> page.
> 
> The reason for that is that things other than us (virtio-rng) might
> want to notify for leak events. For example, I think that Jason
> intended to use this mechanism to periodically notify user-space
> PRNGs that they need to reseed.
> 
> Cheers,
> Babis


Now I'm lost.
when you write, e.g.:
4. vcpu 0: get getrandom() entropy and cache epoch value
how does vcpu access the epoch?

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-09-27 21:47                                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-09-27 21:47 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
> On 22/9/23 18:01, Michael S. Tsirkin wrote:
> > On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> > > 
> > > On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > > > 
> > > > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > > > However, this just
> > > > > > > > > > > > decreases
> > > > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > > > leak event happens it
> > > > > > > > > > > > might not
> > > > > > > > > > > > find any buffers to use:
> > > > > > > > > > > > 
> > > > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > > > use buffers in X
> > > > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > > > use buffers in 1-X
> > > > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > > > >                                       <- leak event 3 (it
> > > > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > > > 9. goto 3
> > > > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > > > actually add the
> > > > > > > > > > buffers in X and make
> > > > > > > > > > them visible to the device?
> > > > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > > > 
> > > > > > > So the scenario I have in mind is the following:
> > > > > > > 
> > > > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > > > maintained by random.c
> > > > > > > that changes every time a leak event happens).
> > > > > > > 
> > > > > > > 1. add buffers to 1-X
> > > > > > > 2. add buffers to X
> > > > > > > 3. poll queue X
> > > > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > > > 5. Device: First snapshot, uses buffers in X
> > > > > > > 6. vcpu 1: sees used buffers
> > > > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > > > has not yet finished adding new buffers).
> > > > > > > 10. vcpu 1 adds new buffer in X
> > > > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > > > 
> > > > > > > 
> > > > > > > In this succession of events, when the third snapshot will happen, the
> > > > > > > device won't find
> > > > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > > > any entropy
> > > > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > > > something?
> > > > > > > 
> > > > > > > Cheers,
> > > > > > > Babis
> > > > > > > 
> > > > > > Yes but notice how this is followed by:
> > > > > > 
> > > > > > 12. vcpu 1: sees used buffers in 1-X
> > > > > > 
> > > > > > Driver can notify getrandom I guess?
> > > > > It could, but then we have the exact race condition that VMGENID had,
> > > > > userspace has already consumed stale entropy and there's nothing we
> > > > > can do about that.
> > > > > 
> > > > > Although this is indeed a corner case, it feels like it beats the purpose
> > > > > of having the hardware update directly userspace (via copy on leak).
> > > > > 
> > > > > How do you feel about the proposal a couple of emails back? It looks to
> > > > > me that it avoids completely the race condition.
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > It does. The problem of course is that this means that e.g.
> > > > taking a snapshot of a guest that is stuck won't work well.
> > > That is true, but does it matter? The intention of the proposal
> > > is that if it is not safe to take snapshots (i.e. no buffers in the
> > > queue) don't take snapshots.
> > > 
> > > > I have been thinking of adding MAP/UNMAP descriptors for
> > > > a while now. Thus it will be possible to modify
> > > > userspace memory without consuming buffers.
> > > > Would something like this solve the problem?
> > > I am not familiar with MAP/UNMAP descriptors. Is there
> > > a link where I can read about them?
> > > 
> > > Cheers,
> > > Babis
> > 
> > Heh no I just came up with the name. Will write up in a couple
> > of days, but the idea is that driver does get_user_pages,
> > adds buffer to queue, and device will remember the address
> > and change that memory on a snapshot. If there are buffers
> > in the queue it will also use these to tell driver,
> > but if there are no buffers then it won't.
> 
> That sounds like a nice mechanism. However in our case the page
> holding the counter that gets increased by the hardware is a kernel
> page.
> 
> The reason for that is that things other than us (virtio-rng) might
> want to notify for leak events. For example, I think that Jason
> intended to use this mechanism to periodically notify user-space
> PRNGs that they need to reseed.
> 
> Cheers,
> Babis


Now I'm lost.
when you write, e.g.:
4. vcpu 0: get getrandom() entropy and cache epoch value
how does vcpu access the epoch?

-- 
MST


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-27 21:47                                         ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-09-28 18:16                                         ` Babis Chalios
  2023-10-13  7:49                                           ` Babis Chalios
  2023-11-02 11:20                                             ` [virtio-comment] " Michael S. Tsirkin
  -1 siblings, 2 replies; 53+ messages in thread
From: Babis Chalios @ 2023-09-28 18:16 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams



On 27/9/23 23:47, Michael S. Tsirkin wrote:
> On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
>> On 22/9/23 18:01, Michael S. Tsirkin wrote:
>>> On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
>>>> On 22/9/23 17:06, Michael S. Tsirkin wrote:
>>>>> On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
>>>>>> On 19/9/23 12:01, Michael S. Tsirkin wrote:
>>>>>>> On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
>>>>>>>> Resending to fix e-mail formatting issues (sorry for the spam)
>>>>>>>>
>>>>>>>> On 18/9/23 18:30, Babis Chalios wrote:
>>>>>>>>>>>>> Yes, that's what the driver does now in the RFC patch.
>>>>>>>>>>>>> However, this just
>>>>>>>>>>>>> decreases
>>>>>>>>>>>>> the race window, it doesn't eliminate it. If a third
>>>>>>>>>>>>> leak event happens it
>>>>>>>>>>>>> might not
>>>>>>>>>>>>> find any buffers to use:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. available buffers to queue 1-X
>>>>>>>>>>>>> 2. available buffers to queue X
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3. poll queue X
>>>>>>>>>>>>> 4. used buffers in queue X       <- leak event 1 will
>>>>>>>>>>>>> use buffers in X
>>>>>>>>>>>>> 5. avail buffers in queue X
>>>>>>>>>>>>> 6. poll queue 1-X                <- leak event 2 will
>>>>>>>>>>>>> use buffers in 1-X
>>>>>>>>>>>>> 7. used buffers in queue 1-X
>>>>>>>>>>>>> 8. avail buffers in queue 1-X
>>>>>>>>>>>>>                                        <- leak event 3 (it
>>>>>>>>>>>>> needs buffers in X, race with step 5)
>>>>>>>>>>>>> 9. goto 3
>>>>>>>>>>>> I don't get it. we added buffers in step 5.
>>>>>>>>>>> What if the leak event 3 arrives before step 5 had time to
>>>>>>>>>>> actually add the
>>>>>>>>>>> buffers in X and make
>>>>>>>>>>> them visible to the device?
>>>>>>>>>> Then it will see a single event in 1-X instead of two events.  A leak is
>>>>>>>>>> a leak though, I don't see does it matter how many triggered.
>>>>>>>>>>
>>>>>>>> So the scenario I have in mind is the following:
>>>>>>>>
>>>>>>>> (Epoch here is terminology that I used in the Linux RFC. It is a value
>>>>>>>> maintained by random.c
>>>>>>>> that changes every time a leak event happens).
>>>>>>>>
>>>>>>>> 1. add buffers to 1-X
>>>>>>>> 2. add buffers to X
>>>>>>>> 3. poll queue X
>>>>>>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>>>>>>> 5. Device: First snapshot, uses buffers in X
>>>>>>>> 6. vcpu 1: sees used buffers
>>>>>>>> 7. Device: Second snapshot, uses buffers in 1-X
>>>>>>>> 8. vcpu 0: getrandom() observes new  epoch value & caches it
>>>>>>>> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
>>>>>>>> has not yet finished adding new buffers).
>>>>>>>> 10. vcpu 1 adds new buffer in X
>>>>>>>> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
>>>>>>>>
>>>>>>>>
>>>>>>>> In this succession of events, when the third snapshot will happen, the
>>>>>>>> device won't find
>>>>>>>> any buffers in either queue, so it won't increase the RNG epoch value. So,
>>>>>>>> any entropy
>>>>>>>> gathered after step 8 will be the same across all snapshots. Am I missing
>>>>>>>> something?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Babis
>>>>>>>>
>>>>>>> Yes but notice how this is followed by:
>>>>>>>
>>>>>>> 12. vcpu 1: sees used buffers in 1-X
>>>>>>>
>>>>>>> Driver can notify getrandom I guess?
>>>>>> It could, but then we have the exact race condition that VMGENID had,
>>>>>> userspace has already consumed stale entropy and there's nothing we
>>>>>> can do about that.
>>>>>>
>>>>>> Although this is indeed a corner case, it feels like it beats the purpose
>>>>>> of having the hardware update directly userspace (via copy on leak).
>>>>>>
>>>>>> How do you feel about the proposal a couple of emails back? It looks to
>>>>>> me that it avoids completely the race condition.
>>>>>>
>>>>>> Cheers,
>>>>>> Babis
>>>>> It does. The problem of course is that this means that e.g.
>>>>> taking a snapshot of a guest that is stuck won't work well.
>>>> That is true, but does it matter? The intention of the proposal
>>>> is that if it is not safe to take snapshots (i.e. no buffers in the
>>>> queue) don't take snapshots.
>>>>
>>>>> I have been thinking of adding MAP/UNMAP descriptors for
>>>>> a while now. Thus it will be possible to modify
>>>>> userspace memory without consuming buffers.
>>>>> Would something like this solve the problem?
>>>> I am not familiar with MAP/UNMAP descriptors. Is there
>>>> a link where I can read about them?
>>>>
>>>> Cheers,
>>>> Babis
>>> Heh no I just came up with the name. Will write up in a couple
>>> of days, but the idea is that driver does get_user_pages,
>>> adds buffer to queue, and device will remember the address
>>> and change that memory on a snapshot. If there are buffers
>>> in the queue it will also use these to tell driver,
>>> but if there are no buffers then it won't.
>> That sounds like a nice mechanism. However in our case the page
>> holding the counter that gets increased by the hardware is a kernel
>> page.
>>
>> The reason for that is that things other than us (virtio-rng) might
>> want to notify for leak events. For example, I think that Jason
>> intended to use this mechanism to periodically notify user-space
>> PRNGs that they need to reseed.
>>
>> Cheers,
>> Babis
>
> Now I'm lost.
> when you write, e.g.:
> 4. vcpu 0: get getrandom() entropy and cache epoch value
> how does vcpu access the epoch?

The kernel provides a user space API to map a pointer to the epoch
value. User space then caches its value and checks it every time it
needs to make sure that no entropy leak has happened before using
cached kernel entropy.

virtio-rng driver adds a copy on leak command to the queue for
increasing this value (that's what we are speaking about in this thread).
But other systems might want to report "leaks", such as random.c
itself.

Cheers,
Babis

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-28 18:16                                         ` Babis Chalios
@ 2023-10-13  7:49                                           ` Babis Chalios
  2023-10-13 13:38                                               ` [virtio-comment] " Michael S. Tsirkin
  2023-11-02 11:20                                             ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-10-13  7:49 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams



On 28/9/23 20:16, Babis Chalios wrote:
>
>
> On 27/9/23 23:47, Michael S. Tsirkin wrote:
>> On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
>>> On 22/9/23 18:01, Michael S. Tsirkin wrote:
>>>> On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
>>>>> On 22/9/23 17:06, Michael S. Tsirkin wrote:
>>>>>> On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
>>>>>>> On 19/9/23 12:01, Michael S. Tsirkin wrote:
>>>>>>>> On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
>>>>>>>>> Resending to fix e-mail formatting issues (sorry for the spam)
>>>>>>>>>
>>>>>>>>> On 18/9/23 18:30, Babis Chalios wrote:
>>>>>>>>>>>>>> Yes, that's what the driver does now in the RFC patch.
>>>>>>>>>>>>>> However, this just
>>>>>>>>>>>>>> decreases
>>>>>>>>>>>>>> the race window, it doesn't eliminate it. If a third
>>>>>>>>>>>>>> leak event happens it
>>>>>>>>>>>>>> might not
>>>>>>>>>>>>>> find any buffers to use:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. available buffers to queue 1-X
>>>>>>>>>>>>>> 2. available buffers to queue X
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3. poll queue X
>>>>>>>>>>>>>> 4. used buffers in queue X       <- leak event 1 will
>>>>>>>>>>>>>> use buffers in X
>>>>>>>>>>>>>> 5. avail buffers in queue X
>>>>>>>>>>>>>> 6. poll queue 1-X                <- leak event 2 will
>>>>>>>>>>>>>> use buffers in 1-X
>>>>>>>>>>>>>> 7. used buffers in queue 1-X
>>>>>>>>>>>>>> 8. avail buffers in queue 1-X
>>>>>>>>>>>>>> <- leak event 3 (it
>>>>>>>>>>>>>> needs buffers in X, race with step 5)
>>>>>>>>>>>>>> 9. goto 3
>>>>>>>>>>>>> I don't get it. we added buffers in step 5.
>>>>>>>>>>>> What if the leak event 3 arrives before step 5 had time to
>>>>>>>>>>>> actually add the
>>>>>>>>>>>> buffers in X and make
>>>>>>>>>>>> them visible to the device?
>>>>>>>>>>> Then it will see a single event in 1-X instead of two 
>>>>>>>>>>> events.  A leak is
>>>>>>>>>>> a leak though, I don't see does it matter how many triggered.
>>>>>>>>>>>
>>>>>>>>> So the scenario I have in mind is the following:
>>>>>>>>>
>>>>>>>>> (Epoch here is terminology that I used in the Linux RFC. It is 
>>>>>>>>> a value
>>>>>>>>> maintained by random.c
>>>>>>>>> that changes every time a leak event happens).
>>>>>>>>>
>>>>>>>>> 1. add buffers to 1-X
>>>>>>>>> 2. add buffers to X
>>>>>>>>> 3. poll queue X
>>>>>>>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>>>>>>>> 5. Device: First snapshot, uses buffers in X
>>>>>>>>> 6. vcpu 1: sees used buffers
>>>>>>>>> 7. Device: Second snapshot, uses buffers in 1-X
>>>>>>>>> 8. vcpu 0: getrandom() observes new  epoch value & caches it
>>>>>>>>> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 
>>>>>>>>> from step 6
>>>>>>>>> has not yet finished adding new buffers).
>>>>>>>>> 10. vcpu 1 adds new buffer in X
>>>>>>>>> 11. vcpu 0: getrandom() will not see new epoch and gets stale 
>>>>>>>>> entropy.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> In this succession of events, when the third snapshot will 
>>>>>>>>> happen, the
>>>>>>>>> device won't find
>>>>>>>>> any buffers in either queue, so it won't increase the RNG 
>>>>>>>>> epoch value. So,
>>>>>>>>> any entropy
>>>>>>>>> gathered after step 8 will be the same across all snapshots. 
>>>>>>>>> Am I missing
>>>>>>>>> something?
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Babis
>>>>>>>>>
>>>>>>>> Yes but notice how this is followed by:
>>>>>>>>
>>>>>>>> 12. vcpu 1: sees used buffers in 1-X
>>>>>>>>
>>>>>>>> Driver can notify getrandom I guess?
>>>>>>> It could, but then we have the exact race condition that VMGENID 
>>>>>>> had,
>>>>>>> userspace has already consumed stale entropy and there's nothing we
>>>>>>> can do about that.
>>>>>>>
>>>>>>> Although this is indeed a corner case, it feels like it beats 
>>>>>>> the purpose
>>>>>>> of having the hardware update directly userspace (via copy on 
>>>>>>> leak).
>>>>>>>
>>>>>>> How do you feel about the proposal a couple of emails back? It 
>>>>>>> looks to
>>>>>>> me that it avoids completely the race condition.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Babis
>>>>>> It does. The problem of course is that this means that e.g.
>>>>>> taking a snapshot of a guest that is stuck won't work well.
>>>>> That is true, but does it matter? The intention of the proposal
>>>>> is that if it is not safe to take snapshots (i.e. no buffers in the
>>>>> queue) don't take snapshots.
>>>>>
>>>>>> I have been thinking of adding MAP/UNMAP descriptors for
>>>>>> a while now. Thus it will be possible to modify
>>>>>> userspace memory without consuming buffers.
>>>>>> Would something like this solve the problem?
>>>>> I am not familiar with MAP/UNMAP descriptors. Is there
>>>>> a link where I can read about them?
>>>>>
>>>>> Cheers,
>>>>> Babis
>>>> Heh no I just came up with the name. Will write up in a couple
>>>> of days, but the idea is that driver does get_user_pages,
>>>> adds buffer to queue, and device will remember the address
>>>> and change that memory on a snapshot. If there are buffers
>>>> in the queue it will also use these to tell driver,
>>>> but if there are no buffers then it won't.
>>> That sounds like a nice mechanism. However in our case the page
>>> holding the counter that gets increased by the hardware is a kernel
>>> page.
>>>
>>> The reason for that is that things other than us (virtio-rng) might
>>> want to notify for leak events. For example, I think that Jason
>>> intended to use this mechanism to periodically notify user-space
>>> PRNGs that they need to reseed.
>>>
>>> Cheers,
>>> Babis
>>
>> Now I'm lost.
>> when you write, e.g.:
>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>> how does vcpu access the epoch?
>
> The kernel provides a user space API to map a pointer to the epoch
> value. User space then caches its value and checks it every time it
> needs to make sure that no entropy leak has happened before using
> cached kernel entropy.
>
> virtio-rng driver adds a copy on leak command to the queue for
> increasing this value (that's what we are speaking about in this thread).
> But other systems might want to report "leaks", such as random.c
> itself.
>
> Cheers,
> Babis

Hey Michael, does this explain the flow of things?

To summarize, end-to-end things work like this:

1. kernel allocates a page where an epoch value is stored.
2. User-space can mmap this page and look for changes in its value to
     know when it needs to re-seed its PRNGs.
3. virtio-rng driver gets a hold of the address of the page and programs
     device to update it when entropy is leaked
4. Other kernel sub-systems can do the same if they have a concept of
     entropy leaking. For example, random.c might want to do this 
periodically

Regarding your suggestion for MAP/UNMAP buffers, I think it could work
if it was working with kernel addresses as well. The device doesn't care
about user vs kernel space guest addresses after all. However, if we were to
use such a mechanism, it seems that we don't need the two leak queues
anyway.

So, it sounds like we can have a setup with one leak queue only. If we use a
MAP/UNMAP type of buffer then all is fine. If we use normal buffers, we can
add the requirement that the VMM needs to refuse to take a snapshot if there
is not a copy on leak buffer in the queue.

Thoughts?

Cheers,
Babis

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-10-13  7:49                                           ` Babis Chalios
@ 2023-10-13 13:38                                               ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-10-13 13:38 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Fri, Oct 13, 2023 at 09:49:40AM +0200, Babis Chalios wrote:
> 
> 
> On 28/9/23 20:16, Babis Chalios wrote:
> > 
> > 
> > On 27/9/23 23:47, Michael S. Tsirkin wrote:
> > > On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
> > > > On 22/9/23 18:01, Michael S. Tsirkin wrote:
> > > > > On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> > > > > > On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > > > > > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > > > > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > > > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > > > > > > 
> > > > > > > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > > > > > > However, this just
> > > > > > > > > > > > > > > decreases
> > > > > > > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > > > > > > leak event happens it
> > > > > > > > > > > > > > > might not
> > > > > > > > > > > > > > > find any buffers to use:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > > > > > > use buffers in X
> > > > > > > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > > > > > > use buffers in 1-X
> > > > > > > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > > > > > > > <- leak event 3 (it
> > > > > > > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > > > > > > 9. goto 3
> > > > > > > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > > > > > > actually add the
> > > > > > > > > > > > > buffers in X and make
> > > > > > > > > > > > > them visible to the device?
> > > > > > > > > > > > Then it will see a single event
> > > > > > > > > > > > in 1-X instead of two events.  A
> > > > > > > > > > > > leak is
> > > > > > > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > > > > > > 
> > > > > > > > > > So the scenario I have in mind is the following:
> > > > > > > > > > 
> > > > > > > > > > (Epoch here is terminology that I used
> > > > > > > > > > in the Linux RFC. It is a value
> > > > > > > > > > maintained by random.c
> > > > > > > > > > that changes every time a leak event happens).
> > > > > > > > > > 
> > > > > > > > > > 1. add buffers to 1-X
> > > > > > > > > > 2. add buffers to X
> > > > > > > > > > 3. poll queue X
> > > > > > > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > > > > > > 5. Device: First snapshot, uses buffers in X
> > > > > > > > > > 6. vcpu 1: sees used buffers
> > > > > > > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > > > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > > > > > > 9. Device: Third snapshot, no buffers in
> > > > > > > > > > either queue, (vcpu 1 from step 6
> > > > > > > > > > has not yet finished adding new buffers).
> > > > > > > > > > 10. vcpu 1 adds new buffer in X
> > > > > > > > > > 11. vcpu 0: getrandom() will not see new
> > > > > > > > > > epoch and gets stale entropy.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > In this succession of events, when the
> > > > > > > > > > third snapshot will happen, the
> > > > > > > > > > device won't find
> > > > > > > > > > any buffers in either queue, so it won't
> > > > > > > > > > increase the RNG epoch value. So,
> > > > > > > > > > any entropy
> > > > > > > > > > gathered after step 8 will be the same
> > > > > > > > > > across all snapshots. Am I missing
> > > > > > > > > > something?
> > > > > > > > > > 
> > > > > > > > > > Cheers,
> > > > > > > > > > Babis
> > > > > > > > > > 
> > > > > > > > > Yes but notice how this is followed by:
> > > > > > > > > 
> > > > > > > > > 12. vcpu 1: sees used buffers in 1-X
> > > > > > > > > 
> > > > > > > > > Driver can notify getrandom I guess?
> > > > > > > > It could, but then we have the exact race
> > > > > > > > condition that VMGENID had,
> > > > > > > > userspace has already consumed stale entropy and there's nothing we
> > > > > > > > can do about that.
> > > > > > > > 
> > > > > > > > Although this is indeed a corner case, it feels
> > > > > > > > like it beats the purpose
> > > > > > > > of having the hardware update directly userspace
> > > > > > > > (via copy on leak).
> > > > > > > > 
> > > > > > > > How do you feel about the proposal a couple of
> > > > > > > > emails back? It looks to
> > > > > > > > me that it avoids completely the race condition.
> > > > > > > > 
> > > > > > > > Cheers,
> > > > > > > > Babis
> > > > > > > It does. The problem of course is that this means that e.g.
> > > > > > > taking a snapshot of a guest that is stuck won't work well.
> > > > > > That is true, but does it matter? The intention of the proposal
> > > > > > is that if it is not safe to take snapshots (i.e. no buffers in the
> > > > > > queue) don't take snapshots.
> > > > > > 
> > > > > > > I have been thinking of adding MAP/UNMAP descriptors for
> > > > > > > a while now. Thus it will be possible to modify
> > > > > > > userspace memory without consuming buffers.
> > > > > > > Would something like this solve the problem?
> > > > > > I am not familiar with MAP/UNMAP descriptors. Is there
> > > > > > a link where I can read about them?
> > > > > > 
> > > > > > Cheers,
> > > > > > Babis
> > > > > Heh no I just came up with the name. Will write up in a couple
> > > > > of days, but the idea is that driver does get_user_pages,
> > > > > adds buffer to queue, and device will remember the address
> > > > > and change that memory on a snapshot. If there are buffers
> > > > > in the queue it will also use these to tell driver,
> > > > > but if there are no buffers then it won't.
> > > > That sounds like a nice mechanism. However in our case the page
> > > > holding the counter that gets increased by the hardware is a kernel
> > > > page.
> > > > 
> > > > The reason for that is that things other than us (virtio-rng) might
> > > > want to notify for leak events. For example, I think that Jason
> > > > intended to use this mechanism to periodically notify user-space
> > > > PRNGs that they need to reseed.
> > > > 
> > > > Cheers,
> > > > Babis
> > > 
> > > Now I'm lost.
> > > when you write, e.g.:
> > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > how does vcpu access the epoch?
> > 
> > The kernel provides a user space API to map a pointer to the epoch
> > value. User space then caches its value and checks it every time it
> > needs to make sure that no entropy leak has happened before using
> > cached kernel entropy.
> > 
> > virtio-rng driver adds a copy on leak command to the queue for
> > increasing this value (that's what we are speaking about in this thread).
> > But other systems might want to report "leaks", such as random.c
> > itself.
> > 
> > Cheers,
> > Babis
> 
> Hey Michael, does this explain the flow of things?
> 
> To summarize, end-to-end things work like this:
> 
> 1. kernel allocates a page where an epoch value is stored.
> 2. User-space can mmap this page and look for changes in its value to
>     know when it needs to re-seed its PRNGs.
> 3. virtio-rng driver gets a hold of the address of the page and programs
>     device to update it when entropy is leaked
> 4. Other kernel sub-systems can do the same if they have a concept of
>     entropy leaking. For example, random.c might want to do this
> periodically
> 
> Regarding your suggestion for MAP/UNMAP buffers, I think it could work
> if it was working with kernel addresses as well. The device doesn't care
> about user vs kernel space guest addresses after all. However, if we were to
> use such a mechanism, it seems that we don't need the two leak queues
> anyway.
> 
> So, it sounds like we can have a setup with one leak queue only. If we use a
> MAP/UNMAP type of buffer then all is fine. If we use normal buffers, we can
> add the requirement that the VMM needs to refuse to take a snapshot if there
> is not a copy on leak buffer in the queue.
> 
> Thoughts?
> 
> Cheers,
> Babis

Lost context a bit. I'll think it over. Thanks!


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-10-13 13:38                                               ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-10-13 13:38 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Fri, Oct 13, 2023 at 09:49:40AM +0200, Babis Chalios wrote:
> 
> 
> On 28/9/23 20:16, Babis Chalios wrote:
> > 
> > 
> > On 27/9/23 23:47, Michael S. Tsirkin wrote:
> > > On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
> > > > On 22/9/23 18:01, Michael S. Tsirkin wrote:
> > > > > On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> > > > > > On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > > > > > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > > > > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > > > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > > > > > > 
> > > > > > > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > > > > > > However, this just
> > > > > > > > > > > > > > > decreases
> > > > > > > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > > > > > > leak event happens it
> > > > > > > > > > > > > > > might not
> > > > > > > > > > > > > > > find any buffers to use:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > > > > > > use buffers in X
> > > > > > > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > > > > > > use buffers in 1-X
> > > > > > > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > > > > > > > <- leak event 3 (it
> > > > > > > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > > > > > > 9. goto 3
> > > > > > > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > > > > > > actually add the
> > > > > > > > > > > > > buffers in X and make
> > > > > > > > > > > > > them visible to the device?
> > > > > > > > > > > > Then it will see a single event
> > > > > > > > > > > > in 1-X instead of two events.  A
> > > > > > > > > > > > leak is
> > > > > > > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > > > > > > 
> > > > > > > > > > So the scenario I have in mind is the following:
> > > > > > > > > > 
> > > > > > > > > > (Epoch here is terminology that I used
> > > > > > > > > > in the Linux RFC. It is a value
> > > > > > > > > > maintained by random.c
> > > > > > > > > > that changes every time a leak event happens).
> > > > > > > > > > 
> > > > > > > > > > 1. add buffers to 1-X
> > > > > > > > > > 2. add buffers to X
> > > > > > > > > > 3. poll queue X
> > > > > > > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > > > > > > 5. Device: First snapshot, uses buffers in X
> > > > > > > > > > 6. vcpu 1: sees used buffers
> > > > > > > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > > > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > > > > > > 9. Device: Third snapshot, no buffers in
> > > > > > > > > > either queue, (vcpu 1 from step 6
> > > > > > > > > > has not yet finished adding new buffers).
> > > > > > > > > > 10. vcpu 1 adds new buffer in X
> > > > > > > > > > 11. vcpu 0: getrandom() will not see new
> > > > > > > > > > epoch and gets stale entropy.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > In this succession of events, when the
> > > > > > > > > > third snapshot will happen, the
> > > > > > > > > > device won't find
> > > > > > > > > > any buffers in either queue, so it won't
> > > > > > > > > > increase the RNG epoch value. So,
> > > > > > > > > > any entropy
> > > > > > > > > > gathered after step 8 will be the same
> > > > > > > > > > across all snapshots. Am I missing
> > > > > > > > > > something?
> > > > > > > > > > 
> > > > > > > > > > Cheers,
> > > > > > > > > > Babis
> > > > > > > > > > 
> > > > > > > > > Yes but notice how this is followed by:
> > > > > > > > > 
> > > > > > > > > 12. vcpu 1: sees used buffers in 1-X
> > > > > > > > > 
> > > > > > > > > Driver can notify getrandom I guess?
> > > > > > > > It could, but then we have the exact race
> > > > > > > > condition that VMGENID had,
> > > > > > > > userspace has already consumed stale entropy and there's nothing we
> > > > > > > > can do about that.
> > > > > > > > 
> > > > > > > > Although this is indeed a corner case, it feels
> > > > > > > > like it beats the purpose
> > > > > > > > of having the hardware update directly userspace
> > > > > > > > (via copy on leak).
> > > > > > > > 
> > > > > > > > How do you feel about the proposal a couple of
> > > > > > > > emails back? It looks to
> > > > > > > > me that it avoids completely the race condition.
> > > > > > > > 
> > > > > > > > Cheers,
> > > > > > > > Babis
> > > > > > > It does. The problem of course is that this means that e.g.
> > > > > > > taking a snapshot of a guest that is stuck won't work well.
> > > > > > That is true, but does it matter? The intention of the proposal
> > > > > > is that if it is not safe to take snapshots (i.e. no buffers in the
> > > > > > queue) don't take snapshots.
> > > > > > 
> > > > > > > I have been thinking of adding MAP/UNMAP descriptors for
> > > > > > > a while now. Thus it will be possible to modify
> > > > > > > userspace memory without consuming buffers.
> > > > > > > Would something like this solve the problem?
> > > > > > I am not familiar with MAP/UNMAP descriptors. Is there
> > > > > > a link where I can read about them?
> > > > > > 
> > > > > > Cheers,
> > > > > > Babis
> > > > > Heh no I just came up with the name. Will write up in a couple
> > > > > of days, but the idea is that driver does get_user_pages,
> > > > > adds buffer to queue, and device will remember the address
> > > > > and change that memory on a snapshot. If there are buffers
> > > > > in the queue it will also use these to tell driver,
> > > > > but if there are no buffers then it won't.
> > > > That sounds like a nice mechanism. However in our case the page
> > > > holding the counter that gets increased by the hardware is a kernel
> > > > page.
> > > > 
> > > > The reason for that is that things other than us (virtio-rng) might
> > > > want to notify for leak events. For example, I think that Jason
> > > > intended to use this mechanism to periodically notify user-space
> > > > PRNGs that they need to reseed.
> > > > 
> > > > Cheers,
> > > > Babis
> > > 
> > > Now I'm lost.
> > > when you write, e.g.:
> > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > how does vcpu access the epoch?
> > 
> > The kernel provides a user space API to map a pointer to the epoch
> > value. User space then caches its value and checks it every time it
> > needs to make sure that no entropy leak has happened before using
> > cached kernel entropy.
> > 
> > virtio-rng driver adds a copy on leak command to the queue for
> > increasing this value (that's what we are speaking about in this thread).
> > But other systems might want to report "leaks", such as random.c
> > itself.
> > 
> > Cheers,
> > Babis
> 
> Hey Michael, does this explain the flow of things?
> 
> To summarize, end-to-end things work like this:
> 
> 1. kernel allocates a page where an epoch value is stored.
> 2. User-space can mmap this page and look for changes in its value to
>     know when it needs to re-seed its PRNGs.
> 3. virtio-rng driver gets a hold of the address of the page and programs
>     device to update it when entropy is leaked
> 4. Other kernel sub-systems can do the same if they have a concept of
>     entropy leaking. For example, random.c might want to do this
> periodically
> 
> Regarding your suggestion for MAP/UNMAP buffers, I think it could work
> if it was working with kernel addresses as well. The device doesn't care
> about user vs kernel space guest addresses after all. However, if we were to
> use such a mechanism, it seems that we don't need the two leak queues
> anyway.
> 
> So, it sounds like we can have a setup with one leak queue only. If we use a
> MAP/UNMAP type of buffer then all is fine. If we use normal buffers, we can
> add the requirement that the VMM needs to refuse to take a snapshot if there
> is not a copy on leak buffer in the queue.
> 
> Thoughts?
> 
> Cheers,
> Babis

Lost context a bit. I'll think it over. Thanks!


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-28 18:16                                         ` Babis Chalios
@ 2023-11-02 11:20                                             ` Michael S. Tsirkin
  2023-11-02 11:20                                             ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-11-02 11:20 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Thu, Sep 28, 2023 at 08:16:11PM +0200, Babis Chalios wrote:
> 
> 
> On 27/9/23 23:47, Michael S. Tsirkin wrote:
> > On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
> > > On 22/9/23 18:01, Michael S. Tsirkin wrote:
> > > > On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> > > > > On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > > > > > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > > > > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > > > > > 
> > > > > > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > > > > > However, this just
> > > > > > > > > > > > > > decreases
> > > > > > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > > > > > leak event happens it
> > > > > > > > > > > > > > might not
> > > > > > > > > > > > > > find any buffers to use:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > > > > > use buffers in X
> > > > > > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > > > > > use buffers in 1-X
> > > > > > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > > > > > >                                        <- leak event 3 (it
> > > > > > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > > > > > 9. goto 3
> > > > > > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > > > > > actually add the
> > > > > > > > > > > > buffers in X and make
> > > > > > > > > > > > them visible to the device?
> > > > > > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > > > > > 
> > > > > > > > > So the scenario I have in mind is the following:
> > > > > > > > > 
> > > > > > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > > > > > maintained by random.c
> > > > > > > > > that changes every time a leak event happens).
> > > > > > > > > 
> > > > > > > > > 1. add buffers to 1-X
> > > > > > > > > 2. add buffers to X
> > > > > > > > > 3. poll queue X
> > > > > > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > > > > > 5. Device: First snapshot, uses buffers in X
> > > > > > > > > 6. vcpu 1: sees used buffers
> > > > > > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > > > > > has not yet finished adding new buffers).
> > > > > > > > > 10. vcpu 1 adds new buffer in X
> > > > > > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > In this succession of events, when the third snapshot will happen, the
> > > > > > > > > device won't find
> > > > > > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > > > > > any entropy
> > > > > > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > > > > > something?
> > > > > > > > > 
> > > > > > > > > Cheers,
> > > > > > > > > Babis
> > > > > > > > > 
> > > > > > > > Yes but notice how this is followed by:
> > > > > > > > 
> > > > > > > > 12. vcpu 1: sees used buffers in 1-X
> > > > > > > > 
> > > > > > > > Driver can notify getrandom I guess?
> > > > > > > It could, but then we have the exact race condition that VMGENID had,
> > > > > > > userspace has already consumed stale entropy and there's nothing we
> > > > > > > can do about that.
> > > > > > > 
> > > > > > > Although this is indeed a corner case, it feels like it beats the purpose
> > > > > > > of having the hardware update directly userspace (via copy on leak).
> > > > > > > 
> > > > > > > How do you feel about the proposal a couple of emails back? It looks to
> > > > > > > me that it avoids completely the race condition.
> > > > > > > 
> > > > > > > Cheers,
> > > > > > > Babis
> > > > > > It does. The problem of course is that this means that e.g.
> > > > > > taking a snapshot of a guest that is stuck won't work well.
> > > > > That is true, but does it matter? The intention of the proposal
> > > > > is that if it is not safe to take snapshots (i.e. no buffers in the
> > > > > queue) don't take snapshots.
> > > > > 
> > > > > > I have been thinking of adding MAP/UNMAP descriptors for
> > > > > > a while now. Thus it will be possible to modify
> > > > > > userspace memory without consuming buffers.
> > > > > > Would something like this solve the problem?
> > > > > I am not familiar with MAP/UNMAP descriptors. Is there
> > > > > a link where I can read about them?
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > Heh no I just came up with the name. Will write up in a couple
> > > > of days, but the idea is that driver does get_user_pages,
> > > > adds buffer to queue, and device will remember the address
> > > > and change that memory on a snapshot. If there are buffers
> > > > in the queue it will also use these to tell driver,
> > > > but if there are no buffers then it won't.
> > > That sounds like a nice mechanism. However in our case the page
> > > holding the counter that gets increased by the hardware is a kernel
> > > page.
> > > 
> > > The reason for that is that things other than us (virtio-rng) might
> > > want to notify for leak events. For example, I think that Jason
> > > intended to use this mechanism to periodically notify user-space
> > > PRNGs that they need to reseed.
> > > 
> > > Cheers,
> > > Babis
> > 
> > Now I'm lost.
> > when you write, e.g.:
> > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > how does vcpu access the epoch?
> 
> The kernel provides a user space API to map a pointer to the epoch
> value. User space then caches its value and checks it every time it
> needs to make sure that no entropy leak has happened before using
> cached kernel entropy.
> 
> virtio-rng driver adds a copy on leak command to the queue for
> increasing this value (that's what we are speaking about in this thread).
> But other systems might want to report "leaks", such as random.c
> itself.
> 
> Cheers,
> Babis


This idea would be fine but I don't see how it's so different
from VMGENID. Care to explain?


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-11-02 11:20                                             ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-11-02 11:20 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Thu, Sep 28, 2023 at 08:16:11PM +0200, Babis Chalios wrote:
> 
> 
> On 27/9/23 23:47, Michael S. Tsirkin wrote:
> > On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
> > > On 22/9/23 18:01, Michael S. Tsirkin wrote:
> > > > On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> > > > > On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > > > > > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > > > > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > > > > > 
> > > > > > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > > > > > However, this just
> > > > > > > > > > > > > > decreases
> > > > > > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > > > > > leak event happens it
> > > > > > > > > > > > > > might not
> > > > > > > > > > > > > > find any buffers to use:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > > > > > use buffers in X
> > > > > > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > > > > > use buffers in 1-X
> > > > > > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > > > > > >                                        <- leak event 3 (it
> > > > > > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > > > > > 9. goto 3
> > > > > > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > > > > > actually add the
> > > > > > > > > > > > buffers in X and make
> > > > > > > > > > > > them visible to the device?
> > > > > > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > > > > > 
> > > > > > > > > So the scenario I have in mind is the following:
> > > > > > > > > 
> > > > > > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > > > > > maintained by random.c
> > > > > > > > > that changes every time a leak event happens).
> > > > > > > > > 
> > > > > > > > > 1. add buffers to 1-X
> > > > > > > > > 2. add buffers to X
> > > > > > > > > 3. poll queue X
> > > > > > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > > > > > 5. Device: First snapshot, uses buffers in X
> > > > > > > > > 6. vcpu 1: sees used buffers
> > > > > > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > > > > > has not yet finished adding new buffers).
> > > > > > > > > 10. vcpu 1 adds new buffer in X
> > > > > > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > In this succession of events, when the third snapshot will happen, the
> > > > > > > > > device won't find
> > > > > > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > > > > > any entropy
> > > > > > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > > > > > something?
> > > > > > > > > 
> > > > > > > > > Cheers,
> > > > > > > > > Babis
> > > > > > > > > 
> > > > > > > > Yes but notice how this is followed by:
> > > > > > > > 
> > > > > > > > 12. vcpu 1: sees used buffers in 1-X
> > > > > > > > 
> > > > > > > > Driver can notify getrandom I guess?
> > > > > > > It could, but then we have the exact race condition that VMGENID had,
> > > > > > > userspace has already consumed stale entropy and there's nothing we
> > > > > > > can do about that.
> > > > > > > 
> > > > > > > Although this is indeed a corner case, it feels like it beats the purpose
> > > > > > > of having the hardware update directly userspace (via copy on leak).
> > > > > > > 
> > > > > > > How do you feel about the proposal a couple of emails back? It looks to
> > > > > > > me that it avoids completely the race condition.
> > > > > > > 
> > > > > > > Cheers,
> > > > > > > Babis
> > > > > > It does. The problem of course is that this means that e.g.
> > > > > > taking a snapshot of a guest that is stuck won't work well.
> > > > > That is true, but does it matter? The intention of the proposal
> > > > > is that if it is not safe to take snapshots (i.e. no buffers in the
> > > > > queue) don't take snapshots.
> > > > > 
> > > > > > I have been thinking of adding MAP/UNMAP descriptors for
> > > > > > a while now. Thus it will be possible to modify
> > > > > > userspace memory without consuming buffers.
> > > > > > Would something like this solve the problem?
> > > > > I am not familiar with MAP/UNMAP descriptors. Is there
> > > > > a link where I can read about them?
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > Heh no I just came up with the name. Will write up in a couple
> > > > of days, but the idea is that driver does get_user_pages,
> > > > adds buffer to queue, and device will remember the address
> > > > and change that memory on a snapshot. If there are buffers
> > > > in the queue it will also use these to tell driver,
> > > > but if there are no buffers then it won't.
> > > That sounds like a nice mechanism. However in our case the page
> > > holding the counter that gets increased by the hardware is a kernel
> > > page.
> > > 
> > > The reason for that is that things other than us (virtio-rng) might
> > > want to notify for leak events. For example, I think that Jason
> > > intended to use this mechanism to periodically notify user-space
> > > PRNGs that they need to reseed.
> > > 
> > > Cheers,
> > > Babis
> > 
> > Now I'm lost.
> > when you write, e.g.:
> > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > how does vcpu access the epoch?
> 
> The kernel provides a user space API to map a pointer to the epoch
> value. User space then caches its value and checks it every time it
> needs to make sure that no entropy leak has happened before using
> cached kernel entropy.
> 
> virtio-rng driver adds a copy on leak command to the queue for
> increasing this value (that's what we are speaking about in this thread).
> But other systems might want to report "leaks", such as random.c
> itself.
> 
> Cheers,
> Babis


This idea would be fine but I don't see how it's so different
from VMGENID. Care to explain?


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-09-22 15:40                                 ` Babis Chalios
@ 2023-11-02 11:25                                     ` Michael S. Tsirkin
  2023-11-02 11:25                                     ` [virtio-comment] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-11-02 11:25 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> 
> 
> On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > 
> > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > However, this just
> > > > > > > > > > decreases
> > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > leak event happens it
> > > > > > > > > > might not
> > > > > > > > > > find any buffers to use:
> > > > > > > > > > 
> > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 3. poll queue X
> > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > use buffers in X
> > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > use buffers in 1-X
> > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > >                                      <- leak event 3 (it
> > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > 9. goto 3
> > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > actually add the
> > > > > > > > buffers in X and make
> > > > > > > > them visible to the device?
> > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > 
> > > > > So the scenario I have in mind is the following:
> > > > > 
> > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > maintained by random.c
> > > > > that changes every time a leak event happens).
> > > > > 
> > > > > 1. add buffers to 1-X
> > > > > 2. add buffers to X
> > > > > 3. poll queue X
> > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > 5. Device: First snapshot, uses buffers in X
> > > > > 6. vcpu 1: sees used buffers
> > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > has not yet finished adding new buffers).
> > > > > 10. vcpu 1 adds new buffer in X
> > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > 
> > > > > 
> > > > > In this succession of events, when the third snapshot will happen, the
> > > > > device won't find
> > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > any entropy
> > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > something?
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > > 
> > > > Yes but notice how this is followed by:
> > > > 
> > > > 12. vcpu 1: sees used buffers in 1-X
> > > > 
> > > > Driver can notify getrandom I guess?
> > > It could, but then we have the exact race condition that VMGENID had,
> > > userspace has already consumed stale entropy and there's nothing we
> > > can do about that.
> > > 
> > > Although this is indeed a corner case, it feels like it beats the purpose
> > > of having the hardware update directly userspace (via copy on leak).
> > > 
> > > How do you feel about the proposal a couple of emails back? It looks to
> > > me that it avoids completely the race condition.
> > > 
> > > Cheers,
> > > Babis
> > It does. The problem of course is that this means that e.g.
> > taking a snapshot of a guest that is stuck won't work well.
> 
> That is true, but does it matter? The intention of the proposal
> is that if it is not safe to take snapshots (i.e. no buffers in the
> queue) don't take snapshots.

OK. Basically I think if there's a way for device to detect that
guest is stuck and not refilling the queue in a timely
manner, then we are ok - host will make its own decisions
on whether to snapshot or not.

However, I feel in that case we need a way to create a big
backlog of buffers for guest to fill such that this
ring empty condition is very unlikely.
One or even 2 queues does not seem enough then.

For example, I can see a "stop" command that will
tell device: "stop consuming buffers" and device
will stop consuming buffers until the next leak event.




> > I have been thinking of adding MAP/UNMAP descriptors for
> > a while now. Thus it will be possible to modify
> > userspace memory without consuming buffers.
> > Would something like this solve the problem?
> 
> I am not familiar with MAP/UNMAP descriptors. Is there
> a link where I can read about them?
> 
> Cheers,
> Babis


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-11-02 11:25                                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-11-02 11:25 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> 
> 
> On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > 
> > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > However, this just
> > > > > > > > > > decreases
> > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > leak event happens it
> > > > > > > > > > might not
> > > > > > > > > > find any buffers to use:
> > > > > > > > > > 
> > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 3. poll queue X
> > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > use buffers in X
> > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > use buffers in 1-X
> > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > >                                      <- leak event 3 (it
> > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > 9. goto 3
> > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > actually add the
> > > > > > > > buffers in X and make
> > > > > > > > them visible to the device?
> > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > 
> > > > > So the scenario I have in mind is the following:
> > > > > 
> > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > maintained by random.c
> > > > > that changes every time a leak event happens).
> > > > > 
> > > > > 1. add buffers to 1-X
> > > > > 2. add buffers to X
> > > > > 3. poll queue X
> > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > 5. Device: First snapshot, uses buffers in X
> > > > > 6. vcpu 1: sees used buffers
> > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > has not yet finished adding new buffers).
> > > > > 10. vcpu 1 adds new buffer in X
> > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > 
> > > > > 
> > > > > In this succession of events, when the third snapshot will happen, the
> > > > > device won't find
> > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > any entropy
> > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > something?
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > > 
> > > > Yes but notice how this is followed by:
> > > > 
> > > > 12. vcpu 1: sees used buffers in 1-X
> > > > 
> > > > Driver can notify getrandom I guess?
> > > It could, but then we have the exact race condition that VMGENID had,
> > > userspace has already consumed stale entropy and there's nothing we
> > > can do about that.
> > > 
> > > Although this is indeed a corner case, it feels like it beats the purpose
> > > of having the hardware update directly userspace (via copy on leak).
> > > 
> > > How do you feel about the proposal a couple of emails back? It looks to
> > > me that it avoids completely the race condition.
> > > 
> > > Cheers,
> > > Babis
> > It does. The problem of course is that this means that e.g.
> > taking a snapshot of a guest that is stuck won't work well.
> 
> That is true, but does it matter? The intention of the proposal
> is that if it is not safe to take snapshots (i.e. no buffers in the
> queue) don't take snapshots.

OK. Basically I think if there's a way for device to detect that
guest is stuck and not refilling the queue in a timely
manner, then we are ok - host will make its own decisions
on whether to snapshot or not.

However, I feel in that case we need a way to create a big
backlog of buffers for guest to fill such that this
ring empty condition is very unlikely.
One or even 2 queues does not seem enough then.

For example, I can see a "stop" command that will
tell device: "stop consuming buffers" and device
will stop consuming buffers until the next leak event.




> > I have been thinking of adding MAP/UNMAP descriptors for
> > a while now. Thus it will be possible to modify
> > userspace memory without consuming buffers.
> > Would something like this solve the problem?
> 
> I am not familiar with MAP/UNMAP descriptors. Is there
> a link where I can read about them?
> 
> Cheers,
> Babis


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-11-02 11:20                                             ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-11-02 11:38                                             ` Babis Chalios
  2023-11-02 11:51                                                 ` [virtio-comment] " Michael S. Tsirkin
  -1 siblings, 1 reply; 53+ messages in thread
From: Babis Chalios @ 2023-11-02 11:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams


On 2/11/23 12:20, Michael S. Tsirkin wrote:
>
> On Thu, Sep 28, 2023 at 08:16:11PM +0200, Babis Chalios wrote:
>>
>> On 27/9/23 23:47, Michael S. Tsirkin wrote:
>>> On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
>>>> On 22/9/23 18:01, Michael S. Tsirkin wrote:
>>>>> On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
>>>>>> On 22/9/23 17:06, Michael S. Tsirkin wrote:
>>>>>>> On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
>>>>>>>> On 19/9/23 12:01, Michael S. Tsirkin wrote:
>>>>>>>>> On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
>>>>>>>>>> Resending to fix e-mail formatting issues (sorry for the spam)
>>>>>>>>>>
>>>>>>>>>> On 18/9/23 18:30, Babis Chalios wrote:
>>>>>>>>>>>>>>> Yes, that's what the driver does now in the RFC patch.
>>>>>>>>>>>>>>> However, this just
>>>>>>>>>>>>>>> decreases
>>>>>>>>>>>>>>> the race window, it doesn't eliminate it. If a third
>>>>>>>>>>>>>>> leak event happens it
>>>>>>>>>>>>>>> might not
>>>>>>>>>>>>>>> find any buffers to use:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. available buffers to queue 1-X
>>>>>>>>>>>>>>> 2. available buffers to queue X
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3. poll queue X
>>>>>>>>>>>>>>> 4. used buffers in queue X       <- leak event 1 will
>>>>>>>>>>>>>>> use buffers in X
>>>>>>>>>>>>>>> 5. avail buffers in queue X
>>>>>>>>>>>>>>> 6. poll queue 1-X                <- leak event 2 will
>>>>>>>>>>>>>>> use buffers in 1-X
>>>>>>>>>>>>>>> 7. used buffers in queue 1-X
>>>>>>>>>>>>>>> 8. avail buffers in queue 1-X
>>>>>>>>>>>>>>>                                         <- leak event 3 (it
>>>>>>>>>>>>>>> needs buffers in X, race with step 5)
>>>>>>>>>>>>>>> 9. goto 3
>>>>>>>>>>>>>> I don't get it. we added buffers in step 5.
>>>>>>>>>>>>> What if the leak event 3 arrives before step 5 had time to
>>>>>>>>>>>>> actually add the
>>>>>>>>>>>>> buffers in X and make
>>>>>>>>>>>>> them visible to the device?
>>>>>>>>>>>> Then it will see a single event in 1-X instead of two events.  A leak is
>>>>>>>>>>>> a leak though, I don't see does it matter how many triggered.
>>>>>>>>>>>>
>>>>>>>>>> So the scenario I have in mind is the following:
>>>>>>>>>>
>>>>>>>>>> (Epoch here is terminology that I used in the Linux RFC. It is a value
>>>>>>>>>> maintained by random.c
>>>>>>>>>> that changes every time a leak event happens).
>>>>>>>>>>
>>>>>>>>>> 1. add buffers to 1-X
>>>>>>>>>> 2. add buffers to X
>>>>>>>>>> 3. poll queue X
>>>>>>>>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>>>>>>>>> 5. Device: First snapshot, uses buffers in X
>>>>>>>>>> 6. vcpu 1: sees used buffers
>>>>>>>>>> 7. Device: Second snapshot, uses buffers in 1-X
>>>>>>>>>> 8. vcpu 0: getrandom() observes new  epoch value & caches it
>>>>>>>>>> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
>>>>>>>>>> has not yet finished adding new buffers).
>>>>>>>>>> 10. vcpu 1 adds new buffer in X
>>>>>>>>>> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> In this succession of events, when the third snapshot will happen, the
>>>>>>>>>> device won't find
>>>>>>>>>> any buffers in either queue, so it won't increase the RNG epoch value. So,
>>>>>>>>>> any entropy
>>>>>>>>>> gathered after step 8 will be the same across all snapshots. Am I missing
>>>>>>>>>> something?
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Babis
>>>>>>>>>>
>>>>>>>>> Yes but notice how this is followed by:
>>>>>>>>>
>>>>>>>>> 12. vcpu 1: sees used buffers in 1-X
>>>>>>>>>
>>>>>>>>> Driver can notify getrandom I guess?
>>>>>>>> It could, but then we have the exact race condition that VMGENID had,
>>>>>>>> userspace has already consumed stale entropy and there's nothing we
>>>>>>>> can do about that.
>>>>>>>>
>>>>>>>> Although this is indeed a corner case, it feels like it beats the purpose
>>>>>>>> of having the hardware update directly userspace (via copy on leak).
>>>>>>>>
>>>>>>>> How do you feel about the proposal a couple of emails back? It looks to
>>>>>>>> me that it avoids completely the race condition.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Babis
>>>>>>> It does. The problem of course is that this means that e.g.
>>>>>>> taking a snapshot of a guest that is stuck won't work well.
>>>>>> That is true, but does it matter? The intention of the proposal
>>>>>> is that if it is not safe to take snapshots (i.e. no buffers in the
>>>>>> queue) don't take snapshots.
>>>>>>
>>>>>>> I have been thinking of adding MAP/UNMAP descriptors for
>>>>>>> a while now. Thus it will be possible to modify
>>>>>>> userspace memory without consuming buffers.
>>>>>>> Would something like this solve the problem?
>>>>>> I am not familiar with MAP/UNMAP descriptors. Is there
>>>>>> a link where I can read about them?
>>>>>>
>>>>>> Cheers,
>>>>>> Babis
>>>>> Heh no I just came up with the name. Will write up in a couple
>>>>> of days, but the idea is that driver does get_user_pages,
>>>>> adds buffer to queue, and device will remember the address
>>>>> and change that memory on a snapshot. If there are buffers
>>>>> in the queue it will also use these to tell driver,
>>>>> but if there are no buffers then it won't.
>>>> That sounds like a nice mechanism. However in our case the page
>>>> holding the counter that gets increased by the hardware is a kernel
>>>> page.
>>>>
>>>> The reason for that is that things other than us (virtio-rng) might
>>>> want to notify for leak events. For example, I think that Jason
>>>> intended to use this mechanism to periodically notify user-space
>>>> PRNGs that they need to reseed.
>>>>
>>>> Cheers,
>>>> Babis
>>> Now I'm lost.
>>> when you write, e.g.:
>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>> how does vcpu access the epoch?
>> The kernel provides a user space API to map a pointer to the epoch
>> value. User space then caches its value and checks it every time it
>> needs to make sure that no entropy leak has happened before using
>> cached kernel entropy.
>>
>> virtio-rng driver adds a copy on leak command to the queue for
>> increasing this value (that's what we are speaking about in this thread).
>> But other systems might want to report "leaks", such as random.c
>> itself.
>>
>> Cheers,
>> Babis
>
> This idea would be fine but I don't see how it's so different
> from VMGENID. Care to explain?
>
It is different in that the memory is owned by the guest kernel, not
the hardware. In this case, random.c maintains it. This allows, systems
other than the hardware, e.g. virtio-rng to notify about entropy leak
events. For example, random.c itself can periodically do that (I think
Jason had that use-case in mind).

The fact that we as well mmap that memory to the user-space is just
for giving user space a mechanism that allows it to know when it needs
to reseed its PRNGs. We _could_ have done the same with VMGENID,
but the fact that, in this case, the underlying physical memory is owned
by the ACPI device blocks us from letting other systems sending 
notifications
as well.

Maybe it makes sense to have this discussion in the patch we sent for
Linux on LKML [1]?

Cheers,
Babis

[1] 
https://lore.kernel.org/lkml/20230823090107.65749-3-bchalios@amazon.es/T/#mb1242999d8296169d9a4ee1a0805005633ec146a

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-11-02 11:25                                     ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-11-02 11:51                                     ` Babis Chalios
  -1 siblings, 0 replies; 53+ messages in thread
From: Babis Chalios @ 2023-11-02 11:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams



On 2/11/23 12:25, Michael S. Tsirkin wrote:
>
>
> On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
>>
>> On 22/9/23 17:06, Michael S. Tsirkin wrote:
>>> On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
>>>> On 19/9/23 12:01, Michael S. Tsirkin wrote:
>>>>> On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
>>>>>> Resending to fix e-mail formatting issues (sorry for the spam)
>>>>>>
>>>>>> On 18/9/23 18:30, Babis Chalios wrote:
>>>>>>>>>>> Yes, that's what the driver does now in the RFC patch.
>>>>>>>>>>> However, this just
>>>>>>>>>>> decreases
>>>>>>>>>>> the race window, it doesn't eliminate it. If a third
>>>>>>>>>>> leak event happens it
>>>>>>>>>>> might not
>>>>>>>>>>> find any buffers to use:
>>>>>>>>>>>
>>>>>>>>>>> 1. available buffers to queue 1-X
>>>>>>>>>>> 2. available buffers to queue X
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 3. poll queue X
>>>>>>>>>>> 4. used buffers in queue X       <- leak event 1 will
>>>>>>>>>>> use buffers in X
>>>>>>>>>>> 5. avail buffers in queue X
>>>>>>>>>>> 6. poll queue 1-X                <- leak event 2 will
>>>>>>>>>>> use buffers in 1-X
>>>>>>>>>>> 7. used buffers in queue 1-X
>>>>>>>>>>> 8. avail buffers in queue 1-X
>>>>>>>>>>>                                       <- leak event 3 (it
>>>>>>>>>>> needs buffers in X, race with step 5)
>>>>>>>>>>> 9. goto 3
>>>>>>>>>> I don't get it. we added buffers in step 5.
>>>>>>>>> What if the leak event 3 arrives before step 5 had time to
>>>>>>>>> actually add the
>>>>>>>>> buffers in X and make
>>>>>>>>> them visible to the device?
>>>>>>>> Then it will see a single event in 1-X instead of two events.  A leak is
>>>>>>>> a leak though, I don't see does it matter how many triggered.
>>>>>>>>
>>>>>> So the scenario I have in mind is the following:
>>>>>>
>>>>>> (Epoch here is terminology that I used in the Linux RFC. It is a value
>>>>>> maintained by random.c
>>>>>> that changes every time a leak event happens).
>>>>>>
>>>>>> 1. add buffers to 1-X
>>>>>> 2. add buffers to X
>>>>>> 3. poll queue X
>>>>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>>>>> 5. Device: First snapshot, uses buffers in X
>>>>>> 6. vcpu 1: sees used buffers
>>>>>> 7. Device: Second snapshot, uses buffers in 1-X
>>>>>> 8. vcpu 0: getrandom() observes new  epoch value & caches it
>>>>>> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
>>>>>> has not yet finished adding new buffers).
>>>>>> 10. vcpu 1 adds new buffer in X
>>>>>> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
>>>>>>
>>>>>>
>>>>>> In this succession of events, when the third snapshot will happen, the
>>>>>> device won't find
>>>>>> any buffers in either queue, so it won't increase the RNG epoch value. So,
>>>>>> any entropy
>>>>>> gathered after step 8 will be the same across all snapshots. Am I missing
>>>>>> something?
>>>>>>
>>>>>> Cheers,
>>>>>> Babis
>>>>>>
>>>>> Yes but notice how this is followed by:
>>>>>
>>>>> 12. vcpu 1: sees used buffers in 1-X
>>>>>
>>>>> Driver can notify getrandom I guess?
>>>> It could, but then we have the exact race condition that VMGENID had,
>>>> userspace has already consumed stale entropy and there's nothing we
>>>> can do about that.
>>>>
>>>> Although this is indeed a corner case, it feels like it beats the purpose
>>>> of having the hardware update directly userspace (via copy on leak).
>>>>
>>>> How do you feel about the proposal a couple of emails back? It looks to
>>>> me that it avoids completely the race condition.
>>>>
>>>> Cheers,
>>>> Babis
>>> It does. The problem of course is that this means that e.g.
>>> taking a snapshot of a guest that is stuck won't work well.
>> That is true, but does it matter? The intention of the proposal
>> is that if it is not safe to take snapshots (i.e. no buffers in the
>> queue) don't take snapshots.
> OK. Basically I think if there's a way for device to detect that
> guest is stuck and not refilling the queue in a timely
> manner, then we are ok - host will make its own decisions
> on whether to snapshot or not.
>
> However, I feel in that case we need a way to create a big
> backlog of buffers for guest to fill such that this
> ring empty condition is very unlikely.
> One or even 2 queues does not seem enough then.
>
> For example, I can see a "stop" command that will
> tell device: "stop consuming buffers" and device
> will stop consuming buffers until the next leak event.
>
Yup, that seems reasonable to me. In that case, we could
have a single queue, where the driver will fill up with multiple
batches of commands where the last one is "stop" command, and
then back-fill as needed.

That should make it very unlikely for a well-behaving guest to
run out of entropy leak commands in the queue. However, even
with that design, I think there is value in putting something in the
spec about the fact that the host might want to consider if it should,
or not, take a snapshot when it finds the leak queue empty.

Cheers,
Babis

>
>
>>> I have been thinking of adding MAP/UNMAP descriptors for
>>> a while now. Thus it will be possible to modify
>>> userspace memory without consuming buffers.
>>> Would something like this solve the problem?
>> I am not familiar with MAP/UNMAP descriptors. Is there
>> a link where I can read about them?
>>
>> Cheers,
>> Babis


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-11-02 11:38                                             ` Babis Chalios
@ 2023-11-02 11:51                                                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-11-02 11:51 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Thu, Nov 02, 2023 at 12:38:28PM +0100, Babis Chalios wrote:
> 
> On 2/11/23 12:20, Michael S. Tsirkin wrote:
> > 
> > On Thu, Sep 28, 2023 at 08:16:11PM +0200, Babis Chalios wrote:
> > > 
> > > On 27/9/23 23:47, Michael S. Tsirkin wrote:
> > > > On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
> > > > > On 22/9/23 18:01, Michael S. Tsirkin wrote:
> > > > > > On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> > > > > > > On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > > > > > > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > > > > > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > > > > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > > > > > > > 
> > > > > > > > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > > > > > > > However, this just
> > > > > > > > > > > > > > > > decreases
> > > > > > > > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > > > > > > > leak event happens it
> > > > > > > > > > > > > > > > might not
> > > > > > > > > > > > > > > > find any buffers to use:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > > > > > > > use buffers in X
> > > > > > > > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > > > > > > > use buffers in 1-X
> > > > > > > > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > > > > > > > >                                         <- leak event 3 (it
> > > > > > > > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > > > > > > > 9. goto 3
> > > > > > > > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > > > > > > > actually add the
> > > > > > > > > > > > > > buffers in X and make
> > > > > > > > > > > > > > them visible to the device?
> > > > > > > > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > > > > > > > 
> > > > > > > > > > > So the scenario I have in mind is the following:
> > > > > > > > > > > 
> > > > > > > > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > > > > > > > maintained by random.c
> > > > > > > > > > > that changes every time a leak event happens).
> > > > > > > > > > > 
> > > > > > > > > > > 1. add buffers to 1-X
> > > > > > > > > > > 2. add buffers to X
> > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > > > > > > > 5. Device: First snapshot, uses buffers in X
> > > > > > > > > > > 6. vcpu 1: sees used buffers
> > > > > > > > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > > > > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > > > > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > > > > > > > has not yet finished adding new buffers).
> > > > > > > > > > > 10. vcpu 1 adds new buffer in X
> > > > > > > > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > In this succession of events, when the third snapshot will happen, the
> > > > > > > > > > > device won't find
> > > > > > > > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > > > > > > > any entropy
> > > > > > > > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > > > > > > > something?
> > > > > > > > > > > 
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Babis
> > > > > > > > > > > 
> > > > > > > > > > Yes but notice how this is followed by:
> > > > > > > > > > 
> > > > > > > > > > 12. vcpu 1: sees used buffers in 1-X
> > > > > > > > > > 
> > > > > > > > > > Driver can notify getrandom I guess?
> > > > > > > > > It could, but then we have the exact race condition that VMGENID had,
> > > > > > > > > userspace has already consumed stale entropy and there's nothing we
> > > > > > > > > can do about that.
> > > > > > > > > 
> > > > > > > > > Although this is indeed a corner case, it feels like it beats the purpose
> > > > > > > > > of having the hardware update directly userspace (via copy on leak).
> > > > > > > > > 
> > > > > > > > > How do you feel about the proposal a couple of emails back? It looks to
> > > > > > > > > me that it avoids completely the race condition.
> > > > > > > > > 
> > > > > > > > > Cheers,
> > > > > > > > > Babis
> > > > > > > > It does. The problem of course is that this means that e.g.
> > > > > > > > taking a snapshot of a guest that is stuck won't work well.
> > > > > > > That is true, but does it matter? The intention of the proposal
> > > > > > > is that if it is not safe to take snapshots (i.e. no buffers in the
> > > > > > > queue) don't take snapshots.
> > > > > > > 
> > > > > > > > I have been thinking of adding MAP/UNMAP descriptors for
> > > > > > > > a while now. Thus it will be possible to modify
> > > > > > > > userspace memory without consuming buffers.
> > > > > > > > Would something like this solve the problem?
> > > > > > > I am not familiar with MAP/UNMAP descriptors. Is there
> > > > > > > a link where I can read about them?
> > > > > > > 
> > > > > > > Cheers,
> > > > > > > Babis
> > > > > > Heh no I just came up with the name. Will write up in a couple
> > > > > > of days, but the idea is that driver does get_user_pages,
> > > > > > adds buffer to queue, and device will remember the address
> > > > > > and change that memory on a snapshot. If there are buffers
> > > > > > in the queue it will also use these to tell driver,
> > > > > > but if there are no buffers then it won't.
> > > > > That sounds like a nice mechanism. However in our case the page
> > > > > holding the counter that gets increased by the hardware is a kernel
> > > > > page.
> > > > > 
> > > > > The reason for that is that things other than us (virtio-rng) might
> > > > > want to notify for leak events. For example, I think that Jason
> > > > > intended to use this mechanism to periodically notify user-space
> > > > > PRNGs that they need to reseed.
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > Now I'm lost.
> > > > when you write, e.g.:
> > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > how does vcpu access the epoch?
> > > The kernel provides a user space API to map a pointer to the epoch
> > > value. User space then caches its value and checks it every time it
> > > needs to make sure that no entropy leak has happened before using
> > > cached kernel entropy.
> > > 
> > > virtio-rng driver adds a copy on leak command to the queue for
> > > increasing this value (that's what we are speaking about in this thread).
> > > But other systems might want to report "leaks", such as random.c
> > > itself.
> > > 
> > > Cheers,
> > > Babis
> > 
> > This idea would be fine but I don't see how it's so different
> > from VMGENID. Care to explain?
> > 
> It is different in that the memory is owned by the guest kernel, not
> the hardware. In this case, random.c maintains it. This allows, systems
> other than the hardware, e.g. virtio-rng to notify about entropy leak
> events. For example, random.c itself can periodically do that (I think
> Jason had that use-case in mind).

No sure I understand. Example?

> 
> The fact that we as well mmap that memory to the user-space is just
> for giving user space a mechanism that allows it to know when it needs
> to reseed its PRNGs. We _could_ have done the same with VMGENID,
> but the fact that, in this case, the underlying physical memory is owned
> by the ACPI device blocks us from letting other systems sending
> notifications
> as well.
> 
> Maybe it makes sense to have this discussion in the patch we sent for
> Linux on LKML [1]?
> 
> Cheers,
> Babis
> 
> [1] https://lore.kernel.org/lkml/20230823090107.65749-3-bchalios@amazon.es/T/#mb1242999d8296169d9a4ee1a0805005633ec146a


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [virtio-comment] Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
@ 2023-11-02 11:51                                                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2023-11-02 11:51 UTC (permalink / raw)
  To: Babis Chalios
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On Thu, Nov 02, 2023 at 12:38:28PM +0100, Babis Chalios wrote:
> 
> On 2/11/23 12:20, Michael S. Tsirkin wrote:
> > 
> > On Thu, Sep 28, 2023 at 08:16:11PM +0200, Babis Chalios wrote:
> > > 
> > > On 27/9/23 23:47, Michael S. Tsirkin wrote:
> > > > On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
> > > > > On 22/9/23 18:01, Michael S. Tsirkin wrote:
> > > > > > On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
> > > > > > > On 22/9/23 17:06, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
> > > > > > > > > On 19/9/23 12:01, Michael S. Tsirkin wrote:
> > > > > > > > > > On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
> > > > > > > > > > > Resending to fix e-mail formatting issues (sorry for the spam)
> > > > > > > > > > > 
> > > > > > > > > > > On 18/9/23 18:30, Babis Chalios wrote:
> > > > > > > > > > > > > > > > Yes, that's what the driver does now in the RFC patch.
> > > > > > > > > > > > > > > > However, this just
> > > > > > > > > > > > > > > > decreases
> > > > > > > > > > > > > > > > the race window, it doesn't eliminate it. If a third
> > > > > > > > > > > > > > > > leak event happens it
> > > > > > > > > > > > > > > > might not
> > > > > > > > > > > > > > > > find any buffers to use:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > 1. available buffers to queue 1-X
> > > > > > > > > > > > > > > > 2. available buffers to queue X
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > > > > > > 4. used buffers in queue X       <- leak event 1 will
> > > > > > > > > > > > > > > > use buffers in X
> > > > > > > > > > > > > > > > 5. avail buffers in queue X
> > > > > > > > > > > > > > > > 6. poll queue 1-X                <- leak event 2 will
> > > > > > > > > > > > > > > > use buffers in 1-X
> > > > > > > > > > > > > > > > 7. used buffers in queue 1-X
> > > > > > > > > > > > > > > > 8. avail buffers in queue 1-X
> > > > > > > > > > > > > > > >                                         <- leak event 3 (it
> > > > > > > > > > > > > > > > needs buffers in X, race with step 5)
> > > > > > > > > > > > > > > > 9. goto 3
> > > > > > > > > > > > > > > I don't get it. we added buffers in step 5.
> > > > > > > > > > > > > > What if the leak event 3 arrives before step 5 had time to
> > > > > > > > > > > > > > actually add the
> > > > > > > > > > > > > > buffers in X and make
> > > > > > > > > > > > > > them visible to the device?
> > > > > > > > > > > > > Then it will see a single event in 1-X instead of two events.  A leak is
> > > > > > > > > > > > > a leak though, I don't see does it matter how many triggered.
> > > > > > > > > > > > > 
> > > > > > > > > > > So the scenario I have in mind is the following:
> > > > > > > > > > > 
> > > > > > > > > > > (Epoch here is terminology that I used in the Linux RFC. It is a value
> > > > > > > > > > > maintained by random.c
> > > > > > > > > > > that changes every time a leak event happens).
> > > > > > > > > > > 
> > > > > > > > > > > 1. add buffers to 1-X
> > > > > > > > > > > 2. add buffers to X
> > > > > > > > > > > 3. poll queue X
> > > > > > > > > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > > > > > > > > 5. Device: First snapshot, uses buffers in X
> > > > > > > > > > > 6. vcpu 1: sees used buffers
> > > > > > > > > > > 7. Device: Second snapshot, uses buffers in 1-X
> > > > > > > > > > > 8. vcpu 0: getrandom() observes new  epoch value & caches it
> > > > > > > > > > > 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
> > > > > > > > > > > has not yet finished adding new buffers).
> > > > > > > > > > > 10. vcpu 1 adds new buffer in X
> > > > > > > > > > > 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > In this succession of events, when the third snapshot will happen, the
> > > > > > > > > > > device won't find
> > > > > > > > > > > any buffers in either queue, so it won't increase the RNG epoch value. So,
> > > > > > > > > > > any entropy
> > > > > > > > > > > gathered after step 8 will be the same across all snapshots. Am I missing
> > > > > > > > > > > something?
> > > > > > > > > > > 
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Babis
> > > > > > > > > > > 
> > > > > > > > > > Yes but notice how this is followed by:
> > > > > > > > > > 
> > > > > > > > > > 12. vcpu 1: sees used buffers in 1-X
> > > > > > > > > > 
> > > > > > > > > > Driver can notify getrandom I guess?
> > > > > > > > > It could, but then we have the exact race condition that VMGENID had,
> > > > > > > > > userspace has already consumed stale entropy and there's nothing we
> > > > > > > > > can do about that.
> > > > > > > > > 
> > > > > > > > > Although this is indeed a corner case, it feels like it beats the purpose
> > > > > > > > > of having the hardware update directly userspace (via copy on leak).
> > > > > > > > > 
> > > > > > > > > How do you feel about the proposal a couple of emails back? It looks to
> > > > > > > > > me that it avoids completely the race condition.
> > > > > > > > > 
> > > > > > > > > Cheers,
> > > > > > > > > Babis
> > > > > > > > It does. The problem of course is that this means that e.g.
> > > > > > > > taking a snapshot of a guest that is stuck won't work well.
> > > > > > > That is true, but does it matter? The intention of the proposal
> > > > > > > is that if it is not safe to take snapshots (i.e. no buffers in the
> > > > > > > queue) don't take snapshots.
> > > > > > > 
> > > > > > > > I have been thinking of adding MAP/UNMAP descriptors for
> > > > > > > > a while now. Thus it will be possible to modify
> > > > > > > > userspace memory without consuming buffers.
> > > > > > > > Would something like this solve the problem?
> > > > > > > I am not familiar with MAP/UNMAP descriptors. Is there
> > > > > > > a link where I can read about them?
> > > > > > > 
> > > > > > > Cheers,
> > > > > > > Babis
> > > > > > Heh no I just came up with the name. Will write up in a couple
> > > > > > of days, but the idea is that driver does get_user_pages,
> > > > > > adds buffer to queue, and device will remember the address
> > > > > > and change that memory on a snapshot. If there are buffers
> > > > > > in the queue it will also use these to tell driver,
> > > > > > but if there are no buffers then it won't.
> > > > > That sounds like a nice mechanism. However in our case the page
> > > > > holding the counter that gets increased by the hardware is a kernel
> > > > > page.
> > > > > 
> > > > > The reason for that is that things other than us (virtio-rng) might
> > > > > want to notify for leak events. For example, I think that Jason
> > > > > intended to use this mechanism to periodically notify user-space
> > > > > PRNGs that they need to reseed.
> > > > > 
> > > > > Cheers,
> > > > > Babis
> > > > Now I'm lost.
> > > > when you write, e.g.:
> > > > 4. vcpu 0: get getrandom() entropy and cache epoch value
> > > > how does vcpu access the epoch?
> > > The kernel provides a user space API to map a pointer to the epoch
> > > value. User space then caches its value and checks it every time it
> > > needs to make sure that no entropy leak has happened before using
> > > cached kernel entropy.
> > > 
> > > virtio-rng driver adds a copy on leak command to the queue for
> > > increasing this value (that's what we are speaking about in this thread).
> > > But other systems might want to report "leaks", such as random.c
> > > itself.
> > > 
> > > Cheers,
> > > Babis
> > 
> > This idea would be fine but I don't see how it's so different
> > from VMGENID. Care to explain?
> > 
> It is different in that the memory is owned by the guest kernel, not
> the hardware. In this case, random.c maintains it. This allows, systems
> other than the hardware, e.g. virtio-rng to notify about entropy leak
> events. For example, random.c itself can periodically do that (I think
> Jason had that use-case in mind).

No sure I understand. Example?

> 
> The fact that we as well mmap that memory to the user-space is just
> for giving user space a mechanism that allows it to know when it needs
> to reseed its PRNGs. We _could_ have done the same with VMGENID,
> but the fact that, in this case, the underlying physical memory is owned
> by the ACPI device blocks us from letting other systems sending
> notifications
> as well.
> 
> Maybe it makes sense to have this discussion in the patch we sent for
> Linux on LKML [1]?
> 
> Cheers,
> Babis
> 
> [1] https://lore.kernel.org/lkml/20230823090107.65749-3-bchalios@amazon.es/T/#mb1242999d8296169d9a4ee1a0805005633ec146a


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [virtio-dev] [PATCH RFC 3/3] rng: leak detection support
  2023-11-02 11:51                                                 ` [virtio-comment] " Michael S. Tsirkin
  (?)
@ 2023-11-02 13:42                                                 ` Babis Chalios
  -1 siblings, 0 replies; 53+ messages in thread
From: Babis Chalios @ 2023-11-02 13:42 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-comment, virtio-dev, Cali, Marco, Graf (AWS),
	Alexander, Jason A. Donenfeld, aams

On 2/11/23 12:51, Michael S. Tsirkin wrote:
>
> On Thu, Nov 02, 2023 at 12:38:28PM +0100, Babis Chalios wrote:
>> On 2/11/23 12:20, Michael S. Tsirkin wrote:
>>> On Thu, Sep 28, 2023 at 08:16:11PM +0200, Babis Chalios wrote:
>>>> On 27/9/23 23:47, Michael S. Tsirkin wrote:
>>>>> On Wed, Sep 27, 2023 at 12:43:20PM +0200, Babis Chalios wrote:
>>>>>> On 22/9/23 18:01, Michael S. Tsirkin wrote:
>>>>>>> On Fri, Sep 22, 2023 at 05:40:50PM +0200, Babis Chalios wrote:
>>>>>>>> On 22/9/23 17:06, Michael S. Tsirkin wrote:
>>>>>>>>> On Tue, Sep 19, 2023 at 12:11:37PM +0200, Babis Chalios wrote:
>>>>>>>>>> On 19/9/23 12:01, Michael S. Tsirkin wrote:
>>>>>>>>>>> On Tue, Sep 19, 2023 at 09:32:08AM +0200, Babis Chalios wrote:
>>>>>>>>>>>> Resending to fix e-mail formatting issues (sorry for the spam)
>>>>>>>>>>>>
>>>>>>>>>>>> On 18/9/23 18:30, Babis Chalios wrote:
>>>>>>>>>>>>>>>>> Yes, that's what the driver does now in the RFC patch.
>>>>>>>>>>>>>>>>> However, this just
>>>>>>>>>>>>>>>>> decreases
>>>>>>>>>>>>>>>>> the race window, it doesn't eliminate it. If a third
>>>>>>>>>>>>>>>>> leak event happens it
>>>>>>>>>>>>>>>>> might not
>>>>>>>>>>>>>>>>> find any buffers to use:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. available buffers to queue 1-X
>>>>>>>>>>>>>>>>> 2. available buffers to queue X
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 3. poll queue X
>>>>>>>>>>>>>>>>> 4. used buffers in queue X       <- leak event 1 will
>>>>>>>>>>>>>>>>> use buffers in X
>>>>>>>>>>>>>>>>> 5. avail buffers in queue X
>>>>>>>>>>>>>>>>> 6. poll queue 1-X                <- leak event 2 will
>>>>>>>>>>>>>>>>> use buffers in 1-X
>>>>>>>>>>>>>>>>> 7. used buffers in queue 1-X
>>>>>>>>>>>>>>>>> 8. avail buffers in queue 1-X
>>>>>>>>>>>>>>>>>                                          <- leak event 3 (it
>>>>>>>>>>>>>>>>> needs buffers in X, race with step 5)
>>>>>>>>>>>>>>>>> 9. goto 3
>>>>>>>>>>>>>>>> I don't get it. we added buffers in step 5.
>>>>>>>>>>>>>>> What if the leak event 3 arrives before step 5 had time to
>>>>>>>>>>>>>>> actually add the
>>>>>>>>>>>>>>> buffers in X and make
>>>>>>>>>>>>>>> them visible to the device?
>>>>>>>>>>>>>> Then it will see a single event in 1-X instead of two events.  A leak is
>>>>>>>>>>>>>> a leak though, I don't see does it matter how many triggered.
>>>>>>>>>>>>>>
>>>>>>>>>>>> So the scenario I have in mind is the following:
>>>>>>>>>>>>
>>>>>>>>>>>> (Epoch here is terminology that I used in the Linux RFC. It is a value
>>>>>>>>>>>> maintained by random.c
>>>>>>>>>>>> that changes every time a leak event happens).
>>>>>>>>>>>>
>>>>>>>>>>>> 1. add buffers to 1-X
>>>>>>>>>>>> 2. add buffers to X
>>>>>>>>>>>> 3. poll queue X
>>>>>>>>>>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>>>>>>>>>>> 5. Device: First snapshot, uses buffers in X
>>>>>>>>>>>> 6. vcpu 1: sees used buffers
>>>>>>>>>>>> 7. Device: Second snapshot, uses buffers in 1-X
>>>>>>>>>>>> 8. vcpu 0: getrandom() observes new  epoch value & caches it
>>>>>>>>>>>> 9. Device: Third snapshot, no buffers in either queue, (vcpu 1 from step 6
>>>>>>>>>>>> has not yet finished adding new buffers).
>>>>>>>>>>>> 10. vcpu 1 adds new buffer in X
>>>>>>>>>>>> 11. vcpu 0: getrandom() will not see new epoch and gets stale entropy.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In this succession of events, when the third snapshot will happen, the
>>>>>>>>>>>> device won't find
>>>>>>>>>>>> any buffers in either queue, so it won't increase the RNG epoch value. So,
>>>>>>>>>>>> any entropy
>>>>>>>>>>>> gathered after step 8 will be the same across all snapshots. Am I missing
>>>>>>>>>>>> something?
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Babis
>>>>>>>>>>>>
>>>>>>>>>>> Yes but notice how this is followed by:
>>>>>>>>>>>
>>>>>>>>>>> 12. vcpu 1: sees used buffers in 1-X
>>>>>>>>>>>
>>>>>>>>>>> Driver can notify getrandom I guess?
>>>>>>>>>> It could, but then we have the exact race condition that VMGENID had,
>>>>>>>>>> userspace has already consumed stale entropy and there's nothing we
>>>>>>>>>> can do about that.
>>>>>>>>>>
>>>>>>>>>> Although this is indeed a corner case, it feels like it beats the purpose
>>>>>>>>>> of having the hardware update directly userspace (via copy on leak).
>>>>>>>>>>
>>>>>>>>>> How do you feel about the proposal a couple of emails back? It looks to
>>>>>>>>>> me that it avoids completely the race condition.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Babis
>>>>>>>>> It does. The problem of course is that this means that e.g.
>>>>>>>>> taking a snapshot of a guest that is stuck won't work well.
>>>>>>>> That is true, but does it matter? The intention of the proposal
>>>>>>>> is that if it is not safe to take snapshots (i.e. no buffers in the
>>>>>>>> queue) don't take snapshots.
>>>>>>>>
>>>>>>>>> I have been thinking of adding MAP/UNMAP descriptors for
>>>>>>>>> a while now. Thus it will be possible to modify
>>>>>>>>> userspace memory without consuming buffers.
>>>>>>>>> Would something like this solve the problem?
>>>>>>>> I am not familiar with MAP/UNMAP descriptors. Is there
>>>>>>>> a link where I can read about them?
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Babis
>>>>>>> Heh no I just came up with the name. Will write up in a couple
>>>>>>> of days, but the idea is that driver does get_user_pages,
>>>>>>> adds buffer to queue, and device will remember the address
>>>>>>> and change that memory on a snapshot. If there are buffers
>>>>>>> in the queue it will also use these to tell driver,
>>>>>>> but if there are no buffers then it won't.
>>>>>> That sounds like a nice mechanism. However in our case the page
>>>>>> holding the counter that gets increased by the hardware is a kernel
>>>>>> page.
>>>>>>
>>>>>> The reason for that is that things other than us (virtio-rng) might
>>>>>> want to notify for leak events. For example, I think that Jason
>>>>>> intended to use this mechanism to periodically notify user-space
>>>>>> PRNGs that they need to reseed.
>>>>>>
>>>>>> Cheers,
>>>>>> Babis
>>>>> Now I'm lost.
>>>>> when you write, e.g.:
>>>>> 4. vcpu 0: get getrandom() entropy and cache epoch value
>>>>> how does vcpu access the epoch?
>>>> The kernel provides a user space API to map a pointer to the epoch
>>>> value. User space then caches its value and checks it every time it
>>>> needs to make sure that no entropy leak has happened before using
>>>> cached kernel entropy.
>>>>
>>>> virtio-rng driver adds a copy on leak command to the queue for
>>>> increasing this value (that's what we are speaking about in this thread).
>>>> But other systems might want to report "leaks", such as random.c
>>>> itself.
>>>>
>>>> Cheers,
>>>> Babis
>>> This idea would be fine but I don't see how it's so different
>>> from VMGENID. Care to explain?
>>>
>> It is different in that the memory is owned by the guest kernel, not
>> the hardware. In this case, random.c maintains it. This allows, systems
>> other than the hardware, e.g. virtio-rng to notify about entropy leak
>> events. For example, random.c itself can periodically do that (I think
>> Jason had that use-case in mind).
> No sure I understand. Example?

Sure!

Take for example a user-space PRNG that uses getrandom() for seeding.
We would like that this PRNG knows when it is not safe any more to return
new random bits, because the initial seed is not unique any more.
In the Linux patch, we use the copy-on-leak command to do that. What the
copy command does in this case, is increase the value of a memory location.
That memory location is maintained by random.c. random.c allows user
space to mmap the page that stores this value. A user-space PRNG would
then mmap this value and observe it for changes.

Now, there is no reason why only virtio-rng should inform user space PRNGs
about entropy leak events. For example, the kernel PRNG (which is handling
out seeds for the user-space PRNG) might want to mark randomness that it
returned earlier as no longer valid (for reasons unrelated to VMs). With 
VMGENID
this would not be possible. However, now that random.c owns that piece of
memory, more than one entities can use it to notify user space about stale
entropy. And they can work together (details on how we do that are in 
the cover
letter of the LKML patch), we can have both the virtio-rng and the random.c
updating that value.

I hope that makes sense and I am really happy to discuss all this, but 
do you
think it is related with the VirtIO protocol?

Cheers,
Babis

>> The fact that we as well mmap that memory to the user-space is just
>> for giving user space a mechanism that allows it to know when it needs
>> to reseed its PRNGs. We _could_ have done the same with VMGENID,
>> but the fact that, in this case, the underlying physical memory is owned
>> by the ACPI device blocks us from letting other systems sending
>> notifications
>> as well.
>>
>> Maybe it makes sense to have this discussion in the patch we sent for
>> Linux on LKML [1]?
>>
>> Cheers,
>> Babis
>>
>> [1] https://lore.kernel.org/lkml/20230823090107.65749-3-bchalios@amazon.es/T/#mb1242999d8296169d9a4ee1a0805005633ec146a
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2023-11-02 13:42 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-21 16:30 [virtio-comment] [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
2022-11-21 16:30 ` [virtio-comment] [PATCH RFC 1/3] rng: move to a file of its own Michael S. Tsirkin
2022-11-21 16:30 ` [virtio-comment] [PATCH RFC 2/3] rng: be specific about the virtqueue Michael S. Tsirkin
2022-11-21 16:30 ` [virtio-dev] [PATCH RFC 3/3] rng: leak detection support Michael S. Tsirkin
2022-11-25 12:41   ` [virtio-dev] " Babis Chalios
2022-12-12 10:10     ` Babis Chalios
2023-01-11 13:57   ` Babis Chalios
2023-08-31 10:16   ` [virtio-dev] " Babis Chalios
2023-09-12 21:05     ` Michael S. Tsirkin
2023-09-12 21:05       ` [virtio-comment] " Michael S. Tsirkin
2023-09-13  9:32       ` Babis Chalios
2023-09-13  9:37         ` Michael S. Tsirkin
2023-09-13  9:37           ` [virtio-comment] " Michael S. Tsirkin
2023-09-13 11:19           ` Babis Chalios
2023-09-18 11:14             ` Babis Chalios
2023-09-18 12:41             ` Michael S. Tsirkin
2023-09-18 12:41               ` [virtio-comment] " Michael S. Tsirkin
2023-09-18 13:00               ` Babis Chalios
2023-09-18 13:58                 ` Michael S. Tsirkin
2023-09-18 13:58                   ` [virtio-comment] " Michael S. Tsirkin
2023-09-18 14:02                   ` Babis Chalios
2023-09-18 14:05                     ` Michael S. Tsirkin
2023-09-18 14:05                       ` [virtio-comment] " Michael S. Tsirkin
2023-09-18 16:30                       ` Babis Chalios
2023-09-19  7:32                         ` Babis Chalios
2023-09-19 10:01                           ` Michael S. Tsirkin
2023-09-19 10:01                             ` [virtio-comment] " Michael S. Tsirkin
2023-09-19 10:11                             ` Babis Chalios
2023-09-22 12:30                               ` Babis Chalios
2023-09-22 15:06                               ` Michael S. Tsirkin
2023-09-22 15:06                                 ` [virtio-comment] " Michael S. Tsirkin
2023-09-22 15:40                                 ` Babis Chalios
2023-09-22 16:01                                   ` Michael S. Tsirkin
2023-09-22 16:01                                     ` [virtio-comment] " Michael S. Tsirkin
2023-09-27 10:43                                     ` Babis Chalios
2023-09-27 21:47                                       ` Michael S. Tsirkin
2023-09-27 21:47                                         ` [virtio-comment] " Michael S. Tsirkin
2023-09-28 18:16                                         ` Babis Chalios
2023-10-13  7:49                                           ` Babis Chalios
2023-10-13 13:38                                             ` Michael S. Tsirkin
2023-10-13 13:38                                               ` [virtio-comment] " Michael S. Tsirkin
2023-11-02 11:20                                           ` Michael S. Tsirkin
2023-11-02 11:20                                             ` [virtio-comment] " Michael S. Tsirkin
2023-11-02 11:38                                             ` Babis Chalios
2023-11-02 11:51                                               ` Michael S. Tsirkin
2023-11-02 11:51                                                 ` [virtio-comment] " Michael S. Tsirkin
2023-11-02 13:42                                                 ` Babis Chalios
2023-11-02 11:25                                   ` Michael S. Tsirkin
2023-11-02 11:25                                     ` [virtio-comment] " Michael S. Tsirkin
2023-11-02 11:51                                     ` Babis Chalios
2023-01-12  7:02 ` [virtio-dev] Re: [PATCH RFC 0/3] virtio-rng based entropy leak reporting Michael S. Tsirkin
2023-01-16 11:39   ` Babis Chalios
     [not found]     ` <CAHmME9ry2fss2gsbPs2zVJkY=8Cdeae0XFD9FzCVnW67Xy3thA@mail.gmail.com>
2023-01-16 18:11       ` [virtio-comment] " Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.